As AI companies mature, the battle for high-quality data has become one of the most competitive areas in the industry, launching companies like Mercor, Surge, and most famously, Alexandr Wang’s Scale AI. But with Wang’s move to running AI at Meta, many funders are seeing the opportunity and are willing to fund companies with attractive new strategies for collecting training data.
Y Combinator alumnus Datacurve is one such company, with a focus on high-quality data for software development. The company announced Thursday a $15 million Series A round led by Chemistry’s Mark Goldberg with participation from employees from DeepMind, Vercel, Anthropic, and OpenAI. The Series A comes after a $2.7 million seed round with investment from former Coinbase CTO Balaji Srinivasan.
Datacurve uses a “bounty hunter” system to bring together skilled software engineers to complete the most difficult-to-obtain datasets. The company pays for these donations and has distributed more than $1 million in rewards to date.
But co-founder Selina Gee (pictured above with co-founder Charlie Lee) says the biggest motivation isn’t financial. For high-value services like software development, data-related jobs always pay much less than traditional employment. Therefore, a company’s most important strength is a positive user experience.
“We treat this as a consumer product, not a data labeling exercise,” Ge said. “We spend a lot of time thinking about how we can optimize our platform so that the people we want are interested and engaged with our platform.”
This becomes especially important as post-training data needs become more complex. While previous models were trained on simple datasets, today’s AI products rely on complex RL environments and must be built through specific and strategic data collection. As environments become more sophisticated, data requirements become more demanding in both quantity and quality. This could be a factor that gives a high-quality data collection company like Datacurve an edge.
As an early-stage company, Datacurve focuses on software engineering, but Ge says the model could just as easily be applied to fields such as finance, marketing, and even healthcare.
tech crunch event
san francisco
|
October 27-29, 2025
“What we are doing now is building an infrastructure for post-training data collection to attract and retain highly talented people in their unique fields,” Ge said.