Editor’s note: Durham-based CloudFactory on Thursday announced a deal to buy Germany-based Hasty. In his own words, CloudFactory founder and CEO Mark Sears explains why the deal was made – and its importance.
DURHAM – The CloudFactory team and I are excited to announce the acquisition of Hasty, a Berlin-based data-centric machine learning platform that allows companies to build and deploy computer vision models faster.
Around the world, CloudFactory is seeing more and more companies lean into data-centric AI practices to move their machine learning models to the next level. Platforms like Hasty allow companies to work on models and data in tandem, allowing for rapid feedback and fast iterations, reducing the time needed to create high-quality datasets.
It’s an exciting time for AI innovators. After lagging behind expectations for a long time, AI is ready to be adopted at scale thanks to data-centric AI.
In this post, I’ll dig deep into the three main reasons CloudFactory is acquiring Hasty to accelerate vision AI:
- We’re advancing the shift from model-centric to data-centric AI.
- We’re adding AI-assisted automated labeling that is best in class.
- We’re integrating humans in the loop and technology to offer a true end-to-end solution.
1. We’re advancing the shift from model-centric to data-centric AI
For years, the focus of AI development has been on the model. Data scientists worked to optimize code while holding datasets constant, largely ignoring “noise” or inconsistency issues in an approach called “model-centric AI.” Early adopters focused on models and data science techniques partly because the alternative method has a very daunting truth–data is hard.
Then in 2021, Andrew Ng began advocating for a new “data-centric” approach to unlock the full power of AI systems and get models to market faster. In a data-centric environment, data scientists and engineers turn their primary attention to the data—finding ways to optimize it iteratively throughout the entire development process.
At CloudFactory, I’ve seen more and more teams shifting from a model-centric to a data-centric approach. Companies building novel vision AI applications need to develop a proprietary data asset to support that application and remain competitive. Companies that understand this fact and embrace it are excelling.
I’ve also seen teams resist this shift and treat the model and data as separate topics.
When AI teams divorce the model from the data, they often operate in a staged approach: collect data, label data, train the model, optimize the model, and deploy the model. This approach leads to slow iterations and limited feedback that cause delays or failures of many machine learning projects.
Your team can unlock vision AI’s magic by looking at the intersection between the model and its training data.
The Hasty platform does exactly this. The platform trains and retrains models in parallel to the annotation process, so teams can track key model performance improvements while labeling. They’ll get two powerful benefits:
- The maximum potential labeling automation.
- The insights needed to iterate and implement an agile workflow.
Iterative vision AI development process. Source: Hasty
According to chess grandmaster Garry Kasparov, “a weak human player + machine + superior process beats stronger humans and machines with inferior processes.” He said this after watching two amateur players with three laptops beat a field of grandmasters and supercomputers in an open chess tournament.
The conclusion was that while the ingredients are important, a strong process becomes the critical differentiating factor and predictor of success, combining the elements in a manner that plays to their strengths.
Combining CloudFactory’s data annotation at scale with the agile process created with Hasty’s tool results in a world-class human-in-the-loop machine learning platform that will drive to the forefront of vision AI.
2. We’re adding AI-assisted automated labeling that is best in class
Over the last decade, CloudFactory has worked with dozens of different tools to label high-quality datasets for our clients. Recently, I’ve seen more labeling platforms supporting automation by allowing companies to plug in custom or off-the-shelf models to assist in labeling. Unfortunately, these automation features are often considered secondary add-ons and afterthoughts.
Hasty’s next-generation platform provides practitioners with several self-learning AI assistants that work out of the box without requiring lengthy setups or configurations. Every annotation is used to update the AI-powered annotation assistants as you label, resulting in custom AI models for each project.
This customization is significant. Your team will eliminate the uncertainty in automation by training Hasty’s models on your data, therefore accelerating automation when appropriate.
Hasty combines these capabilities with a best-in-class approach that uses the model to find potential errors in your labeled data. While still a human-in-the-loop approach, this provides a shortlist of potentially incorrect annotations for a human to review, instead of spending large amounts of time doing random sampling or manually searching for bad labels.
Hasty’s AI assisted labeling is annotating the referee in a sports match. Source: Hasty
3. We’re integrating humans in the loop and technology to offer a true end-to-end solution
CloudFactory has worked with over 250 AI innovators. We know those who integrate process and technology in a tight iterative loop with people—the humans in the loop—are finding the most success.
Human-in-the-loop AI approach. Source: Hasty
Unfortunately, too many teams still train their models as a separate activity from how they sustain those models in production. They focus on the controlled research and development phase and miss that they are deploying their solutions into a messy reality.
The real world often presents models with a huge number of exceptions that need resolving. Teams who integrate humans in the loop across both the building and the operating of their models can launch earlier and with more confidence. Teams often leverage human-in-the-loop solutions in production for:
- Real-time review and quality control.
- Resolving exceptions and low-confidence results.
- Refreshing and avoiding data drift through active learning.
CloudFactory is ready to take the next step
CloudFactory believes that an agile, data-centric approach is paramount for success. And labeling automation and humans in the loop should tightly integrate into this approach.
The reasons listed above are why CloudFactory is acquiring Hasty—to bring together the pieces that will help your team find success with vision AI. We want to help your team build vision AI solutions better and faster.
I look forward to sharing more about this journey with you all in the coming months.