On Thursday, AI platform Clarifai unveiled a new reasoning engine that claims to make AI models twice as fast and 40% cheaper. Designed to adapt to a variety of models and cloud hosts, the system employs a variety of optimizations to gain more inference power from the same hardware.
“This is a variety of optimizations, all the way to Cuda Kernels, and even advanced speculative decoding techniques,” said CEO Matthew Zeiler. “Essentially, you can get more from the same card.”
The results were validated by a series of benchmark tests from third-party corporate artificial analysis, which recorded the industry’s best records for both throughput and latency.
This process focuses specifically on reasoning. This is the computing requirement to manipulate an AI model that is already trained. That computing load has become particularly intense due to the rise in agent and inference models, which require multiple steps in response to a single command.
Clarifai was first launched as a computer vision service and is increasingly focusing on sorting as the AI boom has significantly increased demand for both GPUs and the data centers that house them. The company first unveiled its computing platform at AWS Re:Invent, but the new Reasoning Engine is the first product to be tailored specifically to the multi-step agent model.
The product is spurring billions of dollars trading amid a strong pressure on AI infrastructure. Openai plans up to $1 trillion in new data center spending, predicting near-end future demand. But while the hardware build-out is intense, Clarify CEOs believe there is more to do to optimize the infrastructure they already have.
“There are software tricks like the Clarifai Reasoning Engine that take more of such a great model,” says Zeiler.