Gimlet Labs Raises $80M to Speed Up AI Inference

Gimlet Labs, a startup that solves the AI inference bottleneck problem, has raised an $80 million Series A funding round led by Menlo Ventures. The company claims to have created the first and only “multi-silicon inference cloud” software, which allows AI workloads to be run simultaneously across diverse types of hardware.

The software enables agents to chain together multiple steps that require different hardware configurations, such as computing-intensive tasks and memory-bound operations. Gimlet Labs aims to improve AI model efficiency by 3x to 10x while reducing costs and power consumption.

Founded by Stanford adjunct professor Zain Asgar and his team, Gimlet Labs has already partnered with top chip makers like NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix. The company’s product is designed for large AI model labs and data centers, and its customer base has more than doubled in the last four months.

The funding round brings Gimlet Labs’ total investment to $92 million, with previous investors including Sequoia’s Bill Coughran, Stanford Professor Nick McKeown, and Intel CEO Lip-Bu Tan. The company currently employs 30 people and plans to use the new funding to further develop its technology and expand its customer base.

The AI inference bottleneck problem is a major concern in the tech industry, with McKinsey estimating that data center spending could reach nearly $7 trillion by 2030. Gimlet Labs’ solution aims to address this issue by improving the efficiency of AI workloads while reducing waste and increasing productivity.

Source: https://techcrunch.com/2026/03/23/startup-gimlet-labs-is-solving-the-ai-inference-bottleneck-in-a-surprisingly-elegant-way