Tensormesh

Tensormesh Emerges From Stealth to Slash AI Inference Costs and Latency by up to 10x

SAN FRANCISCO — October 23, 2025 — Tensormesh, the company pioneering caching-accelerated inference optimization for enterprise AI, today emerged from stealth with $4.5 million in seed funding led by Laude Ventures. Tensormesh’s technology eliminates redundant computation in AI inference, reducing latency and GPU spend by up to 10x while giving enterprises full control of their data and infrastructure.

Founded by faculty and PhD researchers from the University of Chicago, UC Berkeley, and Carnegie Mellon, Tensormesh builds on years of academic research in distributed systems and AI infrastructure. The company is led by Junchen Jiang, University of Chicago faculty member and co-creator of LMCache, the leading open-source KV caching project with 5K+ GitHub stars and 100+ contributors. LMCache is integrated with frameworks such as vLLM and NVIDIA Dynamo, and has been used across the ecosystem by organizations including Bloomberg, Red Hat, Redis, Tencent, GMI Cloud, and WEKA.

Tensormesh is the first commercial platform to productize caching for large-scale AI inference, pairing LMCache-inspired techniques with enterprise-grade usability, security, and manageability.

“Enterprises today must either send their most sensitive data to third parties or hire entire engineering teams to rebuild infrastructure from scratch,” said Junchen Jiang, Founder and CEO of Tensormesh. “Tensormesh offers a third path: run AI wherever you want, with state-of-the-art optimizations, cost savings, and performance built in.”

“Enterprises everywhere are wrestling with the huge costs of AI inference,” said Ion Stoica, advisor to Tensormesh and Co-Founder and Executive Chairman of Databricks. “Tensormesh’s approach delivers a fundamental breakthrough in efficiency and is poised to become essential infrastructure for any company betting on AI.”

Sharing KV-cache across nodes in a cluster is a key driver of throughput and cost savings. Tensormesh supports storage backends to enable distributed cache sharing for low-latency, high-throughput deployments.

“We have closely collaborated with Tensormesh to deliver an impressive solution for distributed LLM KVCache sharing across multiple servers. Redis combined with Tensormesh delivers a scalable solution for low-latency, high-throughput LLM deployments. The benchmarks we ran together demonstrated remarkable improvements in both performance and efficiency and we’re excited to see the Tensormesh product, which we believe will set a new bar for LLM hosting performance." said Rowan Trollope, CEO of Redis.

Organizations face immense pressure to balance performance, cost, and control in their AI deployments. Tensormesh lets organizations run AI inference on the infrastructure of their choice while maintaining strong security and low cost. It is cloud-agnostic and available as SaaS or as standalone software, so teams can start small and scale across any public cloud or in-house environment.

“Our partnership with Tensormesh and integration with LMCache played a critical role in helping WEKA open-source aspects of our breakthrough Augmented Memory Grid solution, enabling the broader AI community to tackle some of the toughest challenges in inference today,” said Callan Fox, Lead Product Manager at WEKA.

As inference workloads surge and enterprises search for sustainable ways to scale, the demand for new efficiency layers in the AI stack is growing quickly. Tensormesh is positioned to meet this moment through its deep research roots and wide open-source adoption, bringing caching into the enterprise mainstream and setting the stage for a stronger foundation in AI infrastructure.

‍“Caching is one of the most underutilized levers in AI infrastructure, and this team has found a smart, practical way to apply it at scale,” said Pete Sonsini, Co-Founder and General Partner at Laude Ventures. “This is the moment to define a critical layer in the AI stack, and Tensormesh is well positioned to own it.”

The Tensormesh beta is available now. Sign up at tensormesh.ai.

About Tensormesh

Tensormesh is the AI infrastructure optimization company enabling up to 10x faster inference while keeping full control of data and deployment. Founded by faculty and researchers from the University of Chicago, UC Berkeley, and Carnegie Mellon, Tensormesh commercializes state-of-the-art research to eliminate GPU waste and latency. The software captures and reuses intermediate data other systems discard, delivering breakthrough performance on infrastructure customers own and control. Learn more at www.tensormesh.ai.

Media Contact
‍Sam Polstein
tensormesh@deeptech.agency

‍