DDN Unveils Infinia 2.0: Sub-Millisecond Data Access for AI Development at Scale
Data intelligence company DDN has emerged from relative obscurity with a compelling story about solving one of AI development's most persistent challenges: getting data to GPUs fast enough to maximize training efficiency. At the recent IT Press Tour, DDN demonstrated how their new Infinia 2.0 platform is reshaping the infrastructure stack for AI applications, particularly for developers building at scale.
The GPU Utilization Problem
DDN's value proposition centers on a critical bottleneck many machine learning engineers face: GPU idle time during model training. As CEO Paul Bloch explained, "When people buy 20,000 GPUs, that's a $3 billion investment. The last thing you want is idle GPUs." DDN's data shows that without optimized storage, GPUs can sit waiting for data up to 60% of the time during training workflows.
The company's solution involves positioning its data intelligence platform between applications and infrastructure to eliminate these bottlenecks. According to DDN, their platform increases GPU efficiency by 30% while reducing data center footprint and power consumption by 10x.
Infinia's Technical Architecture
The technical innovation behind Infinia 2.0 represents a fundamental departure from traditional storage architectures. CTO Sven Oehme detailed how the platform uses a native key-value store at its foundation, with all protocols—S3, GCS, POSIX, SQL—operating as peers rather than layers.
"Traditional storage systems have a block layer, then a file system layer, then an object layer on top," Oehme explained. "In Infinia, all data services are peers accessing the same key-value store directly. This eliminates multiple software layers and dramatically reduces latency."
The architecture enables some impressive performance claims: DDN demonstrates 100x faster object listing compared to AWS S3, 25x lower time-to-first-byte latency, and the ability to handle millions of concurrent operations. In production at Elon Musk's xAI facility, Infinia manages nearly 600 petabytes with over one million concurrent jobs.
Developer Integration Points
For developers, DDN provides multiple integration pathways. The platform offers native S3 compatibility, while also adding support for Google Cloud Storage APIs, making DDN potentially the only on-premises solution that can natively run Google Cloud applications without requiring data transformation.
The company is releasing a comprehensive SDK with bindings for multiple languages, including Python, Java, Go, and Rust. This allows developers to build custom data services directly on top of Infinia's key-value store, leveraging its high-performance characteristics.
Particularly relevant for AI development workflows, DDN supports NVIDIA's KV cache offloading, allowing GPU state to be quickly stored and restored. This enables more efficient resource utilization for inference workloads, allowing users to pause and resume sessions.
Multi-Cloud Strategy
DDN's partnership with Google Cloud represents a significant shift for the traditionally HPC-focused company. The Google Cloud Managed Lustre service, powered by DDN's EXAScaler technology, provides developers with persistent parallel file systems that can scale from small experiments to massive training runs.
"It's not marketplace—this is Google's first-party offering," Bloch emphasized. "Every Google rep is incentivized to sell this." The integration enables developers to seamlessly move workloads between on-premises and cloud environments, using the same storage APIs and performance characteristics.
Real-World Performance Gains
DDN's most compelling demonstration involved a retrieval-augmented generation (RAG) pipeline running on AWS infrastructure. By simply replacing AWS S3 with Infinia software running on AWS virtual machines, they achieved 22x performance improvement while reducing costs by over 60%.
The demo used unmodified AWS microservices, changing only the storage layer. This suggests developers could potentially achieve significant performance gains without rewriting applications—a crucial consideration for teams with existing AI infrastructure investments.
Industry Context
DDN's emergence coincides with the rapid evolution of the AI infrastructure market. The company reports powering over 700,000 GPUs across customer deployments, with individual clusters reaching 200,000 GPUs. Their recent $300 million investment from Blackstone at a $5 billion valuation reflects growing recognition that data infrastructure will be critical as AI moves from training to inference at scale.
"We're seeing customers asking for multiple exabytes in single deployments," Oehme noted. "The data requirements are growing faster than compute in many cases."
Looking Forward
For developers building AI applications, DDN's technology suggests a future where data infrastructure becomes increasingly software-defined and protocol-agnostic. The ability to run the same storage platform across edge devices, exascale supercomputers, and maintain consistent APIs across cloud and on-premises environments could significantly simplify the development and deployment of distributed AI systems.
As AI applications transition from foundational models to production inference workloads serving millions of users, the emphasis on ultra-low latency data access that DDN provides may become a table-stakes requirement rather than a competitive advantage. For now, their technology appears to be pushing the boundaries of what's possible in high-performance data infrastructure.