HPE used NVIDIA GTC 2026 to announce the most significant refresh of its AI infrastructure portfolio since the company started building exascale systems. The updates span two product lines — HPE's supercomputing platform and its AI Factory portfolio — and introduce both NVIDIA's new Vera Rubin architecture and the first Vera CPU compute blade from any vendor.
Here's what shipped and why it matters for developers and engineers building on large-scale AI infrastructure.
Supercomputing: First Vera CPU Blade and Quantum-X800 Networking
HPE is adding two new capabilities to the HPE Cray Supercomputing GX5000, its second-generation exascale platform designed to unify AI and HPC workloads.
The first NVIDIA Vera CPU compute blade. The HPE Cray Supercomputing GX240 is a liquid-cooled blade featuring up to 16 NVIDIA Vera CPUs per blade. The Vera CPU is NVIDIA's successor to the Grace CPU, built on custom ARM-based "Olympus" cores with 88 cores per chip. A single rack supports up to 40 blades — that's 640 Vera CPUs and 56,320 ARM cores per rack. HPE claims industry-leading density on the Vera platform.
For developers working on CPU-intensive AI workloads — data preprocessing, feature engineering, inference serving, simulation — this is the ARM density story reaching a new scale. The Vera CPU is purpose-built for AI data center workloads, not repurposed from another market.
NVIDIA Quantum-X800 InfiniBand networking. The GX5000 now supports Quantum-X800 InfiniBand switches with 144 ports of 800 Gb/s connectivity per port. For teams running distributed training or large-scale inference across thousands of GPUs, the networking fabric determines how efficiently you can scale. Quantum-X800 also includes power-efficiency features such as low-power link state and power profiling — relevant when your electricity bill is a meaningful line item.
HPE has built the three most powerful exascale supercomputers in the world (validated by the November 2025 TOP500). This update keeps the Cray platform up to date with NVIDIA's latest silicon.
AI Factory: Vera Rubin NVL72 and Double-Density GPU Servers
The AI Factory portfolio targets service providers, sovereign entities, and large enterprises running AI at scale. Two new systems stand out.
NVIDIA Vera Rubin NVL72 by HPE. This is the flagship: a rack-scale AI system engineered for frontier models exceeding 1 trillion parameters. Each rack includes 36 Vera CPUs, 72 Rubin GPUs, NVLink interconnects, ConnectX-9 SuperNIC, and BlueField-4 networking. HPE provides liquid-cooling integration, services, and data-center design expertise.
The Vera Rubin platform represents a generational leap. NVIDIA claims 3.3x to 5x inference performance improvement over Blackwell in FP4 workloads, a 10x reduction in inference token costs, and 4x fewer GPUs needed to train equivalent Mixture-of-Experts models. For developers building large-scale AI applications, the math changes — workloads that required massive GPU clusters on Blackwell become tractable on smaller Vera Rubin deployments.
HPE Compute XD700. This is an OCP-inspired AI server built on NVIDIA HGX Rubin NVL8. Each rack supports up to 128 Rubin GPUs — double the GPU density of the previous generation (the HPE ProLiant XD685). The design targets reduced space, power, and cooling costs while increasing training and inference throughput.
For teams that don't need a full NVL72 rack but want high-density GPU access for model training and inference, the XD700 is the more practical entry point.
Enterprise: Security, Multi-Tenancy, and Agentic AI
The enterprise announcements focus on HPE Private Cloud AI, the company's turnkey AI factory for organizations that need on-premises AI infrastructure.
Scaling to 128 GPUs. New network expansion racks let HPE Private Cloud AI deployments scale up to 128 GPUs while maintaining a consistent operational experience. Previously, Private Cloud AI topped out at lower GPU counts, limiting the size of models organizations could run on-prem.
Air-gapped configurations. For sovereign deployments and regulated industries, HPE Private Cloud AI is now available in an air-gapped configuration — fully isolated from external networks. If your data can't leave the building, this is the hardware option that makes AI possible under those constraints.
Confidential AI. HPE ProLiant DL380a Gen12 servers and Private Cloud AI systems are being certified for Fortanix Confidential AI, built on NVIDIA Blackwell Confidential Computing GPUs. Sensitive data gets processed without exposure.
CrowdStrike integration. CrowdStrike delivers agentic security for HPE Private Cloud AI — AI-powered threat detection protecting AI infrastructure, models, and the AI agents running across enterprise environments. As organizations deploy more autonomous AI agents, securing the infrastructure on which those agents run becomes a distinct problem.
NVIDIA AI-Q and Omniverse blueprints. Private Cloud AI ships with updated NVIDIA blueprints, including AI-Q for building customizable AI agents and Omniverse for digital twins. These are pre-configured stacks — hardware and software together — that reduce time to first deployment.
Agentic AI services hub. HPE Services is launching an agents hub for structured enterprise adoption of agentic AI, developing and validating agents powered by NVIDIA Nemotron models. This creates reusable patterns for organizations that want to deploy AI agents but need a structured approach rather than experimentation.
What Developers Should Know
Three things matter most from these announcements.
The Vera CPU is real and shipping in blades. HPE is the first vendor to put it in a compute blade product. If you're building for ARM-based AI infrastructure, the 88-core Olympus architecture, with 640 CPUs per rack, sets a new density benchmark.
Vera Rubin changes the economics. A 10x reduction in inference token costs and 4x fewer GPUs for MoE training means workloads that were cost-prohibitive on Blackwell become viable. For AI startups and research labs, the cost-per-token math is the number that determines what you can build.
Enterprise AI security is becoming a product category. Air-gapped configurations, confidential computing certification, CrowdStrike agentic security integration — these aren't afterthoughts. They're shipping alongside the compute hardware as part of the same product stack. As AI agents proliferate in enterprise environments, the infrastructure securing those agents matters as much as the infrastructure running them.
HPE's full GTC 2026 portfolio is available starting today. Initial Vera Rubin samples are expected to ship to tier-one customers by late 2026, with full production in early 2027.