The NVIDIA H200 NVL is an NVLink-optimized variant of the H200, designed for deployment in dual-GPU configurations connected via an NVLink bridge. Announced alongside the H200 SXM in November 2023, the NVL variant began shipping in 2024 and provides a path to H200 memory capacity in PCIe-compatible server platforms.
The H200 NVL features the same 141GB of HBM3e memory and 4.8 TB/s bandwidth as the H200 SXM. The key differentiator is the form factor: while using a PCIe-style card design, it includes an NVLink connector that enables two H200 NVL cards to be connected with 600 GB/s bidirectional bandwidth, creating a unified 282GB memory pool for large model inference.
The dual-card NVL configuration provides an alternative deployment option for organizations that cannot use SXM-based systems like DGX or HGX. It allows existing PCIe server infrastructure to access Hopper's memory expansion without requiring specialized baseboards. TDP is approximately 600W per card.
The H200 NVL is particularly suited for inference workloads requiring large memory capacity but not the full 8-GPU scaling of SXM systems. Common deployments include inference servers running large language models, recommendation systems with large embedding tables, and scientific computing applications with large memory footprints.