The NVIDIA B200 is the first Blackwell architecture datacenter GPU, announced at GTC March 2024. It represents a major architectural evolution from Hopper, introducing a dual-die GPU design and significant advances in AI training and inference capabilities. Initial shipments began in late 2024.
The B200 uses a revolutionary dual-die design with two GPU chiplets manufactured on TSMC's custom 4NP process, containing a combined 208 billion transistors - more than 2.5x the H100. The two dies are connected by a 10 TB/s NV-HBI (High Bandwidth Interface) on a single package, presenting as a unified GPU to software.
Memory capacity is 180GB of HBM3e with 8 TB/s aggregate bandwidth, more than double the H100's capacity and 2.4x its bandwidth. The chip includes 5th-generation NVLink with 1.8 TB/s bidirectional bandwidth per GPU, enabling NVL72 configurations where 72 GPUs can communicate with full bandwidth.
Key architectural innovations include 5th-generation Tensor Cores with native FP4 precision support, enabling further efficiency gains for inference. The 2nd-generation Transformer Engine includes micro-tensor scaling for more granular precision control. New decompression engines accelerate database and data analytics workloads.