How much VRAM does the B200 have?

The B200 has 180 GB of HBM3e memory (compute-accessible: 192 GB).

When was the B200 released?

The B200 was released in 2024 by NVIDIA, based on the Blackwell architecture in the SXM form factor.

How much does it cost to rent the B200?

The B200 rents for $3.69/hr at the cheapest marketplace, with a typical listing-weighted median of $5.89/hr across 12 marketplace partners. Updated daily.

Is the B200 good for AI training or inference?

The B200 delivers 2250 FP16 TFLOPS (dense, no sparsity) with 192 GB of VRAM. Suited for large-model training and high-throughput inference.

Home/GPU Prices/B200/Specifications

NVIDIA · Blackwell · 2024

B200
AIMC Specifications

Name: B200
Brand: NVIDIA
Availability: InStock

Complete technical reference: architecture, memory, performance, and live rental pricing.

Memory

180 GB

HBM3e

Form Factor

SXM

Datacenter

FP16 Compute

2250

TFLOPS (dense)

Open Cost Calculator

Live Rental Pricing

Current market pricing across all authorized partners, updated daily.

Cheapest

$3.69/hr

Typical (median)

$5.89/hr

Marketplaces

12

See full marketplace breakdown for B200

Full Specifications

Factual specifications from manufacturer datasheets.

Manufacturer	NVIDIA
Architecture	Blackwell
Memory Capacity	180 GB
Memory Type	HBM3e
Form Factor	SXM
Release Year	2024
GPU Class	Datacenter
FP16 TFLOPS (dense)	2250(estimated)
VRAM (compute)	192 GB

Architecture & Use Cases

Technical overview of the B200.

The NVIDIA B200 is the first Blackwell architecture datacenter GPU, announced at GTC March 2024. It represents a major architectural evolution from Hopper, introducing a dual-die GPU design and significant advances in AI training and inference capabilities. Initial shipments began in late 2024.

The B200 uses a revolutionary dual-die design with two GPU chiplets manufactured on TSMC's custom 4NP process, containing a combined 208 billion transistors - more than 2.5x the H100. The two dies are connected by a 10 TB/s NV-HBI (High Bandwidth Interface) on a single package, presenting as a unified GPU to software.

Memory capacity is 180GB of HBM3e with 8 TB/s aggregate bandwidth, more than double the H100's capacity and 2.4x its bandwidth. The chip includes 5th-generation NVLink with 1.8 TB/s bidirectional bandwidth per GPU, enabling NVL72 configurations where 72 GPUs can communicate with full bandwidth.

Key architectural innovations include 5th-generation Tensor Cores with native FP4 precision support, enabling further efficiency gains for inference. The 2nd-generation Transformer Engine includes micro-tensor scaling for more granular precision control. New decompression engines accelerate database and data analytics workloads.