How much VRAM does the H200 have?

The H200 has 141 GB of HBM3e memory.

When was the H200 released?

The H200 was released in 2024 by NVIDIA, based on the Hopper architecture in the SXM form factor.

How much does it cost to rent the H200?

The H200 rents for $2.19/hr at the cheapest marketplace, with a typical listing-weighted median of $4.39/hr across 15 marketplace partners. Updated daily.

Is the H200 good for AI training or inference?

The H200 delivers 989 FP16 TFLOPS (dense, no sparsity) with 141 GB of VRAM. Suited for large-model training and high-throughput inference.

Home/GPU Prices/H200/Specifications

NVIDIA · Hopper · 2024

H200
AIMC Specifications

Name: H200
Brand: NVIDIA
Availability: InStock

Complete technical reference: architecture, memory, performance, and live rental pricing.

Memory

141 GB

HBM3e

Form Factor

SXM

Datacenter

FP16 Compute

989

TFLOPS (dense)

Open Cost Calculator

Live Rental Pricing

Current market pricing across all authorized partners, updated daily.

Cheapest

$2.19/hr

Typical (median)

$4.39/hr

Marketplaces

15

See full marketplace breakdown for H200

Full Specifications

Factual specifications from manufacturer datasheets.

Manufacturer	NVIDIA
Architecture	Hopper
Memory Capacity	141 GB
Memory Type	HBM3e
Form Factor	SXM
Release Year	2024
GPU Class	Datacenter
FP16 TFLOPS (dense)	989
VRAM (compute)	141 GB

Architecture & Use Cases

Technical overview of the H200.

The NVIDIA H200 is an extended-memory variant of the Hopper architecture, officially announced in November 2023 and shipping to customers in Q2 2024. It represents a significant memory upgrade over the H100, featuring 141GB of HBM3e memory with 4.8 TB/s of memory bandwidth - a 76% increase in capacity and 43% increase in bandwidth compared to the H100 SXM.

The H200 uses the same GH100 GPU die as the H100, manufactured on TSMC's custom 4N process node, containing 80 billion transistors. It retains all the architectural features of Hopper including 4th-generation Tensor Cores, the Transformer Engine with FP8 precision support, and 4th-generation NVLink with 900 GB/s bidirectional bandwidth.

The SXM form factor is designed for NVIDIA's HGX H200 baseboard, which supports 8-GPU configurations with full NVLink mesh connectivity. The H200 is a drop-in replacement for H100 SXM in existing DGX and HGX infrastructure, requiring only firmware updates. TDP remains at 700W, matching the H100 SXM thermal envelope.

Primary use cases include large language model inference where the expanded memory allows larger batch sizes and longer context lengths, reducing the need for tensor parallelism across multiple GPUs. The H200 can run models like Llama 2 70B at nearly double the throughput of H100 for inference workloads that are memory-bandwidth bound.