LLM training is the most VRAM-intensive workload in modern AI. A 7B-parameter model in FP16 requires roughly 14 GB just for weights, plus optimizer states (Adam roughly doubles or triples that), activations, and gradients. Production fine-tuning runs typically need 40 GB or more per GPU, and full pretraining of 70B+ models is multi-node territory using FSDP, DeepSpeed ZeRO-3, or Megatron-LM.

Memory bandwidth matters more than raw FLOPs for large transformer training because attention operations are memory-bound. NVLink and SXM form factors enable efficient tensor-parallel and pipeline-parallel sharding across cards. FP8 support on Hopper (H100, H200) and Blackwell (B200) GPUs roughly halves memory pressure compared to FP16 with minimal accuracy loss.

The dominant frameworks are PyTorch with FSDP/DDP, Hugging Face Accelerate, DeepSpeed, and JAX. For LoRA and QLoRA fine-tuning, the memory threshold drops significantly: a QLoRA fine-tune of Llama 3 8B fits comfortably in 24 GB.

Key Considerations

•Memory bandwidth (HBM3/HBM3e) often matters more than peak FLOPs
•SXM/NVL form factors enable model-parallel sharding via NVLink
•FP8 support cuts memory pressure roughly in half versus FP16
•Multi-node training requires high-bandwidth interconnect (InfiniBand 400G+)

Full Ranking (25)

Every GPU in our index that meets the 24 GB VRAM minimum, ranked by fit score. Click any row for the full analysis.

GPU	Fit	VRAM	Class	$/hr
A100 PCIe 80GB	100	80 GB	Datacenter	$3.14
A100 SXM 80GB	100	80 GB	Datacenter	$1.39
B200 SXM	100	180 GB	Datacenter	$5.89
B300 SXM	100	192 GB	Datacenter	$7.50
H100 NVL	100	94 GB	Datacenter	$3.11
H100 PCIe	100	80 GB	Datacenter	$3.19
H100 SXM	100	80 GB	Datacenter	$2.50
H200	100	141 GB	Datacenter	$4.39
H200 NVL	100	141 GB	Datacenter	$3.81
MI300X	100	192 GB	Datacenter	$2.82
A100 PCIe 40GB	90	40 GB	Datacenter	$1.31
A100 SXM 40GB	90	40 GB	Datacenter	$0.79
L40S	90	48 GB	Datacenter	$1.60
A40	80	48 GB	Datacenter	$0.54
L40	80	48 GB	Datacenter	$0.86
L4	70	24 GB	Datacenter	$0.86
V100 SXM 32GB	70	32 GB	Datacenter	$0.91
RTX 5090	50	32 GB	Consumer	$0.75
RTX 6000 Ada	50	48 GB	Workstation	$0.79
RTX A6000	50	48 GB	Workstation	$0.57
RTX PRO 6000 Blackwell	50	96 GB	Workstation	$1.99
RTX 4090	40	24 GB	Consumer	$0.64
RTX A5000	40	24 GB	Workstation	$0.27
RTX 3090	25	24 GB	Consumer	$0.41
RTX 3090 Ti	25	24 GB	Consumer	$0.25