LLM training is the process of training a large language model — typically a transformer architecture with billions of parameters — on large text corpora. Frontier-model training requires tens of thousands of GPU-hours on top-tier hardware; even fine-tuning a 70B-parameter model usually requires multi-GPU systems with high-bandwidth interconnects.
The compute profile is dominated by matrix multiplications at FP16 or BF16 precision, with backward-pass gradients adding roughly 2-3x the forward-pass memory and compute. Datacenter-class GPUs (H100, H200, B200, A100, MI300X) are the typical hardware tier; consumer cards rarely have enough VRAM and lack the NVLink interconnect needed for multi-GPU training at scale.
AIMC tracks LLM training as one of 10 workloads with a 24 GB VRAM minimum and 500 FP16 TFLOPS threshold. The /for/llm-training hub ranks every viable GPU in the index by fit score.