TFLOPS (TeraFLOPS) measures how many trillion floating-point math operations a GPU can perform per second at a given precision. Ratings are precision-specific: a GPU's FP32 TFLOPS, FP16 TFLOPS, FP8 TFLOPS, and INT8 TFLOPS are all different numbers.
Marketing figures often include sparsity (treating zero-valued matrix elements as free), doubling the reported rate. Dense (non-sparse) TFLOPS is the more conservative number; AIMC uses this where possible to avoid overstating capability.
Headline 2026 datacenter ratings (dense FP16): H100 SXM ~989 TFLOPS, A100 SXM 80GB ~312 TFLOPS, B200 SXM ~2,250 TFLOPS, MI300X ~1,300 TFLOPS. For consumer parts: RTX 5090 ~104 TFLOPS, RTX 4090 ~83 TFLOPS. TFLOPS is one of three inputs to AIMC's workload fit score.