Compute Units

CUDA Cores

General-purpose parallel processors in NVIDIA GPUs.

CUDA Cores are the general-purpose parallel processing units in NVIDIA GPUs, responsible for executing programs written in CUDA (NVIDIA's parallel computing platform). They handle FP32 and FP64 work, as well as integer operations and any code that isn't a tensor matrix operation.

A modern datacenter GPU has thousands of CUDA cores — the H100 has 14,592; the RTX 5090 has 21,760. Each CUDA core is much simpler than a CPU core but operates in massive parallel groups.

For deep learning, the relative importance of CUDA cores has decreased as workloads shifted to Tensor Core-eligible operations. They remain critical for scientific computing, rendering, simulation, and any workload using full FP32/FP64 precision.

Related Terms

Concepts directly relevant to CUDA Cores.

Tensor Cores

NVIDIA's specialized matrix-multiplication accelerators used heavily in deep learning.

Workloads Where CUDA Cores Matters

GPU fit analysis for the workloads this concept directly influences.

This definition is part of AIMC's reference glossary — 36 concepts across 10 categories.

Browse full glossary