The NVIDIA L4 is a compact Ada Lovelace architecture accelerator optimized for AI inference at the edge and in mainstream servers, announced in March 2023. It provides strong inference performance in a low-power, single-slot form factor designed for dense deployment and power-constrained environments.
The L4 uses the AD104 die with 24GB of GDDR6 memory providing 300 GB/s bandwidth. It includes 7,424 CUDA cores, 232 fourth-generation Tensor Cores, and 58 third-generation RT cores. The chip is manufactured on TSMC's custom 4N process, the same node used across the Ada Lovelace family.
A key feature is the extremely low 72W TDP in a low-profile, single-slot PCIe Gen4 x16 form factor. This enables deployment in standard servers without additional power delivery and in edge computing environments with thermal constraints. Up to 8 L4 GPUs can fit in a standard 4U server.
Fourth-generation Tensor Cores support FP8, FP16, BF16, TF32, and INT8 operations optimized for inference. Hardware video capabilities include 8th-generation NVENC with AV1 encode and 5th-generation NVDEC. Primary deployment scenarios include edge AI inference, video analytics, and dense inference servers where power efficiency is critical.