How much VRAM does the T4 have?

The T4 has 16 GB of GDDR6 memory.

When was the T4 released?

The T4 was released in 2018 by NVIDIA, based on the Turing architecture in the PCIe form factor.

How much does it cost to rent the T4?

The T4 rents for $0.15/hr at the cheapest marketplace, with a typical listing-weighted median of $0.50/hr across 4 marketplace partners. Updated daily.

Is the T4 good for AI training or inference?

Performance benchmarks are not currently available for the T4. See architecture overview for use-case fit.

Home/GPU Prices/T4/Specifications

NVIDIA · Turing · 2018

T4
AIMC Specifications

Name: T4
Brand: NVIDIA
Availability: InStock

Complete technical reference: architecture, memory, performance, and live rental pricing.

Memory

16 GB

GDDR6

Form Factor

PCIe

Datacenter

FP16 Compute

—

benchmark pending

Open Cost Calculator

Live Rental Pricing

Current market pricing across all authorized partners, updated daily.

Cheapest

$0.15/hr

Typical (median)

$0.50/hr

Marketplaces

4

See full marketplace breakdown for T4

Full Specifications

Factual specifications from manufacturer datasheets.

Manufacturer	NVIDIA
Architecture	Turing
Memory Capacity	16 GB
Memory Type	GDDR6
Form Factor	PCIe
Release Year	2018
GPU Class	Datacenter

Architecture & Use Cases

Technical overview of the T4.

The NVIDIA T4 is a Turing architecture inference accelerator that became the most widely deployed GPU for AI inference in cloud computing. Announced in September 2018, the T4 established the template for compact, efficient inference GPUs that subsequent generations would follow.

The T4 uses the TU104 die with 16GB of GDDR6 memory providing 320 GB/s bandwidth. It includes 2,560 CUDA cores, 320 second-generation Tensor Cores, and 40 RT cores for hardware-accelerated ray tracing. The chip is manufactured on TSMC's 12nm FFN process.

The defining feature is the extremely compact low-profile, single-slot PCIe Gen3 x16 form factor with just 70W TDP. This enables passive cooling and dense deployment with up to 20 T4s in a single server chassis. No external power connector is required - the T4 draws all power from the PCIe slot.

Second-generation Tensor Cores support INT8 and INT4 operations in addition to FP16, enabling efficient quantized inference. The T4 remains the most widely deployed inference GPU in cloud computing as of 2024, powering inference instances on AWS (G4dn), Azure (NC T4 v3), Google Cloud (N1 with T4), and most other cloud providers.