HBM is a vertically-stacked DRAM technology connected to the GPU through a silicon interposer rather than a traditional PCB. The wide on-package bus (1024 bits or more per stack) delivers memory bandwidth several times higher than GDDR-based consumer cards.
HBM generations have steadily increased capacity and bandwidth. HBM2 reached ~900 GB/s per stack; HBM2e pushed past 1.2 TB/s; HBM3 in the H100 SXM delivers approximately 3.35 TB/s aggregate; HBM3e in the H200 and B200 reaches over 4.8 TB/s.
Datacenter GPUs almost universally use HBM. Consumer cards (RTX 4090, RTX 5090) use GDDR6X or GDDR7 instead, which is cheaper to manufacture and assemble. AIMC's GPU classification reflects this: HBM-equipped cards land in the Datacenter category.