InfiniBand is a high-bandwidth, low-latency network interconnect originally designed for HPC clusters and now standard in large-scale GPU training environments. Modern InfiniBand (NDR generation, 400 Gbps per link) provides the bandwidth and microsecond-scale latency needed for synchronous distributed training across many nodes.
Multi-node LLM training is impractical without InfiniBand or comparable interconnect (NVIDIA Spectrum-X, AWS Elastic Fabric Adapter). All-reduce operations — which synchronize gradients across nodes after every batch — would otherwise become the bottleneck, leaving expensive GPUs idle waiting for network traffic.
GPU rental marketplaces typically expose interconnect quality as a tier descriptor: "InfiniBand-attached" or "400 Gbps NDR" listings command a premium because they enable workloads that single-node-only listings cannot run. AIMC tracks per-GPU pricing; interconnect tier should be a separate consideration when evaluating multi-node training options.