V100
The Volta V100 (32 GB) — a mature data-center GPU still used for training and inference on a budget.
A16
A quad-GPU Ampere card with 64 GB total, designed for high-density VDI and light inference.
T4
The Turing T4 (16 GB, 70 W) — a low-power inference staple available almost everywhere.
A40
An Ampere 48 GB professional GPU for inference, rendering and virtual workstations.
A10
An Ampere 24 GB GPU popular for mainstream inference and graphics workloads.
RTX 4090
The Ada Lovelace GeForce RTX 4090 (24 GB) — exceptional price/performance for single-GPU inference and fine-tuning.
P100
The Pascal P100 (16 GB HBM2) — an older data-center GPU for budget inference and HPC.
L4
A compact 72 W Ada Lovelace inference GPU with 24 GB GDDR6 for cost-efficient, high-density serving.
RTX5080
The Blackwell GeForce RTX 5080 (16 GB GDDR7) for high-end consumer inference and gaming.
RTX A6000
The Ampere RTX A6000 (48 GB) — a proven workstation GPU for large-memory inference and training.
RTX 3090
The Ampere GeForce RTX 3090 (24 GB) — a popular budget GPU for local AI workloads.
A10G
AWS's Ampere A10G (24 GB), the GPU behind G5 instances for inference and light training.
RTX5090
The Blackwell GeForce RTX 5090 (32 GB GDDR7) — the fastest consumer GPU for local AI.
RTX6000Ada
The RTX 6000 Ada (48 GB) workstation GPU for professional AI, rendering and simulation.
L40S
An Ada Lovelace universal GPU with 48 GB GDDR6, balancing AI inference, fine-tuning and graphics.
A100
The Ampere A100 (80 GB SXM) — the previous-generation data-center standard, still widely available and cost-effective for training and inference.
H100
NVIDIA's Hopper flagship with 80 GB HBM3 and ~3.35 TB/s of bandwidth — the workhorse for large-scale LLM training and high-throughput inference.
H200NVL
The PCIe/NVL variant of the H200 with 141 GB HBM3e, tuned for mainstream servers at a lower 600 W board power.
H200
A Hopper GPU with the same compute as the H100 but 141 GB of HBM3e and ~4.8 TB/s bandwidth, ideal for memory-bound LLM inference.
B300
Blackwell Ultra (B300) with 288 GB HBM3e for the largest models and longest context windows.
B200
NVIDIA's Blackwell data-center GPU with 192 GB HBM3e and ~8 TB/s bandwidth, roughly doubling H100 training throughput.
GH200
The Grace-Hopper superchip combining a Hopper GPU with a Grace CPU and up to 96 GB HBM3 for memory-coherent workloads.
GB200
The Grace-Blackwell GB200 pairing a Blackwell GPU with a Grace CPU over NVLink-C2C for rack-scale AI.
