GPUDojo

What GPU do I need for AI?

Tell us what you want to run and your budget — we'll recommend the best GPUs with live prices. Or browse the full comparison table.

1 What do you want to run?

2 What's your budget?

3 Any constraints?

How Much VRAM Do I Need?

Model SizeMin VRAMSweet SpotExample Models
7-8B8GB12GBMistral 7B, Llama 3.1 8B
14B12GB16GBQwen 2.5 14B
14-30B / MoE16GB24GBMixtral 8x7B, Qwen 32B Q4
70B+48GB48-80GBLlama 3.1 70B Q4, Qwen 72B

70B Q4 needs ~40GB. A single 24GB card cannot run it — you need 48GB+ (e.g., A6000, dual P40s).

Why $/GB VRAM?

For local AI inference, VRAM is the bottleneck. It determines the largest model you can load — and larger models produce smarter, more coherent output. Price per GB of VRAM tells you how much capability you get per USD spent.

This is why a used Tesla P40 (24GB) often beats a new RTX 4060 Ti (8GB) for AI work, despite being older hardware. The P40 can run 30B models that simply won't fit on 8GB. See the full comparison table ranked by this metric.

Quantization Explained

LLMs are stored as floating-point numbers. Quantization reduces precision (e.g., 16-bit to 4-bit) so larger models fit in less VRAM, with a small quality trade-off.