NVIDIA RTX 4080 — 16GB

Great performance but 16GB limits you to 14B models. Consider 3090 for more VRAM at lower cost.

Specifications

BrandNVIDIA
ModelRTX 4080
VRAM16GB
ArchitectureAda
CUDA / Stream Processors9,728
Memory Bandwidth717 GB/s
TDP320W
FP32 TFLOPS49

Buy Now

Prices last updated:

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Price tracking started — chart will appear after the next snapshot.

For AI / LLM Use

Good for 14B models. 30B requires aggressive quantisation.

What Models Can It Run?

  • 14B Q6_K, 30B Q3_K (tight)
  • 14B Q4_K_M, 7B full precision
  • 7B Q6_K, 14B Q3_K (tight)
  • 7B Q4_K_M only

Estimated Performance

Generation: ~54 tokens/sec

Prefill: ~875 tokens/sec

Recommended Quantisations

  • Q4_K_M for 14B models
  • Q6_K for 7B-8B models
  • Q8 for 7B if VRAM allows

Pros & Cons

Pros

  • Ada architecture — good software support
  • Consumer card — easy to install, display output

Cons

  • 320W TDP — high power draw

Community Verdict

  • r/LocalLLaMA

    Fast but 16GB is the bottleneck. Most recommend spending the same money on a 3090 for more VRAM.

    Source