NVIDIA A6000 — 48GB

48GB VRAM monster. The only way to run 70B models on a single consumer-grade card.

Specifications

BrandNVIDIA
ModelA6000
VRAM48GB
ArchitectureAmpere
CUDA / Stream Processors10,752
Memory Bandwidth768 GB/s
TDP300W
FP32 TFLOPS38.7

Buy Now

Prices last updated:

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Price tracking started — chart will appear after the next snapshot.

For AI / LLM Use

Best for running 70B+ models on a single GPU.

What Models Can It Run?

  • 70B Q4_K_M, 30B full precision
  • 30B Q6_K, 70B Q2_K
  • 30B Q4_K_M, 14B full precision, 70B Q2 (tight)
  • 14B Q6_K, 30B Q3_K (tight)
  • 14B Q4_K_M, 7B full precision
  • 7B Q6_K, 14B Q3_K (tight)
  • 7B Q4_K_M only

Estimated Performance

Generation: ~58 tokens/sec

Prefill: ~691 tokens/sec

Recommended Quantisations

  • Q6_K or Q8 for best quality
  • Q4_K_M for larger models
  • Full precision for 30B and below

Pros & Cons

Pros

  • 48GB VRAM — handles large models
  • Ampere architecture — good software support
  • Consumer card — easy to install, display output

Cons

  • 300W TDP — high power draw

Community Verdict

  • r/LocalLLaMA

    Only realistic single-GPU option for 70B Q4 models. 48GB VRAM in a workstation form factor.

    Source