NVIDIA A6000 — 48GB
48GB VRAM monster. The only way to run 70B models on a single consumer-grade card.
Specifications
| Brand | NVIDIA |
|---|---|
| Model | A6000 |
| VRAM | 48GB |
| Architecture | Ampere |
| CUDA / Stream Processors | 10,752 |
| Memory Bandwidth | 768 GB/s |
| TDP | 300W |
| FP32 TFLOPS | 38.7 |
Buy Now
Prices last updated:
GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.
Price History
Price tracking started — chart will appear after the next snapshot.
For AI / LLM Use
Best for running 70B+ models on a single GPU.
What Models Can It Run?
- 70B Q4_K_M, 30B full precision
- 30B Q6_K, 70B Q2_K
- 30B Q4_K_M, 14B full precision, 70B Q2 (tight)
- 14B Q6_K, 30B Q3_K (tight)
- 14B Q4_K_M, 7B full precision
- 7B Q6_K, 14B Q3_K (tight)
- 7B Q4_K_M only
Estimated Performance
Generation: ~58 tokens/sec
Prefill: ~691 tokens/sec
Recommended Quantisations
- Q6_K or Q8 for best quality
- Q4_K_M for larger models
- Full precision for 30B and below
Pros & Cons
Pros
- 48GB VRAM — handles large models
- Ampere architecture — good software support
- Consumer card — easy to install, display output
Cons
- 300W TDP — high power draw
Community Verdict
- r/LocalLLaMA
Only realistic single-GPU option for 70B Q4 models. 48GB VRAM in a workstation form factor.
Source