NVIDIA A4000 — 16GB
Specifications
| Brand | NVIDIA |
|---|---|
| Model | A4000 |
| VRAM | 16GB |
| Architecture | Ampere |
| CUDA / Stream Processors | 6,144 |
| Memory Bandwidth | 448 GB/s |
| TDP | 140W |
| FP32 TFLOPS | 19.2 |
Current Prices
Prices last updated:
GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.
Price History
Best price up 5% since 2026-03-13
AmazoneBay
For AI / LLM Use
Good for 14B models. 30B requires aggressive quantization. Datacenter card — no display output, may need aftermarket cooling.
What Models Can It Run?
- 14B Q6_K, 30B Q3_K (tight)
- 14B Q4_K_M, 7B full precision
- 7B Q6_K, 14B Q3_K (tight)
- 7B Q4_K_M only
Estimated Performance
Generation: ~34 tokens/sec
Prefill: ~343 tokens/sec
Recommended Quantisations
- Q4_K_M for 14B models
- Q6_K for 7B-8B models
- Q8 for 7B if VRAM allows
Pros & Cons
Pros
- Only 140W TDP — power efficient
- Ampere architecture — good software support
Cons
- 16GB VRAM — may need quantization for 30B+ models
- Moderate memory bandwidth — not the fastest for inference
- No display output — headless only
- May need aftermarket cooling solution
Community Verdict
No community reviews yet for the A4000. Know a good review? Let us know.