NVIDIA A4000 — 16GB

Specifications

BrandNVIDIA
ModelA4000
VRAM16GB
ArchitectureAmpere
CUDA / Stream Processors6,144
Memory Bandwidth448 GB/s
TDP140W
FP32 TFLOPS19.2

Current Prices

Prices last updated:

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Prices stable since 2026-03-13

2026-03-132026-03-152026-03-272026-04-032026-04-102026-04-172026-04-242026-05-012026-05-08$1199$950$1369$1475$825$850
AmazonNeweggeBay

For AI / LLM Use

Good for 14B models. 30B requires aggressive quantization. Datacenter card — no display output, may need aftermarket cooling.

What Models Can It Run?

  • 14B Q6_K, 30B Q3_K (tight)
  • 14B Q4_K_M, 7B full precision
  • 7B Q6_K, 14B Q3_K (tight)
  • 7B Q4_K_M only

Estimated Performance

Generation: ~34 tokens/sec

Prefill: ~343 tokens/sec

Recommended Quantisations

  • Q4_K_M for 14B models
  • Q6_K for 7B-8B models
  • Q8 for 7B if VRAM allows

Pros & Cons

Pros

  • Only 140W TDP — power efficient
  • Ampere architecture — good software support

Cons

  • 16GB VRAM — may need quantization for 30B+ models
  • Moderate memory bandwidth — not the fastest for inference
  • No display output — headless only
  • May need aftermarket cooling solution

Community Verdict

No community reviews yet for the A4000. Know a good review? Let us know.