NVIDIA Tesla P40 — 24GB

Cheapest 24GB you can buy. No display output, needs aftermarket cooling, but unbeatable $/GB.

Specifications

BrandNVIDIA
ModelTesla P40
VRAM24GB
ArchitecturePascal
CUDA / Stream Processors3,840
Memory Bandwidth347 GB/s
TDP250W
FP32 TFLOPS12

Current Prices

Prices last updated:

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Best price up 10% since 2026-03-13

2026-03-132026-03-152026-03-272026-04-032026-04-102026-04-172026-04-242026-05-012026-05-08$314$345$270$368$269$295
AmazonNeweggeBay

For AI / LLM Use

Solid choice for 30B models and comfortable 14B inference. Slower generation — usable but not snappy. Older architecture may have limited software support (check CUDA compatibility). Datacenter card — no display output, may need aftermarket cooling.

What Models Can It Run?

  • 30B Q4_K_M, 14B full precision, 70B Q2 (tight)
  • 14B Q6_K, 30B Q3_K (tight)
  • 14B Q4_K_M, 7B full precision
  • 7B Q6_K, 14B Q3_K (tight)
  • 7B Q4_K_M only

Estimated Performance

Generation: ~26 tokens/sec

Prefill: ~214 tokens/sec

Recommended Quantisations

  • Q4_K_M recommended for 30B models
  • Q6_K or Q8 for 14B and below
  • Full precision for 7B

Pros & Cons

Pros

  • 24GB VRAM — handles large models

Cons

  • Moderate memory bandwidth — not the fastest for inference
  • Older architecture — check CUDA/ROCm compatibility
  • No display output — headless only
  • May need aftermarket cooling solution

Community Verdict

  • r/LocalLLaMA

    Budget legend. Needs a blower cooler mod and no display output, but 24GB for under $200 is hard to beat.

    Source