NVIDIA Tesla M40 - 24GB

Cheapest 24GB GPU at ~$80. Maxwell architecture is slow but the VRAM capacity is real. Needs aftermarket cooling.

Specifications

Brand	NVIDIA
Model	Tesla M40
VRAM	24GB
Architecture	Maxwell
CUDA / Stream Processors	3,072
Memory Bandwidth	288 GB/s
TDP	250W
FP32 TFLOPS	7

Current Prices

eBay€175€7.29/GB

Prices last updated: 7/8/2026

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Best price dropped 12% since 2026-03-13

eBay

For AI / LLM Use

Solid choice for 30B models and comfortable 14B inference. Slower generation, usable but not snappy. Older architecture may have limited software support (check CUDA compatibility). Datacenter card with no display output, may need aftermarket cooling.

What Models Can It Run?

30B Q4_K_M, 14B full precision, 70B Q2 (tight)
14B Q6_K, 30B Q3_K (tight)
14B Q4_K_M, 7B full precision
7B Q6_K, 14B Q3_K (tight)
7B Q4_K_M only

Estimated Performance

Generation: ~22 tokens/sec

Prefill: ~125 tokens/sec

Recommended Quantisations

Q4_K_M recommended for 30B models
Q6_K or Q8 for 14B and below
Full precision for 7B

Pros & Cons

Pros

24GB VRAM: handles large models

Cons

Low memory bandwidth: slower token generation
Older architecture: check CUDA/ROCm compatibility
No display output: headless only
May need aftermarket cooling solution

Community Verdict

r/LocalLLaMA
The absolute cheapest 24GB card. Passive cooled so you need a blower mod. Slow but VRAM is VRAM.
Source

← Back to full comparison table