NVIDIA Tesla K80 - 24GB

Looks like 24GB but it is two 12GB GPUs. Kepler is too old for most AI frameworks. Avoid unless it is nearly free.

Specifications

Brand	NVIDIA
Model	Tesla K80
VRAM	24GB
Architecture	Kepler
CUDA / Stream Processors	4,992
Memory Bandwidth	480 GB/s
TDP	300W
FP32 TFLOPS	8.7

Current Prices

AmazonCheck on Amazon →Newegg$100$4.17/GB eBay$60$2.50/GB

Prices last updated: 7/8/2026

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Best price up 7% since 2026-03-13

NeweggeBay

For AI / LLM Use

Solid choice for 30B models and comfortable 14B inference. Older architecture may have limited software support (check CUDA compatibility). Datacenter card with no display output, may need aftermarket cooling.

What Models Can It Run?

30B Q4_K_M, 14B full precision, 70B Q2 (tight)
14B Q6_K, 30B Q3_K (tight)
14B Q4_K_M, 7B full precision
7B Q6_K, 14B Q3_K (tight)
7B Q4_K_M only

Estimated Performance

Generation: ~36 tokens/sec

Prefill: ~155 tokens/sec

Recommended Quantisations

Q4_K_M recommended for 30B models
Q6_K or Q8 for 14B and below
Full precision for 7B

Pros & Cons

Pros

24GB VRAM: handles large models

Cons

Moderate memory bandwidth: not the fastest for inference
300W TDP: high power draw
Older architecture: check CUDA/ROCm compatibility
No display output: headless only
May need aftermarket cooling solution

Community Verdict

r/LocalLLaMA
Avoid. Dual-GPU means 12GB per die, Kepler lacks modern CUDA support, and it draws 300W. Buy a P40 instead.
Source

← Back to full comparison table