NVIDIA Tesla K80 — 24GB
Looks like 24GB but it is two 12GB GPUs. Kepler is too old for most AI frameworks. Avoid unless it is nearly free.
Specifications
| Brand | NVIDIA |
|---|---|
| Model | Tesla K80 |
| VRAM | 24GB |
| Architecture | Kepler |
| CUDA / Stream Processors | 4,992 |
| Memory Bandwidth | 480 GB/s |
| TDP | 300W |
| FP32 TFLOPS | 8.7 |
Buy Now
Prices last updated:
GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.
Price History
Price tracking started — chart will appear after the next snapshot.
For AI / LLM Use
Solid choice for 30B models and comfortable 14B inference. Older architecture may have limited software support (check CUDA compatibility). Datacenter card — no display output, may need aftermarket cooling.
What Models Can It Run?
- 30B Q4_K_M, 14B full precision, 70B Q2 (tight)
- 14B Q6_K, 30B Q3_K (tight)
- 14B Q4_K_M, 7B full precision
- 7B Q6_K, 14B Q3_K (tight)
- 7B Q4_K_M only
Estimated Performance
Generation: ~36 tokens/sec
Prefill: ~155 tokens/sec
Recommended Quantisations
- Q4_K_M recommended for 30B models
- Q6_K or Q8 for 14B and below
- Full precision for 7B
Pros & Cons
Pros
- 24GB VRAM — handles large models
Cons
- 300W TDP — high power draw
- Older architecture — check CUDA/ROCm compatibility
- No display output — headless only
- May need aftermarket cooling solution
Community Verdict
- r/LocalLLaMA
Avoid. Dual-GPU means 12GB per die, Kepler lacks modern CUDA support, and it draws 300W. Buy a P40 instead.
Source