NVIDIA Tesla P100 — 16GB

Budget datacenter card with 16GB. Older architecture but still handles 14B models.

Specifications

Brand	NVIDIA
Model	Tesla P100
VRAM	16GB
Architecture	Pascal
CUDA / Stream Processors	3,584
Memory Bandwidth	732 GB/s
TDP	250W
FP32 TFLOPS	9.3

Current Prices

eBay£218£13.63/GB

Prices last updated: 5/10/2026

GPUDojo is reader-supported. When you buy through links on our site, we may earn an affiliate commission.

Price History

Best price up 8% since 2026-03-13

AmazoneBay

For AI / LLM Use

Good for 14B models. 30B requires aggressive quantization. Older architecture may have limited software support (check CUDA compatibility). Datacenter card — no display output, may need aftermarket cooling.

What Models Can It Run?

14B Q6_K, 30B Q3_K (tight)
14B Q4_K_M, 7B full precision
7B Q6_K, 14B Q3_K (tight)
7B Q4_K_M only

Estimated Performance

Generation: ~55 tokens/sec

Prefill: ~166 tokens/sec

Recommended Quantisations

Q4_K_M for 14B models
Q6_K for 7B-8B models
Q8 for 7B if VRAM allows

Pros & Cons

Pros

Cons

16GB VRAM — may need quantization for 30B+ models
Moderate memory bandwidth — not the fastest for inference
Older architecture — check CUDA/ROCm compatibility
No display output — headless only
May need aftermarket cooling solution

Community Verdict

r/LocalLLaMA
16GB HBM2 on a budget. Pascal architecture limits software support but handles 14B Q4 models.
Source

← Back to full comparison table