browse/nvidia/rtx-pro-6000-blackwell 01 // Inference benchmarks
Single-stream decode · llama.cpp
Qwen3 32B · Q4_K_M
142 t/s # env llama.cpp b4732 · 4096 ctx · batch=1 · prompt=512 · temp=0.0 · median of 5 runs
02 // Hardware specs
ArchitectureBlackwell
Process nodeTSMC 4NP
Memory96 GB
Memory bandwidth1,792 GB/s
FP16 compute165 TFLOPS
INT8 compute330 TOPS
TDP600 W
PCIeGen 5 x16
Form factorDual-slot 2.5
CoolingBlower
03 // Model fit
Approximate VRAM required to load weights + 4096 ctx KV cache.
+ STRENGTHS
- ✓96GB VRAM is enough for 200B+ models at Q4
- ✓1792 GB/s memory bandwidth · top tier in its class
- ✓Strong tooling: FP16, FP8, Q8, Q4 all officially supported
− TRADE-OFFS
- −Draws 600W under load — plan PSU and thermals accordingly
- −$8,499 puts this firmly in pro tier
- −Driver lock-in to vendor stack