Notes from the lab
Long-form benchmarks, buying guides, and write-ups on local AI hardware. Updated weekly.
Best GPUs for Running AI Models Locally in 2026: Ranked by tok/s per Dollar
We benchmarked 7 GPUs from $749 to $9,499 on Qwen3 32B with llama.cpp. The RTX 3090 at $749 used delivers the best value. The RTX 5090 at $1,999 is the best overall. Here is every data point.
Best Budget GPU for AI Under $1,000 in 2026: Every Option Ranked
We ranked every GPU under $1,000 for local AI inference. The used RTX 3090 at $749 wins on VRAM. The RTX 5070 Ti at $749 wins on tok/s. Here is the full breakdown with benchmarks.
Running Qwen3 235B on a single Mac Studio
We pushed Apple's M3 Ultra with 512GB unified memory to its limits. Here's what 22 tok/s of dense inference actually feels like.
AMD vs NVIDIA for Local AI Inference in 2026: ROCm Has Finally Caught Up
ROCm 7.2 changed the game. The AMD RX 7900 XTX with 24GB at $849 now runs Ollama, llama.cpp, and vLLM out of the box. We compare the full AMD vs NVIDIA stack for local inference — hardware, software, and real-world experience.
RTX PRO 6000 Blackwell vs H100: Which One for Your Home Lab? (2026)
96GB at $8.5k vs 80GB at $30k. We profiled both on Qwen3 72B Q8 with llama.cpp. The RTX PRO 6000 wins on value. The H100 wins on throughput. Here is every benchmark.
The 2026 Used RTX 3090 Buyer's Guide: Mining Cards, OEM Pulls & What to Avoid
The RTX 3090 remains the best $/VRAM GPU for local AI in 2026. 24GB for under $800. Here is exactly what to look for, what to avoid, and where to buy.
DGX Spark, three months in
128GB unified memory in a 1.2kg desktop. Worth $4k? Depends what you're optimizing for.
FP8 vs Q4: how much quality are you actually losing?
Perplexity isn't the whole story. We ran human evals across 6 quantization schemes.
Cooling Blackwell: the case for water
600W in a triple-slot air card means 600W in your office. Here's the AIO data.
Get the index
delivered Mondays.
New benchmarks, price drops, and one well-tested buying recommendation. No spam.