blog · field notes

Notes from the lab

Long-form benchmarks, buying guides, and write-ups on local AI hardware. Updated weekly.

Best GPUs for Running AI Models Locally in 2026: Ranked by tok/s per Dollar

We benchmarked 7 GPUs from $749 to $9,499 on Qwen3 32B with llama.cpp. The RTX 3090 at $749 used delivers the best value. The RTX 5090 at $1,999 is the best overall. Here is every data point.

GPU Hunter · 2026-04-30T10:00:00.000Z21 min

[hero · placeholder]

tok/sec on 235B params

M3 Ultra · 512GB

budget-gpu#002

Best Budget GPU for AI Under $1,000 in 2026: Every Option Ranked

We ranked every GPU under $1,000 for local AI inference. The used RTX 3090 at $749 wins on VRAM. The RTX 5070 Ti at $749 wins on tok/s. Here is the full breakdown with benchmarks.

GPU Hunter · 2026-04-25T10:00:00.000Z

28 min

Benchmark#003

Running Qwen3 235B on a single Mac Studio

We pushed Apple's M3 Ultra with 512GB unified memory to its limits. Here's what 22 tok/s of dense inference actually feels like.

M. Chen · 2026-04-22

12 min

amd#004

AMD vs NVIDIA for Local AI Inference in 2026: ROCm Has Finally Caught Up

ROCm 7.2 changed the game. The AMD RX 7900 XTX with 24GB at $849 now runs Ollama, llama.cpp, and vLLM out of the box. We compare the full AMD vs NVIDIA stack for local inference — hardware, software, and real-world experience.

GPU Hunter · 2026-04-20T10:00:00.000Z

22 min

rtx-pro-6000#005

RTX PRO 6000 Blackwell vs H100: Which One for Your Home Lab? (2026)

96GB at $8.5k vs 80GB at $30k. We profiled both on Qwen3 72B Q8 with llama.cpp. The RTX PRO 6000 wins on value. The H100 wins on throughput. Here is every benchmark.

GPU Hunter · 2026-04-14T10:00:00.000Z

19 min

rtx-3090#006

The 2026 Used RTX 3090 Buyer's Guide: Mining Cards, OEM Pulls & What to Avoid

The RTX 3090 remains the best $/VRAM GPU for local AI in 2026. 24GB for under $800. Here is exactly what to look for, what to avoid, and where to buy.

GPU Hunter · 2026-04-08T10:00:00.000Z

22 min

Review#007