What to run on your hardware

Pick your hardware. Models are ranked by Artificial Analysis Intelligence Index v4.0 — only configs that actually fit your memory at full native context.

Fit is estimated from memory — nothing has been benchmarked on the network yet. Scores are full-precision (API-measured).
# Model AA Score % Frontier Quant Context VRAM

Ranked by AA score adjusted for estimated quantization loss, at the best quant that fits at full native context. Q2/Q3 rows lose meaningful quality. Models that don't fit aren't shown. Click a row to see every quant that fits.

Proprietary frontier

API-only

For context: the closed models you can't run locally. The % frontier column above is relative to Claude Fable 5 at 64.9.

#ModelLabAA Score% Frontier
1Claude Fable 5Anthropic64.9100%
2Claude Opus 4.8Anthropic61.495%
3GPT-5.5 (xhigh)OpenAI60.293%
4GPT-5.5 (high)OpenAI58.991%
5Claude Opus 4.7Anthropic57.388%
6Gemini 3.1 Pro PreviewGoogle57.288%
7GPT-5.4 (xhigh)OpenAI56.888%
8GPT-5.5 (medium)OpenAI56.787%
9Gemini 3.5 Flash (high)Google55.385%
10Gemini 3.5 Flash (medium)Google54.884%

Scores: Artificial Analysis Intelligence Index v4.0 (10 evals incl. GDPval-AA, Terminal-Bench Hard, SciCode, GPQA Diamond, Humanity's Last Exam). VRAM figures include weights + KV cache at the listed context. Last updated 2026-06-11.