What to run on your hardware
Pick your hardware. Models are ranked by Artificial Analysis Intelligence Index v4.0 — only configs that actually fit your memory at full native context.
| # | Model | AA Score | % Frontier | Quant | Context | VRAM |
|---|
* Context capped by this hardware's memory — the model natively supports a larger window, but its KV cache wouldn't fit. The table shows the largest context that fits.
Ranked by AA score adjusted for estimated quantization loss, at the best quant that fits at full native context. Q2/Q3 rows lose meaningful quality. Models that don't fit aren't shown. Click a row to see every quant that fits.
Proprietary frontier
API-onlyFor context: the closed models you can't run locally. The % frontier column above is relative to Claude Fable 5 at 64.9.
| # | Model | Lab | AA Score | % Frontier |
|---|---|---|---|---|
| 1 | Claude Fable 5 | Anthropic | 64.9 | 100% |
| 2 | Claude Opus 4.8 | Anthropic | 61.4 | 95% |
| 3 | GPT-5.5 (xhigh) | OpenAI | 60.2 | 93% |
| 4 | GPT-5.5 (high) | OpenAI | 58.9 | 91% |
| 5 | Claude Opus 4.7 | Anthropic | 57.3 | 88% |
| 6 | Gemini 3.1 Pro Preview | 57.2 | 88% | |
| 7 | GPT-5.4 (xhigh) | OpenAI | 56.8 | 88% |
| 8 | GPT-5.5 (medium) | OpenAI | 56.7 | 87% |
| 9 | Gemini 3.5 Flash (high) | 55.3 | 85% | |
| 10 | Gemini 3.5 Flash (medium) | 54.8 | 84% |
Scores: Artificial Analysis Intelligence Index v4.0 (10 evals incl. GDPval-AA, Terminal-Bench Hard, SciCode, GPQA Diamond, Humanity's Last Exam). VRAM figures include weights + KV cache at the listed context. Last updated 2026-06-11.