Leaderboard
On-device LLM performance rankings powered by Glicko-2
iPhone 16e
iOSRank
#16
Rating
1,939
±16 RD
Win Rate
92.3%
Conservative Rating
1,907
TG Rating
1,902
PP Rating
1,927
Matches
1,032
Record
953W – 79L
Models Tested
| Model | TG Median (tok/s) | PP Median (tok/s) | TG Best | PP Best | Runs |
|---|---|---|---|---|---|
| Qwen3-0.6B-abliterated-TIES.IQ4_XS | 68.24 | 869.13 | 68.24 | 869.13 | 1 |
| gemma-3-1b-it-abliterated-q4_k_m | 42.73 | 647.09 | 42.73 | 647.09 | 1 |
| DeepSeek-R1-Distill-Qwen-1.5B-Abliterated-dpo.IQ4_XS | 36.18 | 363.47 | 36.18 | 363.47 | 1 |
| llama-3.2-1b-instruct-q8_0 | 31.46 | 69.14 | 38.56 | 551.82 | 3 |
| Qwen3-1.7B.Q4_K_M | 28.03 | 44.52 | 28.03 | 44.52 | 1 |
| gemma-3-1b-it-BF16 | 20.34 | 349.36 | 20.34 | 349.36 | 1 |
| gemma-2-2b-it-Q6_K | 19.00 | 240.28 | 20.31 | 245.98 | 3 |
| Phi-3.5-mini-instruct.Q4_K_M | 15.68 | 114.25 | 15.68 | 114.25 | 1 |
| qwen2.5-3b-instruct-q5_k_m | 12.12 | 138.08 | 12.12 | 138.08 | 1 |
| Phi-4-mini-instruct.Q8_0 | 10.51 | 122.05 | 10.96 | 127.56 | 2 |
| Llama-3.2-3B-Instruct-Q6_K | 9.92 | 82.20 | 13.60 | 156.81 | 2 |
| Phi-3.5-mini-instruct_Uncensored-Q6_K_L | 9.13 | 112.24 | 9.13 | 112.24 | 1 |
| Qwen3.5-4B.Q4_K_M | 8.36 | 102.00 | 8.36 | 102.00 | 1 |
| gemma-3-4b-it.Q4_K_S | 7.03 | 14.05 | 7.03 | 14.05 | 1 |
| gemma-3-4b-it.Q8_0 | 7.00 | 54.44 | 7.69 | 92.55 | 2 |
| DeepSeek-R1-Distill-Qwen-7B-Q3_K_L | 6.94 | 45.91 | 6.94 | 45.91 | 1 |
| Qwen3-4B-Q6_K | 6.06 | 9.74 | 6.06 | 9.74 | 1 |
Head-to-Head Record
1–50 of 265 rows
1 / 6
Performance by App Version
ImprovedRegressed