Business Models Providers LLM Leaderboard Blog

LLM Performance Leaderboard

Interactive comparison of large language models across multiple benchmarks

xAIOpenAIGoogleDeepSeekAlibabaAnthropicZhipuStepFun01.AINexusflowTencentMetaRWKVTsinghuaOpenAssistantLMSYSDatabricksStability AIHuggingFaceMicrosoftUWTogether AIMistralAllen AIUC BerkeleyNomic AIMosaicMLStanfordNousResearchIBMCognitive ComputationsNvidiaUpstage AITII01 AIOpenChatSnowflakeDeepSeek AIAllenAI/UWAi2CohereReka AIInternLMZhipu AIAmazonAI21 LabsNexusFlowPrinceton

Filter by BenchmarkClick a benchmark to sort models

Rank	Model Name	Organization	License	arenaElo Score	Votes
1	Grok-3-Preview-02-24	xAI	Proprietary	1412	3,364
2	GPT-4.5-Preview	OpenAI	Proprietary	1411	3,242
3	Gemini-2.0-Flash-Thinking-Exp-01-21	Google	Proprietary	1384	17,487
4	Gemini-2.0-Pro-Exp-02-05	Google	Proprietary	1380	15,466
5	ChatGPT-4o-latest (2025-01-29)	OpenAI	Proprietary	1377	17,221
6	DeepSeek-R1	DeepSeek	MIT	1363	8,580
7	Gemini-2.0-Flash-001	Google	Proprietary	1357	13,257
8	o1-2024-12-17	OpenAI	Proprietary	1352	19,785
9	Qwen2.5-Max	Alibaba	Proprietary	1336	11,930
10	o3-mini-high	OpenAI	Proprietary	1329	9,102
11	o3-mini-high	OpenAI		1325	17,988
12	DeepSeek-V3	DeepSeek	DeepSeek	1318	22,007
13	DeepSeek-V3	DeepSeek		1318	22,846
14	QwQ-32B	Alibaba		1316	7,735
15	GLM-4-Plus-0111	Zhipu	Proprietary	1311	6,035
16	Gemini-2.0-Flash-Lite	Google		1311	2,212
17	Qwen-Plus-0125	Alibaba	Proprietary	1310	6,054
18	GLM-4-Plus-0111	Zhipu		1310	6,026
19	Claude 3.7 Sonnet	Anthropic	Proprietary	1309	4,254
20	Gemini-2.0-Flash-Lite-Preview-02-05	Google	Proprietary	1308	12,774
21	Step-2-16K-Exp	StepFun	Proprietary	1305	5,132
22	Qwen-Plus-0125	Alibaba		1305	6,055
23	o3-mini	OpenAI		1305	24,877
24	o1-mini	OpenAI	Proprietary	1304	54,923
25	o3-mini	OpenAI	Proprietary	1304	15,463
26	Step-2-16K-Exp	StepFun		1304	5,128
27	o1-mini	OpenAI		1304	54,960
28	Command A.	Cohere		1303	7,547
29	Gemini-1.5-Pro-002	Google	Proprietary	1302	57,551
30	Hunyuan-TurboS	Tencent		1302	2,452
31	Gemini-1.5-Pro-0902	Google		1302	58,660
32	Hunyuan-Turbo-0110	Tencent		1302	2,511
33	llama-3.3-Nemotron-Super-49B-v1	Nvidia		1296	1,236
34	Grok-2-08-13	xAI	Proprietary	1288	67,038
35	Grok-2-08-13	xAI		1288	67,102
36	Yi-Lightning	01.AI	Proprietary	1287	28,946
37	Yi-Lightning	01 AI		1287	28,972
38	GPT-4o (2024-05-13)	OpenAI		1285	117,769
39	Claude 3.5 Sonnet (20241022)	Anthropic	Proprietary	1284	59,139
40	Claude 3.5 Sonnet	Anthropic		1283	64,670
41	Qwen2.5-plus-1127	Alibaba		1282	10,723
42	Deepseek-v2.5-1210	DeepSeek	DeepSeek	1279	7,247
43	Deepseek-v2.5-1210	DeepSeek		1279	7,246
44	Athene-v2-Chat-72B	Nexusflow	Athene V2	1275	26,092
45	Athene-v2-Chat-72B	NexusFlow		1275	26,093
46	GPT-4o-mini-2024-07-18	OpenAI	Proprietary	1272	66,710
47	Hunyuan-Large-2025-02-10	Tencent		1272	3,859
48	GPT-4o-mini-2024-07-18	OpenAI		1272	71,388
49	Hunyuan-Large-2025-02-10	Tencent	Proprietary	1271	3,860
50	Gemini-1.5-Flash-002	Google	Proprietary	1271	36,979
51	Gemini-1.5-Flash-0902	Google		1271	37,025
52	llama-4-Maverick-17B-128F-Instruct	Meta		1271	4,917
53	Llama-3.1-405B-Instruct-bf16	Meta	Llama 3.1	1269	34,228
54	Llama-3.1-Nemotron-70B-Instruct	Nvidia		1269	7,580
55	Meta-llama-3.1-405B-Instruct-bf16	Meta		1269	43,795
56	Claude 3.5 Sonnet (20240620)	Anthropic		1268	86,162
57	Meta-llama-3.1-405B-Instruct-fp8	Meta		1267	63,055
58	Gemini Advanced App. (2024-05-14)	Google		1267	52,143
59	Grok-2-Mini-08-13	xAI		1266	55,452
60	GPT-4o-2024-08-06	OpenAI		1265	47,982
61	Qwen-Max-0919	Alibaba		1263	17,440
62	Gemini-1.5-Pro-0901	Google		1260	82,436
63	Hunyuan-Standard-2025-02-10	Tencent		1260	4,017
64	Deepseek-v2.5	DeepSeek		1258	26,344
65	Qwen2.5-72B-Instruct	Alibaba		1257	41,532
66	Llama-3.3-70B-Instruct	Meta		1257	38,101
67	GPT-4-Turbo-2024-04-09	OpenAI		1256	102,158
68	Mistral-Large-2407	Mistral		1251	48,212
69	Athene-70B	NexusFlow		1250	20,584
70	GPT-4-1106-preview	OpenAI		1250	103,746
71	Mistral-Large-2411	Mistral		1249	29,895
72	Meta-llama-3.1-70B-Instruct	Meta		1248	58,654
73	Claude 3 Opus	Anthropic		1247	202,697
74	GLM-4-Plus	Zhipu AI		1246	27,784
75	Amazon-Nova-Pro-1.0	Amazon		1245	24,285
76	GPT-4-0125-preview	OpenAI		1245	97,076
77	Llama-3.1-Tulu-3-70B	Ai2		1244	3,014
78	Claude 3.5 Haiku (20241022)	Anthropic		1237	33,322
79	Reka-Core-20240904	Reka AI		1235	7,938
80	Gemini-1.5-Flash-0901	Google		1227	65,565
81	Jamba-1.5-Large	AI21 Labs		1222	65,665
82	Gemma-2-27B-it	Google		1220	79,536
83	Mistral-Small-24B-Instruct-2501	Mistral		1217	14,573
84	Amazon-Nova-Lite-1.0	Amazon		1217	20,648
85	Qwen2.5-Coder-32B-Instruct	Alibaba		1217	5,729
86	Command R+ (08-2024)	Cohere		1215	20,612
87	Gemini-1.5-Flash-88-091	Google		1212	37,686
88	Llama-3.1-Nemotron-518B-Instruct	Nvidia		1211	3,887
89	Aya-Expanse-32B	Cohere		1209	28,749
90	Gemma-2-9B-it-SimPO	Princeton		1209	10,548
91	GLM-4-0529	Zhipu AI		1207	10,220
92	Llama-3-70B-Instruct	Meta		1206	163,632
93	Reka-Flash-20240904	Reka AI		1205	8,138
94	Phi-4	Microsoft		1205	25,224
95	Claude 3 Sonnet	Anthropic		1201	113,061
96	Nemotron-4-340B-Instruct	Nvidia		1199	10,540
97	Amazon-Nova-Micro-1.0	Amazon		1198	20,663
98	Zephyr-ORPO-141b-A35b-v0.1	HuggingFace		1197	4,857
99	Gemma-2-9B-it	Google		1192	57,211
100	Command R+ (04-2024)	Cohere		1190	80,859
101	Hunyuan-Standard-256K	Tencent		1189	2,901
102	Qwen2-72B-Instruct	Alibaba		1187	38,877
103	GPT-4-0314	OpenAI		1186	55,978
104	Llama-3.1-Tulu-3-8B	Ai2		1185	3,076
105	Ministral-8B-2410	Mistral		1182	5,113
106	Aya-Expanse-8B	Cohere		1180	10,391
107	Command R (08-2024)	Cohere		1180	10,848
108	Claude 3 Haiku	Anthropic		1179	122,304
109	DeepSeek-Coder-V2-Instruct	DeepSeek AI		1178	15,754
110	Meta-llama-3.1-8B-Instruct	Meta		1176	52,597
111	Jamba-1.5-Mini	AI21 Labs		1176	9,272
112	GPT-4-0613	OpenAI		1163	91,640
113	Qwen1.5-110B-Chat	Alibaba		1161	27,431
114	Yi-1.5-34B-Chat	01 AI		1157	25,137
115	Mistral-Large-2402	Mistral		1157	64,928
116	Reka-Flash-21B-online	Reka AI		1156	16,038
117	QwQ-32B-Preview	Alibaba		1153	3,411
118	Llama-3-8B-Instruct	Meta		1152	109,106
119	Command R+ (04-2024)	Cohere		1149	56,391
120	InternLM2.5-20B-chat	InternLM		1149	10,597
121	Mixtral-8x22b-Instruct-v0.1	Mistral		1148	53,756
122	Mistral Medium	Mistral		1148	35,561
123	Qwen1.5-72B-Chat	Alibaba		1147	40,677
124	Reka-Flash-21B	Reka AI		1147	25,809
125	Gemma-2-2b-it	Google		1144	48,905
126	Granite-3.1-8B-Instruct	IBM		1143	3,294
127	Gemini-1.0-Pro-0901	Google		1131	18,803
128	Qwen1.5-32B-Chat	Alibaba		1125	22,759
129	Phi-3-Medium-4k-Instruct	Microsoft		1123	26,109
130	Granite-3.1-2B-Instruct	IBM		1120	3,383
131	Starling-LM-7B-beta	Nexusflow		1119	16,672
132	Mixtral-8x7B-Instruct-v0.1	Mistral		1114	76,135
133	Yi-34B-Chat	01 AI		1111	15,914
134	Gemini Pro	Google		1111	6,556
135	Qwen1.5-14B-Chat	Alibaba		1109	18,680
136	WizardLM-70B-v1.0	Microsoft		1106	8,384
137	GPT-3.5-Turbo-0125	OpenAI		1106	68,878
138	DBRX-Instruct-Preview	Databricks		1103	33,734
139	Meta-llama-3.2-3B-Instruct	Meta		1103	8,399
140	Phi-3-Small-8k-Instruct	Microsoft		1102	18,473
141	Tulu-2-DPO-70B	AllenAI/UW		1098	6,662
142	Llama-2-70B-chat	Meta		1093	39,599
143	Granite-3.0-8B-Instruct	IBM		1093	7,002
144	OpenChat-3.5-0106	OpenChat		1091	12,987
145	Vicuna-33B	LMSYS		1091	22,948
146	Snowflake Arctic Instruct	Snowflake		1090	34,177
147	Starling-LM-7B-alpha	UC Berkeley		1088	10,417
148	Granite-3.0-2B-Instruct	IBM		1084	7,184
149	Gemma-1.4-7B-it	Google		1084	25,066
150	NV-Llama2-70B-SteerLM-Chat	Nvidia		1081	3,636
151	DeepSeek-LLM-67B-Chat	DeepSeek AI		1077	4,988
152	OpenChat-3.5	OpenChat		1076	8,106
153	OpenHermes-2.5-Mistral-7B	NousResearch		1074	5,089
154	Mistral-7B-Instruct-v0.2	Mistral		1072	20,065
155	Qwen1.5-7B-Chat	Alibaba		1070	4,871
156	Phi-3-Mini-4K-Instruct-June-24	Microsoft		1070	12,803
157	GPT-3.5-Turbo-1106	OpenAI		1067	17,035
158	Phi-3-Mini-4K-Instruct	Microsoft		1066	21,098
159	Dolphin-2.2.1-Mistral-7B	Cognitive Computations		1063	1,713
160	Llama-2-13b-chat	Meta		1063	19,721
161	SOLAR-10.7B-Instruct-v1.0	Upstage AI		1062	4,288
162	Nous-Hermes-2-Mixtral-8x7B-DPO	NousResearch		1061	3,839
163	WizardLM-13b-v1.2	Microsoft		1059	7,175
164	Meta-llama-3.2-13B-Instruct	Meta		1054	8,518
165	Vicuna-13B	LMSYS		1054	19,776
166	Zephyr-7B-beta	HuggingFace		1053	11,318
167	SmoLLM2-1.7B-Instruct	HuggingFace		1046	2,376
168	MPT-30B-chat	MosaicML		1045	2,645
169	falcon-180b-chat	TII		1044	1,328
170	CodeLlama-34B-instruct	Meta		1043	7,512
171	CodeLlama-70B-instruct	Meta		1042	1,190
172	Zephyr-7B-alpha	HuggingFace		1040	1,811
173	Gemma-7B-it	Google		1037	9,176
174	Phi-3-Mini-128k-Instruct	Microsoft		1037	21,628
175	Llama-2-7B-chat	Meta		1037	14,535
176	Qwen-14B-Chat	Alibaba		1035	5,068
177	Guanaco-33B	UW		1033	2,998
178	Gemma-1.1-2b-it	Google		1021	11,354
179	StripedHyena-Nous-7B	Together AI		1017	5,278
180	OLMo-7B-instruct	Allen AI		1015	6,499
181	Mistral-7B-Instruct-v0.1	Mistral		1008	9,145
182	Vicuna-7B	LMSYS		1005	7,014
183	PaLM-Chat-Bison-001	Google		1003	8,712
184	Gemma-2B-it	Google		989	4,922
185	Qwen1.5-4B-Chat	Alibaba		988	7,819
186	Koala-13B	UC Berkeley		964	7,021
187	ChatGLM3-6B	Tsinghua		955	4,763
188	GPT4All-13B-Snoozy	Nomic AI		932	1,788
189	MPT-7B-Chat	MosaicML		928	3,997
190	ChatGLM2-6B	Tsinghua		924	2,711
191	RWKV-4-Raven-14B	RWKV		922	4,920
192	Alpaca-13B	Stanford		901	5,865
193	OpenAssistant-Pythia-12B	OpenAssistant		893	6,368
194	ChatGLM-6B	Tsinghua		879	4,987
195	FastChat-T5-3B	LMSYS		868	4,290
196	StableLM-Tuned-Alpha-7B	Stability AI		840	3,339
197	Dolly-V2-12B	Databricks		822	3,481
198	LLaMA-13B	Meta		799	2,445

🚀 Real-time updates | 🔍 Interactive visualizations | 📊 Data-driven insights

Data aggregated from multiple benchmark sources • Last updated: March 2025