LMSYS Chatbot Arena

Quickly evaluate large language models (LLMs) head-to-head using the LMSYS Chatbot Arena. This crowdsourced platform lets you compare two anonymous models, providing direct human feedback to benchmark their performance.

beginner5 min5 steps

The play

Access the Arena
Navigate to the LMSYS Chatbot Arena website to begin your LLM evaluation.
Start a New Battle
Click 'New Battle' to initiate a fresh comparison between two randomly selected, anonymous LLMs.
Interact and Evaluate
Prompt both models with the same query. Carefully compare their responses for quality, coherence, helpfulness, and overall performance.
Submit Your Vote
Select the model you believe performed better (or choose 'Tie'/'Neither'). You can also provide optional written feedback.
Reveal and Learn
After submitting your vote, the names of the models will be revealed. Review the arena's leaderboard and statistics to see how models rank.

Starter resource

↗chat.lmsys.org

Source

Articlelmarena.ai