Massive Multitask Language Understanding (MMLU)
by · free · Last verified
A comprehensive benchmark designed to measure an AI model's knowledge across 57 subjects, ranging from humanities to STEM. It assesses a model's understanding and reasoning capabilities in a zero-shot or few-shot setting, crucial for evaluating general intelligence.
https://huggingface.co/datasets/cais/mmlu ↗F
F—Critical
Adoption: FQuality: FFreshness: A+Citations: FEngagement: F
Specifications
- Pricing
- free
- Capabilities
- Integrations
- Use Cases
- API Available
- No
- Tags
- evaluation-benchmark, multitask, knowledge, reasoning, llm-evaluation, zero-shot, few-shot
- Added
- 2026-03-25
- Completeness
- 1%
Index Score
0Adoption
0
Quality
0
Freshness
100
Citations
0
Engagement
0