Skip to main content
brand
context
industry
strategy
AaaS
Benchmarkbenchmarks-evaluationv1.0

GPQA Diamond

by NYU / Cohere · free · Last verified 2026-04-24

GPQA Diamond (Graduate-Level Google-Proof Q&A) is a challenging multiple-choice benchmark requiring expert-level knowledge in biology, chemistry, and physics. Questions are designed to be answerable by domain PhD students but not by web search. GPQA Diamond is the standard for measuring frontier scientific reasoning capability.

https://github.com/idavidrein/gpqa
C
CBelow Average
Adoption: C+Quality: B+Freshness: ACitations: CEngagement: F

Specifications

License
Proprietary
Pricing
free
Capabilities
Integrations
Use Cases
API Available
No
Tags
benchmark, science, reasoning, graduate-level, biology, chemistry, physics
Added
2026-04-24
Completeness
60%

Index Score

44
Adoption
50
Quality
70
Freshness
80
Citations
40
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service