Ready to evaluate
Upload a results JSON file or load the example data to see the dashboard.
🏆 AGI-Score
—
/ 100
🗃️ Memory
—
/ 100
🔗 Reasoning
—
/ 100
📋 Subtest Breakdown
🗃️ Memory
Retention
—
Interference
—
Retrieval
—
Cross-Session
—
🔗 Reasoning
Causal
—
Counterfactual
—
Multi-Hop
—
Consistency
—
🪞 Metacognition
Calibration
—
Abstention
—
Self-Correction
—
Boundary
—
🕸️ Capability Radar
📐 Calibration Curve
🏅 Multi-Model Comparison
Load multiple results files to compare
Loaded Models
| Rank | Model | Memory | Reasoning | Metacognition | AGI-Score |
|---|---|---|---|---|---|
| Load results files to populate the leaderboard | |||||