-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Pull requests: openai/evals
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Evals] Add eval for Dhivehi diacritical marks
#1495
opened Mar 16, 2024 by
aanaseer
Loading…
11 of 12 tasks
Fix specifying API arguments from the CLI
#1505
opened Mar 27, 2024 by
LoryPack
Contributor
Loading…
6 tasks done
Refactor JSONL file loading logic in data.py
#1612
opened Feb 3, 2026 by
Pritiks23
Loading…
13 tasks done
Add reasoning consistency eval under constrained intermediate steps
#1615
opened Feb 5, 2026 by
getappai
Loading…
Add Logic Stress Stress-test Suite (v2, v3)
#1622
opened Feb 16, 2026 by
14H034160212
Contributor
Loading…
Add finance-agent routing eval dataset and builder guidance
#1625
opened Feb 24, 2026 by
maxpetrusenko
Loading…
eval: add RAIL Score responsible AI evaluation across 8 dimensions
#1640
opened Apr 2, 2026 by
SumitVermakgp
Loading…
12 tasks done
Add Turkish language evals: logical reasoning and grammar
#1647
opened Apr 23, 2026 by
kayametehan
Loading…
Update Python version to 3.12 and refresh PR template
#1648
opened Apr 23, 2026 by
kayametehan
Loading…
Route modern OpenAI models through chat completions
#1651
opened Apr 23, 2026 by
kayametehan
Loading…
ProTip!
Updated in the last three days: updated:>2026-04-25.