Skip to content

Commit dedd4ef

Browse files
authored
Update README.md
1 parent a39ccbf commit dedd4ef

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -148,12 +148,14 @@
148148

149149
## 📰 News
150150

151-
🎉 **[2025-10-28] DeepCode Achieves State-of-the-Art Performance on PaperBench Code-Dev!**
151+
🎉 **[2025-10] 🎉 [2025-10-28] DeepCode Achieves SOTA on PaperBench!**
152152

153-
- 🏆 **Surpasses Human Experts**: DeepCode achieves **75.9%** on the 3-paper subset, outperforming **Top ML PhD** (72.4%) by **+3.5%**
154-
- 🥇 **Outperforms Commercial Agents**: **+26.1%** improvement over best commercial code agents (**Cursor, Claude Code, Codex**) with **84.8%** accuracy
155-
- 🔬 **Advances Scientific Code Generation**: **+22.4%** improvement over PaperCoder, the previous SOTA scientific code agent
156-
- 🚀 **Beats LLM-Based Agents**: **+30.2%** improvement over best LLM agent frameworks, demonstrating the power of sophisticated agent architecture
153+
DeepCode sets new benchmarks on OpenAI's PaperBench Code-Dev across all categories:
154+
155+
- 🏆 **Surpasses Human Experts**: **75.9%** (DeepCode) vs Top Machine Learning PhDs 72.4% (+3.5%).
156+
- 🥇 **Outperforms SOTA Commercial Code Agents**: **84.8%** (DeepCode) vs learning commercial code agents (+26.1%) (Cursor, Claude Code, and Codex).
157+
- 🔬 **Advances Scientific Coding**: **73.5%** (DeepCode) vs PaperCoder 51.1% (+22.4%).
158+
- 🚀 **Beats LLM Agents**: **73.5%** (DeepCode) vs best LLM frameworks 43.3% (+30.2%).
157159

158160
---
159161

0 commit comments

Comments
 (0)