Starred repositories
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.
Lightweight and portable LLM sandbox runtime (code interpreter) Python library.
Universal skills loader for AI coding agents - npm i -g openskills
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
DataSciBench: An LLM Agent Benchmark for Data Science
[ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?
DeepAnalyze is the first agentic LLM for autonomous data science. 🎈你的AI数据分析师,自动分析大量数据,一键生成专业分析报告!
[EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
📑 PageIndex: Document Index for Vectorless, Reasoning-based RAG
Continuously updated paper list on advancements in Data Agents. Companion repo to our paper "A Survey of Data Agents: Emerging Paradigm or Overstated Hype?"
[WWW‘26 Oral🔥] DeepAgent: A General Reasoning Agent with Scalable Toolsets
🔥[ICML'25] Official repository for the paper "Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search"
LLM-Powered Semi-Structured Table Question Answering
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)
We collect papers about "large language models (LLM) for table-related tasks", e.g., using LLM for Table QA task. “表格+LLM”相关论文整理
AdalFlow: The library to build & auto-optimize LLM applications.
End-to-end Generative Optimization for AI Agents
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.
[NeurIPS2025] "AI-Researcher: Autonomous Scientific Innovation" -- A production-ready version: https://novix.science/chat
The code base for paper: "ReAcTable: Enhancing ReAct for Table Question Answering"
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
