diff --git a/README.md b/README.md index ffab6759..18a9c8ec 100644 --- a/README.md +++ b/README.md @@ -16,6 +16,10 @@ +
+HKUDS%2FDeepCode | Trendshift +
+ # DeepCode Logo DeepCode: Open Agentic Coding diff --git a/README_ZH.md b/README_ZH.md new file mode 100644 index 00000000..298f2357 --- /dev/null +++ b/README_ZH.md @@ -0,0 +1,793 @@ +
+ + + + + + +
+ DeepCode Logo + +
    ██████╗ ███████╗███████╗██████╗  ██████╗ ██████╗ ██████╗ ███████╗
+    ██╔══██╗██╔════╝██╔════╝██╔══██╗██╔════╝██╔═══██╗██╔══██╗██╔════╝
+    ██║  ██║█████╗  █████╗  ██████╔╝██║     ██║   ██║██║  ██║█████╗
+    ██║  ██║██╔══╝  ██╔══╝  ██╔═══╝ ██║     ██║   ██║██║  ██║██╔══╝
+    ██████╔╝███████╗███████╗██║     ╚██████╗╚██████╔╝██████╔╝███████╗
+    ╚═════╝ ╚══════╝╚══════╝╚═╝      ╚═════╝ ╚═════╝ ╚═════╝ ╚══════╝
+
+ +
+HKUDS%2FDeepCode | Trendshift +
+ + + +# DeepCode Logo DeepCode: 开源智能体编程 + +### *基于多智能体系统推进代码生成技术* + + +

+ + + +

+

+ + +

+
+
+
+ +
+ + + +
+ +### 🖥️ **界面展示** + + + + + + +
+ +#### 🖥️ **命令行界面** +**基于终端的开发环境** + +
+ + CLI Interface Demo + +
+ 🚀 高级终端体验
+ ⚡ 快速命令行工作流
🔧 开发者友好界面
📊 实时进度跟踪
+
+ + *专业终端界面,适合高级用户和CI/CD集成* +
+ +
+ +#### 🌐 **Web界面** +**可视化交互体验** + +
+ + Web Interface Demo + +
+ 🎨 现代化Web仪表板
+ 🖱️ 直观的拖拽操作
📱 响应式设计
🎯 可视化进度跟踪
+
+ + *美观的Web界面,为所有技能水平用户提供流畅的工作流程* +
+ +
+ +--- + +
+ +### 🎬 **介绍视频** + +
+ + DeepCode Introduction Video + +
+ +*🎯 **观看我们的完整介绍** - 了解DeepCode如何将研究论文和自然语言转换为生产就绪的代码* + +

+ + Watch Video + +

+ +
+ +--- + + + + +> *"AI智能体将创意转化为生产就绪代码的地方"* + +
+ +--- + +## 📑 目录 + +- [🚀 核心特性](#-核心特性) +- [🏗️ 架构](#️-架构) +- [🚀 快速开始](#-快速开始) +- [💡 示例](#-示例) + - [🎬 实时演示](#-实时演示) +- [⭐ 星标历史](#-星标历史) +- [📄 许可证](#-许可证) + +--- + +## 🚀 核心特性 + +
+ + + + + + + +
+ +
+

🚀 论文转代码

+
+ +
+ Algorithm Badge +
+ +
+

复杂算法的自动化实现

+
+ +
+

轻松将研究论文中的复杂算法转换为高质量生产就绪的代码,加速算法复现。

+
+ + + +
+ +
+

🎨 文本转Web

+
+ +
+ Frontend Badge +
+ +
+

自动化前端Web开发

+
+ +
+

将纯文本描述转换为功能完整视觉美观的前端Web代码,快速创建界面。

+
+ + + +
+ +
+

⚙️ 文本转后端

+
+ +
+ Backend Badge +
+ +
+

自动化后端开发

+
+ +
+

从简单的文本输入生成高效可扩展功能丰富的后端代码,简化服务器端开发。

+
+ + + +
+ +
+ +### 🎯 **自主多智能体工作流** + +**面临的挑战**: + +- 📄 **实现复杂性**: 将学术论文和复杂算法转换为可运行代码需要大量技术投入和领域专业知识 + +- 🔬 **研究瓶颈**: 研究人员将宝贵时间花在算法实现上,而不是专注于核心研究和发现工作 + +- ⏱️ **开发延迟**: 产品团队在概念和可测试原型之间经历长时间等待,减慢创新周期 + +- 🔄 **重复编码**: 开发者重复实现相似的模式和功能,而不是基于现有解决方案构建 + +**DeepCode** 通过为常见开发任务提供可靠的自动化来解决这些工作流程低效问题,简化从概念到代码的开发工作流程。 + +
+ +```mermaid +flowchart LR + A["📄 研究论文
💬 文本提示
🌐 URL和文档
📎 文件: PDF, DOC, PPTX, TXT, HTML"] --> B["🧠 DeepCode
多智能体引擎"] + B --> C["🚀 算法实现
🎨 前端开发
⚙️ 后端开发"] + + style A fill:#ff6b6b,stroke:#c0392b,stroke-width:2px,color:#000 + style B fill:#00d4ff,stroke:#0984e3,stroke-width:3px,color:#000 + style C fill:#00b894,stroke:#00a085,stroke-width:2px,color:#000 +``` + +
+ +--- + +## 🏗️ 架构 + +### 📊 **系统概述** + +**DeepCode** 是一个AI驱动的开发平台,自动化代码生成和实现任务。我们的多智能体系统处理将需求转换为功能性、结构良好代码的复杂性,让您专注于创新而非实现细节。 + +🎯 **技术能力**: + +🧬 **研究到生产流水线**
+多模态文档分析引擎,从学术论文中提取算法逻辑和数学模型。生成优化的实现,使用适当的数据结构,同时保持计算复杂度特征。 + +🪄 **自然语言代码合成**
+使用在精选代码库上训练的微调语言模型进行上下文感知代码生成。在支持多种编程语言和框架的同时保持模块间架构一致性。 + +⚡ **自动化原型引擎**
+智能脚手架系统,生成包括数据库模式、API端点和前端组件的完整应用程序结构。使用依赖分析确保从初始生成开始的可扩展架构。 + +💎 **质量保证自动化**
+集成静态分析与自动化单元测试生成和文档合成。采用AST分析进行代码正确性检查和基于属性的测试进行全面覆盖。 + +🔮 **CodeRAG集成系统**
+高级检索增强生成,结合语义向量嵌入和基于图的依赖分析。从大规模代码语料库中自动发现最优库和实现模式。 + +--- + +### 🔧 **核心技术** + +- 🧠 **智能编排智能体**: 协调工作流阶段和分析需求的中央决策系统。采用动态规划算法,根据不断发展的项目复杂性实时调整执行策略。为每个实现步骤动态选择最优处理策略。
+ +- 💾 **高效内存机制**: 高效管理大规模代码上下文的高级上下文工程系统。实现分层内存结构,具有智能压缩功能,用于处理复杂代码库。该组件实现实现模式的即时检索,并在扩展开发会话中保持语义一致性。
+ +- 🔍 **高级CodeRAG系统**: 分析跨存储库复杂相互依赖关系的全局代码理解引擎。执行跨代码库关系映射,从整体角度理解架构模式。该模块利用依赖图和语义分析在实现过程中提供全局感知的代码建议。 + +--- + +### 🤖 **DeepCode的多智能体架构**: + +- **🎯 中央编排智能体**: 编排整个工作流程执行并做出战略决策。基于输入复杂性分析协调专门智能体。实现动态任务规划和资源分配算法。
+ +- **📝 意图理解智能体**: 对用户需求进行深度语义分析以解码复杂意图。通过高级NLP处理提取功能规范和技术约束。通过结构化任务分解将模糊的人类描述转换为精确、可操作的开发规范。
+ +- **📄 文档解析智能体**: 使用高级解析能力处理复杂的技术文档和研究论文。使用文档理解模型提取算法和方法。通过智能内容分析将学术概念转换为实用的实现规范。
+ +- **🏗️ 代码规划智能体**: 执行架构设计和技术栈优化。动态规划适应性开发路线图。通过自动化设计模式选择执行编码标准并生成模块化结构。
+ +- **🔍 代码参考挖掘智能体**: 通过智能搜索算法发现相关存储库和框架。分析代码库的兼容性和集成潜力。基于相似性度量和自动化依赖分析提供建议。
+ +- **📚 代码索引智能体**: 构建发现代码库的综合知识图谱。维护代码组件之间的语义关系。实现智能检索和交叉引用能力。
+ +- **🧬 代码生成智能体**: 将收集的信息合成为可执行的代码实现。创建功能接口并集成发现的组件。生成全面的测试套件和文档以确保可重现性。 + +--- + +#### 🛠️ **实现工具矩阵** + +**🔧 基于MCP (模型上下文协议) 驱动** + +DeepCode利用**模型上下文协议 (MCP)** 标准与各种工具和服务无缝集成。这种标准化方法确保AI智能体和外部系统之间的可靠通信,实现强大的自动化能力。 + +##### 📡 **MCP服务器和工具** + +| 🛠️ **MCP服务器** | 🔧 **主要功能** | 💡 **目的和能力** | +|-------------------|-------------------------|-------------------------------| +| **🔍 brave** | Web搜索引擎 | 通过Brave搜索API进行实时信息检索 | +| **🌐 bocha-mcp** | 替代搜索 | 具有独立API访问的辅助搜索选项 | +| **📂 filesystem** | 文件系统操作 | 本地文件和目录管理,读/写操作 | +| **🌐 fetch** | Web内容检索 | 从URL和Web资源获取和提取内容 | +| **📥 github-downloader** | 存储库管理 | 克隆和下载GitHub存储库进行分析 | +| **📋 file-downloader** | 文档处理 | 下载文件(PDF、DOCX等)并转换为Markdown | +| **⚡ command-executor** | 系统命令 | 执行bash/shell命令进行环境管理 | +| **🧬 code-implementation** | 代码生成中心 | 具有执行和测试的综合代码复现 | +| **📚 code-reference-indexer** | 智能代码搜索 | 代码存储库的智能索引和搜索 | +| **📄 document-segmentation** | 智能文档分析 | 大型论文和技术文档的智能文档分割 | + +##### 🔧 **传统工具功能** *(供参考)* + +| 🛠️ **功能** | 🎯 **使用上下文** | +|-----------------|---------------------| +| **📄 read_code_mem** | 从内存高效检索代码上下文 | +| **✍️ write_file** | 直接文件内容生成和修改 | +| **🐍 execute_python** | Python代码测试和验证 | +| **📁 get_file_structure** | 项目结构分析和组织 | +| **⚙️ set_workspace** | 动态工作空间和环境配置 | +| **📊 get_operation_history** | 过程监控和操作跟踪 | + + +--- + +🎛️ **多界面框架**
+具有CLI和Web前端的RESTful API,具有实时代码流、交互式调试和可扩展插件架构,用于CI/CD集成。 + +**🚀 多智能体智能流水线:** + +
+ +### 🌟 **智能处理流程** + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+💡 输入层
+📄 研究论文 • 💬 自然语言 • 🌐 URL • 📋 需求 +
+🎯 中央编排
+战略决策制定 • 工作流程协调 • 智能体管理 +
+📝 文本分析
+需求处理 +
+📄 文档分析
+论文和规范处理 +
+📋 复现规划
+深度论文分析 • 代码需求解析 • 复现策略开发 +
+🔍 参考分析
+存储库发现 +
+📚 代码索引
+知识图谱构建 +
+🧬 代码实现
+实现生成 • 测试 • 文档 +
+⚡ 输出交付
+📦 完整代码库 • 🧪 测试套件 • 📚 文档 • 🚀 部署就绪 +
+ +
+ +
+
+ +### 🔄 **流程智能特性** + + + + + + + + +
+
+

🎯 自适应流程

+

基于输入复杂性的动态智能体选择

+
+
+
+

🧠 智能协调

+

智能任务分配和并行处理

+
+
+
+

🔍 上下文感知

+

通过CodeRAG集成的深度理解

+
+
+
+

⚡ 质量保证

+

全程自动化测试和验证

+
+
+ +
+ +--- + +## 🚀 快速开始 + + + +### 📦 **步骤1: 安装** + +#### ⚡ **直接安装 (推荐)** + +```bash +# 🚀 直接安装DeepCode包 +pip install deepcode-hku + +# 🔑 下载配置文件 +curl -O https://raw.githubusercontent.com/HKUDS/DeepCode/main/mcp_agent.config.yaml +curl -O https://raw.githubusercontent.com/HKUDS/DeepCode/main/mcp_agent.secrets.yaml + +# 🔑 配置API密钥 (必需) +# 使用您的API密钥和base_url编辑mcp_agent.secrets.yaml: +# - openai: api_key, base_url (用于OpenAI/自定义端点) +# - anthropic: api_key (用于Claude模型) + +# 🔑 配置搜索API密钥用于Web搜索 (可选) +# 编辑mcp_agent.config.yaml设置您的API密钥: +# - 对于Brave搜索: 在brave.env部分设置BRAVE_API_KEY: "your_key_here" (第~28行) +# - 对于Bocha-MCP: 在bocha-mcp.env部分设置BOCHA_API_KEY: "your_key_here" (第~74行) + +# 📄 配置文档分割 (可选) +# 编辑mcp_agent.config.yaml控制文档处理: +# - enabled: true/false (是否使用智能文档分割) +# - size_threshold_chars: 50000 (触发分割的文档大小阈值) +``` + +#### 🔧 **开发安装 (从源码)** + +
+📂 点击展开开发安装选项 + +##### 🔥 **使用UV (开发推荐)** + +```bash +# 🔽 克隆存储库 +git clone https://github.com/HKUDS/DeepCode.git +cd DeepCode/ + +# 📦 安装UV包管理器 +curl -LsSf https://astral.sh/uv/install.sh | sh + +# 🔧 使用UV安装依赖 +uv venv --python=3.13 +source .venv/bin/activate # Windows下: .venv\Scripts\activate +uv pip install -r requirements.txt + +# 🔑 配置API密钥 (必需) +# 使用您的API密钥和base_url编辑mcp_agent.secrets.yaml: +# - openai: api_key, base_url (用于OpenAI/自定义端点) +# - anthropic: api_key (用于Claude模型) + +# 🔑 配置搜索API密钥用于Web搜索 (可选) +# 编辑mcp_agent.config.yaml设置您的API密钥: +# - 对于Brave搜索: 在brave.env部分设置BRAVE_API_KEY: "your_key_here" (第~28行) +# - 对于Bocha-MCP: 在bocha-mcp.env部分设置BOCHA_API_KEY: "your_key_here" (第~74行) + +# 📄 配置文档分割 (可选) +# 编辑mcp_agent.config.yaml控制文档处理: +# - enabled: true/false (是否使用智能文档分割) +# - size_threshold_chars: 50000 (触发分割的文档大小阈值) +``` + +##### 🐍 **使用传统pip** + +```bash +# 🔽 克隆存储库 +git clone https://github.com/HKUDS/DeepCode.git +cd DeepCode/ + +# 📦 安装依赖 +pip install -r requirements.txt + +# 🔑 配置API密钥 (必需) +# 使用您的API密钥和base_url编辑mcp_agent.secrets.yaml: +# - openai: api_key, base_url (用于OpenAI/自定义端点) +# - anthropic: api_key (用于Claude模型) + +# 🔑 配置搜索API密钥用于Web搜索 (可选) +# 编辑mcp_agent.config.yaml设置您的API密钥: +# - 对于Brave搜索: 在brave.env部分设置BRAVE_API_KEY: "your_key_here" (第~28行) +# - 对于Bocha-MCP: 在bocha-mcp.env部分设置BOCHA_API_KEY: "your_key_here" (第~74行) + +# 📄 配置文档分割 (可选) +# 编辑mcp_agent.config.yaml控制文档处理: +# - enabled: true/false (是否使用智能文档分割) +# - size_threshold_chars: 50000 (触发分割的文档大小阈值) +``` + +
+ +#### 🪟 **Windows用户: 额外的MCP服务器配置** + +如果您使用Windows,可能需要在`mcp_agent.config.yaml`中手动配置MCP服务器: + +```bash +# 1. 全局安装MCP服务器 +npm i -g @modelcontextprotocol/server-brave-search +npm i -g @modelcontextprotocol/server-filesystem + +# 2. 找到您的全局node_modules路径 +npm -g root +``` + +然后更新您的`mcp_agent.config.yaml`使用绝对路径: + +```yaml +mcp: + servers: + brave: + command: "node" + args: ["C:/Program Files/nodejs/node_modules/@modelcontextprotocol/server-brave-search/dist/index.js"] + filesystem: + command: "node" + args: ["C:/Program Files/nodejs/node_modules/@modelcontextprotocol/server-filesystem/dist/index.js", "."] +``` + +> **注意**: 将路径替换为步骤2中您实际的全局node_modules路径。 + +#### 🔍 **搜索服务器配置 (可选)** + +DeepCode支持多个搜索服务器进行Web搜索功能。您可以在`mcp_agent.config.yaml`中配置您的首选选项: + +```yaml +# 默认搜索服务器配置 +# 选项: "brave" 或 "bocha-mcp" +default_search_server: "brave" +``` + +**可用选项:** +- **🔍 Brave搜索** (`"brave"`): + - 具有高质量搜索结果的默认选项 + - 需要BRAVE_API_KEY配置 + - 推荐给大多数用户 + +- **🌐 Bocha-MCP** (`"bocha-mcp"`): + - 替代搜索服务器选项 + - 需要BOCHA_API_KEY配置 + - 使用本地Python服务器实现 + +**在mcp_agent.config.yaml中的API密钥配置:** +```yaml +# 对于Brave搜索 (默认) - 第28行左右 +brave: + command: "npx" + args: ["-y", "@modelcontextprotocol/server-brave-search"] + env: + BRAVE_API_KEY: "your_brave_api_key_here" + +# 对于Bocha-MCP (替代) - 第74行左右 +bocha-mcp: + command: "python" + args: ["tools/bocha_search_server.py"] + env: + PYTHONPATH: "." + BOCHA_API_KEY: "your_bocha_api_key_here" +``` + +> **💡 提示**: 两个搜索服务器都需要API密钥配置。选择最适合您的API访问和需求的选项。 + +### ⚡ **步骤2: 启动应用程序** + +#### 🚀 **使用已安装的包 (推荐)** + +```bash +# 🌐 直接启动Web界面 +deepcode + +# 应用程序将自动在 http://localhost:8501 启动 +``` + +#### 🛠️ **使用源码** + +选择您首选的界面: + +##### 🌐 **Web界面** (推荐) +```bash +# 使用UV +uv run streamlit run ui/streamlit_app.py +# 或使用传统Python +streamlit run ui/streamlit_app.py +``` +
+ Web Access +
+ +##### 🖥️ **CLI界面** (高级用户) +```bash +# 使用UV +uv run python cli/main_cli.py +# 或使用传统Python +python cli/main_cli.py +``` +
+ CLI Mode +
+ +### 🎯 **步骤3: 生成代码** + +1. **📄 输入**: 上传您的研究论文,提供需求,或粘贴URL +2. **🤖 处理**: 观看多智能体系统分析和规划 +3. **⚡ 输出**: 接收带有测试和文档的生产就绪代码 + + + --- + +## 💡 示例 + + + +### 🎬 **实时演示** + + + + + + + + + +
+ +#### 📄 **论文转代码演示** +**研究到实现** + +
+ + Paper2Code Demo + + + **[▶️ 观看演示](https://www.youtube.com/watch?v=MQZYpLkzsbw)** + + *自动将学术论文转换为生产就绪代码* +
+ +
+ +#### 🖼️ **图像处理演示** +**AI驱动的图像工具** + +
+ + Image Processing Demo + + + **[▶️ 观看演示](https://www.youtube.com/watch?v=nFt5mLaMEac)** + + *智能图像处理,具有背景移除和增强功能* +
+ +
+ +#### 🌐 **前端实现** +**完整Web应用程序** + +
+ + Frontend Demo + + + **[▶️ 观看演示](https://www.youtube.com/watch?v=78wx3dkTaAU)** + + *从概念到部署的全栈Web开发* +
+ +
+ + + +### 🆕 **最新更新** + +#### 📄 **智能文档分割 (v1.2.0)** +- **智能处理**: 自动处理超出LLM令牌限制的大型研究论文和技术文档 +- **可配置控制**: 通过配置切换分割功能,具有基于大小的阈值 +- **语义分析**: 高级内容理解,保留算法、概念和公式 +- **向后兼容**: 对较小文档无缝回退到传统处理 + +### 🚀 **即将推出** + +我们正在不断增强DeepCode的令人兴奋的新功能: + +#### 🔧 **增强的代码可靠性和验证** +- **自动化测试**: 具有执行验证和错误检测的全面功能测试。 +- **代码质量保证**: 通过静态分析、动态测试和性能基准测试进行多级验证。 +- **智能调试**: AI驱动的错误检测,具有自动纠正建议 + +#### 📊 **PaperBench性能展示** +- **基准仪表板**: PaperBench评估套件的综合性能指标。 +- **准确性指标**: 与最先进的论文复现系统的详细比较。 +- **成功分析**: 跨论文类别和复杂度水平的统计分析。 + +#### ⚡ **系统级优化** +- **性能提升**: 多线程处理和优化智能体协调,实现更快的生成。 +- **增强推理**: 具有改进上下文理解的高级推理能力。 +- **扩展支持**: 扩展与其他编程语言和框架的兼容性。 + +--- + +## ⭐ 星标历史 + +
+ +*社区增长轨迹* + + + + + + Star History Chart + + + +
+ +--- + +### 🚀 **准备好变革开发方式了吗?** + +
+ +

+ Get Started + View on GitHub + Star Project +

+ +--- + +### 📄 **许可证** + +MIT License + +**MIT许可证** - 版权所有 (c) 2025 香港大学数据智能实验室 + +--- + + + +Visitors + +
diff --git a/__init__.py b/__init__.py index feedfa01..680cae06 100644 --- a/__init__.py +++ b/__init__.py @@ -5,7 +5,7 @@ ⚡ Transform research papers into working code automatically """ -__version__ = "1.0.4" +__version__ = "1.0.5" __author__ = "DeepCode Team" __url__ = "https://github.com/HKUDS/DeepCode" diff --git a/cli/cli_app.py b/cli/cli_app.py index 6504ceda..0b0627fa 100644 --- a/cli/cli_app.py +++ b/cli/cli_app.py @@ -37,8 +37,7 @@ def __init__(self): self.app = None # Will be initialized by workflow adapter self.logger = None self.context = None - # Document segmentation configuration - self.segmentation_config = {"enabled": True, "size_threshold_chars": 50000} + # Document segmentation will be managed by CLI interface async def initialize_mcp_app(self): """初始化MCP应用 - 使用工作流适配器""" @@ -49,50 +48,10 @@ async def cleanup_mcp_app(self): """清理MCP应用 - 使用工作流适配器""" await self.workflow_adapter.cleanup_mcp_app() - def update_segmentation_config(self): - """Update document segmentation configuration in mcp_agent.config.yaml""" - import yaml - import os - - config_path = os.path.join( - os.path.dirname(os.path.dirname(os.path.abspath(__file__))), - "mcp_agent.config.yaml", - ) - - try: - # Read current config - with open(config_path, "r", encoding="utf-8") as f: - config = yaml.safe_load(f) - - # Update document segmentation settings - if "document_segmentation" not in config: - config["document_segmentation"] = {} - - config["document_segmentation"]["enabled"] = self.segmentation_config[ - "enabled" - ] - config["document_segmentation"]["size_threshold_chars"] = ( - self.segmentation_config["size_threshold_chars"] - ) - - # Write updated config - with open(config_path, "w", encoding="utf-8") as f: - yaml.dump(config, f, default_flow_style=False, allow_unicode=True) - - self.cli.print_status( - "📄 Document segmentation configuration updated", "success" - ) - - except Exception as e: - self.cli.print_status( - f"⚠️ Failed to update segmentation config: {str(e)}", "warning" - ) - async def process_input(self, input_source: str, input_type: str): """处理输入源(URL或文件)- 使用升级版智能体编排引擎""" try: - # Update segmentation configuration before processing - self.update_segmentation_config() + # Document segmentation configuration is managed by CLI interface self.cli.print_separator() self.cli.print_status( @@ -281,20 +240,9 @@ async def run_interactive_session(self): self.cli.show_history() elif choice in ["c", "config", "configure"]: - # Sync current segmentation config from CLI interface - self.segmentation_config["enabled"] = self.cli.segmentation_enabled - self.segmentation_config["size_threshold_chars"] = ( - self.cli.segmentation_threshold - ) - + # Show configuration menu - all settings managed by CLI interface self.cli.show_configuration_menu() - # Sync back from CLI interface after configuration changes - self.segmentation_config["enabled"] = self.cli.segmentation_enabled - self.segmentation_config["size_threshold_chars"] = ( - self.cli.segmentation_threshold - ) - else: self.cli.print_status( "Invalid choice. Please select U, F, T, C, H, or Q.", "warning" diff --git a/cli/cli_interface.py b/cli/cli_interface.py index 770dc1c9..3bf6304a 100644 --- a/cli/cli_interface.py +++ b/cli/cli_interface.py @@ -40,9 +40,67 @@ def __init__(self): self.is_running = True self.processing_history = [] self.enable_indexing = True # Default configuration - self.segmentation_enabled = True # Default to smart segmentation - self.segmentation_threshold = 50000 # Default threshold + # Load segmentation config from the same source as UI + self._load_segmentation_config() + + # Initialize tkinter availability + self._init_tkinter() + + def _load_segmentation_config(self): + """Load segmentation configuration from mcp_agent.config.yaml""" + try: + from utils.llm_utils import get_document_segmentation_config + + seg_config = get_document_segmentation_config() + self.segmentation_enabled = seg_config.get("enabled", True) + self.segmentation_threshold = seg_config.get("size_threshold_chars", 50000) + except Exception as e: + print(f"⚠️ Warning: Failed to load segmentation config: {e}") + # Fall back to defaults + self.segmentation_enabled = True + self.segmentation_threshold = 50000 + + def _save_segmentation_config(self): + """Save segmentation configuration to mcp_agent.config.yaml""" + import yaml + import os + + # Get the project root directory (where mcp_agent.config.yaml is located) + current_file = os.path.abspath(__file__) + cli_dir = os.path.dirname(current_file) # cli directory + project_root = os.path.dirname(cli_dir) # project root + config_path = os.path.join(project_root, "mcp_agent.config.yaml") + + try: + # Read current config + with open(config_path, "r", encoding="utf-8") as f: + config = yaml.safe_load(f) + + # Update document segmentation settings + if "document_segmentation" not in config: + config["document_segmentation"] = {} + + config["document_segmentation"]["enabled"] = self.segmentation_enabled + config["document_segmentation"]["size_threshold_chars"] = ( + self.segmentation_threshold + ) + + # Write updated config + with open(config_path, "w", encoding="utf-8") as f: + yaml.dump(config, f, default_flow_style=False, allow_unicode=True) + + print( + f"{Colors.OKGREEN}✅ Document segmentation configuration updated{Colors.ENDC}" + ) + + except Exception as e: + print( + f"{Colors.WARNING}⚠️ Failed to update segmentation config: {str(e)}{Colors.ENDC}" + ) + + def _init_tkinter(self): + """Initialize tkinter availability check""" # Check tkinter availability for file dialogs self.tkinter_available = True try: @@ -765,6 +823,8 @@ def show_configuration_menu(self): elif choice in ["s", "segmentation"]: current_state = getattr(self, "segmentation_enabled", True) self.segmentation_enabled = not current_state + # Save the configuration to file + self._save_segmentation_config() seg_mode = ( "📄 Smart Segmentation" if self.segmentation_enabled diff --git a/cli/main_cli.py b/cli/main_cli.py index 1e70baa0..bee8f748 100644 --- a/cli/main_cli.py +++ b/cli/main_cli.py @@ -230,18 +230,16 @@ async def main(): print( f"\n{Colors.MAGENTA}📄 Document segmentation disabled - using traditional processing{Colors.ENDC}" ) - app.segmentation_config = { - "enabled": False, - "size_threshold_chars": args.segmentation_threshold, - } + app.cli.segmentation_enabled = False + app.cli.segmentation_threshold = args.segmentation_threshold + app.cli._save_segmentation_config() else: print( f"\n{Colors.BLUE}📄 Smart document segmentation enabled (threshold: {args.segmentation_threshold} chars){Colors.ENDC}" ) - app.segmentation_config = { - "enabled": True, - "size_threshold_chars": args.segmentation_threshold, - } + app.cli.segmentation_enabled = True + app.cli.segmentation_threshold = args.segmentation_threshold + app.cli._save_segmentation_config() # 检查是否为直接处理模式 if args.file or args.url or args.chat: diff --git a/config/mcp_tool_definitions.py b/config/mcp_tool_definitions.py index 36c0b3e2..732c7df8 100644 --- a/config/mcp_tool_definitions.py +++ b/config/mcp_tool_definitions.py @@ -26,8 +26,10 @@ def get_code_implementation_tools() -> List[Dict[str, Any]]: """ return [ MCPToolDefinitions._get_read_file_tool(), + MCPToolDefinitions._get_read_multiple_files_tool(), MCPToolDefinitions._get_read_code_mem_tool(), MCPToolDefinitions._get_write_file_tool(), + MCPToolDefinitions._get_write_multiple_files_tool(), MCPToolDefinitions._get_execute_python_tool(), MCPToolDefinitions._get_execute_bash_tool(), ] @@ -58,6 +60,31 @@ def _get_read_file_tool() -> Dict[str, Any]: }, } + @staticmethod + def _get_read_multiple_files_tool() -> Dict[str, Any]: + """批量读取多个文件工具定义""" + return { + "name": "read_multiple_files", + "description": "Read multiple files in a single operation (for batch reading)", + "input_schema": { + "type": "object", + "properties": { + "file_requests": { + "type": "string", + "description": 'JSON string with file requests, e.g., \'{"file1.py": {}, "file2.py": {"start_line": 1, "end_line": 10}}\' or simple array \'["file1.py", "file2.py"]\'', + }, + "max_files": { + "type": "integer", + "description": "Maximum number of files to read in one operation", + "default": 5, + "minimum": 1, + "maximum": 10, + }, + }, + "required": ["file_requests"], + }, + } + @staticmethod def _get_read_code_mem_tool() -> Dict[str, Any]: """Read code memory tool definition - reads from implement_code_summary.md""" @@ -109,6 +136,41 @@ def _get_write_file_tool() -> Dict[str, Any]: }, } + @staticmethod + def _get_write_multiple_files_tool() -> Dict[str, Any]: + """批量写入多个文件工具定义""" + return { + "name": "write_multiple_files", + "description": "Write multiple files in a single operation (for batch implementation)", + "input_schema": { + "type": "object", + "properties": { + "file_implementations": { + "type": "string", + "description": 'JSON string mapping file paths to content, e.g., \'{"file1.py": "content1", "file2.py": "content2"}\'', + }, + "create_dirs": { + "type": "boolean", + "description": "Whether to create directories if they don't exist", + "default": True, + }, + "create_backup": { + "type": "boolean", + "description": "Whether to create backup files if they already exist", + "default": False, + }, + "max_files": { + "type": "integer", + "description": "Maximum number of files to write in one operation", + "default": 5, + "minimum": 1, + "maximum": 10, + }, + }, + "required": ["file_implementations"], + }, + } + @staticmethod def _get_execute_python_tool() -> Dict[str, Any]: """Python执行工具定义""" diff --git a/config/mcp_tool_definitions_index.py b/config/mcp_tool_definitions_index.py index 830313aa..5abfe4d1 100644 --- a/config/mcp_tool_definitions_index.py +++ b/config/mcp_tool_definitions_index.py @@ -26,11 +26,35 @@ def get_code_implementation_tools() -> List[Dict[str, Any]]: """ return [ MCPToolDefinitions._get_read_file_tool(), + MCPToolDefinitions._get_read_multiple_files_tool(), MCPToolDefinitions._get_read_code_mem_tool(), MCPToolDefinitions._get_write_file_tool(), + MCPToolDefinitions._get_write_multiple_files_tool(), MCPToolDefinitions._get_execute_python_tool(), MCPToolDefinitions._get_execute_bash_tool(), MCPToolDefinitions._get_search_code_references_tool(), + MCPToolDefinitions._get_search_code_tool(), + MCPToolDefinitions._get_file_structure_tool(), + MCPToolDefinitions._get_set_workspace_tool(), + MCPToolDefinitions._get_operation_history_tool(), + ] + + @staticmethod + def get_code_evaluation_tools() -> List[Dict[str, Any]]: + """ + 获取代码评估相关的工具定义 + Get tool definitions for code evaluation + """ + return [ + MCPToolDefinitions._get_analyze_repo_structure_tool(), + MCPToolDefinitions._get_detect_dependencies_tool(), + MCPToolDefinitions._get_assess_code_quality_tool(), + MCPToolDefinitions._get_evaluate_documentation_tool(), + MCPToolDefinitions._get_check_reproduction_readiness_tool(), + MCPToolDefinitions._get_generate_evaluation_summary_tool(), + MCPToolDefinitions._get_detect_empty_files_tool(), + MCPToolDefinitions._get_detect_missing_files_tool(), + MCPToolDefinitions._get_generate_code_revision_report_tool(), ] @staticmethod @@ -59,6 +83,31 @@ def _get_read_file_tool() -> Dict[str, Any]: }, } + @staticmethod + def _get_read_multiple_files_tool() -> Dict[str, Any]: + """批量读取多个文件工具定义""" + return { + "name": "read_multiple_files", + "description": "Read multiple files in a single operation (for batch reading)", + "input_schema": { + "type": "object", + "properties": { + "file_requests": { + "type": "string", + "description": 'JSON string with file requests, e.g., \'{"file1.py": {}, "file2.py": {"start_line": 1, "end_line": 10}}\' or simple array \'["file1.py", "file2.py"]\'', + }, + "max_files": { + "type": "integer", + "description": "Maximum number of files to read in one operation", + "default": 5, + "minimum": 1, + "maximum": 10, + }, + }, + "required": ["file_requests"], + }, + } + @staticmethod def _get_read_code_mem_tool() -> Dict[str, Any]: """Read code memory tool definition - reads from implement_code_summary.md""" @@ -110,6 +159,41 @@ def _get_write_file_tool() -> Dict[str, Any]: }, } + @staticmethod + def _get_write_multiple_files_tool() -> Dict[str, Any]: + """批量写入多个文件工具定义""" + return { + "name": "write_multiple_files", + "description": "Write multiple files in a single operation (for batch implementation)", + "input_schema": { + "type": "object", + "properties": { + "file_implementations": { + "type": "string", + "description": 'JSON string mapping file paths to content, e.g., \'{"file1.py": "content1", "file2.py": "content2"}\'', + }, + "create_dirs": { + "type": "boolean", + "description": "Whether to create directories if they don't exist", + "default": True, + }, + "create_backup": { + "type": "boolean", + "description": "Whether to create backup files if they already exist", + "default": False, + }, + "max_files": { + "type": "integer", + "description": "Maximum number of files to write in one operation", + "default": 5, + "minimum": 1, + "maximum": 10, + }, + }, + "required": ["file_implementations"], + }, + } + @staticmethod def _get_execute_python_tool() -> Dict[str, Any]: """Python执行工具定义""" @@ -208,6 +292,56 @@ def _get_search_code_references_tool() -> Dict[str, Any]: }, } + @staticmethod + def _get_search_code_tool() -> Dict[str, Any]: + """代码搜索工具定义 - 在当前代码库中搜索模式""" + return { + "name": "search_code", + "description": "Search patterns in code files within the current repository", + "input_schema": { + "type": "object", + "properties": { + "pattern": { + "type": "string", + "description": "Search pattern", + }, + "file_pattern": { + "type": "string", + "description": "File pattern (e.g., '*.py')", + "default": "*.py", + }, + "use_regex": { + "type": "boolean", + "description": "Whether to use regular expressions", + "default": False, + }, + "search_directory": { + "type": "string", + "description": "Specify search directory (optional)", + }, + }, + "required": ["pattern"], + }, + } + + @staticmethod + def _get_operation_history_tool() -> Dict[str, Any]: + """操作历史工具定义""" + return { + "name": "get_operation_history", + "description": "Get operation history", + "input_schema": { + "type": "object", + "properties": { + "last_n": { + "type": "integer", + "description": "Return the last N operations", + "default": 10, + }, + }, + }, + } + @staticmethod def _get_get_indexes_overview_tool() -> Dict[str, Any]: """获取索引概览工具定义""" @@ -262,6 +396,176 @@ def _get_set_workspace_tool() -> Dict[str, Any]: # } # } + # Code evaluation tool definitions + @staticmethod + def _get_analyze_repo_structure_tool() -> Dict[str, Any]: + return { + "name": "analyze_repo_structure", + "description": "Perform comprehensive repository structure analysis", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository to analyze", + } + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_detect_dependencies_tool() -> Dict[str, Any]: + return { + "name": "detect_dependencies", + "description": "Detect and analyze project dependencies across multiple languages", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository", + } + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_assess_code_quality_tool() -> Dict[str, Any]: + return { + "name": "assess_code_quality", + "description": "Assess code quality metrics and identify potential issues", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository", + } + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_evaluate_documentation_tool() -> Dict[str, Any]: + return { + "name": "evaluate_documentation", + "description": "Evaluate documentation completeness and quality", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository", + }, + "docs_path": { + "type": "string", + "description": "Optional path to external documentation", + }, + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_check_reproduction_readiness_tool() -> Dict[str, Any]: + return { + "name": "check_reproduction_readiness", + "description": "Assess repository readiness for reproduction and validation", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository", + }, + "docs_path": { + "type": "string", + "description": "Optional path to reproduction documentation", + }, + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_generate_evaluation_summary_tool() -> Dict[str, Any]: + return { + "name": "generate_evaluation_summary", + "description": "Generate comprehensive evaluation summary combining all analysis results", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository", + }, + "docs_path": { + "type": "string", + "description": "Optional path to reproduction documentation", + }, + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_detect_empty_files_tool() -> Dict[str, Any]: + return { + "name": "detect_empty_files", + "description": "Detect empty files in the repository that may need implementation", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository to analyze", + } + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_detect_missing_files_tool() -> Dict[str, Any]: + return { + "name": "detect_missing_files", + "description": "Detect missing essential files like main programs, tests, requirements, etc.", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository to analyze", + } + }, + "required": ["repo_path"], + }, + } + + @staticmethod + def _get_generate_code_revision_report_tool() -> Dict[str, Any]: + return { + "name": "generate_code_revision_report", + "description": "Generate comprehensive code revision report combining empty files, missing files, and quality analysis", + "input_schema": { + "type": "object", + "properties": { + "repo_path": { + "type": "string", + "description": "Path to the repository to analyze", + }, + "docs_path": { + "type": "string", + "description": "Optional path to documentation", + }, + }, + "required": ["repo_path"], + }, + } + @staticmethod def get_available_tool_sets() -> Dict[str, str]: """ @@ -270,6 +574,7 @@ def get_available_tool_sets() -> Dict[str, str]: """ return { "code_implementation": "代码实现相关工具集 / Code implementation tool set", + "code_evaluation": "代码评估相关工具集 / Code evaluation tool set", # 可以在这里添加更多工具集 # "data_analysis": "数据分析工具集 / Data analysis tool set", # "web_scraping": "网页爬取工具集 / Web scraping tool set", @@ -283,6 +588,7 @@ def get_tool_set(tool_set_name: str) -> List[Dict[str, Any]]: """ tool_sets = { "code_implementation": MCPToolDefinitions.get_code_implementation_tools(), + "code_evaluation": MCPToolDefinitions.get_code_evaluation_tools(), } return tool_sets.get(tool_set_name, []) diff --git a/deepcode.py b/deepcode.py index 76fdec32..9a300c54 100755 --- a/deepcode.py +++ b/deepcode.py @@ -146,8 +146,81 @@ def print_banner(): print(banner) +def launch_paper_test(paper_name: str, fast_mode: bool = False): + """Launch paper testing mode""" + try: + print("\n🧪 Launching Paper Test Mode") + print(f"📄 Paper: {paper_name}") + print(f"⚡ Fast mode: {'enabled' if fast_mode else 'disabled'}") + print("=" * 60) + + # Run the test setup + setup_cmd = [sys.executable, "test_paper.py", paper_name] + if fast_mode: + setup_cmd.append("--fast") + + result = subprocess.run(setup_cmd, check=True) + + if result.returncode == 0: + print("\n✅ Paper test setup completed successfully!") + print("📁 Files are ready in deepcode_lab/papers/") + print("\n💡 Next steps:") + print(" 1. Install MCP dependencies: pip install -r requirements.txt") + print( + f" 2. Run full pipeline: python -m workflows.paper_test_engine --paper {paper_name}" + + (" --fast" if fast_mode else "") + ) + + except subprocess.CalledProcessError as e: + print(f"\n❌ Paper test setup failed: {e}") + sys.exit(1) + except Exception as e: + print(f"\n❌ Unexpected error: {e}") + sys.exit(1) + + def main(): """Main function""" + # Parse command line arguments + if len(sys.argv) > 1: + if sys.argv[1] == "test" and len(sys.argv) >= 3: + # Paper testing mode: python deepcode.py test rice [--fast] + paper_name = sys.argv[2] + fast_mode = "--fast" in sys.argv or "-f" in sys.argv + + print_banner() + launch_paper_test(paper_name, fast_mode) + return + elif sys.argv[1] in ["--help", "-h", "help"]: + print_banner() + print(""" +🔧 Usage: + python deepcode.py - Launch web interface + python deepcode.py test - Test paper reproduction + python deepcode.py test --fast - Test paper (fast mode) + +📄 Examples: + python deepcode.py test rice - Test RICE paper reproduction + python deepcode.py test rice --fast - Test RICE paper (fast mode) + +📁 Available papers:""") + + # List available papers + papers_dir = "papers" + if os.path.exists(papers_dir): + for item in os.listdir(papers_dir): + item_path = os.path.join(papers_dir, item) + if os.path.isdir(item_path): + paper_md = os.path.join(item_path, "paper.md") + addendum_md = os.path.join(item_path, "addendum.md") + status = "✅" if os.path.exists(paper_md) else "❌" + addendum_status = "📄" if os.path.exists(addendum_md) else "➖" + print(f" {status} {item} {addendum_status}") + print( + "\n Legend: ✅ = paper.md exists, 📄 = addendum.md exists, ➖ = no addendum" + ) + return + print_banner() # Check dependencies @@ -182,7 +255,7 @@ def main(): "run", str(streamlit_app_path), "--server.port", - "8501", + "8503", "--server.address", "localhost", "--browser.gatherUsageStats", diff --git a/mcp_agent.config.yaml b/mcp_agent.config.yaml index d3b50207..fce8aaf9 100644 --- a/mcp_agent.config.yaml +++ b/mcp_agent.config.yaml @@ -75,6 +75,7 @@ mcp: args: - -y - '@modelcontextprotocol/server-filesystem' + - . command: npx github-downloader: args: @@ -83,8 +84,8 @@ mcp: env: PYTHONPATH: . openai: - default_model: "anthropic/claude-sonnet-4" + base_max_tokens: 16384 + default_model: anthropic/claude-3.5-sonnet + max_tokens_policy: adaptive + retry_max_tokens: 32768 planning_mode: traditional - -anthropic: - default_model: "" diff --git a/mcp_agent.secrets.yaml b/mcp_agent.secrets.yaml index 435cf3f7..6e670162 100644 --- a/mcp_agent.secrets.yaml +++ b/mcp_agent.secrets.yaml @@ -3,5 +3,6 @@ openai: base_url: "" + anthropic: api_key: "" diff --git a/prompts/code_prompts.py b/prompts/code_prompts.py index 13a61ade..9896db3c 100644 --- a/prompts/code_prompts.py +++ b/prompts/code_prompts.py @@ -63,6 +63,8 @@ Task: Handle paper according to input type and save to "./deepcode_lab/papers/id/id.md" Note: Generate id (id is a number) by counting files in "./deepcode_lab/papers/" directory and increment by 1. +CRITICAL RULE: NEVER use write_file tool to create paper content directly. Always use file-downloader tools for PDF/document conversion. + Processing Rules: 1. URL Input (input_type = "url"): - Use "file-downloader" tool to download paper @@ -70,8 +72,9 @@ - Return saved file path and metadata 2. File Input (input_type = "file"): - - Move file to "./deepcode_lab/papers/id/" - - Use "file-downloader" tool to convert to .md format + - Move file to "./deepcode_lab/papers/id/" using move_file_to tool + - The move_file_to tool will automatically convert PDF/documents to .md format + - NEVER manually extract content or use write_file - let the conversion tools handle this - Return new saved file path and metadata 3. Directory Input (input_type = "directory"): @@ -581,7 +584,7 @@ ⚠️ IMPORTANT: Generate a COMPLETE plan that includes ALL 5 sections without being cut off by token limits. ## Content Balance Guidelines: -- **Section 1 (File Structure)**: Brief overview (10% of content) - Focus on CORE implementation files only +- **Section 1 (File Structure)**: Brief overview (10% of content) - Include all files but focus on implementation priority - **Section 2 (Implementation Components)**: Detailed but concise (40% of content) - This is the PRIORITY section - **Section 3 (Validation)**: Moderate detail (25% of content) - Essential experiments and tests - **Section 4 (Environment)**: Brief but complete (10% of content) - All necessary dependencies @@ -595,7 +598,7 @@ 4. **FOURTH**: Configuration and data handling 5. **LAST**: Documentation files (README.md, requirements.txt) - These should be created AFTER core implementation -Note: README and requirements.txt are maintenance files that depend on the final implementation, so plan them last. +Note: README and requirements.txt are maintenance files that depend on the final implementation, so plan them last but INCLUDE them in the file structure. # DETAILED SYNTHESIS PROCESS @@ -649,14 +652,16 @@ # - Organize files and directories in the most logical way for implementation # - Create meaningful names and groupings based on paper content # - Keep it clean, intuitive, and focused on what actually needs to be implemented - # - EXCLUDE documentation files (README.md, requirements.txt) - these come last + # - INCLUDE documentation files (README.md, requirements.txt) but mark them for LAST implementation file_structure: | [Design and specify your own project structure here - KEEP THIS BRIEF] - [Focus ONLY on core implementation files, NOT documentation files] + [Include ALL necessary files including README.md and requirements.txt] [Organize based on what this paper actually contains and needs] [Create directories and files that make sense for this specific implementation] - [EXCLUDE: README.md, requirements.txt - these come last in implementation] + [IMPORTANT: Include executable files (e.g., main.py, run.py, train.py, demo.py) - choose names based on repo content] + [Design executable entry points that match the paper's main functionality and experiments] + [NOTE: README.md and requirements.txt should be implemented LAST after all code files] # SECTION 2: Implementation Components @@ -1099,10 +1104,10 @@ - **Reference only**: Use `search_code_references(indexes_path="indexes", target_file=the_file_you_want_to_implement, keywords=the_keywords_you_want_to_search)` for reference, NOT as implementation standard - **Core principle**: Original paper requirements take absolute priority over any reference code found 3. **TOOL EXECUTION STRATEGY**: - - ⚠️**Development Cycle (for each new file implementation)**: `read_code_mem` (check existing implementations in Working Directory, use `read_file` as fallback if memory unavailable`) → `search_code_references` (OPTIONAL reference check from `/home/agent/indexes`) → `write_file` (implement based on original paper) → `execute_python` (if should test) - - **Environment Setup**: `write_file` (requirements.txt) → `execute_bash` (pip install) → `execute_python` (verify) + - ⚠️**Development Cycle (for each new file implementation)**: `read_code_mem` (check existing implementations in Working Directory, use `read_file` as fallback if memory unavailable`) → `search_code_references` (OPTIONAL reference check from `/home/agent/indexes`) → `write_file` (implement based on original paper) → `execute_python` (if needed to verify implementation) + - **File Verification**: Use `execute_bash` and `execute_python` when needed to check implementation completeness -4. **CRITICAL**: Use bash and python tools to ACTUALLY REPLICATE the paper yourself - do not provide instructions. +4. **CRITICAL**: Use bash and python tools when needed to CHECK and VERIFY implementation completeness - do not provide instructions. These tools help validate that your implementation files are syntactically correct and properly structured. **Execution Guidelines**: - **Plan First**: Before each action, explain your reasoning and which function you'll use @@ -1147,24 +1152,22 @@ 1. **Identify** what needs to be implemented from the requirements 2. **Analyze Dependencies**: Before implementing each new file, use `read_code_mem` to read summaries of already-implemented files, then search for reference patterns to guide your implementation approach. 3. **Implement** one component at a time -4. **Test** immediately using `execute_python` or `execute_bash` to catch issues early - THIS IS MANDATORY, NOT OPTIONAL +4. **Verify** optionally using `execute_python` or `execute_bash` to check implementation completeness if needed 5. **Integrate** with existing components -6. **Verify** against requirement specifications using execution tools to ensure everything works +6. **Validate** against requirement specifications **TOOL CALLING STRATEGY**: 1. ⚠️ **SINGLE FUNCTION CALL PER MESSAGE**: Each message may perform only one function call. You will see the result of the function right after sending the message. If you need to perform multiple actions, you can always send more messages with subsequent function calls. Do some reasoning before your actions, describing what function calls you are going to use and how they fit into your plan. 2. **TOOL EXECUTION STRATEGY**: - - **Development Cycle (for each new file implementation)**: `read_code_mem` (check existing implementations in Working Directory, use `read_file` as fallback if memory unavailable) → `write_file` (implement) → **MANDATORY TESTING**: `execute_python` or `execute_bash` (ALWAYS test after implementation) - - **Environment Setup**: Use `execute_bash` for installing packages, setting up dependencies, downloading files, etc. - - **Testing & Debugging**: Use `execute_python` for Python code testing and `execute_bash` for system commands, package installation, file operations, and bug fixing - - **⚠️ TESTING REMINDER**: After implementing ANY file, you MUST call either `execute_python` or `execute_bash` to test the implementation. Do not skip this step! + - **Development Cycle (for each new file implementation)**: `read_code_mem` (check existing implementations in Working Directory, use `read_file` as fallback if memory unavailable) → `write_file` (implement) → **Optional Verification**: `execute_python` or `execute_bash` (if needed to check implementation) + - **File Verification**: Use `execute_bash` and `execute_python` when needed to verify implementation completeness. -3. **CRITICAL**: Use `execute_bash` and `execute_python` tools to ACTUALLY IMPLEMENT and TEST the requirements yourself - do not provide instructions. These tools are essential for: - - Installing dependencies and setting up environments (`execute_bash`) - - Testing Python implementations (`execute_python`) - - Debugging and fixing issues (`execute_bash` for system-level, `execute_python` for Python-specific) - - Validating that your code actually works before moving to the next component +3. **CRITICAL**: Use `execute_bash` and `execute_python` tools when needed to CHECK and VERIFY file implementation completeness - do not provide instructions. These tools are essential for: + - Checking file syntax and import correctness (`execute_python`) + - Verifying file structure and dependencies (`execute_bash` for listing, `execute_python` for imports) + - Validating that implemented files are syntactically correct and can be imported + - Ensuring code implementation meets basic functionality requirements **Execution Guidelines**: - **Plan First**: Before each action, explain your reasoning and which function you'll use @@ -1192,83 +1195,174 @@ """ # Chat Agent Planning Prompt (Universal for Academic and Engineering Use) -CHAT_AGENT_PLANNING_PROMPT = """You are a universal project planning agent that creates implementation plans for any coding project: web apps, games, academic research, tools, etc. - -# 🎯 OBJECTIVE -Transform user requirements into a clear, actionable implementation plan with optimal file structure and dependencies. +CHAT_AGENT_PLANNING_PROMPT = """You are a universal software planning agent that produces implementation plans for ANY coding task across ALL languages and stacks: algorithms, libraries, CLIs, web/mobile apps, services/APIs, data/ML pipelines, system tools, infra/DevOps, and paper/code reproduction. -# 📋 OUTPUT FORMAT +# ROLE +- Act as a senior architect and tech lead. +- Convert ambiguous user input into a concrete, buildable plan. +- Proactively identify unknowns; make explicit assumptions; ask crisp follow‑ups only when blockers exist. +# OBJECTIVE +Create a concise but complete plan that a developer can implement without further clarification. + +# OUTPUT RULES (MUST FOLLOW) +- Output ONLY one fenced YAML block. No prose outside the block. +- Use specific names and values (avoid placeholders like "TBD" unless truly unknown). +- If information is missing, add it to open_questions and proceed with reasonable assumptions. +- Keep the file tree minimal but sufficient (<= 15 files unless justification provided in rationale). +- Be language-agnostic: choose stack-specific artifacts based on the determined language/framework (no Python bias). + +# ADAPTIVE PLANNING +Dynamically tailor the plan to the project_type and language_stack inferred from the user input. Include sections that apply; omit irrelevant ones. Common types and emphases: +- web_app/api/service: routing/endpoints, data models, auth, security, deployment. +- cli/tool/library: commands/APIs, packaging, versioning, usage examples. +- algorithm/research/paper_reproduction: problem, math/pseudocode, datasets, evaluation. +- data/etl/ml: schemas, pipelines, validation, experiments, tracking. +- system/infrastructure: processes, configs, observability, deployment. +- mobile/desktop: UI navigation, platform constraints, packaging, distribution. + +# YAML SCHEMA (STRICT) ```yaml project_plan: - title: "[Project Name]" - description: "[Brief description]" - project_type: "[web_app|game|academic|tool|api|other]" + meta: + title: "[Project Name]" + description: "[1-3 sentence summary focused on user value]" + project_type: "[web_app|api|cli|library|algorithm|ml_pipeline|system|mobile|desktop|other]" + stakeholders: ["[who benefits/uses it]"] + domain: "[e.g., finance, research, internal tooling]" + language_stack: + primary_language: "[e.g., TypeScript|Go|Rust|Python|Java|C#|C++|Swift|Kotlin|PHP]" + frameworks: ["[e.g., React, Spring, FastAPI, .NET, Qt, Flutter, Angular, Next.js, Express, Gin]"] + build_tool: "[e.g., npm/yarn/pnpm|cargo|go build|maven/gradle|cmake|dotnet|swiftpm|composer|bundler]" + package_manager: "[e.g., npm|pnpm|yarn|pip/poetry|cargo|go modules|maven/gradle|nuget|composer|gem|swiftpm]" + test_framework: "[e.g., jest|pytest|go test|junit|xUnit|rspec|vitest]" + target_platforms: ["[web|server|linux|windows|mac|ios|android|embedded]"] + + scope: + goals: + must: ["[non-negotiable outcomes]"] + should: ["[important but flexible]"] + could: ["[nice-to-have]"] + non_goals: ["[explicitly out-of-scope items]"] + constraints: + - "[performance, budget, platform, compliance, deadlines]" + assumptions: + - "[assumption 1]" + open_questions: + - "[blocking question 1]" + + architecture: + overview: "[1-2 paragraphs describing high-level design and reasoning]" + components: + - name: "[Component]" + responsibilities: ["[what it does]"] + interfaces: ["[APIs/messages/CLI/UI]"] + inputs: ["[data/events/requests]"] + outputs: ["[responses/artifacts/events]"] + dependencies: ["[libraries/services/files]"] + data_model: | + [If applicable: entities/schemas with fields and types] + flows: + - name: "[Key flow]" + steps: ["[step 1]", "[step 2]"] + + interfaces: # Include applicable subsections only + apis: + - name: "[API name]" + method: "[GET|POST|...]" + path: "/resource" + request: { fields: { id: "string", ... } } + response: { fields: { data: "..." } } + errors: ["[error cases]"] + cli: + - command: "tool subcmd" + flags: ["--optA", "--optB"] + examples: ["tool subcmd --optA foo"] + ui: + screens: + - name: "[Screen]" + states: ["[empty/loading/error]"] + actions: ["[action]"] + + algorithm_spec: # Include for algorithm/research tasks + problem: "[formal/problem statement]" + inputs: "[types/shapes/ranges]" + outputs: "[types/shapes/metrics]" + pseudocode: | + [If applicable: numbered steps] + complexity: "[time/space complexities]" + evaluation_metrics: ["[metric1]", "[metric2]"] - # CUSTOM FILE TREE STRUCTURE (max 15 files, design as needed) file_structure: | project_root/ - ├── main.py # Entry point - ├── [specific_files] # Core files based on project type - ├── [folder]/ # Organized folders if needed - │ ├── __init__.py - │ └── [module].py - ├── requirements.txt # Dependencies - └── README.md # Basic documentation - - # IMPORTANT: Output ACTUAL file tree structure above, not placeholder text - # Examples by project type: - # Web App: app.py, templates/, static/, models.py, config.py - # Game: main.py, game/, assets/, sprites/, config.yaml - # Academic: algorithm.py, experiments/, data/, utils.py, config.json - # Tool: cli.py, core/, utils.py, tests/, setup.py - - # CORE IMPLEMENTATION PLAN - implementation_steps: - 1. "[First step - usually setup/core structure]" - 2. "[Second step - main functionality]" - 3. "[Third step - integration/interface]" - 4. "[Fourth step - testing/refinement]" - - # DEPENDENCIES & SETUP + ├── [entrypoint_or_bootstrap] # e.g., index.ts, main.go, main.rs, App.swift, Program.cs, server.js, app.py, main.cpp + ├── [modules_or_packages]/ # keep total files <= 15 unless justified + ├── [project_manifest] # choose per stack: package.json|pyproject.toml|requirements.txt|go.mod|Cargo.toml|pom.xml|build.gradle|CMakeLists.txt|composer.json|Gemfile|Package.swift|pubspec.yaml|mix.exs + └── README.md + + implementation_plan: + phases: + - name: "Phase 1 - Foundations" + tasks: + - id: T1 + description: "[task]" + acceptance_criteria: ["[what proves completion]"] + - name: "Phase 2 - Core Features" + tasks: + - id: T2 + description: "[task]" + acceptance_criteria: ["[what proves completion]"] + rationale: "[why this order; critical path; risk-first or value-first]" + dependencies: - required_packages: - - "[package1==version]" - - "[package2>=version]" - optional_packages: - - "[optional1]: [purpose]" + manifests: + - file: "[project_manifest]" + manager: "[package_manager]" + runtime: + - "package==version or name@version" + dev: + - "[linters/test frameworks/build tools]" + services: + - "[db/cache/message broker/3rd-party APIs]" setup_commands: - - "[command to setup environment]" - - "[command to install dependencies]" - - # KEY TECHNICAL DETAILS - tech_stack: - language: "[primary language]" - frameworks: ["[framework1]", "[framework2]"] - key_libraries: ["[lib1]", "[lib2]"] - - main_features: - - "[core feature 1]" - - "[core feature 2]" - - "[core feature 3]" + - "[stack-specific bootstrap, e.g., npm i|pnpm i|yarn; poetry install; go mod tidy; cargo build; mvn package; gradle build; dotnet restore; composer install; swift build]" + configuration: + env_vars: ["[KEY1]", "[KEY2]"] + secrets: ["[how to manage secrets]"] + + quality: + formatting_linting: "[prettier|eslint|ruff|flake8|clang-format|ktlint|gofmt|rustfmt]" + testing: + strategy: "[unit/integration/e2e/property/perf]" + critical_cases: ["[must-pass cases]"] + fixtures: ["[data/mocks]"] + security_privacy: + - "[input validation, authN/Z, secret handling, PII, OWASP, supply-chain checks]" + observability: + logging: "[levels/structure/correlation ids]" + metrics: ["[business/technical KPIs]"] + tracing: "[if distributed]" + + deployment: # Include when relevant + target: "[local|docker|kubernetes|serverless|desktop|mobile|edge]" + artifacts: ["[container image|binary|bundle|apk/ipa|wheel|jar]"] + commands: ["[run/build/deploy commands per stack]"] + runtime_profile: "[cpu/mem/storage scale]" + + success_metrics: + - "[quantitative/qualitative criteria for success]" + + next_steps: + - "[follow-ups or stretch goals]" ``` -# 🎯 PLANNING PRINCIPLES -- **Flexibility**: Adapt file structure to project type (no fixed templates) -- **Simplicity**: Keep under 15 files, focus on essentials -- **Practicality**: Include specific packages/versions needed -- **Clarity**: Clear implementation steps that can be directly coded -- **Universality**: Work for any project type (web, game, academic, etc.) - -# 📝 FILE STRUCTURE GUIDELINES -- **MUST OUTPUT**: Actual file tree with specific filenames (not placeholder text) -- Design structure based on project needs, not templates -- Group related functionality logically -- Include main entry point (main.py, app.py, etc.) -- Add config/settings files if needed -- Include requirements.txt or equivalent -- Keep it minimal but complete (max 15 files) -- Use tree format: ├── ─ │ symbols for visual hierarchy""" +# GUIDANCE +- Detect/imply the most suitable language and stack from user input; choose idiomatic project layout for that stack. +- Prefer concrete details over generic advice. No Python bias; pick correct manifests, tools, and commands for the chosen stack. +- For missing details, choose sensible defaults and clearly list assumptions. +- Keep the plan implementable within constraints; avoid over-engineering. +- For polyglot/monorepo needs, propose a lean apps/ and packages/ structure (still <= 15 files unless justified). +""" # ============================================================================= # TRADITIONAL PROMPTS (Non-segmented versions for smaller documents) @@ -1630,7 +1724,7 @@ ⚠️ IMPORTANT: Generate a COMPLETE plan that includes ALL 5 sections without being cut off by token limits. ## Content Balance Guidelines: -- **Section 1 (File Structure)**: Brief overview (10% of content) - Focus on CORE implementation files only +- **Section 1 (File Structure)**: Brief overview (10% of content) - Include all files but focus on implementation priority - **Section 2 (Implementation Components)**: Detailed but concise (40% of content) - This is the PRIORITY section - **Section 3 (Validation)**: Moderate detail (25% of content) - Essential experiments and tests - **Section 4 (Environment)**: Brief but complete (10% of content) - All necessary dependencies @@ -1644,7 +1738,7 @@ 4. **FOURTH**: Configuration and data handling 5. **LAST**: Documentation files (README.md, requirements.txt) - These should be created AFTER core implementation -Note: README and requirements.txt are maintenance files that depend on the final implementation, so plan them last. +Note: README and requirements.txt are maintenance files that depend on the final implementation, so plan them last but INCLUDE them in the file structure. # DETAILED SYNTHESIS PROCESS @@ -1698,14 +1792,16 @@ # - Organize files and directories in the most logical way for implementation # - Create meaningful names and groupings based on paper content # - Keep it clean, intuitive, and focused on what actually needs to be implemented - # - EXCLUDE documentation files (README.md, requirements.txt) - these come last + # - INCLUDE documentation files (README.md, requirements.txt) but mark them for LAST implementation file_structure: | [Design and specify your own project structure here - KEEP THIS BRIEF] - [Focus ONLY on core implementation files, NOT documentation files] + [Include ALL necessary files including README.md and requirements.txt] [Organize based on what this paper actually contains and needs] [Create directories and files that make sense for this specific implementation] - [EXCLUDE: README.md, requirements.txt - these come last in implementation] + [IMPORTANT: Include executable files (e.g., main.py, run.py, train.py, demo.py) - choose names based on repo content] + [Design executable entry points that match the paper's main functionality and experiments] + [NOTE: README.md and requirements.txt should be implemented LAST after all code files] # SECTION 2: Implementation Components diff --git a/tools/code_implementation_server.py b/tools/code_implementation_server.py index 172ef720..694dd0ee 100644 --- a/tools/code_implementation_server.py +++ b/tools/code_implementation_server.py @@ -176,6 +176,229 @@ async def read_file( return json.dumps(result, ensure_ascii=False, indent=2) +@mcp.tool() +async def read_multiple_files(file_requests: str, max_files: int = 5) -> str: + """ + Read multiple files in a single operation (for batch reading) + + Args: + file_requests: JSON string with file requests, e.g., + '{"file1.py": {}, "file2.py": {"start_line": 1, "end_line": 10}}' + or simple array: '["file1.py", "file2.py"]' + max_files: Maximum number of files to read in one operation (default: 5) + + Returns: + JSON string of operation results for all files + """ + try: + # Parse the file requests + try: + requests_data = json.loads(file_requests) + except json.JSONDecodeError as e: + return json.dumps( + { + "status": "error", + "message": f"Invalid JSON format for file_requests: {str(e)}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + # Normalize requests format + if isinstance(requests_data, list): + # Convert simple array to dict format + normalized_requests = {file_path: {} for file_path in requests_data} + elif isinstance(requests_data, dict): + normalized_requests = requests_data + else: + return json.dumps( + { + "status": "error", + "message": "file_requests must be a JSON object or array", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + # Validate input + if len(normalized_requests) == 0: + return json.dumps( + { + "status": "error", + "message": "No files provided for reading", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + if len(normalized_requests) > max_files: + return json.dumps( + { + "status": "error", + "message": f"Too many files provided ({len(normalized_requests)}), maximum is {max_files}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + # Process each file + results = { + "status": "success", + "message": f"Successfully processed {len(normalized_requests)} files", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + "files_processed": len(normalized_requests), + "files": {}, + "summary": { + "successful": 0, + "failed": 0, + "total_size_bytes": 0, + "total_lines": 0, + "files_not_found": 0, + }, + } + + # Process each file individually + for file_path, options in normalized_requests.items(): + try: + full_path = validate_path(file_path) + start_line = options.get("start_line") + end_line = options.get("end_line") + + if not full_path.exists(): + results["files"][file_path] = { + "status": "error", + "message": f"File does not exist: {file_path}", + "file_path": file_path, + "content": "", + "total_lines": 0, + "size_bytes": 0, + "start_line": start_line, + "end_line": end_line, + } + results["summary"]["failed"] += 1 + results["summary"]["files_not_found"] += 1 + continue + + with open(full_path, "r", encoding="utf-8") as f: + lines = f.readlines() + + # Handle line range + original_line_count = len(lines) + if start_line is not None or end_line is not None: + start_idx = (start_line - 1) if start_line else 0 + end_idx = end_line if end_line else len(lines) + lines = lines[start_idx:end_idx] + + content = "".join(lines) + size_bytes = len(content.encode("utf-8")) + lines_count = len(lines) + + # Record individual file result + results["files"][file_path] = { + "status": "success", + "message": f"File read successfully: {file_path}", + "file_path": file_path, + "content": content, + "total_lines": lines_count, + "original_total_lines": original_line_count, + "size_bytes": size_bytes, + "start_line": start_line, + "end_line": end_line, + "line_range_applied": start_line is not None + or end_line is not None, + } + + # Update summary + results["summary"]["successful"] += 1 + results["summary"]["total_size_bytes"] += size_bytes + results["summary"]["total_lines"] += lines_count + + # Log individual file operation + log_operation( + "read_file_multi", + { + "file_path": file_path, + "start_line": start_line, + "end_line": end_line, + "lines_read": lines_count, + "size_bytes": size_bytes, + "batch_operation": True, + }, + ) + + except Exception as file_error: + # Record individual file error + results["files"][file_path] = { + "status": "error", + "message": f"Failed to read file: {str(file_error)}", + "file_path": file_path, + "content": "", + "total_lines": 0, + "size_bytes": 0, + "start_line": options.get("start_line"), + "end_line": options.get("end_line"), + } + + results["summary"]["failed"] += 1 + + # Log individual file error + log_operation( + "read_file_multi_error", + { + "file_path": file_path, + "error": str(file_error), + "batch_operation": True, + }, + ) + + # Determine overall status + if results["summary"]["failed"] > 0: + if results["summary"]["successful"] > 0: + results["status"] = "partial_success" + results["message"] = ( + f"Read {results['summary']['successful']} files successfully, {results['summary']['failed']} failed" + ) + else: + results["status"] = "failed" + results["message"] = ( + f"All {results['summary']['failed']} files failed to read" + ) + + # Log overall operation + log_operation( + "read_multiple_files", + { + "files_count": len(normalized_requests), + "successful": results["summary"]["successful"], + "failed": results["summary"]["failed"], + "total_size_bytes": results["summary"]["total_size_bytes"], + "status": results["status"], + }, + ) + + return json.dumps(results, ensure_ascii=False, indent=2) + + except Exception as e: + result = { + "status": "error", + "message": f"Failed to read multiple files: {str(e)}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + "files_processed": 0, + } + log_operation("read_multiple_files_error", {"error": str(e)}) + return json.dumps(result, ensure_ascii=False, indent=2) + + @mcp.tool() async def write_file( file_path: str, content: str, create_dirs: bool = True, create_backup: bool = False @@ -248,6 +471,215 @@ async def write_file( return json.dumps(result, ensure_ascii=False, indent=2) +@mcp.tool() +async def write_multiple_files( + file_implementations: str, + create_dirs: bool = True, + create_backup: bool = False, + max_files: int = 5, +) -> str: + """ + Write multiple files in a single operation (for batch implementation) + + Args: + file_implementations: JSON string mapping file paths to content, e.g., + '{"file1.py": "content1", "file2.py": "content2"}' + create_dirs: Whether to create directories if they don't exist + create_backup: Whether to create backup files if they already exist + max_files: Maximum number of files to write in one operation (default: 5) + + Returns: + JSON string of operation results for all files + """ + try: + # Parse the file implementations + try: + files_dict = json.loads(file_implementations) + except json.JSONDecodeError as e: + return json.dumps( + { + "status": "error", + "message": f"Invalid JSON format for file_implementations: {str(e)}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + # Validate input + if not isinstance(files_dict, dict): + return json.dumps( + { + "status": "error", + "message": "file_implementations must be a JSON object mapping file paths to content", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + if len(files_dict) == 0: + return json.dumps( + { + "status": "error", + "message": "No files provided for writing", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + if len(files_dict) > max_files: + return json.dumps( + { + "status": "error", + "message": f"Too many files provided ({len(files_dict)}), maximum is {max_files}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + }, + ensure_ascii=False, + indent=2, + ) + + # Process each file + results = { + "status": "success", + "message": f"Successfully processed {len(files_dict)} files", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + "files_processed": len(files_dict), + "files": {}, + "summary": { + "successful": 0, + "failed": 0, + "total_size_bytes": 0, + "total_lines": 0, + "backups_created": 0, + }, + } + + # Process each file individually + for file_path, content in files_dict.items(): + try: + full_path = validate_path(file_path) + + # Create directories (if needed) + if create_dirs: + full_path.parent.mkdir(parents=True, exist_ok=True) + + # Backup existing file (only when explicitly requested) + backup_created = False + if full_path.exists() and create_backup: + backup_path = full_path.with_suffix(full_path.suffix + ".backup") + shutil.copy2(full_path, backup_path) + backup_created = True + results["summary"]["backups_created"] += 1 + + # Write file + with open(full_path, "w", encoding="utf-8") as f: + f.write(content) + + # Calculate file metrics + size_bytes = len(content.encode("utf-8")) + lines_count = len(content.split("\n")) + + # Update current file record + CURRENT_FILES[file_path] = { + "last_modified": datetime.now().isoformat(), + "size_bytes": size_bytes, + "lines": lines_count, + } + + # Record individual file result + results["files"][file_path] = { + "status": "success", + "message": f"File written successfully: {file_path}", + "size_bytes": size_bytes, + "lines_written": lines_count, + "backup_created": backup_created, + } + + # Update summary + results["summary"]["successful"] += 1 + results["summary"]["total_size_bytes"] += size_bytes + results["summary"]["total_lines"] += lines_count + + # Log individual file operation + log_operation( + "write_file_multi", + { + "file_path": file_path, + "size_bytes": size_bytes, + "lines": lines_count, + "backup_created": backup_created, + "batch_operation": True, + }, + ) + + except Exception as file_error: + # Record individual file error + results["files"][file_path] = { + "status": "error", + "message": f"Failed to write file: {str(file_error)}", + "size_bytes": 0, + "lines_written": 0, + "backup_created": False, + } + + results["summary"]["failed"] += 1 + + # Log individual file error + log_operation( + "write_file_multi_error", + { + "file_path": file_path, + "error": str(file_error), + "batch_operation": True, + }, + ) + + # Determine overall status + if results["summary"]["failed"] > 0: + if results["summary"]["successful"] > 0: + results["status"] = "partial_success" + results["message"] = ( + f"Processed {results['summary']['successful']} files successfully, {results['summary']['failed']} failed" + ) + else: + results["status"] = "failed" + results["message"] = ( + f"All {results['summary']['failed']} files failed to write" + ) + + # Log overall operation + log_operation( + "write_multiple_files", + { + "files_count": len(files_dict), + "successful": results["summary"]["successful"], + "failed": results["summary"]["failed"], + "total_size_bytes": results["summary"]["total_size_bytes"], + "status": results["status"], + }, + ) + + return json.dumps(results, ensure_ascii=False, indent=2) + + except Exception as e: + result = { + "status": "error", + "message": f"Failed to write multiple files: {str(e)}", + "operation_type": "multi_file", + "timestamp": datetime.now().isoformat(), + "files_processed": 0, + } + log_operation("write_multiple_files_error", {"error": str(e)}) + return json.dumps(result, ensure_ascii=False, indent=2) + + # ==================== Code Execution Tools ==================== diff --git a/tools/pdf_downloader.py b/tools/pdf_downloader.py index e1e4ed8b..ce55853c 100644 --- a/tools/pdf_downloader.py +++ b/tools/pdf_downloader.py @@ -103,7 +103,17 @@ async def perform_document_conversion( conversion_msg = "" # 首先尝试使用简单的PDF转换器(对于PDF文件) - if file_path.lower().endswith(".pdf") and PYPDF2_AVAILABLE: + # 检查文件是否实际为PDF(无论扩展名如何) + is_pdf_file = False + if PYPDF2_AVAILABLE: + try: + with open(file_path, "rb") as f: + header = f.read(8) + is_pdf_file = header.startswith(b"%PDF") + except Exception: + is_pdf_file = file_path.lower().endswith(".pdf") + + if is_pdf_file and PYPDF2_AVAILABLE: try: simple_converter = SimplePdfConverter() conversion_result = simple_converter.convert_pdf_to_markdown(file_path) diff --git a/ui/components.py b/ui/components.py index 2fa57050..933f419b 100644 --- a/ui/components.py +++ b/ui/components.py @@ -1,3 +1,4 @@ +# -*- coding: utf-8 -*- """ Streamlit UI Components Module @@ -6,388 +7,99 @@ import streamlit as st import sys -from typing import Dict, Any, Optional +from typing import Dict, Any, Optional, List from datetime import datetime import json def display_header(): - """Display application header""" + """Display modern, compact application header""" st.markdown( """ -
-

🧬 DeepCode

-

OPEN-SOURCE CODE AGENT

-

⚡ DATA INTELLIGENCE LAB @ HKU • REVOLUTIONIZING RESEARCH REPRODUCIBILITY ⚡

-
- """, - unsafe_allow_html=True, - ) - - -def display_features(): - """Display DeepCode AI Agent capabilities""" - # AI Agent core capabilities display area - updated to match README content - st.markdown( - """ -
-
-
-
-
-
-

🧠 Open Agentic Coding Platform

-

Advancing Code Generation with Multi-Agent Systems

-
- """, - unsafe_allow_html=True, - ) - - # Core functionality modules - Vertical Layout - st.markdown( - """ -
-
-
-
- -
🚀
-
-
-

Paper2Code: Research-to-Production Pipeline

-

Automated Implementation of Complex Algorithms

-
-
-
- Multi-Modal - Analysis -
-
- Production - Ready -
-
-
-
-
-

Multi-modal document analysis engine that extracts algorithmic logic and mathematical models from academic papers, generating optimized implementations with proper data structures while preserving computational complexity characteristics.

-
-
-
📄
- Document Parsing -
-
-
-
🧠
- Algorithm Extraction -
-
-
-
- Code Synthesis -
-
-
-
- Quality Assurance +
+
+
+
+
+
+
+ ◊ DeepCode
-
-
-
-
- Python -
Generating...
-
-
-
import torch
-
import torch.nn as nn
-
class ResearchAlgorithm(nn.Module):
-
def __init__(self, config):
-
super().__init__()
-
+
+ AI Research Engine + + Data Intelligence Lab @ HKU
+
+ + ONLINE +
""", unsafe_allow_html=True, ) + +def display_features(): + """Display DeepCode AI capabilities with world-class, futuristic design""" + + # Capability Matrix st.markdown( """ -
-
-
-
- -
🎨
-
-
-

Text2Web: Automated Prototyping Engine

-

Natural Language to Front-End Code Synthesis

-
-
-
- Intelligent - Scaffolding +
+
+
+
+
RESEARCH
-
- Scalable - Architecture +
+

Paper2Code&Text2Code

+

Neural document processing and algorithmic synthesis

-
-
-
-
-

Context-aware code generation using fine-tuned language models. Intelligent scaffolding system generating complete application structures including frontend components, maintaining architectural consistency across modules.

-
-
-
📝
-

Intent Understanding

-

Semantic analysis of requirements

-
-
-
🎨
-

UI Architecture

-

Component design & structure

-
-
-
💻
-

Code Generation

-

Functional interface creation

-
-
-
-

Quality Assurance

-

Automated testing & validation

-
-
-
-
-
-
-
🎯
- Web Application -
-
-
-
- 📝 -
-
-
- 🏗️ -
-
-
- ⚙️ -
-
-
- 🧪 -
-
+
+ Multi-Agents
+ +
-
""", unsafe_allow_html=True, ) + # Processing Pipeline st.markdown( """ -
-
-
-
- -
⚙️
-
-
-

Text2Backend: Scalable Architecture Generator

-

Intelligent Server-Side Development

+
+
+
REQUIREMENTS
+
Input Requirements
-
-
- Database - Integration -
-
- API - Endpoints -
-
-
-
-
-

Generates efficient, scalable backend systems with database schemas, API endpoints, and microservices architecture. Uses dependency analysis to ensure scalable architecture from initial generation with comprehensive testing.

-
-
-
💬
-
"Build a scalable e-commerce API with user authentication and payment processing"
-
-
⬇️
-
-
-
🏗️ Microservices Architecture
-
🔒 Authentication & Security
-
🗄️ Database Schema Design
-
📊 API Documentation & Testing
-
-
-
-
-
-
-
-
-
-

API Design

-

RESTful endpoints

-
-
-
-
-
-

Database Layer

-

Schema & relationships

-
-
-
-
🔄
-
-

Security Layer

-

Authentication & authorization

-
-
-
-
🚀
-
-

Deployment

-

CI/CD integration

-
-
-
+
+
-
-
- """, - unsafe_allow_html=True, - ) - - st.markdown( - """ -
-
-
-
- -
🎯
+
+
PLANNING
+
Design & Planning
-
-

CodeRAG Integration System

-

Advanced Multi-Agent Orchestration

+
+
-
-
- Global - Code Analysis -
-
- Intelligent - Orchestration -
+
+
IMPLEMENTATION
+
Code Implementation
-
-
-
-

Advanced retrieval-augmented generation combining semantic vector embeddings with graph-based dependency analysis. Central orchestrating agent coordinates specialized agents with dynamic task planning and intelligent memory management.

-
-
-
🧠
-
-

Intelligent Orchestration

-

Central decision-making with dynamic planning algorithms

-
-
-
-
🔍
-
-

CodeRAG System

-

Semantic analysis with dependency graph mapping

-
-
-
-
-
-

Quality Assurance

-

Automated testing, validation, and documentation

-
-
-
+
+
-
-
-
- - Multi-Agent Engine -
-
-
-
🎯
- Central Orchestration -
-
-
📝
- Intent Understanding -
-
-
🔍
- Code Mining & Indexing -
-
-
🧬
- Code Generation -
-
-
+
+
VALIDATION
+
Validation & Refinement
-
""", unsafe_allow_html=True, ) @@ -791,9 +503,283 @@ def url_input_component(task_counter: int) -> Optional[str]: return None +def requirement_analysis_mode_selector(task_counter: int) -> str: + """ + Requirement analysis mode selector + + Args: + task_counter: Task counter + + Returns: + Selected mode ("direct" or "guided") + """ + st.markdown( + """ +
+

+ 🎯 Choose Your Input Mode +

+

+ Select how you'd like to provide your requirements +

+
+ """, + unsafe_allow_html=True, + ) + + mode = st.radio( + "Input mode:", + ["🚀 Direct Input", "🧠 Guided Analysis"], + index=0 + if st.session_state.get("requirement_analysis_mode", "direct") == "direct" + else 1, + horizontal=True, + help="Direct: Enter requirements directly. Guided: AI asks questions to help you clarify needs.", + key=f"req_mode_{task_counter}", + ) + + return "direct" if mode.startswith("🚀") else "guided" + + +def requirement_questions_component( + questions: List[Dict], task_counter: int +) -> Dict[str, str]: + """ + Requirement questions display and answer collection component + + Args: + questions: Question list + task_counter: Task counter + + Returns: + User answer dictionary + """ + st.markdown( + """ +
+

+ 📝 Help Us Understand Your Needs Better +

+

+ Please answer the following questions to help us generate better code. You can skip any question. +

+
+ """, + unsafe_allow_html=True, + ) + + answers = {} + + for i, question in enumerate(questions): + with st.expander( + f"📋 {question.get('category', 'Question')} - {question.get('importance', 'Medium')} Priority", + expanded=i < 3, + ): + st.markdown(f"**{question['question']}**") + + if question.get("hint"): + st.info(f"💡 {question['hint']}") + + answer = st.text_area( + "Your answer:", + placeholder="Enter your answer here, or leave blank to skip...", + height=80, + key=f"answer_{i}_{task_counter}", + ) + + if answer and answer.strip(): + answers[str(i)] = answer.strip() + + st.markdown("---") + st.info(f"📊 You've answered {len(answers)} out of {len(questions)} questions.") + + return answers + + +def requirement_summary_component(summary: str, task_counter: int) -> bool: + """ + Requirement summary display and confirmation component + + Args: + summary: Requirement summary document + task_counter: Task counter + + Returns: + Whether user confirms requirements + """ + st.markdown( + """ +
+

+ 📋 Detailed Requirements Summary +

+

+ Based on your input, here's the detailed requirements document we've generated. +

+
+ """, + unsafe_allow_html=True, + ) + + # Display requirement summary + with st.expander("📖 View Detailed Requirements", expanded=True): + st.markdown(summary) + + # Confirmation options + st.markdown("### 🎯 Next Steps") + + col1, col2, col3 = st.columns(3) + + with col1: + if st.button( + "✅ Looks Good, Proceed", + type="primary", + use_container_width=True, + key=f"confirm_{task_counter}", + ): + # Mark requirements as confirmed, prepare to enter code generation + st.session_state.requirements_confirmed = True + return True + + with col2: + if st.button( + "✏️ Edit Requirements", + type="secondary", + use_container_width=True, + key=f"edit_{task_counter}", + ): + # Enter editing mode + st.session_state.requirement_analysis_step = "editing" + st.session_state.edit_feedback = "" + st.rerun() + + with col3: + if st.button( + "🔄 Start Over", use_container_width=True, key=f"restart_{task_counter}" + ): + # Complete reset + st.session_state.requirement_analysis_mode = "direct" + st.session_state.requirement_analysis_step = "input" + st.session_state.generated_questions = [] + st.session_state.user_answers = {} + st.session_state.detailed_requirements = "" + st.rerun() + + return False + + +def requirement_editing_component(current_requirements: str, task_counter: int) -> bool: + """ + Interactive requirement editing component + + Args: + current_requirements: Current requirement document content + task_counter: Task counter + + Returns: + Whether editing is completed + """ + st.markdown( + """ +
+

+ ✏️ Edit Requirements Document +

+

+ Review the current requirements and tell us how you'd like to modify them. +

+
+ """, + unsafe_allow_html=True, + ) + + # Display current requirements + st.markdown("### 📋 Current Requirements") + with st.expander("📖 View Current Requirements Document", expanded=True): + st.markdown(current_requirements) + + # Ask for modification feedback + st.markdown("### 💭 How would you like to modify the requirements?") + st.markdown("Please describe your changes, additions, or corrections:") + + edit_feedback = st.text_area( + "Your modification request:", + value=st.session_state.edit_feedback, + placeholder="For example:\n- Add user authentication feature\n- Change database from MySQL to PostgreSQL", + height=120, + key=f"edit_feedback_{task_counter}", + ) + + # Update session state + st.session_state.edit_feedback = edit_feedback + + # Action buttons + col1, col2, col3 = st.columns(3) + + with col1: + if st.button( + "🔄 Apply Changes", + type="primary", + use_container_width=True, + key=f"apply_edit_{task_counter}", + ): + if edit_feedback.strip(): + # Start requirement modification process + st.session_state.requirements_editing = True + st.info("🔄 Processing your modification request...") + return True + else: + st.warning("Please provide your modification request first.") + + with col2: + if st.button( + "↩️ Back to Summary", + type="secondary", + use_container_width=True, + key=f"back_summary_{task_counter}", + ): + # Go back to summary view + st.session_state.requirement_analysis_step = "summary" + st.session_state.edit_feedback = "" + st.rerun() + + with col3: + if st.button( + "🔄 Start Over", + use_container_width=True, + key=f"restart_edit_{task_counter}", + ): + # Complete reset + st.session_state.requirement_analysis_mode = "direct" + st.session_state.requirement_analysis_step = "input" + st.session_state.generated_questions = [] + st.session_state.user_answers = {} + st.session_state.detailed_requirements = "" + st.session_state.edit_feedback = "" + st.rerun() + + return False + + def chat_input_component(task_counter: int) -> Optional[str]: """ - Chat input component for coding requirements + Enhanced chat input component with requirement analysis support Args: task_counter: Task counter @@ -801,6 +787,20 @@ def chat_input_component(task_counter: int) -> Optional[str]: Returns: User coding requirements or None """ + # Select input mode + selected_mode = requirement_analysis_mode_selector(task_counter) + + # Update requirement analysis mode + st.session_state.requirement_analysis_mode = selected_mode + + if selected_mode == "direct": + return _direct_input_component(task_counter) + else: + return _guided_analysis_component(task_counter) + + +def _direct_input_component(task_counter: int) -> Optional[str]: + """Direct input mode component""" st.markdown( """

- 💬 Describe Your Coding Requirements + 🚀 Direct Input Mode

- Tell us what you want to build. Our AI will analyze your requirements and generate a comprehensive implementation plan. + Describe your coding requirements directly. Our AI will analyze and generate a comprehensive implementation plan.

""", @@ -852,7 +852,7 @@ def chat_input_component(task_counter: int) -> Optional[str]: The system should be scalable and production-ready, with proper error handling and documentation.""", height=200, help="Describe what you want to build, including functionality, technologies, and any specific requirements", - key=f"chat_input_{task_counter}", + key=f"direct_input_{task_counter}", ) if user_input and len(user_input.strip()) > 20: # Minimum length check @@ -871,7 +871,7 @@ def chat_input_component(task_counter: int) -> Optional[str]: user_input, height=100, disabled=True, - key=f"preview_{task_counter}", + key=f"direct_preview_{task_counter}", ) return user_input.strip() @@ -885,6 +885,183 @@ def chat_input_component(task_counter: int) -> Optional[str]: return None +def _guided_analysis_component(task_counter: int) -> Optional[str]: + """Guided analysis mode component""" + + # Check if requirements are confirmed, if confirmed return detailed requirements directly + if st.session_state.get("requirements_confirmed", False): + detailed_requirements = st.session_state.get("detailed_requirements", "") + if detailed_requirements: + # Show confirmation message and return requirements for processing + st.success("🎉 Requirement analysis completed! Starting code generation...") + st.info( + "🔄 Automatically proceeding to code generation based on your confirmed requirements." + ) + return detailed_requirements + + st.markdown( + """ +
+

+ 🧠 Guided Analysis Mode +

+

+ Let our AI guide you through a series of questions to better understand your requirements. +

+
+ """, + unsafe_allow_html=True, + ) + + # Check current step + current_step = st.session_state.get("requirement_analysis_step", "input") + + if current_step == "input": + return _guided_input_step(task_counter) + elif current_step == "questions": + return _guided_questions_step(task_counter) + elif current_step == "summary": + return _guided_summary_step(task_counter) + elif current_step == "editing": + return _guided_editing_step(task_counter) + else: + # Reset to initial state + st.session_state.requirement_analysis_step = "input" + st.rerun() + + +def _guided_input_step(task_counter: int) -> Optional[str]: + """Initial input step for guided mode""" + st.markdown("### 📝 Step 1: Tell us your basic idea") + + user_input = st.text_area( + "What would you like to build? (Brief description is fine)", + placeholder="Example: A web app for sentiment analysis of social media posts", + height=120, + help="Don't worry about details - we'll ask specific questions next!", + key=f"guided_input_{task_counter}", + ) + + if user_input and len(user_input.strip()) > 10: + col1, col2 = st.columns([3, 1]) + + with col1: + st.info(f"📝 Initial idea captured: {len(user_input.split())} words") + + with col2: + if st.button( + "🚀 Generate Questions", type="primary", use_container_width=True + ): + # Save initial input and enter question generation step + st.session_state.initial_requirement = user_input.strip() + st.session_state.requirement_analysis_step = "questions" + st.rerun() + + elif user_input and len(user_input.strip()) <= 10: + st.warning( + "⚠️ Please provide at least a brief description (more than 10 characters)" + ) + + return None + + +def _guided_questions_step(task_counter: int) -> Optional[str]: + """Question answering step for guided mode""" + st.markdown("### 🤔 Step 2: Answer questions to refine your requirements") + + # Display initial requirements + with st.expander("📋 Your Initial Idea", expanded=False): + st.write(st.session_state.get("initial_requirement", "")) + + # Check if questions have been generated + if not st.session_state.get("generated_questions"): + st.info("🔄 Generating personalized questions for your project...") + + # Async call needed here, but we show placeholder in UI first + if st.button("🎯 Generate Questions Now", type="primary"): + st.session_state.questions_generating = True + st.rerun() + return None + + # Display questions and collect answers + questions = st.session_state.generated_questions + answers = requirement_questions_component(questions, task_counter) + st.session_state.user_answers = answers + + # Continue button + col1, col2, col3 = st.columns([1, 2, 1]) + + with col2: + if st.button( + "📋 Generate Detailed Requirements", + type="primary", + use_container_width=True, + ): + st.session_state.requirement_analysis_step = "summary" + st.rerun() + + with col1: + if st.button("⬅️ Back", use_container_width=True): + st.session_state.requirement_analysis_step = "input" + st.rerun() + + return None + + +def _guided_summary_step(task_counter: int) -> Optional[str]: + """Requirement summary step for guided mode""" + st.markdown("### 📋 Step 3: Review and confirm your detailed requirements") + + # Check if detailed requirements have been generated + if not st.session_state.get("detailed_requirements"): + st.info("🔄 Generating detailed requirements based on your answers...") + + if st.button("📋 Generate Requirements Now", type="primary"): + st.session_state.requirements_generating = True + st.rerun() + return None + + # Display requirement summary and get confirmation + summary = st.session_state.detailed_requirements + confirmed = requirement_summary_component(summary, task_counter) + + if confirmed: + # Return detailed requirements as final input + return summary + + return None + + +def _guided_editing_step(task_counter: int) -> Optional[str]: + """Requirement editing step for guided mode""" + st.markdown("### ✏️ Step 4: Edit your requirements") + + # Get current requirements + current_requirements = st.session_state.get("detailed_requirements", "") + if not current_requirements: + st.error("No requirements found to edit. Please start over.") + st.session_state.requirement_analysis_step = "input" + st.rerun() + return None + + # Show editing component + editing_requested = requirement_editing_component( + current_requirements, task_counter + ) + + if editing_requested: + # User has provided editing feedback, trigger requirement modification + st.session_state.requirements_editing = True + st.rerun() + return None + + return None + + def input_method_selector(task_counter: int) -> tuple[Optional[str], Optional[str]]: """ Input method selector diff --git a/ui/handlers.py b/ui/handlers.py index f3a03bc3..b109003a 100644 --- a/ui/handlers.py +++ b/ui/handlers.py @@ -587,6 +587,203 @@ def cleanup_temp_file(input_source: str, input_type: str): pass +async def handle_requirement_analysis_workflow( + user_input: str, analysis_mode: str, user_answers: Dict[str, str] = None +) -> Dict[str, Any]: + """ + Handle requirement analysis workflow + + Args: + user_input: User initial requirements + analysis_mode: Analysis mode ("generate_questions" or "summarize_requirements") + user_answers: User answer dictionary + + Returns: + Processing result dictionary + """ + try: + # Import required modules + from workflows.agent_orchestration_engine import ( + execute_requirement_analysis_workflow, + ) + + # Create progress callback function + def update_progress(progress: int, message: str): + # Display progress in Streamlit + st.session_state.current_progress = progress + st.session_state.current_message = message + + # Execute requirement analysis workflow + result = await execute_requirement_analysis_workflow( + user_input=user_input, + analysis_mode=analysis_mode, + user_answers=user_answers, + logger=None, # Can pass in logger + progress_callback=update_progress, + ) + + return result + + except Exception as e: + return { + "status": "error", + "error": str(e), + "message": f"Requirement analysis workflow execution failed: {str(e)}", + } + + +async def handle_requirement_modification_workflow( + current_requirements: str, modification_feedback: str +) -> Dict[str, Any]: + """ + Handle requirement modification workflow + + Args: + current_requirements: Current requirement document content + modification_feedback: User's modification requests and feedback + + Returns: + Processing result dictionary + """ + try: + # Import required modules + from workflows.agents.requirement_analysis_agent import RequirementAnalysisAgent + + # Create progress callback function + def update_progress(progress: int, message: str): + # Display progress in Streamlit + st.session_state.current_progress = progress + st.session_state.current_message = message + + update_progress(10, "🔧 Initializing requirement modification agent...") + + # Initialize RequirementAnalysisAgent + agent = RequirementAnalysisAgent() + + # Initialize agent (LLM is initialized internally) + await agent.initialize() + + update_progress(50, "✏️ Modifying requirements based on your feedback...") + + # Modify requirements + result = await agent.modify_requirements( + current_requirements=current_requirements, + modification_feedback=modification_feedback, + ) + + # Cleanup + await agent.cleanup() + + update_progress(100, "✅ Requirements modification completed!") + + return { + "status": "success", + "result": result, + "message": "Requirements modification completed successfully", + } + + except Exception as e: + return { + "status": "error", + "error": str(e), + "message": f"Requirements modification workflow execution failed: {str(e)}", + } + + +def handle_guided_mode_processing(): + """Handle asynchronous processing for guided mode""" + # Check if questions need to be generated + if st.session_state.get("questions_generating", False): + st.session_state.questions_generating = False + + # Asynchronously generate questions + initial_req = st.session_state.get("initial_requirement", "") + if initial_req: + try: + # Use asynchronous processing to generate questions + result = run_async_task_simple( + handle_requirement_analysis_workflow( + user_input=initial_req, analysis_mode="generate_questions" + ) + ) + + if result["status"] == "success": + # Parse JSON result + import json + + questions = json.loads(result["result"]) + st.session_state.generated_questions = questions + else: + st.error( + f"Question generation failed: {result.get('error', 'Unknown error')}" + ) + + except Exception as e: + st.error(f"Question generation exception: {str(e)}") + + # Check if detailed requirements need to be generated + if st.session_state.get("requirements_generating", False): + st.session_state.requirements_generating = False + + # Asynchronously generate detailed requirements + initial_req = st.session_state.get("initial_requirement", "") + user_answers = st.session_state.get("user_answers", {}) + + if initial_req: + try: + # Use asynchronous processing to generate requirement summary + result = run_async_task_simple( + handle_requirement_analysis_workflow( + user_input=initial_req, + analysis_mode="summarize_requirements", + user_answers=user_answers, + ) + ) + + if result["status"] == "success": + st.session_state.detailed_requirements = result["result"] + else: + st.error( + f"Requirement summary generation failed: {result.get('error', 'Unknown error')}" + ) + + except Exception as e: + st.error(f"Requirement summary generation exception: {str(e)}") + + # Check if requirements need to be edited + if st.session_state.get("requirements_editing", False): + st.session_state.requirements_editing = False + st.info("🔧 Starting requirement modification process...") + + # Asynchronously modify requirements based on user feedback + current_requirements = st.session_state.get("detailed_requirements", "") + edit_feedback = st.session_state.get("edit_feedback", "") + + if current_requirements and edit_feedback: + try: + # Use asynchronous processing to modify requirements + result = run_async_task_simple( + handle_requirement_modification_workflow( + current_requirements=current_requirements, + modification_feedback=edit_feedback, + ) + ) + + if result["status"] == "success": + st.session_state.detailed_requirements = result["result"] + st.session_state.requirement_analysis_step = "summary" + st.session_state.edit_feedback = "" + st.success("✅ Requirements updated successfully!") + st.rerun() + else: + st.error( + f"Requirements modification failed: {result.get('error', 'Unknown error')}" + ) + + except Exception as e: + st.error(f"Requirements modification exception: {str(e)}") + + def handle_start_processing_button(input_source: str, input_type: str): """ Handle start processing button click @@ -666,6 +863,30 @@ def initialize_session_state(): False # Default enable indexing functionality ) + # Requirement analysis related states + if "requirement_analysis_mode" not in st.session_state: + st.session_state.requirement_analysis_mode = "direct" # direct/guided + if "requirement_analysis_step" not in st.session_state: + st.session_state.requirement_analysis_step = "input" # input/questions/summary + if "generated_questions" not in st.session_state: + st.session_state.generated_questions = [] + if "user_answers" not in st.session_state: + st.session_state.user_answers = {} + if "detailed_requirements" not in st.session_state: + st.session_state.detailed_requirements = "" + if "initial_requirement" not in st.session_state: + st.session_state.initial_requirement = "" + if "questions_generating" not in st.session_state: + st.session_state.questions_generating = False + if "requirements_generating" not in st.session_state: + st.session_state.requirements_generating = False + if "requirements_confirmed" not in st.session_state: + st.session_state.requirements_confirmed = False + if "edit_feedback" not in st.session_state: + st.session_state.edit_feedback = "" + if "requirements_editing" not in st.session_state: + st.session_state.requirements_editing = False + def cleanup_resources(): """ diff --git a/ui/layout.py b/ui/layout.py index 54185f55..dbb79638 100644 --- a/ui/layout.py +++ b/ui/layout.py @@ -18,17 +18,23 @@ initialize_session_state, handle_start_processing_button, handle_error_display, + handle_guided_mode_processing, ) from .styles import get_main_styles def setup_page_config(): - """Setup page configuration""" + """Setup optimized page configuration""" st.set_page_config( page_title="DeepCode - AI Research Engine", page_icon="🧬", layout="wide", initial_sidebar_state="expanded", + menu_items={ + "Get Help": "https://github.com/yourusername/deepcode", + "Report a bug": "https://github.com/yourusername/deepcode/issues", + "About": "# DeepCode AI Research Engine\nNext-Generation Multi-Agent Coding Platform", + }, ) @@ -38,11 +44,24 @@ def apply_custom_styles(): def render_main_content(): - """Render main content area""" - # Display header and features + """Render main content area with improved layout""" + # Display modern, compact header and features display_header() display_features() - st.markdown("---") + + # Add subtle spacing instead of heavy divider + st.markdown( + """ +
+
+ """, + unsafe_allow_html=True, + ) # Display results if available if st.session_state.show_results and st.session_state.last_result: @@ -62,11 +81,45 @@ def render_main_content(): def render_input_interface(): """Render input interface""" - # Get input source and type - input_source, input_type = input_method_selector(st.session_state.task_counter) - - # Processing button - if input_source and not st.session_state.processing: + # 处理引导模式的异步操作 + handle_guided_mode_processing() + + # Check if user is in guided analysis workflow + if st.session_state.get( + "requirement_analysis_mode" + ) == "guided" and st.session_state.get("requirement_analysis_step") in [ + "questions", + "summary", + "editing", + ]: + # User is in guided analysis workflow, show chat input directly + from .components import chat_input_component + + input_source = chat_input_component(st.session_state.task_counter) + input_type = "chat" if input_source else None + else: + # Normal flow: show input method selector + input_source, input_type = input_method_selector(st.session_state.task_counter) + + # Processing button - Check if requirements are confirmed for guided mode + requirements_confirmed = st.session_state.get("requirements_confirmed", False) + + # For guided mode, if requirements are confirmed, automatically start processing + if ( + st.session_state.get("requirement_analysis_mode") == "guided" + and requirements_confirmed + and input_source + and not st.session_state.processing + ): + # Automatically start processing for confirmed requirements + st.session_state.requirements_confirmed = ( + False # Clear flag to prevent re-processing + ) + handle_start_processing_button(input_source, input_type) + elif ( + input_source and not st.session_state.processing and not requirements_confirmed + ): + # Only show Start Processing button if requirements are not already confirmed if st.button("🚀 Start Processing", type="primary", use_container_width=True): handle_start_processing_button(input_source, input_type) @@ -75,7 +128,9 @@ def render_input_interface(): st.warning("⚠️ Do not refresh the page or close the browser during processing.") elif not input_source: - st.info("👆 Please upload a file or enter a URL to start processing.") + st.info( + "👆 Please upload a file, enter a URL, or describe your coding requirements to start processing." + ) def render_sidebar(): diff --git a/ui/styles.py b/ui/styles.py index a9cad83f..8f9a8d85 100644 --- a/ui/styles.py +++ b/ui/styles.py @@ -52,11 +52,36 @@ def get_main_styles() -> str: --light-accent-purple: #ba68c8; } - /* Global app background and text */ + /* Global app background and text - Enhanced */ .stApp { background: linear-gradient(135deg, var(--primary-bg) 0%, var(--secondary-bg) 100%); color: var(--text-primary); font-family: 'Inter', sans-serif; + min-height: 100vh; + } + + /* Enhanced main container */ + .main .block-container { + padding-top: 2rem !important; + padding-bottom: 2rem !important; + max-width: 1200px !important; + } + + /* Improved responsiveness for all screen sizes */ + @media (max-width: 1200px) { + .main .block-container { + max-width: 95% !important; + padding-left: 2rem !important; + padding-right: 2rem !important; + } + } + + @media (max-width: 768px) { + .main .block-container { + padding-top: 1rem !important; + padding-left: 1rem !important; + padding-right: 1rem !important; + } } /* Force high contrast for all text elements */ @@ -1067,6 +1092,170 @@ def get_main_styles() -> str: font-weight: 600 !important; } + /* Modern Header Styles - Compact & Professional */ + .modern-header { + background: linear-gradient(135deg, rgba(26, 31, 58, 0.8) 0%, rgba(45, 55, 72, 0.6) 100%); + border-radius: 16px; + margin: 1rem 0; + padding: 1.5rem 2rem; + backdrop-filter: blur(20px); + border: 1px solid rgba(100, 181, 246, 0.1); + position: relative; + overflow: hidden; + } + + .modern-header::before { + content: ''; + position: absolute; + top: 0; + left: 0; + right: 0; + height: 2px; + background: linear-gradient(90deg, + var(--neon-cyan) 0%, + var(--neon-blue) 50%, + var(--neon-green) 100%); + z-index: 1; + } + + .header-content { + display: flex; + justify-content: space-between; + align-items: center; + position: relative; + z-index: 2; + } + + .logo-section { + display: flex; + flex-direction: column; + gap: 0.5rem; + } + + .logo-animation { + display: flex; + align-items: center; + gap: 0.8rem; + } + + .dna-helix { + position: relative; + width: 30px; + height: 30px; + } + + .helix-strand { + position: absolute; + width: 100%; + height: 2px; + background: var(--neon-cyan); + border-radius: 2px; + animation: helix-rotate 3s infinite linear; + } + + .helix-strand.strand-1 { + top: 40%; + animation-delay: 0s; + } + + .helix-strand.strand-2 { + top: 60%; + animation-delay: 1.5s; + background: var(--neon-blue); + } + + @keyframes helix-rotate { + 0% { transform: rotateY(0deg) scaleX(1); } + 25% { transform: rotateY(90deg) scaleX(0.3); } + 50% { transform: rotateY(180deg) scaleX(1); } + 75% { transform: rotateY(270deg) scaleX(0.3); } + 100% { transform: rotateY(360deg) scaleX(1); } + } + + .logo-text { + font-size: 1.8rem; + font-weight: 700; + background: linear-gradient(135deg, var(--neon-cyan), var(--neon-blue)); + -webkit-background-clip: text; + -webkit-text-fill-color: transparent; + background-clip: text; + filter: drop-shadow(0 0 20px rgba(77, 208, 225, 0.3)); + } + + .tagline { + display: flex; + align-items: center; + gap: 0.8rem; + font-size: 0.9rem; + } + + .tagline .highlight { + color: var(--neon-cyan); + font-weight: 600; + } + + .tagline .separator { + color: var(--text-muted); + opacity: 0.6; + } + + .tagline .org { + color: var(--text-secondary); + font-weight: 400; + } + + .status-badge { + display: flex; + align-items: center; + gap: 0.5rem; + background: rgba(129, 199, 132, 0.1); + border: 1px solid rgba(129, 199, 132, 0.3); + border-radius: 20px; + padding: 0.5rem 1rem; + } + + .status-dot { + width: 8px; + height: 8px; + background: var(--neon-green); + border-radius: 50%; + animation: status-pulse 2s infinite ease-in-out; + } + + @keyframes status-pulse { + 0%, 100% { opacity: 1; transform: scale(1); } + 50% { opacity: 0.6; transform: scale(0.8); } + } + + .status-text { + font-size: 0.8rem; + color: var(--neon-green); + font-weight: 600; + letter-spacing: 0.5px; + } + + /* Responsive modern header */ + @media (max-width: 768px) { + .modern-header .header-content { + flex-direction: column; + gap: 1rem; + text-align: center; + } + + .modern-header .tagline { + flex-wrap: wrap; + justify-content: center; + } + + .modern-header .logo-text { + font-size: 1.5rem; + } + + .modern-header { + padding: 1rem 1.5rem; + } + } + /* Streamlit component style overrides */ .stMarkdown h3 { color: var(--neon-cyan) !important; @@ -1403,108 +1592,899 @@ def get_main_styles() -> str: justify-content: space-between; } - /* NEW VERTICAL LAYOUT FEATURE CARDS */ - .feature-card-vertical { - position: relative; - background: linear-gradient(135deg, var(--card-bg) 0%, rgba(45, 55, 72, 0.8) 100%); - backdrop-filter: blur(25px); - border: 1px solid var(--border-color); - padding: 0; + /* AI NEXUS FUTURISTIC LAYOUT - WORLD-CLASS DESIGN */ + .ai-nexus-container { + background: linear-gradient(135deg, + rgba(0, 0, 0, 0.95) 0%, + rgba(15, 20, 42, 0.95) 25%, + rgba(0, 12, 36, 0.95) 50%, + rgba(10, 5, 30, 0.95) 75%, + rgba(0, 0, 0, 0.95) 100%); border-radius: 24px; - margin: 2.5rem 0; - transition: all 0.5s cubic-bezier(0.175, 0.885, 0.32, 1.275); - box-shadow: 0 12px 60px rgba(0, 0, 0, 0.4); + padding: 3rem 2rem; + margin: 2rem 0; + backdrop-filter: blur(20px); + border: 1px solid rgba(0, 255, 255, 0.2); + position: relative; overflow: hidden; - min-height: 500px; + box-shadow: + 0 0 50px rgba(0, 255, 255, 0.1), + inset 0 0 50px rgba(0, 255, 255, 0.05); } - .feature-card-vertical:hover { - transform: translateY(-8px) scale(1.01); - box-shadow: 0 20px 80px rgba(0, 0, 0, 0.5); + .ai-nexus-container::before { + content: ''; + position: absolute; + top: 0; + left: 0; + right: 0; + bottom: 0; + background: + radial-gradient(circle at 20% 30%, rgba(0, 255, 255, 0.1) 0%, transparent 50%), + radial-gradient(circle at 80% 70%, rgba(255, 0, 255, 0.08) 0%, transparent 50%), + radial-gradient(circle at 40% 80%, rgba(0, 255, 0, 0.06) 0%, transparent 50%); + z-index: -1; + animation: neural-pulse 4s ease-in-out infinite alternate; } - /* Card glow effect for vertical cards */ - .card-glow-vertical { + @keyframes neural-pulse { + 0% { opacity: 0.6; transform: scale(1); } + 100% { opacity: 0.9; transform: scale(1.02); } + } + + .quantum-header { + text-align: center; + margin-bottom: 3rem; + position: relative; + } + + .neural-matrix { + position: relative; + display: inline-block; + margin-bottom: 2rem; + width: 120px; + height: 80px; + } + + .matrix-node { position: absolute; - top: -50%; - left: -50%; - width: 200%; - height: 200%; - background: radial-gradient(circle, transparent 30%, rgba(77, 208, 225, 0.03) 60%, transparent 80%); - opacity: 0; - transition: opacity 0.5s ease; - pointer-events: none; - animation: verticalGlowPulse 8s ease-in-out infinite; + width: 8px; + height: 8px; + background: rgba(0, 255, 255, 0.8); + border-radius: 50%; + box-shadow: 0 0 20px rgba(0, 255, 255, 0.6); + animation: node-pulse 2s ease-in-out infinite; } - .feature-card-vertical:hover .card-glow-vertical { - opacity: 1; + .matrix-node.node-1 { + top: 20px; + left: 20px; + animation-delay: 0s; } - @keyframes verticalGlowPulse { + .matrix-node.node-2 { + top: 20px; + right: 20px; + animation-delay: 0.7s; + } + + .matrix-node.node-3 { + bottom: 20px; + left: 50%; + transform: translateX(-50%); + animation-delay: 1.4s; + } + + @keyframes node-pulse { 0%, 100% { - transform: rotate(0deg) scale(1); - opacity: 0.3; + opacity: 0.6; + transform: scale(1); + box-shadow: 0 0 20px rgba(0, 255, 255, 0.6); } 50% { - transform: rotate(180deg) scale(1.1); - opacity: 0.7; + opacity: 1; + transform: scale(1.5); + box-shadow: 0 0 30px rgba(0, 255, 255, 1); } } - /* Feature header section */ - .feature-header { - display: flex; - align-items: center; - padding: 2.5rem 3rem 1.5rem 3rem; - background: linear-gradient(135deg, rgba(77, 208, 225, 0.08) 0%, rgba(186, 104, 200, 0.06) 100%); - border-bottom: 1px solid rgba(255, 255, 255, 0.1); - gap: 2rem; + .matrix-connection { + position: absolute; + height: 1px; + background: linear-gradient(90deg, + transparent 0%, + rgba(0, 255, 255, 0.6) 50%, + transparent 100%); + animation: connection-flow 3s linear infinite; } - .feature-logo-container { - position: relative; - display: flex; - align-items: center; - justify-content: center; - width: 80px; - height: 80px; - flex-shrink: 0; + .matrix-connection.conn-1 { + top: 24px; + left: 28px; + width: 64px; } - .feature-icon-large { - font-size: 3.5rem; - z-index: 2; - filter: drop-shadow(0 0 15px rgba(77, 208, 225, 0.5)); + .matrix-connection.conn-2 { + top: 24px; + left: 24px; + width: 72px; + transform: rotate(35deg); + transform-origin: left center; + animation-delay: 1s; } - .feature-header-content { - flex: 1; + .matrix-connection.conn-3 { + top: 24px; + right: 24px; + width: 72px; + transform: rotate(-35deg); + transform-origin: right center; + animation-delay: 2s; } - .feature-title-large { - font-family: 'Inter', sans-serif !important; - color: var(--text-primary) !important; - font-size: 2rem !important; - font-weight: 700 !important; - margin-bottom: 0.5rem !important; - text-shadow: 0 0 20px rgba(255, 255, 255, 0.3); - background: linear-gradient(135deg, var(--neon-cyan), var(--neon-blue)); - background-clip: text; + @keyframes connection-flow { + 0% { + background: linear-gradient(90deg, + transparent 0%, + transparent 10%, + rgba(0, 255, 255, 0.6) 50%, + transparent 90%, + transparent 100%); + } + 100% { + background: linear-gradient(90deg, + transparent 0%, + rgba(0, 255, 255, 0.6) 10%, + transparent 50%, + rgba(0, 255, 255, 0.6) 90%, + transparent 100%); + } + } + + .nexus-title { + font-family: 'JetBrains Mono', monospace; + font-size: 2.2rem; + font-weight: 800; + background: linear-gradient(135deg, + #00ffff 0%, + #ffffff 25%, + #00ff00 50%, + #ffffff 75%, + #ff00ff 100%); + background-size: 200% 200%; -webkit-background-clip: text; -webkit-text-fill-color: transparent; + background-clip: text; + margin: 0; + letter-spacing: 3px; + text-shadow: 0 0 30px rgba(0, 255, 255, 0.3); + animation: nexus-glow 3s ease-in-out infinite alternate; } - .feature-subtitle { - color: var(--text-secondary) !important; - font-size: 1rem !important; - font-weight: 500 !important; + @keyframes nexus-glow { + 0% { + background-position: 0% 50%; + filter: brightness(1); + } + 100% { + background-position: 100% 50%; + filter: brightness(1.2); + } + } + + .nexus-subtitle { + font-family: 'JetBrains Mono', monospace; + font-size: 1rem; + color: rgba(0, 255, 255, 0.8); + letter-spacing: 2px; + margin: 1rem 0 0 0; + text-transform: uppercase; opacity: 0.9; } - .feature-stats { - display: flex; - flex-direction: column; + .capability-matrix { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); + gap: 2rem; + margin: 3rem 0; + } + + .capability-node { + background: linear-gradient(135deg, + rgba(0, 0, 0, 0.6) 0%, + rgba(15, 25, 45, 0.4) 100%); + border: 1px solid rgba(0, 255, 255, 0.3); + border-radius: 16px; + padding: 2rem; + position: relative; + overflow: hidden; + transition: all 0.4s cubic-bezier(0.23, 1, 0.320, 1); + backdrop-filter: blur(10px); + } + + .capability-node::before { + content: ''; + position: absolute; + top: 0; + left: 0; + right: 0; + bottom: 0; + background: linear-gradient(135deg, + rgba(0, 255, 255, 0.05) 0%, + transparent 50%, + rgba(255, 0, 255, 0.05) 100%); + opacity: 0; + transition: opacity 0.4s ease; + z-index: -1; + } + + .capability-node:hover { + transform: translateY(-8px) scale(1.02); + border-color: rgba(0, 255, 255, 0.6); + box-shadow: + 0 20px 60px rgba(0, 0, 0, 0.4), + 0 0 30px rgba(0, 255, 255, 0.2); + } + + .capability-node:hover::before { + opacity: 1; + } + + .node-core { + position: relative; + text-align: center; + margin-bottom: 1.5rem; + } + + .core-pulse { + position: absolute; + top: 50%; + left: 50%; + transform: translate(-50%, -50%); + width: 60px; + height: 60px; + border: 2px solid rgba(0, 255, 255, 0.6); + border-radius: 50%; + animation: core-ripple 2s linear infinite; + } + + @keyframes core-ripple { + 0% { + transform: translate(-50%, -50%) scale(0.8); + opacity: 1; + } + 100% { + transform: translate(-50%, -50%) scale(2); + opacity: 0; + } + } + + .core-label { + font-family: 'JetBrains Mono', monospace; + font-size: 0.9rem; + font-weight: 700; + color: #00ffff; + background: rgba(0, 0, 0, 0.8); + padding: 0.8rem 1.2rem; + border-radius: 20px; + border: 1px solid rgba(0, 255, 255, 0.4); + letter-spacing: 1px; + position: relative; + z-index: 2; + display: inline-block; + text-shadow: 0 0 10px rgba(0, 255, 255, 0.5); + } + + .node-description h3 { + font-size: 1.4rem; + font-weight: 600; + color: #ffffff; + margin: 0 0 0.8rem 0; + text-align: center; + } + + .node-description p { + font-size: 0.95rem; + color: rgba(255, 255, 255, 0.8); + margin: 0; + text-align: center; + line-height: 1.5; + } + + .node-metrics { + text-align: center; + margin-top: 1.5rem; + } + + .metric { + font-family: 'JetBrains Mono', monospace; + font-size: 0.75rem; + font-weight: 600; + color: #00ffff; + background: linear-gradient(135deg, + rgba(0, 255, 255, 0.2) 0%, + rgba(0, 255, 255, 0.1) 100%); + padding: 0.4rem 0.8rem; + border-radius: 12px; + border: 1px solid rgba(0, 255, 255, 0.3); + text-transform: uppercase; + letter-spacing: 0.5px; + display: inline-block; + } + + /* Node-specific color themes */ + .research-node { + border-color: rgba(0, 255, 255, 0.3); + } + .research-node .core-label, + .research-node .metric { + color: #00ffff; + border-color: rgba(0, 255, 255, 0.4); + } + + .interface-node { + border-color: rgba(0, 255, 0, 0.3); + } + .interface-node .core-label, + .interface-node .metric { + color: #00ff00; + border-color: rgba(0, 255, 0, 0.4); + } + + .architecture-node { + border-color: rgba(255, 255, 0, 0.3); + } + .architecture-node .core-label, + .architecture-node .metric { + color: #ffff00; + border-color: rgba(255, 255, 0, 0.4); + } + + .text2code-node { + border-color: rgba(0, 255, 0, 0.3); + } + .text2code-node .core-label, + .text2code-node .metric { + color: #00ff00; + border-color: rgba(0, 255, 0, 0.4); + } + + .intelligence-node { + border-color: rgba(255, 0, 255, 0.3); + } + .intelligence-node .core-label, + .intelligence-node .metric { + color: #ff00ff; + border-color: rgba(255, 0, 255, 0.4); + } + + .processing-pipeline { + display: flex; + justify-content: center; + align-items: center; + gap: 1rem; + margin: 3rem 0; + padding: 2rem; + background: rgba(0, 0, 0, 0.3); + border-radius: 16px; + border: 1px solid rgba(0, 255, 255, 0.2); + flex-wrap: wrap; + } + + .pipeline-stage { + text-align: center; + position: relative; + } + + .stage-core { + font-family: 'JetBrains Mono', monospace; + font-size: 0.8rem; + font-weight: 700; + color: #ffffff; + background: linear-gradient(135deg, + rgba(0, 255, 255, 0.8) 0%, + rgba(0, 200, 200, 0.8) 100%); + padding: 1rem 1.5rem; + border-radius: 50px; + border: 2px solid rgba(0, 255, 255, 0.6); + letter-spacing: 1px; + text-shadow: 0 0 10px rgba(0, 0, 0, 0.8); + box-shadow: + 0 0 20px rgba(0, 255, 255, 0.3), + inset 0 0 20px rgba(255, 255, 255, 0.1); + } + + .stage-description { + font-size: 0.75rem; + color: rgba(0, 255, 255, 0.8); + margin-top: 0.5rem; + font-weight: 500; + } + + .pipeline-flow { + width: 60px; + height: 2px; + background: linear-gradient(90deg, + transparent 0%, + rgba(0, 255, 255, 0.6) 50%, + transparent 100%); + position: relative; + margin: 0 1rem; + } + + .flow-particle { + position: absolute; + top: -2px; + left: 0; + width: 6px; + height: 6px; + background: #00ffff; + border-radius: 50%; + box-shadow: 0 0 10px rgba(0, 255, 255, 0.8); + animation: particle-flow 2s linear infinite; + } + + @keyframes particle-flow { + 0% { + left: 0; + opacity: 0; + } + 10% { + opacity: 1; + } + 90% { + opacity: 1; + } + 100% { + left: 100%; + opacity: 0; + } + } + + .system-status { + text-align: center; + margin-top: 2rem; + } + + .status-indicator { + display: inline-flex; + align-items: center; + gap: 0.8rem; + background: rgba(0, 0, 0, 0.6); + padding: 1rem 2rem; + border-radius: 25px; + border: 1px solid rgba(0, 255, 0, 0.4); + } + + .status-pulse { + width: 10px; + height: 10px; + background: #00ff00; + border-radius: 50%; + animation: status-heartbeat 1.5s ease-in-out infinite; + box-shadow: 0 0 15px rgba(0, 255, 0, 0.6); + } + + @keyframes status-heartbeat { + 0%, 100% { + transform: scale(1); + opacity: 1; + } + 50% { + transform: scale(1.2); + opacity: 0.8; + } + } + + .status-text { + font-family: 'JetBrains Mono', monospace; + font-size: 0.9rem; + font-weight: 600; + color: #00ff00; + letter-spacing: 1px; + text-shadow: 0 0 10px rgba(0, 255, 0, 0.3); + } + + /* Responsive design for AI Nexus */ + @media (max-width: 768px) { + .ai-nexus-container { + padding: 2rem 1rem; + } + + .capability-matrix { + grid-template-columns: 1fr; + gap: 1.5rem; + } + + .processing-pipeline { + flex-direction: column; + gap: 1.5rem; + } + + .pipeline-flow { + transform: rotate(90deg); + width: 30px; + } + + .nexus-title { + font-size: 1.8rem; + letter-spacing: 2px; + } + } + + /* LEGACY FEATURES COMPACT */ + .features-compact-container { + background: linear-gradient(135deg, rgba(45, 55, 72, 0.1) 0%, rgba(26, 31, 58, 0.1) 100%); + border-radius: 20px; + padding: 2rem; + margin: 1.5rem 0; + backdrop-filter: blur(10px); + border: 1px solid rgba(100, 181, 246, 0.1); + position: relative; + overflow: hidden; + } + + .features-compact-container::before { + content: ''; + position: absolute; + top: 0; + left: 0; + right: 0; + bottom: 0; + background: linear-gradient(45deg, + rgba(100, 181, 246, 0.05) 0%, + rgba(77, 208, 225, 0.05) 50%, + rgba(129, 199, 132, 0.05) 100%); + z-index: -1; + } + + .features-header { + text-align: center; + margin-bottom: 2rem; + position: relative; + } + + .neural-pulse { + position: relative; + display: inline-block; + margin-bottom: 1rem; + } + + .pulse-dot { + width: 12px; + height: 12px; + background: var(--neon-cyan); + border-radius: 50%; + position: relative; + z-index: 2; + box-shadow: 0 0 20px var(--neon-cyan); + } + + .pulse-ring { + position: absolute; + top: 50%; + left: 50%; + transform: translate(-50%, -50%); + width: 40px; + height: 40px; + border: 2px solid var(--neon-cyan); + border-radius: 50%; + opacity: 0.6; + animation: pulse-expand 2s infinite ease-out; + } + + @keyframes pulse-expand { + 0% { + transform: translate(-50%, -50%) scale(0.5); + opacity: 0.8; + } + 100% { + transform: translate(-50%, -50%) scale(2); + opacity: 0; + } + } + + .platform-title { + font-size: 2rem; + font-weight: 700; + background: linear-gradient(135deg, var(--neon-cyan), var(--neon-blue)); + -webkit-background-clip: text; + -webkit-text-fill-color: transparent; + background-clip: text; + margin: 0.5rem 0; + text-shadow: 0 0 30px rgba(77, 208, 225, 0.3); + } + + .platform-subtitle { + font-size: 1.1rem; + color: var(--text-secondary); + margin: 0; + opacity: 0.9; + } + + .features-grid { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); + gap: 1.5rem; + margin: 2rem 0; + } + + .feature-pill { + background: rgba(255, 255, 255, 0.05); + border: 1px solid rgba(255, 255, 255, 0.1); + border-radius: 16px; + padding: 1.5rem; + position: relative; + overflow: hidden; + transition: all 0.3s ease; + backdrop-filter: blur(10px); + cursor: pointer; + } + + .feature-pill:hover { + transform: translateY(-5px); + box-shadow: 0 15px 40px rgba(0, 0, 0, 0.2); + border-color: rgba(255, 255, 255, 0.2); + } + + .pill-glow { + position: absolute; + top: 0; + left: 0; + right: 0; + bottom: 0; + background: linear-gradient(135deg, + rgba(100, 181, 246, 0.1) 0%, + rgba(77, 208, 225, 0.05) 100%); + opacity: 0; + transition: opacity 0.3s ease; + border-radius: 16px; + } + + .feature-pill:hover .pill-glow { + opacity: 1; + } + + .feature-pill.paper2code .pill-glow { + background: linear-gradient(135deg, rgba(100, 181, 246, 0.15), rgba(77, 208, 225, 0.1)); + } + + .feature-pill.text2web .pill-glow { + background: linear-gradient(135deg, rgba(129, 199, 132, 0.15), rgba(186, 104, 200, 0.1)); + } + + .feature-pill.text2backend .pill-glow { + background: linear-gradient(135deg, rgba(255, 193, 7, 0.15), rgba(255, 87, 34, 0.1)); + } + + .feature-pill.coderag .pill-glow { + background: linear-gradient(135deg, rgba(186, 104, 200, 0.15), rgba(156, 39, 176, 0.1)); + } + + .feature-icon { + font-size: 2.5rem; + margin-bottom: 1rem; + display: block; + text-align: center; + filter: drop-shadow(0 0 10px rgba(255, 255, 255, 0.3)); + } + + .feature-info h3 { + font-size: 1.3rem; + font-weight: 600; + color: var(--text-primary); + margin: 0.5rem 0; + text-align: center; + } + + .feature-info p { + font-size: 0.9rem; + color: var(--text-secondary); + margin: 0; + text-align: center; + opacity: 0.8; + } + + .feature-status { + position: absolute; + top: 1rem; + right: 1rem; + background: linear-gradient(135deg, var(--neon-cyan), var(--neon-blue)); + color: white; + padding: 0.3rem 0.8rem; + border-radius: 12px; + font-size: 0.7rem; + font-weight: 600; + text-transform: uppercase; + letter-spacing: 0.5px; + box-shadow: 0 4px 15px rgba(77, 208, 225, 0.3); + } + + .feature-pill.text2web .feature-status { + background: linear-gradient(135deg, var(--neon-green), var(--neon-purple)); + } + + .feature-pill.text2backend .feature-status { + background: linear-gradient(135deg, #ffc107, #ff5722); + } + + .feature-pill.coderag .feature-status { + background: linear-gradient(135deg, var(--neon-purple), #9c27b0); + } + + .workflow-indicator { + display: flex; + justify-content: center; + align-items: center; + gap: 1rem; + margin-top: 2rem; + padding: 1rem; + background: rgba(255, 255, 255, 0.02); + border-radius: 12px; + border: 1px solid rgba(255, 255, 255, 0.05); + } + + .workflow-step { + display: flex; + flex-direction: column; + align-items: center; + gap: 0.5rem; + padding: 0.8rem; + border-radius: 10px; + transition: all 0.3s ease; + position: relative; + } + + .workflow-step.active { + background: rgba(77, 208, 225, 0.1); + border: 1px solid rgba(77, 208, 225, 0.3); + } + + .workflow-step:hover { + background: rgba(255, 255, 255, 0.05); + transform: translateY(-2px); + } + + .step-icon { + font-size: 1.5rem; + filter: drop-shadow(0 0 8px rgba(255, 255, 255, 0.2)); + } + + .step-label { + font-size: 0.8rem; + color: var(--text-secondary); + font-weight: 500; + } + + .workflow-arrow { + color: var(--neon-cyan); + font-size: 1.2rem; + opacity: 0.6; + } + + /* Responsive adjustments for compact features */ + @media (max-width: 768px) { + .features-grid { + grid-template-columns: 1fr; + gap: 1rem; + } + + .workflow-indicator { + flex-wrap: wrap; + gap: 0.5rem; + } + + .workflow-arrow { + display: none; + } + + .platform-title { + font-size: 1.5rem; + } + } + + /* NEW VERTICAL LAYOUT FEATURE CARDS (LEGACY) */ + .feature-card-vertical { + position: relative; + background: linear-gradient(135deg, var(--card-bg) 0%, rgba(45, 55, 72, 0.8) 100%); + backdrop-filter: blur(25px); + border: 1px solid var(--border-color); + padding: 0; + border-radius: 24px; + margin: 2.5rem 0; + transition: all 0.5s cubic-bezier(0.175, 0.885, 0.32, 1.275); + box-shadow: 0 12px 60px rgba(0, 0, 0, 0.4); + overflow: hidden; + min-height: 500px; + } + + .feature-card-vertical:hover { + transform: translateY(-8px) scale(1.01); + box-shadow: 0 20px 80px rgba(0, 0, 0, 0.5); + } + + /* Card glow effect for vertical cards */ + .card-glow-vertical { + position: absolute; + top: -50%; + left: -50%; + width: 200%; + height: 200%; + background: radial-gradient(circle, transparent 30%, rgba(77, 208, 225, 0.03) 60%, transparent 80%); + opacity: 0; + transition: opacity 0.5s ease; + pointer-events: none; + animation: verticalGlowPulse 8s ease-in-out infinite; + } + + .feature-card-vertical:hover .card-glow-vertical { + opacity: 1; + } + + @keyframes verticalGlowPulse { + 0%, 100% { + transform: rotate(0deg) scale(1); + opacity: 0.3; + } + 50% { + transform: rotate(180deg) scale(1.1); + opacity: 0.7; + } + } + + /* Feature header section */ + .feature-header { + display: flex; + align-items: center; + padding: 2.5rem 3rem 1.5rem 3rem; + background: linear-gradient(135deg, rgba(77, 208, 225, 0.08) 0%, rgba(186, 104, 200, 0.06) 100%); + border-bottom: 1px solid rgba(255, 255, 255, 0.1); + gap: 2rem; + } + + .feature-logo-container { + position: relative; + display: flex; + align-items: center; + justify-content: center; + width: 80px; + height: 80px; + flex-shrink: 0; + } + + .feature-icon-large { + font-size: 3.5rem; + z-index: 2; + filter: drop-shadow(0 0 15px rgba(77, 208, 225, 0.5)); + } + + .feature-header-content { + flex: 1; + } + + .feature-title-large { + font-family: 'Inter', sans-serif !important; + color: var(--text-primary) !important; + font-size: 2rem !important; + font-weight: 700 !important; + margin-bottom: 0.5rem !important; + text-shadow: 0 0 20px rgba(255, 255, 255, 0.3); + background: linear-gradient(135deg, var(--neon-cyan), var(--neon-blue)); + background-clip: text; + -webkit-background-clip: text; + -webkit-text-fill-color: transparent; + } + + .feature-subtitle { + color: var(--text-secondary) !important; + font-size: 1rem !important; + font-weight: 500 !important; + opacity: 0.9; + } + + .feature-stats { + display: flex; + flex-direction: column; gap: 1rem; align-items: flex-end; } @@ -2586,5 +3566,344 @@ def get_main_styles() -> str: color: var(--text-primary) !important; } + /* ==================================================================== + MODERN HEADER DESIGN - MINIMALIST & PROFESSIONAL + ==================================================================== */ + + .main-header-modern { + text-align: center; + padding: 2rem 1rem 1.5rem 1rem; + margin-bottom: 1rem; + background: linear-gradient(135deg, rgba(255, 255, 255, 0.02) 0%, rgba(255, 255, 255, 0.08) 100%); + border-radius: 16px; + border: 1px solid rgba(255, 255, 255, 0.1); + backdrop-filter: blur(10px); + } + + .header-logo { + display: flex; + align-items: center; + justify-content: center; + gap: 1rem; + margin-bottom: 1rem; + } + + .logo-icon { + font-size: 3rem; + filter: drop-shadow(0 0 20px rgba(100, 181, 246, 0.6)); + animation: float 3s ease-in-out infinite; + } + + @keyframes float { + 0%, 100% { transform: translateY(0px); } + 50% { transform: translateY(-5px); } + } + + .logo-text h1 { + margin: 0; + font-size: 2.5rem; + font-weight: 700; + background: linear-gradient(135deg, var(--neon-blue), var(--neon-cyan), var(--neon-purple)); + -webkit-background-clip: text; + -webkit-text-fill-color: transparent; + background-clip: text; + text-shadow: 0 0 30px rgba(100, 181, 246, 0.3); + } + + .logo-tagline { + display: block; + font-size: 0.9rem; + color: var(--text-muted); + font-weight: 400; + opacity: 0.8; + margin-top: 0.3rem; + } + + .header-subtitle { + display: flex; + flex-direction: column; + gap: 0.3rem; + opacity: 0.7; + } + + .org-info { + font-size: 0.8rem; + color: var(--neon-cyan); + font-weight: 600; + letter-spacing: 1px; + text-transform: uppercase; + } + + .mission { + font-size: 0.85rem; + color: var(--text-muted); + font-weight: 400; + font-style: italic; + } + + @media (max-width: 768px) { + .main-header-modern { + padding: 1.5rem 1rem; + } + + .header-logo { + flex-direction: column; + gap: 0.5rem; + } + + .logo-icon { + font-size: 2.5rem; + } + + .logo-text h1 { + font-size: 2rem; + } + + .header-subtitle { + gap: 0.2rem; + } + + .org-info { + font-size: 0.75rem; + } + + .mission { + font-size: 0.8rem; + } + } + + /* ==================================================================== + COMPACT FEATURES SECTION - MODERN MINIMALIST DESIGN + ==================================================================== */ + + .features-compact-header { + text-align: center; + margin: 2rem 0 1.5rem 0; + padding: 0; + } + + .features-title { + display: flex; + align-items: center; + justify-content: center; + gap: 1rem; + flex-wrap: wrap; + } + + .features-icon { + font-size: 2rem; + filter: drop-shadow(0 0 10px rgba(100, 181, 246, 0.6)); + } + + .features-title h3 { + margin: 0; + font-size: 1.8rem; + font-weight: 600; + color: var(--text-primary); + background: linear-gradient(135deg, var(--neon-blue), var(--neon-cyan)); + -webkit-background-clip: text; + -webkit-text-fill-color: transparent; + background-clip: text; + } + + .features-subtitle { + font-size: 0.95rem; + color: var(--text-muted); + font-weight: 400; + opacity: 0.8; + } + + .features-row { + display: grid; + grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); + gap: 1rem; + margin: 1rem 0; + padding: 0; + } + + @media (max-width: 768px) { + .features-row { + grid-template-columns: 1fr; + gap: 0.8rem; + } + } + + .feature-card-compact { + display: flex; + align-items: center; + gap: 1rem; + padding: 1.2rem; + background: rgba(255, 255, 255, 0.05); + backdrop-filter: blur(10px); + border: 1px solid rgba(255, 255, 255, 0.1); + border-radius: 12px; + transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1); + position: relative; + overflow: hidden; + cursor: pointer; + } + + .feature-card-compact::before { + content: ''; + position: absolute; + top: 0; + left: -100%; + width: 100%; + height: 100%; + background: linear-gradient(90deg, transparent, rgba(255, 255, 255, 0.08), transparent); + transition: left 0.6s ease; + } + + .feature-card-compact:hover { + transform: translateY(-2px); + border-color: rgba(255, 255, 255, 0.2); + box-shadow: 0 8px 25px rgba(0, 0, 0, 0.15); + background: rgba(255, 255, 255, 0.08); + } + + .feature-card-compact:hover::before { + left: 100%; + } + + .feature-card-compact .feature-icon { + flex-shrink: 0; + width: 48px; + height: 48px; + display: flex; + align-items: center; + justify-content: center; + font-size: 1.5rem; + background: rgba(255, 255, 255, 0.1); + border-radius: 10px; + transition: all 0.3s ease; + } + + .feature-card-compact:hover .feature-icon { + transform: scale(1.1); + background: rgba(255, 255, 255, 0.15); + } + + /* Specific color themes for each card */ + .feature-card-compact.paper2code .feature-icon { + background: linear-gradient(135deg, #667eea, #764ba2); + } + + .feature-card-compact.text2web .feature-icon { + background: linear-gradient(135deg, #4facfe, #00f2fe); + } + + .feature-card-compact.text2backend .feature-icon { + background: linear-gradient(135deg, #fa709a, #fee140); + } + + .feature-card-compact.coderag .feature-icon { + background: linear-gradient(135deg, #a8edea, #fed6e3); + } + + .feature-content { + flex-grow: 1; + min-width: 0; + } + + .feature-content h4 { + margin: 0 0 0.3rem 0; + font-size: 1.1rem; + font-weight: 600; + color: var(--text-primary); + line-height: 1.2; + } + + .feature-content p { + margin: 0 0 0.8rem 0; + font-size: 0.85rem; + color: var(--text-muted); + line-height: 1.3; + opacity: 0.9; + } + + .feature-tags { + display: flex; + gap: 0.4rem; + flex-wrap: wrap; + } + + .feature-tags .tag { + padding: 0.2rem 0.6rem; + font-size: 0.75rem; + background: rgba(100, 181, 246, 0.15); + color: var(--neon-blue); + border: 1px solid rgba(100, 181, 246, 0.3); + border-radius: 20px; + font-weight: 500; + transition: all 0.2s ease; + } + + .feature-card-compact:hover .feature-tags .tag { + background: rgba(100, 181, 246, 0.25); + border-color: rgba(100, 181, 246, 0.5); + } + + .feature-status { + flex-shrink: 0; + display: flex; + align-items: center; + } + + .status-indicator { + width: 8px; + height: 8px; + border-radius: 50%; + background: var(--neon-green); + box-shadow: 0 0 10px rgba(129, 199, 132, 0.6); + animation: pulse-status 2s infinite; + } + + @keyframes pulse-status { + 0%, 100% { + opacity: 1; + transform: scale(1); + } + 50% { + opacity: 0.7; + transform: scale(1.2); + } + } + + /* Responsive adjustments */ + @media (max-width: 640px) { + .features-title { + flex-direction: column; + gap: 0.5rem; + } + + .features-title h3 { + font-size: 1.5rem; + } + + .feature-card-compact { + padding: 1rem; + gap: 0.8rem; + } + + .feature-card-compact .feature-icon { + width: 40px; + height: 40px; + font-size: 1.3rem; + } + + .feature-content h4 { + font-size: 1rem; + } + + .feature-content p { + font-size: 0.8rem; + } + + .feature-tags .tag { + font-size: 0.7rem; + padding: 0.15rem 0.5rem; + } + } + """ diff --git a/utils/file_processor.py b/utils/file_processor.py index 1d1bcd14..99632861 100644 --- a/utils/file_processor.py +++ b/utils/file_processor.py @@ -187,6 +187,14 @@ async def read_file_content(file_path: str) -> str: if not os.path.exists(file_path): raise FileNotFoundError(f"File not found: {file_path}") + # Check if file is actually a PDF by reading the first few bytes + with open(file_path, "rb") as f: + header = f.read(8) + if header.startswith(b"%PDF"): + raise IOError( + f"File {file_path} is a PDF file, not a text file. Please convert it to markdown format or use PDF processing tools." + ) + # Read file content # Note: Using async with would be better for large files # but for simplicity and compatibility, using regular file reading @@ -195,6 +203,10 @@ async def read_file_content(file_path: str) -> str: return content + except UnicodeDecodeError as e: + raise IOError( + f"Error reading file {file_path}: File encoding is not UTF-8. Original error: {str(e)}" + ) except Exception as e: raise IOError(f"Error reading file {file_path}: {str(e)}") diff --git a/utils/llm_utils.py b/utils/llm_utils.py index 558ed0fb..ff135977 100644 --- a/utils/llm_utils.py +++ b/utils/llm_utils.py @@ -187,6 +187,12 @@ def get_adaptive_agent_config( config["algorithm_analysis"].append("document-segmentation") if "document-segmentation" not in config["code_planner"]: config["code_planner"].append("document-segmentation") + else: + config["concept_analysis"] = ["filesystem"] + if "filesystem" not in config["algorithm_analysis"]: + config["algorithm_analysis"].append("filesystem") + if "filesystem" not in config["code_planner"]: + config["code_planner"].append("filesystem") return config diff --git a/workflows/agent_orchestration_engine.py b/workflows/agent_orchestration_engine.py index 02926110..f8cf57c1 100644 --- a/workflows/agent_orchestration_engine.py +++ b/workflows/agent_orchestration_engine.py @@ -62,6 +62,125 @@ os.environ["PYTHONDONTWRITEBYTECODE"] = "1" # Prevent .pyc file generation +def _assess_output_completeness(text: str) -> float: + """ + 智能评估输出完整性的高级算法 + + 使用多种启发式方法来检测输出是否被截断: + 1. 结构化标记完整性检查 + 2. 句子完整性分析 + 3. 代码块完整性验证 + 4. 预期内容元素检查 + + Returns: + float: 完整性分数 (0.0-1.0),越高表示越完整 + """ + if not text or len(text.strip()) < 100: + return 0.0 + + score = 0.0 + factors = 0 + + # 1. 基本长度检查 (权重: 0.2) + if len(text) > 5000: # 期望的最小输出长度 + score += 0.2 + elif len(text) > 2000: + score += 0.1 + factors += 1 + + # 2. 结构完整性检查 (权重: 0.3) + structure_indicators = [ + "## 1.", + "## 2.", + "## 3.", # 章节标题 + "```", + "file_structure", + "implementation", + "algorithm", + "method", + "function", + ] + structure_count = sum( + 1 for indicator in structure_indicators if indicator.lower() in text.lower() + ) + if structure_count >= 6: + score += 0.3 + elif structure_count >= 3: + score += 0.15 + factors += 1 + + # 3. 句子完整性检查 (权重: 0.2) + lines = text.strip().split("\n") + if lines: + last_line = lines[-1].strip() + # 检查最后一行是否是完整的句子或结构化内容 + if ( + last_line.endswith((".", ":", "```", "!", "?")) + or last_line.startswith(("##", "-", "*", "`")) + or len(last_line) < 10 + ): # 很短的行可能是列表项 + score += 0.2 + elif len(last_line) > 50 and not last_line.endswith( + (".", ":", "```", "!", "?") + ): + # 长行但没有适当结尾,可能被截断 + score += 0.05 + factors += 1 + + # 4. 代码实现计划完整性 (权重: 0.3) + implementation_keywords = [ + "file structure", + "architecture", + "implementation", + "requirements", + "dependencies", + "setup", + "main", + "class", + "function", + "method", + "algorithm", + ] + impl_count = sum( + 1 for keyword in implementation_keywords if keyword.lower() in text.lower() + ) + if impl_count >= 8: + score += 0.3 + elif impl_count >= 4: + score += 0.15 + factors += 1 + + return min(score, 1.0) # 确保不超过1.0 + + +def _adjust_params_for_retry(params: RequestParams, retry_count: int) -> RequestParams: + """ + 动态调整请求参数以提高成功率 + + 基于重试次数智能调整参数: + - 增加token限制 + - 调整temperature + - 优化其他参数 + """ + # 基础token增量:每次重试增加更多tokens + token_increment = 4096 * (retry_count + 1) + new_max_tokens = min( + params.max_tokens + token_increment, 32768 + ) # 不超过32K的合理限制 + + # 随着重试次数增加,降低temperature以获得更一致的输出 + new_temperature = max(params.temperature - (retry_count * 0.1), 0.1) + + print(f"🔧 Adjusting parameters for retry {retry_count + 1}:") + print(f" Token limit: {params.max_tokens} → {new_max_tokens}") + print(f" Temperature: {params.temperature} → {new_temperature}") + + return RequestParams( + max_tokens=new_max_tokens, + temperature=new_temperature, + ) + + def get_default_search_server(config_path: str = "mcp_agent.config.yaml"): """ Get the default search server from configuration. @@ -382,10 +501,24 @@ async def run_code_analyzer( llm_factory=get_preferred_llm_class(), ) - # Set appropriate token output limit for Claude models (max 8192) + # Advanced token management system with dynamic scaling + # 检查是否使用分段模式以动态调整token限制 + if use_segmentation: + # 分段模式:可以使用更高的token限制,因为输入已经被优化 + max_tokens_limit = 16384 # 使用更高限制,因为分段减少了输入复杂性 + temperature = 0.2 # 稍微降低temperature以提高一致性 + print( + "🧠 Using SEGMENTED mode: Higher token limit (16384) with optimized inputs" + ) + else: + # 传统模式:使用保守的token限制并启用增量生成 + max_tokens_limit = 12288 # 中等限制,为聚合输出留出空间 + temperature = 0.3 + print("🧠 Using TRADITIONAL mode: Moderate token limit (12288)") + enhanced_params = RequestParams( - max_tokens=8192, # Adjusted to Claude 3.5 Sonnet's actual limit - temperature=0.3, + max_tokens=max_tokens_limit, + temperature=temperature, ) # Concise message for multi-agent paper analysis and code planning @@ -399,10 +532,44 @@ async def run_code_analyzer( The goal is to create a reproduction plan detailed enough for independent implementation.""" - result = await code_aggregator_agent.generate_str( - message=message, request_params=enhanced_params - ) - print(f"Code analysis result: {result}") + # 智能输出完整性检查和重试机制 + max_retries = 3 + retry_count = 0 + + while retry_count < max_retries: + try: + print( + f"🚀 Attempting code analysis (attempt {retry_count + 1}/{max_retries})" + ) + result = await code_aggregator_agent.generate_str( + message=message, request_params=enhanced_params + ) + + # 检查输出完整性的高级指标 + completeness_score = _assess_output_completeness(result) + print(f"📊 Output completeness score: {completeness_score:.2f}/1.0") + + if completeness_score >= 0.8: # 输出被认为是完整的 + print( + f"✅ Code analysis completed successfully (length: {len(result)} chars)" + ) + return result + else: + print( + f"⚠️ Output appears truncated (score: {completeness_score:.2f}), retrying with enhanced parameters..." + ) + # 动态调整参数进行重试 + enhanced_params = _adjust_params_for_retry(enhanced_params, retry_count) + retry_count += 1 + + except Exception as e: + print(f"❌ Error in code analysis attempt {retry_count + 1}: {e}") + retry_count += 1 + if retry_count >= max_retries: + raise + + # 如果所有重试都失败,返回最后一次的结果 + print(f"⚠️ Returning potentially incomplete result after {max_retries} attempts") return result @@ -661,6 +828,14 @@ async def orchestrate_document_preprocessing_agent( # Step 2: Read document content to determine size md_path = os.path.join(dir_info["paper_dir"], md_files[0]) try: + # Check if file is actually a PDF by reading the first few bytes + with open(md_path, "rb") as f: + header = f.read(8) + if header.startswith(b"%PDF"): + raise IOError( + f"File {md_path} is a PDF file, not a text file. Please convert it to markdown format or use PDF processing tools." + ) + with open(md_path, "r", encoding="utf-8") as f: document_content = f.read() except Exception as e: @@ -1562,3 +1737,130 @@ async def execute_chat_based_planning_pipeline( except Exception as e: print(f"Error in execute_chat_based_planning_pipeline: {e}") raise e + + +async def run_requirement_analysis_agent( + user_input: str, + analysis_mode: str, + user_answers: Dict[str, str] = None, + logger=None, +) -> str: + """ + Run requirement analysis Agent for question generation or requirement summarization + + Args: + user_input: User's initial requirement description + analysis_mode: Analysis mode ("generate_questions" or "summarize_requirements") + user_answers: User's answer dictionary for questions (only used in summarize_requirements mode) + logger: Logger instance + + Returns: + str: Generated question JSON string or detailed requirement document + """ + try: + print(f"🧠 Starting requirement analysis Agent, mode: {analysis_mode}") + print(f"Input length: {len(user_input) if user_input else 0}") + + if not user_input or user_input.strip() == "": + raise ValueError("User input cannot be empty") + + # Import requirement analysis Agent + from workflows.agents.requirement_analysis_agent import RequirementAnalysisAgent + + # Create requirement analysis Agent instance + async with RequirementAnalysisAgent(logger=logger) as req_agent: + if analysis_mode == "generate_questions": + # Generate guiding questions + print("📝 Generating guiding questions...") + questions = await req_agent.generate_guiding_questions(user_input) + return json.dumps(questions, ensure_ascii=False, indent=2) + + elif analysis_mode == "summarize_requirements": + # Summarize detailed requirements + print("📋 Summarizing detailed requirements...") + if user_answers is None: + user_answers = {} + summary = await req_agent.summarize_detailed_requirements( + user_input, user_answers + ) + return summary + + else: + raise ValueError(f"Unsupported analysis mode: {analysis_mode}") + + except Exception as e: + print(f"❌ Requirement analysis Agent execution failed: {e}") + print(f"Exception details: {type(e).__name__}: {str(e)}") + raise + + +async def execute_requirement_analysis_workflow( + user_input: str, + analysis_mode: str, + user_answers: Dict[str, str] = None, + logger=None, + progress_callback: Optional[Callable] = None, +) -> Dict[str, Any]: + """ + Execute user requirement analysis workflow + + This function supports two modes: + 1. generate_questions: Generate guiding questions based on user initial requirements + 2. summarize_requirements: Generate detailed requirement document based on user answers + + Args: + user_input: User's initial requirement description + analysis_mode: Analysis mode ("generate_questions" or "summarize_requirements") + user_answers: User's answer dictionary for questions + logger: Logger instance + progress_callback: Progress callback function + + Returns: + Dict[str, Any]: Dictionary containing analysis results + """ + try: + print(f"🧠 Starting requirement analysis workflow, mode: {analysis_mode}") + + if progress_callback: + if analysis_mode == "generate_questions": + progress_callback( + 10, + "🤔 Analyzing user requirements, generating guiding questions...", + ) + else: + progress_callback( + 10, + "📝 Integrating user answers, generating detailed requirement document...", + ) + + # Call requirement analysis Agent + result = await run_requirement_analysis_agent( + user_input=user_input, + analysis_mode=analysis_mode, + user_answers=user_answers, + logger=logger, + ) + + if progress_callback: + progress_callback(100, "✅ Requirement analysis completed!") + + return { + "status": "success", + "mode": analysis_mode, + "result": result, + "message": f"Requirement analysis ({analysis_mode}) executed successfully", + } + + except Exception as e: + error_msg = f"Requirement analysis workflow execution failed: {str(e)}" + print(f"❌ {error_msg}") + + if progress_callback: + progress_callback(0, f"❌ {error_msg}") + + return { + "status": "error", + "mode": analysis_mode, + "error": error_msg, + "message": "Requirement analysis workflow execution failed", + } diff --git a/workflows/agents/memory_agent_concise.py b/workflows/agents/memory_agent_concise.py index 5448e7fd..4b38ea92 100644 --- a/workflows/agents/memory_agent_concise.py +++ b/workflows/agents/memory_agent_concise.py @@ -215,32 +215,50 @@ def _extract_from_tree_structure(self, lines: List[str]) -> List[str]: if ":" in line and not ("." in line and "/" in line): continue - # Only process lines that look like file tree structure - if not any( - char in line for char in ["├", "└", "│", "-"] - ) and not line.startswith(" "): + stripped_line = line.strip() + + # Detect root directory (directory name ending with / at minimal indentation) + if ( + stripped_line.endswith("/") + and len(line) - len(line.lstrip()) + <= 4 # Minimal indentation (0-4 spaces) + and not any(char in line for char in ["├", "└", "│", "─"]) + ): # No tree characters + root_directory = stripped_line.rstrip("/") + path_stack = [root_directory] + continue + + # Only process lines that have tree structure + if not any(char in line for char in ["├", "└", "│", "─"]): continue - # Remove tree characters and get the clean name + # Parse tree structure depth by analyzing the line structure + # Count │ characters before the actual item, or use indentation as fallback + pipe_count = 0 + + for i, char in enumerate(line): + if char == "│": + pipe_count += 1 + elif char in ["├", "└"]: + break + + # Calculate depth: use pipe count if available, otherwise use indentation + if pipe_count > 0: + depth = pipe_count + 1 # +1 because the actual item is one level deeper + else: + # Use indentation to determine depth (every 4 spaces = 1 level) + indent_spaces = len(line) - len(line.lstrip()) + depth = max(1, indent_spaces // 4) # At least depth 1 + + # Clean the line to get the item name clean_line = line - for char in ["├──", "└──", "│", "─", "├", "└"]: + for char in ["├──", "└──", "├", "└", "│", "─"]: clean_line = clean_line.replace(char, "") clean_line = clean_line.strip() if not clean_line or ":" in clean_line: continue - # Calculate indentation level by counting spaces/tree chars - indent_level = 0 - for char in line: - if char in [" ", "\t", "│", "├", "└", "─"]: - indent_level += 1 - else: - break - indent_level = max( - 0, (indent_level - 4) // 4 - ) # Normalize to directory levels - # Extract filename (remove comments) if "#" in clean_line: filename = clean_line.split("#")[0].strip() @@ -251,12 +269,34 @@ def _extract_from_tree_structure(self, lines: List[str]) -> List[str]: if not filename: continue - # Update path stack based on indentation - if indent_level < len(path_stack): - path_stack = path_stack[:indent_level] + # Adjust path stack to current depth + while len(path_stack) < depth: + path_stack.append("") + path_stack = path_stack[:depth] + + # Determine if it's a directory or file + is_directory = ( + filename.endswith("/") + or ( + "." not in filename + and filename not in ["README", "requirements.txt", "setup.py"] + ) + or filename + in [ + "core", + "networks", + "environments", + "baselines", + "evaluation", + "experiments", + "utils", + "src", + "lib", + "app", + ] + ) - # If it's a directory (ends with / or no extension), add to path stack - if filename.endswith("/") or ("." not in filename and filename != ""): + if is_directory: directory_name = filename.rstrip("/") if directory_name and ":" not in directory_name: path_stack.append(directory_name) @@ -473,10 +513,16 @@ async def create_code_implementation_summary( # Format summary with only Implementation Progress and Dependencies for file saving file_summary_content = "" - if sections.get("implementation_progress"): - file_summary_content += sections["implementation_progress"] + "\n\n" - if sections.get("dependencies"): - file_summary_content += sections["dependencies"] + "\n\n" + if sections.get("core_purpose"): + file_summary_content += sections["core_purpose"] + "\n\n" + if sections.get("public_interface"): + file_summary_content += sections["public_interface"] + "\n\n" + if sections.get("internal_dependencies"): + file_summary_content += sections["internal_dependencies"] + "\n\n" + if sections.get("external_dependencies"): + file_summary_content += sections["external_dependencies"] + "\n\n" + if sections.get("implementation_notes"): + file_summary_content += sections["implementation_notes"] + "\n\n" # Create the formatted summary for file saving (without Next Steps) formatted_summary = self._format_code_implementation_summary( @@ -545,12 +591,25 @@ def _create_code_summary_prompt( **Required Summary Format:** -**Implementation Progress**: List the code file completed in current round and core implementation ideas - Format: {{file_path}}: {{core implementation ideas}} +**Core Purpose** (provide a general overview of the file's main responsibility): +- {{1-2 sentence description of file's main responsibility}} + +**Public Interface** (what other files can use, if any): +- Class {{ClassName}}: {{purpose}} | Key methods: {{method_names}} | Constructor params: {{params}} +- Function {{function_name}}({{params}}): {{purpose}} -> {{return_type}}: {{purpose}} +- Constants/Types: {{name}}: {{value/description}} -**Dependencies**: According to the File Structure and initial plan, list functions that may be called by other files - Format: {{file_path}}: Function {{function_name}}: core ideas--{{ideas}}; Required parameters--{{params}}; Return parameters--{{returns}} - Required packages: {{packages}} +**Internal Dependencies** (what this file imports/requires, if any): +- From {{module/file}}: {{specific_imports}} +- External packages: {{package_name}} - {{usage_context}} + +**External Dependencies** (what depends on this file, if any): +- Expected to be imported by: {{likely_consumer_files}} +- Key exports used elsewhere: {{main_interfaces}} + +**Implementation Notes**: (if any) +- Architecture decisions: {{key_choices_made}} +- Cross-File Relationships: {{how_files_work_together}} **Next Steps**: List the code file (ONLY ONE) that will be implemented in the next round (MUST choose from "Remaining Unimplemented Files" above) Format: Code will be implemented: {{file_path}} @@ -569,6 +628,14 @@ def _create_code_summary_prompt( return prompt + # TODO: The prompt is not good, need to be improved + # **Implementation Progress**: List the code file completed in current round and core implementation ideas + # Format: {{file_path}}: {{core implementation ideas}} + + # **Dependencies**: According to the File Structure and initial plan, list functions that may be called by other files + # Format: {{file_path}}: Function {{function_name}}: core ideas--{{ideas}}; Required parameters--{{params}}; Return parameters--{{returns}} + # Required packages: {{packages}} + def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: """ Extract different sections from LLM-generated summary @@ -577,9 +644,17 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: llm_summary: Raw LLM-generated summary text Returns: - Dictionary with extracted sections: implementation_progress, dependencies, next_steps + Dictionary with extracted sections: core_purpose, public_interface, internal_dependencies, + external_dependencies, implementation_notes, next_steps """ - sections = {"implementation_progress": "", "dependencies": "", "next_steps": ""} + sections = { + "core_purpose": "", + "public_interface": "", + "internal_dependencies": "", + "external_dependencies": "", + "implementation_notes": "", + "next_steps": "", + } try: lines = llm_summary.split("\n") @@ -590,17 +665,30 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: line_lower = line.lower().strip() # Check for section headers - if "implementation progress" in line_lower: + if "core purpose" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "core_purpose" + current_content = [line] # Include the header + elif "public interface" in line_lower: if current_section and current_content: sections[current_section] = "\n".join(current_content).strip() - current_section = "implementation_progress" + current_section = "public_interface" current_content = [line] # Include the header - elif ( - "dependencies" in line_lower and "implementation" not in line_lower - ): + elif "internal dependencies" in line_lower: if current_section and current_content: sections[current_section] = "\n".join(current_content).strip() - current_section = "dependencies" + current_section = "internal_dependencies" + current_content = [line] # Include the header + elif "external dependencies" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "external_dependencies" + current_content = [line] # Include the header + elif "implementation notes" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "implementation_notes" current_content = [line] # Include the header elif "next steps" in line_lower: if current_section and current_content: @@ -620,8 +708,8 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: except Exception as e: self.logger.error(f"Failed to extract summary sections: {e}") - # Fallback: put everything in implementation_progress - sections["implementation_progress"] = llm_summary + # Fallback: put everything in core_purpose + sections["core_purpose"] = llm_summary return sections @@ -984,10 +1072,10 @@ def create_concise_messages( **For NEW file implementation:** 1. **You need to call read_code_mem(already_implemented_file_path)** to understand existing implementations and dependencies - agent should choose relevant ALREADY IMPLEMENTED file paths for reference, NOT the new file you want to create 2. Write_file can be used to implement the new component -3. Finally: Use execute_python or execute_bash for testing (if needed) +3. OPTIONALLY: Use execute_python or execute_bash if meet some specific requirements (if needed) **When all files implemented:** -1. **Use execute_python or execute_bash** to test the complete implementation""" +1. **Use execute_python or execute_bash** to test the complete implementation (if needed)""" # # Append Next Steps information if available (even when no tool results) # if self.current_next_steps.strip(): diff --git a/workflows/agents/memory_agent_concise_index.py b/workflows/agents/memory_agent_concise_index.py index e19cb870..8409adfc 100644 --- a/workflows/agents/memory_agent_concise_index.py +++ b/workflows/agents/memory_agent_concise_index.py @@ -215,32 +215,50 @@ def _extract_from_tree_structure(self, lines: List[str]) -> List[str]: if ":" in line and not ("." in line and "/" in line): continue - # Only process lines that look like file tree structure - if not any( - char in line for char in ["├", "└", "│", "-"] - ) and not line.startswith(" "): + stripped_line = line.strip() + + # Detect root directory (directory name ending with / at minimal indentation) + if ( + stripped_line.endswith("/") + and len(line) - len(line.lstrip()) + <= 4 # Minimal indentation (0-4 spaces) + and not any(char in line for char in ["├", "└", "│", "─"]) + ): # No tree characters + root_directory = stripped_line.rstrip("/") + path_stack = [root_directory] + continue + + # Only process lines that have tree structure + if not any(char in line for char in ["├", "└", "│", "─"]): continue - # Remove tree characters and get the clean name + # Parse tree structure depth by analyzing the line structure + # Count │ characters before the actual item, or use indentation as fallback + pipe_count = 0 + + for i, char in enumerate(line): + if char == "│": + pipe_count += 1 + elif char in ["├", "└"]: + break + + # Calculate depth: use pipe count if available, otherwise use indentation + if pipe_count > 0: + depth = pipe_count + 1 # +1 because the actual item is one level deeper + else: + # Use indentation to determine depth (every 4 spaces = 1 level) + indent_spaces = len(line) - len(line.lstrip()) + depth = max(1, indent_spaces // 4) # At least depth 1 + + # Clean the line to get the item name clean_line = line - for char in ["├──", "└──", "│", "─", "├", "└"]: + for char in ["├──", "└──", "├", "└", "│", "─"]: clean_line = clean_line.replace(char, "") clean_line = clean_line.strip() if not clean_line or ":" in clean_line: continue - # Calculate indentation level by counting spaces/tree chars - indent_level = 0 - for char in line: - if char in [" ", "\t", "│", "├", "└", "─"]: - indent_level += 1 - else: - break - indent_level = max( - 0, (indent_level - 4) // 4 - ) # Normalize to directory levels - # Extract filename (remove comments) if "#" in clean_line: filename = clean_line.split("#")[0].strip() @@ -251,12 +269,34 @@ def _extract_from_tree_structure(self, lines: List[str]) -> List[str]: if not filename: continue - # Update path stack based on indentation - if indent_level < len(path_stack): - path_stack = path_stack[:indent_level] + # Adjust path stack to current depth + while len(path_stack) < depth: + path_stack.append("") + path_stack = path_stack[:depth] + + # Determine if it's a directory or file + is_directory = ( + filename.endswith("/") + or ( + "." not in filename + and filename not in ["README", "requirements.txt", "setup.py"] + ) + or filename + in [ + "core", + "networks", + "environments", + "baselines", + "evaluation", + "experiments", + "utils", + "src", + "lib", + "app", + ] + ) - # If it's a directory (ends with / or no extension), add to path stack - if filename.endswith("/") or ("." not in filename and filename != ""): + if is_directory: directory_name = filename.rstrip("/") if directory_name and ":" not in directory_name: path_stack.append(directory_name) @@ -473,10 +513,16 @@ async def create_code_implementation_summary( # Format summary with only Implementation Progress and Dependencies for file saving file_summary_content = "" - if sections.get("implementation_progress"): - file_summary_content += sections["implementation_progress"] + "\n\n" - if sections.get("dependencies"): - file_summary_content += sections["dependencies"] + "\n\n" + if sections.get("core_purpose"): + file_summary_content += sections["core_purpose"] + "\n\n" + if sections.get("public_interface"): + file_summary_content += sections["public_interface"] + "\n\n" + if sections.get("internal_dependencies"): + file_summary_content += sections["internal_dependencies"] + "\n\n" + if sections.get("external_dependencies"): + file_summary_content += sections["external_dependencies"] + "\n\n" + if sections.get("implementation_notes"): + file_summary_content += sections["implementation_notes"] + "\n\n" # Create the formatted summary for file saving (without Next Steps) formatted_summary = self._format_code_implementation_summary( @@ -545,12 +591,25 @@ def _create_code_summary_prompt( **Required Summary Format:** -**Implementation Progress**: List the code file completed in current round and core implementation ideas - Format: {{file_path}}: {{core implementation ideas}} +**Core Purpose** (provide a general overview of the file's main responsibility): +- {{1-2 sentence description of file's main responsibility}} + +**Public Interface** (what other files can use, if any): +- Class {{ClassName}}: {{purpose}} | Key methods: {{method_names}} | Constructor params: {{params}} +- Function {{function_name}}({{params}}): {{purpose}} -> {{return_type}}: {{purpose}} +- Constants/Types: {{name}}: {{value/description}} -**Dependencies**: According to the File Structure and initial plan, list functions that may be called by other files - Format: {{file_path}}: Function {{function_name}}: core ideas--{{ideas}}; Required parameters--{{params}}; Return parameters--{{returns}} - Required packages: {{packages}} +**Internal Dependencies** (what this file imports/requires, if any): +- From {{module/file}}: {{specific_imports}} +- External packages: {{package_name}} - {{usage_context}} + +**External Dependencies** (what depends on this file, if any): +- Expected to be imported by: {{likely_consumer_files}} +- Key exports used elsewhere: {{main_interfaces}} + +**Implementation Notes**: (if any) +- Architecture decisions: {{key_choices_made}} +- Cross-File Relationships: {{how_files_work_together}} **Next Steps**: List the code file (ONLY ONE) that will be implemented in the next round (MUST choose from "Remaining Unimplemented Files" above) Format: Code will be implemented: {{file_path}} @@ -560,7 +619,6 @@ def _create_code_summary_prompt( - Be precise and concise - Focus on function interfaces that other files will need - Extract actual function signatures from the code -- **CRITICAL: ONLY choose from the "Remaining Unimplemented Files" list above!** - **CRITICAL: For Next Steps, ONLY choose ONE file from the "Remaining Unimplemented Files" list above** - **NEVER suggest implementing a file that is already in the implemented files list** - Choose the next file based on logical dependencies and implementation order @@ -570,6 +628,14 @@ def _create_code_summary_prompt( return prompt + # TODO: The prompt is not good, need to be improved + # **Implementation Progress**: List the code file completed in current round and core implementation ideas + # Format: {{file_path}}: {{core implementation ideas}} + + # **Dependencies**: According to the File Structure and initial plan, list functions that may be called by other files + # Format: {{file_path}}: Function {{function_name}}: core ideas--{{ideas}}; Required parameters--{{params}}; Return parameters--{{returns}} + # Required packages: {{packages}} + def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: """ Extract different sections from LLM-generated summary @@ -578,9 +644,17 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: llm_summary: Raw LLM-generated summary text Returns: - Dictionary with extracted sections: implementation_progress, dependencies, next_steps + Dictionary with extracted sections: core_purpose, public_interface, internal_dependencies, + external_dependencies, implementation_notes, next_steps """ - sections = {"implementation_progress": "", "dependencies": "", "next_steps": ""} + sections = { + "core_purpose": "", + "public_interface": "", + "internal_dependencies": "", + "external_dependencies": "", + "implementation_notes": "", + "next_steps": "", + } try: lines = llm_summary.split("\n") @@ -591,17 +665,30 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: line_lower = line.lower().strip() # Check for section headers - if "implementation progress" in line_lower: + if "core purpose" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "core_purpose" + current_content = [line] # Include the header + elif "public interface" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "public_interface" + current_content = [line] # Include the header + elif "internal dependencies" in line_lower: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "internal_dependencies" + current_content = [line] # Include the header + elif "external dependencies" in line_lower: if current_section and current_content: sections[current_section] = "\n".join(current_content).strip() - current_section = "implementation_progress" + current_section = "external_dependencies" current_content = [line] # Include the header - elif ( - "dependencies" in line_lower and "implementation" not in line_lower - ): + elif "implementation notes" in line_lower: if current_section and current_content: sections[current_section] = "\n".join(current_content).strip() - current_section = "dependencies" + current_section = "implementation_notes" current_content = [line] # Include the header elif "next steps" in line_lower: if current_section and current_content: @@ -621,8 +708,8 @@ def _extract_summary_sections(self, llm_summary: str) -> Dict[str, str]: except Exception as e: self.logger.error(f"Failed to extract summary sections: {e}") - # Fallback: put everything in implementation_progress - sections["implementation_progress"] = llm_summary + # Fallback: put everything in core_purpose + sections["core_purpose"] = llm_summary return sections @@ -642,6 +729,25 @@ def _format_code_implementation_summary( """ timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + # # Create formatted list of implemented files + # implemented_files_list = ( + # "\n".join([f"- {file}" for file in self.implemented_files]) + # if self.implemented_files + # else "- None yet" + # ) + + # formatted_summary = f"""# Code Implementation Summary + # **All Previously Implemented Files:** + # {implemented_files_list} + # **Generated**: {timestamp} + # **File Implemented**: {file_path} + # **Total Files Implemented**: {files_implemented} + + # {llm_summary} + + # --- + # *Auto-generated by Memory Agent* + # """ formatted_summary = f"""# Code Implementation Summary **Generated**: {timestamp} **File Implemented**: {file_path} @@ -821,6 +927,7 @@ def record_tool_result( if tool_name == "write_file": self.last_write_file_detected = True self.should_clear_memory_next = True + # self.logger.info(f"🔄 WRITE_FILE DETECTED: {file_path} - Memory will be cleared in next round") # Only record specific tools that provide essential information @@ -929,56 +1036,56 @@ def create_concise_messages( "role": "user", "content": f"""**Below is the Knowledge Base of the LATEST implemented code file:** {self._read_code_knowledge_base()} -""", + +**Development Cycle - START HERE:** + +**For NEW file implementation:** +1. **You need to call read_code_mem(already_implemented_file_path)** to understand existing implementations and dependencies - agent should choose relevant ALREADY IMPLEMENTED file paths for reference, NOT the new file you want to create +2. `search_code_references` → OPTIONALLY search reference patterns for inspiration (use for reference only, original paper specs take priority) +3. `write_file` → Create the complete code implementation based on original paper requirements +4. `execute_python` or `execute_bash` → Test the partial implementation if needed + +**When all files implemented:** +**Use execute_python or execute_bash** to test the complete implementation""", } concise_messages.append(knowledge_base_message) # 3. Add current tool results (essential information for next file generation) if self.current_round_tool_results: tool_results_content = self._format_tool_results() + + # # Append Next Steps information if available + # if self.current_next_steps.strip(): + # tool_results_content += f"\n\n**Next Steps (from previous analysis):**\n{self.current_next_steps}" + tool_results_message = { "role": "user", "content": f"""**Current Tool Results:** -{tool_results_content} - -**🚨 NEXT STEP: First determine if ALL files from the reproduction plan have been implemented:** - -**If ALL files are implemented (reproduction plan complete):** -- Use `execute_python` or `execute_bash` to test the complete implementation -- If testing successful, respond with "**implementation complete**" to end the conversation -- Only use `read_code_mem` if debugging is needed during testing - -**If MORE files need to be implemented:** -- #1. `read_code_mem` → Query summaries of relevant **already-implemented** files (agent should choose which implemented file paths to reference)(important!!!) -- #2. `search_code_references` → OPTIONALLY search reference patterns for inspiration (use for reference only, original paper specs take priority) -- #3. `write_file` → Create the complete code implementation based on original paper requirements -- #4. `execute_python` or `execute_bash` → Test the partial implementation if needed - -**Remember:** Always check if all planned files are implemented before continuing with new file creation.""", +{tool_results_content}""", } concise_messages.append(tool_results_message) else: # If no tool results yet, add guidance for next steps - guidance_message = { - "role": "user", - "content": f"""**Current Round:** {self.current_round} + guidance_content = f"""**Current Round:** {self.current_round} **Development Cycle - START HERE:** -**FIRST: Check if ALL files from the reproduction plan are implemented** -- If YES: Use `execute_python` or `execute_bash` for testing, then respond "**implementation complete**" -- If NO: Continue with file implementation cycle below - **For NEW file implementation:** -1. **You can call read_code_mem(*already_implemented_file_path*)** to understand existing implementations and dependencies - agent should choose relevant ALREADY IMPLEMENTED file paths for reference, NOT the new file you want to create -2. **Optionally use search_code_references** for reference patterns (OPTIONAL - for inspiration only, original paper specs take priority) -3. Write_file can be used to implement the new component based on original paper requirements +1. **You need to call read_code_mem(already_implemented_file_path)** to understand existing implementations and dependencies - agent should choose relevant ALREADY IMPLEMENTED file paths for reference, NOT the new file you want to create +2. `search_code_references` → OPTIONALLY search reference patterns for inspiration (use for reference only, original paper specs take priority) +3. Write_file can be used to implement the new component 4. Finally: Use execute_python or execute_bash for testing (if needed) -**For TESTING/COMPLETION phase (when all files implemented):** -1. **➡️ FIRST: Use execute_python or execute_bash** to test the complete implementation -2. **If successful: Respond with "implementation complete"** to end the conversation -3. Only use read_code_mem if debugging is needed during testing""", +**When all files implemented:** +1. **Use execute_python or execute_bash** to test the complete implementation""" + + # # Append Next Steps information if available (even when no tool results) + # if self.current_next_steps.strip(): + # guidance_content += f"\n\n**Next Steps (from previous analysis):**\n{self.current_next_steps}" + + guidance_message = { + "role": "user", + "content": guidance_content, } concise_messages.append(guidance_message) # **Available Essential Tools:** read_code_mem, write_file, execute_python, execute_bash diff --git a/workflows/agents/memory_agent_concise_multi.py b/workflows/agents/memory_agent_concise_multi.py new file mode 100644 index 00000000..50cc7cb3 --- /dev/null +++ b/workflows/agents/memory_agent_concise_multi.py @@ -0,0 +1,1659 @@ +""" +Concise Memory Agent for Code Implementation Workflow - Multi-File Only Support + +This memory agent implements a focused approach with ONLY multi-file capabilities: +1. Before first batch: Normal conversation flow +2. After first batch: Keep only system_prompt + initial_plan + current round tool results +3. Clean slate for each new code batch generation +4. MULTI-FILE ONLY: Support for summarizing multiple files simultaneously (max 5) + +Key Features: +- Preserves system prompt and initial plan always +- After first batch generation, discards previous conversation history +- Keeps only current round tool results from essential tools: + * read_multiple_files, write_multiple_files + * execute_python, execute_bash + * search_code, search_reference_code, get_file_structure +- Provides clean, focused input for next write_multiple_files operation +- MULTI-FILE ONLY: No single file support +- FILE TRACKING: Gets ALL file information from workflow, no internal tracking +""" + +import json +import logging +import os +import time +from datetime import datetime +from typing import Dict, Any, List, Optional + + +class ConciseMemoryAgent: + """ + Concise Memory Agent - Focused Information Retention with MULTI-FILE ONLY Support + + Core Philosophy: + - Preserve essential context (system prompt + initial plan) + - After first batch generation, use clean slate approach + - Keep only current round tool results from multi-file MCP tools + - Remove conversational clutter and previous tool calls + - MULTI-FILE ONLY: Support for multiple file implementations in single operation + - FILE TRACKING: Receives ALL file information from workflow (no internal tracking) + + Essential Tools Tracked: + - Multi-File Operations: read_multiple_files, write_multiple_files + - Code Analysis: search_code, search_reference_code, get_file_structure + - Execution: execute_python, execute_bash + """ + + def __init__( + self, + initial_plan_content: str, + logger: Optional[logging.Logger] = None, + target_directory: Optional[str] = None, + default_models: Optional[Dict[str, str]] = None, + max_files_per_batch: int = 3, + ): + """ + Initialize Concise Memory Agent with MULTI-FILE ONLY support + + Args: + initial_plan_content: Content of initial_plan.txt + logger: Logger instance + target_directory: Target directory for saving summaries + default_models: Default models configuration from workflow + max_files_per_batch: Maximum number of files to implement simultaneously (default: 3) + """ + self.logger = logger or self._create_default_logger() + self.initial_plan = initial_plan_content + self.max_files_per_batch = max_files_per_batch + + # Store default models configuration + self.default_models = default_models or { + "anthropic": "claude-sonnet-4-20250514", + "openai": "gpt-4o", + } + + # Memory state tracking - new logic: trigger after each write_multiple_files + self.last_write_multiple_files_detected = ( + False # Track if write_multiple_files was called in current iteration + ) + self.should_clear_memory_next = False # Flag to clear memory in next round + self.current_round = 0 + + # self.phase_structure = self._parse_phase_structure() + + # Memory configuration + if target_directory: + self.save_path = target_directory + else: + self.save_path = "./deepcode_lab/papers/1/" + + # Code summary file path + self.code_summary_path = os.path.join( + self.save_path, "implement_code_summary.md" + ) + + # Current round tool results storage + self.current_round_tool_results = [] + + self.logger.info( + f"Concise Memory Agent initialized with target directory: {self.save_path}" + ) + self.logger.info(f"Code summary will be saved to: {self.code_summary_path}") + self.logger.info(f"Max files per batch: {self.max_files_per_batch}") + self.logger.info( + "📝 MULTI-FILE LOGIC: Memory clearing triggered after each write_multiple_files call" + ) + self.logger.info( + "🆕 MULTI-FILE ONLY: No single file support - batch operations only" + ) + self.logger.info( + "📊 FILE TRACKING: ALL file information received from workflow (no internal tracking)" + ) + + def _create_default_logger(self) -> logging.Logger: + """Create default logger""" + logger = logging.getLogger(f"{__name__}.ConciseMemoryAgent") + logger.setLevel(logging.INFO) + return logger + + async def create_multi_code_implementation_summary( + self, + client, + client_type: str, + file_implementations: Dict[str, str], + files_implemented: int, + implemented_files: List[str], # Receive from workflow + ) -> str: + """ + Create LLM-based code implementation summary for multiple files + ONLY AVAILABLE METHOD: Handles multiple files simultaneously with separate summaries for each + + Args: + client: LLM client instance + client_type: Type of LLM client ("anthropic" or "openai") + file_implementations: Dictionary mapping file_path to implementation_content + files_implemented: Number of files implemented so far + implemented_files: List of all implemented files (from workflow) + + Returns: + LLM-generated formatted code implementation summaries for all files + """ + try: + # Validate input + if not file_implementations: + raise ValueError("No file implementations provided") + + if len(file_implementations) > self.max_files_per_batch: + raise ValueError( + f"Too many files provided ({len(file_implementations)}), max is {self.max_files_per_batch}" + ) + + # Create prompt for LLM summary of multiple files + summary_prompt = self._create_multi_code_summary_prompt( + file_implementations, files_implemented, implemented_files + ) + summary_messages = [{"role": "user", "content": summary_prompt}] + + # Get LLM-generated summary + llm_response = await self._call_llm_for_summary( + client, client_type, summary_messages + ) + llm_summary = llm_response.get("content", "") + + # Extract sections for each file and next steps + multi_sections = self._extract_multi_summary_sections( + llm_summary, file_implementations.keys() + ) + + # Format and save summary for each file (WITHOUT Next Steps) + all_formatted_summaries = [] + + for file_path in file_implementations.keys(): + file_sections = multi_sections.get("files", {}).get(file_path, {}) + + # Format summary with ONLY Implementation Progress and Dependencies for file saving + file_summary_content = "" + if file_sections.get("core_purpose"): + file_summary_content += file_sections["core_purpose"] + "\n\n" + if file_sections.get("public_interface"): + file_summary_content += file_sections["public_interface"] + "\n\n" + if file_sections.get("internal_dependencies"): + file_summary_content += ( + file_sections["internal_dependencies"] + "\n\n" + ) + if file_sections.get("external_dependencies"): + file_summary_content += ( + file_sections["external_dependencies"] + "\n\n" + ) + if file_sections.get("implementation_notes"): + file_summary_content += ( + file_sections["implementation_notes"] + "\n\n" + ) + + # Create the formatted summary for file saving (WITHOUT Next Steps) + formatted_summary = self._format_code_implementation_summary( + file_path, file_summary_content.strip(), files_implemented + ) + + all_formatted_summaries.append(formatted_summary) + + # Save to implement_code_summary.md (append mode) - ONLY Implementation Progress and Dependencies + await self._save_code_summary_to_file(formatted_summary, file_path) + + # Combine all summaries for return + combined_summary = "\n".join(all_formatted_summaries) + + self.logger.info( + f"Created and saved multi-file code summaries for {len(file_implementations)} files" + ) + + return combined_summary + + except Exception as e: + self.logger.error( + f"Failed to create LLM-based multi-file code implementation summary: {e}" + ) + # Fallback to simple summary for each file + return self._create_fallback_multi_code_summary( + file_implementations, files_implemented + ) + + def _create_multi_code_summary_prompt( + self, + file_implementations: Dict[str, str], + files_implemented: int, + implemented_files: List[str], + ) -> str: + """ + Create prompt for LLM to generate multi-file code implementation summary + + Args: + file_implementations: Dictionary mapping file_path to implementation_content + files_implemented: Number of files implemented so far + implemented_files: List of all implemented files (from workflow) + + Returns: + Prompt for LLM multi-file summarization + """ + + # Format file lists using workflow data + implemented_files_list = ( + "\n".join([f"- {file}" for file in implemented_files]) + if implemented_files + else "- None yet" + ) + + # Note: We don't have unimplemented files list anymore - workflow will provide when needed + + # Format file implementations for the prompt + implementation_sections = [] + for file_path, content in file_implementations.items(): + implementation_sections.append(f""" + **File: {file_path}** + {content} + """) + + files_list = list(file_implementations.keys()) + files_count = len(files_list) + + prompt = f"""You are an expert code implementation summarizer. Analyze the {files_count} implemented code files and create structured summaries for each. + +**All Previously Implemented Files:** +{implemented_files_list} + +**Current Implementation Context:** +- **Files Implemented**: {', '.join(files_list)} +- **Total Files Implemented**: {files_implemented} +- **Files in This Batch**: {files_count} + +**Initial Plan Reference:** +{self.initial_plan[:]} + +**Implemented Code Content:** +{''.join(implementation_sections)} + +**Required Summary Format:** + +**FOR EACH FILE, provide separate sections:** + +**File: {{file_path}}** +**Core Purpose** (provide a general overview of the file's main responsibility): +- {{1-2 sentence description of file's main responsibility}} + +**Public Interface** (what other files can use, if any): +- Class {{ClassName}}: {{purpose}} | Key methods: {{method_names}} | Constructor params: {{params}} +- Function {{function_name}}({{params}}): {{purpose}} -> {{return_type}}: {{purpose}} +- Constants/Types: {{name}}: {{value/description}} + +**Internal Dependencies** (what this file imports/requires, if any): +- From {{module/file}}: {{specific_imports}} +- External packages: {{package_name}} - {{usage_context}} + +**External Dependencies** (what depends on this file, if any): +- Expected to be imported by: {{likely_consumer_files}} +- Key exports used elsewhere: {{main_interfaces}} + +**Implementation Notes**: (if any) +- Architecture decisions: {{key_choices_made}} +- Cross-File Relationships: {{how_files_work_together}} + +[Repeat for all {files_count} files...] + +**Instructions:** +- Provide separate Implementation Progress and Dependencies sections for each of the {files_count} files +- Be precise and concise for each file +- Focus on function interfaces that other files will need +- Extract actual function signatures from the code +- Use the exact format specified above + +**Summary:**""" + + return prompt + + def _extract_multi_summary_sections( + self, llm_summary: str, file_paths: List[str] + ) -> Dict[str, Any]: + """ + Extract different sections from LLM-generated multi-file summary + """ + result = { + "files": {}, + } + + try: + # Convert dict_keys to list if needed + if hasattr(file_paths, "keys"): + file_paths = list(file_paths) + elif not isinstance(file_paths, list): + file_paths = list(file_paths) + + lines = llm_summary.split("\n") + current_file = None + current_section = None + current_content = [] + file_sections = {} + + for i, line in enumerate(lines): + line_lower = line.lower().strip() + original_line = line.strip() + + # Skip empty lines + if not original_line: + if current_section: + current_content.append(line) + continue + + # File header detection + if ( + "**file:" in line_lower or "file:" in line_lower + ) and "**" in original_line: + # Save previous section + if current_file and current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + + # Extract file path + file_header = original_line.lower() + if "**file:" in file_header: + file_header = original_line[ + original_line.lower().find("file:") + 5 : + ] + if "**" in file_header: + file_header = file_header[: file_header.find("**")] + else: + file_header = original_line[ + original_line.lower().find("file:") + 5 : + ] + + file_header = file_header.strip() + current_file = None + + # File matching + for file_path in file_paths: + file_name = file_path.split("/")[-1] + if ( + file_path in file_header + or file_header in file_path + or file_name in file_header + or file_header in file_name + ): + current_file = file_path + break + + current_section = None + current_content = [] + continue + + # Section detection within files + if current_file: + section_matched = False + + if "core purpose" in line_lower and "**" in original_line: + if current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + current_section = "core_purpose" + current_content = [] + section_matched = True + elif "public interface" in line_lower and "**" in original_line: + if current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + current_section = "public_interface" + current_content = [] + section_matched = True + elif ( + "internal dependencies" in line_lower and "**" in original_line + ): + if current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + current_section = "internal_dependencies" + current_content = [] + section_matched = True + elif ( + "external dependencies" in line_lower and "**" in original_line + ): + if current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + current_section = "external_dependencies" + current_content = [] + section_matched = True + elif "implementation notes" in line_lower and "**" in original_line: + if current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + current_section = "implementation_notes" + current_content = [] + section_matched = True + + # If no section header matched, add to current content + if not section_matched and current_section: + current_content.append(line) + + # Save the final section + if current_file and current_section and current_content: + if current_file not in file_sections: + file_sections[current_file] = {} + file_sections[current_file][current_section] = "\n".join( + current_content + ).strip() + + # Build final result + for file_path in file_paths: + sections = file_sections.get(file_path, {}) + result["files"][file_path] = {} + if "core_purpose" in sections: + result["files"][file_path]["core_purpose"] = ( + "**Core Purpose**:\n" + sections["core_purpose"] + ) + if "public_interface" in sections: + result["files"][file_path]["public_interface"] = ( + "**Public Interface**:\n" + sections["public_interface"] + ) + if "implementation_notes" in sections: + result["files"][file_path]["implementation_notes"] = ( + "**Implementation Notes**:\n" + sections["implementation_notes"] + ) + if "internal_dependencies" in sections: + result["files"][file_path]["internal_dependencies"] = ( + "**Internal Dependencies**:\n" + + sections["internal_dependencies"] + ) + if "external_dependencies" in sections: + result["files"][file_path]["external_dependencies"] = ( + "**External Dependencies**:\n" + + sections["external_dependencies"] + ) + + self.logger.info( + f"📋 Extracted multi-file sections for {len(result['files'])} files" + ) + + except Exception as e: + self.logger.error(f"Failed to extract multi-file summary sections: {e}") + self.logger.error(f"📋 file_paths type: {type(file_paths)}") + self.logger.error(f"📋 file_paths value: {file_paths}") + self.logger.error(f"📋 file_paths length: {len(file_paths)}") + for file_path in file_paths: + result["files"][file_path] = { + "core_purpose": f"**Core Purpose**: {file_path} completed.", + "public_interface": "**Public Interface**: Public interface need manual review.", + "internal_dependencies": "**Internal Dependencies**: Internal dependencies need manual review.", + "external_dependencies": "**External Dependencies**: External dependencies need manual review.", + "implementation_notes": "**Implementation Notes**: Implementation notes need manual review.", + } + + return result + + def _format_code_implementation_summary( + self, file_path: str, llm_summary: str, files_implemented: int + ) -> str: + """ + Format the LLM-generated summary into the final structure + + Args: + file_path: Path of the implemented file + llm_summary: LLM-generated summary content + files_implemented: Number of files implemented so far + + Returns: + Formatted summary + """ + timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + formatted_summary = f"""# Code Implementation Summary +**Generated**: {timestamp} +**File Implemented**: {file_path} + +{llm_summary} + +--- +*Auto-generated by Memory Agent* +""" + return formatted_summary + + def _create_fallback_multi_code_summary( + self, file_implementations: Dict[str, str], files_implemented: int + ) -> str: + """ + Create fallback multi-file summary when LLM is unavailable + + Args: + file_implementations: Dictionary mapping file_path to implementation_content + files_implemented: Number of files implemented so far + + Returns: + Fallback multi-file summary + """ + # Create fallback summaries for each file + fallback_summaries = [] + timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + for file_path in file_implementations.keys(): + fallback_summary = f"""# Code Implementation Summary +**Generated**: {timestamp} +**File Implemented**: {file_path} +**Multi-file batch summary failed to generate.** + +--- +*Auto-generated by Concise Memory Agent (Multi-File Fallback Mode)* +""" + fallback_summaries.append(fallback_summary) + + return "\n".join(fallback_summaries) + + async def _save_code_summary_to_file(self, new_summary: str, file_path: str): + """ + Append code implementation summary to implement_code_summary.md + Accumulates all implementations with clear separators + + Args: + new_summary: New summary content to append + file_path: Path of the file for which the summary was generated + """ + try: + # Create directory if it doesn't exist + os.makedirs(os.path.dirname(self.code_summary_path), exist_ok=True) + + # Check if file exists to determine if we need header + file_exists = os.path.exists(self.code_summary_path) + + # Open in append mode to accumulate all implementations + with open(self.code_summary_path, "a", encoding="utf-8") as f: + if not file_exists: + # Write header for new file + f.write("# Code Implementation Progress Summary\n") + f.write("*Accumulated implementation progress for all files*\n\n") + + # Add clear separator between implementations + f.write("\n" + "=" * 80 + "\n") + f.write(f"## IMPLEMENTATION File {file_path}\n") + f.write("=" * 80 + "\n\n") + + # Write the new summary + f.write(new_summary) + f.write("\n\n") + + self.logger.info( + f"Appended LLM-based code implementation summary to: {self.code_summary_path}" + ) + + except Exception as e: + self.logger.error(f"Failed to save code implementation summary: {e}") + + async def _call_llm_for_summary( + self, client, client_type: str, summary_messages: List[Dict] + ) -> Dict[str, Any]: + """ + Call LLM for code implementation summary generation ONLY + + This method is used only for creating code implementation summaries, + NOT for conversation summarization which has been removed. + """ + if client_type == "anthropic": + response = await client.messages.create( + model=self.default_models["anthropic"], + system="You are an expert code implementation summarizer. Create structured summaries of implemented code files that preserve essential information about functions, dependencies, and implementation approaches.", + messages=summary_messages, + max_tokens=8000, # Increased for multi-file support + temperature=0.2, + ) + + content = "" + for block in response.content: + if block.type == "text": + content += block.text + + return {"content": content} + + elif client_type == "openai": + openai_messages = [ + { + "role": "system", + "content": "You are an expert code implementation summarizer. Create structured summaries of implemented code files that preserve essential information about functions, dependencies, and implementation approaches.", + } + ] + openai_messages.extend(summary_messages) + + # Try max_tokens and temperature first, fallback to max_completion_tokens without temperature if unsupported + try: + response = await client.chat.completions.create( + model=self.default_models["openai"], + messages=openai_messages, + max_tokens=8000, # Increased for multi-file support + temperature=0.2, + ) + except Exception as e: + if "max_tokens" in str(e) and "max_completion_tokens" in str(e): + # Retry with max_completion_tokens and no temperature for models that require it + response = await client.chat.completions.create( + model=self.default_models["openai"], + messages=openai_messages, + max_completion_tokens=8000, # Increased for multi-file support + ) + else: + raise + + return {"content": response.choices[0].message.content or ""} + + else: + raise ValueError(f"Unsupported client type: {client_type}") + + def start_new_round(self, iteration: Optional[int] = None): + """Start a new dialogue round and reset tool results + + Args: + iteration: Optional iteration number from workflow to sync with current_round + """ + if iteration is not None: + # Sync with workflow iteration + self.current_round = iteration + else: + # Default behavior: increment round counter + self.current_round += 1 + self.logger.info(f"🔄 Started new round {self.current_round}") + + self.current_round_tool_results = [] # Clear previous round results + + def record_tool_result( + self, tool_name: str, tool_input: Dict[str, Any], tool_result: Any + ): + """ + Record tool result for current round and detect write_multiple_files calls + + Args: + tool_name: Name of the tool called + tool_input: Input parameters for the tool + tool_result: Result returned by the tool + """ + # Detect write_multiple_files calls to trigger memory clearing + if tool_name == "write_multiple_files": + self.last_write_multiple_files_detected = True + self.should_clear_memory_next = True + + # Only record specific tools that provide essential information + essential_tools = [ + "read_multiple_files", # Read multiple file contents + "write_multiple_files", # Write multiple file contents (important for tracking implementations) + "execute_python", # Execute Python code (for testing/validation) + "execute_bash", # Execute bash commands (for build/execution) + "search_code", # Search code patterns + "search_reference_code", # Search reference code (if available) + "get_file_structure", # Get file structure (for understanding project layout) + ] + + if tool_name in essential_tools: + tool_record = { + "tool_name": tool_name, + "tool_input": tool_input, + "tool_result": tool_result, + "timestamp": time.time(), + } + self.current_round_tool_results.append(tool_record) + + def should_use_concise_mode(self) -> bool: + """ + Check if concise memory mode should be used + + Returns: + True if first batch has been generated and concise mode should be active + """ + return self.last_write_multiple_files_detected + + def create_concise_messages_revise( + self, + system_prompt: str, + messages: List[Dict[str, Any]], + files_implemented: int, + task_description: str, + file_batch: List[str], + is_first_batch: bool = True, + implemented_files: List[str] = None, # Receive from workflow + all_files: List[str] = None, # NEW: Receive all files from workflow + ) -> List[Dict[str, Any]]: + """ + Create concise message list for LLM input specifically for revision execution + ALIGNED with _execute_multi_file_batch_revision in code_evaluation_workflow + + Args: + system_prompt: Current system prompt + messages: Original message list + files_implemented: Number of files implemented so far + task_description: Description of the current task + file_batch: Files to implement in this batch + is_first_batch: Whether this is the first batch (use file_batch) or subsequent + implemented_files: List of all implemented files (from workflow) + all_files: List of all files that should be implemented (from workflow) + + Returns: + Concise message list containing only essential information for revision + """ + # Use empty lists if not provided + if implemented_files is None: + implemented_files = [] + if all_files is None: + all_files = [] + + self.logger.info( + "🎯 Using CONCISE memory mode for revision - Clear slate after write_multiple_files" + ) + + concise_messages = [] + + # Format file lists using workflow data + implemented_files_list = ( + "\n".join([f"- {file}" for file in implemented_files]) + if implemented_files + else "- None yet" + ) + + # Calculate unimplemented files from workflow data + + # Read initial plan and memory content + initial_plan_content = self.initial_plan + memory_content = ( + self._read_code_knowledge_base() + or "No previous implementation memory available" + ) + + files_to_implement = file_batch + file_list = "\n".join([f"- {file_path}" for file_path in files_to_implement]) + + # Create revision-specific task message + task_message = f"""Task: {task_description} + + Files to implement in this batch ({len(files_to_implement)} files): + {file_list} + + MANDATORY JSON FORMAT REQUIREMENTS: + 1. Use write_multiple_files tool + 2. Parameter name: "file_implementations" + 3. Value must be a VALID JSON string with ESCAPED newlines + 4. Use \\n for newlines, \\t for tabs, \\" for quotes + 5. NO literal newlines in the JSON string + + CORRECT JSON FORMAT EXAMPLE: + {{ + "file1.py": "# Comment\\nclass MyClass:\\n def __init__(self):\\n pass\\n", + "file2.py": "import os\\n\\ndef main():\\n print('Hello')\\n" + }} + + Initial Implementation Plan Context: + {initial_plan_content} + + Previous Implementation Memory: + {memory_content} + + **All Previously Implemented Files:** + {implemented_files_list} + + **Current Status:** {files_implemented} files implemented + + IMPLEMENTATION REQUIREMENTS: + - Create functional code for each file + - Use proper Python syntax and imports + - Include docstrings and comments + - Follow the existing patterns from memory + + Files to implement: {files_to_implement} + + Call write_multiple_files NOW with PROPERLY ESCAPED JSON containing all {len(files_to_implement)} files.""" + + concise_messages.append({"role": "user", "content": task_message}) + + # Debug output for files to implement + print("✅ Files to implement:") + for file_path in files_to_implement: + print(f"{file_path}") + + return concise_messages + + def _calculate_message_statistics( + self, messages: List[Dict[str, Any]], label: str + ) -> Dict[str, Any]: + """ + Calculate statistics for a message list + + Args: + messages: List of messages to analyze + label: Label for logging + + Returns: + Dictionary with statistics + """ + total_chars = 0 + total_words = 0 + + for msg in messages: + content = msg.get("content", "") + total_chars += len(content) + total_words += len(content.split()) + + # Estimate tokens (rough approximation: ~4 characters per token) + estimated_tokens = total_chars // 4 + + stats = { + "message_count": len(messages), + "total_characters": total_chars, + "total_words": total_words, + "estimated_tokens": estimated_tokens, + "summary": f"{len(messages)} msgs, {total_chars:,} chars, ~{estimated_tokens:,} tokens", + } + + return stats + + def _calculate_memory_savings( + self, original_stats: Dict[str, Any], optimized_stats: Dict[str, Any] + ) -> Dict[str, Any]: + """ + Calculate memory savings between original and optimized messages + + Args: + original_stats: Statistics for original messages + optimized_stats: Statistics for optimized messages + + Returns: + Dictionary with savings calculations + """ + messages_saved = ( + original_stats["message_count"] - optimized_stats["message_count"] + ) + chars_saved = ( + original_stats["total_characters"] - optimized_stats["total_characters"] + ) + tokens_saved_estimate = ( + original_stats["estimated_tokens"] - optimized_stats["estimated_tokens"] + ) + + # Calculate percentages (avoid division by zero) + messages_saved_percent = ( + messages_saved / max(original_stats["message_count"], 1) + ) * 100 + chars_saved_percent = ( + chars_saved / max(original_stats["total_characters"], 1) + ) * 100 + tokens_saved_percent = ( + tokens_saved_estimate / max(original_stats["estimated_tokens"], 1) + ) * 100 + + return { + "messages_saved": messages_saved, + "chars_saved": chars_saved, + "tokens_saved_estimate": tokens_saved_estimate, + "messages_saved_percent": messages_saved_percent, + "chars_saved_percent": chars_saved_percent, + "tokens_saved_percent": tokens_saved_percent, + } + + def _read_code_knowledge_base(self) -> Optional[str]: + """ + Read the implement_code_summary.md file as code knowledge base + Returns only the final/latest implementation entry, not all historical entries + + Returns: + Content of the latest implementation entry if it exists, None otherwise + """ + try: + if os.path.exists(self.code_summary_path): + with open(self.code_summary_path, "r", encoding="utf-8") as f: + content = f.read().strip() + return content + else: + return None + + except Exception as e: + self.logger.error(f"Failed to read code knowledge base: {e}") + return None + + def _extract_latest_implementation_entry(self, content: str) -> Optional[str]: + """ + Extract the latest/final implementation entry from the implement_code_summary.md content + Uses a simpler approach to find the last implementation section + + Args: + content: Full content of implement_code_summary.md + + Returns: + Latest implementation entry content, or None if not found + """ + try: + import re + + # Pattern to match the start of implementation sections + section_pattern = r"={80}\s*\n## IMPLEMENTATION File .+?" + + # Find all implementation section starts + matches = list(re.finditer(section_pattern, content)) + + if not matches: + # No implementation sections found + lines = content.split("\n") + fallback_content = ( + "\n".join(lines[:10]) + "\n... (truncated for brevity)" + if len(lines) > 10 + else content + ) + self.logger.info( + "📖 No implementation sections found, using fallback content" + ) + return fallback_content + + # Get the start position of the last implementation section + last_match = matches[-1] + start_pos = last_match.start() + + # Take everything from the last section start to the end of content + latest_entry = content[start_pos:].strip() + + return latest_entry + + except Exception as e: + self.logger.error(f"Failed to extract latest implementation entry: {e}") + # Return last 1000 characters as fallback + return content[-500:] if len(content) > 500 else content + + def _format_tool_results(self) -> str: + """ + Format current round tool results for LLM input + + Returns: + Formatted string of tool results + """ + if not self.current_round_tool_results: + return "No tool results in current round." + + formatted_results = [] + + for result in self.current_round_tool_results: + tool_name = result["tool_name"] + tool_input = result["tool_input"] + tool_result = result["tool_result"] + + # Format based on tool type + if tool_name == "read_multiple_files": + file_requests = tool_input.get("file_requests", "unknown") + formatted_results.append(f""" +**read_multiple_files Result for {file_requests}:** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "write_multiple_files": + formatted_results.append(f""" +**write_multiple_files Result for batch:** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "execute_python": + code_snippet = ( + tool_input.get("code", "")[:50] + "..." + if len(tool_input.get("code", "")) > 50 + else tool_input.get("code", "") + ) + formatted_results.append(f""" +**execute_python Result (code: {code_snippet}):** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "execute_bash": + command = tool_input.get("command", "unknown") + formatted_results.append(f""" +**execute_bash Result (command: {command}):** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "search_code": + pattern = tool_input.get("pattern", "unknown") + file_pattern = tool_input.get("file_pattern", "") + formatted_results.append(f""" +**search_code Result (pattern: {pattern}, files: {file_pattern}):** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "search_reference_code": + target_file = tool_input.get("target_file", "unknown") + keywords = tool_input.get("keywords", "") + formatted_results.append(f""" +**search_reference_code Result for {target_file} (keywords: {keywords}):** +{self._format_tool_result_content(tool_result)} +""") + elif tool_name == "get_file_structure": + directory = tool_input.get( + "directory_path", tool_input.get("path", "current") + ) + formatted_results.append(f""" +**get_file_structure Result for {directory}:** +{self._format_tool_result_content(tool_result)} +""") + + return "\n".join(formatted_results) + + def _format_tool_result_content(self, tool_result: Any) -> str: + """ + Format tool result content for display + + Args: + tool_result: Tool result to format + + Returns: + Formatted string representation + """ + if isinstance(tool_result, str): + # Try to parse as JSON for better formatting + try: + result_data = json.loads(tool_result) + if isinstance(result_data, dict): + # Format key information + if result_data.get("status") == "success": + return json.dumps(result_data, indent=2) + else: + return json.dumps(result_data, indent=2) + else: + return str(result_data) + except json.JSONDecodeError: + return tool_result + else: + return str(tool_result) + + def get_memory_statistics( + self, all_files: List[str] = None, implemented_files: List[str] = None + ) -> Dict[str, Any]: + """ + Get memory agent statistics for multi-file operations + + Args: + all_files: List of all files that should be implemented (from workflow) + implemented_files: List of all implemented files (from workflow) + """ + if all_files is None: + all_files = [] + if implemented_files is None: + implemented_files = [] + + # Calculate unimplemented files from workflow data + unimplemented_files = [f for f in all_files if f not in implemented_files] + + return { + "last_write_multiple_files_detected": self.last_write_multiple_files_detected, + "should_clear_memory_next": self.should_clear_memory_next, + "current_round": self.current_round, + "concise_mode_active": self.should_use_concise_mode(), + "current_round_tool_results": len(self.current_round_tool_results), + "essential_tools_recorded": [ + r["tool_name"] for r in self.current_round_tool_results + ], + # File tracking statistics (from workflow) + "total_files_in_plan": len(all_files), + "files_implemented_count": len(implemented_files), + "files_remaining_count": len(unimplemented_files), + "all_files_list": all_files.copy(), + "implemented_files_list": implemented_files.copy(), + "unimplemented_files_list": unimplemented_files, + "implementation_progress_percent": ( + len(implemented_files) / len(all_files) * 100 + ) + if all_files + else 0, + # Multi-file support statistics + "max_files_per_batch": self.max_files_per_batch, + "multi_file_support": True, + "single_file_support": False, # Explicitly disabled + } + + def record_multi_file_implementation(self, file_implementations: Dict[str, str]): + """ + Record multi-file implementation (for compatibility with workflow) + NOTE: This method doesn't track files internally - workflow manages file tracking + + Args: + file_implementations: Dictionary mapping file_path to implementation_content + """ + self.logger.info( + f"📝 Recorded multi-file implementation batch: {len(file_implementations)} files" + ) + # Note: We don't track files internally anymore - workflow handles this + + # ===== ENHANCED MEMORY SYNCHRONIZATION METHODS (Phase 4+) ===== + + async def synchronize_revised_file_memory( + self, + client, + client_type: str, + revised_file_path: str, + diff_content: str, + new_content: str, + revision_type: str = "targeted_fix", + ) -> str: + """ + Synchronize memory for a single revised file with diff information + + Args: + client: LLM client instance + client_type: Type of LLM client ("anthropic" or "openai") + revised_file_path: Path of the revised file + diff_content: Unified diff showing changes made + new_content: Complete new content of the file + revision_type: Type of revision ("targeted_fix", "comprehensive_revision", etc.) + + Returns: + Updated memory summary for the revised file + """ + try: + self.logger.info( + f"🔄 Synchronizing memory for revised file: {revised_file_path}" + ) + + # Create revision-specific summary prompt + revision_prompt = self._create_file_revision_summary_prompt( + revised_file_path, diff_content, new_content, revision_type + ) + + summary_messages = [{"role": "user", "content": revision_prompt}] + + # Get LLM-generated revision summary + llm_response = await self._call_llm_for_summary( + client, client_type, summary_messages + ) + llm_summary = llm_response.get("content", "") + + # Extract summary sections + revision_sections = self._extract_revision_summary_sections(llm_summary) + + # Format revision summary + formatted_summary = self._format_file_revision_summary( + revised_file_path, revision_sections, diff_content, revision_type + ) + + # Save the revision summary (replace old summary) + await self._save_revised_file_summary(formatted_summary, revised_file_path) + + self.logger.info( + f"✅ Memory synchronized for revised file: {revised_file_path}" + ) + + return formatted_summary + + except Exception as e: + self.logger.error( + f"Failed to synchronize memory for revised file {revised_file_path}: {e}" + ) + + # Fallback to simple revision summary + return self._create_fallback_revision_summary( + revised_file_path, revision_type + ) + + async def synchronize_multiple_revised_files( + self, client, client_type: str, revision_results: List[Dict[str, Any]] + ) -> Dict[str, str]: + """ + Synchronize memory for multiple revised files based on revision results + + Args: + client: LLM client instance + client_type: Type of LLM client + revision_results: List of revision results with file paths, diffs, and new content + + Returns: + Dictionary mapping file paths to updated memory summaries + """ + try: + self.logger.info( + f"🔄 Synchronizing memory for {len(revision_results)} revised files" + ) + + synchronized_summaries = {} + + for revision_result in revision_results: + file_path = revision_result.get("file_path", "") + diff_content = revision_result.get("diff", "") + new_content = revision_result.get("new_content", "") + revision_type = revision_result.get("revision_type", "targeted_fix") + + if file_path and revision_result.get("success", False): + summary = await self.synchronize_revised_file_memory( + client, + client_type, + file_path, + diff_content, + new_content, + revision_type, + ) + synchronized_summaries[file_path] = summary + else: + self.logger.warning( + f"⚠️ Skipping memory sync for failed revision: {file_path}" + ) + + self.logger.info( + f"✅ Memory synchronized for {len(synchronized_summaries)} successfully revised files" + ) + + return synchronized_summaries + + except Exception as e: + self.logger.error( + f"Failed to synchronize memory for multiple revised files: {e}" + ) + return {} + + def _create_file_revision_summary_prompt( + self, file_path: str, diff_content: str, new_content: str, revision_type: str + ) -> str: + """ + Create prompt for LLM to generate file revision summary + + Args: + file_path: Path of the revised file + diff_content: Unified diff showing changes + new_content: Complete new content of the file + revision_type: Type of revision performed + + Returns: + Prompt for LLM revision summarization + """ + # Truncate content if too long for prompt + content_preview = ( + new_content[:2000] + "..." if len(new_content) > 2000 else new_content + ) + diff_preview = ( + diff_content[:1000] + "..." if len(diff_content) > 1000 else diff_content + ) + + prompt = f"""You are an expert code revision summarizer. A file has been REVISED with targeted changes. Create a structured summary of the revision. + +**File Revised**: {file_path} +**Revision Type**: {revision_type} + +**Changes Made (Diff):** +```diff +{diff_preview} +``` + +**Updated File Content:** +```python +{content_preview} +``` + +**Required Summary Format:** + +**Revision Summary**: +- Brief description of what was changed and why + +**Changes Made**: +- Specific modifications applied (line-level changes) +- Functions/classes affected +- New functionality added or bugs fixed + +**Impact Assessment**: +- How the changes affect the file's behavior +- Dependencies that might be affected +- Integration points that need attention + +**Quality Improvements**: +- Code quality enhancements made +- Error handling improvements +- Performance or maintainability gains + +**Post-Revision Status**: +- Current functionality of the file +- Key interfaces and exports +- Dependencies and imports + +**Instructions:** +- Focus on the CHANGES made, not just the final state +- Highlight the specific improvements and fixes applied +- Be concise but comprehensive about the revision impact +- Use the exact format specified above + +**Summary:**""" + + return prompt + + def _extract_revision_summary_sections(self, llm_summary: str) -> Dict[str, str]: + """ + Extract different sections from LLM-generated revision summary + + Args: + llm_summary: Raw LLM response containing revision summary + + Returns: + Dictionary with extracted sections + """ + sections = { + "revision_summary": "", + "changes_made": "", + "impact_assessment": "", + "quality_improvements": "", + "post_revision_status": "", + } + + try: + lines = llm_summary.split("\n") + current_section = None + current_content = [] + + for line in lines: + line_lower = line.lower().strip() + original_line = line.strip() + + # Skip empty lines + if not original_line: + if current_section: + current_content.append(line) + continue + + # Section detection + section_matched = False + + if "revision summary" in line_lower and "**" in original_line: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "revision_summary" + current_content = [] + section_matched = True + elif "changes made" in line_lower and "**" in original_line: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "changes_made" + current_content = [] + section_matched = True + elif "impact assessment" in line_lower and "**" in original_line: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "impact_assessment" + current_content = [] + section_matched = True + elif "quality improvements" in line_lower and "**" in original_line: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "quality_improvements" + current_content = [] + section_matched = True + elif "post-revision status" in line_lower and "**" in original_line: + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + current_section = "post_revision_status" + current_content = [] + section_matched = True + + # If no section header matched, add to current content + if not section_matched and current_section: + current_content.append(line) + + # Save the final section + if current_section and current_content: + sections[current_section] = "\n".join(current_content).strip() + + self.logger.info( + f"📋 Extracted {len([s for s in sections.values() if s])} revision summary sections" + ) + + except Exception as e: + self.logger.error(f"Failed to extract revision summary sections: {e}") + # Provide fallback content + sections["revision_summary"] = "File revision completed" + sections["changes_made"] = ( + "Targeted changes applied based on error analysis" + ) + sections["impact_assessment"] = ( + "Changes should improve code functionality and reduce errors" + ) + sections["quality_improvements"] = ( + "Code quality enhanced through targeted fixes" + ) + sections["post_revision_status"] = "File functionality updated and improved" + + return sections + + def _format_file_revision_summary( + self, + file_path: str, + revision_sections: Dict[str, str], + diff_content: str, + revision_type: str, + ) -> str: + """ + Format the revision summary into the final structure + + Args: + file_path: Path of the revised file + revision_sections: Extracted sections from LLM summary + diff_content: Unified diff content + revision_type: Type of revision performed + + Returns: + Formatted revision summary + """ + timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + # Format sections with fallbacks + revision_summary = revision_sections.get( + "revision_summary", "File revision completed" + ) + changes_made = revision_sections.get("changes_made", "Targeted changes applied") + impact_assessment = revision_sections.get( + "impact_assessment", "Changes should improve functionality" + ) + quality_improvements = revision_sections.get( + "quality_improvements", "Code quality enhanced" + ) + post_revision_status = revision_sections.get( + "post_revision_status", "File updated successfully" + ) + + formatted_summary = f"""# File Revision Summary (UPDATED) +**Generated**: {timestamp} +**File Revised**: {file_path} +**Revision Type**: {revision_type} + +## Revision Summary +{revision_summary} + +## Changes Made +{changes_made} + +## Impact Assessment +{impact_assessment} + +## Quality Improvements +{quality_improvements} + +## Post-Revision Status +{post_revision_status} + +## Technical Details +**Diff Applied:** +```diff +{diff_content[:500]}{"..." if len(diff_content) > 500 else ""} +``` + +--- +*Auto-generated by Enhanced Memory Agent (Revision Mode)* +""" + return formatted_summary + + def _create_fallback_revision_summary( + self, file_path: str, revision_type: str + ) -> str: + """ + Create fallback revision summary when LLM is unavailable + + Args: + file_path: Path of the revised file + revision_type: Type of revision performed + + Returns: + Fallback revision summary + """ + timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + + fallback_summary = f"""# File Revision Summary (UPDATED) +**Generated**: {timestamp} +**File Revised**: {file_path} +**Revision Type**: {revision_type} + +## Revision Summary +File has been revised with targeted changes. LLM summary generation failed. + +## Changes Made +- Targeted modifications applied based on error analysis +- Specific line-level changes implemented +- Code functionality updated + +## Impact Assessment +- File behavior should be improved +- Error conditions addressed +- Integration points maintained + +## Quality Improvements +- Code quality enhanced through precise fixes +- Error handling improved +- Maintainability increased + +## Post-Revision Status +- File successfully updated +- Functionality preserved and enhanced +- Ready for integration testing + +--- +*Auto-generated by Enhanced Memory Agent (Revision Fallback Mode)* +""" + return fallback_summary + + async def _save_revised_file_summary(self, revision_summary: str, file_path: str): + """ + Save or update the revision summary for a file (replaces old summary) + + Args: + revision_summary: New revision summary content + file_path: Path of the file for which the summary was generated + """ + try: + # For revised files, we replace the existing summary rather than append + # Read existing content to find and replace the specific file's summary + file_exists = os.path.exists(self.code_summary_path) + + if file_exists: + with open(self.code_summary_path, "r", encoding="utf-8") as f: + existing_content = f.read() + + # Look for existing summary for this file and replace it + import re + + # Pattern to match existing implementation section for this file + file_pattern = re.escape(file_path) + section_pattern = rf"={80}\s*\n## IMPLEMENTATION File {file_pattern}\n={80}.*?(?=\n={80}|\Z)" + + # Check if this file already has a summary + if re.search(section_pattern, existing_content, re.DOTALL): + # Replace existing summary + new_section = f"\n{'=' * 80}\n## IMPLEMENTATION File {file_path} (REVISED)\n{'=' * 80}\n\n{revision_summary}\n\n" + updated_content = re.sub( + section_pattern, + new_section.strip(), + existing_content, + flags=re.DOTALL, + ) + + with open(self.code_summary_path, "w", encoding="utf-8") as f: + f.write(updated_content) + + self.logger.info( + f"Updated existing summary for revised file: {file_path}" + ) + else: + # Append new summary for this file + with open(self.code_summary_path, "a", encoding="utf-8") as f: + f.write("\n" + "=" * 80 + "\n") + f.write(f"## IMPLEMENTATION File {file_path} (REVISED)\n") + f.write("=" * 80 + "\n\n") + f.write(revision_summary) + f.write("\n\n") + + self.logger.info( + f"Appended new summary for revised file: {file_path}" + ) + else: + # Create new file with header + os.makedirs(os.path.dirname(self.code_summary_path), exist_ok=True) + + with open(self.code_summary_path, "w", encoding="utf-8") as f: + f.write("# Code Implementation Progress Summary\n") + f.write("*Accumulated implementation progress for all files*\n\n") + f.write("\n" + "=" * 80 + "\n") + f.write(f"## IMPLEMENTATION File {file_path} (REVISED)\n") + f.write("=" * 80 + "\n\n") + f.write(revision_summary) + f.write("\n\n") + + self.logger.info( + f"Created new summary file with revised file: {file_path}" + ) + + except Exception as e: + self.logger.error( + f"Failed to save revised file summary for {file_path}: {e}" + ) + + def get_revision_memory_statistics( + self, revised_files: List[str] + ) -> Dict[str, Any]: + """ + Get memory statistics for revised files + + Args: + revised_files: List of file paths that have been revised + + Returns: + Dictionary with revision memory statistics + """ + try: + total_revisions = len(revised_files) + + # Count how many files have updated summaries + summaries_updated = 0 + if os.path.exists(self.code_summary_path): + with open(self.code_summary_path, "r", encoding="utf-8") as f: + content = f.read() + + for file_path in revised_files: + if f"File {file_path} (REVISED)" in content: + summaries_updated += 1 + + return { + "total_revised_files": total_revisions, + "summaries_updated": summaries_updated, + "memory_sync_rate": (summaries_updated / total_revisions * 100) + if total_revisions > 0 + else 0, + "revised_files_list": revised_files.copy(), + "memory_summary_path": self.code_summary_path, + "revision_memory_mode": "active", + } + + except Exception as e: + self.logger.error(f"Failed to get revision memory statistics: {e}") + return { + "total_revised_files": len(revised_files), + "summaries_updated": 0, + "memory_sync_rate": 0, + "revised_files_list": revised_files.copy(), + "memory_summary_path": self.code_summary_path, + "revision_memory_mode": "error", + } diff --git a/workflows/agents/requirement_analysis_agent.py b/workflows/agents/requirement_analysis_agent.py new file mode 100644 index 00000000..79bab67c --- /dev/null +++ b/workflows/agents/requirement_analysis_agent.py @@ -0,0 +1,410 @@ +""" +User Requirement Analysis Agent + +Responsible for analyzing user initial requirements, generating guiding questions, +and summarizing detailed requirement documents based on user responses. +This Agent seamlessly integrates with existing chat workflows to provide more precise requirement understanding. +""" + +import json +import logging +from typing import Dict, List, Optional + +from mcp_agent.agents.agent import Agent +from utils.llm_utils import get_preferred_llm_class + + +class RequirementAnalysisAgent: + """ + User Requirement Analysis Agent + + Core Functions: + 1. Generate 5-8 guiding questions based on user initial requirements + 2. Collect user responses and analyze requirement completeness + 3. Generate detailed requirement documents for subsequent workflows + 4. Support skipping questions to directly enter implementation process + + Design Philosophy:ß + - Intelligent question generation covering functionality, technology, performance, UI, deployment dimensions + - Flexible user interaction supporting partial answers or complete skipping + - Structured requirement output for easy understanding by code generation agents + """ + + def __init__(self, logger: Optional[logging.Logger] = None): + """ + Initialize requirement analysis agent + Args: + logger: Logger instance + """ + self.logger = logger or self._create_default_logger() + self.mcp_agent = None + self.llm = None + + def _create_default_logger(self) -> logging.Logger: + """Create default logger""" + logger = logging.getLogger(f"{__name__}.RequirementAnalysisAgent") + logger.setLevel(logging.INFO) + return logger + + async def __aenter__(self): + """Async context manager entry""" + await self.initialize() + return self + + async def __aexit__(self, exc_type, exc_val, exc_tb): + """Async context manager exit""" + await self.cleanup() + + async def initialize(self): + """Initialize MCP Agent connection and LLM""" + try: + self.mcp_agent = Agent( + name="RequirementAnalysisAgent", + instruction="""You are a professional requirement analysis expert, skilled at guiding users to provide more detailed project requirements through precise questions. + +Your core capabilities: +1. **Intelligent Question Generation**: Based on user initial descriptions, generate 5-8 key questions covering functional requirements, technology selection, performance requirements, user interface, deployment environment, etc. +2. **Requirement Understanding Analysis**: Deep analysis of user's real intentions and implicit requirements +3. **Structured Requirement Output**: Integrate scattered requirement information into clear technical specification documents + +Question Generation Principles: +- Questions should be specific and clear, avoiding overly broad scope +- Cover key decision points for technical implementation +- Consider project feasibility and complexity +- Help users think about important details they might have missed + +Requirement Summary Principles: +- Maintain user's original intent unchanged +- Supplement key information for technical implementation +- Provide clear functional module division +- Give reasonable technical architecture suggestions""", + server_names=[], # No MCP servers needed, only use LLM + ) + + # Initialize agent context + await self.mcp_agent.__aenter__() + + # Attach LLM + self.llm = await self.mcp_agent.attach_llm(get_preferred_llm_class()) + + self.logger.info("RequirementAnalysisAgent initialized successfully") + + except Exception as e: + self.logger.error(f"RequirementAnalysisAgent initialization failed: {e}") + raise + + async def cleanup(self): + """Clean up resources""" + if self.mcp_agent: + try: + await self.mcp_agent.__aexit__(None, None, None) + except Exception as e: + self.logger.warning(f"Error during resource cleanup: {e}") + + async def generate_guiding_questions(self, user_input: str) -> List[Dict[str, str]]: + """ + Generate guiding questions based on user initial requirements + + Args: + user_input: User's initial requirement description + + Returns: + List[Dict]: Question list, each question contains category, question, importance and other fields + """ + try: + self.logger.info("Starting to generate AI precise guiding questions") + + # Build more precise prompt + prompt = f"""Based on user's project requirements, generate precise guiding questions to help refine requirements. + +User Requirements: {user_input} + +Please analyze user requirements and generate 1-3 most critical targeted questions focusing on the most important aspects for this specific project + +Return format (pure JSON array, no other text): +[ + {{ + "category": "Functional Requirements", + "question": "Specific question content", + "importance": "High", + "hint": "Question hint" + }} +] + +Requirements: Questions should be specific and practical, avoiding general discussions.""" + + from mcp_agent.workflows.llm.augmented_llm import RequestParams + + params = RequestParams( + max_tokens=3000, + temperature=0.5, # Lower temperature for more stable JSON output + ) + + self.logger.info( + f"Calling LLM to generate precise questions, input length: {len(user_input)}" + ) + + result = await self.llm.generate_str(message=prompt, request_params=params) + + self.logger.info( + f"LLM returned result length: {len(result) if result else 0}" + ) + + if not result or not result.strip(): + self.logger.error("LLM returned empty result") + raise ValueError("LLM returned empty result") + + self.logger.info(f"LLM returned result: {result[:500]}...") + + # Clean result and extract JSON part + result_cleaned = result.strip() + + # Try to find JSON array + import re + + json_pattern = r"\[\s*\{.*?\}\s*\]" + json_match = re.search(json_pattern, result_cleaned, re.DOTALL) + + if json_match: + json_str = json_match.group() + self.logger.info(f"Extracted JSON: {json_str[:200]}...") + else: + # If complete JSON not found, try direct parsing + json_str = result_cleaned + + # Parse JSON result + try: + questions = json.loads(json_str) + if isinstance(questions, list) and len(questions) > 0: + self.logger.info( + f"✅ Successfully generated {len(questions)} AI precise guiding questions" + ) + return questions + else: + raise ValueError("Returned result is not a valid question list") + + except json.JSONDecodeError as e: + self.logger.error(f"JSON parsing failed: {e}") + self.logger.error(f"Original result: {result}") + + # Try more lenient JSON extraction + lines = result.split("\n") + json_lines = [] + in_json = False + + for line in lines: + if "[" in line: + in_json = True + if in_json: + json_lines.append(line) + if "]" in line and in_json: + break + + if json_lines: + try: + json_attempt = "\n".join(json_lines) + questions = json.loads(json_attempt) + if isinstance(questions, list) and len(questions) > 0: + self.logger.info( + f"✅ Generated {len(questions)} questions through lenient parsing" + ) + return questions + except Exception: + pass + + # If JSON parsing fails, raise an error + self.logger.error("JSON parsing completely failed") + raise ValueError("Failed to parse AI generated questions") + + except Exception as e: + self.logger.error(f"Failed to generate guiding questions: {e}") + # Re-raise the exception instead of falling back to default questions + raise + + async def summarize_detailed_requirements( + self, initial_input: str, answers: Dict[str, str] + ) -> str: + """ + Generate detailed requirement document based on initial input and user answers + + Args: + initial_input: User's initial requirement description + answers: User's answer dictionary {question_id: answer} + + Returns: + str: Detailed requirement document + """ + try: + self.logger.info("Starting to generate AI detailed requirement summary") + + # Build answer content + answers_text = "" + if answers: + for question_id, answer in answers.items(): + if answer and answer.strip(): + answers_text += f"• {answer}\n" + + if not answers_text: + answers_text = "User chose to skip questions, generating based on initial requirements" + + prompt = f"""Based on user requirements and responses, generate a concise project requirement document. + +Initial Requirements: {initial_input} + +Additional Information: +{answers_text} + +Please generate a focused requirement document including: + +## Project Overview +Brief description of project's core goals and value proposition + +## Functional Requirements +Detailed list of required features and functional modules: +- Core functionalities +- User interactions and workflows +- Data processing requirements +- Integration needs + +## Technical Architecture +Recommended technical design including: +- Technology stack and frameworks +- System architecture design +- Database and data storage solutions +- API design considerations +- Security requirements + +## Performance & Scalability +- Expected user scale and performance requirements +- Scalability considerations and constraints + +Requirements: Focus on what needs to be built and how to build it technically. Be concise but comprehensive - avoid unnecessary implementation details.""" + + from mcp_agent.workflows.llm.augmented_llm import RequestParams + + params = RequestParams(max_tokens=4000, temperature=0.3) + + self.logger.info( + f"Calling LLM to generate requirement summary, initial requirement length: {len(initial_input)}" + ) + + result = await self.llm.generate_str(message=prompt, request_params=params) + + if not result or not result.strip(): + self.logger.error("LLM returned empty requirement summary") + raise ValueError("LLM returned empty requirement summary") + + self.logger.info( + f"✅ Requirement summary generation completed, length: {len(result)}" + ) + return result.strip() + + except Exception as e: + self.logger.error(f"Requirement summary failed: {e}") + # Return basic requirement document + return f"""## Project Overview +Based on user requirements: {initial_input} + +## Functional Requirements +Core functionality needed: {initial_input} + +## Technical Architecture +- Select appropriate technology stack based on project requirements +- Adopt modular architecture design +- Consider database and data storage solutions +- Implement necessary security measures + +## Performance & Scalability +- Design for expected user scale +- Consider scalability and performance requirements + +Note: Due to technical issues, this is a simplified requirement document. Manual supplementation of detailed information is recommended.""" + + async def modify_requirements( + self, current_requirements: str, modification_feedback: str + ) -> str: + """ + Modify existing requirement document based on user feedback + + Args: + current_requirements: Current requirement document content + modification_feedback: User's modification requests and feedback + + Returns: + str: Modified requirement document + """ + try: + self.logger.info("Starting to modify requirements based on user feedback") + + # Build modification prompt + prompt = f"""Based on the current requirement document and user's modification requests, generate an updated requirement document. + +Current Requirements Document: +{current_requirements} + +User's Modification Requests: +{modification_feedback} + +CRITICAL REQUIREMENT: You MUST generate a complete, well-structured requirement document regardless of how complete or incomplete the user's modification requests are. Even if the user only provides minimal or unclear feedback, you must still produce a comprehensive requirement document following the exact format below. + +Generate an updated requirement document that incorporates any reasonable interpretation of the user's requested changes while maintaining the EXACT structure and format: + +## Project Overview +Brief description of project's core goals and value proposition + +## Functional Requirements +Detailed list of required features and functional modules: +- Core functionalities +- User interactions and workflows +- Data processing requirements +- Integration needs + +## Technical Architecture +Recommended technical design including: +- Technology stack and frameworks +- System architecture design +- Database and data storage solutions +- API design considerations +- Security requirements + +## Performance & Scalability +- Expected user scale and performance requirements +- Scalability considerations and constraints + +MANDATORY REQUIREMENTS: +1. ALWAYS return a complete document with ALL sections above, regardless of user input completeness +2. If user feedback is unclear or incomplete, make reasonable assumptions based on the current requirements +3. Incorporate any clear user requests while filling in missing details intelligently +4. Maintain consistency and coherence throughout the document +5. Ensure all technical suggestions are feasible and practical +6. NEVER return an incomplete or partial document - always provide full sections +7. Keep the same professional structure and format in all cases""" + + from mcp_agent.workflows.llm.augmented_llm import RequestParams + + params = RequestParams(max_tokens=4000, temperature=0.3) + + self.logger.info( + f"Calling LLM to modify requirements, feedback length: {len(modification_feedback)}" + ) + + result = await self.llm.generate_str(message=prompt, request_params=params) + + if not result or not result.strip(): + self.logger.error("LLM returned empty modified requirements") + raise ValueError("LLM returned empty modified requirements") + + self.logger.info( + f"✅ Requirements modification completed, length: {len(result)}" + ) + return result.strip() + + except Exception as e: + self.logger.error(f"Requirements modification failed: {e}") + # Return current requirements with a note about the modification attempt + return f"""{current_requirements} + +--- +**Note:** Automatic modification failed due to technical issues. The original requirements are shown above. Please manually incorporate the following requested changes: + +{modification_feedback}""" diff --git a/workflows/code_implementation_workflow.py b/workflows/code_implementation_workflow.py index c8daecc2..a38f5c96 100644 --- a/workflows/code_implementation_workflow.py +++ b/workflows/code_implementation_workflow.py @@ -11,6 +11,7 @@ - Configuration: mcp_agent.config.yaml """ +import asyncio import json import logging import os @@ -288,7 +289,7 @@ async def _pure_code_implementation_loop( target_directory, ): """Pure code implementation loop with memory optimization and phase consistency""" - max_iterations = 100 + max_iterations = 500 iteration = 0 start_time = time.time() max_time = 2400 # 40 minutes @@ -419,9 +420,9 @@ async def _pure_code_implementation_loop( keyword in response_content.lower() for keyword in [ "all files implemented", - "implementation complete", "all phases completed", "reproduction plan fully implemented", + "all code of repo implementation complete", ] ): self.logger.info("Code implementation declared complete") @@ -877,3 +878,114 @@ async def _generate_pure_code_final_report_with_concise_agents( except Exception as e: self.logger.error(f"Failed to generate final report: {e}") return f"Failed to generate final report: {str(e)}" + + +async def main(): + """Main function for running the workflow""" + # Configure root logger carefully to avoid duplicates + root_logger = logging.getLogger() + if not root_logger.handlers: + handler = logging.StreamHandler() + formatter = logging.Formatter("%(levelname)s:%(name)s:%(message)s") + handler.setFormatter(formatter) + root_logger.addHandler(handler) + root_logger.setLevel(logging.INFO) + + workflow = CodeImplementationWorkflow() + + print("=" * 60) + print("Code Implementation Workflow with UNIFIED Reference Indexer") + print("=" * 60) + print("Select mode:") + print("1. Test Code Reference Indexer Integration") + print("2. Run Full Implementation Workflow") + print("3. Run Implementation with Pure Code Mode") + print("4. Test Read Tools Configuration") + + # mode_choice = input("Enter choice (1-4, default: 3): ").strip() + + # For testing purposes, we'll run the test first + # if mode_choice == "4": + # print("Testing Read Tools Configuration...") + + # # Create a test workflow normally + # test_workflow = CodeImplementationWorkflow() + + # # Create a mock code agent for testing + # print("\n🧪 Testing with read tools DISABLED:") + # test_agent_disabled = CodeImplementationAgent(None, enable_read_tools=False) + # await test_agent_disabled.test_read_tools_configuration() + + # print("\n🧪 Testing with read tools ENABLED:") + # test_agent_enabled = CodeImplementationAgent(None, enable_read_tools=True) + # await test_agent_enabled.test_read_tools_configuration() + + # print("✅ Read tools configuration testing completed!") + # return + + # print("Running Code Reference Indexer Integration Test...") + + test_success = True + if test_success: + print("\n" + "=" * 60) + print("🎉 UNIFIED Code Reference Indexer Integration Test PASSED!") + print("🔧 Three-step process successfully merged into ONE tool") + print("=" * 60) + + # Ask if user wants to continue with actual workflow + print("\nContinuing with workflow execution...") + + plan_file = "/Users/lizongwei/Reasearch/DeepCode_Base/DeepCode_eval_init/deepcode_lab/papers/1/initial_plan.txt" + # plan_file = "/data2/bjdwhzzh/project-hku/Code-Agent2.0/Code-Agent/deepcode-mcp/agent_folders/papers/1/initial_plan.txt" + target_directory = "/Users/lizongwei/Reasearch/DeepCode_Base/DeepCode_eval_init/deepcode_lab/papers/1/" + print("Implementation Mode Selection:") + print("1. Pure Code Implementation Mode (Recommended)") + print("2. Iterative Implementation Mode") + + pure_code_mode = True + mode_name = "Pure Code Implementation Mode with Memory Agent Architecture + Code Reference Indexer" + print(f"Using: {mode_name}") + + # Configure read tools - modify this parameter to enable/disable read tools + enable_read_tools = ( + True # Set to False to disable read_file and read_code_mem tools + ) + read_tools_status = "ENABLED" if enable_read_tools else "DISABLED" + print(f"🔧 Read tools (read_file, read_code_mem): {read_tools_status}") + + # NOTE: To test without read tools, change the line above to: + # enable_read_tools = False + + result = await workflow.run_workflow( + plan_file, + target_directory=target_directory, + pure_code_mode=pure_code_mode, + enable_read_tools=enable_read_tools, + ) + + print("=" * 60) + print("Workflow Execution Results:") + print(f"Status: {result['status']}") + print(f"Mode: {mode_name}") + + if result["status"] == "success": + print(f"Code Directory: {result['code_directory']}") + print(f"MCP Architecture: {result.get('mcp_architecture', 'unknown')}") + print("Execution completed!") + else: + print(f"Error Message: {result['message']}") + + print("=" * 60) + print( + "✅ Using Standard MCP Architecture with Memory Agent + Code Reference Indexer" + ) + + else: + print("\n" + "=" * 60) + print("❌ Code Reference Indexer Integration Test FAILED!") + print("Please check the configuration and try again.") + print("=" * 60) + + +if __name__ == "__main__": + asyncio.run(main()) diff --git a/workflows/code_implementation_workflow_index.py b/workflows/code_implementation_workflow_index.py index 9cd3b261..e782ecc2 100644 --- a/workflows/code_implementation_workflow_index.py +++ b/workflows/code_implementation_workflow_index.py @@ -11,6 +11,7 @@ - Configuration: mcp_agent.config.yaml """ +import asyncio import json import logging import os @@ -289,7 +290,7 @@ async def _pure_code_implementation_loop( target_directory, ): """Pure code implementation loop with memory optimization and phase consistency""" - max_iterations = 100 + max_iterations = 500 iteration = 0 start_time = time.time() max_time = 2400 # 40 minutes @@ -381,6 +382,8 @@ async def _pure_code_implementation_loop( if memory_agent.should_trigger_memory_optimization( messages, code_agent.get_files_implemented_count() ): + # Memory optimization triggered + # Apply concise memory optimization files_implemented_count = code_agent.get_files_implemented_count() current_system_message = code_agent.get_system_prompt() @@ -418,9 +421,9 @@ async def _pure_code_implementation_loop( keyword in response_content.lower() for keyword in [ "all files implemented", - "implementation complete", "all phases completed", "reproduction plan fully implemented", + "all code of repo implementation complete", ] ): self.logger.info("Code implementation declared complete") @@ -502,7 +505,7 @@ async def _initialize_llm_client(self): # Test connection with default model from config await client.messages.create( model=self.default_models["anthropic"], - max_tokens=10, + max_tokens=20, messages=[{"role": "user", "content": "test"}], ) self.logger.info( @@ -721,7 +724,7 @@ def _generate_success_guidance(self, files_count: int) -> str: 2. **If MORE files need implementation:** Continue with dependency-aware workflow: - **Start with `read_code_mem`** to understand existing implementations and dependencies - **Optionally use `search_code_references`** for reference patterns (OPTIONAL - use for inspiration only, original paper specs take priority) - - **Then `write_file`** to implement the new component based on original paper requirements + - **Then `write_file`** to implement the new component - **Finally: Test** if needed 💡 **Key Point:** Always verify completion status before continuing with new file creation.""" @@ -738,7 +741,7 @@ def _generate_error_guidance(self) -> str: - **If NO:** Continue with proper development cycle for next file: - **Start with `read_code_mem`** to understand existing implementations - **Optionally use `search_code_references`** for reference patterns (OPTIONAL - for inspiration only) - - **Then `write_file`** to implement properly based on original paper requirements + - **Then `write_file`** to implement properly - **Test** if needed 4. Ensure proper error handling in future implementations @@ -757,7 +760,7 @@ def _generate_no_tools_guidance(self, files_count: int) -> str: 2. **If MORE files need implementation:** Follow the development cycle: - **Start with `read_code_mem`** to understand existing implementations - **Optionally use `search_code_references`** for reference patterns (OPTIONAL - for inspiration only) - - **Then `write_file`** to implement the new component based on original paper requirements + - **Then `write_file`** to implement the new component - **Finally: Test** if needed 🚨 **Critical:** Always verify completion status first, then use appropriate tools - not just explanations!""" @@ -880,4 +883,115 @@ async def _generate_pure_code_final_report_with_concise_agents( self.logger.error(f"Failed to generate final report: {e}") return f"Failed to generate final report: {str(e)}" - # ==================== 8. Testing and Debugging (Testing Layer) ==================== + +async def main(): + """Main function for running the workflow""" + # Configure root logger carefully to avoid duplicates + root_logger = logging.getLogger() + if not root_logger.handlers: + handler = logging.StreamHandler() + formatter = logging.Formatter("%(levelname)s:%(name)s:%(message)s") + handler.setFormatter(formatter) + root_logger.addHandler(handler) + root_logger.setLevel(logging.INFO) + + workflow = CodeImplementationWorkflowWithIndex() + + print("=" * 60) + print("Code Implementation Workflow with UNIFIED Reference Indexer") + print("=" * 60) + print("Select mode:") + print("1. Test Code Reference Indexer Integration") + print("2. Run Full Implementation Workflow") + print("3. Run Implementation with Pure Code Mode") + print("4. Test Read Tools Configuration") + + # mode_choice = input("Enter choice (1-4, default: 3): ").strip() + + # For testing purposes, we'll run the test first + # if mode_choice == "4": + # print("Testing Read Tools Configuration...") + + # # Create a test workflow normally + # test_workflow = CodeImplementationWorkflow() + + # # Create a mock code agent for testing + # print("\n🧪 Testing with read tools DISABLED:") + # test_agent_disabled = CodeImplementationAgent(None, enable_read_tools=False) + # await test_agent_disabled.test_read_tools_configuration() + + # print("\n🧪 Testing with read tools ENABLED:") + # test_agent_enabled = CodeImplementationAgent(None, enable_read_tools=True) + # await test_agent_enabled.test_read_tools_configuration() + + # print("✅ Read tools configuration testing completed!") + # return + + # print("Running Code Reference Indexer Integration Test...") + + test_success = True + if test_success: + print("\n" + "=" * 60) + print("🎉 UNIFIED Code Reference Indexer Integration Test PASSED!") + print("🔧 Three-step process successfully merged into ONE tool") + print("=" * 60) + + # Ask if user wants to continue with actual workflow + print("\nContinuing with workflow execution...") + + plan_file = "/Users/lizongwei/Reasearch/DeepCode_Base/DeepCode/deepcode_lab/papers/1/initial_plan.txt" + # plan_file = "/data2/bjdwhzzh/project-hku/Code-Agent2.0/Code-Agent/deepcode-mcp/agent_folders/papers/1/initial_plan.txt" + target_directory = ( + "/Users/lizongwei/Reasearch/DeepCode_Base/DeepCode/deepcode_lab/papers/1/" + ) + print("Implementation Mode Selection:") + print("1. Pure Code Implementation Mode (Recommended)") + print("2. Iterative Implementation Mode") + + pure_code_mode = True + mode_name = "Pure Code Implementation Mode with Memory Agent Architecture + Code Reference Indexer" + print(f"Using: {mode_name}") + + # Configure read tools - modify this parameter to enable/disable read tools + enable_read_tools = ( + True # Set to False to disable read_file and read_code_mem tools + ) + read_tools_status = "ENABLED" if enable_read_tools else "DISABLED" + print(f"🔧 Read tools (read_file, read_code_mem): {read_tools_status}") + + # NOTE: To test without read tools, change the line above to: + # enable_read_tools = False + + result = await workflow.run_workflow( + plan_file, + target_directory=target_directory, + pure_code_mode=pure_code_mode, + enable_read_tools=enable_read_tools, + ) + + print("=" * 60) + print("Workflow Execution Results:") + print(f"Status: {result['status']}") + print(f"Mode: {mode_name}") + + if result["status"] == "success": + print(f"Code Directory: {result['code_directory']}") + print(f"MCP Architecture: {result.get('mcp_architecture', 'unknown')}") + print("Execution completed!") + else: + print(f"Error Message: {result['message']}") + + print("=" * 60) + print( + "✅ Using Standard MCP Architecture with Memory Agent + Code Reference Indexer" + ) + + else: + print("\n" + "=" * 60) + print("❌ Code Reference Indexer Integration Test FAILED!") + print("Please check the configuration and try again.") + print("=" * 60) + + +if __name__ == "__main__": + asyncio.run(main())