Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
docs(i18n): 更新英文博客文章
- 更新了三篇英文博客文章的翻译和内容- 优化了文章结构,增加了小标题和列表
- 修正了语法和拼写错误
- 统一了格式和风格
  • Loading branch information
abel committed Apr 23, 2025
commit bbd5a03036c75f9de7688ecf582af562351faa23
124 changes: 63 additions & 61 deletions i18n/en/docusaurus-plugin-content-blog/2023-07-23-autodev-0-7-0.mdx
Original file line number Diff line number Diff line change
@@ -1,112 +1,114 @@


---
title: AutoDev 0.7.0 - 生成规范化代码,深入开发者日常
title: AutoDev 0.7.0 - Generating Standardized Code, Deep Integration into Developer Daily Work
slug: autodev-0-7-0
hide_table_of_contents: false
---

# 开源 AI 编程助手 AutoDev 0.7 发布—— 生成规范化代码,深入开发者日常
# Open-source AI Programming Assistant AutoDev 0.7 Released - Generating Standardized Code, Deep Integration into Developer Daily Work

几个月前,我们朝着探索:**如何结合 AIGC 的研发效能提升的目标?**开源了 AutoDev,如 GitHub 所介绍的:
Months ago, we embarked on exploring: **How to combine AIGC for R&D efficiency improvement?** We open-sourced AutoDev, as introduced on GitHub:

> AutoDev 是一款基于 JetBrains IDE 的 LLM/AI 辅助编程插件。AutoDev 能够与您的需求管理系统(例如 JiraTrelloGithub Issue 等)直接对接。在 IDE 中,您只需简单点击,AutoDev 会根据您的需求自动为您生成代码。您所需做的,仅仅是对生成的代码进行质量检查。
> AutoDev is a LLM/AI-assisted programming plugin for JetBrains IDEs. AutoDev can directly integrate with your requirement management systems (e.g., Jira, Trello, Github Issues, etc.). Within the IDE, with simple clicks, AutoDev automatically generates code based on your requirements. All you need to do is perform quality checks on the generated code.

随着,我们对于 LLM 能力边界的探索,发现了一些更有意思的模式,这些探索的模式也融入了 AutoDev 中。
Through our exploration of LLM capability boundaries, we discovered some more interesting patterns that have been incorporated into AutoDev.

PS:在 JetBrains 插件中搜索 `AutoDev` 并安装,配置上你的 LLM,如 OpenAI 及其代理、开源 LLM 等即可使用。
PS: Search for `AutoDev` in JetBrains plugins and install it. Configure your LLM (e.g., OpenAI and its proxies, open-source LLMs, etc.) to start using.

## WHY AutoDev?对于 GenAI + 软件研发结合的理解
## WHY AutoDev? Understanding the Integration of GenAI + Software Development

于生成式 AI 来说,我们依旧保持先前分享时相似的观点:
Regarding generative AI, we maintain views similar to our previous sharing:

1. GenAI 可以在研发流程的几乎每个环节产生提效作用。
2. 对于标准化流程提效比较明显,不规范的小团队提升有限。
3. 由于 prompt 编写需要耗费时间,提效需要落地到工具上。
1. GenAI can improve efficiency in almost every phase of the R&D process.
2. More effective for standardized processes, with limited benefits for less standardized small teams.
3. Efficiency gains need tool implementation due to the time cost of prompt writing.

所以,在设计 AutoDev 时,我们的目标是:
Therefore, when designing AutoDev, our goals were:

1. 端到端集成,降低交互成本。即从 prompt 编写到与 LLM 交互,再复制回工具中。
2. 自动收集 prompt 的上下文生成内容、代码
3. 最后由人来修复 AI 生成的代码。
1. End-to-end integration to reduce interaction costs - from prompt writing to LLM interaction, then copying back into tools.
2. Automatic collection of prompt context for content/code generation
3. Final human verification and correction of AI-generated code.

那么,手动整理规范、自动收集上下文,以提升生成内容的质量,便是我们做工具里所要探索的。
Thus, manual specification organization and automatic context collection to improve generation quality became our focus in tool development.

## AutoDev 0.7 新特性
## AutoDev 0.7 New Features

从四月份的大 DEMO,到如今的新版本里,我们持续研究了 GitHub CopilotJetBrains AI AssistantCursorBloop 等 IDE/编辑器的代码、实现逻辑等。每个工具都有其独特的卖点,再结合我日常的一引起开发习惯,添加了一系列探索性的新功能。
From the big demo in April to the new version today, we continuously studied implementations of GitHub Copilot, JetBrains AI Assistant, Cursor, Bloop, etc. Each tool has unique selling points. Combined with my daily development habits, we added a series of exploratory new features.

详细见 GitHubhttps://github.com/unit-mesh/auto-dev
Details on GitHub: https://github.com/unit-mesh/auto-dev

### 特性 1:架构规范与**代码规范内建**
### Feature 1: Built-in Architectural Specifications & **Code Standards**

LLM 的复读机模式(生成机机制),会根据当前上下文的编程习惯,复读出相似的代码。即在使用诸如 GitHub Copilot 这一类的 AI 代码生成功能时,它会根据我们如何处理 API,来生成新的 API 代码。如果我们的代码使用了 Swagger 注解生成 API 代码,那么在同一个 Controller 下也会生成相似的代码。
LLM's "parrot mode" (generation mechanism) produces code matching current context programming habits. When using AI code generation features like GitHub Copilot, it generates new API code based on how we handle existing APIs. If our code uses Swagger annotations, it will generate similar code in the same Controller.

这也意味着问题:如果前人写的代码是不规范的,那么生成的代码亦是不规范的。因此,我们在 AutoDev 添加了配置 CRUD 模板代码的规范:
This implies a problem: If predecessors wrote non-standard code, generated code will also be non-standard. Therefore, we added CRUD template code specification configuration:

```json
{
"spec": {
"controller": "- 在 Controller 中使用 BeanUtils.copyProperties 进行 DTO 转换 Entity",
"service": "- Service 层应该使用构造函数注入或者 setter 注入,不要使用 @Autowired 注解注入。",
"entity": "- Entity 类应该使用 JPA 注解进行数据库映射",
"repository": "- Repository 接口应该继承 JpaRepository 接口,以获得基本的 CRUD 操作",
"ddl": "- 字段应该使用 NOT NULL 约束,确保数据的完整性"
"controller": "- Use BeanUtils.copyProperties for DTO to Entity conversion in Controllers",
"service": "- Service layer should use constructor injection or setter injection, avoid @Autowired annotation",
"entity": "- Entity classes should use JPA annotations for database mapping",
"repository": "- Repository interfaces should extend JpaRepository for basic CRUD operations",
"ddl": "- Fields should use NOT NULL constraints to ensure data integrity"
}
}
```

在一些特殊的场景下,只有这个规范是不够的,还需要配置示例代码。在有了这个配置之后,当我们在生成 ControllerService 等代码时,可以直接用上述的规范生成。
In special scenarios, specifications alone are insufficient - sample code configuration is needed. With this configuration, when generating Controller/Service code, we can directly apply these specifications.

### 特性 2:深入开发者日常编程活动
### Feature 2: Deep Integration into Developer Daily Activities

在四月份发布的时候 ,AutoDev 集成了基本的编程活动能力:AI 填充代码、添加代码注释、重构代码、解释代码等等。
In the April release, AutoDev integrated basic programming activities: AI code completion, comment generation, code refactoring, code explanation, etc.

而在开发 AutoDev 自身功能的时候,我们发现了一些更有意思的需求,也集成到了 IDE 中。
While developing AutoDev itself, we discovered more interesting needs and integrated them into the IDE:

- 一键生成提交信息。在我们使用 IDEA UI 功能写提交信息时,可以一键生成参考的提交信息。
- 一键生成发布日志。在提交历史中,选中多个 commit,根据提交信息,来生成 CHANGELOG
- 错误信息一键分析。编写代码时,DEBUG 遇到错误,选中错误信息,可以自动结合错误代码,发送给 LLM 进行分析。
- 代码测试代码。
- One-click commit message generation. When using IDEA's commit UI, generate suggested commit messages.
- One-click changelog generation. Select multiple commits in history to generate CHANGELOG based on messages.
- Error message analysis. During debugging, select error messages to automatically analyze with LLM combining error context.
- Test code generation.

再加上,AutoDev 最擅长的拉取需求进行自动 CRUD 的功能,在功能上更加完备了。
Combined with AutoDev's core strength of automatic CRUD from requirements, the feature set becomes more comprehensive.

### 特性 3:**多语言的 AI 辅助支持**
### Feature 3: **Multi-language AI Support**

四月份,我们发现 LLM 非常擅长于 CRUD,所以选中了 Java 语言作为测试与场景,只构建了 Java 语言的自动 CRUD 功能。而像我最近几年经常用的 KotlinRustTypeScript,都没有支持,而这就对我不友好了。
In April, we found LLMs excel at CRUD, so chose Java for initial implementation. However, languages I frequently use like Kotlin/Rust/TypeScript lacked support.

于是,参考了 Intellij Rust 的模块化结构,重新组织了分层、模块,并以 Intellij Plugin 的扩展点 (XML + Java)重塑了整个应用的基础架构。
Referencing Intellij Rust's modular architecture, we reorganized layers/modules using Intellij Plugin extension points (XML + Java) to rebuild the foundation.

以下围绕新架构下产生的新扩展点:
New extension points in the architecture:

- 语言数据结构扩展点。原先的设计中,这部分用于在 token 不够时,使用 UML 来表达原来的代码。随后,我们参考(抄袭)了 JetBrains AI Assistant 的语言扩展点功能,即不同的语言的数据结构在自身的扩展中实现。
- 语言 prompt 扩展点。不同语言也有自身的 prompt 差异,这些差异也被移到各自的模块中实现。
- 自定义 CRUD 工作流。现有的 CRUD 实现,绑定的是 Java 语言特性,而每个语言有自身的不同实现方式,也交由语言自身去实现。
- Language data structure extensions. Originally designed for UML representation when tokens are limited. Later referenced (copied) JetBrains AI Assistant's language extensions - language-specific data structures implemented in their own modules.
- Language prompt extensions. Language-specific prompt differences moved to respective modules.
- Custom CRUD workflows. Existing CRUD implementation was Java-specific. Now each language implements its own approach.

当然了,当前依旧只有 Java/Kotlin 支持是最好的。
Currently, Java/Kotlin still have the best support.

### 特征 4:更广泛的 LLM 支持
### Feature 4: Broader LLM Support

AutoDev 在设计初衷面向我们的第二个假设是:每个大公司都会推出自己的 LLM。每个 LLM 都有自身的特点,所以我们需要有更多的 LLM 支持。
AutoDev's original design considered our second hypothesis: Every major company will launch its own LLM. Each LLM has unique characteristics, requiring broader LLM support.

- OpenAI 及其代理。目前是测试最多的,也是最完整的。
- Azure OpenAI。作为一个在国内合法使用 OpenAI 的渠道,我们也在先前的版本中进行了初步的支持,并逐步地完善了这个功能。
- 其它 LLM。虽然,还没有找到合适的国内 LLM API 进行适配,但是已经在接口上构建了这样的能力。
- OpenAI & proxies. Most tested and complete implementation.
- Azure OpenAI. As a legal OpenAI channel in China, we implemented preliminary support and gradually improved it.
- Other LLMs. While suitable domestic LLM APIs haven't been found yet, the interface supports such integration.

欢迎大家结合自己的 LLM 尝试。
Welcome to experiment with your own LLMs.

### 特征 5:更智能的 prompt 策略
### Feature 5: Smarter Prompt Strategies

回到我们 5 月份的那篇《**[上下文工程:基于 Github Copilot 的实时能力分析与思考](https://www.phodal.com/blog/llm-context-engineering/)》**里,我们详细分析了 GitHub Copilotprompt 策略。围绕于这个策略,会有基本的 promptElements 诸如:`BeforeCursor`, `AfterCursor`, `SimilarFile`, `ImportedFile`, `LanguageMarker`, `PathMarker`, `RetrievalSnippet` 等。
In our May article **[Context Engineering: Real-time Capability Analysis Based on GitHub Copilot](https://www.phodal.com/blog/llm-context-engineering/)**, we analyzed GitHub Copilot's prompt strategies. Core promptElements include: `BeforeCursor`, `AfterCursor`, `SimilarFile`, `ImportedFile`, `LanguageMarker`, `PathMarker`, `RetrievalSnippet`, etc.

在发现了 JetBrains AI Assistant 也在尝试使用类似的方式来构建其 prompt 策略时。我们也进一步参考,并完善了 AutoDev 的 prompt 策略,以让其更智能。
Discovering JetBrains AI Assistant uses similar approaches, we refined AutoDev's prompt strategies:

- 代码上下文策略。
- Java 语言 + CRUD 模式下,会尝试按相关代码(BeforeCursor)、调用代码的所有方法、调用代码行、相关代码的 UML 等方式构建。
- Java 语言其它模式下,会使用 DtModel 来构建类 UML 的注释,作为相关任务的参考。
- Python 语言,会根据 import 来相似代码段来构建生成 prompt 作为注释,作为 LLM 的参考。
- 计算策略。剩下的则是根据 token 是否超限,来不分配适合的上下文。
- Code context strategies:
- Java + CRUD mode: Build context using related code (BeforeCursor), all called methods, called lines, UML-like class diagrams.
- Other Java modes: Use DtModel to build UML-like comments as reference.
- Python: Use import-based similar code snippets as LLM reference.
- Token allocation strategy: Distribute context based on token limits.

作为一个所谓的 “智能上下文” 策略,现有的策略还需要进一步优化。
As a "smart context" strategy, current implementation still needs optimization.

## 其它
## Others

有兴趣的话,欢迎来 GitHub 讨论代码:https://github.com/unit-mesh/auto-dev
Feel free to discuss code on GitHub: https://github.com/unit-mesh/auto-dev.
Loading