A comprehensive system for monitoring and evaluating GitHub Copilot code suggestions to detect potential hallucinations, recursive behaviors, and other issues. This project consists of several components that work together to provide a seamless development experience with AI code assistants.
- MCP Server: Model Context Protocol server for structured AI-to-AI communication
- Web Interface: Real-time visualization of communication logs
- Monitor Agent: Intelligent evaluation of code suggestions using LLM
- TDD Framework: Test-Driven Development support with automated test generation
- Test Execution: JSON-based API for executing tests on code suggestions
- C++23 Support: Enhanced test generation for modern C++ features
- Pydantic v2 Support: Backward compatibility with modern data validation
- Multiple LLM Support: Backend compatible with both local and API-based LLMs
- Copilot Integration: Captures and monitors GitHub Copilot suggestions
- Copilot Chat Integration: Interact with GitHub Copilot Chat programmatically
- Test Execution: Run tests on GitHub Copilot Chat code suggestions
- TDD Dashboard: Visualize test execution results with GitHub Copilot integration
- Chat Interaction Commands: Send "Continue" or request changes via command palette
- MCP Client: Communicates with the MCP server via WebSockets
- Hugging Face Integration: Connect to Hugging Face API-hosted models as alternative to Ollama
- Model Provider Service: Seamlessly switch between local and cloud-hosted LLMs
- Evaluation UI: Shows risk scores and recommendations in VS Code
- TDD Support: Enables Test-Driven Development workflows with Copilot suggestions
- Dedicated Panel: Rich visual interface for detailed evaluation results
- Smart Notifications: Configurable notification system with multiple verbosity levels
- Context Manager: Improved context sharing between components
- Python 3.10 or higher
- Node.js 16 or higher
- VS Code 1.85.0 or higher
- GitHub Copilot extension
- Ollama running in the background (for LLM-based code evaluation)
-
Clone the repository:
git clone https://github.com/username/ai-development-monitor.git cd ai-development-monitor -
Set up the Python environment:
python -m venv venv source venv/bin/activate pip install -r requirements.txt -
Install the VS Code extension:
cd vscode-extension npm install vsce package code --install-extension ai-development-monitor-0.3.1.vsix
For the best experience, start both the MCP server and web interface:
-
Start the MCP server:
./start_mcp_server.sh
-
Start the web interface server:
./start_web_server.sh
-
Open VS Code and start using GitHub Copilot
- Suggestions will be automatically evaluated using the MCP protocol
- View communication logs at http://localhost:5002
- Run TDD cycles by using the diagnostic test or invoking TDD commands
Alternatively, you can use just the REST API server:
./start_server.shThe system uses a multi-component architecture:
- GitHub Copilot generates code suggestions in VS Code
- VS Code Extension captures these suggestions and sends them to the MCP server
- MCP Server routes messages between components using a structured protocol
- Monitor Agent evaluates code for risks using an LLM
- Web Interface visualizes the communication with colorful logs and emoticons
- TDD Framework generates tests and manages test-driven development cycles
The following UML diagrams provide a detailed view of the system architecture:
- Backend Components UML - Class diagram showing the backend components
- VS Code Extension Components UML - Class diagram showing the VS Code extension components
- Component Interaction UML - Sequence diagram showing communication flow
- Package Dependency UML - Diagram showing dependencies between modules
The Model Context Protocol (MCP) enables structured communication between AI systems:
- Suggestions: Code proposals from GitHub Copilot
- Evaluations: Risk assessments from the Monitor Agent
- Continuations: Follow-up requests when suggestions are incomplete
- TDD Requests: Requests for test generation in TDD workflow
- TDD Tests: Generated test code with validation suggestions
Each message includes context tracking, allowing for threaded conversations between AI systems.
The TDD functionality follows a 5-iteration cycle:
- Basic Testing: Tests for basic functionality and simple edge cases
- Extended Coverage: More comprehensive tests for normal use cases
- Error Handling: Tests for invalid inputs and boundary conditions
- Performance Testing: Tests for optimization and large inputs
- Comprehensive Review: Final assessment and improvement suggestions
Each iteration improves both the test suite and the implementation, progressively enhancing code quality.
The AI Development Monitor supports several programming languages with language-specific test templates:
- Python: pytest-based testing with fixtures and parametrized tests
- JavaScript: Jest testing framework with modern JS capabilities
- TypeScript: Type-aware testing with Jest and TypeScript features
- Java: JUnit testing with Java-specific patterns
- C#: NUnit/xUnit testing for .NET applications
- C++: GTest framework with comprehensive C++23 feature support
- Error handling with
std::expected<T, E> - Formatted output with
std::printandstd::format - Modern modules system testing
- Advanced threading tests with
std::barrier - Other C++23 features like
auto(x)lambdas and spaceship operator
- Error handling with
See the C++23 Support Documentation for more details on the C++23 feature support.
- GitHub Copilot does not provide a public API, so the extension uses heuristic methods to detect suggestions
- WebSocket connections may require reconnection in unstable network environments
- TDD workflow currently only supports Python and JavaScript code
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
See CHANGELOG.md for version history and release notes.

