MCCE is a multi-objective optimization framework powered by Large Language Models (LLMs), supporting molecular optimization, Multi-Objective Traveling Salesman Problem (MOTSP), Multi-Objective Capacitated Vehicle Routing Problem (MOCVRP), and circle packing problems.
- Model Collaboration: Supports collaboration between API models (GPT, Claude, Gemini) and local Qwen models
- DPO Training: Integrated Direct Preference Optimization (DPO) training with automatic data generation and model fine-tuning
- Multi-Problem Support: Molecular optimization, MOTSP, MOCVRP, and circle packing
- Self-Contained: All dependencies and data files are included within the project, no external path dependencies
MCCE/
├── algorithm/ # Core algorithm implementation
│ ├── MOO.py # Multi-objective optimization algorithm
│ ├── base.py # Base class definitions
│ └── PromptTemplate.py # Prompt templates
├── model/ # Model implementations
│ ├── MOLLM.py # Main model class
│ ├── LLM.py # LLM interface
│ └── util.py # Utility functions
├── problem/ # Problem definitions
│ ├── molecules/ # Molecular optimization
│ ├── motsp/ # Multi-Objective TSP
│ ├── mocvrp/ # Multi-Objective CVRP
│ └── circlepacking/ # Circle packing
├── tools/ # Data generation tools
│ ├── makerldata_dpov3.py # Molecular DPO data
│ ├── makerldata_motsp_embed.py # MOTSP DPO data
│ ├── makerldata_mocvrp_embed.py # MOCVRP DPO data
│ └── makerldata_circle_embed.py # Circle packing DPO data
├── training/ # Training scripts
│ └── train_dpo.py # DPO training implementation
├── data/ # Data directory
│ ├── problems/ # Problem data files
│ ├── dpo_training/ # DPO training data (auto-generated)
│ └── dpo_models/ # DPO trained models (auto-generated)
├── oracle/ # Molecular evaluation data
├── eval.py # Evaluation module
└── main.py # Main entry point
This project requires two conda environments:
conda create -n moorl python=3.10
conda activate moorl
pip install -r requirements_moorl.txt
# For molecular optimization, install rdkit:
conda install -c conda-forge rdkitconda create -n verl python=3.10
conda activate verl
pip install -r requirements_verl.txt
# Install PyTorch with CUDA support:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121For detailed environment setup instructions, please refer to ENVIRONMENT_SETUP.md.
conda activate moorl
python main.py problem/molecules/config.yamlconda activate moorl
python main.py problem/motsp/config.yamlconda activate moorl
python main.py problem/mocvrp/config.yamlconda activate moorl
python main.py problem/circlepacking/config.yamlEach problem has its own configuration file with key parameters:
max_generation: Maximum number of iterationspop_size: Population sizemodel_collaboration: Enable model collaborationuse_dpo: Enable DPO trainingmodel_name: API model name (e.g.,gemini-2.5-flash-nothinking)local_model_path: Local Qwen model path
To define a new optimization problem, create the following files:
config.yaml- Algorithm parameter configuration{problem}.yaml- Problem description and objective definitionsevaluator.py- Evaluation function implementation
Refer to the example files in each problem directory for detailed tutorials.
MCCE automatically:
- Collects optimization data (chosen/rejected sample pairs)
- Generates DPO training datasets
- Launches DPO training (using verl environment)
- Updates model weights
Training data and models are saved in data/dpo_training/ and data/dpo_models/ directories.
- First run will download local Qwen models, which may take considerable time
- DPO training requires GPU support, CUDA environment recommended
- API models require appropriate API key configuration
- The optimization process generates extensive logs and data files
conda activate moorl
python -c "from algorithm.MOO import MOO; print('✓ moorl environment OK')"
python -c "from model.MOLLM import MOLLM; print('✓ MOLLM import OK')"conda activate verl
python -c "import trl; import swanlab; print('✓ verl environment OK')"
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"- Python: 3.10
- GPU: Recommended (24GB+ VRAM for DPO training)
- Disk Space: ~30GB (environments + models + data)
- RAM: 16GB minimum, 32GB+ recommended
If you use this project, please cite the relevant paper.
This project is licensed under the MIT License.