This is a research project, NOT a commercial product. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. The developers do NOT assume any responsibility for potential misuse by users.
- 🚨 Released the implementation code of GeoDrag.
- 🚨 Paper Portal for Top Conferences in the Field of Artificial intelligence: CV_Paper_Portal
- 🚨 Abstract Paper Portal of ICLR 2026
Recommended environment: Linux system with an NVIDIA GPU.
The project has not been fully tested on other operating systems or hardware configurations.
Running our method currently needs around 6 GB of GPU memory. We will continue to optimize memory efficiency.
To install the required libraries, simply run the following command:
conda env create -f environment.yaml
conda activate geodragNext, download the pretrained Depth Anything V2 weights by running:
bash download.sh modelFrom the command line, run the following to launch the Gradio user interface:
python3 app.pyYou can refer to GIF above for a step-by-step demonstration of how to use the UI.
To run GeoDrag directly from the command line, use:
python3 inference_drag.py \
-c path/to/config \
-i path/to/image \
-m path/to/meta-data \
-o path/to/output-dir
# example
# python3 inference_drag.py \
# -c configs/base_configs.yaml \
# -i datasets/DragBench/animals/JH_2023-09-14-1820-16/original_image.png \
# -m datasets/DragBench/animals/JH_2023-09-14-1820-16/meta_data.pkl \
# -o output
# -c: config file path
# -i: input image
# -m: metadata (drag points, mask, and prompt)
# -o: output directoryDownload DragBench into the folder "datasets" by running:
bash download.sh dragbenchTo evaluate GeoDrag on the DRAGBENCH benchmark, run the following command:
python3 inference_dragbench.py \
-c path/to/config \
-b path/to/benchmark \
-o path/to/output
# example
# python3 inference_dragbench.py \
# -c configs/base_configs.yaml \
# -b datasets/DragBench
# -o outputsTo compute quantitative evaluation metrics on a single result, run:
python3 -m evaluation single \
-s '["MD", "IF", "DAI"]' \
-i path/to/image \
-m path/to/meta-data \
-e path/to/edited-image
# The `-s` argument specifies which scores to compute.
# You can choose any combination of the following:
# MD – Mean Distance
# IF – Image Fidelity
# DAI – Dragging Accuracy Index
# For example:
# -s '["MD"]'
# -s '["MD", "IF"]'
# -s '["MD", "IF", "DAI"]'To directly compute quantitative results on the DRAGBENCH benchmark, run:
python3 -m evaluation benchmark \
-b path/to/benchmark-resultsResults will be summarized and printed in the terminal.
See parameters.md for additional information.
If you find our repo helpful, please consider leaving a star or cite our paper :)
@article{pu2025geodrag,
title={Dragging with Geometry: From Pixels to Geometry-Guided Image Editing},
author={Xinyu Pu and Hongsong Wang and Jie Gui and Pan Zhou},
journal={arXiv preprint arXiv:2509.25740},
year={2025}
}For any questions on this project, please contact Xinyu (xinyupu@seu.edu.cn)
The code is built builds upon DragDiffusion, FastDrag and diffusers, thanks for their outstanding work!
We’d also like to express our appreciation to the amazing open-source community behind diffusion models, libraries, and research that inspired this work.
- Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
- DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
- Emergent Correspondence from Image Diffusion
- DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models
- FreeDrag: Feature Dragging for Reliable Point-based Image Editing
- FastDrag: Manipulate Anything in One Step
- Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation
- Lightning fast text-guided image editing via one-step diffusion
