[CVPR 2026] 🌀Diff4Splat

⚠️ Note: This repository currently contains only the initial code. Full code and pretrained weights will be updated soon.

[CVPR 2026] 🌀Diff4Splat

Diff4Splat: Controllable 4D Scene Generation with
Latent Dynamic Reconstruction Models

Panwang Pan^†, Chenguo Lin^†, Jingjing Zhao, Chenxin Li, Yuchen Lin, Haopeng Li, Honglei Yan, Kairun Wen, Yunlong Lin, Yixuan Yuan, Yadong Mu

Diff4Splat is a feed-forward method that synthesizes controllable and explicit 4D scenes from a single image. Our approach unifies the generative priors of video diffusion models with geometry and motion constraints learned from large-scale 4D datasets.

Here is our Project Page.

Feel free to contact us or open an issue if you have any questions or suggestions.

🔥 See Also

You may also be interested in our other works:

[CVPR 2026] MoVieS: a feed-forward model for 4D dynamic reconstruction from monocular videos.

📢 News

2026-02-21: The paper is accepted to CVPR 2026.
2025-11-01: Diff4Splat is released on arXiv.
2025-10-15: Initial codebase structure established.
2025-10-01: Project development started.

📝 Paper vs Released Code

This section clarifies the differences between the paper description and the current released codebase, to help set expectations for reproducibility.

What the Paper Describes

Video Backbone: CogVideoX-style Video DiT with 32-channel 3D Causal VAE (4×8×8 compression)
LDRM Input: Video latent tensor (z) from the diffusion model, together with camera information, processed by LDRM Transformer to output deformable 3D Gaussians

Release Roadmap

We are actively working on:

Paper-faithful implementation: A version closer to the CogVideoX + latent-input LDRM stack described in the paper
Complete training/inference scripts: Exact scripts to reproduce the paper's results
Pretrained checkpoints: Both the paper's setup and this repository's engineering variant

📋 Project Status

Inference code released
Training code and data preprocessing scripts released
Pretrained checkpoints (coming soon)
HuggingFace demo (coming soon)
Preprocessed dataset (coming soon)

🔧 Installation

Requirements

Python >= 3.10
PyTorch >= 2.0 (with CUDA support)
CUDA >= 11.8

Install Dependencies

# Clone the repository
git clone https://github.com/paulpanwang/Diff4Splat.git
cd Diff4Splat

# Install required packages
pip install -r settings/requirements.txt

📊 Datasets

Configure your dataset root path in src/options.py or via the DATASET_ROOT environment variable.

The following datasets are supported:

RealEstate10K (re10k) - Static scenes
TartanAir (tartanair) - Static scenes
MatrixCity (matrixcity) - Static scenes
DL3DV (dl3dv) - Static scenes
DynamicReplica (dynamicreplica) - Dynamic scenes
PointOdyssey (pointodyssey) - Dynamic scenes
VKITTI2 (vkitti2) - Dynamic scenes
Spring (spring) - Dynamic scenes
Stereo4D (stereo4d) - Dynamic scenes

Dataset paths can be configured in src/options.py:

Questions & Feedback

If you have questions about reproducibility or comparisons, please open an issue or contact the authors. We appreciate your understanding as we continue to improve and complete this codebase!

📚 Citation

If you find our work helpful, please consider citing:

@article{pan2025diff4splat,
  title={Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models},
  author={Pan, Panwang and Lin, Chenguo and Zhao, Jingjing and Li, Chenxin and Lin, Yuchen and Li, Haopeng and Yan, Honglei and Wen, Kairun and Lin, Yunlong and Yuan, Yixuan and Mu, Yadong},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
  year={2026}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

😊 Acknowledgement

We would like to thank the authors of MoVieS, PartCrafter, DiffSplat, and other related works for their inspiring research and open-source contributions that helped shape this project.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
configs		configs
extensions/vggt		extensions/vggt
scripts		scripts
settings		settings
src		src
tests		tests
wandb		wandb
.editorconfig		.editorconfig
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR 2026] 🌀Diff4Splat

Diff4Splat: Controllable 4D Scene Generation with
Latent Dynamic Reconstruction Models

Panwang Pan^†, Chenguo Lin^†, Jingjing Zhao, Chenxin Li, Yuchen Lin, Haopeng Li, Honglei Yan, Kairun Wen, Yunlong Lin, Yixuan Yuan, Yadong Mu

🔥 See Also

📢 News

📝 Paper vs Released Code

What the Paper Describes

Release Roadmap

📋 Project Status

🔧 Installation

Requirements

Install Dependencies

📊 Datasets

Questions & Feedback

📚 Citation

📄 License

😊 Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2026] 🌀Diff4Splat

Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models

Panwang Pan†, Chenguo Lin†, Jingjing Zhao, Chenxin Li, Yuchen Lin, Haopeng Li, Honglei Yan, Kairun Wen, Yunlong Lin, Yixuan Yuan, Yadong Mu

🔥 See Also

📢 News

📝 Paper vs Released Code

What the Paper Describes

Release Roadmap

📋 Project Status

🔧 Installation

Requirements

Install Dependencies

📊 Datasets

Questions & Feedback

📚 Citation

📄 License

😊 Acknowledgement

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Diff4Splat: Controllable 4D Scene Generation with
Latent Dynamic Reconstruction Models

Panwang Pan^†, Chenguo Lin^†, Jingjing Zhao, Chenxin Li, Yuchen Lin, Haopeng Li, Honglei Yan, Kairun Wen, Yunlong Lin, Yixuan Yuan, Yadong Mu

Packages