Skip to content

paulpanwang/Diff4Splat

Repository files navigation

⚠️ Note: This repository currently contains only the initial code. Full code and pretrained weights will be updated soon.

[CVPR 2026] 🌀Diff4Splat

Diff4Splat is a feed-forward method that synthesizes controllable and explicit 4D scenes from a single image. Our approach unifies the generative priors of video diffusion models with geometry and motion constraints learned from large-scale 4D datasets.

Here is our Project Page.

Feel free to contact us or open an issue if you have any questions or suggestions.

🔥 See Also

You may also be interested in our other works:

  • [CVPR 2026] MoVieS: a feed-forward model for 4D dynamic reconstruction from monocular videos.

📢 News

  • 2026-02-21: The paper is accepted to CVPR 2026.
  • 2025-11-01: Diff4Splat is released on arXiv.
  • 2025-10-15: Initial codebase structure established.
  • 2025-10-01: Project development started.

📝 Paper vs Released Code

This section clarifies the differences between the paper description and the current released codebase, to help set expectations for reproducibility.

What the Paper Describes

  • Video Backbone: CogVideoX-style Video DiT with 32-channel 3D Causal VAE (4×8×8 compression)
  • LDRM Input: Video latent tensor (z) from the diffusion model, together with camera information, processed by LDRM Transformer to output deformable 3D Gaussians

Release Roadmap

We are actively working on:

  1. Paper-faithful implementation: A version closer to the CogVideoX + latent-input LDRM stack described in the paper
  2. Complete training/inference scripts: Exact scripts to reproduce the paper's results
  3. Pretrained checkpoints: Both the paper's setup and this repository's engineering variant

📋 Project Status

  • Inference code released
  • Training code and data preprocessing scripts released
  • Pretrained checkpoints (coming soon)
  • HuggingFace demo (coming soon)
  • Preprocessed dataset (coming soon)

🔧 Installation

Requirements

  • Python >= 3.10
  • PyTorch >= 2.0 (with CUDA support)
  • CUDA >= 11.8

Install Dependencies

# Clone the repository
git clone https://github.com/paulpanwang/Diff4Splat.git
cd Diff4Splat

# Install required packages
pip install -r settings/requirements.txt

📊 Datasets

Configure your dataset root path in src/options.py or via the DATASET_ROOT environment variable.

The following datasets are supported:

  • RealEstate10K (re10k) - Static scenes
  • TartanAir (tartanair) - Static scenes
  • MatrixCity (matrixcity) - Static scenes
  • DL3DV (dl3dv) - Static scenes
  • DynamicReplica (dynamicreplica) - Dynamic scenes
  • PointOdyssey (pointodyssey) - Dynamic scenes
  • VKITTI2 (vkitti2) - Dynamic scenes
  • Spring (spring) - Dynamic scenes
  • Stereo4D (stereo4d) - Dynamic scenes

Dataset paths can be configured in src/options.py:

Questions & Feedback

If you have questions about reproducibility or comparisons, please open an issue or contact the authors. We appreciate your understanding as we continue to improve and complete this codebase!

📚 Citation

If you find our work helpful, please consider citing:

@article{pan2025diff4splat,
  title={Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models},
  author={Pan, Panwang and Lin, Chenguo and Zhao, Jingjing and Li, Chenxin and Lin, Yuchen and Li, Haopeng and Yan, Honglei and Wen, Kairun and Lin, Yunlong and Yuan, Yixuan and Mu, Yadong},
  booktitle={Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
  year={2026}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

😊 Acknowledgement

We would like to thank the authors of MoVieS, PartCrafter, DiffSplat, and other related works for their inspiring research and open-source contributions that helped shape this project.

About

[CVPR 2026] Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors