Skip to content

Sm0kyWu/Amodal3R

Repository files navigation

Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images

Tianhao Wu · Chuanxia Zheng · Frank Guan . Andrea Vedaldi . Tat-Jen Cham

ICCV 2025

Demo Video

Demo Video

Setup

This code has been tested on Ubuntu 22.02 with torch 2.4.0 & CUDA 11.8. We sincerely thank TRELLIS for providing the environment setup and follow exactly as their instruction in this work.

Create a new conda environment named amodal3r and install the dependencies:

. ./setup.sh --new-env --basic --xformers --flash-attn --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast

The detailed usage of setup.sh can be found by running . ./setup.sh --help.

Usage: setup.sh [OPTIONS]
Options:
    -h, --help              Display this help message
    --new-env               Create a new conda environment
    --basic                 Install basic dependencies
    --train                 Install training dependencies
    --xformers              Install xformers
    --flash-attn            Install flash-attn
    --diffoctreerast        Install diffoctreerast
    --vox2seq               Install vox2seq
    --spconv                Install spconv
    --mipgaussian           Install mip-splatting
    --kaolin                Install kaolin
    --nvdiffrast            Install nvdiffrast
    --demo                  Install all dependencies for demo

Pretrained models

We have provided our pretrained weights of both sparse structure module and SLAT module on HuggingFace.

Data Preprocessing

Training Data

We use three datasets for training: ABO, 3D-FUTURE, and HSSD. To obtain the training data, please also refer to TRELLIS. Thanks to them for the amazing work!!!.

When the data is ready, combine them and put under ./dataset/abo_3dfuture_hssd. If you want to train on a single dataset, feel free to modify the dataloader. For training, rendering images, Sparse Structure and SLAT are required.

Training

To train you own model, you can start either on our weights or TRELLIS original weights. Please download the weights and put them under ./ckpts.

To train the sparse structure module with our designed mask-weighted cross-attention and occlusion-aware attention, please run:

. ./train_ss.sh

To train the sparse structure module with our designed mask-weighted cross-attention and occlusion-aware attention, please run:

. ./train_slat.sh

The output folder where the model will be saved can be changed by modifying --vis parameter in the script.

Inference

We have prepared examples under ./example folder. It supports both single and multiple image as input. For inference, please run:

python ./inference.py

If you want to try on you own data. You should prepare: 1) original image and 2) mask image (background is white (255,255,255), visible area is gray (188,188,188), occluded area is black (0,0,0)).

You can use Segment Anything to obtain the corresponding mask, which is used for our in-the-wild examples in the paper and also in our demo.

Evalutation

We render Toys4K and GSO exactly the same as training data. To obtain the evaluation dataset, please modify the directory in 3d_mask_render.py and run:

python ./3d_mask_render.py

It will create a renders_mask folder with the 3D consistent mask in it.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages