ParticleSfM

Paper | Video | Project Page

Code release for our ECCV 2022 paper “ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild.” by Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang and Yong-Jin Liu.

[Introduction] ParticleSfM is an offline structure-from-motion system for videos (image sequences). Inspired by Particle video, our method connects pairwise optical flows and optimizes dense point trajectories as long-range video correpondences, which are used in a customized global structure-from-motion framework with similarity averaging and global bundle adjustment. In particular, for dynamic scenes, the acquired dense point trajectories can be fed into a specially designed trajectory-based motion segmentation module to select static point tracks, enabling the system to produce reliable camera trajectories on in-the-wild sequences with complex foreground motion.

Teaser

Contact Wang Zhao (thuzhaowang@163.com), Shaohui Liu (b1ueber2y@gmail.com) and Hengkai Guo (guohengkai@bytedance.com) for questions, comments and reporting bugs.

If you are interested in potential collaboration or internship at ByteDance, please feel free to contact Hengkai Guo (guohengkai@bytedance.com).

Update by 2025.02.05

We support GLOMAP in our pipeline, which achieves more accurate results on 13 sequences of the Sintel dataset:

Method	ATE (m)	RPE trans (m)	RPE rot (deg)	SfM runtime (min)	#Frames
Global SfM - Ours w/ gcolmap(Theia)	0.104	0.054	0.414	3.35	45.6
Global SfM - Ours w/ GLOMAP	0.057	0.031	0.201	6.97	45.6

Test it by simply changing the sfm_type to global_glomap:

python run_particlesfm.py --image_dir /path/to/the/image/folder/ \
                          --output_dir /path/to/output/workspace/ \
                          --sfm_type global_glomap  # "global_theia" for the paper version

Installation

Install dependencies:

Ceres >= 2.0.0

For using gcolmap (Theia) as in the original ParticleSfM paper:

COLMAP <= 3.8 [Guide]
Theia SfM (customized version) [Guide]

Alternatively, if you want to use our latest GLOMAP support:

GLOMAP

Set up Python environment with Conda:

conda env create -f particlesfm_env.yaml
conda activate particlesfm

Build our point trajectory optimizer and global structure-from-motion module.

The path to your customized python executable should be set here.

(Optional) Add another gcc search path (e.g. gcc 9) here to compile gmapper correctly.

git submodule update --init --recursive
sudo apt-get install libhdf5-dev
bash scripts/build_all.sh

Download pretrained models for MiDaS, RAFT and our motion segmentation module (download script).
```
bash scripts/download_all_models.sh
```

Quickstart Demo

Download two example in-the-wild sequences [Google Drive] from DAVIS: snowboard and train:
```
bash ./scripts/download_examples.sh
```
Example command to run the reconstruction (e.g. on snowboard):
```
python run_particlesfm.py --image_dir ./example/snowboard/images --output_dir ./outputs/snowboard/
```
You can also alternatively use the command for the workspace with the images folder inside below. This option will write all the output in the same workspace.
```
python run_particlesfm.py --workspace_dir ./example/snowboard/
```
Visualize the outputs with either the COLMAP GUI or your customized visualizer. We also provide a visualization script:
```
python -m pip install open3d pycolmap
python visualize.py --input_dir ./outputs/snowboard/sfm/model --visualize
```
The results below are expected (left: snowboard; right: train):

Usage

Given an image sequence, put all the images in the same folder. The sorted ordering of the names should be consistent with its ordering in the sequence.
Use the following command to run our whole pipeline:
```
python run_particlesfm.py --image_dir /path/to/the/image/folder/ \
                          --output_dir /path/to/output/workspace/
```
This will sequentially run optical flow -> point trajectory -> motion seg -> sfm. The final results will be saved inside the image data folder with COLMAP output format.

If you have the prior information that the scene to be reconstructed is fully static, you can skip the motion segmentation module with --assume_static. Conversely, if you only want to run the motion segmentation, attach --skip_sfm to the command.
To speed up
- Use “–skip_path_consistency” to skip the path consistency optimization of point trajectories
- Try higher down-sampling ratio for optimizing point trajectories: e.g. “–sample_ratio 4”
Visualize the outputs using COLMAP GUI (Download the COLMAP Binary and import the data sequence directory) or just your customized visualizer.

Evaluation

MPI Sintel dataset

Download the Sintel dataset. You also need to download the groundtruth camera motion data and the generated motion mask to evaluate the pose and motion segmentation.

Prepare the sequences:

python scripts/prepare_sintel.py --src_dir /path/to/your/sintel/training/final/ \
                              --tgt_dir /path/to/the/data/root/dir/want/to/save/

Run ParticleSfM reconstructions:

python run_particlesfm.py --root_dir /path/to/the/data/root/dir/

To evaluate the camera poses:

python ./evaluation_evo/eval_sintel.py --input_dir /path/to/the/data/root/dir/ \
                                    --gt_dir /path/to/the/sintel/training/data/camdata_left/ \
                                    --dataset sintel

This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.

To evaluate the motion segmentation:

python ./motion_seg/eval_traj_iou.py --root_dir /path/to/the/data/root/dir/ \
                                  --gt_dir /path/to/the/sintel/rigidity/

ScanNet dataset

Download the test split of ScanNet dataset, extract the data from .sens data using the official script.

Prepare the sequences:

python scripts/prepare_scannet.py --src_dir /path/to/your/scannet/test/scans_test/ \ 
                               --tgt_dir /path/to/the/data/root/dir/want/to/save/

We use the first 20 sequences of test split and perform downsampling with stride 3, resize the image to 640x480.

Run ParticleSfM reconstructions:

python run_particlesfm.py --root_dir /path/to/the/data/root/dir/ \
                       --flow_check_thres 3.0 --assume_static

To evaluate the camera poses:

python ./evaluation_evo/eval_scannet.py --input_dir /path/to/the/data/root/dir/ \
                                     --gt_dir /path/to/the/scannet/test/scans_test/ \
                                     --dataset scannet

This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.

Training

Download the Flyingthings3D dataset from the official website. We need the RGB images (finalpass) and optical flow data.
Download the generated binary motion labels from here or GoogleDrive, and unpack this archive into the root directory of the FlyingThings3D dataset. We thank the authors of MPNet for kindly sharing it.

Prepare the training data:

python ./scripts/prepare_flyingthings3d.py --src_dir /path/to/your/flyingthings3d/data/root/

To launch the training, configure your config file inside ./motion_seg/configs/ and then run:
```
cd ./motion_seg/
python train_seq.py ./configs/your-config-file
cd ..
```

Applications

Motion Control for Video Generation: MotionCtrl, CamCo
Motion Evaluation for Video Generation: AnimateAnything, AC3D
Kinematic Control Annotation: EgoVid-5M

Citation

@inproceedings{zhao2022particlesfm,
      author    = {Zhao, Wang and Liu, Shaohui and Guo, Hengkai and Wang, Wenping and Liu, Yong-Jin},
      title     = {ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild},
      booktitle = {European conference on computer vision (ECCV)},
      year      = {2022}
  }

DynaSLAM. Bescos et al. DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes. IROS 2018.
TrianFlow. Zhao et al. Towards Better Generalization: Joint Depth-Pose Learning without PoseNet. CVPR 2020.
VOLDOR. Min et al. VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough. ICRA 2021.
DROID-SLAM. Teed et al. DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras. NeurIPS 2021.

Acknowledgements

This project could not be possible without the great open-source works from COLMAP, Theia, hloc, RAFT, MiDaS and OANet. We sincerely thank them all.