目录

Combined Skeleton and Skin Prediction Model

This project implements a combined model for predicting both skeleton joint locations and skinning weights from 3D point cloud data. The model utilizes a shared transformer backbone and predicts skinning weights based on the predicted skeleton, enabling an end-to-end learning approach.

Features

  • Combined Learning: Jointly trains for skeleton prediction and skin weight estimation.
  • Shared Backbone: Employs a shared Point_Transformer2 as a feature extractor for both tasks.
  • Dependent Skin Prediction: Skin weight prediction leverages the predicted skeleton joints, ensuring consistency.
  • Flexible Prediction Modes: Supports predicting only skeleton, only skin, or both.
  • Evaluation Metrics: Includes Mean Squared Error (MSE), L1 Loss, and Joint-to-Joint (J2J) distance for evaluation.
  • Visualization: Provides utilities for rendering and visualizing predicted skeletons, point clouds, and skin weights.

Model Architecture

The core of the model is the CombinedSkeletonSkinModel defined in models/combined.py.

  • shared_transformer: A Point_Transformer2 (from PCT.networks.cls.pct) extracts shared features from the input vertices.
  • skeleton_head: A multi-layer perceptron (MLP) branch takes the shared features and predicts the flattened 3D coordinates of the skeleton joints.
  • joint_mlp and vertex_mlp: These MLPs process the predicted joint locations (for joint_mlp) and input vertices (for vertex_mlp), concatenated with the shared features, to produce latent representations for joints and vertices, respectively.
  • Skin Weight Calculation: Skin weights are computed by performing a scaled dot-product attention-like operation between the vertices_latent and joints_latent, followed by a softmax activation to ensure weights sum to 1 for each vertex across all joints.

Getting Started

Prerequisites

  • Jittor (0.1.18 or newer recommended)
  • NumPy
  • SciPy
  • tqdm

You can install Jittor by following the instructions on their official GitHub page or by running:

pip install jittor

Other dependencies can be installed via pip:

pip install numpy scipy tqdm

Data Preparation

The model expects data to be organized in a specific structure. The data_root argument points to the root directory containing your dataset. Data lists (e.g., train_data_list.txt, predict_data_list.txt) should specify the paths to individual data samples relative to data_root.

Each data sample should ideally contain:

  • vertices: Point cloud data.
  • joints: Ground truth skeleton joint locations (for training).
  • skin: Ground truth skinning weights (for training).
  • origin_vertices: Original untransformed vertices (for prediction output).

The dataset/dataset.py and dataset/sampler.py scripts handle data loading and sampling.

Training

To train the combined model, use the launch/train_combined.sh script.

./launch/train_combined.sh

Before running, ensure you edit launch/train_combined.sh to configure your desired training parameters, such as data paths, output directory, epochs, batch size, learning rate, and loss weights.

Example parameters you might set in the script:

  • --train_data_list data/train_list.txt
  • --val_data_list data/val_list.txt
  • --data_root /path/to/your/data
  • --output_dir output/combined_training
  • --epochs 500
  • --batch_size 16
  • --learning_rate 1e-4
  • --optimizer adamw
  • --skeleton_weight 1.0
  • --skin_weight 1.0
  • --feat_dim 256
  • --apply_rotation False
  • --apply_z_scaling False

Training Arguments (configured in launch/train_combined.sh):

  • --train_data_list (required): Path to the list file containing training data samples.
  • --val_data_list: Path to the list file containing validation data samples. (Optional)
  • --data_root: Root directory of your dataset.
  • --output_dir: Directory to save models, logs, and visualizations.
  • --epochs: Number of training epochs.
  • --batch_size: Batch size for training.
  • --learning_rate: Initial learning rate.
  • --optimizer: Optimizer to use (sgd, adam, adamw).
  • --skeleton_weight: Weight for the skeleton loss in the total loss.
  • --skin_weight: Weight for the skin loss in the total loss.
  • --feat_dim: Feature dimension for the shared transformer.
  • --pretrained_model: Path to a pre-trained model checkpoint to resume training or fine-tune.
  • --apply_z_scaling: Whether to apply z-axis scaling during data augmentation.
  • --apply_rotation: Whether to apply random rotations during data augmentation.
  • --print_freq: How often to print training progress (batches).
  • --save_freq: How often to save model checkpoints (epochs).
  • --val_freq: How often to run validation (epochs).

Prediction

To make predictions using a trained model, use the launch/predict_combined.sh script.

./launch/predict_combined.sh

Before running, ensure you edit launch/predict_combined.sh to configure your prediction parameters, including data paths, the path to your trained model, output directory, and prediction mode.

Example parameters you might set in the script:

  • --predict_data_list data/predict_list.txt
  • --data_root /path/to/your/data
  • --pretrained_model output/combined_training/best_combined_model.pkl
  • --predict_output_dir prediction_results
  • --mode both
  • --feat_dim 256

Prediction Arguments (configured in launch/predict_combined.sh):

  • --predict_data_list (required): Path to the list file containing data samples for prediction.
  • --data_root: Root directory of your dataset.
  • --pretrained_model (required): Path to the pre-trained model checkpoint for prediction.
  • --predict_output_dir (required): Directory to save prediction results.
  • --mode: Prediction mode (skeleton, skin, or both). Determines what outputs are saved.
  • --feat_dim: Feature dimension used during training (must match the trained model).
  • --batch_size: Batch size for prediction. Note: Currently must be 1 due to unpadded origin_vertices.

Output

Training Output:

  • training_log.txt: A log file detailing training progress, losses, and validation results.
  • best_skeleton_model.pkl: Model checkpoint with the best skeleton (J2J) loss on the validation set.
  • best_skin_model.pkl: Model checkpoint with the best skin (L1) loss on the validation set.
  • best_combined_model.pkl: Model checkpoint with the best combined loss on the validation set.
  • checkpoint_epoch_X.pkl: Periodic model checkpoints.
  • final_combined_model.pkl: The model saved at the end of training.
  • tmp/combined/epoch_X/: Directory containing visualization outputs (skeleton renderings, point clouds, skin heatmaps) during validation.

Prediction Output: For each sample in your predict_data_list, a directory will be created under predict_output_dir (e.g., prediction_results/<class_id>/<sample_id>). This directory will contain:

  • predict_skeleton.npy: Predicted skeleton joint locations (NumPy array, if mode is ‘skeleton’ or ‘both’).
  • predict_skin.npy: Resampled skinning weights for the original vertices (NumPy array, if mode is ‘skin’ or ‘both’).
  • transformed_vertices.npy: The original (untransformed) vertices used for prediction (NumPy array).

Code Structure

  • launch/train_combined.sh: Shell script to execute the training process.
  • launch/predict_combined.sh: Shell script to execute the prediction process.
  • train_combined.py: Python script containing the main training logic.
  • predict_combined.py: Python script containing the main prediction logic.
  • models/combined.py: Defines the CombinedSkeletonSkinModel architecture.
  • dataset/dataset.py: Handles data loading and batching.
  • dataset/sampler.py: Defines the SamplerMix for data sampling.
  • dataset/asset.py: Likely contains asset-related utilities (e.g., for loading specific data formats).
  • dataset/format.py: Defines id_to_name and parents for skeleton structure.
  • dataset/exporter.py: Contains utilities for exporting and visualizing results.
  • models/metrics.py: Defines evaluation metrics like J2J.
  • models/basics.py: Contains basic neural network modules like MLP.
  • models/transformers.py: Might contain custom transformer implementations, if any.
  • PCT/networks/cls/pct.py: Contains the Point_Transformer and Point_Transformer2 used as the backbone.

Notes

  • Batch Size in Prediction: The predict_combined.py script currently forces batch_size=1 during prediction because origin_vertices are not padded, which would cause issues with variable-length point clouds in a batch.
  • Jittor Flags: jt.flags.use_cuda = 1 is set to ensure GPU utilization if CUDA is available.
  • Random Seeds: seed_all(123) is used to ensure reproducibility.
关于
386.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802047560号