This project implements a combined model for predicting both skeleton joint locations and skinning weights from 3D point cloud data. The model utilizes a shared transformer backbone and predicts skinning weights based on the predicted skeleton, enabling an end-to-end learning approach.
Features
Combined Learning: Jointly trains for skeleton prediction and skin weight estimation.
Shared Backbone: Employs a shared Point_Transformer2 as a feature extractor for both tasks.
Flexible Prediction Modes: Supports predicting only skeleton, only skin, or both.
Evaluation Metrics: Includes Mean Squared Error (MSE), L1 Loss, and Joint-to-Joint (J2J) distance for evaluation.
Visualization: Provides utilities for rendering and visualizing predicted skeletons, point clouds, and skin weights.
Model Architecture
The core of the model is the CombinedSkeletonSkinModel defined in models/combined.py.
shared_transformer: A Point_Transformer2 (from PCT.networks.cls.pct) extracts shared features from the input vertices.
skeleton_head: A multi-layer perceptron (MLP) branch takes the shared features and predicts the flattened 3D coordinates of the skeleton joints.
joint_mlp and vertex_mlp: These MLPs process the predicted joint locations (for joint_mlp) and input vertices (for vertex_mlp), concatenated with the shared features, to produce latent representations for joints and vertices, respectively.
Skin Weight Calculation: Skin weights are computed by performing a scaled dot-product attention-like operation between the vertices_latent and joints_latent, followed by a softmax activation to ensure weights sum to 1 for each vertex across all joints.
Getting Started
Prerequisites
Jittor (0.1.18 or newer recommended)
NumPy
SciPy
tqdm
You can install Jittor by following the instructions on their official GitHub page or by running:
pip install jittor
Other dependencies can be installed via pip:
pip install numpy scipy tqdm
Data Preparation
The model expects data to be organized in a specific structure. The data_root argument points to the root directory containing your dataset. Data lists (e.g., train_data_list.txt, predict_data_list.txt) should specify the paths to individual data samples relative to data_root.
Each data sample should ideally contain:
vertices: Point cloud data.
joints: Ground truth skeleton joint locations (for training).
skin: Ground truth skinning weights (for training).
origin_vertices: Original untransformed vertices (for prediction output).
The dataset/dataset.py and dataset/sampler.py scripts handle data loading and sampling.
Training
To train the combined model, use the launch/train_combined.sh script.
./launch/train_combined.sh
Before running, ensure you edit launch/train_combined.sh to configure your desired training parameters, such as data paths, output directory, epochs, batch size, learning rate, and loss weights.
Example parameters you might set in the script:
--train_data_list data/train_list.txt
--val_data_list data/val_list.txt
--data_root /path/to/your/data
--output_dir output/combined_training
--epochs 500
--batch_size 16
--learning_rate 1e-4
--optimizer adamw
--skeleton_weight 1.0
--skin_weight 1.0
--feat_dim 256
--apply_rotation False
--apply_z_scaling False
Training Arguments (configured in launch/train_combined.sh):
--train_data_list (required): Path to the list file containing training data samples.
--val_data_list: Path to the list file containing validation data samples. (Optional)
--data_root: Root directory of your dataset.
--output_dir: Directory to save models, logs, and visualizations.
--epochs: Number of training epochs.
--batch_size: Batch size for training.
--learning_rate: Initial learning rate.
--optimizer: Optimizer to use (sgd, adam, adamw).
--skeleton_weight: Weight for the skeleton loss in the total loss.
--skin_weight: Weight for the skin loss in the total loss.
--feat_dim: Feature dimension for the shared transformer.
--pretrained_model: Path to a pre-trained model checkpoint to resume training or fine-tune.
--apply_z_scaling: Whether to apply z-axis scaling during data augmentation.
--apply_rotation: Whether to apply random rotations during data augmentation.
--print_freq: How often to print training progress (batches).
--save_freq: How often to save model checkpoints (epochs).
--val_freq: How often to run validation (epochs).
Prediction
To make predictions using a trained model, use the launch/predict_combined.sh script.
./launch/predict_combined.sh
Before running, ensure you edit launch/predict_combined.sh to configure your prediction parameters, including data paths, the path to your trained model, output directory, and prediction mode.
Prediction Arguments (configured in launch/predict_combined.sh):
--predict_data_list (required): Path to the list file containing data samples for prediction.
--data_root: Root directory of your dataset.
--pretrained_model (required): Path to the pre-trained model checkpoint for prediction.
--predict_output_dir (required): Directory to save prediction results.
--mode: Prediction mode (skeleton, skin, or both). Determines what outputs are saved.
--feat_dim: Feature dimension used during training (must match the trained model).
--batch_size: Batch size for prediction. Note: Currently must be 1 due to unpadded origin_vertices.
Output
Training Output:
training_log.txt: A log file detailing training progress, losses, and validation results.
best_skeleton_model.pkl: Model checkpoint with the best skeleton (J2J) loss on the validation set.
best_skin_model.pkl: Model checkpoint with the best skin (L1) loss on the validation set.
best_combined_model.pkl: Model checkpoint with the best combined loss on the validation set.
checkpoint_epoch_X.pkl: Periodic model checkpoints.
final_combined_model.pkl: The model saved at the end of training.
tmp/combined/epoch_X/: Directory containing visualization outputs (skeleton renderings, point clouds, skin heatmaps) during validation.
Prediction Output:
For each sample in your predict_data_list, a directory will be created under predict_output_dir (e.g., prediction_results/<class_id>/<sample_id>). This directory will contain:
predict_skeleton.npy: Predicted skeleton joint locations (NumPy array, if mode is ‘skeleton’ or ‘both’).
predict_skin.npy: Resampled skinning weights for the original vertices (NumPy array, if mode is ‘skin’ or ‘both’).
transformed_vertices.npy: The original (untransformed) vertices used for prediction (NumPy array).
Code Structure
launch/train_combined.sh: Shell script to execute the training process.
launch/predict_combined.sh: Shell script to execute the prediction process.
train_combined.py: Python script containing the main training logic.
predict_combined.py: Python script containing the main prediction logic.
models/combined.py: Defines the CombinedSkeletonSkinModel architecture.
dataset/dataset.py: Handles data loading and batching.
dataset/sampler.py: Defines the SamplerMix for data sampling.
dataset/asset.py: Likely contains asset-related utilities (e.g., for loading specific data formats).
dataset/format.py: Defines id_to_name and parents for skeleton structure.
dataset/exporter.py: Contains utilities for exporting and visualizing results.
models/metrics.py: Defines evaluation metrics like J2J.
models/basics.py: Contains basic neural network modules like MLP.
models/transformers.py: Might contain custom transformer implementations, if any.
PCT/networks/cls/pct.py: Contains the Point_Transformer and Point_Transformer2 used as the backbone.
Notes
Batch Size in Prediction: The predict_combined.py script currently forces batch_size=1 during prediction because origin_vertices are not padded, which would cause issues with variable-length point clouds in a batch.
Jittor Flags: jt.flags.use_cuda = 1 is set to ensure GPU utilization if CUDA is available.
Random Seeds: seed_all(123) is used to ensure reproducibility.
Combined Skeleton and Skin Prediction Model
This project implements a combined model for predicting both skeleton joint locations and skinning weights from 3D point cloud data. The model utilizes a shared transformer backbone and predicts skinning weights based on the predicted skeleton, enabling an end-to-end learning approach.
Features
Point_Transformer2as a feature extractor for both tasks.Model Architecture
The core of the model is the
CombinedSkeletonSkinModeldefined inmodels/combined.py.shared_transformer: APoint_Transformer2(fromPCT.networks.cls.pct) extracts shared features from the input vertices.skeleton_head: A multi-layer perceptron (MLP) branch takes the shared features and predicts the flattened 3D coordinates of the skeleton joints.joint_mlpandvertex_mlp: These MLPs process the predicted joint locations (forjoint_mlp) and input vertices (forvertex_mlp), concatenated with the shared features, to produce latent representations for joints and vertices, respectively.vertices_latentandjoints_latent, followed by a softmax activation to ensure weights sum to 1 for each vertex across all joints.Getting Started
Prerequisites
You can install Jittor by following the instructions on their official GitHub page or by running:
Other dependencies can be installed via pip:
Data Preparation
The model expects data to be organized in a specific structure. The
data_rootargument points to the root directory containing your dataset. Data lists (e.g.,train_data_list.txt,predict_data_list.txt) should specify the paths to individual data samples relative todata_root.Each data sample should ideally contain:
vertices: Point cloud data.joints: Ground truth skeleton joint locations (for training).skin: Ground truth skinning weights (for training).origin_vertices: Original untransformed vertices (for prediction output).The
dataset/dataset.pyanddataset/sampler.pyscripts handle data loading and sampling.Training
To train the combined model, use the
launch/train_combined.shscript.Before running, ensure you edit
launch/train_combined.shto configure your desired training parameters, such as data paths, output directory, epochs, batch size, learning rate, and loss weights.Example parameters you might set in the script:
--train_data_list data/train_list.txt--val_data_list data/val_list.txt--data_root /path/to/your/data--output_dir output/combined_training--epochs 500--batch_size 16--learning_rate 1e-4--optimizer adamw--skeleton_weight 1.0--skin_weight 1.0--feat_dim 256--apply_rotation False--apply_z_scaling FalseTraining Arguments (configured in
launch/train_combined.sh):--train_data_list(required): Path to the list file containing training data samples.--val_data_list: Path to the list file containing validation data samples. (Optional)--data_root: Root directory of your dataset.--output_dir: Directory to save models, logs, and visualizations.--epochs: Number of training epochs.--batch_size: Batch size for training.--learning_rate: Initial learning rate.--optimizer: Optimizer to use (sgd,adam,adamw).--skeleton_weight: Weight for the skeleton loss in the total loss.--skin_weight: Weight for the skin loss in the total loss.--feat_dim: Feature dimension for the shared transformer.--pretrained_model: Path to a pre-trained model checkpoint to resume training or fine-tune.--apply_z_scaling: Whether to apply z-axis scaling during data augmentation.--apply_rotation: Whether to apply random rotations during data augmentation.--print_freq: How often to print training progress (batches).--save_freq: How often to save model checkpoints (epochs).--val_freq: How often to run validation (epochs).Prediction
To make predictions using a trained model, use the
launch/predict_combined.shscript.Before running, ensure you edit
launch/predict_combined.shto configure your prediction parameters, including data paths, the path to your trained model, output directory, and prediction mode.Example parameters you might set in the script:
--predict_data_list data/predict_list.txt--data_root /path/to/your/data--pretrained_model output/combined_training/best_combined_model.pkl--predict_output_dir prediction_results--mode both--feat_dim 256Prediction Arguments (configured in
launch/predict_combined.sh):--predict_data_list(required): Path to the list file containing data samples for prediction.--data_root: Root directory of your dataset.--pretrained_model(required): Path to the pre-trained model checkpoint for prediction.--predict_output_dir(required): Directory to save prediction results.--mode: Prediction mode (skeleton,skin, orboth). Determines what outputs are saved.--feat_dim: Feature dimension used during training (must match the trained model).--batch_size: Batch size for prediction. Note: Currently must be 1 due to unpaddedorigin_vertices.Output
Training Output:
training_log.txt: A log file detailing training progress, losses, and validation results.best_skeleton_model.pkl: Model checkpoint with the best skeleton (J2J) loss on the validation set.best_skin_model.pkl: Model checkpoint with the best skin (L1) loss on the validation set.best_combined_model.pkl: Model checkpoint with the best combined loss on the validation set.checkpoint_epoch_X.pkl: Periodic model checkpoints.final_combined_model.pkl: The model saved at the end of training.tmp/combined/epoch_X/: Directory containing visualization outputs (skeleton renderings, point clouds, skin heatmaps) during validation.Prediction Output: For each sample in your
predict_data_list, a directory will be created underpredict_output_dir(e.g.,prediction_results/<class_id>/<sample_id>). This directory will contain:predict_skeleton.npy: Predicted skeleton joint locations (NumPy array, ifmodeis ‘skeleton’ or ‘both’).predict_skin.npy: Resampled skinning weights for the original vertices (NumPy array, ifmodeis ‘skin’ or ‘both’).transformed_vertices.npy: The original (untransformed) vertices used for prediction (NumPy array).Code Structure
launch/train_combined.sh: Shell script to execute the training process.launch/predict_combined.sh: Shell script to execute the prediction process.train_combined.py: Python script containing the main training logic.predict_combined.py: Python script containing the main prediction logic.models/combined.py: Defines theCombinedSkeletonSkinModelarchitecture.dataset/dataset.py: Handles data loading and batching.dataset/sampler.py: Defines theSamplerMixfor data sampling.dataset/asset.py: Likely contains asset-related utilities (e.g., for loading specific data formats).dataset/format.py: Definesid_to_nameandparentsfor skeleton structure.dataset/exporter.py: Contains utilities for exporting and visualizing results.models/metrics.py: Defines evaluation metrics likeJ2J.models/basics.py: Contains basic neural network modules likeMLP.models/transformers.py: Might contain custom transformer implementations, if any.PCT/networks/cls/pct.py: Contains thePoint_TransformerandPoint_Transformer2used as the backbone.Notes
predict_combined.pyscript currently forcesbatch_size=1during prediction becauseorigin_verticesare not padded, which would cause issues with variable-length point clouds in a batch.jt.flags.use_cuda = 1is set to ensure GPU utilization if CUDA is available.seed_all(123)is used to ensure reproducibility.