A Jittor implementation of Point Cloud Transformer (PCT) for ModelNet40 classification.
Overview
This project implements a Point Cloud Transformer (PCT) model using the Jittor deep learning framework. The model is designed for 3D point cloud classification on the ModelNet40 dataset.
Given an input point cloud, the network extracts point-wise features, models global geometric relationships through self-attention layers, and predicts one of 40 object categories. This repository is created as part of a computer graphics programming assignment on point cloud classification with Jittor.
Features
Implements a PCT-based classification model in Jittor.
Supports ModelNet40 point cloud classification.
Includes a custom ModelNet40Dataset class for loading preprocessed .npy files.
Uses random point sampling from each point cloud.
Includes simple training-time data augmentation through random rotation.
Uses four self-attention layers to model global point relationships.
Uses global max pooling to obtain a point cloud-level feature vector.
Trains with cross-entropy loss.
Uses SGD optimizer with cosine annealing learning rate scheduling.
Exports model weights to pct_model.pkl.
Generates test predictions in result.json.
Project Structure
PCT_Jittor/
├── pct.py # Main implementation file
├── README.md # Project documentation
├── .gitignore # Git ignore rules
└── data/ # Dataset folder, not tracked by Git
The following files may be generated during training or evaluation, but they are not intended to be tracked by Git:
pct_model.pkl # Saved model weights
result.json # Test prediction output
Requirements
This project requires Python and Jittor.
Recommended environment:
Python >= 3.11
Jittor
NumPy
Jittor may require a working C++ compiler such as g++ or clang. A Linux or WSL environment is recommended.
Installation
Clone the repository:
git clone https://gitlink.org.cn/excalibur_64/pct-jittor.git
cd pct-jittor
If Jittor compilation fails, make sure that the required compiler and build tools are installed in your system.
Dataset
This project expects the ModelNet40 dataset to be stored in the data/ directory. The dataset can be downloaded from here.
Expected file structure:
data/
├── train_points.npy # Training point clouds
├── train_labels.npy # Training labels
└── test_points.npy # Test point clouds
The training point cloud file is expected to contain point cloud samples with shape similar to:
(N, 2048, 3)
During training and testing, the program randomly samples a fixed number of points from each point cloud. The default number of sampled points is 1024.
The dataset is not included in this repository because dataset files are usually large and should not be pushed to Git.
Usage
Run the training and prediction pipeline with the default settings:
After training, the program saves the trained model as:
pct_model.pkl
It then performs prediction on the test set and saves the results as:
result.json
The result.json file stores a mapping from sample index to predicted class label, for example:
{
"0": 12,
"1": 5,
"2": 31
}
Model Architecture
This project implements a Point Cloud Transformer (PCT) model for ModelNet40 classification. The model takes a point cloud as input and predicts one of 40 object categories.
The network consists of the following main components:
Point-wise feature embedding Two Conv1d layers are used to map the input 3D point coordinates into a higher-dimensional feature space.
Self-attention layers Four self-attention layers are applied to model global relationships between points in the point cloud.
Feature fusion The outputs of the self-attention layers are concatenated and fused through another Conv1d layer.
Global pooling Max pooling is applied over the point dimension to obtain a global feature representation of the whole point cloud.
Classification head The global feature is passed through fully connected layers to produce logits for 40 ModelNet40 classes.
Training Details
The training loop uses:
Loss function: Cross-entropy loss
Optimizer: SGD
Momentum: 0.9
Weight decay: 1e-4
Scheduler: Cosine annealing learning rate scheduler
During each epoch, the model computes classification logits, evaluates the cross-entropy loss, updates the parameters, and reports training loss and accuracy.
Git Ignore Policy
This repository only tracks the essential source code and documentation files.
It also ignores all Python files by default except:
pct.py
This keeps the repository clean by excluding local environments, datasets, generated files, model weights, prediction outputs, images, and temporary files.
Notes
The dataset should be placed under data/ before running the program.
The current implementation uses CUDA by default through jt.flags.use_cuda = 1.
If CUDA is unavailable, modify the code accordingly before running on CPU.
Model weights and generated prediction files are intentionally not tracked by Git.
The repository is intended to provide a clean open-source version of the main implementation file and README documentation.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgements
This project is based on the Point Cloud Transformer architecture and uses the Jittor deep learning framework for point cloud classification on ModelNet40.
关于
A Jittor implementation of Point Cloud Transformer (PCT) for ModelNet40 classification.
PCT_Jittor
A Jittor implementation of Point Cloud Transformer (PCT) for ModelNet40 classification.
Overview
This project implements a Point Cloud Transformer (PCT) model using the Jittor deep learning framework. The model is designed for 3D point cloud classification on the ModelNet40 dataset.
Given an input point cloud, the network extracts point-wise features, models global geometric relationships through self-attention layers, and predicts one of 40 object categories. This repository is created as part of a computer graphics programming assignment on point cloud classification with Jittor.
Features
ModelNet40Datasetclass for loading preprocessed.npyfiles.pct_model.pkl.result.json.Project Structure
The following files may be generated during training or evaluation, but they are not intended to be tracked by Git:
Requirements
This project requires Python and Jittor.
Recommended environment:
Jittor may require a working C++ compiler such as
g++orclang. A Linux or WSL environment is recommended.Installation
Clone the repository:
Create and activate a virtual environment:
Install dependencies:
If Jittor compilation fails, make sure that the required compiler and build tools are installed in your system.
Dataset
This project expects the ModelNet40 dataset to be stored in the
data/directory. The dataset can be downloaded from here.Expected file structure:
The training point cloud file is expected to contain point cloud samples with shape similar to:
During training and testing, the program randomly samples a fixed number of points from each point cloud. The default number of sampled points is
1024.The dataset is not included in this repository because dataset files are usually large and should not be pushed to Git.
Usage
Run the training and prediction pipeline with the default settings:
The default configuration is:
You can also specify arguments manually:
Output
After training, the program saves the trained model as:
It then performs prediction on the test set and saves the results as:
The
result.jsonfile stores a mapping from sample index to predicted class label, for example:Model Architecture
This project implements a Point Cloud Transformer (PCT) model for ModelNet40 classification. The model takes a point cloud as input and predicts one of 40 object categories.
The network consists of the following main components:
Point-wise feature embedding
Two
Conv1dlayers are used to map the input 3D point coordinates into a higher-dimensional feature space.Self-attention layers
Four self-attention layers are applied to model global relationships between points in the point cloud.
Feature fusion
The outputs of the self-attention layers are concatenated and fused through another
Conv1dlayer.Global pooling
Max pooling is applied over the point dimension to obtain a global feature representation of the whole point cloud.
Classification head
The global feature is passed through fully connected layers to produce logits for 40 ModelNet40 classes.
Training Details
The training loop uses:
During each epoch, the model computes classification logits, evaluates the cross-entropy loss, updates the parameters, and reports training loss and accuracy.
Git Ignore Policy
This repository only tracks the essential source code and documentation files.
The
.gitignorefile excludes:It also ignores all Python files by default except:
This keeps the repository clean by excluding local environments, datasets, generated files, model weights, prediction outputs, images, and temporary files.
Notes
data/before running the program.jt.flags.use_cuda = 1.License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgements
This project is based on the Point Cloud Transformer architecture and uses the Jittor deep learning framework for point cloud classification on ModelNet40.