PCT_Jittor

A Jittor implementation of Point Cloud Transformer (PCT) for ModelNet40 classification.

Overview

This project implements a Point Cloud Transformer (PCT) model using the Jittor deep learning framework. The model is designed for 3D point cloud classification on the ModelNet40 dataset.

Given an input point cloud, the network extracts point-wise features, models global geometric relationships through self-attention layers, and predicts one of 40 object categories. This repository is created as part of a computer graphics programming assignment on point cloud classification with Jittor.

Features

Implements a PCT-based classification model in Jittor.
Supports ModelNet40 point cloud classification.
Includes a custom ModelNet40Dataset class for loading preprocessed .npy files.
Uses random point sampling from each point cloud.
Includes simple training-time data augmentation through random rotation.
Uses four self-attention layers to model global point relationships.
Uses global max pooling to obtain a point cloud-level feature vector.
Trains with cross-entropy loss.
Uses SGD optimizer with cosine annealing learning rate scheduling.
Exports model weights to pct_model.pkl.
Generates test predictions in result.json.

Project Structure

PCT_Jittor/
├── pct.py              # Main implementation file
├── README.md           # Project documentation
├── .gitignore          # Git ignore rules
└── data/               # Dataset folder, not tracked by Git

The following files may be generated during training or evaluation, but they are not intended to be tracked by Git:

pct_model.pkl           # Saved model weights
result.json             # Test prediction output

Requirements

This project requires Python and Jittor.

Recommended environment:

Python >= 3.11
Jittor
NumPy

Jittor may require a working C++ compiler such as g++ or clang. A Linux or WSL environment is recommended.

Installation

Clone the repository:

git clone https://gitlink.org.cn/excalibur_64/pct-jittor.git
cd pct-jittor

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate

Install dependencies:

pip install --upgrade pip
pip install jittor numpy

If Jittor compilation fails, make sure that the required compiler and build tools are installed in your system.

Dataset

This project expects the ModelNet40 dataset to be stored in the data/ directory. The dataset can be downloaded from here.

Expected file structure:

data/
├── train_points.npy    # Training point clouds
├── train_labels.npy    # Training labels
└── test_points.npy     # Test point clouds

The training point cloud file is expected to contain point cloud samples with shape similar to:

(N, 2048, 3)

During training and testing, the program randomly samples a fixed number of points from each point cloud. The default number of sampled points is 1024.

The dataset is not included in this repository because dataset files are usually large and should not be pushed to Git.

Usage

Run the training and prediction pipeline with the default settings:

python pct.py

The default configuration is:

data_dir   = ./data
n_points   = 1024
batch_size = 32
epochs     = 200
lr         = 0.01
seed       = 42

You can also specify arguments manually:

python pct.py --data_dir ./data --n_points 1024 --batch_size 32 --epochs 200 --lr 0.01

Output

After training, the program saves the trained model as:

pct_model.pkl

It then performs prediction on the test set and saves the results as:

result.json

The result.json file stores a mapping from sample index to predicted class label, for example:

{
  "0": 12,
  "1": 5,
  "2": 31
}

Model Architecture

This project implements a Point Cloud Transformer (PCT) model for ModelNet40 classification. The model takes a point cloud as input and predicts one of 40 object categories.

The network consists of the following main components:

Point-wise feature embedding
Two Conv1d layers are used to map the input 3D point coordinates into a higher-dimensional feature space.
Self-attention layers
Four self-attention layers are applied to model global relationships between points in the point cloud.
Feature fusion
The outputs of the self-attention layers are concatenated and fused through another Conv1d layer.
Global pooling
Max pooling is applied over the point dimension to obtain a global feature representation of the whole point cloud.
Classification head
The global feature is passed through fully connected layers to produce logits for 40 ModelNet40 classes.

Training Details

The training loop uses:

Loss function: Cross-entropy loss
Optimizer:     SGD
Momentum:      0.9
Weight decay:  1e-4
Scheduler:     Cosine annealing learning rate scheduler

During each epoch, the model computes classification logits, evaluates the cross-entropy loss, updates the parameters, and reports training loss and accuracy.

Git Ignore Policy

This repository only tracks the essential source code and documentation files.

The .gitignore file excludes:

__pycache__/
.venv/
data/
*.pdf
*.pkl
*.zip
*.json
*.png

It also ignores all Python files by default except:

pct.py

This keeps the repository clean by excluding local environments, datasets, generated files, model weights, prediction outputs, images, and temporary files.

Notes

The dataset should be placed under data/ before running the program.
The current implementation uses CUDA by default through jt.flags.use_cuda = 1.
If CUDA is unavailable, modify the code accordingly before running on CPU.
Model weights and generated prediction files are intentionally not tracked by Git.
The repository is intended to provide a clean open-source version of the main implementation file and README documentation.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements

This project is based on the Point Cloud Transformer architecture and uses the Jittor deep learning framework for point cloud classification on ModelNet40.