PA3_oyx - Conditional GAN (CGAN) Implementation
This project implements a Conditional Generative Adversarial Network (CGAN) using the Jittor deep learning framework. The CGAN is trained on the MNIST dataset to generate handwritten digit images conditioned on specific class labels.
Overview
Conditional GANs extend the original GAN architecture by conditioning both the generator and discriminator on additional information (class labels). This allows for controlled generation of samples from specific classes.
Features
- Generator: Takes random noise and class labels as input to generate realistic digit images
- Discriminator: Classifies images as real/fake while considering class labels
- Conditional Training: Both networks are trained with class label information
- MNIST Dataset: Trained on 32x32 grayscale handwritten digit images
- GPU Support: Automatically uses CUDA if available
- Model Persistence: Saves trained models every 10 epochs
- Sample Generation: Generates sample images during training
Requirements
jittor
numpy
PIL (Pillow)
argparse
Installation
Install Jittor:
pip install jittor
Install other dependencies:
pip install pillow numpy
Usage
Basic Training
python CGAN.py
Custom Parameters
python CGAN.py --n_epochs 200 --batch_size 128 --lr 0.0001
Available Arguments
--n_epochs
: Number of training epochs (default: 100)
--batch_size
: Batch size for training (default: 64)
--lr
: Learning rate for Adam optimizer (default: 0.0002)
--b1
: Beta1 parameter for Adam (default: 0.5)
--b2
: Beta2 parameter for Adam (default: 0.999)
--n_cpu
: Number of CPU threads (default: 8)
--latent_dim
: Dimensionality of latent space (default: 100)
--n_classes
: Number of classes (default: 10)
--img_size
: Size of generated images (default: 32)
--channels
: Number of image channels (default: 1)
--sample_interval
: Interval for saving sample images (default: 1000)
Model Architecture
Generator
- Input: Random noise (100D) + Class label embedding (10D)
- Architecture:
- 4 fully connected layers (110→128→256→512→1024)
- Batch normalization and LeakyReLU activations
- Final linear layer to image dimensions (1024)
- Output: 32x32 grayscale images with Tanh activation
Discriminator
- Input: Flattened image (1024D) + Class label embedding (10D)
- Architecture:
- 4 fully connected layers with dropout (0.4) and LeakyReLU
- Input dimension: 1034 (1024 + 10)
- Hidden layers: 512 neurons each
- Output: Single real/fake probability
Training Process
- Generator Training: Learn to generate realistic images that fool the discriminator
- Discriminator Training: Learn to distinguish between real and generated images
- Conditional Loss: Both networks consider class label information
- Adversarial Loss: Mean Squared Error between predictions and target labels
The training alternates between:
- Training generator to maximize discriminator error
- Training discriminator to correctly classify real vs fake images
Output Files
Generated During Training
{batch_number}.png
: Sample images generated every sample_interval
batches
- Images are arranged in a 10x10 grid showing digits 0-9
Model Checkpoints
generator_last.pkl
: Saved generator model (every 10 epochs)
discriminator_last.pkl
: Saved discriminator model (every 10 epochs)
Final Output
result.png
: Final generated image sequence for the specified number string
Project Structure
PA3_oyx/
├── CGAN.py # Main implementation file
├── README.md # This file
├── .gitignore # Git ignore rules
├── generator_last.pkl # Trained generator model
├── discriminator_last.pkl # Trained discriminator model
├── result.png # Final generated sequence
├── *.png # Training sample images
└── .git/ # Git repository
Key Implementation Details
Label Embedding
Both generator and discriminator use embedding layers to convert class labels into dense vectors that are concatenated with image data.
Loss Function
Uses Mean Squared Error (MSE) loss for the adversarial training:
- Generator loss: MSE between discriminator output and “real” labels
- Discriminator loss: Average of MSE for real images (target=1) and fake images (target=0)
Training Schedule
- Models are saved every 10 epochs
- Sample images are generated every 1000 batches
- Training progress is printed every 50 batches
Results
The trained model can generate conditioned digit images where you can specify which digit (0-9) to generate. The final output demonstrates this by generating images for the sequence “2023010788”.
- Student ID: 2023010788
- Project: PA3 Assignment
- Framework: Jittor
- Model: Conditional GAN
License
This project is for educational purposes as part of PA3 assignment.
PA3_oyx - Conditional GAN (CGAN) Implementation
This project implements a Conditional Generative Adversarial Network (CGAN) using the Jittor deep learning framework. The CGAN is trained on the MNIST dataset to generate handwritten digit images conditioned on specific class labels.
Overview
Conditional GANs extend the original GAN architecture by conditioning both the generator and discriminator on additional information (class labels). This allows for controlled generation of samples from specific classes.
Features
Requirements
Installation
Install Jittor:
Install other dependencies:
Usage
Basic Training
Custom Parameters
Available Arguments
--n_epochs
: Number of training epochs (default: 100)--batch_size
: Batch size for training (default: 64)--lr
: Learning rate for Adam optimizer (default: 0.0002)--b1
: Beta1 parameter for Adam (default: 0.5)--b2
: Beta2 parameter for Adam (default: 0.999)--n_cpu
: Number of CPU threads (default: 8)--latent_dim
: Dimensionality of latent space (default: 100)--n_classes
: Number of classes (default: 10)--img_size
: Size of generated images (default: 32)--channels
: Number of image channels (default: 1)--sample_interval
: Interval for saving sample images (default: 1000)Model Architecture
Generator
Discriminator
Training Process
The training alternates between:
Output Files
Generated During Training
{batch_number}.png
: Sample images generated everysample_interval
batchesModel Checkpoints
generator_last.pkl
: Saved generator model (every 10 epochs)discriminator_last.pkl
: Saved discriminator model (every 10 epochs)Final Output
result.png
: Final generated image sequence for the specified number stringProject Structure
Key Implementation Details
Label Embedding
Both generator and discriminator use embedding layers to convert class labels into dense vectors that are concatenated with image data.
Loss Function
Uses Mean Squared Error (MSE) loss for the adversarial training:
Training Schedule
Results
The trained model can generate conditioned digit images where you can specify which digit (0-9) to generate. The final output demonstrates this by generating images for the sequence “2023010788”.
Student Information
License
This project is for educational purposes as part of PA3 assignment.