import jittor as jt
from jittor_models.conv2former import conv2former_n
# create a Conv2Former model with 1000 classes
model = conv2former_n(num_classes=1000)
Training
bash distributed_train.sh 8 $DATA_DIR --model $MODEL -b $BS --lr $LR --drop-path $DROP_PATH
# DATA_DIR: path to the dataset
# MODEL: name of the model
# BS: batch size
# LR: learning rate
# DROP_PATH: drop path rate
validation
python validate.py $DATA_DIR --model $MODEL --checkpoint $CHECKPOINT
# DATA_DIR: path to the dataset
# MODEL: name of the model
# CHECKPOINT: path to the saved checkpoint
If you find this work or code is helpful in your research, please cite:
@article{hou2024conv2former,
title={Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition},
author={Hou, Qibin and Lu, Cheng-Ze and Cheng, Ming-Ming and Feng, Jiashi},
journal={IEEE TPAMI},
year={2024},
doi={10.1109/TPAMI.2024.3401450},
}
Reference
You may want to cite:
@inproceedings{liu2022convnet,
title={A ConvNet for the 2020s},
author={Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
booktitle=CVPR,
year={2022}
}
@inproceedings{liu2021swin,
title={Swin transformer: Hierarchical vision transformer using shifted windows},
author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
booktitle=ICCV,
year={2021}
}
@inproceedings{tan2021efficientnetv2,
title={Efficientnetv2: Smaller models and faster training},
author={Tan, Mingxing and Le, Quoc},
booktitle=ICML,
pages={10096--10106},
year={2021},
organization={PMLR}
}
@misc{focalmnet,
author = {Yang, Jianwei and Li, Chunyuan and Gao, Jianfeng},
title = {Focal Modulation Networks},
publisher = {arXiv},
year = {2022},
}
@article{dai2021coatnet,
title={Coatnet: Marrying convolution and attention for all data sizes},
author={Dai, Zihang and Liu, Hanxiao and Le, Quoc and Tan, Mingxing},
journal=NIPS,
volume={34},
year={2021}
}
@inproceedings{replknet,
author = {Ding, Xiaohan and Zhang, Xiangyu and Zhou, Yizhuang and Han, Jungong and Ding, Guiguang and Sun, Jian},
title = {Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs},
booktitle=CVPR,
year = {2022},
}
Conv2Former
The official implementation of the paper “Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition“. Our code is based on timm and ConvNeXt.
Our paper is accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI).
Usage
Requirements
Examples
Input image should be normalized as follows:
We also provide the Jittor version of the model:
Training
validation
Results
Training on ImageNet-1k
Pre-Training on ImageNet-22k and Finetining on ImageNet-1k
Citation
If you find this work or code is helpful in your research, please cite:
Reference
You may want to cite: