E2STR

The official implementation of E2STR: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer (CVPR-2024) PDF

environment

install mmocr 1.0.0
install requirements.txt

data & model

Download Union14M-L from Union14M-L
Download the MAE pretrained ViT weight from MAERec
Download OPT-125M
Download all the test dataset (listed in Table 1 and Table 2) and modify all the data_root in configs/textrecog/base/datasets. We may upload them later.
The 600k training data with character-wise annotations will be available later. But currently, the repository can also run well without this training data (i.e., you can perform in-context training with only Transform Strategy by modifying ‘JSON FILE FOR CHARACTER-WISE POSITION INFORMATION’ as None). Also refer to Table 4.

train

stage1: vanilla STR training

modify ‘MAE PRETRAIN WEIGHT PATH’ / ‘LM WEIGHT PATH’ / ‘CHECKPOINT SAVE PATH’ / ‘SAVE_NAME’ in configs/textrecog/icl_ocr/stage1.py

sh run_stage1.sh

stage2: in-context training

modify ‘STAGE-1 WEIGHT PATH’ / ‘LM WEIGHT PATH’ / ‘JSON FILE FOR CHARACTER-WISE POSITION INFORMATION’ / ‘CHECKPOINT SAVE PATH’ / ‘SAVE_NAME’ in configs/textrecog/icl_ocr/stage2.py

sh run_stage2.sh

evaluate

Construct the in-context pool (i.e., a json file) by randomly sample data from any target training set. The json file should be structured as follows:

[
{
    'img_path': ,
    'gt_text': 
}
]

Modify ‘JSON FILE FOR IN-CONTEXT POOL’ in configs/textrecog/icl_ocr/stage2.py

Run the following command to evaluate the model.

bash tools/dist_test.sh ./configs/textrecog/icl_ocr/S-stage2.py 'STAGE2-CHECKPOINT-PATH' 8

Citation

If you find our models / code / papers useful in your research, please consider giving stars ⭐ and citations 📝

@article{zhao2023multi,
  title={Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer},
  author={Zhao, Zhen and Huang, Can and Wu, Binghong and Lin, Chunhui and Liu, Hao and Zhang, Zhizhong and Tan, Xin and Tang, Jingqun and Xie, Yuan},
  journal={CVPR},
  year={2024}
}