Note we use [space] to represent the masked characters. For customized datasets, you may modify codes in src/training/data.py and your data annotation accordingly.
We provide a script for converting model parameter names, thus it could be used in the dev-1.x branch of MMOCR
# first modify the model_path and save_path in tools/convert2mmocr.py
python tools/convert2mmocr.py
Citation
@article{xue2022language,
title={Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting},
author={Xue, Chuhui and Zhang, Wenqing and Hao, Yu and Lu, Shijian and Torr, Philip and Bai, Song},
journal={Proceedings of the European Conference on Computer Vision (ECCV)},
year={2022}
}
oCLIP
This repository is the official implementation for the following paper:
Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting
Chuhui Xue, Wenqing Zhang, Yu Hao, Shijian Lu, Philip Torr, Song Bai, ECCV 2022 (Oral)
Part of the code is inherited from open_clip.
Models
Training oCLIP
Conda
Data
Download SynthText and put it to ./data.
You may use the provided script to generate the annotation for pre-training:
Train
Sample running code for training:
Visualization
We also provide a script for visualization of attention maps in the pre-trained model.
Download the pre-trained model to ./pretrained.
Fine-tune in MMOCR
We provide a script for converting model parameter names, thus it could be used in the dev-1.x branch of MMOCR
Citation