While witnessed with rapid development, remote sensing object detection remains challenging for detecting high aspect ratio objects.
This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection and can detect objects of various aspect ratios well.
Based on large strip convolutions, we build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful.
Unlike recent remote sensing object detectors that leverage large-kernel convolutions with square shapes, our Strip R-CNN takes advantage of sequential orthogonal large strip convolutions to capture spatial information.
In addition, we enhance the localization capability of remote-sensing object detectors by decoupling the detection heads and equipping the localization head with strip convolutions to better localize the target objects.
Extensive experiments on several benchmarks, for example DOTA, FAIR1M, HRSC2016, and DIOR, show that our Strip R-CNN can greatly improve previous work.
In particular, our 30M model achieves 82.75% mAP on DOTA-v1.0, setting a new state-of-the-art record.
Introduction
This repository is the official implementation of “Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection” at: arxiv
The master branch is built on MMRotate which works with PyTorch 1.6+.
StripNet backbone code is placed under mmrotate/models/backbones/, and the train/test configure files are placed under configs/strip_rcnn/
MMRotate depends on PyTorch, MMCV and MMDetection.
Below are quick steps for installation.
Please refer to Install Guide for more detailed instruction.
MMRotate is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new methods.
Citation
If you use this toolbox or benchmark in your research, please cite this project.
If you like our work, don’t hesitate to reach out! Let’s work on it and see how far it would go!
@article{yuan2025strip,
title={Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection},
author={Yuan, Xinbin and Zheng, ZhaoHui and Li, Yuxuan and Liu, Xialei and Liu, Li and Li, Xiang and Hou, Qibin and Cheng, Ming-Ming},
journal={arXiv preprint arXiv:2501.03775},
year={2025}
}
[AAAI 2026]Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection
If you find our work helpful, please consider giving us a ⭐!
Offical implementation of “Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection”
we also add our config in https://github.com/zcablii/LSKNet and https://github.com/YXB-NKU/Strip-R-CNN.
If you encounter any issues while using our code, please check the relevant repositories for related issues first. If none exist, feel free to ask me!
Update[22/2/2025]
Jittor implementation at github.com/NK-JittorCV/nk-remote
博客
Abstract
While witnessed with rapid development, remote sensing object detection remains challenging for detecting high aspect ratio objects. This paper shows that large strip convolutions are good feature representation learners for remote sensing object detection and can detect objects of various aspect ratios well. Based on large strip convolutions, we build a new network architecture called Strip R-CNN, which is simple, efficient, and powerful. Unlike recent remote sensing object detectors that leverage large-kernel convolutions with square shapes, our Strip R-CNN takes advantage of sequential orthogonal large strip convolutions to capture spatial information. In addition, we enhance the localization capability of remote-sensing object detectors by decoupling the detection heads and equipping the localization head with strip convolutions to better localize the target objects. Extensive experiments on several benchmarks, for example DOTA, FAIR1M, HRSC2016, and DIOR, show that our Strip R-CNN can greatly improve previous work. In particular, our 30M model achieves 82.75% mAP on DOTA-v1.0, setting a new state-of-the-art record.
Introduction
This repository is the official implementation of “Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection” at: arxiv
The master branch is built on MMRotate which works with PyTorch 1.6+.
StripNet backbone code is placed under mmrotate/models/backbones/, and the train/test configure files are placed under configs/strip_rcnn/
Results and models
Imagenet 300-epoch pre-trained Strip R-CNN-T backbone: Download
Imagenet 300-epoch pre-trained Strip R-CNN-S backbone: Download
Please note that the Exponential Moving Average (EMA) strategy was not utilized during the ImageNet pretraining stage. DOTA1.0
FAIR1M-1.0
| Strip R-CNN-S | 48.26 | le90 | 1x | 1*8 | strip_rcnn_s_fpn_1x_dota_le90 | model | |
HRSC2016
Installation
MMRotate depends on PyTorch, MMCV and MMDetection. Below are quick steps for installation. Please refer to Install Guide for more detailed instruction.
Get Started
Please see get_started.md for the basic usage of MMRotate. We provide colab tutorial, and other tutorials for:
Acknowledgement
MMRotate is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new methods.
Citation
If you use this toolbox or benchmark in your research, please cite this project.
If you like our work, don’t hesitate to reach out! Let’s work on it and see how far it would go!
Star History
License
Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.