SO-Det: A Cross-Layer Weighted Architecture with Channel-Optimized Downsampling and Enhanced Attention Fusion of Small Object Detector
Official implementation of SO-Det, a novel architecture for small object detection featuring:
- Cross-Layer Weighted Architecture (CLWA)
- Channel-Optimized Downsampling (CDown)
- Enhanced Attention Fusion (EAFusion)
A full runnable version will be released when the paper is published
--
Detection (VisDrone)
See VisDrone Dataset for details about this 10-class dataset.
| Model | size (pixels) |
mAPval 50 |
mAPval 50-95 |
Precision | Recall | Params (M) |
FLOPs (B) |
|---|---|---|---|---|---|---|---|
| SO-Det-s | 640 | 40.2 | 23.7 | 49.9 | 38.9 | 1.1 | 16.3 |
| SO-Det-m | 640 | 46.8 | 28.0 | 53.7 | 46.2 | 3.5 | 50.2 |
| SO-Det-l | 640 | 51.7 | 32.0 | 59.6 | 50.0 | 13.1 | 182.1 |
- Metrics measured on VisDrone val set with input resolution 640x640.
- Reproduce by
python val.py --data visdrone.yaml --weights so-det-s.pt --img 640
Detection (TinyPerson)
See TinyPerson Dataset for details.
| Model | size (pixels) |
mAPval 50 |
mAPval 50-95 |
Precision | Recall | Params (M) |
FLOPs (B) |
|---|---|---|---|---|---|---|---|
| SO-Det-s | 640 | 19.3 | 6.3 | 32.6 | 26.0 | 1.1 | 16.3 |
| SO-Det-m | 640 | 24.7 | 7.7 | 37.4 | 28.8 | 3.5 | 50.2 |
| SO-Det-l | 640 | 26.1 | 8.3 | 41.5 | 30.6 | 13.1 | 182.1 |
- Metrics measured on TinyPerson val set with input resolution 640x640.
- Reproduce by
python val.py --data tinyperson.yaml --weights so-det-s.pt --img 640
--
