Recently, a team from the Institute of Intelligent Machines,Hefei Institutes of Physical Science (HFIPS) of Chinese Academy of Sciences (CAS), proposed a new artificial intelligence framework for target detection, which provided a new solution for fast and high-precision real-time online target detection.
Relevant work was published in Expert Systems With Applications.
In recent years, deep learning theory has driven the rapid development of artificial intelligence technology. Object detection technology based on deep learning theory is also successful in many industrial applications. Current research mainly improves the speed or accuracy of target detection, while fails to take efficiency and accuracy into account. How to achieve fast and accurate object detection has become an important challenge in the field of artificial intelligence.
In this research, the team found that one of the main defects of the target detection technology based on deep learning lied in the repeated feature extraction and fusion of deep network structures, resulting in unnecessary computational costs.
Therefore, they proposed a multi-input single-output target recognition framework (MiSo), which was different from the traditional multi-input and multi-output model and reduced model complexity and inference time overhead.
Furthermore, under this framework, based on the eRF detection theory proposed earlier, the research team designed three new learning mechanisms to extract hot spot feature information more accurately and efficiently, which were receptive field adjustment mechanism, residual attention self-learning mechanism, and eRF-based dynamic balance sampling strategy.
"We named them as M2YOLOF," said WANG Hongqiang who led the team, "it detects objects on one feature map and performs well on small objects. It's as fast as YOLOF (You Only Look One-level Feature), but more accurate."
They tried it on standard dataset benchmark and achieved 39.2 AP at a speed of 29FPS. It's 2.6 AP higher than existing SOTA TridenNet-R50.
This method provided a new idea for research and industrial application of target detection.
The research work has been supported by the National Natural Science Foundation of China, the equipment development of the Chinese Academy of Sciences, the key research and development plan of Anhui Province, and the commissioned development of horizontal enterprises.
Figure 1: Network Structure Framework (Image by CHEN Kun)
Figure 2: Object Detection Examples (Image by CHEN Kun)
M2YOLOF: Based on effective receptive fields and multiple-in-single-out encoder for object detection