EMPIRICAL ANALYSIS OF RGB-IR FEATURE FUSION FOR UAV-BASED OBJECT DETECTION
DOI:
https://doi.org/10.56651/lqdtu.jst.v14.n01.1047.ictKeywords:
UAV-based object detection, feature-level fusion, deep learning, RGB-IR fusionAbstract
Object detection based on Unmanned Aerial Vehicles (UAVs) plays a crucial role in applications such as surveillance, disaster management, and military operations. However, traditional methods relying solely on visible Red-Green-Blue (RGB) imagery often perform poorly under low-light conditions and occlusions. To overcome these challenges, recent studies have explored the fusion of RGB and infrared (IR) images, leveraging their complementary properties. Among various fusion strategies, feature-level fusion has gained increasing attention due to its flexibility and superior performance compared to pixel-level and decision-level approaches. Despite its potential, the impact of the specific stage within the network where modality-specific features are integrated remains insufficiently investigated. This study focuses on feature-level fusion and conducts a comprehensive empirical analysis within a unified dual-stream detection framework to examine how fusion at different depths-early, middle, and late-affects detection performance. Additionally, we evaluate multi-position fusion schemes by combining features from multiple levels. Experimental results on the DroneVehicle dataset reveal that middle fusion achieves the best balance between detection accuracy and efficiency among single-layer fusion configurations. Furthermore, early-middle multi-position fusion further improves localization precision, albeit with moderate computational overhead. These findings offer practical insights into designing more effective and efficient RGB-IR fusion networks for UAV-based object detection systems.










