DyFusion:基于动态融合的交叉注意三维目标检测论文解析资源-CSDN下载

三维目标检测

63 浏览量 2024-08-02 17:13:08 上传评论收藏 7.83MB PDF 举报

资源推荐

资源详情

资源评论

106 IEEE LATIN AMERICA TRANSACTIONS, VOL. 22, NO. 2, FEBRUARY 2024

DyFusion: Cross-Attention 3D Object Detection

with Dynamic Fusion

Jiangfeng Bi , Haiyue Wei , Guoxin Zhang , Kuihe Yang , and Ziying Song

Abstract—In the realm of autonomous driving, LiDAR and

camera sensors play an indispensable role, furnishing pivotal

observational data for the critical task of precise 3D object

detection. Existing fusion algorithms effectively utilize the

complementary data from both sensors. However, these methods

typically concatenate the raw point cloud data and pixel-level

image features, unfortunately, a process that introduces errors

and results in the loss of critical information embedded in each

modality. To mitigate the problem of lost feature information,

this paper proposes a Cross-Attention Dynamic Fusion (CADF)

strategy that dynamically fuses the two heterogeneous data

sources. In addition, we acknowledge the issue of insufﬁcient

data augmentation for these two diverse modalities. To combat

this, we propose a Synchronous Data Augmentation (SDA)

strategy designed to enhance training efﬁciency. We have tested

our method using the KITTI and nuScenes datasets, and the

results have been promising. Remarkably, our top-performing

model attained an 82.52% mAP on the KITTI test benchmark,

outperforming other state-of-the-art methods.

Link to graphical and video abstracts, and to code:

https://latamt.ieeer9.org/index.php/transactions/article/view/8434

Index Terms—Cross-Attention Dynamic Fusion, Synchronous

Data Augmentation, 3D object detection

I. INTRODUCTION

gainst the backdrop of thriving autonomous driving ad-

vances, 3D object detection has arisen as an imperative

task to equip unmanned vehicles with precise environmental

cognition [1]. Pioneers in 3D object detection have carried out

signiﬁcant research, demonstrating excellent performance on

public datasets such as KITTI [2] and nuScenes [3]. As an

exemplar pioneering work, Qi et al. [4] devised PointNet, an

innovative deep neural architecture that directly learns global

features from point cloud data. Zhou et al. [5] proposed

VoxelNet with a Voxel Feature Encoder, which transforms

raw point clouds into voxel-wise features containing spatial

and physical information. These pioneering methods have laid

a strong foundation and provided inspiration for subsequent

research in the realm of 3D object detection.

Within the realm of autonomous driving, LiDAR and cam-

era serve as key sensors, providing rich data for 3D object

This work was supported by the Natural Science Foundation of Hebei

Province, Grant No. F2013208105. (Corresponding author: Ziying Song.)

Jiangfeng Bi, Haiyue Wei, Guoxin Zhang, and Kuihe Yang are with

the School of Information Science and Engineering, Hebei Univer-

sity of Science and Technology, Shijiazhuang 050018, China (e-mail:

[email protected]; [email protected]; [email protected];

[email protected]).

Ziying Song is with the School of Computer and Information Technology,

Beijing Key Lab of Trafﬁc Data Analysis and Mining, Beijing Jiaotong

University, Beijing, China. (e-mail: [email protected]).

Fig. 1. Quantitative Analysis of Camera, LiDAR, Fusion and Pro-

posed Methods.

TABLE I

COMPARISON OF ADVANTAGES AND DISADVANTAGES OF

CAMERA AND LIDAR

Sensor Advantages Disadvantages

Camera

High resolution, color in-

formation.

Sensitive to lighting and

weather, difﬁcult to handle

reﬂective surfaces.

LiDAR

Distance information, no

need for lighting.

Lower resolution, difﬁ-

culty in recognizing color

and texture.

detection [1]. These two sensors have highly complementary

output data, with their respective advantages and disadvantages

summarized in Table I. Fusion-based algorithms tested on

the KITTI dataset demonstrate better detection performance

compared to using only camera or LiDAR alone, as shown

in Fig. 1. Consequently, fusion-based technology has attracted

signiﬁcant research interest. Frustum PointNets [6], an end-

to-end 3D object detection method devised by Chen et al.,

integrates 2D object detection outputs with point cloud data.

Building on the foundation of MV3D [7], Ku et al. [8]

introduced AVOD, a fusion-based approach utilizing both

image and point cloud features to achieve more accurate 3D

object detection.

Point cloud data contains the coordinate information of

3D spatial points in a scene, providing high precision and

reliability. Image data provides high-resolution information

本内容试读结束，登录后可阅读更多

下载后可阅读完整内容，剩余6页未读，立即下载

评论收藏

内容反馈

不是很强但是很秃

粉丝: 1208

DyFusion:基于动态融合的交叉注意三维目标检测 论文解析

基于深度学习的三维目标检测算法综述.pdf

基于深度学习的三维目标检测方法研究.pdf

三维目标检测-基于局部到全局跨模态融合的精确三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于注意力机制上下文感知的三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于SAM实现的零样本三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于深度感知transformer实现的单目三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于注意力机制关系模块实现的三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于以目标为中心时序建模实现的高效多视角三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于极坐标transformer实现的多摄像头三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测常用数据集

自动驾驶场景下基于深度神经网络的三维目标检测方法研究

三维目标检测-基于多摄像头视频的高性能稀疏三维目标检测算法-附项目源码-优质项目实战.zip

基于改进粒子滤波算法的无人机三维航迹预测技术：九维预测模型与三维观测的对比研究,基于Matlab开发的改进粒子滤波算法在无人机三维航迹预测中的应用：九维动态预测与三维观测对比研究,改进粒子滤波的无人机

基于双目视觉的三维目标检测算法研究.docx

三维目标检测-基于融合多模态稀疏表示的多传感器三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于全景图像+单目深度估计实现的三维目标检测-附项目源码-优质项目实战.zip

三维目标检测-基于RGBD数据的可迁移半监督三维目标检测算法-附项目源码-优质项目实战.zip

YOLOv11+BEVformer-三维目标检测在自动驾驶中的融合实践.pdf

基于深度学习的三维目标检测方法综述.pdf

三维目标检测-基于图像语义增强点特征实现的三维目标检测算法-附项目源码-优质项目实战.zip

毕业论文:基于混合专家模型的三维人体跟踪

基于注意力机制的三维点云车辆目标检测算法研究.zip

三维目标检测-基于深度学习实现的形状遮挡三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于RGB相机+雷达实现的多阶段三维目标检测算法-附项目源码-优质项目实战.zip

三维目标检测-基于体素表示+transformer实现的三维目标检测算法-附项目源码-优质项目实战.zip

基于注意力机制的三维点云车辆目标检测算法研究python源码+项目说明+示例图片.zip

Python基于注意力机制的三维点云车辆目标检测算法.zip

智能车辆开发中的三维目标检测算法综述.pdf

三维目标检测中，2D影像与三维点云融合是研究热点，但传感器数据差异影响性能

SD扩展 Controlnet 安装和14种基础用法

Cisco asa 的nat语法旧版本

最新资源

DyFusion:基于动态融合的交叉注意三维目标检测论文解析