没有合适的资源?快使用搜索试试~ 我知道了~
DyFusion:基于动态融合的交叉注意三维目标检测 论文解析
1 下载量 63 浏览量
2024-08-02
17:13:08
上传
评论
收藏 7.83MB PDF 举报
温馨提示
DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion原文论文
资源推荐
资源详情
资源评论






























106 IEEE LATIN AMERICA TRANSACTIONS, VOL. 22, NO. 2, FEBRUARY 2024
DyFusion: Cross-Attention 3D Object Detection
with Dynamic Fusion
Jiangfeng Bi , Haiyue Wei , Guoxin Zhang , Kuihe Yang , and Ziying Song
Abstract—In the realm of autonomous driving, LiDAR and
camera sensors play an indispensable role, furnishing pivotal
observational data for the critical task of precise 3D object
detection. Existing fusion algorithms effectively utilize the
complementary data from both sensors. However, these methods
typically concatenate the raw point cloud data and pixel-level
image features, unfortunately, a process that introduces errors
and results in the loss of critical information embedded in each
modality. To mitigate the problem of lost feature information,
this paper proposes a Cross-Attention Dynamic Fusion (CADF)
strategy that dynamically fuses the two heterogeneous data
sources. In addition, we acknowledge the issue of insufficient
data augmentation for these two diverse modalities. To combat
this, we propose a Synchronous Data Augmentation (SDA)
strategy designed to enhance training efficiency. We have tested
our method using the KITTI and nuScenes datasets, and the
results have been promising. Remarkably, our top-performing
model attained an 82.52% mAP on the KITTI test benchmark,
outperforming other state-of-the-art methods.
Link to graphical and video abstracts, and to code:
https://latamt.ieeer9.org/index.php/transactions/article/view/8434
Index Terms—Cross-Attention Dynamic Fusion, Synchronous
Data Augmentation, 3D object detection
I. INTRODUCTION
A
gainst the backdrop of thriving autonomous driving ad-
vances, 3D object detection has arisen as an imperative
task to equip unmanned vehicles with precise environmental
cognition [1]. Pioneers in 3D object detection have carried out
significant research, demonstrating excellent performance on
public datasets such as KITTI [2] and nuScenes [3]. As an
exemplar pioneering work, Qi et al. [4] devised PointNet, an
innovative deep neural architecture that directly learns global
features from point cloud data. Zhou et al. [5] proposed
VoxelNet with a Voxel Feature Encoder, which transforms
raw point clouds into voxel-wise features containing spatial
and physical information. These pioneering methods have laid
a strong foundation and provided inspiration for subsequent
research in the realm of 3D object detection.
Within the realm of autonomous driving, LiDAR and cam-
era serve as key sensors, providing rich data for 3D object
This work was supported by the Natural Science Foundation of Hebei
Province, Grant No. F2013208105. (Corresponding author: Ziying Song.)
Jiangfeng Bi, Haiyue Wei, Guoxin Zhang, and Kuihe Yang are with
the School of Information Science and Engineering, Hebei Univer-
sity of Science and Technology, Shijiazhuang 050018, China (e-mail:
Ziying Song is with the School of Computer and Information Technology,
Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong
Fig. 1. Quantitative Analysis of Camera, LiDAR, Fusion and Pro-
posed Methods.
TABLE I
COMPARISON OF ADVANTAGES AND DISADVANTAGES OF
CAMERA AND LIDAR
Sensor Advantages Disadvantages
Camera
High resolution, color in-
formation.
Sensitive to lighting and
weather, difficult to handle
reflective surfaces.
LiDAR
Distance information, no
need for lighting.
Lower resolution, diffi-
culty in recognizing color
and texture.
detection [1]. These two sensors have highly complementary
output data, with their respective advantages and disadvantages
summarized in Table I. Fusion-based algorithms tested on
the KITTI dataset demonstrate better detection performance
compared to using only camera or LiDAR alone, as shown
in Fig. 1. Consequently, fusion-based technology has attracted
significant research interest. Frustum PointNets [6], an end-
to-end 3D object detection method devised by Chen et al.,
integrates 2D object detection outputs with point cloud data.
Building on the foundation of MV3D [7], Ku et al. [8]
introduced AVOD, a fusion-based approach utilizing both
image and point cloud features to achieve more accurate 3D
object detection.
Point cloud data contains the coordinate information of
3D spatial points in a scene, providing high precision and
reliability. Image data provides high-resolution information
资源评论


不是很强但是很秃
- 粉丝: 1208
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


最新资源
- 基于形状直径函数的三维模型集一致性分割算法研究.docx
- 有关数字通信系统中技术应用分析.docx
- 大数据平台产品体系介绍.pdf
- 从数据挖掘到重点知识产生.ppt
- 学院学生宿舍楼综合布线方案.doc
- 浅议高校档案信息化建设与公共服务能力.docx
- 电子商务师三测验考试理论真题(三).doc
- 管理信息系统开发的项目管理.docx
- 项目管理中的关键流程.docx
- 最新共享互利共赢-互联网平台运营模式生存启示录模板ppt模板:.pptx
- 天津科技政务网络安全管理的研究.doc
- MATLAB程式设计方案与应用.doc
- 班单片机课程设计任务书.doc
- JSPWEB图书馆借阅系统设计方案与实现S.doc
- 互联网巨头纷纷布局加快生鲜电商行业发展.docx
- 大数据环境下的《证券投资学》课程教学探索.docx
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈



安全验证
文档复制为VIP权益,开通VIP直接复制
