UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
Multi-Sem Fusion: Multimodal Semantic Fusion for 3-D Object Detection
Xu, Shaoqing1; Li, Fang2; Song, Ziying3; Fang, Jin4; Wang, Sifen5; Yang, Zhi Xin1
2024
Source PublicationIEEE Transactions on Geoscience and Remote Sensing
ISSN0196-2892
Volume62Pages:5703114
Abstract

LIDAR and camera fusion techniques are promising for achieving 3-D object detection in autonomous driving (AD). Most multimodal 3-D object detection frameworks integrate semantic knowledge from 2-D images into 3-D LiDAR point clouds to enhance detection accuracy. Nevertheless, the restricted resolution of 2-D feature maps impedes accurate reprojection and often induces a pronounced boundary-blurring effect, which is primarily attributed to erroneous semantic segmentation. To address these limitations, we present the multi-sem fusion (MSF) framework, a versatile multimodal fusion approach that employs 2-D/3-D semantic segmentation methods to generate parsing results for both modalities. Subsequently, the 2-D semantic information undergoes reprojection into 3-D point clouds utilizing calibration parameters. To tackle misalignment challenges between the 2-D and 3-D parsing results, we introduce an adaptive attention-based fusion (AAF) module to fuse them by learning an adaptive fusion score. Then, the point cloud with the fused semantic label is sent to the following 3-D object detectors. Furthermore, we propose a deep feature fusion (DFF) module to aggregate deep features at different levels to boost the final detection performance. The effectiveness of the framework has been verified on two public large-scale 3-D object detection benchmarks by comparing them with different baselines. And the experimental results show that the proposed fusion strategies can significantly improve the detection performance compared to the methods using only point clouds and the methods using only 2-D semantic information. Moreover, our approach seamlessly integrates as a plug-in within any detection framework.

Keyword3-d Object Detection Multimodal Fusion Self-attention
DOI10.1109/TGRS.2024.3387732
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaGeochemistry & Geophysics ; Engineering ; Remote Sensing ; Imaging Science & Photographic Technology
WOS SubjectGeochemistry & Geophysics ; Engineering, Electrical & Electronic ; Remote Sensing ; Imaging Science & Photographic Technology
WOS IDWOS:001225945800020
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Scopus ID2-s2.0-85190329974
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
DEPARTMENT OF ELECTROMECHANICAL ENGINEERING
Corresponding AuthorYang, Zhi Xin
Affiliation1.University of Macau, State Key Laboratory of Internet of Things for Smart City, Department of Electromechanical Engineering, Macau, Macao
2.Beijing Institute of Technology, School of Mechanical Engineering, Beijing, 100811, China
3.Beijing Jiaotong University, School of Computer and Information Technology, Beijing, 100044, China
4.University of Macau, State Key Laboratory of Internet of Things for Smart City, Macau, Macao
5.Beihang University, School of Transportation Science and Engineering, Beijing, 100083, China
First Author AffilicationUniversity of Macau
Corresponding Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Xu, Shaoqing,Li, Fang,Song, Ziying,et al. Multi-Sem Fusion: Multimodal Semantic Fusion for 3-D Object Detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62, 5703114.
APA Xu, Shaoqing., Li, Fang., Song, Ziying., Fang, Jin., Wang, Sifen., & Yang, Zhi Xin (2024). Multi-Sem Fusion: Multimodal Semantic Fusion for 3-D Object Detection. IEEE Transactions on Geoscience and Remote Sensing, 62, 5703114.
MLA Xu, Shaoqing,et al."Multi-Sem Fusion: Multimodal Semantic Fusion for 3-D Object Detection".IEEE Transactions on Geoscience and Remote Sensing 62(2024):5703114.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Xu, Shaoqing]'s Articles
[Li, Fang]'s Articles
[Song, Ziying]'s Articles
Baidu academic
Similar articles in Baidu academic
[Xu, Shaoqing]'s Articles
[Li, Fang]'s Articles
[Song, Ziying]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Xu, Shaoqing]'s Articles
[Li, Fang]'s Articles
[Song, Ziying]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.