UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
Instance Tracking in 3D Scenes from Egocentric Videos
Zhao, Yunhan1; Haoyu Ma1; Shu Kong2,3; Charless Fowlkes1
2024-09
Conference Name2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Source PublicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Pages21933-21944
Conference Date16-22 June 2024
Conference PlaceSeattle, WA, USA
CountryUSA
PublisherIEEE Computer Society
Abstract

Egocentric sensors such as AR/VR devices capture human-object interactions and offer the potential to provide task-assistance by recalling 3D locations of objects of interest in the surrounding environment. This capability requires instance tracking in real-world 3D scenes from egocentric videos (IT3DEgo). We explore this problem by first introducing a new benchmark dataset, consisting of RGB and depth videos, per-frame camera pose, and instance-level annotations in both 2D camera and 3D world coordinates. We present an evaluation protocol which evaluates tracking performance in 3D coordinates with two settings for enrolling instances to track: (1) single-view online enrollment where an instance is specified on-the-fly based on the human wearer's interactions. and (2) multi-view pre-enrollment where images of an instance to be tracked are stored in memory ahead of time. To address IT3DEgo, we first re-purpose methods from relevant areas, e.g., single object tracking (SOT) - running SOT methods to track instances in 2D frames and lifting them to 3D using camera pose and depth. We also present a simple method that leverages pretrained segmentation and detection models to generate proposals from RGB frames and match proposals with enrolled instance images. Our experiments show that our method (with no finetuning) significantly outperforms SOT-based approaches in the egocentric setting. We conclude by arguing that the problem of egocentric instance tracking is made easier by leveraging camera pose and using a 3D allocentric (world) coordinate representation. Dataset and open-source code: https://github.com/IT3DEgo/IT3DEgo. 

KeywordThree-dimensional Displays Protocols Annotations Benchmark Testing Cameras Sensors Pattern Recognition Egocentric Videos Instance TrackIng In 3d
DOI10.1109/CVPR52733.2024.02071
URLView the original
Language英語English
Scopus ID2-s2.0-85199613330
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionFaculty of Science and Technology
INSTITUTE OF COLLABORATIVE INNOVATION
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.University of California Irvine
2.Texas A&M University
3.Institute of Collaborative Innovation, University of Macau
Recommended Citation
GB/T 7714
Zhao, Yunhan,Haoyu Ma,Shu Kong,et al. Instance Tracking in 3D Scenes from Egocentric Videos[C]:IEEE Computer Society, 2024, 21933-21944.
APA Zhao, Yunhan., Haoyu Ma., Shu Kong., & Charless Fowlkes (2024). Instance Tracking in 3D Scenes from Egocentric Videos. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 21933-21944.
Files in This Item: Download All
File Name/Size Publications Version Access License
InstanceTrack.pdf(16581KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhao, Yunhan]'s Articles
[Haoyu Ma]'s Articles
[Shu Kong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhao, Yunhan]'s Articles
[Haoyu Ma]'s Articles
[Shu Kong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhao, Yunhan]'s Articles
[Haoyu Ma]'s Articles
[Shu Kong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: InstanceTrack.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.