Residential College | false |
Status | 已發表Published |
The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking | |
Ye,Shuo1; Wang,Yu1; Peng,Qinmu1; You,Xinge1; Philip Chen,C. L.2 | |
2024-01 | |
Source Publication | IEEE Transactions on Circuits and Systems for Video Technology |
ISSN | 1051-8215 |
Volume | 34Issue:1Pages:2-16 |
Abstract | Weakly-supervised fine-grained visual categorization (FGVC) aims to achieve subclass classification within the same large class using only label information. Compared to general images, fine-grained images have similar appearances and features, and are often affected by disturbances such as viewpoint, lighting, and occlusion during data collection, resulting in significant intra-class variance and small inter-class variance. To achieve FGVC, carefully designed models are often needed to explore the locally discriminative regions of the image. This paper revisits high-quality FGVC publications based on deep learning and analyzes from two new perspective: fine-grained image data and backbone. We address two ignored but interesting problems in FGVC. First, we argue that the reasons for exacerbating intra-class variance are not the same in data of animal, plant, and commodity types, and it is necessary to consider the effects of posture, covariate shift, and structural changes. Additionally, the “soft boundary” between subclasses intensifies the difficulty of classification. Second, we highlight that convolutional networks and self-attention networks have different receptive fields and shape biases, leading to performance differences when processing different types of fine-grained data. Overall, our analysis provides new insights into recent advances, challenges, and future directions for FGVC based on deep learning, which can help researchers develop more effective models for FGVC. |
Keyword | Fine-grained Visual Categorization Deep Learning Weakly Supervised Learning |
DOI | 10.1109/TCSVT.2023.3284405 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Engineering |
WOS Subject | Engineering, Electrical & Electronic |
WOS ID | WOS:001138814400011 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-85162676585 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Peng,Qinmu |
Affiliation | 1.School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China 2.Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macau, China |
Recommended Citation GB/T 7714 | Ye,Shuo,Wang,Yu,Peng,Qinmu,et al. The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34(1), 2-16. |
APA | Ye,Shuo., Wang,Yu., Peng,Qinmu., You,Xinge., & Philip Chen,C. L. (2024). The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking. IEEE Transactions on Circuits and Systems for Video Technology, 34(1), 2-16. |
MLA | Ye,Shuo,et al."The Image Data and Backbone in Weakly Supervised Fine-Grained Visual Categorization: A Revisit and Further Thinking".IEEE Transactions on Circuits and Systems for Video Technology 34.1(2024):2-16. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment