UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
AlignVE: Visual Entailment Recognition Based on Alignment Relations
Cao, Biwei1,2; Cao, Jiuxin1,2; Gui, Jie1,2; Shen, Jiayun1,2; Liu, Bo3; He, Lei4; Tang, Yuan Yan5; Kwok, James Tin Yau6
2023-12
Source PublicationIEEE TRANSACTIONS ON MULTIMEDIA
ISSN1520-9210
Volume25Pages:7378-7387
Abstract

Visual entailment (VE) is to recognize whether the semantics of a hypothesis text can be inferred from the given premise image, which is one special task among recent emerged vision and language understanding tasks. Currently, most of the existing VE approaches are derived from the methods of visual question answering. They recognize visual entailment by quantifying the similarity between the hypothesis and premise in the content semantic features from multi modalities. Such approaches, however, ignore the VE's unique nature of relation inference between the premise and hypothesis. Therefore, in this paper, a new architecture called AlignVE is proposed to solve the visual entailment problem with a relation interaction method. It models the relation between the premise and hypothesis as an alignment matrix. Then it introduces a pooling operation to get feature vectors with a fixed size. Finally, it goes through the fully-connected layer and normalization layer to complete the classification. Experiments show that our alignment-based architecture reaches 72.45% accuracy on SNLI-VE dataset, outperforming previous content-based models under the same settings.

KeywordComputer Vision Alignment Relation Visual Entailment
DOI10.1109/TMM.2022.3222118
URLView the original
Indexed BySCIE
Language英語English
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85142806996
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorCao, Jiuxin; Gui, Jie
Affiliation1.School of Cyber Science and Engineering, Southeast University, Nanjing 211189, China
2.Key Laboratory of Computer Network and Information of Ministry of Education of China, Nanjing 211189, China
3.School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
4.Information Engineering University, China and Purple Mountain Laboratories, Nanjing 210000, China
5.Department of Computer and Information Science, University of Macao, Macao 999078, China
6.Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Recommended Citation
GB/T 7714
Cao, Biwei,Cao, Jiuxin,Gui, Jie,et al. AlignVE: Visual Entailment Recognition Based on Alignment Relations[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25, 7378-7387.
APA Cao, Biwei., Cao, Jiuxin., Gui, Jie., Shen, Jiayun., Liu, Bo., He, Lei., Tang, Yuan Yan., & Kwok, James Tin Yau (2023). AlignVE: Visual Entailment Recognition Based on Alignment Relations. IEEE TRANSACTIONS ON MULTIMEDIA, 25, 7378-7387.
MLA Cao, Biwei,et al."AlignVE: Visual Entailment Recognition Based on Alignment Relations".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):7378-7387.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Cao, Biwei]'s Articles
[Cao, Jiuxin]'s Articles
[Gui, Jie]'s Articles
Baidu academic
Similar articles in Baidu academic
[Cao, Biwei]'s Articles
[Cao, Jiuxin]'s Articles
[Gui, Jie]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Cao, Biwei]'s Articles
[Cao, Jiuxin]'s Articles
[Gui, Jie]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.