UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation
Suo, Yucheng1; Zheng, Zhedong2; Wang, Xiaohan1; Zhang, Bang3; Yang, Yi1
2024-03
Source PublicationACM Transactions on Multimedia Computing, Communications and Applications
ISSN1551-6857
Volume20Issue:6Pages:185
Abstract

Sign language provides a way for differently-abled individuals to express their feelings and emotions. However, learning sign language can be challenging and time consuming. An alternative approach is to animate user photos using sign language videos of specific words, which can be achieved using existing image animation methods. However, the finger motions in the generated videos are often not ideal. To address this issue, we propose the Structure-aware Temporal Consistency Network (STCNet), which jointly optimizes the prior structure of humans with temporal consistency to produce sign language videos. We use a fine-grained skeleton detector to acquire knowledge of body structure and introduce both short- and long-term cycle loss to ensure the continuity of the generated video. The two losses and keypoint detector network are optimized in an end-to-end manner. Quantitative and qualitative evaluations on three widely used datasets, namely LSA64, Phoenix-2014T, and WLASL-2000, demonstrate the effectiveness of the proposed method. It is our hope that this work can contribute to future studies on sign language production.

KeywordSign Language Jointly Training Motion Transfer Video Generation
DOI10.1145/3648368
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods
WOS IDWOS:001208681800035
PublisherASSOC COMPUTING MACHINERY, 1601 Broadway, 10th Floor, NEW YORK, NY 10019-7434
Scopus ID2-s2.0-85189454373
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
INSTITUTE OF COLLABORATIVE INNOVATION
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorYang, Yi
Affiliation1.College of Computer Science and Technology, Zhejiang University, Hangzhou, 38 Zheda Road, Xihu District, , Zhejiang, 310027, China
2.Faculty of Science and Technology, Institute of Collaborative Innovation, University of Macau, Taipa University Boulevard, 999078, Macao
3.DAMO Academy, Alibaba Group, Hangzhou, 969 Wenyi West Road, Yuhang District, , Zhejiang, 311121, China
Recommended Citation
GB/T 7714
Suo, Yucheng,Zheng, Zhedong,Wang, Xiaohan,et al. Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 20(6), 185.
APA Suo, Yucheng., Zheng, Zhedong., Wang, Xiaohan., Zhang, Bang., & Yang, Yi (2024). Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation. ACM Transactions on Multimedia Computing, Communications and Applications, 20(6), 185.
MLA Suo, Yucheng,et al."Jointly Harnessing Prior Structures and Temporal Consistency for Sign Language Video Generation".ACM Transactions on Multimedia Computing, Communications and Applications 20.6(2024):185.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Suo, Yucheng]'s Articles
[Zheng, Zhedong]'s Articles
[Wang, Xiaohan]'s Articles
Baidu academic
Similar articles in Baidu academic
[Suo, Yucheng]'s Articles
[Zheng, Zhedong]'s Articles
[Wang, Xiaohan]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Suo, Yucheng]'s Articles
[Zheng, Zhedong]'s Articles
[Wang, Xiaohan]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.