Residential Collegefalse
Status已發表Published
Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction
Zeng, Peng1; Tian, Zunzhe2; Han, Yuwei2; Zhang, Weixiong1; Zhou, Tinggan2; Peng, Yingmei2; Hu, Hao1; Cai, Jing2
2022-08-09
Source PublicationChinese Medicine
ISSN1749-8546
Volume17Issue:1Pages:94
Abstract

Background: Many medicinal plants are known for their complex genomes with high ploidy, heterozygosity, and repetitive content which pose severe challenges for genome sequencing of those species. Long reads from Oxford nanopore sequencing technology (ONT) or Pacific Biosciences Single Molecule, Real-Time (SMRT) sequencing offer great advantages in de novo genome assembly, especially for complex genomes with high heterozygosity and repetitive content. Currently, multiple allotetraploid species have sequenced their genomes by long-read sequencing. However, we found that a considerable proportion of these genomes (7.9% on average, maximum 23.7%) could not be covered by NGS (Next Generation Sequencing) reads (uncovered region by NGS reads, UCR) suggesting the questionable and low-quality of those area or genomic areas that can’t be sequenced by NGS due to sequencing bias. The underlying causes of those UCR in the genome assembly and solutions to this problem have never been studied. Methods: In the study, we sequenced the tetraploid genome of Veratrum dahuricum (Turcz.) O. Loes (VDL), a Chinese medicinal plant, with ONT platform and assembled the genome with three strategies in parallel. We compared the qualities, coverage, and heterozygosity of the three ONT assemblies with another released assembly of the same individual using reads from PacBio circular consensus sequencing (CCS) technology, to explore the cause of the UCR. Results: By mapping the NGS reads against the three ONT assemblies and the CCS assembly, we found that the coverage of those ONT assemblies by NGS reads ranged from 49.15 to 76.31%, much smaller than that of the CCS assembly (99.53%). And alignment between ONT assemblies and CCS assembly showed that most UCR can be aligned with CCS assembly. So, we conclude that the UCRs in ONT assembly are low-quality sequences with a high error rate that can’t be aligned with short reads, rather than genomic regions that can’t be sequenced by NGS. Further comparison among the intermediate versions of ONT assemblies showed that the most probable origin of those errors is a combination of artificial errors introduced by “self-correction” and initial sequencing error in long reads. We also found that polishing the ONT assembly with CCS reads can correct those errors efficiently. Conclusions: Through analyzing genome features and reads alignment, we have found the causes for the high proportion of UCR in ONT assembly of VDL are sequencing errors and additional errors introduced by self-correction. The high error rates of ONT-raw reads make them not suitable for self-correction prior to allotetraploid genome assembly, as the self-correction will introduce artificial errors to > 5% of the UCR sequences. We suggest high-precision CCS reads be used to polish the assembly to correct those errors effectively for polyploid genomes.

KeywordOnt-based Assembly Allotetraploid Veratrum Dahuricum Low-quality Sequences Homozygous Variants
DOI10.1186/s13020-022-00644-1
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaIntegrative & Complementary Medicine ; Pharmacology & Pharmacy
WOS SubjectIntegrative & Complementary Medicine ; Pharmacology & Pharmacy
WOS IDWOS:000838069800001
PublisherBMC, CAMPUS, 4 CRINAN ST, LONDON N1 9XW, ENGLAND
Scopus ID2-s2.0-85135845341
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionTHE STATE KEY LABORATORY OF QUALITY RESEARCH IN CHINESE MEDICINE (UNIVERSITY OF MACAU)
Institute of Chinese Medical Sciences
Co-First AuthorZeng, Peng
Corresponding AuthorHu, Hao; Cai, Jing
Affiliation1.State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences, University of Macau, Macao
2.School of Ecology and Environment, Northwestern Polytechnical University, Xi’an, China
First Author AffilicationInstitute of Chinese Medical Sciences
Corresponding Author AffilicationInstitute of Chinese Medical Sciences
Recommended Citation
GB/T 7714
Zeng, Peng,Tian, Zunzhe,Han, Yuwei,et al. Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction[J]. Chinese Medicine, 2022, 17(1), 94.
APA Zeng, Peng., Tian, Zunzhe., Han, Yuwei., Zhang, Weixiong., Zhou, Tinggan., Peng, Yingmei., Hu, Hao., & Cai, Jing (2022). Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction. Chinese Medicine, 17(1), 94.
MLA Zeng, Peng,et al."Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction".Chinese Medicine 17.1(2022):94.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zeng, Peng]'s Articles
[Tian, Zunzhe]'s Articles
[Han, Yuwei]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zeng, Peng]'s Articles
[Tian, Zunzhe]'s Articles
[Han, Yuwei]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zeng, Peng]'s Articles
[Tian, Zunzhe]'s Articles
[Han, Yuwei]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.