copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

: large-scale high-accuracy PacBio correction through iterative short read consensus

T. Hackl, R. Hedrich, J. Schultz, and F. Förster. Bioinformatics, 30 (21): 3004-3011 (2014)At0sg Times Cited:350 Cited References Count:26.
DOI: 10.1093/bioinformatics/btu392

Abstract

Motivation: Today, the base code of DNA is mostly determined through sequencing by synthesis as provided by the Illumina sequencers. Although highly accurate, resulting reads are short, making their analyses challenging. Recently, a new technology, single molecule real-time (SMRT) sequencing, was developed that could address these challenges, as it generates reads of several thousand bases. But, their broad application has been hampered by a high error rate. Therefore, hybrid approaches that use high-quality short reads to correct erroneous SMRT long reads have been developed. Still, current implementations have great demands on hardware, work only in well-defined computing infrastructures and reject a substantial amount of reads. This limits their usability considerably, especially in the case of large sequencing projects. Results: Here we present proovread, a hybrid correction pipeline for SMRT reads, which can be flexibly adapted on existing hardware and infrastructure from a laptop to a high-performance computing cluster. On genomic and transcriptomic test cases covering Escherichia coli, Arabidopsis thaliana and human, proovread achieved accuracies up to 99.9% and outperformed the existing hybrid correction programs. Furthermore, proovread-corrected sequences were longer and the throughput was higher. Thus, proovread combines the most accurate correction results with an excellent adaptability to the available hardware. It will therefore increase the applicability and value of SMRT sequencing.

Cite this publication

@article{RN1080, abstract = {Motivation: Today, the base code of DNA is mostly determined through sequencing by synthesis as provided by the Illumina sequencers. Although highly accurate, resulting reads are short, making their analyses challenging. Recently, a new technology, single molecule real-time (SMRT) sequencing, was developed that could address these challenges, as it generates reads of several thousand bases. But, their broad application has been hampered by a high error rate. Therefore, hybrid approaches that use high-quality short reads to correct erroneous SMRT long reads have been developed. Still, current implementations have great demands on hardware, work only in well-defined computing infrastructures and reject a substantial amount of reads. This limits their usability considerably, especially in the case of large sequencing projects. Results: Here we present proovread, a hybrid correction pipeline for SMRT reads, which can be flexibly adapted on existing hardware and infrastructure from a laptop to a high-performance computing cluster. On genomic and transcriptomic test cases covering Escherichia coli, Arabidopsis thaliana and human, proovread achieved accuracies up to 99.9% and outperformed the existing hybrid correction programs. Furthermore, proovread-corrected sequences were longer and the throughput was higher. Thus, proovread combines the most accurate correction results with an excellent adaptability to the available hardware. It will therefore increase the applicability and value of SMRT sequencing.}, added-at = {2024-02-14T14:38:32.000+0100}, author = {Hackl, T. and Hedrich, R. and Schultz, J. and Förster, F.}, biburl = {https://www.bibsonomy.org/bibtex/24ce2825ac50fa28a9b32e1f22e2b0be7/rainerhedrich_2}, doi = {10.1093/bioinformatics/btu392}, interhash = {9df331a3c77e17db35c64eecbb58cbf2}, intrahash = {4ce2825ac50fa28a9b32e1f22e2b0be7}, issn = {1367-4803}, journal = {Bioinformatics}, keywords = {myOwn sequence}, note = {At0sg Times Cited:350 Cited References Count:26}, number = 21, pages = {3004-3011}, timestamp = {2024-02-14T14:38:32.000+0100}, title = {: large-scale high-accuracy PacBio correction through iterative short read consensus}, type = {Journal Article}, url = {/brokenurl#<Go to ISI>://WOS:000344644600003}, volume = 30, year = 2014 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

: large-scale high-accuracy PacBio correction through iterative short read consensus

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML : large-scale high-accuracy PacBio correction through iterative short read consensus

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

: large-scale high-accuracy PacBio correction through iterative short read consensus

Comments and Reviews
(0)