Alignment-free estimation of nucleotide diversity

B. Haubold, F. Reed, и P. Pfaffelhuber.
Bioinformatics, 27 (4): 449-455 (февраля 2011)
DOI: 10.1093/bioinformatics/btq689

Аннотация

Sequencing capacity is currently growing more rapidly than CPU speed, leading to an analysis bottleneck in many genome projects. Alignment-free sequence analysis methods tend to be more efficient than their alignment-based counterparts. They may, therefore, be important in the long run for keeping sequence analysis abreast with sequencing.We derive and implement an alignment-free estimator of the number of pairwise mismatches, . Our implementation of , pim, is based on an enhanced suffix array and inherits the superior time and memory efficiency of this data structure. Simulations demonstrate that is accurate if mutations are distributed randomly along the chromosome. While real data often deviates from this ideal, remains useful for identifying regions of low genetic diversity using a sliding window approach. We demonstrate this by applying it to the complete genomes of 37 strains of Drosophila melanogaster, and to the genomes of two closely related Drosophila species, D.simulans and D.sechellia. In both cases, we detect the diversity minimum and discuss its biological implications.

ключ BibTeX: haubold2011alignmentfree
тип записи: article
год: 2011
месяц: feb
журнал: Bioinformatics
номер: 4
страницы: 449-455
том: 27
pmid: 21156730
DOI: 10.1093/bioinformatics/btq689
url: http://www.ncbi.nlm.nih.gov/pubmed/21156730?dopt=Abstract

тэги

Пользователи данного ресурса

Комментарии и рецензиипоказать / перейти в невидимый режим

Пожалуйста, войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)

BibSonomy