@peter.ralph

Improving the estimation of genetic distances from Next-Generation Sequencing data

, , , and . Biological Journal of the Linnean Society, (2015)
DOI: 10.1111/bij.12511

Abstract

Next-Generation Sequencing (NGS) technologies have revolutionized research in evolutionary biology, by increasing the sequencing speed and reducing the experimental costs. However, sequencing errors are higher than in traditional technologies and, furthermore, many studies rely on low-depth sequencing. Under these circumstances, the use of standard methods for inferring genotypes leads to biased estimates of nucleotide variation, which can bias all downstream analyses. Through simulations, we assessed the bias in estimating genetic distances under several different scenarios. The results indicate that naive methods for assigning individual genotypes greatly overestimate genetic distances. We propose a novel method to estimate genetic distances that is suitable for low-depth NGS data and takes genotype call statistical uncertainty into account. We applied this method to investigate the genetic structure of domesticated and wild strains of rice. We implemented this approach in an open-source software and discuss further directions of phylogenetic analyses within this novel probabilistic framework. © 2015 The Linnean Society of London, Biological Journal of the Linnean Society, 2015, ●●, ●●–●●.

Links and resources

Tags