Abstract

Many approaches to studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history, but they are unlikely to contain sufficient information about older events. However the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying more ancient divergence. Data sets extracted from whole genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data, we developed a new maximum likelihood method for genomic data under the Isolation-with-Migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans, and detect a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster.

Links and resources

Tags