Abstract

If the population is large and the sampling mechanism is random, the coalescent is commonly used to model the haplotypes in the sample. Ordered genotypes can then be formed by random matching of the derived haplotypes. However, this approach is not realistic when (1) there is departure from random mating (e.g., dominant individuals in breeding populations or monogamy in humans), or (2) the population is small and/or the individuals in the sample are ascertained by applying some particular non-random sampling scheme, as is usually the case when considering the statistical modeling and analysis of pedigree data. For such situations, we present here a data generation method where an ancestral graph with non-overlapping generations is first generated backwards in time, using ideas from coalescent theory. Alleles are randomly assigned to the founders, and subsequently the gene flow over the entire genome is simulated forwards in time by dropping alleles down the graph according to recombination model without interference. The parameters controlling the mating behavior of generated individuals in the graph (degree of monogamy) can be tuned in order to match a particular demographic situation, without restriction to simple random mating. The performance of the approach is illustrated with a simulation example. The software (written in C-language) is freely available for research purposes at http://www.rni.helsinki.fi/∼dag/.

Links and resources

Tags