MORGAN Tutorial: LM-sampler and LMM-sampler

8.4 LM-sampler and LMM-sampler

MORGAN's Autozyg and Lodscore programs use MCMC to estimate ibd probabilities and multilocus LOD scores, respectively, in pedigrees. The latent (unobserved) parameters of interest in MCMC estimation of ibd probabilities and LOD scores are the meiosis indicators at marker and/or trait loci for each non-founder in the pedigree. Observed data are trait values and unphased marker genotypes for some or all pedigree members. With unphased genotypes, it may or may not be possible to determine the grandparental source (i.e. the inheritance indicator) of each allele unambiguously. MORGAN uses MCMC to sample meiosis indicators (S) conditional on observed data (Y).

MORGAN implements two different block Gibbs samplers, a locus- and a meiosis-sampler, for sampling from S conditional on Y. Each method updates a subset, S_u, of S conditional on Y and on the rest of S (S_f). The difference between the two methods is the choice of S_u.

The locus-sampler (or L-sampler) chooses S_u to be S.j for some j. In other words, a single locus is selected and inheritance indicators at that locus are updated based on the genotype data at all loci and on the current realization of inheritance indicators at all loci other than j. The MORGAN user can determine whether a locus is to be selected at random each time or if loci are taken in a pre-determined random order, as described in the next section. This method is a modification of the Elston-Stewart algorithm (Elston RC and Stewart J (1971) Human Heredity 21:523-542) and it can be used whenever single locus pedigree peeling is possible. If inter-locus recombination fractions are strictly positive, the L-sampler is irreducible. On the downside, mixing is poor if loci are tightly linked.

The meiosis-sampler (or M-sampler) chooses S_u to be Si. for some i. It is, in a sense, perpendicular to the L-sampler in that at each iteration a single meiosis is selected and inheritance indicators for that meiosis are updated conditional on the genotype data at all loci and the current realization of inheritance indicators for all other meioses. The M-sampler is a modification of the Lander-Green algorithm (Lander ES and Green P (1987) PNAS 84:2363-2367) for peeling along a chromosome using the Baum algorithm (Baum LE (1972) in O. Shea, ed., `Inequalities-III; Proceedings of the Third Symposium on Inequalities, UCLA, 1969', Academic Press, NY pp. 1-8). At each iteration, a single meiosis is randomly selected or meioses can be updated sequentially. As with locus selection in the L-sampler, MORGAN allows the user to choose the meiosis selection The M-sampler mixes well in the presence of tightly linked loci, but it can perform poorly in large pedigrees with missing data.

MORGAN's Autozyg and Lodscore programs use a combination of the L- and M-samplers, referred to as the LM-sampler. The user may choose the fraction of updates that are of each type; the default is 20% L-sampler, 80% M-sampler. The recommendation is 20/80, 50/50 or 80/20, depending on which sampler is, in any particular example, the more computationally intensive.

For mathematical details on the L-, M- and LM-samplers, see Thompson EA (2000) Statistical Inference from Genetic Data on Pedigrees, in `NSF-CBMS Regional Conference Series in Probability and Statistics, Volume 6. Institute of Mathematical Sciences, Beachwood OH and American Statistical Association, Alexandria VA.

Multiple meiosis (MM) sampler updates multiple meioses jointly and is therefore a generalization of the meiosis sampler (M-sampler). There are four types of update in MM-sampler: random meiosis update, individual update, sib update and 3-generation update. This is based on work by Liping Tong (Tong L and Thompson EA (2008) Human Heredity 65: 152-163). The LMM-sampler is a combination of L-sampler and MM-sampler, analogous to the LM-sampler. The LMM-sampler is implemented in the program lm_multiple, and from MORGAN 2.8.3, lm_markers is compiled as a special case of lm_multiple. The LMM-sampler can also be used in the program gl_auto, a MORGAN 3.0 program to sample inheritance patterns conditional on marker data.

From MORGAN V2.8.2, Gold standards for lm_multiple are added in the Lodscore/Gold2 subdirectory.

The lm_haplotype program is a generalization of lm_multiple in which haplotypes of key individuals dividing the pedigree are sampled in addition to meiosis indicators. To facilitate efficient implementation of this algorithm, new peeling-by-component routines need to be implemented and checked. This program is also the work of Liping Tong. This program is not yet released.

Up to MORGAN V2.8.2, MCMC was performed globally over pedigree components (except those small enough for exact computation). The L-sampler peeling and lod score estimation could be done either by component (using "set peeling by component") or globally (the default).

With MORGAN V2.8.3, and specifically to accommodate the new lm_haplotype program, the preferred option is to do both MCMC and pedigree peeling (lod score estimation) by component, and to use exact computation on all sufficiently small component pedigrees. The alternative, retained so that older data sets can be rerun, is to use "set global MCMC", in which case no exact computation will be done, and MCMC will be done globally over all component pedigrees. In this case, the "set peeling by component" option is retained.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Elizabeth Thompson on September, 10 2010 using texi2html