MORGAN Tutorial: Introduction to lm_lods and lm_markers and lm_multiple and lm_bayes and lm

11.1 Introduction to `lm_lods`, `lm_markers` and `lm_multiple`, `lm_bayes` and `lm_schnell`

The programs lm_lods, lm_markers, lm_multiple, lm_bayes and lm_schnell are referred to as "Lodscore" programs. The Lodscore programs use MCMC to perform multipoint linkage analysis and trait mapping on large pedigrees where many individuals may be unobserved and exact computation is infeasible. The data are the genotypes of observed individuals in the pedigree at marker loci and discrete or continuous trait data. As with exact methods of computing lod scores, the genetic model is assumed known. The only unknown parameter is the location of the trait locus. Therefore, the user is required to specify the marker locations, trait and marker allele frequencies and penetrance function. Presently, users are very limited in their choice of penetrance function, but this is currently under revision and will change in future releases of MORGAN.

lm_lods estimates location LOD scores for genotypic or discrete traits by working along the chromosome, estimating likelihood ratios between adjacent locations of the trait locus, starting from unlinked and proceeding through the linkage group to unlinked again. We have three methods of combining these local likelihood ratios into an overall LOD score method. One reduces to an eigenvalue method used by Thompson (2000: sec 9.2, P.118). Other alternatives are simply to combine the ratios from the left, or from the right. Weighted combinations do a better job (William Stewart), but we do not pursue this here as better methods are available in lm_bayes and lm_markers.

lm_markers and lm_multiple are implementations of the Lange-Sobel estimator, using our LM-sampler and the new LMM-sampler respectively. The program lm_markers is so-named because only the meiosis indicators at marker loci are sampled, and only conditional on the marker data. The Lange-Sobel estimate works reasonably well in reasonable time, provided a good MCMC sampler is used, and provided the trait data do not have strong impact on the conditional distribution of meiosis indicators. Recall that the method samples meiosis indicators conditionally only on the marker data. Because of this the method can produce quite accurate LOD scores in the absence of linkage, but can be inaccurate in estimating the strength of linkage signals. As well as producing the LOD score, our current method provides a batch-means pointwise estimate of the Monte Carlo standard error of the LOD-score estimate. lm_markers can work with genotypic, discrete or quantitative traits.

lm_multiple generalizes the lm_markers program in various ways. In fact, from MORGAN 2.8.2 (Spring 2006), the executable lm_markers is compiled as a special case of the more general lm_multiple program. As well as including better exact computation and pedigree peeling options for use in the lod score estimator (see Exact HMM computations), the lm_multiple uses the new multiple-meiosis (MM) sampler in conjunction with the L-sampler. The lm_multiple program and MM-sampler are the work of Liping Tong (Tong & Thompson, 2008, Human Heredity 65: 142-153). Both lm_markers and lm_multiple code optionally perform exact lodscore computations on small pedigree components.

lm_bayes is an alternative method implemented for genotypic or discrete traits. The MCMC performance is better than for lm_markers, but it has other computational overheads. lm_bayes samples trait locations from a posterior distribution, and then divides it by the prior to produce the likelihood and hence the LOD score. Estimation is in two phases. A preliminary run with discrete uniform prior gives order-of-magnitude relative likelihoods. Then, using the inverse of these likelihoods as prior weights (to produce an approximately uniform posterior) a second run is made to estimate the likelihood. It is important that the initial run is long enough for all points to be sampled, and for the unlinked trait position to have a reasonable number of realizations. For locations at which LOD scores are very negative, or for the unlinked position when there is some location with strong positive LOD score this can be problematic.

Our current implementation of lm_bayes provides two LOD score estimates. The first is a crude estimate which counts realizations of locations sampled to estimate the posterior: as can be seen from the output this can be quite erratic. The Rao-Blackwellized estimator is much preferred, and produces good estimates in reasonable time.

lm_schnell uses MCMC realizations of segregation indicators, conditional on marker and quantitative trait data, to estimate local likelihood ratios between alternative hypothesized trait locations. It is based on the program SCHNELL (Single CHromosome Non-Exponential Linkage Likelihoods), originally written by Greg Snow. Because lm_schnell uses the same local-likelihood-ratio based method of lodscore estimation as lm_lods, it suffers from the same disadvantages, namely extensive MCMC requirements and frequent difficulty estimating local likelihood ratios across the positions of highly polymorphic markers. However, because lm_schnell models a quantitative, rather than qualitative trait, MCMC mixing performance should be better. Also, uniquely among our currently released programs, lm_schnell models a polygenic component in addition to the major trait locus. The sampling of this component is by single-site updating, and testing of this feature has been limited. Joint updating of polygenic values is implemented in programs under development, and lm_schnell will be improved or replaced in future releases.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Elizabeth Thompson on September, 10 2010 using texi2html

11.1 Introduction to lm_lods, lm_markers and lm_multiple, lm_bayes and lm_schnell

11.1 Introduction to `lm_lods`, `lm_markers` and `lm_multiple`, `lm_bayes` and `lm_schnell`