MORGAN Tutorial: MCMC parameters and options

8.5 MCMC parameters and options

MORGAN can obtain a starting configuration for S in one of two ways. The default method is by sequential imputation. The alternative is to contruct an L-sampler realization independently for each locus, conditional on the genotype data at that locus only (the locus-by-locus option). Sequential imputation tends to produce initial configurations that have higher conditional probabilities, but locus-by-locus sampling can sometimes reveal other modes in the complex space of S values. The MORGAN user can select the L-sampler setup method by including the `use locus-by-locus for setup' statement. If sequential imputation is selected, the user can specify the number of sequential imputation samples from which the starting configuration of meiosis indicators is to be selected, using the `use I sequential imputation realizations for setup' statement. The default is 10% of the total MC iterations.

At each MCMC iteration, MORGAN selects a locus (with L-sampler) or meiosis (with M-sampler) to update. Two different selection methods are available: sample by step and sample by scan. If `sample by scan' is chosen, all loci or meioses are updated one-at-a-time in a predetermined random order. This option is the default. If `sample by step' is chosen, a single locus or meiosis is randomly selected for updating at each iteration. The sampling method selected applies to the entire MCMC run, including burn-in, pseudo-prior computation and main iterations.

When running a MORGAN MCMC program, the user must specify the desired number of several types of iterations. For all programs, some number of initial burn-in iterations must be performed. These realizations are discarded and, if the burn-in period is sufficiently long, subsequent points will be dependent samples from the desired stationary distibution. The `set burn-in iterations' statement is used to specify the number of desired burn-in iterations, with the default value varying by program. The desired number of "main" iterations must be specified using the `set MC iterations' statement; there is no default number of main iterations. Recommended number of iterations is on the order of 10^5. lm_bayes performs a third type of iteration to calculate pseudo-priors. Alternatively, pseudo-priors can be read from an input file. They encourage the MC sampler to visit test positions of low conditional probability. The number of iterations for calculation of pseudo-priors is set using the `set pseudo-prior iterations' statement, or the default value of 50% of the number of main iterations can be used. Specific Autozyg and Lodscore programs have additional parameters and options that are described in the relevant sections of the next two chapters of the tutorial.

In addition to the main program-specific outputs described in the following chapters, the MCMC process accumulates diagnostic counts, scoring the configuration of inheritance indicators at intervals determined by the same statement compute scores every I iterations as is used for scoring for the primary output. (By default, scores and diagnostic output are computed every iteration.)

There are three components to this diagnostic output:

Average total log-probability of segregations:
This is the average (over the scored iterations) of the total (over meioses) of the log-probability of the meiosis indicators. For the first locus this is simply the marginal probability log((1/2)^m) for m meioses, and for each successive locus is log P(S.j | S.(j-1)) for locus j conditional on locus (j-1).
Average total log-probability of penetrances, by locus
This is the average (over the scored iterations) of the combined (over observed individuals) log-probability of the observed data at the locus given the inheritance configuration (S.j).
Recombination counts for map intervals
This is the total count over (male and female) meioses and over MCMC iterations of realizations of configurations of inheritance indicators that are recombinant and non-recombinant in each interval of the map.

In these diagnostic scores, for the programs lm_pval, lm_markers and lm_multiple only marker loci and marker map intervals are included in these diagnostic scores. For lm_auto, the trait locus (designated `0') is included in the correct position, if it is included in the MCMC. For programs lm_schnell and lm_lods the trait locus (designated `0') is included in its position in that cycle of MCMC. If poor MCMC mixing is suspected, it can be useful to see if these diagnostic probabilities and counts differ significantly among MCMC runs.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Elizabeth Thompson on September, 10 2010 using texi2html