[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

13.2 Sample lm_map parameter file

The two sample parameter files for lm_map can be found in the directory `/MORGAN_Examples/Map'. The two files are `map_G.par' and `map_P.par', along with the corresponding marker data files `map_G.markers' and `map_P.markers'. Thus there are two examples, one for genotypic markers (G) and one for phenotypic markers (P). "G" denotes that marker genotypes are observed without error. "P" denote the possibility of error, so that the observed marker phenotype is not the same as the underlying true marker genotype. This example uses the pedigree file `map.ped'.

`map_G.par' and `map_P.par' have the following statements in common:

 
input pedigree file './map.ped'
input marker data file './map_[G|P].markers'

select all markers
set marker 1 2 3 freqs  .2 .2 .2 .2 .2 
set marker names DS123 DS456 DS789

map gender F marker recomb fract .18 .18  # true F map (cM): 20 20 
map gender M marker recomb fract .08 .08  # true M map (cM): 10 10

limit recomb fracts .001

use sequential imputation for setup
use 100 sequential imputation realizations for setup
set burn-in iterations 100
sample by scan
set L-sampler probability .8
set MC iterations 50 	 # The initial number of MCMC scans per step
limit EM iterations 10   # The total number of MCEM steps 

As seen in previous examples, the `select all markers' statement instructs the program to use all markers on the chromosome for computation. The alternative is to use only selected markers for computation, which can be achieved by using the `select markers' statement (see Autozyg computing requests). The `set marker 1 2 3 freqs .2 .2 .2 .2' statement specifies the marker allele frequencies for markers 1, 2, and 3. This statement, as constructed, requires markers 1, 2, and 3 to each have five alleles with frequencies of 0.2 for each allele. If the number of alleles per marker varies from marker to marker, or if the allele frequencies vary from marker to marker, a separate `set marker freqs' statement is needed for each marker (see markerdrop population model parameters ). The `set marker names' statement overrides the default behavior, which labels markers consequtively: marker-1, marker-2, etc.

The two `map gender [] marker recomb fract' statements specify the marker map in terms of recombination fractions.

The `limit recomb fracts 0.001' statement is optional and places lower and upper bounds on the estimated recombination fractions of the map. For markers that are separated by little or no recombination, the MCEM algorithm may yield estimated recombination fractions of zero which could lead to a severe bias in the results. As a safeguard against such events, this statement places a lower bound 0.001 and an upper bound 0.5 - 0.001 on the estimated recombination fractions of the map.

The statement `use sequential imputation for setup' instructs lm_map to initialize the set of maternal and paternal meiosis indicators for all members of the pedigree who are not founders; this is done prior to the Monte Carlo simulation. The default behavior is specified in this statement, with the alternative being to `use locus-by-locus sampling for setup'. The statement `use 100 sequential imputation realizations for setup' is optional and modifies the default behavior for setup by sequential imputation (which is 10% of the MC iterations). The next three lines in the parameter files contain statements introduced in the Autozyg examples of this tutorial. For explanation of `set burn-in iterations', `sample by scan', and `set L-sampler probability' see Autozyg MCMC parameters and options. The statement `set MC iterations 50' indicates how many MC iterations are to be performed at each step. The statement `limit EM iterations' was introduced in the multivar example and puts an upper bound on the number of MCEM iterations.

Now we'll take a look at the remaining statements in `map_G.par':

 
output maps gender averaged specific
set map estimation model with no mistyping
set EM convergence .01

use MCEM and SA for maximization   
set SA curvature iterations 10	
set SA ascent iterations 10	
set SA gradient iterations 10 
set SA convergence .001

The `output maps gender averaged specific' statement specifies the type of map to be estimated by lm_map. In this example, the default behavior is specified, which instructs lm_map to automatically compute the likelihood ratio test statistic for testing the null hypothesis of a sex-averaged map. The statement `set map estimation model with no mistyping' instructs lm_map to assume that the genotypes are observed without error. The `set EM convergence' statement instructs lm_map to stop the MCEM algorithm if all recombination fraction updates are within 0.01 of their previous values.

The statement `use MCEM and SA for maximization' instructs lm_map to attempt to refine its MCEM-based estimate of the MLE by performing additional SA steps. The alternative is to `use MCEM only for maximization', with no further refining. There are several statements that allow additional control of the SA algorithm. First, an estimate of the curvature of the likelihood is needed to initiate the SA algorithm. The statement `set SA curvature iterations 10' instructs lm_map to use at least 10 MCMC realizations to estimate the curvature of the likelihood. Also, lm_map will not initiate the SA algorithm with a step that decreases likelihood. So, when the SA algorithm is used for refining the likelihood estimate, the statement `set SA ascent iterations 10' instructs lm_map to use at least 10 MCMC realizations to determine whether a proposed first step increases the likelihood. The SA algorithm also requires an estimate of the gradient of the likelihood at each SA step. The statement `set SA gradient iterations 10' instructs lm_map to use at least 10 MCMC realizations to estimate the gradient of the likelihood. Finally, the map estimate obtained from the final step of the MCEM algorithm is used to seed the SA algorithm. The `set SA convergence 0.001' statement instructs lm_map to terminate the SA algorithm when the absolute change in successive map estimates is less than 0.001 for each recombination fraction in the map.

Now we'll take a look at the remaining statements in `map_P.par':

 
output maps gender averaged
set map estimation model with mistyping
set genotyping error rate .02
use MCEM only for maximization

In this parameter file, a gender averaged map is specified by using the `output maps gender averaged' statement. Unlike in the previous parameter file, `map_P.par' does not assume the genotypes are recorded without error; this is indicated by the statement `set map estimation model with mistyping'. When `with mistyping' is chosen, one has the option of specifying an estimate of the error rate with the statement `set genotyping error rate E'. In this example, the error rate is set at 0.02. Finally, the statement `use MCEM only for maximization' instructs lm_map not to use the SA algorithm to further refine the MCEM-based estimate of the MLE. Since the SA algorithm will not be used, none of the `SA' statements are used in `map_P.par'.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

This document was generated by Elizabeth Thompson on September, 10 2010 using texi2html