ABSTRACT; Period 2007-2011
Techniques will be developed for genetic analysis of complex diseases
segregating in
extended pedigrees of arbitrary structure.
Cardiovascular, neurological and behavioral traits are among those
having both environmental and genetic
components. However, identification of genes contributing to increased
risk of related disorders has been limited by both computational and
statistical constraints.
The development of Markov chain Monte Carlo
(MCMC) methods has helped to overcome these limitations,
providing information for gene localization, trait model
estimation, haplotype analysis, and genetic map analyses, using data on
extended pedigrees.
The research now proposed is concerned with the
extension of MCMC methods in several areas.
Improved methods will be developed for the MCMC analysis of gene
descent in extended pedigrees, given data at a dense genome screen of
markers, together with the use of these gene descent patterns in
joint multilocus linkage and segregation analysis of complex traits.
Models for discrete and
quantitative trait phenotypes in these analyses will be extended to
include epistasis and pleiotropy. MCMC-based likelihood-ratio
methods for assessment of trait-model robustness
will be incorporated into our toolkit.
Further, methods will be developed
for assessment of the statistical
significance of linkage findings, including correction for multiple testing
at linked genome locations. Conditional on marker data,
trait-data resimulation and permutation
will be used to develop measures of significance.
Also, the conditional distributions of gene descent
at marker locations will be used to provide both measures of significance and
confidence sets for trait-locus locations.
Methods will also be developed
for the use of patterns of gene descent
realized conditional on marker data in the analysis of genetic
maps and marker models, including multi-SNP haplotypes
and copy-number variants in these analyses.
The impact of marker map and model uncertainty on linkage findings will
be investigated.
In using multiple dense SNPs as markers, methods
for incorporating linkage disequilibrium (LD) into the analysis of
family data will be developed. The impact of LD on MCMC-based
methods of haplotype inference, estimation of identity by descent ({\ibd})
over regions, and lod score
analyses will be investigated.
Methods will be evaluated by
analyses on several simulated and real data sets, including pedigrees
segregating cardiovascular disease or
behavioral disorders.
These real data sets include several on which are
available genome-wide
marker screens or more localized multigene haplotypes.
Finally, software will be developed that implements these
methods, and will be documented and released for use by practitioners.