When scientists are interested in knowing the values of, or inferring causal
relations between variables that they cannot directly measure, they typically
record several survey or test ?items? that are thought to be indicators of
latent variables of interest, e.g. math ability or impulsiveness. Although it
is rare that a latent variable is measured perfectly by any single indicator,
estimates of the latent variables, and their associations with other latents,
can be obtained by employing multiple indicators for each latent variable
(multiple indicator models). If the multiple indicator model is correctly
specified, then in a wide range of cases the estimates obtained are consistent
and have desirable statistical properties, and these estimates can be used to
search for causal models among the latents. If the model is mis-specified, then
estimators are typically biased.
A number of problems make it difficult to find a correctly specified
measurement model: a) associations among items are often confounded by
additional unknown latent common causes, b) there are often a plethora of
alternative models that are consistent with the data and with the prior
knowledge of domain experts, c) there may be non-linear dependencies among
latent variables, or linear relationships among non-Gaussian latent variables,
and d) there may be feedback relationships among latent variables. This work
will generalize previous work by Silva, et al., 2006 in JMLR in a way that
overcomes many of these difficulties. In particular, I will sketch an outline
to an approach that leverages algebraic work by Sullivant et al. (2010) in the
Annals of Statistics that allows for models with several latent factors
underlying each pair of items, that involve non-linearities and non-Gaussian
variables (in parts of the model), and that involve feedback between latent
variables of interest; I will also describe what the open problems for this
approach are.