Among the Blind the One-Eyed is King: A Decision-Tree Model for Dealing with Incomplete Information in Environmental Policy

Tanja Srebotnjak, Yale University

 

 

 

Abstract

 

Incomplete data are a pervasive phenomenon in environmental data collection and subsequently pose considerable problems for environmental policy-making. Although careful planning and execution of the collection protocol can considerably reduce the amount of missing values, analysts have to be prepared for appropriately dealing with data gaps because most statistical inference techniques require complete data and software applications can at best address the most standard missing data scenarios. Over the years and especially since the 1970s statistics has developed a multitude of methods for analyzing incomplete data. Yet, ad-hoc and na•ve treatment approaches such as listwise deletion and mean imputation are still in widespread use despite their limitations for obtaining valid statistical inference. This paper integrates existing missing data methods into a decision-tree framework that enables environmental practitioners and non-experts in the statistical modeling of incomplete data to identify suitable methods for their specific problem. Several taxonomies of incomplete data are presented and subsequently unified in a decision-tree framework. The utility of the framework is tested using real data with artificially created missing data. An overview of currently available software for missing data analysis further facilitates application of the decision-tree framework by helping users to find computational solutions. In conclusion the paper discusses the strength and weaknesses of the unified decision-tree framework and specifies scope for future research.

 

Keywords: incomplete data, missing data, missing data pattern, missing data generating mechanism, MCAR, MAR, MNAR, imputation, complete-data likelihood, posterior predictive distribution