Descriptive Statistics (univariate) - qualitative/quantitative variables - continuous/discrete variables - random variable - sample mean - sample variance (defining formula) - sample variance (computational formula) - histograms - histogram area = probability - transformations for changing histogram shape - comparative boxplots - quantile-quantile plots Descriptive Statistics (bivariate) - scatterplot (constant variance) - histograms on scatterplots - contingency tables Distributions (Calculus required) - probability density function (pdf) - probability mass function (pmf) - Bernoulli - Binomial - Derivation of Binomial - Poisson - Derivation of Poisson - Uniform - Normal/Gaussian - Exponential - Power-law - Expected Value (E) - Variance (V) Probability - Standardization - Derivation of Sampling Distribution - Derivation of E[sample mean] - Derivation of V[sample mean] - Central Limit Theorem - prob( a < sample mean < b) = integral/sum of pdf/pmf - prob( observed sample mean ) = nonsense - prob( population parameter ) = nonsense - Basic set/event theory, Venn diagrams - Basic axioms of prob, conditional prob - Independence, and Bayes' theorem. Bivariate (Calculus required) - correlation coefficient (r) - invariance of r under shift/scale transformations - sensitivity of r to clusters/outliers - Ordinary Least-square Regression (OLS) - meaning of regression coefficients - Derivation of OLS estimates of regression coefficients - Model versus Data - Extrapolation - transformations - polynomial regression - multiple regression - overfitting - collinearity - interaction - residual plots - Analysis of Variance (ANOVA) in regression - Derivation of ANOVA in regression - R-squared - standard deviation of errors Inference with Confidence Interval (CI) - Derivation of (CI) - 2-sided CI for 1 mean - 2-sided CI for 2 means - 1-sided CI (upper and lower) for 1 mean All 1-side CIs are skipped this quarter - 1-sided CI (upper and lower) for 2 means - 2-sided CI for 1 proportion - 2-sided CI for 2 proportions - 1-sided CI (upper and lower) for 1 proportion - 1-sided CI (upper and lower) for 2 proportions - CI for paired data - determination of upper vs. lower confidence bound - random vs. observed CI - interpretation of CI in terms of confidence - interpretation of CI in terms of probability - coverage property of CIs - determination of minimum sample size - unknown vs. known population variance - t-distribution - CIs based on t-distribution Hypothesis Testing (HT) - HT with p-value - HT with rejection region - 2-sided 1-sample HT for mean - 2-sided 2-sample HT for means - 1-sided 1-sample HT for mean - 1-sided 2-sample HT for means - 2-sided 1-sample HT for proportion - 2-sided 2-sample HT for proportions - 1-sided 1-sample HT for proportion - 1-sided 2-sample HT for proportions - HT for paired data - Determination of 2-sided vs. 1-sided - Significance level as prob(Type I) - chisquared HT of multiple proportions in 1 population - chisquared HT of multiple proportions in multiple populations (i.e. test of homogeneity) - power - 1-way ANOVA F-test Inference in Regression - probability model for regression - predicted response = conditional mean - prob ( a < random y < b) - CI and HT of regression parameters in simple regression - CI and HT of regression parameters in polynomial regression - CI and HT of regression parameters in multiple regression - F-Test of model utility in multiple regression - HT of correlation coefficient - Confidence Interval (CI) for the true mean y(x) - Prediction Interval (PI) for a single y, at a given x - Derivation of CI and PI for prediction - Classification and Regression problems (e.g., neural nets) Statistical Software - R