STATISTICS 593:
Machine Learning, Spring, 2000
Instructors: Alejandro Murua and Jon A. Wellner
Tentative Time: T-Th 9:30 - 11:00
First Meeting: Tuesday, March 28
This special topic course will deal with statistical issues arising
from ``machine learning''. The course will
begin with a discussion of the concept
``learning'' and its connection to some classical statistical problems
and terminology.
Although emphasis will be placed on classification
error bounds, and model selection techniques,
the course will cover
both applied and theoretical aspects of machine learning.
Examples of applications will be drawn from speech and object
recognition, as well as from document classification, and other
fields.
This course is a credit/non-credit course.
Student participation will be encouraged either through special reading
assignments, short presentations, or short term projects.
Tentative Topics / Outline
- I.
- Introduction: (3 weeks)
A. Problems: Classification, and Discrimination problems: (1.5
weeks)
- Supervised Learning:
- Linear Discriminant Analysis.
- Logistic Discrimination.
- Flexible Discriminant Analysis.
- Instance-based learning: K-Nearest-Neighbors.
- PAC-learnability. The special case of a finite
family of classifiers.
B. Methods: Vapnik-Chervonenkis classes and empirical process
theory: (1.5 weeks)
- VC-classes , exponential bounds, and PAC-learnability.
- Glivenko-Cantelli classes of functions.
- ``universal'' Glivenko-Cantelli classes of functions.
- An introduction to ``uniform'' exponential bounds.
- II.
- Classification and Decision Trees: (2 weeks)
A. Classification and decision trees: (1 week)
- Bagging.
- Boosting.
- weakly-dependent classifiers.
B. Exponential bounds, Part II: (1 week)
- Isoperimetric inequalities (Talagrand).
- Massart's inequalities.
- III.
- Model Complexity: Penalization and Model Selection: (3.5 weeks)
A. Approaches to model selection: (1 week)
- Prunning decision trees.
- AIC, BIC.
- Vapnik's risk minimization.
B. Empirical Penalization and Exponential bounds, Part III: (2.5
weeks)
- Koltchinskii's approach: Rademacher penalties.
- Birgé and Massart: model selection and ``Oracle inequalities''.
Some Reference Books:
-
Devroye, L., Gyorfi, and Lugosi, G. (1996).
A Probabilistic Theory of Pattern Recognition.Springer, New York.
-
Mitchell, T. (1997).
Machine Learning.
McGraw-Hill, New York.
-
Ripley, B. D. (1996).
Pattern Recognition and Neural Networks.
Cambridge University Press.
-
Vapnik, V. (1998).
Statistical Learning Theory.
Wiley, New York.
-
Vidyasagar, M. (1997).
A Theory of Learning and Generalization.
Springer, New York.