Monday February 14 2011, 14.00 – 16.00 Aula Bianchi

Tuesday February 15 2011, 9.00 – 11.00 Aula Fermi

Thursday February 17 2011, 14.00 – 16.00 Aula Fermi

Tuesday February 22 2011, 14.00 – 16.00 Aula Dini

Wednesday February 23 2011, 9.00 – 11.00 Aula Bianchi

Scuola Normale Superiore

**SYLVAIN ARLOT**

CNRS-INRIA and ENS, Paris

**Advanced Course on Statistics**

**Lecture 1. (Monday February 14) Statistical learning **

- the statistical learning learning problem
- examples: prediction, regression, classification, density estimation
- estimators: definition, consistency, examples
- universal learning rates and No Free Lunch Theorems [1]
- the estimator selection paradigm, bias-variance decomposition of the risk
- data-driven selection procedures and the unbiased risk estimation principle

**Lecture 2. (Tuesday February 15) Model selection for least-squares regression **

- ideal penalty, Mallows’ Cp
- oracle inequality for Cp (i.e., non-asymptotic optimality of the corresponding model selection procedure), corresponding learning rates [2]
- the variance estimation problem
- minimal penalties and data-driven calibration of penalties: the slope heuristics [3,4]
- algorithmic and other practical issues [5]

**Lecture 3. (Thursday February 17) Linear estimator selection for least-squares regression [6] **

- linear estimators: (kernel) ridge regression, smoothing splines, k-nearest neighbours, Nadaraya-Watson estimators
- bias-variance decomposition of the risk
- the linear estimator selection problem: CL penalty
- oracle inequality for CL
- data-driven calibration of penalties: a new light on the slope heuristics

**Lecture 4. (Tuesday February 22) Resampling and model selection **

- regressograms in heteroscedastic regression: the penalty cannot be a function of the dimensionality of the models [7]
- resampling in statistics: general heuristics, the bootstrap, exchangeable weighted bootstrap [8]
- study of a case example: estimating the variance by resampling
- resampling penalties: why do they work for heteroscedastic regression? oracle-inequality. comparison of the resampling weights [9]

**Lecture 5. (Wendsday February 23) Cross-validation and model/estimator selection [10] **

- cross-validation: principle, main examples
- cross-validation for estimating of the prediction risk: bias, variance
- cross-validation for selecting among a family of estimators: main properties, how should the splits be chosen?
- illustration of the robustness of cross-validation: detecting changes in the mean of a signal with unknown and non-constant variance [11]
- correcting the bias of cross-validation: V-fold penalization. Oracle-inequality. [12] Read More →