UNIVERSITE
Yves Rozenholc (Université Paris Descartes)
Statistical Base Jumping : A simple and fully data-driven answer to penalized model selection
In the context of model selection by minimization of penalized contrast, since the original work of Akaike in 1974 followed later in 1999 by the work of Barron, Birgé and Massart (1999), a question remains : how to build a data-driven penalized procedure which offers both good theoretical quality and good empirical behavior ?
Since Comte and Rozenholc (2002, 2004) where already a plug-in approach was proposed to answer this question, several attempts have been proposed in the more recent years and we can cite two main approaches : "dimension jump" and "slope heuristic", referring non exhaustively to Birgé and Massart (2006), Lebarbier (2005), Arlot and Massart (2009), Baudry et al. (2008), Baudry et al. (2012) but also Baraud, Giraud and Huet (2009). Unfortunately such approaches offer unstable solutions which need to be stabilized in order to be used from a practical point of view and/or which are not easily implementable.
In this talk, I will present in the fixed design homoscedastic regression setting a new fully data-driven procedure for model selection called "Statistical Base Jumping" which offers the benefit to be both stable and easily implementable. Several heuristics justifying this construction will be provided and simulations will show that its is safely controlled in term of risk. Finally, I will show how this construction can be extended to other models.
Dans la même rubrique :
- Ségolen Geffray (IRMA, UMR 7501, Université de Strasbourg)
- Bertrand Michel (LSTA, Université Pierre et Marie Curie)
- Van Hanh Nguyen (Laboratoire Statistique et Génome, Université d’Evry et Université Paris-Sud 11)
- Tristan Mary-Huard (AgroParisTech, UMR INRA/AgroParisTech MIA 518)
- Vittorio Perduca (MAP5, Université Paris Descartes)
- Sébastien Gerchinovitz (DMA, Ecole normale supérieure et Université Paris-Sud)
- Maud Delattre (Laboratoire de Mathématiques, Université Paris Sud)
- Serge Cohen (CNRS/UPS3352 IPANEMA / Synchrotron SOLEIL)
- Julien Stirnemann (MAP5, Maternité et médecine materno-foetale, GHU Necker-Enfants Malades, Université Paris Descartes et CNRS)
- Laureen Ribassin-Majed (MAP5, Université Paris Descartes et CNRS)
- Aurélie Fischer (MAP5 et LSTA, Universités Paris Descartes et Pierre et Marie Curie)
- Anne-Cécile Dragon (CEBC et MAP5, Université Paris Descartes et CNRS)
- Niels Keiding (Department of Biostatistics, University of Copenhagen)
- Christophe Pouzat (Laboratoire de Physiologie Cérébrale, Université Paris Descartes)
- Gaëlle Chagny (MAP5, Université Paris Descartes)
- Marc Vincent (Bases moléculaires de la réponse aux xénobiotiques, UMR-S775, Université Paris Descartes)
- Aurélien Garivier (LTCI Telecom ParisTech, CNRS UMR 5141)
- Pierre Neuvial (Laboratoire Statistique et Génome, Évry, UMR CNRS 8071/Université d’Evry/INRA)
- Simon Cauchemez (School of Public Health and Imperial College, London)
- Meïli Baragatti (IML, université de la Méditerranée et Ipsogen)
