2010-07-28

cross-validation demos

I am giving a talk on model selection on Friday. Today I made a demo that compares AIC, BIC, and leave-one-out cross-validation. All these are frequentist model selection criteria (despite the B in BIC). Of them, cross-validation is by far the best for many reasons: It is based on data prediction, not theory; it is (relatively) insensitive to bad uncertainty estimates; it has an interpretation in terms of a sensible utility. One of the oddities of model selection, not often appreciated, is that no principled frequentist or Bayesian result tells you which model to choose; data analysis just puts probabilities on models. If you want to remain principled you either never choose and just propagate all models (best practice for Bayesians), or else explicitly state your utility function and make the decision that maximizes your expected utility. Any time you choose a model without specifying or calculating utility, you have made a mistake, I think. Perhaps not a serious mistake, but a mistake nonetheless.

No comments:

Post a Comment