Model selection as an approach
3.6 Explanation
3.6.1 The basics
Model selection looks at suites of variables in otherwise comparable models.
- The models are compared by some number (such as AIC, AICc, or BIC, defined later) and ranked. The “best” (more accurately the highest-ranked) models are the best of the set, not the best in all the possible universe. You might have not added all the needed variables to explain your data
- This approach can be modified with “model averaging” that allows you to combine multiple of the highest-ranked models.
Read chapter 8 of the foundational book first to see if the approach is right for you (Kenneth P. Burnham and Anderson 2008).
3.6.2 More technical
3.6.2.1 Questions and data types
- Any generalized linear model (linear regression, generalized linear models, ANOVA, ANCOVA, and more) that generates AICs can work
3.6.2.2 Key assumptions
- You provide selected, well-thought-out hypotheses and compare them
- This is best suited for comparing suites of variables instead of all possible combinations
- Any higher-ranked models may not be the best model, only the best model you’ve tested
- You give up significance (p-values) with this approach
- To find out how of the model explains your data, you can look at weights
- You don’t get to interpret the coefficients by significance like you do with frequentist statistics
3.6.2.4 Common terminology confusions
Some papers will describe the approach as “information-theoretic” (Anderson and Burnham 2002) and others generally refer to the methods as “model selection” (Kenneth P. Burnham and Anderson 2008). Model selection is a type of information-theoretic approach, but information-theoretic approach may be used as shorthand for it in casual writing or conversation.
Model averaging is a type of multi-model inference but the terms are sometimes used interchangeably in more casual writing or conversation.
3.6.2.5 Cite these
Kenneth P. Burnham and Anderson (2008) is the classic baseline. The summary chapter (8) is a good starting place to understand if the approach is right for your data and problem. If you plan to use the models, then go ahead and read the whole book followed by the implementation papers above.
3.6.2.6 Implementations and controversies
AICcmodavgR package has a vignette with examples and details (Mazerolle 2023). Open software.Basic difficulties are covered in Anderson and Burnham (2002).
Kenneth P. Burnham, Anderson, and Huyvaert (2011) provides a basic summary as well as how to use multi-model inference (also called model averaging).
Harrison et al. (2018) covers multi-model inference as well as using mixed models as the base models to compare. Open access.
Model averaging has some controversies about how to implement correctly (Cade 2015).
3.8 Reporting results
Anderson et al. (2001) compares how to present data analyses of typical frequentist (p-value) analyses as well as information-theoretic (model selection) and Bayesian results. The paper mainly focuses on the advantages and ways to present the methods and analyses (results) for model selection. They strongly advise not mixing frequentist (p-value and test statistic reporting) and information-theoretic (model selection) results.