Friday, September 6, 2013

Fixed or Mixed: The Dilemma of Mixed Models versus Linear Models

We all know what a linear (regression) model is, but mixed-models are less well-known (read more about them here). Linear model only model fixed effects, while mixed models model both fixed and random effects.

One of the questions that many people initially ask is: How do we know if a factor is random or fixed? Bodo Winter has a nice tutorial on linear mixed models, and I summarize the main points here:
  • if the factor has a systematic relation with the predicted value (or the class label, aka in machine learning), it is a fixed effect or variable (voice pitch has a systematic relation with gender, pitch is higher in females, while is lower in males)
  • if the factor has a non-systematic relation with the predicted value, it is a random effect (voice pitch has no systematic relation with subject ID, a certain subject might have a high or low pitch)

There is also another criteria (again, the credit goes to Bodo): "Fixed effects “exhaust the population of interest”, or they exhaust “the levels of a factor”. For example for a "gender" attribute, there’s only “male” or “female”, so these are the only two levels of this factor. In contrast, random effects generally sample from the population of interest. That  means that they are far away from “exhausting the population”, because there’s usually many many more subjects or items that you could have tested.