Comments
Take Away the Mystery of Regression vs. ANOVA
View Comments: Newest First | Oldest First | Threaded View
Well Written, But Not Quite Accurate
  • 12/23/2016 5:59:06 PM
NO RATINGS

I really enjoyed your writing style, but this isn't quite an accurate breakdown of the difference between Regression and ANOVA.

Essentially, Regression is very broad and involves an incredible amount of math. ANOVA is a simplified form of Regression that was created in a time before computers to help researchers quickly and easily analyze if their intervention groups were different.  

I like to compare ANOVA to a point-and-shoot camera, and Regression to a pro's DSLR camera on a manual setting. You can do a LOT more with Regression, but only if you know how to use it, so a lot of people stick to the point-and-shoot ANOVA, and for a lot of 'pictures', the outcome is the same.

There is nothing that ANOVA can tell us that Regression can't; that doesn't make sense. That is like saying that there is something that a car can do that vehicles can't. Cars are just a type of vehicle, so by definition, everything that cars can do, vehicles can do. ANOVAs are a special type of regression. Everything an ANOVA can do, a Regression can do, but ANOVA can't always do everything Regression can do. 

ANOVA and Regression will yield the same results if you have categorial independent variables. Both ANOVA and Regression are identical in this case. They answer the exact same questions. They both will tell you "do categories have an effect?", "how is the effect different across categories?" and "is this significant?". This makes sense, because ANOVA is Regression specifically for categorical variables. 

If you are interested in an IV that is NOT categorical (like age, which could range from 0-100, or height), then ANOVA can't be used; it only works with categorical variables, and doesn't know what to do with continuous ones. It is the trade you give up for its simplicity. Some people who do not know how to use Regression will try to push the categorical variables into a framework an ANOVA can use, by sorting people into fake categories that they create (like "low age" and 'high age"). This is generally not looked upon as 'best practice' for a variety of reasons that I won't go into.

Regression, however, can still be used even if you have continuous IVs. Regression will tell you "for each increase in year of age that you have, your score on the outcome should change this much", and it can ALSO tell you how this interacts with categories, like interaction effects in an ANOVA, and whether or not it is significant. 

Mainly, people choose what they are comfortable with. If people are comfortable with Regression, they will typically use Regression, because it can be used for all sorts of analyses (not just the ones done by ANOVA). If people are not comfortable with Regression, they tend to use ANOVA, mainly because most statistics courses teach it first, it is available in more easy-to-use software, and the output is much easier to understand.  This is fine, because ANOVA comparing groups is identical to Regression comparing groups. 

 

Re: This is excellent!
  • 11/23/2015 8:53:39 PM
NO RATINGS

..

John writes


Use regression when you aren't sure whether the independent categorical variables have any effect at all. Use ANOVA when you want to see whether particular categories have different effects.


 

Perhaps the clearest and most succinct explanation of the difference between these similar statistical methods I've ever read. Thanks, and congratulations...

 

Re: This is excellent!
  • 11/21/2015 11:34:21 AM
NO RATINGS

Thanks for the nice article...excellent simplified explanation

Re: This is excellent!
  • 10/24/2012 2:13:39 PM
NO RATINGS

Mnorth, thanks for the vote of confidence -- and it's always nice to know a few dozen people who have never met me now hate me!  (Haven't had so much fun since I wrote the applied problems for a math text back in the  early 90s....)

This is excellent!
  • 10/24/2012 1:50:30 PM
NO RATINGS

Thank you for this post John.  I'm teaching regression in my Intro to Data Mining course this week and next, and your blog post just became an assigned reading for my class!



INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +