# ANOVA

**ANOVA** is a quantitative research method that tests hypotheses that are made about differences between two or more means. If independent estimates of variance can be obtained from the data, ANOVA compares the means of different groups by analyzing comparisons of variance estimates. There are two models for ANOVA, the fixed effects model, and the random effects model (in the latter, the treatments are not fixed).

## Contents

## Quantitative technique: One-Way Analysis of Variance (ANOVA)

The analysis of variance is a partitioning of the total variance in a set of data into a number of component parts, so that the relative contributions of identifiable sources of variation to the total variation in measured responses can be determined. From this partition, suitable F-tests can be derived that allow differences between sets of means to be assessed.1

Thus ANOVA is a biostatistical method for determining whether a difference exists between the means of three or more independent populations. Expressed mathematically, it tests the null hypothesis- H0: 41 = 42 = 43 The one-way ANOVA parametric test will result in either accepting or rejecting this null hypothesis. If we reject the null hypothesis, then we can conclude that the population means are not equal. We do not know however whether all the means are different from one another or only some of them are different. This additional specificity is determined by conducting multiple comparison procedures, i.e. additional statistical tests.2

## ANOVA assumptions

- cases are independent
- Distributions are normal
- Variance of data in groups is homogeneous

The one way ANOVA test compares several groups of observations, all of which are independent but possibly with different group means. Two way ANOVA studies the effects of two factors separately (their main effect) and together (their interaction effect).

## Statistical vocabulary

- The t-test is a powerful statistical test that can be used to test differences between two means.
- The null hypothesis claims that there is no difference between the terms we are testing.
- The object of our testing is to either validate or reject the null hypothesis.
- The p-value is the probability of obtaining a result at least as extreme as a given data point, under the null hypothesis.
- A Type I Error occurs when we falsely reject the true null hypothesis.

## History

ANOVA was initially suggested by the British statistician Sir Ronald Aylmer Fisher in the 1920s. He coined the phrase "analysis of variance," defined as "the separation of variance ascribable to one group of causes from the variance ascribable to the other groups."1

Fisher was very interested in genetics. ANOVA uses Fisher's F-distribution as part of the test of statistical significance. Some of his famous papers include "On the mathematical foundations of theoretical statistics", published in the Philosophical Transactions of the Royal Society in 1922, and "Applications of Student's distribution" , published in 1925.

## Purpose

It is possible to use the t-test to compare more than two means, but this method raises the rate of type I errors. ANOVA (Analysis of variance) is used to test differences among multiple means without increasing the Type I error rate.

As the number of groups increases, the number pair comparisons increases substantially and calculations become overwhelming very quickly. If we test enough pairs, we begin to make observations that are less significant, until we find p values that are insignificant. ANOVA puts all the data into one F number and gives us one P to test the null hypothesis.

### Advantages

- robust design
- increases statistical power

In addition a two way ANOVA

- looks at interaction between factors
- reduces random variability
- can look at effect on second variable after controlling the first variable

### Disadvantages

- if null hypothesis is rejected, we know at least one group differs from others, but with a one way ANOVA and multiple groups, it may be difficult to determine which group is different
- assumptions need to be fulfilled

## Examples

1. Rennie CA, Hannan S, Maycock N, Kang C. Age-related macular degeneration: what do patients find on the internet? J R Soc Med. 2007 Oct;100(10):473-7.

Internet sites were scored for technical information, quality, and SMOG (Simple Measure of Gobbledygook) using one-way ANOVA tests

2. Petrovecki M, Rahelic D, Bilic-Zulle L, Jelec V. Factors influencing medical informatics examination grade--can biorhythm, astrological sign, seasonal aspect, or bad statistics predict outcome? Croat Med J. 2003 Feb;44(1):69-74.

This is an interesting study (though probably one with limited academic value). It looked at how "pseudoscientific variables" such as zodiac sign or biorhythm cycles affected a medical informatics exam grade.

382 second-year undergraduate students at the Rijeka University School of Medicine in the period from 1996/97 to 2000/01 academic year were asked to fill out an anonymous questionnaire about their attitude toward learning medical informatics after taking a Medical Informatics exam.

The answer: general learning capacity and computer habits correlated with exam grades, but there was no correlation between grades and zodiac signs, biorhythms, students sex, or time of year when exam was taken (so I guess my zodiac sign and the fact that I once lived in Finchley, London, the same place where R.A. Fisher was born, had nothing to do with my selection of this study). However, the authors also came up with this masterfully understated statement -- "Inadequate statistical analysis can always confirm false conclusions".

## Principal use

One-way ANOVA is used when the researcher is comparing multiple groups (more than two) because it can control the overall Type I error rate.

Advantages:

- It provides the overall test of equality of group means
- It can control the overall type I error rate (i.e. false positive finding)
- It is a parametric test so it is more powerful, if normality assumptions hold true

Shortcomings:

- Requires that the population distributions are normal
- It assumes equality of variances for each group

## Sources

- Landau S, Everitt BS. A Handbook of Statistical Analyses Using SPSS, Chapman & Hall/CRC, 2004.
- Pagano M, Gauvreau K. Principles of Biostatistics, 2nd Edition, Duxbury Press, Pacific Grove, CA, 2000.

## Related topics

Evidence based medicine EBM