Tests for differences in variances
All data must have been sampled on a random basis.
Data sets must be normally distributed.
These tests are adversely affected by non-normality and should not be used unless you can confidently show that your data does not depart from normality.
If you do not get a significant difference between your variances you cannot say that your variances are equal, just that they are not different.
You may have wondered why the ANalysis Of VAriance is termed that way, when what it compares are the means? Surely it should be called the ANOME? Both sound good, but the ANOVA does compare the variances in your data sets. This is another point at which the underlying requirements are very important. One of the requirements for ANOVA is: "there should be no difference between the variances of the data sets". If a difference is then found through using the test, it can be assumed that the means of the data sets are different. It is the same with the equal variance t-test.
In addition to the term "equal variance" you may also have heard "homogeneity of variance" or "homoschedasticity". They are all synonymous with each other. There are of course the opposites, "heterogeneity of variance" and "heteroschedasticity".
How are the variances compared? You could make a mental comparison, but the variance is a squared value and therefore difficult to assess using brainpower alone. Alternatively and more reliably, you could use one of the three tests available depending on which test you anticipate using to compare means, and which statistical package you use. They are explained below:
This is a function easily performed on a calculator or in Excel. It is the most basic of these tests and can be good as a ready reckoner but one of the other tests should be used for presenting results.
This test will make a statistical comparison between the variances of two data sets. Although it can be used for more than one data set by taking the highest and the lowest values, one of the homogeneity of variance tests as explained below would be more applicable. The hypotheses for the F-test are:
HO: there is no difference between the two variances
HA: larger variance s21 is significantly different than the smaller variance s22.
Notice the HA is directional or one-tailed.
The data can be in columns or rows in Excel but only in columns in the statistical packages. Be aware that Excel requires the data set with the biggest s2 to be entered first. The F-test will provide you with two or three values; the calculated F-value (1.106), Critical F-value (1.822) and the P-value (0.39). Output from Excel is shown below
If the Calculated value is greater than the Critical value reject the HO at the chosen level of confidence. If this is the case at the 0.05 level then look at the P-value to see the actual confidence level. The above example shows no difference between the variances and so the data can be used for parametric statistics.
Where more than two variances are being compared as is the case most often with ANOVA one of these tests should be employed. The most commonly quoted is the Bartlett-Box. Excel does not support these tests but they are found in SPSS, Minitab and UNISTAT. You will have to arrange the data in stacked columns thus, as you do with ANOVA. Hypotheses will be similar to those of the F-test
HO: there are no differences between two or more variances
HA: there are differences between two or more variances
Output from the test will look something similar to this:
The one value to look out for is the Significance value (0.0053 for Bartlett-Box F-test). If this value is less than 0.05 reject the HO and accept the HA. If you can find a reference to the other tests and have a good reason to use them, please do.
NOTE: Do remember that all the above tests are extremely sensitive to non-normality and if in doubt do not rely on the results. However this is offset somewhat by the robustness of ANOVA and T-tests.