## On-line statistics

Z-tests and t tests

Data types that can be analysed with z-tests

data points should be independent from each other

z-test is preferable when n is greater than 30.

the distributions should be normal if n is low, if however n>30 the distribution of the data does not have to be normal

the variances of the samples should be the same (F-test)

all individuals must be selected at random from the population

all individuals must have equal chance of being selected

sample sizes should be as equal as possible but some differences are allowed

Data types that can be analysed with t-tests

data sets should be independent from each other except in the case of the paired-sample t-test

where n<30 the t-tests should be used

the distributions should be normal for the equal and unequal variance t-test (K-S test or Shapiro-Wilke)

the variances of the samples should be the same (F-test) for the equal variance t-test

all individuals must be selected at random from the population

all individuals must have equal chance of being selected

sample sizes should be as equal as possible but some differences are allowed

Limitations of the tests

if you do not find a significant difference in your data, you cannot say that the samples are the same

Introduction to the z and t-tests

Z-test and t-test are basically the same; they compare between two means to suggest whether both samples come from the same population. There are however variations on the theme for the t-test. If you have a sample and wish to compare it with a known mean (e.g. national average) the single sample t-test is available. If both of your samples are not independent of each other and have some factor in common, i.e. geographical location or before/after treatment, the paired sample t-test can be applied. There are also two variations on the two sample t-test, the first uses samples that do not have equal variances and the second uses samples whose variances are equal.

It is well publicised that female students are currently doing better then male students! It could be speculated that this is due to brain size differences? To assess differences between a set of male students' brains and female students' brains a z or t-test could be used. This is an important issue (as I'm sure you'll realise lads) and we should use substantial numbers of measurements. Several universities and colleges are visited and a set of male brain volumes and a set of female brain volumes are gathered (I leave it to your imagination how the brain sizes are obtained!).

Hypotheses

Data arrangement

Excel can apply the z or t-tests to data arranged in rows or in columns, but the statistical packages nearly always use columns and are required side by side.

Results and interpretation

Degrees of freedom:

For the z-test degrees of freedom are not required since z-scores of 1.96 and 2.58 are used for 5% and 1% respectively.

For unequal and equal variance t-tests = (n1 + n2) - 2

For paired sample t-test = number of pairs - 1

The output from the z and t-tests are always similar and there are several values you need to look for:

You can check that the program has used the right data by making sure that the means (1.81 and 1.66 for the t-test), number of observations (32, 32) and degrees of freedom (62) are correct. The information you then need to use in order to reject or accept your HO, are the bottom five values. The t Stat value is the calculated value relating to your data. This must be compared with the two t Critical values depending on whether you have decided on a one or two-tail test (do not confuse these terms with the one or two-way ANOVA). If the calculated value exceeds the critical values the HO must be rejected at the level of confidence you selected before the test was executed. Both the one and two-tailed results confirm that the HO must be rejected and the HA accepted.

We can also use the P(T<=t) values to ascertain the precise probability rather than the one specified beforehand. For the results of the t-test above the probability of the differences occurring by chance for the one-tail test are 2.3x10-9 (from 2.3E-11 x 100). All the above P-values denote very high significant differences.

Graphical output

# Descriptive Stats Diversity Indices Comparisons Correlations Regression

Ted Gaten  Department of Biology  gat@le.ac.uk
Entry approved by the Head of Department. Last Updated: May 2000