On-line
statistics |

**Data transformations**

One advantage of using parametric statistics is that it makes it much easier to describe your data. If you have established that it follows a normal distribution you can be sure that a particular set of measurements can be properly described by its mean and standard deviation. If your data are not normally distributed you cannot use any of the tests that assume that it is (e.g. ANOVA, t test, regression analysis). If your data are not normally distributed it is often possible to normalise it by transforming it.

Transforming data to allow you to use parametric statistics is completely legitimate. People often feel uncomfortable when they transform data because it seems like it artificially improves their results but this is only because they feel happiest with linear or arithmetic scales. However, there is no reason for not using other scales (e.g. logarithms, square roots, reciprocals or angles) where appropriate (Sokal & Rohlf, 1995; see pages 411-422).

Different transformations work for different data types:

** Logarithms :** Growth rates are often exponential and
log transforms will often normalise them. Log transforms are particularly
appropriate if the variance increases with the mean.

** Reciprocal : **If a log transform does not normalise
your data you could try a reciprocal (1/x) transformation. This
is often used for enzyme reaction rate data (see Fowler
& Cohen, 1990).

** Square root : **This transform is often of value when
the data are counts, e.g. blood cells on a haemocytometer or woodlice
in a garden. Carrying out a square root transform will convert
data with a Poisson distribution to a normal distribution.

** Arcsine : **This transformation is also known as the
angular transformation and is especially useful for percentages
and proportions.

To present a true mean value of data in the linear scale it is necessary to reconvert the transformed mean. The standard deviation in this case is of no value and you should compute confidence limits of the transformed data and then convert these to the linear scale.

The
product moment correlation coefficient as generated by most statistics
packages can be artificially affected by transformation of data.
Care should be taken in this situation to make sure that the particular
correlation coefficient you use is robust. See, for example Kvalseth,
T.O. (1985) Cautionary note about r squared, *Amer. Stat.*,
**39(4):**279-285 and Scott, A. and Wild, C. (1991). Transformations
and r-squared. *Amer. Stat.*, **45(2):**127-128

Descriptive Stats |
Diversity Indices |
Comparisons |
Correlations |
Regression |

Ted Gaten Department of Biology gat@le.ac.uk Entry approved by the Head of Department. Last Updated: May 2000