Previous Page Table of Contents Next Page


Chapter 8

8. Design and analysis of experiments

This chapter was contributed by Andrew Speedy, University of Oxford, UK. The objective is to assist researchers to compile and analyze data. To this end, use is made of one of the simpler statistics programs (MINITAB, Minitab Inc., Philadelphia, USA) as the model. More powerful statistical packages may be required for studies in plant and animal genetics and agricultural economics. But, in line with the general philosophy of this manual, it is considered that simplicity and ease of understanding are the principal attributes required of a computer program and, in this respect, Minitab has much to commend it and is therefore selected as the example. But obviously there are various otheroften more sophisticated - statistical software packages available on the market.

THE OVERALL APPROACH

The objectives

The most important aspect of conducting good research is the definition of the objective(s). No matter how good the design of the experiment, how sophisticated the methods used or how clever is the statistical treatment of results, the work is of little value if it does not answer a question of scientific importance and practical relevance. Studying the literature, thinking about the questions and discussing them with colleagues, and especially the farmers who will ultimately apply the technology, is the most important part of planning and research programme. Research must be oriented to solving farmers' problems.

The methodology

Once the objectives are clear, the methodology can be considered. This should be planned to provide the data to answer the questions raised and to satisfy the needs of the researcher and also others who may wish to adopt the findings and apply them in other situations. It must also be possible within the confines of the resources available (land, animals, buildings, pens, laboratory equipment, etc.). Some of these problems (such as numbers of replicates and land resources) may be overcome by conducting the research ‘on-farm’, which also has important implications for short-cutting the process of research application or technology transfer.

Analysis of the data

When the data are finally collected, they must be analyzed in a way that will provide meaningful conclusions.

Planning the analysis of the data is part of the initial process of setting up the research programme. Knowing how the data can be correctly analyzed and interpreted will affect how the data are collected and the numbers of observations required. It is often valuable to produce a ‘dummy’ set of data, calculated on the computer, to test the statistical method.

The following section describes the rules and basic methods for planning, analyzing and interpreting data relating to feed resources and their use by animals.

PLANNING, ANALYZING AND INTERPRETING DATA

Statistical programs

There are many computer packages available for statistical analysis. Throughout this chapter, examples will be given from data analyzed using the package MINITAB which is available for IBM-compatible, Apple Mackintosh and also mainframe computers. The necessary inputs and outputs for this package will be shown. It is taken as an example of a simple yet accurate system for the research worker as well as the student.

Management of experimental data

Collection of data on a daily or weekly basis will yield results that must be used to calculate the variables required for analysis: average daily gain (kg) for each animal, average daily food intake, etc. Such initial calculations (although they may be managed with MINITAB) are best stored and manipulated with a spreadsheet package such as LOTUS 1-2-3. These data can be read into the MINITAB worksheet by, e.g.

RETRIEVE ‘WEIGHTS.WK1’;
LOTUS.

Types of data

Many of the measurements made in this type of work will be of the kind that are called ‘continuous variables’: weight, food intake, blood levels, etc. The pattern of variation of such variables conforms to the ‘normal distribution’. These can be analyzed by a range of tools called parametric statistics, including regression analysis and analysis of variance.

Certain variables of the type ‘success/failure’, ‘germinated/not germinated’, ‘conceived/not conceived’ are ‘discontinuous variables’ and the variation conforms to the binomial distribution. Also amongst this type may be records of the type ‘class 1, class 2 or class 3’, where a measurement 1.1 or 1.2 is not possible. These conform to the Poisson distribution. These data cannot be analyzed by techniques like analysis of variance but require ‘non-parametric statistics’. However, in many cases, such data can be ‘transformed’ using mathematical devices (e.g. logarithm, square root, etc.) to make them conform to a normal distribution. Percentage data should also be transformed.

Types of analysis

Although ANOVAR and regression are well-known techniques, both may be analyzed using a newer method called the Generalized Linear Model (GLM) which has a number of advantages. It fits a ‘model’ to the data and predicts the means and variance from the model. Equivalent examples of MINITAB instructions are:

REGRESS C1 1 C2GLM C1=C2;
 COVARIATE C2.
REGRESS C1 2 C2 C3GLM C1=C2+C3;
 COVARIATE C2 C3
ANOVA C1=C5GLM C1=C5
ANOVA C1=C5+C6GLM C1=C5+C6
No equivalentGLM C1=C2+C5;
 COVARIATE C2.
No equivalentGLM C1=C2+C3+C5+C6;
 COVARIATE C2 C3.

Types of variables

From the above, it is clear that some variables are suited to ‘regression analysis’ and some to ANOVA. The former are called continuous variables and the latter are discrete variables. Levels of fertilizer, levels of feed protein, etc. are continuous variables and can be analyzed in GLM with the ‘COVAR’ subcommand. Discrete variables like variety of crop, breed of livestock, etc. are analyzed with ANOVA or the equivalent GLM command.

Number of treatments

The number of treatments which can be applied may depend on what is available and the amount of experimental resources. However, with continuous variables, it is always better to have more levels of a factor where possible. For example, with 30 experimental units, more information on the type of response to a factor will be obtained with 5 levels and 6 replicates than with 3 levels and 10 replicates. Very little is lost in precision whereas much is gained in knowledge about the shape of the response (linear, quadratic or cubic). Thus we can find the maximum or optimum response level of a factor.

Numbers of replicates

An experiment uses a sample of a population as the experimental unit. In general, the more replicates that are used, the greater the difference that can be detected. However, experimental facilities are always limited and therefore it is important to be economical with the use of resources. There overriding rule is never to have less than 3 replicates per (sub-) treatment.

A more precise estimate of the numbers required to detect the desired percentage difference with a t-test is given by the formula:

where t = Student's t (at given treatment and error degrees of freedom); CV = coefficient of variation; and r = number of replicates.

The appropriate CV can be found by finding comparable experiments in published articles in the literature. It may vary between 3–25% in this type of work.

The actual size of the experiment will vary with both the number of replicates and the number of treatments. Fewer replicates are needed in factorial experiments where the overall total is greater. Again, as a general rule, ensure that the design has at least 15 degrees of freedom for error (residual degrees of freedom).

Blocks

Blocking is a way to deal with known sources of variation which may be sites on a gradient of fertility down a slope, different litters of pigs, different farms, etc. Each block contains all treatments with replicates. The analysis enables the variable ‘block’ to be measured and removed from the error variation, eg:

GLM output = block + treatment

It is good to block experiments wherever a known source of variation occurs. There is little point in including an interaction between treatment and block because this will be difficult to interpret even if it is significant.

Covariates

Inclusion of covariates in an analysis is another way of taking out known variation. Covariates are continuous variables such as initial weight, initial milk yield, etc. Their use is vital in experiments involving dairy production where it is normal for animals of different ages, stages of lactation and potential yield to be used. The command is:

GLM yield = initial + treatment;
covariate initial.

ANALYSIS OF CONTINUOUS DATA

Experiments with two treatments

The simplest experiment compares the results of two treatments. We may with to compare two or more populations (breeds of animal or varieties of plant) and take samples from each. Our samples must be taken at random and must represent the populations and their variation.

Experiments often involve applying some action or actions to a sample of the population to measure its effect. The sample of the population is divided and the treatment(s) applied to part(s) of the sample. If we want to know how the treated sample differs from the untreated one, we need to keep the untreated ones as a ‘control’. Treatments must be applied at random.

When the data have been collected, we want to analyze the results to compare the two (or more) samples or the treated groups with the control. This is done by calculating the variance and partitioning it between that due to treatment and the natural (‘residual’ or ‘error’) variance. The process is called an ‘analysis of variance’.

An example involves two treatments (or a treated group and control) with 10 replicates of each. The means of the two treatments are 10 and 11.

MTB > PRINT C1-C2
ROWC1C2
111.59.7
211.611.6
310.510.9
410.110.8
510.29.6
69.010.7
79.112.5
88.511.2
910.212.6
109.311.5

MTB > TWOSAMPLE C2 C1;
SUBC> POOLED.
TWOSAMPLE T FOR C2 VS C1

 NMEANSTDEVSE MEAN
C21011.111.010.32
C11010.001.040.33

95 PCT C1 FOR MU C2 - MU C1: (0.15, 2.07)
TTEST MU C2 = MU C1 (VS NE): T= 2.43 P=0.026 DF= 18
POOLED STDEV = 1.02

Explanation:

The data consist of two sets of values (two treatments) stored in C1 and C2. These are listed with the MINITAB command ‘PRINT’. Then the data are compared using a ‘t-test’ with the command ‘TWOSAMPLE’. The printout shows the means, standard deviations and standard error of the means and calculated t value. The probability value of 0.026 is less than 0.05 and therefore the null hypothesis that C2 is NOT different to C1 is rejected, i.e. C2 is significantly greater than C1 (P>0.05).

Relationships between variables

In some types of data, the objective is to test the relationship between two variables and to produce an equation which describes this relationship. This is frequently done by regression analysis. In the example here, the alternative ‘GLM’ command (Generalized Linear Model) is used to perform the regression analysis. The example is to test the relationship between OIL and ENERGY in feed samples:

MTB > brief 3
MTB > glm Energy=Oil;
SUBC> covariate Oil.

Analysis of Variance for Energy

SourceDFSeq SSAdj SSAdj MSFP
Oil10.261670.261670.261677.940.011
Error180.593340.593340.03296  
Total190.85501    


TermCoeffSdevt-valueP
Constant10.93240.904612.090.000
Oil0.51160.18162.820.011

Unusual Observations for Energy

Obs.EnergyFitSdev.FitResidualSd.Resid
513.946113.49810.04120.44802.53R
1713.634013.73590.1000-0.1020-0.67 X

R denotes an obs. with a large Sd.Resid.
X denotes an obs. whose X value gives it large influence.

Explanation:

The two variables are stored in columns C1-C2 and labelled Energy and Oil. The GLM model to test is C2=C1 and the subcommand COVARIATE C1 (abbreviated to ‘cova C1’) tells MINITAB to treat C! as a continous variable and not a discrete series of treatments. The probability value (P=0.011) tells us that there IS a significant relationship between Energy and Oil (P>0.05) and the equation is given below. Badly fitting data are also indicated. The constant and coefficient of the regression equation are given and the equation can be derived as:

Energy = 10.9324 (± 0.9046) + 0.5116 (± 0.1815) Oil

When more than two variables are involved, these may be included in the model to give a multiple regression analysis. Only significant factors should be included in the equation. The COEFFICIENT OF DETERMINATION (r2) is found by dividing the SSx by the SStotal. In this case:

r2 = 0.26167/0.85501 = 0.306 (30.6%)

(Equations with r2 less than 70% should not be used for prediction).

Experiments with more than two treatments.

When more than two factors are involved in an experiment, the technique of ANALYSIS OF VARIANCE can be used. This can also be carried out using GLM with the appropriate model. The following is a simple experiment with three treatments and their effect on live weight gain (LWG). The model is: LWG = Treat

MTB > table ‘Treat’;
SUBC> stats ‘LWG’.
ROWS: Treat

 LWGLWGLWG
 NMEANSTD DEV
110507.3853.45
210576.9848.31
310656.4571.04
ALL30580.2783.75

MTB > glm LWG=Treat;
SUBC> means Treat.

FactorLevelsValues
Treat3123

Analysis of Variance for LWG

SourceDFSeq SSAdj SSAdj MSFP
Treat21112861112865564316.300.00
Error2792145921453413  
Total29203431    

Unusual Observations for LWG

Obs.LWGFitStdev.FitResidualSt.Resid
25775.515656.45518.474119.0602.15R

R denotes an obs. with a large st. resid.

Means for LWG

TreatMeanStdev
1507.418.47
2577.018.47
3656.518.47

Explanation:

The TABLE command is used to give the means and standard deviations of the treatments. These are the figures that should be presented in a published paper. Then the GLM test is used as shown to produce the analysis of variance table.

From the results shown it can be seen that the effect of treatment is highly significant (P<0.001). A significant F test must be obtained before it is valid to compare treatments by a t-test.

The final table lists the means and pooled standard deviation of the mean. This is used to test for differences. The least significant difference is t x SE of the difference. In cases where there are a reasonable number of replicates, t will be approximately 2. Therefore differences between means greater than 2 x SE(difference) are significant. In this example, there are significant differences between all treatments.

Experiments with blocks

As was explained carlier, where there is a known source of variation (such as site, farm, litter of pigs, etc.) the treatments should be applied equally to each block and the block taken account of in the analysis. The following example gives the MINITAB printout for an experiment with three blocks and three treatments:

MTB > glm LWG=block treat;
SUBC> means treat.

FactorLevelsValues  
block3123
treat3123

Analysis of Variance for LWG

SourceDFSeq SSAdj SSAdj MSFP
block2192111921196053.470.044
treat22833128331141655.110.012
Error3185916859162771  
Total35133457    

Unusual Observations for LWG

Obs.LWGFitStdev.FitResidualSd.Resid
27627.328517.69519.620109.6332.24R

R denotes an obs. with a large sd. resid.

Means for LWG

treatMeanStdev
1546.715.20
2604.615.20
3607.715.20

In this example, both block and treatment are significant (P>0.05). There are significant differences between treatment 1 and both the other two treatments but not between T2 and T3.

Latin square design

A Latin Square is a special sort of block design with symmetrical arrangement of treatments in two directions. It is particularly useful in experiments where numbers are restricted by facilities. Take an animal experiment to measure protein degradability by the nylon bag technique, using 4 fistulated animals. Four feeds (A, B, C, D) are studied and each feed is incubated in the rumen of each animal in turn. The design looks as follows:

Period1234
1BADC
2ADCB
3CBAD
4DCBA

The analysis would appear as follows:

MTB> table c1 c2;
SUBC> means c4.

ROWS:RowCOLUMNS: Column
 1234ALL
142.80042.90069.10049.40051.050
247.40053.50047.10056.80051.200
352.30061.20040.50051.80051.400
461.30051.20054.00039.70051.550
ALL50.95052.20052.62549.42551.300

CELL CONTENTS -
C4: MEAN

MTB> table c3;
SUBC> stats c4.
ROWS: Feed

 DgDgDg
 NMEANSTD DEV
1442.5753.504
2448.5254.176
3452.3002.061
4462.1005.117
ALL1651.3008.138

MTB> glm dg = row column feed

FactorLevelsValues   
Row41234
Column41234
Feed41234

Analysis of Variance for Dg

SourceDFSeq SSAdj SSAdj MSFP
Row30.580.580.190.010.999
Column324.8124.818.270.320.811
Feed3812.88812.88270.9610.490.008
Error6155.0425.84  
Total15993.32    

Explanation:

The analysis shows a significant effect of feed (P<0.01); the table of means is given at the top of this page, together with their standard deviations.

In general, a 4×4 (or better, a 6×6) latin square is suitable for this type of experiment. The design can be chosen at random from lists of latin square designs in statistical textbooks.

Experiments with interactions

When there are two factors in an experiment, we require to know not only whether there is an effect of each factor alone but also whether there is an INTERACTION between them (one factor affects the response of the animal to the other). This can be analyzed using GLM by specifying the terms:

MTB> GLM Y = A B A * B

Alternatively, the above expression can be abbreviated to:

MTB> GLM Y = A ! B

The following example refers to an experiment with three energy treatments and three protein treatments.

MTB> table ‘energy’ ‘protein’;
SUBC> stats ‘LWG’.

ROWS:Energy COLUMNS:Protein
 123ALL
13339
 513.11636.95650.69600.25
 50.2363.4135.1079.06
23339
 640.75650.09711.54667.46
 65.6918.2922.3648.96
33339
 649.58737.08627.28671.32
 40.6559.8154.6667.68
 ALL9927
 601.15674.71663.17646.34
 80.6064.8450.9871.94

CELL CONTENTS --
LWG:N
MEAN
STD DEV

MTB > glm LWG = energy ! protein;
SUBC> means energy ! protein.

FactorLevelsValues  
Energy3123
Portein3123

Analysis of Variance for LWG

SourceDFSeq SSAdj SSAdj MSFP
Energy22874628746143736.120.009
Protein22817228172140866.000.010
Energy*      
Protein4353643536488413.760.021
Error1842287422872349  
Total26134569    

Means for LWG

EnergyMeanStdev
1600.316.16
2667.516.16
3671.316.16
Protein  
1601.116.16
2674.716.16
3663.216.16
Energy*Protein

11513.127.98
12636.927.98
13650.727.98
21640.727.98
22650.127.98
23711.527.98
31649.627.98
32737.127.98
33627.327.98

Explanation:

Both energy and protein have significant effects. In addition there is a significant interaction between energy and protein, that is, the effect of one is mediated by the effect of the other.

Introducing covariates

Another known source of variation may be a continuous variable such as previous milk yield, starting weight, previous performance, etc. This is very often the case in milking experiments with cows or goats when the experimental animals will almost certainly have different yields and be at different stages of lactation. The following example is an experiment with three treatments to measure the effect on the milk yield of cow. Initial yield is stored in the data table as the variable ‘init’ and the analysis is as follows:

MTB > glm yield=treat;
SUBC> cova init;
SUBC> means treat.

FactorLevelsValues  
treat3123

Analysis of Variance for yield

SourceDFSeq SSAdj SSAdj MSFP
init1274.56250.30250.3055.750.000
treat244.0844.0822.044.910.014
Error32143.66143.664.49  
Total35462.31    
TermCoeffStdevt-valuep
Constant2.30860.90552.550.016
init0.96700.12957.470.000

Means for Covariates

CovariateMeanStdev
init6.4382.789

Adjusted Means for yield

treatMeanStdev
17.1000.6121
28.6950.6130
39.8080.6150

Explanation:

If the analysis had been performed without including initial milk yield as a covariate, no significant differences between treatments would have been found. However, with the inclusion of the term ‘init’ as a covariate, there is a significant effect of treatment (P<0.05).

The final table of means shown are the values for each treatment adjusted for initial milk yield. Treatment I again differs significantly from the other 2.

Better experiments (more levels of the treatments)

In many feeding experiments we need to test the effects of the level of feed inclusion or of the level of a nutrient such as energy or protein. All the experiments described above have 3 treatments and the treatments are treated as DISCRETE variables. However, we often want to know the SHAPE of the response to a treatment, whether there is a maximum or optimum level. With 2 or 3 treatments we can only see if there is a response. By including more levels of the treatment we can test the linear, quadratic and cubic effects. That is, we can see if the response is curved. We can also find the equation which describes the curve. To do this, we treat the factors as CONTINUOUS variables. As a general rule, it is better to include more levels of treatments in this type of experiment as we obtain more information about the response. We make better use of the available experimental material and, provided we have a reasonable number, we lose very little in precision (only 1 degree of freedom for each level). The following is an experiment with 5 levels of energy and 5 levels of protein. We can test for the response to both and also for the interaction between energy and protein.

MTB > table ‘Energy’ ‘Protein’; SUBC> state ‘LWG’.

ROWS: EnergyCOLUMNS: Protein   
 12345ALL
13333315
 522.31522.62565.98535.65596.00548.51
 20.4632.5218.1529.7748.5939.96
23333315
 555.63564.93631.32638.35643.72606.79
 28.5023.5419.1929.5818.5144.64
33333315
 558.41620.30638.36646.82641.30621.04
 5.6525.6341.768.5827.6940.03
43333315
 621.73664.42673.26666.18667.57658.63
 30.4834.3036.8626.1936.1533.97
53333315
 616.58667.53690.55698.05691.20672.78
 17.214.9414.179.1936.0635.10
ALL151515151575
 574.93607.96639.89637.01647.96621.55
 43.9162.6850.4759.7944.0858.05

CELL CONTENTS --
LWG:N
MEAN
STD DEV

MTB > glm LWG=Energy Protein Energy*Energy Protein*Protein Energy*Protein;
SUBC> cova Energy Protein;
SUBC> test Energy Protein Energy*Energy Protein*Protein Energy*Protein/error.

Analysis of Variance for LWG

SourceDFSeq SSAdj SSAdj MSFP
Energy1135343181051810522.130.000
Protein145991144731447317.690.000
Energy*      
Energy14514451445145.520.022
Protein*      
Protein16682668266828.170.006
Energy*      
Protein14144144140.510.479
Error695643856438818  
Total74249382    

TermCoeffStdevt-valueP
Constant396.3926.6814.860.000
Energy61.3813.054.700.000
Protein54.8813.054.210.000
Energy*Energy4.6361.9742.350.022
Protein*Protein5.6411.9742.860.006
Energy*Protein1.1751.6510.710.479

F-test with denominator: Error Denominator MS = 817.94 with 69 degrees of freedom

NumeratorDFSeq MSFP
Energy1135343165.470.000
Protein14599156.230.000
Energy*Energy145145.520.022
Protein*Protein166828.170.006
Energy*Protein14140.510.479

Explanation:

The first TABLE gives the means for each sub-treatment with standard deviations. The mean for each main treatment is shown at the right hand side and bottom of the table. Then the analysis of variance is performed. Notice that both Energy and Protein are set as continuous variables with the subcommand COVA Energy Protein. Notice also an additional subcommand TEST. This requires some explanation.

TEST is used as a sub-command to GLM to force MINITAB to use the sequential sums-of-squares and consequent mean squares in the test of significance, rather than the adjusted sums-of-squares and mean squares, which is the default action. The difference between them is that the adjusted sum-of-squares refers to each factor when all the others have been accounted for; the sequential sum-of-squares is calculated sequentially from the top so that each factor is taken out in turn.

The TEST sub-command should always be used when the factors are NOT independent, as is inevitably the case with linear, quadratic and cubic effects (X, X*X, X*X*X). In other experiments where the sequential sums-of-squares and adjusted sums-of-squares are very different, non-independence is implied and the TEST sub-command should be used to force the use of the sequential sums-of-squares. The factors tested by the above commands are:

Energy:linear effect of energy
Protein:linear effect of protein
Energy*Energy:quadratic effect of energy
Protein*Protein:quadratic effect of protein
Energy*Protein:Energy x Protein interaction.

In assessing significance, the LAST table should be used (F test with denominator: Error). In the example, the linear and quadratic effects for both Energy and Protein are significant but there is no interaction (NS). This shows that the effects of Energy and Protein are curvilinear (diminishing response in this case as the quadratic coefficients are negative).

There is little reason for a farmer to increase either energy or protein above the third level in both cases. An accurate equation can be obtained by rerunning the analysis with the interaction removed (because it was not significant) and using the constant and coefficients to construct the equation.

Note that in experiments with two treatments where we wish to test the interaction, the model can be abbreviated to:

MTB> GLM LWG = FEED ! SYSTEM

This will test the main effects and the interaction (FEED, SYSTEM and FEED*SYSTEM). This could not be used in the above example because we excluded some of the more complex interactions.

Dealing with unbalanced designs

Particularly in on-farm research, we may not be able to apply all of the treatments, all of the time. With ANOVA, this presented serious problems and necessitated calculating ‘missing plots’. However, GLM is a powerful tool for dealing with unbalanced designs and has less limitations. A fuller explanation of the use of GLM for unbalanced designs is given below.

Some Restrictions on Models in GLM

(Minitab Reference Manual 1991, 8–28)

Although models can be unbalanced in GLM, they must be “full rank.” Thus, there must be enough data to estimate all the terms in your model. For example, suppose you have a two factor crossed model with one empty cell. Then you can fit the model GLM Y = A B, but not GLM Y = A B A*B. Don't worry about figuring out whether or not your model is of full rank. Minitab will tell you if it is not. In most cases, eliminating some of the high order interactions in your model (assuming, of course, they are not important) will solve your problem.

There is another restriction: nesting must be balanced. Suppose A has 3 levels, and B is nested within A. If B has 4 levels within the first level of A, it must have 4 levels within the second and third levels of A also. Minitab will tell you if you have unbalanced nesting.

In addition, the subscripts used to indicate the 4 levels of B within each level of A must be the same. Thus, you cannot use (1 2 3 4 ) for the levels of B within level 1 of A, and (5 6 7 8 ) for the levels of B within level 2 of A.”

ANALYZING EXPERIMENTS WITH DISCONTINUOUS VARIABLES

Chi-squared analysis

The use of chi-squared analysis enables the analysis of experiments involving data of the yes/no type or when the results are counts. Note that in the latter case the absolute data should be used and the results should NOT be converted into percentages.

The data are arranged in ‘contingency tables’ of the type:

 GerminatedNot germinated
Control14951
Treated seed18218

The chi-squared statistic is rather like the SS in that it is the square of the difference between the observed result and the expected result (if the results were averaged between the two treatments). We compute a value for each cell, then sum the values for all the cells and compare the value with the value in tables.

If the total chi-squared value is GREATER than the tabulated value, then there is a significant difference between the rows or treatments.

The data should be entered into MINITAB in two columns and the MINITAB command CHISQUARE used as follows:

MTB > chis c1 c2

Expected counts are printed below observed counts

 C1C2Total
114951200
 165.5034.50 
210210200
 165.5034.50 
Total33169 

ChiSq = 1.645 + 7.891 + 1.645 + 7.891 = 19.073df = 1

Note that for a 2×2 table there is one degree of freedom (only one comparison possible). Look up the tables on the line for 1 d.f.

Chi-squared analysis may equally be used for simple experiments with more than two treatments. A 3 × 2 table has 2 d.f. The degrees of freedom is calculated as: (rows-1) × (columns-1)

It is also possible to have 3 × 3, 3 × 4, 4 × 4… etc. tables.

The chi-squared statistic behaves like normal variance in that it may be partitioned between several factors and the interaction may also be calculated. Take the following example:

Four treatments are applied to 100 cows each and the results measured as ‘conceived’ or ‘failed’ to conceive:

TreatmentConceivedFailed
High energy - high protein8119
High energy - low protein8812
Low energy - high protein7525
Low energy - low protein4357

First, compute the chi-squared value for the whole table (3 d.f.): Total treatment effect (3 df) chi2 = 58.549 > 11.3 significant (P<0.01)

Now combine rows 1+2 and 3+4 into a 2×2 table and calculate chi-squared (1 df) to calculate the energy effect and combine rows 1+3 and 2+4 into another 2×2 table and calculate the chi-squared to test the protein effect:

Energy effect (1 df) chi2 = 32.080 > 6.63 significant (P<0.01) Protein effect (1 df) chi2 = 7.709 > 6.63 significant (P<0.01)

Subtract the energy and protein chi-squared values from the total chi-squared to get the remaining effect which is due to the interaction. Energy x protein (1 df) chi2 = 6.760 > 6.63 significant (P<0.01)

There is a significant effect of energy and protein, and there is also an interaction between energy and protein. Note how the chi-squared values are additive and we partition the original 3 df into 1 for each main effect and 1 for the interaction.

Numbers required for chi-squared analysis

(eg: animal reproductive performance)

The numbers required to obtain significant differences in this type of analysis are usually greater than with measurements such as growth or yield. Consider the results of chi-squared analysis where there is a difference of 10% in fertility of cows:

25 cows per treatment

conceivedfailed 
205chi2=0.439
187 

50 cows per treatment

conceivedfailed 
4010chi2=0.877
3614 

100 cows per treatment

conceivedfailed 
8020chi2=1.754
7228 

150 cows per treatment

conceivedfailed 
12030chi2=2.632
187 

225 cows per treatment

conceivedfailed 
18045chi2=3.947
16263 

It is only when we have 225 cows per treatment that we can detect the 10% difference in fertility (P<0.05), which is an important practical difference.

Limitations of chi-squared analysis

Certain rules must be considered when applying chi-squared analysis. One of these is that all cells should contain values greater than 5 (Snedecor). Otherwise, chi-squared is unreliable particularly with only 1 df.

As an improvement, Yates (1939) proposed an adjustment known as ‘Yate's Correction Factor’. This is simply an adjustment of the formula as follows:

Exact probabilities

Occasionally it is possible to obtain only limited amounts of data, for example, if to obtain data would destroy experimental units. When the numbers in a 2 × 2 table are very small, it may be best to compute exact probabilities rather than to rely on the chi-squared approximation.

Example:

 HaveHave notTotal
Standard527
Treatment336
Total8513

We compute the probability of obtaining the observed distribution or a more extreme one, the more extreme ones being:

617
246
8513

and

707
156
8513

We require the sum of the probabilities associated with the three distributions. Marginal totals are the same for all three tables. The sum of the probabilities will be used in judging significance. The probability associated with the distribution:

n11n12n1.
n21n22n2.
n.1n.2n..

where nij is defined as

n! = n(n-1)…1 and 0! = 1

Read n! as ‘n factorial’.

The probabilities for the three tables above are

The sum of the probabilities is 0.4126 (not significant, P>0.05). It is clear that the computation of the first and second probability alone was sufficient to answer the question of significance. In practice, one uses this approach by computing the largest individual probability first, and so on.

Other non-parametric tests

Chi-squared analysis can also be used to test whether a distribution conforms to a particular type such as a binomial or Poisson. The calculated distribution is tested against the observed one. Other non-parametric tests which may be required are:

The sign test - for comparing medians. Wilcoxon's signed rank test - an improvement on the above. Friedman's test for randomized complete block design. Wilcoxon's test for completely random design, two populations. Mann-Whitney test for the same but with unequal samples.

Kruskal-Wallis test for completely random design, any number of populations. Spearman's coefficient of rank correlation. Tukey's test of association.

All the above can be used as quick tests without having to make assumptions about the nature of the population, its type of distribution and variance. However, where it is possible to make the necessary assumptions for the use of anovar, etc., more information (on means, variance, etc.) will be obtained.

Transforming non-normal data for analysis

To use the analysis of variance, we have to confirm the assumptions that: 1. Treatment and environmental effects are additive. 2. Experimental errors are random, independently and normally distributed about zero mean and with a common variance (i.e. the data are of the ‘height’ or ‘weight’ type).

Violation of these assumptions may result in unreliable statistical tests and the unacceptability of the conclusions (particularly for publication). Data which consist of counts and percentages, in particular, do not conform to these requirements. We can use the non-parametric tests (such as chi-squared) but these give us less information on treatment effects. A solution is often to ‘transform’ the data to conform to a normal probability distribution. For this, we take the original data, apply a formula and carry out anovar on the transformed data. We do NOT convert the data back to present the statistics but state that the data were transformed before analysis. The following techniques apply:

Square root transformation

When data consist of small whole numbers, e.g. number of plants or insects of a stated species in a given area, they often conform to the ‘Poisson distribution’, for which the mean and variance are equal. The analysis of such numbers is often best done by first taking the square root of each observation (√x) before carrying out the anovar.

Percentage data based on counts and a common denominator, where the range of percentages is 0–20% or 80–100% (but not both), may also be analyzed using √x. Percentages between 80–100 should be subtracted from 100 before the transformation is made.

It can be seen that when there are mostly low counts with a few very high ones, the probability will be skewed and taking the square root will pull in the high ‘tail’. Notice also that this type of data will have a fixed end, 0 (or 100% in the case of high percentages) which prevents it from showing a two sided normal distribution shape.

When very small values are involved, √x tends to overcorrect and √(x+0.5) should be used when some of the values are <10 and especially when zeros are present.

The logarithmic transformation

The logarithmic transformation (log10 x) is used with positive integers which cover a wide range. This will again pull in a high ‘tail’ particularly when the high values are 100's or 1000's. When values are low (and obviously with 0), log(x+1) should be used. (The log transformation is also appropriate in experiments in which the variable is the variance.)

The angular transformation

The angular transformation(arcsin √x or sin-1 √x) is applicable to binomial data expressed as a decimal fraction or percentages when the percentages cover a wide range. (√x was recommended for percentages 0–20 and 80–100. For percentages 30–70 it is doubtful if any transformation is required.) Data may require to be divided by the numerator or 100 in the case of percentages to produce the decimal fractions required.

Classical binomial data are the ‘success or failure’ type variables - conception rate, germination rate, etc. When given as a proportion, the angular transformation is appropriate. The square root may be applied when they are given as percentages (80–100%).

It is not always obvious which type of transformation is required. It may be helpful to plot the data and data transformed by various methods to check the effect on the shape of the curve.

SIMULATING EXPERIMENTAL DATA

The data referred to here are not real. They were produced by simulation, using MINITAB to produce sets of random data conforming to normally distributed probabilities and with appropriate variance.

This can be a very useful technique, used to run the experiment in a theoretical way (using appropriate means and SE's obtained from previous experience or the literature). We can then try the statistical analysis before the experiment starts and identify and limitations in the design.

The appropriate MINITAB commands to create a set of 20 normally distributed data, with mean 10 and SD±1 in column 1, are:

MTB> RANDOM 20 C1;
MTB> NORMAL 10 1.

We might do the same in C2, with mean 12 and SD±1 and perform an ANOVAR on the two columns.

The technique can be used to simulate factorial experiments, randomized block designs, latin squares, etc., using appropriate columns for different effects and variances. These can be summed to produce the simulated values for the data column and the appropriate analysis performed.

It is a good method to ‘practise’ statistics, while gaining an appreciation of the effects of numbers, different levels of variation and different methods of analysis.


Previous Page Top of Page Next Page