- record sheets for measurements and analyses of variance
Page 210 is a blank assessment sheet for measuring trees.
Page 211 is the same sheet, with a simple worked example.
Page 212 is a blank analysis of variance sheet.
Page 213 is the same sheet, with a worked analysis of the figures on page 211.
|SPECIES:||EXPERIMENT NUMBER:||DATE OF TREATMENT:|
|IDENTITY NUMBER:||DATES OF ASSESSMENT:|
|DETAILS OF MEASUREMENTS MADE:|
|DATES →||OBSERVATIONS AND NOTES|
ASSESSMENT OF Gain in height growth
|SPECIES: Ceiba pentandre||EXPERIMENT NUMBER: 3/97||DATE OF TREATMENT:||15/12/97|
|IDENTITY NUMBER: 96/12||DATES OF ASSESSMENT:||15/12/97|
|DETAILS OF MEASUREMENTS MADE:|
|Height in cm measured from the top of a peg in the sore to the estimated position of the man shoot top|
|DATES→||15/12/97||5/1/98||OBSERVATIONS AND NOTES|
|8||(22)||-||-||attacked by aphids|
|Date of treatment:||Date of assessment:||Assessment number:|
|Treatment number→||OVERALL TOTALS|
|Source of Variation||Sums of Squares||Df||Variance Estimate|
|Error||(total - treatment)|
|Total||(A - CF)|
|Coefficient of variation:||%|
|Standard error of the mean:|
|Species: Ceiba pentandre||Experiment number:|
|Effect of: Pot size|
|Assessment of: gain in height||Units: cm|
|Date of treatment: 15/12/97 Date of assessment: 5/1/98||Assessment number: 1|
|Differences:||1.7 4.8||Treatment 3 = 2.1 × Treatment 1||C.F. =|
|Source of Variation||Sums of Squares||Df||Variance Estimate|
|Treatment||(B -cf)||56.83||2||28.42||3.47 n.s.||5.14||(5%)|
|Error||(total - treatment)||49.17||6||8.194|
|Coefficient of variation : 35.8 %|
|Standard error of the mean:|
|at the||for 4 replicates = ± 1.4|
|5% - 6.1||for 3 replicates = ± 1.7|
|1% - 9.2||for 2 replicates = ± 2.0|
CONCLUSIONS: Overall treatment effect not significant. Growth in large pots more than twice that in small pots. This probably significant difference needs further study with many more trees in each treatment.
- assessment by scoring
(A) Need for scoring methods:
Scoring is a valuable way of getting a rapid, general view of a situation in biology. It can take one further than the recording of an observation (C 55), without embarking on long and detailed measurements that may or may not be appropriate and productive.
Scoring is especially useful when the features to be assessed are difficult or impossible to record by measurement or counting; for example when differences are:
Scoring can also be helpful later on, when the trees are too big for easy measurement, or there are too many items to count.
(B) How to score:
Suggestions on scoring leaf colour are given in sheet C 55.
(C) Some weaknesses of scoring methods:
(D) Hints on scoring:
The main aims when choosing a scoring method are to minimise these weaknesses, and to achieve a valid, useful assessment simply and promptly. Some hints are:
(E) Analysis of scored data:
(1) Chi-square (χ2) tests are especially appropriate for comparing categories.
The 2×2 χ2 test is quick to calculate, and gives a simple estimate of the significance (see C 69-E, G, H, I) of qualitative differences such as the presence or absence of something. For example:
Comparing the number of cuttings that rooted in two different rooting media:
|Treatment||Number rooted||Number not rooted||Total number||Percentage rooted|
|sand||19 (a)||46 (b)||65 (g)||29 %|
|sawdust/sand||12 (c)||3 (d)||15 (h)||80 %|
|both||31 (e)||49 (f)||80 (N)|
This is the sum for calculating the chi-square:
Note: when the numbers are small (some totals less than 30), three points apply:
Result of the chi-square analysis (using Yates's Correction): χ2 = 11.18 ***.
With one degree of freedom (see C 69-I) between the two treatments, the values of chi-square to be exceeded are 3.84 (5% level); 6.64 (1%); and 10.83 (0.1%). In this example, the difference in rooting percent is highly significant (C 69-H), indicated by ***.
When features have been scored into several categories, larger tables can be constructed, and the combined chi-square calculated. If such a test would be invalid because of low ‘expected frequencies’ in some boxes in the table, then categories can be amalgamated and a simpler table prepared. (For instance, categories 1+2 and 3–5 might be put together and the larger groups compared in a 2×2 chi-square test.)
(2) Analysis of variance can also be applied to scored data, and the variation between independent observers included in the analysis (see C 69-F), provided that:
If there are many zero values, you could compare the presence or absence of the feature by a chi-square test, and confine the analysis of variance to the cases where the feature is present.
Used with judgement, scoring methods can provide a rapid and useful complement to more precise and fully quantitative measurements. They are especially valuable when time is short and the features do not lend themselves to easy measurement.
Although the data obtained are only semi-quantitative, it may be possible to carry out valid statistical tests of significance.
- analysing the results of experiments
(A) Why experimental results generally need analysing:
Looking carefully at what happened in your experiment can clarify:
Statistical analyses are particularly important, helping one to avoid being misled when drawing conclusions about the results. They indicate how likely it is that any differences between the growth of various groups of young trees in your experiment are due just to chance, rather than to the conditions being studied (C 62-F).
(B) Two questions before starting a statistical analysis:
(1) Is it unnecessary? A formal analysis may not be needed for example when:
(2) Would the analysis be valid? It may not be, if, for instance:
(C) Steps in analysing the results:
(D) Which figures to analyse?
This depends on the circumstances, but it may often be best to start with:
A decision can then be taken about which other sets of figures might be worth analysing.
(E) Tests of significance for ‘Yes/No’ situations:
If the difference between two sets of experimental trees is qualitative - that is, a simple choice
between damaged/undamaged; alive/dead; leafy/leafless; terminal bud sprouting/not sprouting
- then the Chi-square test (χ2) is a straightforward one to use.
(see worked example of a 2×2 χ2 test in sheet C 68-E.)
Chi-square tests can also be performed with more than two samples.
(F) Tests of significance for ‘More/Less’ situations:
Where the difference between various sets of experimental trees is quantitative, several kinds of tests of significance are available. One of the most adaptable and widely used is the Analysis of Variance (ANOVA) - see blank sheet and worked example in C 67). What this does is to estimate:
A simpler version is the t-test, but this can only handle a single comparison at a time, so is generally less informative and useful. (See also F - standard error of the mean.)
(G) Significant and non-significant effects.
The starting assumption (‘null hypothesis’) on which tests of significance are based is that there are not any real differences between the various groups of plants in the experiment - they just show chance variation around the overall mean. However, if it turns out that considerably more of the variation is assigned to treatment than to error, the assumption is found to be false and the treatment is said to have had a significant effect. If, on the other hand, roughly similar variation is assigned to treatment and to error, the original assumption stands, and the overall treatment difference is said to be ‘not significant’ (n.s.). The same applies to Blocks, genetic origins, and other factors.
(H) Levels of significance.
If a test of significance gives a number that is larger than the value given in the relevant table for the 5% level of probability (p = 0.05), this means that variation like this might happen anyway by chance in one out of more than 20 such trials. We say that such a difference is “probably significant” (and it is usually given one *). If the test gives a number that is bigger than the value in the table for the 1% level (p = 0.01), such a difference would only be likely to happen by chance once in more than 100 trials, and it is called “significant” (**). If the value for the 0.1% level (p = 0.001) is exceeded, the difference would probably only occur by chance once in more than 1000 trials, and is called “highly significant” (***).
(I) Degrees of freedom and significance in statistical tables.
Between two treatments, there is only one comparison to be made; between ten seed-lots only nine independent comparisons. The number of degrees of freedom (d.f.) is the number of trees, replicates, Blocks, treatments, seed lots, clones, and so on that are involved; minus one. So when looking up tables for:
Chi-square tests: With one d.f. between ‘yes’ or ‘no’, the values of chi-square to be exceeded are 3.84 (5% level); 6.64 (1%); and 10.83 (0.1%).
ANOVA (See worked example in C 67):
(J) Calculating the Standard Errors of the Means:
The standard error of the mean (S.E.) is the simplest estimate of how reliable an average is. It can be calculated for any set of figures by dividing the standard deviation by the square root of the number of trees:
After an ANOVA, a more accurate S.E. is calculated by dividing the error mean square (residual variance estimate - see I-2) by the number of values that have been averaged in a particular treatment, and then taking the square root.
The average (mean) is then written for example as 5.6±1.2, and on a graph or histogram the S.E. is usually shown to scale, as a vertical bar above and below the average value.
If their ‘error bars’ do not overlap, this is commonly taken as an indication that two means are probably significantly different from each other. If they do overlap, any differences may just be due to chance.
Consider a 2 × 2 trial with a control, mulch only, fertiliser only, and mulch plus fertiliser (D 6 and D 55 in Manual 4). If, for example the effects of fertiliser were dependent on whether the plants were mulched or not, then an interaction is occurring (Manual 5). The two factors, mulch and fertiliser, are not acting independently from each other. Interactions are important in understanding more about growth, because they suggest that the two factors are acting upon the same process. On the other hand, if there is no interaction, the separate effects of the two factors will just be added together, or the one subtracted from the other. Interactions:
(L) Examining the significance of differences between individual pairs:
Calculate a value called the least significant difference (L.S.D.):
L.S.D= t × the standard error of the difference.
The value ‘t’ is taken from tables at the 5%, 1% and 0.1% levels of probability, using the error degrees of freedom. n1 and n2 are the numbers of plants in treatment 1 and control;
This is the simplest method. Although various authors suggest alternatives (C 62-F), these are more complicated to calculate. The L.S.D. can be a useful guide, provided the following points are remembered:
(M) Various reasons for lack of significance:
If a test does not show significance, this merely means that the null hypothesis stands (see G). It does not prove that the treatment is ineffective.
Significant effects might not have been found because:
(N) Reducing variability:
Some computer programmes (see R-2) can be set to ignore data points that are further from the mean than a set distance. This is risky when dealing with variable species and environments, as these points may well be true values.
Only after several experiments would you conclude that the treatment probably has little or no effect on those aspects of growth of that tree species.
These are sometimes needed in order to put the figures into a form where a valid analysis can be done (see B-2). Here are some examples:
Chi-square tests with small numbers - apply Yates's Correction (see C 68-E).
ANOVA with a non-normal distribution - if the mean value lies well towards the low end of the distribution, transforming all the original data (‘x’) may make the distribution reasonably normal. If so, do the ANOVA on the transformed figures (‘z’). Some common transformations include:
log transformation: z = log10 (x + 0.375); or z = ln (x + 0.375) (0.375 is added to each number to avoid problems with values of zero and one.)
ANOVA of percentages - use the arcsin transformation:
Note: if you want to de-transform the results before presenting them, the standard error of the mean (see J) and the least significant difference (see L) require care. Because transformations (2) and (3) above are not linear ones, the S.E. bars will be of unequal length above and below a mean.
(P) Missing plants or readings.
It is still possible to do ANOVAs when there are different numbers of readings in the various treatments or genetic origins, for example because:
If the ANOVA has only one factor (see K), then calculate as in C 67. If it has more than one factor, you could analyse them separately, though without being able to look at any interactions. Alternatively see statistical textbooks (C 62-F) for how to estimate missing values, noting that for each of them one d.f is deducted before calculating the error mean square.
(Q) Correlation and regression.
These are ways of examining how closely two sets of readings may be connected. For example, height and diameter growth in a set of young trees might often (though not always) be closely linked, with the shorter trees thinner, and the taller ones thicker. Moreover, you might expect that the growth of the trees could be linked with soil depth, moisture or fertility, or with an aspect of the weather.
When correlations or regressions show a close relationship, significance values are often given to them. But here it is particularly important not to be misled, because:
(R) Aids to calculation.
(1) Calculators: These have the advantages of being small, portable, robust and relatively cheap, and of working reliably from long-lasting batteries or solar energy. They are invaluable for transformations (see O), to obtain and check totals and averages, and for other simple calculations (C 63).
Some types contain programmes that automatically calculate the standard deviation and standard error of the mean when a set of figures is totalled. Others will perform more detailed analyses, or allow you to write a programme yourself.
(2) Computers: These offer opportunities for storing large amounts of data, doing complex calculations and analyses, and almost limitless possibilities for displaying the results. They can also be programmed to accept electronically recorded information about the environment. However, computers are relatively expensive, require a steady and reliable electricity supply, and need to be kept free of dust and high humidity. Some types can operate from rechargeable batteries, but they are too delicate to be really portable.
(S) A final hint:
Check at each stage for errors in recording numbers, and in calculations. If not, sooner or later you will find yourself having to start back at the beginning, re-analysing and re-drawing graphs (and perhaps even changing slides and the proofs of publications). Just because a computer has done the analysis does not mean that there cannot be errors.