A Statistical Manual For Forestry Research

4.4. Factorial experiments

Response variable(s) in any experiment can be found to be affected by a number of factors in the overall system some of which are controlled or maintained at desired levels in the experiment. An experiment in which the treatments consist of all possible combinations of the selected levels in two or more factors is referred as a factorial experiment. For example, an experiment on rooting of cuttings involving two factors, each at two levels, such as two hormones at two doses, is referred to as a 2 x 2 or a 2² factorial experiment. Its treatments consist of the following four possible combinations of the two levels in each of the two factors.

	Treatment combination
Treatment number	Hormone	Dose (ppm)
1	NAA	10
2	NAA	20
3	IBA	10
4	IBA	20

The term complete factorial experiment is sometimes used when the treatments include all combinations of the selected levels of the factors. In contrast, the term fractional factorial experiment is used when only a fraction of all the combinations is tested. Throughout this manual, however, complete factorial experiments are referred simply as factorial experiments. Note that the term factorial describes a specific way in which the treatments are formed and does not, in any way, refer to the design used for laying out the experiment. For example, if the foregoing 2² factorial experiment is in a randomized complete block design, then the correct description of the experiment would be 2² factorial experiment in randomized complete block design.

The total number of treatments in a factorial experiment is the product of the number of levels of each factor; in the 2² factorial example, the number of treatments is 2 x 2 = 4, in the 2³ factorial, the number of treatments is 2 x 2 x 2 = 8. The number of treatments increases rapidly with an increase in the number of factors or an increase in the levels in each factor. For a factorial experiment involving 5 clones, 4 espacements, and 3 weed-control methods, the total number of treatments would be 5 x 4 x 3 = 60. Thus, indiscriminate use of factorial experiments has to be avoided because of their large size, complexity, and cost. Furthermore, it is not wise to commit oneself to a large experiment at the beginning of the investigation when several small preliminary experiments may offer promising results. For example, a tree breeder has collected 30 new clones from a neighbouring country and wants to assess their reaction to the local environment. Because the environment is expected to vary in terms of soil fertility, moisture levels, and so on, the ideal experiment would be one that tests the 30 clones in a factorial experiment involving such other variable factors as fertilizer, moisture level, and population density. Such an experiment, however, becomes extremely large as factors other than clones are added. Even if only one factor, say nitrogen or fertilizer with three levels were included, the number of treatments would increase from 30 to 90. Such a large experiment would mean difficulties in financing, in obtaining an adequate experimental area, in controlling soil heterogeneity, and so on. Thus, the more practical approach would be to test the 30 clones first in a single-factor experiment, and then use the results to select a few clones for further studies in more detail. For example, the initial single-factor experiment may show that only five clones are outstanding enough to warrant further testing. These five clones could then be put into a factorial experiment with three levels of nitrogen, resulting in an experiment with 15 treatments rather than the 90 treatments needed with a factorial experiment with 30 clones.

The effect of a factor is defined to be the average change in response produced by a change in the level of that factor. This is frequently called the main effect. For example, consider the data in Table 4.12.

Table 4.12. Data from a 2x2 factorial experiment

		Factor B
	Level	b₁	b₂

	a₁	20	30
Factor A
	a₂	40	52

The main effect of factor A could be thought of as the difference between the average response at the first level of A and the average response at the second level of A. Numerically, this is

That is, increasing factor A from level 1 to level to 2 causes an average increase in the response by 21 units. Similarly, the main effect of B is

If the factors appear at more than two levels, the above procedure must be modified since there are many ways to express the differences between the average responses.

The major advantage of conducting a factorial experiment is the gain in information on interaction between factors. In some experiments, we may find that the difference in response between the levels of one factor is not the same at all levels of the other factors. When this occurs, there is an interaction between the factors. For example, consider the data in Table 4.13.

Table 4.13. Data from a 2x2 factorial experiment

		Factor B
	Levels	b₁	b₂

	a₁	20	40
Factor A
	a₂	50	12

At the first level of factor B, the factor A effect is

A = 50-20 = 30

and at the second level of factor B, the factor A effect is

A = 12-40 = -28

Since the effect of A depends on the level chosen for factor B, we see that there is interaction between A and B.

These ideas may be illustrated graphically. Figure 4.5 plots the response data in Table 4.12. against factor A for both levels of factor B.

Figure 4.5. Graphical representation of lack of interaction between factors.

Note that the b₁ and b₂ lines are approximately parallel, indicating a lack of interaction between factors A and B.

Similarly, Figure 4.6 plots the response data in Table 4.13. Here we see that the b₁ and b₂ lines are not parallel. This indicates an interaction between factors A and B. Graphs such as these are frequently very useful in interpreting significant interactions and in reporting the results to nonstatistically trained management. However, they should not be utilized as the sole technique of data analysis because their interpretation is subjective and their appearance is often misleading.

Figure 4.6. Graphical representation of interaction between factors.

Note that when an interaction is large, the corresponding main effects have little practical meaning. For the data of Table 4.13, we would estimate the main effect of A to be

= 1

which is very small, and we are tempted to conclude that there is no effect due to A. However, when we examine the effects of A at different levels of factor B, we see that this is not the case. Factor A has an effect, but it depends on the level of factor B i.e., a significant interaction will often mask the significance of main effects. In the presence of significant interaction, the experimenter must usually examine the levels of one factor, say A, with level of the other factors fixed to draw conclusions about the main effect of A.

For most factorial experiments, the number of treatments is usually too large for an efficient use of a complete block design. There are, however, special types of designs developed specifically for large factorial experiments such as confounded designs. Descriptions on the use of such designs can be found in Das and Giri (1980).

4.4.1. Analysis of variance

Any of the complete block designs discussed in sections 4.2 and 4.3 for single-factor experiments is applicable to a factorial experiment. The procedures for randomization and layout of the individual designs are directly applicable by simply ignoring the factor composition of the factorial treatments and considering all the treatments as if they were unrelated. For the analysis of variance, the computations discussed for individual designs are also directly applicable. However, additional computational steps are required to partition the treatment sum of squares into factorial components corresponding to the main effects of individual factors and to their interactions. The procedure for such partitioning is the same for all complete block designs and is, therefore, illustrated for only one case, namely, that of RCBD.

The step-by-step procedure for the analysis of variance of a two-factor experiment on bamboo involving two levels of spacing (Factor A) and three levels of age at planting (Factor A) laid out in RCBD with three replications is illustrated here. The list of the six factorial treatment combinations is shown in Table 4.14, the experimental layout in Figure 4.7, and the data in Table 4.15.

Table 4.14. The 2 x 3 factorial treatment combinations of two levels of spacing and three levels of age.

Age at planting

Spacing (m)

(month)

10 m x 10 m

12 m x 12m

(a₁)

(a₂)

6 (b₁)

a₁b₁

a₂b₁

12 (b₂)

a₁b₂

a₂b₂

24 (b₃)

a₁b₃

a₂b₃

Replication I Replication II Replication III

a₂b₃

a₂b₃

a₁b₂

a₁b₃

a₁b₂

a₁b₁

a₁b₂

a₁b₃

a₂b₂

a₂b₁

a₂b₁

a₁b₃

a₁b₁

a₂b₂

a₂b₁

a₂b₂

a₁b₁

a₂b₃

Figure 4.7. A sample layout of 23 factorial experiment involving two levels of spacing and three levels of age in a RCBD with 3 replications.

Table 4.15. Mean maximum culm height of Bambusa arundinacea tested with three age levels and two levels of spacing in a RCBD.

Treatment

Maximum culm height of a clump (cm)

Treatment

combination

Rep. I

Rep. II

Rep. III

total (T_ij)

a₁b₁

46.50

55.90

78.70

181.10

a₁b₂

49.50

59.50

78.70

187.70

a₁b₃

127.70

134.10

137.10

398.90

a₂b₁

49.30

53.20

65.30

167.80

a₂b₂

65.50

65.00

74.00

204.50

a₂b₃

67.90

112.70

129.00

309.60

Replication total (R_k)

406.40

480.40

562.80

G=1449.60

Step 1. Denote the number of replication by r, the number of levels of factor A (i.e., spacing) by a, and that of factor B (i.e., age) by b. Construct the outline of the analysis of variance as follows:

Table 4.16. Schematic representation of ANOVA of a factorial experiment with two levels of factor A, three levels of factor B and with three replications in RCBD.

Source of variation

Degrees of freedom

(df)

Sum of

squares

(SS)

Mean square

Computed f

Replication

r-1

SSR

MSR

Treatment

ab- 1

SST

MST

A

a- 1

SSA

MSA

B

b- 1

SSB

MSB

AB

(a-1)(b-1)

SSAB

MSAB

Error

(r-1)(ab-1)

SSE

MSE

Total

rab -1

SSTO

Step 2.Compute treatment totals (T_ij), replication totals (R_k), and the grand total (G), as shown in Table 4.15 and compute the SSTO, SSR, SST and SSE following the procedure described in Section 4.3.3. Let y_ijk refer to the observation corresponding to the ith level of factor A and jth level factor B in the kth replication.

(4.22)

SSTO (4.23)

= 17479.10

SSR (4.24)

= 2040.37

SST (4.25)

= 14251.87

SSE = SSTO - SSR - SST (4.26)

= 17479.10 - 2040.37 - 14251.87

= 1186.86

The preliminary analysis of variance is shown in Table 4.17.

Table 4.17. Preliminary analysis of variance for data in Table 4.15.

Source of variation

Degree of freedom

Sum of squares

Mean square

Computed F

Tabular F 5%

Replication

2

2040.37

1020.187

8.59567*

4.10

Treatment

5

14251.87

2850.373

24.01609*

3.33

Error

10

1186.86

118.686

Total

17

17479.10

*Significant at 5% level.

Step 3. Construct the factor A x factor B two-way table of totals, with factor A totals and factor B totals computed. For our example, the Spacing x Age table of totals (AB) with Spacing totals (A) and Age totals (B) computed are shown in Table 4.18.

Table 4.18. The Spacing x Age table of totals for the data in Table 4.15.

Age	Spacing			Total
	a₁	a₂	(B_j)
b₁	181.10	167.80	348.90
b₂	187.70	204.50	392.20
b₃	398.90	309.60	708.50
Total (A_i)	767.70	681.90	G = 1449.60

Step 4. Compute the three factorial components of the treatment sum of squares as:

SSA = (4.27)

= 408.98

SSB = (4.28)

= 12846.26

SSAB = SST - SSA - SSB (4.29)

= 14251.87 - 408.98 - 12846.26

= 996.62

Step 5. Compute the mean square for each source of variation by dividing each sum of squares by its corresponding degrees of freedom and obtain the F ratios for each of the three factorial components as per the scheme given in the Table 4.16

Step 6. Enter all values obtained in Steps 3 to 5 in the preliminary analysis of variance of Step 2, as shown in Table 4.19.

Table 4.19. ANOVA of data in Table 4.15 from a 2 x 3 factorial experiment in RCBD.

Source of variation

Degree of freedom

Sum of squares

Mean square

Computed F

Tabular F 5%

Replication

2

2040.37

1020.187

8.60*

4.10

Treatment

5

14251.87

2850.373

24.07*

3.33

A

1

12846.26

6423.132

3.45

4.96

B

2

408.98

408.980

54.12*

4.10

AB

2

996.62

498.312

4.20*

4.10

Error

10

1186.86

118.686

Total

17

17479.10

^{*Significant at 5% level

Step 7. Compare each of the computed F value with the tabular F value obtained from Appendix 3, with f₁ = df of the numerator MS and f₂ = df of the denominator MS, at the desired level of significance. For example, the computed F value for main effect of factor A is compared with the tabular F values (with f₁=1 and f₂=10 degrees of freedom) of 4.96 at the 5% level of significance. The result indicates that the main effect of factor A (spacing) is not significant at the 5% level of significance.

Step 8. Compute the coefficient of variation as:

(4.30)

Comparison of means

In a factorial experiment, comparison of effects are of different types. For example, a 2x3 factorial experiment has four types of means that can be compared.

Type-(1) The two A means, averaged over all three levels of factor B
Type-(2) The three B means, averaged over both levels of factor A
Type (3) The six A means, two means at each of the three levels of factor B
Type (4) The six B means, three means at each of the two levels of factor A

The Type-(1) mean is an average of 3r observations, the Type-(2) is an average of 2r observations and the Type-(3) or Type-(4) is an average of r observations. Thus, the formula is appropriate only for the mean difference involving either Type-(3) or Type-(4) means. For Type-(1) and Type-(2) means, the divisor r in the formula should be replaced by 3r and 2r respectively. That is, to compare two A means averaged over all levels of factor B, the value is computed as and to compare any pair of B means averaged over all levels of factor A, the value is computed as or simply .

As an example, consider the 2x 3 factorial experiment whose data are shown in Table 4.15. The analysis of variance shows a significant interaction between spacing and age, indicating that the effect of age vary with the change in spacing. Hence, comparison between age means averaged over all levels of spacing or between spacing means averaged over all age levels is not useful. The more appropriate mean comparisons are those between age means under the same level of spacing or between spacing means of the same level of age. The comparison between spacing means at the same age level is illustrated in the following. The steps involved in the computation of LSD for comparing two spacing means at same age level are,

Step 1.Compute the standard error of the mean difference following the formula for comparison Type-(3) as

(4.31)
=
where the Error MS value of 118.686 is obtained from the analysis of variance of Table 4.19.

Step 2. From Appendix 2, obtain the tabular t value for error df (10 df), which is 2.23 at 5% level of significance and compute the LSD as,
=

Step 3.Construct the Spacing x Age two-way table of means as shown in Table 4.20. For each pair of spacing levels to be compared at the same age level, compute the mean difference and compare it with the LSD value obtained at Step 2. For example, the mean difference in culm height between the two spacing levels at age level of 12 months at planting is 5.6 cm. Because this mean difference is smaller than the LSD value at the 5% level of significance, it is not significant.

Table 4.20. The Spacing x Age table of means of culm height based on data in Table 4.15.

Age at planting

Spacing (m)

(month)

10 m x 10 m

12 m x 12m

Mean culm height (cm)

6

60.37

55.93

12

62.57

68.17

24

132.97

103.20}
4.5. Fractional factorial design

In a factorial experiment, as the number of factors to be tested increases, the complete set of factorial treatments may become too large to be tested simultaneously in a single experiment. A logical alternative is an experimental design that allows testing of only a fraction of the total number of treatments. A design uniquely suited for experiments involving large number of factors is the fractional factorial design (FFD). It provides a systematic way of selecting and testing only a fraction of the complete set of factorial treatment combinations. In exchange, however, there is loss of information on some pre-selected effects. Although this information loss may be serious in experiments with one or two factors, such a loss becomes more tolerable with large number of factors. The number of interaction effects increases rapidly with the number of factors involved, which allows flexibility in the choice of the particular effects to be sacrificed. In fact, in cases where some specific effects are known beforehand to be small or unimportant, use of the FFD results in minimal loss of information.

In practice, the effects that are most commonly sacrificed by use of the FFD are high order interactions - the four-factor or five-factor interactions and at times, even the three-factor interaction. In almost all cases, unless the researcher has prior information to indicate otherwise he should select a set of treatments to be tested so that all main effects and two-factor interactions can be estimated. In forestry research, the FFD is to be used in exploratory trials where the main objective is to examine the interactions between factors. For such trials, the most appropriate FFD’s are those that sacrifice only those interactions that involve more than two factors.

With the FFD, the number of effects that can be measured decreases rapidly with the reduction in the number of treatments to be tested. Thus, when the number of effects to be measured is large, the number of treatments to be tested, even with the use of FFD, may still be too large. In such cases, further reduction in the size of the experiment can be achieved by reducing the number of replications. Although the use of FFD without replication is uncommon in forestry experiments, when FFD is applied to exploratory trials, the number of replications required can be reduced to the minimum.

Another desirable feature of FFD is that it allows reduced block size by not requiring a block to contain all treatments to be tested. In this way, the homogeneity of experimental units within the same block can be improved. A reduction in block size is, however, accompanied by loss of information in addition to that already lost through the reduction in number of treatments. Although the FFD can thus be tailor-made to fit most factorial experiments, the procedure for doing so is complex and so only a particular class of FFD that is suited for exploratory trials in forestry research is described here. The major features of these selected designs are that they (i) apply only to 2ⁿ factorial experiments where n, the number of factors is at least 5, (ii) involve only one half of the complete set of factorial treatment combinations, denoted by 2^n-1(iii) allow all main effects and two-factor interactions to be estimated. For more complex plans, reference may made to Das and Giri (1980).

The procedure for layout, and analysis of variance of a 2^5-1FFD with a field experiment involving five factors A, B, C, D and E is illustrated in the following. In the designation of the various treatment combinations, the letters a, b, c,…, are used to denote the presence (or high level) of factors A, B, C,… Thus the treatment combination ab in a 2⁵ factorial experiment refers to the treatment combination that contains the high level (or presence) of factors A and B and low level (or absence ) of factors C, D and E, but this same notation (ab) in a 2⁶ factorial experiment would refer to the treatment combination that contains the high level of factors A and B and low level of factors C, D, E, and F. In all cases, the treatment combination that consists of the low level of all factors is denoted by the symbol (1).

4.5.1. Construction of the design and layout

One simple way to arrive at the desired fraction of factorial combinations in a 2^5-1FFD is to utilize the finding that in a 2⁵ factorial trial, the effect ABCDE can be estimated from the expression arising from the expansion of the term (a-1)(b-1)(c-1)(d-1)(e-1) which is

(a-1)(b-1)(c-1)(d-1)(e-1) = abcde - acde - bcde + cde - abde + ade + bde - de

- abce + ace + bce - ce + abe - ae - be + e

- abcd + acd + bcd - cd + abd - ad - bd + d

+ abc - ac - bc + c - ab + a + b - 1

Based on the signs (positive or negative) attached to the treatments in this expression, two groups of treatments can be formed out of the complete factorial set. Retaining only one set with either negative or positive signs, we get a half fraction of the 2⁵ factorial experiment. The two sets of treatments are shown below.

Treatments with negative signs	Treatments with positive signs
acde, bcde, abde, de, abce, ce, ae, be,	abcde, bcde, abde, de, abce, ce, ae, be,
abcd, cd, ad, bd, ac, bc, ab, 1	abcd, cd, ad, bd, ac, bc, ab, 1

As a consequence of the reduction in number of treatments included in the experiment, we shall not be able to estimate the effect ABCDE using the fractional set. All main effects and two factor interactions can be estimated under the assumption that all three factor and higher order interactions are negligible. The procedure is generalizable in the sense that in a 2⁶ experiment, a half fraction can be taken by retaining the treatments with either negative or positive signs in the expansion for (a-1)(b-1)(c-1)(d-1)(e-1)(f-1).

The FFD refers to only a way of selecting treatments with a factorial structure and the resulting factorial combinations can be taken as a set of treatments for the physical experiment to be laid out in any standard design like CRD or RCBD. A sample randomized layout for a 2^5-1FFD under RCBD with two replications is shown in Figure 4.8.

abce

acde

adde

acde

abce

bcde

abcd

bcde

bcd

abce

Replication I Replication II

Figure 4.8. A sample layout of a 2^5-1 FFD with two replications under RCBD.

4.5.2. Analysis of variance.

The analysis of variance procedure of a 2^5-1FFD with 2 replications is illustrated using Yate’s method for the computation of sums of squares. This is a method suitable for manual computation of large factorial experiments. Alternatively, the standard rules for the computation of sums of squares in the analysis of variance, by constructing one-way tables of totals for computing main effects, two-way tables of totals for two-factor interactions and so on as illustrated in Section 4.4.1 can also be adopted in this case.

The analysis of 2^5-1FFD is illustrated using hypothetical data from a trial whose layout is shown in Figure 4.8 which conforms to that of a RCBD. The response obtained in terms of fodder yield (t/ha) under the different treatment combinations is given in Table 4.21. The five factors were related to different components of a soil management scheme involving application of organic matter, fertilizers, herbicides, water, and lime.

Table 4.21.Fodder yield data from a 2^5-1 factorial experiment

Treatment combination	Fodder yield (t/ha)		Treatment total (T_i)
	Replication I	Replication II
acde	1.01	1.04	2.06
bcde	1.01	0.96	1.98
abde	0.97	0.94	1.92
de	0.82	0.75	1.58
abce	0.92	0.95	1.88
ce	0.77	0.75	1.53
ae	0.77	0.77	1.55
be	0.76	0.80	1.57
abcd	0.97	0.99	1.97
cd	0.92	0.88	1.80
ad	0.80	0.87	1.68
bd	0.82	0.80	1.63
ac	0.91	0.87	1.79
bc	0.79	0.76	1.55
ab	0.86	0.87	1.74
1	0.73	0.69	1.42
Replication total (R_j)	13.83	13.69
Grand total (G)			27.52

Age at planting	Spacing (m)
(month)	10 m x 10 m	12 m x 12m
	(a₁)	(a₂)
6 (b₁)	a₁b₁	a₂b₁
12 (b₂)	a₁b₂	a₂b₂
24 (b₃)	a₁b₃	a₂b₃

Treatment	Maximum culm height of a clump (cm)			Treatment
combination	Rep. I	Rep. II	Rep. III	total (T_ij)
a₁b₁	46.50	55.90	78.70	181.10
a₁b₂	49.50	59.50	78.70	187.70
a₁b₃	127.70	134.10	137.10	398.90
a₂b₁	49.30	53.20	65.30	167.80
a₂b₂	65.50	65.00	74.00	204.50
a₂b₃	67.90	112.70	129.00	309.60
Replication total (R_k)	406.40	480.40	562.80	G=1449.60

Source of variation	Degrees of freedom (df)	Sum of squares (SS)	Mean square	Computed f
Replication	r-1	SSR	MSR
Treatment	ab- 1	SST	MST
A	a- 1	SSA	MSA
B	b- 1	SSB	MSB
AB	(a-1)(b-1)	SSAB	MSAB
Error	(r-1)(ab-1)	SSE	MSE
Total	rab -1	SSTO

Source of variation	Degree of freedom	Sum of squares	Mean square	Computed F	Tabular F 5%
Replication	2	2040.37	1020.187	8.59567*	4.10
Treatment	5	14251.87	2850.373	24.01609*	3.33
Error	10	1186.86	118.686
Total	17	17479.10