Population improvement

CHAPTER 3
Selection Indices
for Population
Improvement
Programmes

Isaias Olivio Geraldi

Isaias Olivio Geraldi

Department of Genetics, Escola Superior de Agricultura ‘Luiz de Queiroz’ (ESALQ), Universidade de São
Paulo (USP), Caixa Postal 83, 13400-970 Piracicaba, SP, Brazil.
E-mail: [email protected]

Abstract

In population improvement programmes, several traits are usually considered at the same time. Because genetic correlation occurs, traits cannot be selected on an individual basis, as improving one may change the means of the others in an undesirable way. The alternative is to use selection indices, which evaluate the total genotypic value of individuals (or families) for several traits. In the past, plant breeders used selection indices for allogamous species and, only recently, have been using them for autogamous species. To obtain selection indices, families from the base population must be evaluated to estimate genetic and phenotypic parameters such as heritability and genetic correlation coefficients for the set of traits being considered. For example, in an experiment based on progeny from half-sib families of maize, the traits tassel branch number (NB) and grain yield (GY) were evaluated and found to be highly and negatively correlated (r_G = -0.719). Results showed that response to selection based on indices would be about 30% higher than for selection based on individual traits (selection for higher GY or lower NB) for yield improvement, thus demonstrating the advantage of selection indices over individual-trait selection. We strongly recommend that rice breeders (1) consider selection indices in breeding programmes for traits such as yield, grain quality and disease resistance; and (2) carry out basic studies to estimate genetic and phenotypic parameters, and thus discover the relationships among traits of agronomic and economic importance.

Resumen

En los programas de mejoramiento poblacional, en general, se consideran varias características al tiempo. Debido a la ocurrencia de la correlación genética, las características no pueden seleccionarse de manera individual, ya que el mejoramiento para una podría cambiar los promedios de otras de una manera indeseable. La alternativa es el uso de índices de selección que evalúan el valor genotípico total de los individuos (familias) para varias características. Los índices de selección se utilizaron en el pasado en programas de mejoramiento de especies alógamas, pero sólo recientemente su uso se ha considerado en especies autógamas. Para obtener los índices de selección es necesario la evaluación de progenies de la población base, eso para estimar parámetros genéticos y fenotípicos tales como heredabilidad y coeficientes genéticos de correlación para el conjunto de características considerado. Como ejemplo se utilizó un experimento basado en progenies de familias de medios-hermanos en maíz, las cuales se evaluaron para las características número de ramificaciones de la espiga (NR) y producción de granos (PG), que estaban altamente correlacionadas de manera negativa (r_G = -0,719). Los resultados muestran que la respuesta a la selección con base en los índices sería de alrededor del 30% mayor que aquellos basados en las características individualmente (selección para mayor PG o menor NR) para el mejoramiento de la producción de granos, mostrando claramente la ventaja de los índices de selección sobre la selección individual. Se recomienda enfáticamente que los fitomejoradores de arroz consideren los índices de selección en programas de mejoramiento para características como rendimiento de grano, calidad de grano y resistencia a enfermedad. También se aconseja que los fitomejoradores lleven a cabo estudios básicos para la estimación de parámetros genéticos y fenotípicos con el fin de conocer la relación entre características agronómicas y de importancia económica en arroz.

Introduction

In population improvement programmes, several traits with varying degrees of agronomic and economic importance are usually considered at the same time. Because genetic correlations exist between traits, their selection cannot be considered separately. The improvement of one will provoke changes in others, usually in a sense contrary to that desired. In rice, for example, if only the trait number of grains per panicle is selected, the best families will surely have smaller panicles or grains of less weight.

The alternative methodology available to plant breeders to counteract this type of problem comprises selection indices (Hazel, 1943). The importance of these lies in evaluating the merits of individuals or families in terms of several traits. Selection based on indices permits maximizing the response to selection for one or a group of traits. In reality, selection based on indices reflects not only the response with direct selection, but also the correlated response as selection is practised for other traits.

In this chapter, we describe the basic principles for constructing and using selection indices and how to calculate the expected response to selection when these principles are applied. A practical example of their use in population improvement is also presented, together with some general considerations for plant breeders who want to exploit the genetic resources of rice, using population improvement methods.

Basic principles of using selection indices

To construct the indices, we must first know the magnitude of the genetic and phenotypic correlations between the traits. As a result, we must use some type of genetic design (e.g. family) and experimental design that permit obtaining estimates of genetic and phenotypic variances and covariances among the traits and, consequently, their coefficients of heritability and genetic and phenotypic correlations. For the specific case of population improvement through recurrent selection, we need to estimate the additive components of genetic variance and covariance because these are directly related to the response to selection. The commonest genetic design used to obtain such estimates comprises the families of half-sibs, which permit estimating the parameters just mentioned.

In population improvement programmes, selection is based on the phenotypic evaluation of several traits that are frequently obtained from the means of several replications. This is carried out to minimize the influence of experimental error and thus increase the precision of those means. That is, one can say that and that, as a result, the phenotypic mean approaches the genotypic value.

When considering several traits at the same time, the genotypic value (G_j) of each family for the various traits should be obtained (Baker, 1986), that is:

where:

a_i corresponds to the relative weights attributed to the various traits according to their agronomic or economic importance

G_j corresponds to the genotypic value of family j(j=1,2,...,n)

The problem is that the values of G_j are unknown; only their approximate values - the phenotypic values - are known. Hence, for each individual (or family), a selection index based on phenotypic values is obtained, that is:

where:

b_i corresponds to the weights of the phenotypic values (that will be estimated)

corresponds to the mean phenotypic value of family j(j=1,2,...,n)

Once the initial weights (a_i) are attributed to each trait, the estimates for b_i are obtained so that the correlation between G_j and I_j is maximized. That is, the value r(G_j, I_j) should be the maximum possible. Under such conditions, we have the following matrix representation:

GA = PB

where:

G is the matrix of the genetic variances and covariances

P is the matrix of the phenotypic variances and covariances

A is the vector of the weights attributed to the genotypic values

B is the vector of the weights of the phenotypic values to be estimated

The solution of this matrix system is given as:

where:

P^-1 is the inverse matrix of P.

As a result, the estimated vector B generates the relative weights of the phenotypic values thus permitting us to obtain each family’s indices.

Response to selection based on indices

The equation for the response to selection (Rs) for a trait, is given as follows (Falconer and Mackay, 1996):

where:

Rs is the expected response to selection ds is the selection differential related to the intensity of selection applied to the trait

Cov_PO is the covariance of parents and offspring, which is genetic in nature as the parents and offspring are in different environments and, as a result, not correlated. Moreover, this can be expressed as a genetic variance as it concerns selection and response in the same trait (X or Y)

is the phenotypic variance of the selection unit (parents)

Consequently, for selection based on family means, the response is expressed as:

where:

is the genetic variance of the trait

is the phenotypic variance of the selection unit (parents), comprising the family means

In an analogous way, when selection is practised for trait X but the response is evaluated in another trait Y, which is correlated, then a correlated response to selection (CRs) is obtained, that is:

where:

Cov_PO is the covariance of parents and offspring, which is also genetic. This, however, corresponds to a genetic covariance, as it refers to the response in trait Y due to selection in another trait (X). Thus, we have:

Because this response involves two traits, and for selection based on family means, it is more correctly represented as follows:

where:

Rs_Y/X is the response to selection for trait Y due to selection carried out for trait X (correlated response)

ds_x is the selection differential for trait X

is the genetic covariance between traits X and Y

is the phenotypic variance of the selection unit (parents), comprising the family means for trait X

However, when selection is based on indices, this equation is obviously more complex as it has a set of traits under selection (Henderson, 1963). Continuing with the same reasoning, the response to selection for trait Y, because of selection based on an index (I), is:

where:

Rs_Y/I is the response to selection for trait Y due to selection for index I (correlated response)

ds_I is the selection differential for the index

is the genetic covariance between index I and trait Y

is the phenotypic variance of the selection unit (parents), comprising the family means for index I

To illustrate, we will assume an index of only two traits, X and Y. The equation previously deduced for the index is:

For this case of only two traits (X and Y), the equation is reduced to:

where:

I_j is the index (phenotypic value) of family j

b₁ and b₂ are the weights of traits X and Y to be estimated

and are the phenotypic means of traits X and Y of family jo coacaza

Substituting the value of the index in the equations for and we have:

and, as a result:

where:

Rs_Y/I is the response to selection for trait Y due to the selection practised for the index

ds_I is the selection differential for the index

is the genetic covariance between traits X and Y

is the genetic variance of trait Y

is the phenotypic variance of family means for trait X

is the phenotypic variance of family means for trait Y

is the phenotypic covariance of family means between traits X and Y

We should take into account that, in population improvement programmes, genetic variance and covariance are components of the numerator of the response to selection. They are therefore functions of the additive genetic variances and covariances of the base population. When considering two traits, this means: , , and .

Thus, only the additive genetic variance and covariance are related to the response to selection. This is because the covariance of parents and offspring is a function of the additive genetic variance for direct selection, and of the additive genetic covariance for indirect selection (correlated response).

For different selection methods, we have different coefficients of parental control © that relate the genetic variances and covariances of families to the additive variances and covariances of the base population (Table 1).

Thus, for selection based on half-sib family means with recombination of remnant seeds, c is equal to 1/4 and, as a result:

For the same scheme, but with the recombination of selfed progenies (S₁), c is equal to 1/2 and, as a result:

Table 1. Coefficients of parental control © associated with some selection methods based on the evaluation of families with selection in the two sexes.

Family type	SU^a	RU^b	C^c
Half-sib (HS)	HS	HS	1/4
Half-sib (HS)	HS	S₁	1/2
Full-sib (FS)	FS	FS	1/2
Self-pollination (F₃)	F₃	F₃	1

a. selection unit.
b. recombination unit.
c. relates the numerator of 'response to selecton' to the additive variances and covariances of the base population

and, thus, the response to selection is twice as much.

Note that, for half-sib families, the estimate of genetic variance is a direct estimate of . The same is true for covariance, that is: a. selection unit. b. recombination unit. c. relates the numerator of ‘response to selection’ to the additive variances and covariances of the base population.

All equations of response to selection were presented, using the selection differential for trait units (ds). This can be substituted by the standardized selection differential (k), where and is the standard deviation of the selection unit, component of the denominator. The advantage is that coefficient k can be obtained from the tables and, thus, the response to selection for a trait is:

The same reasoning is valid for the other equations of response to selection already presented.

An example of using an index

The data presented in the following example originate from an experiment with maize, using 40 half-sib families, but the same reasoning is valid for two traits in rice. The materials were evaluated in an experiment carried out in a randomized complete block design with three replications and linear plots of 10 m each, spaced at 1 m apart, with 50 plants after thinning. The traits evaluated were tassel branch number (NB, i.e. trait X) and grain yield (GY, i.e. trait Y). The parameters estimated from the analyses of variance for the two traits and for the analysis of covariance between them are presented in Table 2.

Table 2 shows that NB had a higher heritability coefficient (79.9%) than did GY (37.9%). The two traits were also highly and negatively correlated (r_G = - 0.719). This indicates that many loci in common are controlling these traits (i.e. pleiotropy) and that, consequently, most of the genetic variation of one is explained by variation of the other.

For breeding purposes, NB has no agronomic or economic importance and, as such, increasing yield in the population can be obtained through selection for GY (direct selection) or through selection for NB (indirect selection). Given that the two traits are highly correlated, the two processes should entail increases in GY. However, a selection index involving the two traits must be more efficient for such a purpose, as shown below.

Table 2. Estimates of population mean, genetic variance of families , phenotypic variance of family means , coefficient of heritability on a half-sib family mean basis , genetic covariance of families (Côv_G), phenotypic covariance of family means , genetic correlation (r_G) and phenotypic correlation of family means between the traits tassel branch number (NB) and grain yield (GY).

Parameter	NB	GY	NB × GY
Mean	18.7	134.4
	3.7568	119.7227
	4.6982	315.7038
	79.9	37.9
			-15.2518
			-9.3004
r_G			-0.719
r_P			-0.241
Unit	branches plant^-1	g plant^-1	branches x g plant^-1

In this example, NB is considered not to have agronomic or economic importance and, thus, a weight of zero can be attributed to it. Hence, the following initial weights that will form part of the vector of genotypic values are attributed according to the procedure already presented:

a₁ = 0(NB) and a₂ = 1(GY)

On considering the estimates of the parameters in Table 2, we have the following matrix representation of the system (GA = PB):

By solving system , we obtain:

and thus b₁ = -2.6501 and b₂ = 0.3012

The index (I_j) as a result is:

I_j = -2.6501 X + 0.3012 Y or

I_j = -2.6501 NB + 0.3012 GY

An interesting fact: although, initially, a null weight for NB was attributed (a₁=0), this trait appears in the composition of the index because it is correlated with the trait of interest (GY).

Table 3 presents both the means of the two traits (NB and GY) for the 40 half-sib families and the index of each, obtained with the previous equation. The reason for most of the indices being negative is that b₁ is negative. This table reveals some interesting aspects: the families with higher index values are those that present lower NB values and higher GY values (e.g. families 11, 17 and 36); and the families with lower index values are those that present higher NB values and lower GY values (e.g. families 12, 13 and 20). The index for family 29, the most productive, falls in fourth place because it presents an NB value similar to the overall mean. Of the two families classified as having the worst index, one presents the highest NB value (family 12), whereas the other presents the lowest GY value (family 20).

Table 3. Means for the traits tassel branch number: (NB = branches plant^-1), and grain yield: (GY = g plant^-1), and their indices: (I_j = -2.6502 NB + 0.3012 GY) for 40 half-sib families of maize.

Family (j)	NB	GY	Index
1	16.6	120.6	-7.673
2	16.4	129.2	-4.554
3	18.5	123.3	-11.896
4	18.1	149.6	-2.915
5	18.1	130.8	-8.577
6	18.2	137.1	-6.945
7	18.9	99.0	-20.274
8	21.0	145.1	-11.956
9	17.9	124.7	-9.884
10	17.7	127.4	-8.541
11	14.7	150.2	6.276
12	24.9	120.6	-29.670
13	21.0	101.7	-25.026
14	17.1	132.1	-5.535
15	19.6	146.1	-7.945
16	16.5	141.5	-1.114
17	16.9	162.0	3.999
18	17.1	149.3	-0.355
19	19.7	133.7	-11.944
20	20.7	86.5	-28.808
21	21.0	152.7	-9.667
22	16.7	138.7	-2.488
23	20.8	138.7	-13.353
24	20.5	129.0	-15.479
25	19.4	137.0	-10.155
26	19.1	97.1	-21.376
27	18.0	132.2	-7.890
28	20.4	150.0	-8.890
29	19.0	176.0	2.650
30	20.2	135.3	-12.787
31	17.8	163.6	2.096
32	21.7	130.9	-18.087
33	19.8	134.7	-11.908
34	15.4	116.0	-5.879
35	23.2	134.9	-20.858
36	15.8	156.7	5.318
37	15.8	141.0	0.590
38	18.1	132.1	-8.186
39	18.1	133.9	-7.643
40	17.6	136.4	-5.565
Mean	18.7	134.4	-9.072

We now have three selection criteria for increasing GY:

(1) Selection for lower NB values (indirect selection)
(2) Selection for higher GY values (direct selection)
(3) Selection for higher I_j values (combined selection).

Table 4 presents the selected families, considering a selection intensity of 20% (i.e. 8 families) per criterion. The mean of these is also presented for each of the three criteria. Of the 8 families selected by the index (criterion 3), 4 are included in criterion 1 (selection for lower NB), whereas for criterion 2 (selection for higher GY) 5 families coincide with the index. Coincidence among the three criteria occurred only for two families (11 and 36). One that showed a higher index (family 18) would not have been selected, using any individual selection criteria. However, it was selected by the index because, although this family is not in the group of the eight best for either trait, it showed its superiority when the two traits were combined.

Table 4. Selected maize families and their means for three selection criteria.

Selection criterion	Selected families (code no.)	Mean
< Tassel branch number	01 02 11 16 22 34 36 37	16.0 branches plant^-1
> Grain yield	04 11 17 21 28 29 31 36	157.6 g plant^-1
> Index	11 16 17 18 29 31 36 37	2.432 branches x g plant^-1

Table 5. Expected responses to selection (Rs%) for grain yield (GY) in maize for three selection criteria.

Selection criterion	Rs% in GY
< Tassel branch number	6.52
> Grain yield	6.55
> index	8.56

Table 5 presents the estimates of the expected responses for GY with three selection criteria. The method of selecting among means of half-sib families with recombination of remnant seed is taken into account, and the estimates listed in Table 2 are used.

The first two criteria provided estimates that were very close to the expected response for GY. That is, selection for lower NB values is as efficient as selecting for higher GY values (6.52% and 6.55%, respectively). This is possible whenever the two traits are genetically well correlated and the trait for indirect selection has a higher heritability. According to Falconer and Mackay (1996), indirect selection will produce better results whenever

is greater than

In the example, and, as such, Rs_Y/X should be slightly higher than Rs_Y.

However, the expected response for GY by selecting through the index was 30.9% higher than the mean of the expected responses with selection based on individual traits (8.56% versus 6.54%). This result could be expected as the indices identify the families that best combine the two traits (low NB and high GY) in the sense of maximizing response in GY.

Final comments

The example described above clearly illustrates the advantage of using selection indices when precise estimates of genetic and phenotypic parameters of a reference population are available. Likewise, little difficulty was encountered in attributing weights to genotypic values (a_i), as only two traits were considered, one of which had economic importance, and the other not. Even so, we should remember that the index will frequently consist of more than two traits having some type of agronomic or economic importance. In rice, for example, we constantly consider simultaneously the traits yield, grain quality and disease resistance. In such cases, attributing the initial weights is difficult, as this is done somewhat arbitrarily. Currently, the facilities offered by computers greatly help this task; we can, so much more quickly, use various combinations of weighting the different traits, construct indices and estimate the responses to selection for different traits. The most appropriate index can then be chosen.

As was demonstrated, to obtain the indices, we had to estimate the components of variance of each trait and of covariance between the traits. To do that, we had to use a genetic design such as half-sib families, full-sib families or other. This is not always is easy, particularly for autogamous species for which crossing is more difficult, such as rice, soybean, common bean or wheat. Even so, because of the importance given to population improvement in recent years, greater emphasis is placed on obtaining populations in equilibrium, which permits obtaining such estimates. Thus, the use of selection indices, more frequently used for allogamous and animal species, will be more widely used for autogamous species in the future. Population improvement programmes will therefore be executed more efficiently.

Rice breeders who use population improvement through recurrent selection frequently make decisions on the families to select to develop a new and improved population. Usually, such decisions are based on experience with the crop and with the performance of the different traits being selected. Gains for specific traits have been reported by Guimarães and Correa-Victoria (1997) for blast and by Borrero et al. (2000) for white leaf virus (rice ‘hoja blanca’ virus).

This chapter aims to encourage such plant breeders to consider using selection indices as an integral part of their work, as these will surely help increase the response to selection for more than one trait, such as when developing populations resistant to three important insect pests of the rice crop (Ferreira et al., 2000). Another possibility would be to use selection indices to improve yield and simultaneously maintain or increase levels of disease resistance and grain quality.

We also hope to encourage rice breeders who work with population improvement to conduct research on estimating genetic and phenotypic parameters of populations to obtain selection indices. Researchers should remember that they do not always have to conduct research specifically for this purpose, but can carry it out in parallel with population improvement. Thus, when families are evaluated in experiments with replications, the estimates of genetic and phenotypic parameters can be easily obtained, based on evaluating traits of importance. As already observed, the best type of family for this purpose is the half-sib, as it permits obtaining estimates of additive genetic variance and covariance.

Because obtaining half-sib families is not always easy in autogamous species, an option may be, for example, using F_2:3 families, that is, F₃ families derived from F₂ plants, which are very common in autogamous species. These, even though they do not offer direct estimates, permit obtaining approximate estimates of additive genetic variance and covariance. It should be emphasized that this type of work is highly appropriate for thesis preparation by postgraduate students. Likewise, those programmes using population improvement methods and associated with postgraduate programmes should encourage students to conduct their research in this area.

Finally, alternatives should be suggested for those programmes that do not have the conditions for conducting research on estimating genetic and phenotypic parameters. In such cases, the literature can provide information on the magnitudes of heritabilities and of the genetic correlations between traits, and, consequently, assist decision making on the simultaneous selection for several traits.

With such data, plant breeders can practise selection by using a subjective index, also based on their experience with populations. Such information is particularly useful when the direction of the correlation is contrary to the breeders’ interest, for example, when correlation between two traits is negative but the interest is to select them both in the same sense (to either increase or diminish); or when the correlation is positive but the interest is to select the two traits in a contrary sense (to increase one and diminish the other). In such cases, however subjective, breeders should select according to some index, as otherwise and inevitably, the improvement of one trait will carry the other in a changed and undesired sense. In rice, as for most crops, the principal objective is to increase yield. However, selecting only for that trait always leads to increased susceptibility to diseases and/or problems in grain quality.

References

Baker, R.J. 1986. Selection indices in plant breeding. Boca Raton, FL, CRC Press. 218 pp.

Borrero, J.; Châtel, M. & Triana, M. 2000. Mejoramiento poblacional del arroz irrigado con énfasis en el virus de la hoja blanca. In E.P. Guimarães, ed. Avances en el mejoramiento poblacional en arroz, pp. 105-118. Santo Antônio de Goiás, Brazil, Embrapa Arroz e Feijão.

Falconer, D.S. & Mackay, T.F. 1996. Introduction to quantitative genetics. 4th ed. New York, Longman Scientific & Technical, 464 pp.

Ferreira, E.; Breseghello, F. & Castro, E. da M. de. 2000. CNA-8: población de arroz de tierras altas bajo mejoramiento poblacional para resistência a plagas iniciales. In E.P. Guimarães, ed. Avances en el mejoramiento poblacional en arroz, pp. 255-268. Santo Antônio de Goiás, Brazil, Embrapa Arroz e Feijão.

Guimarães, E.P. & Correa-Victoria, F. 1997. Utilización de la selección recurrente para desarrollar resistencia a Pyricularia grisea Sacc. en arroz. In E.P. Guimarães, ed. Selección recurrente en arroz, pp. 165-175. Cali, Colombia, CIAT.

Hazel, L.N. 1943. The genetics basis for constructing selection indices. Genetics, 28: 476-490.

Henderson, C.R. 1963. Selection index and expected genetic advance. In W.D. Hanson & H.F. Robison, eds. Statistical genetics and plant breeding, pp. 141-163. Publication 982. Washington, DC, National Academy of Sciences.

CHAPTER 3 Selection Indices for Population Improvement Programmes

Introduction

Basic principles of using selection indices

Response to selection based on indices

An example of using an index

Final comments

CHAPTER 3
Selection Indices
for Population
Improvement
Programmes