Catalina Ramis[5]
Ana Cláudia de Carvalho Badan[6]
Elcio Perpétuo Guimarães[7]
Antonio Díaz[5]
Carlos Eduardo Gamboa[8]
Catalina Ramis
Abstract
Programmes for the genetic improvement of rice populations through recurrent selection have made little use of molecular markers, even though they could be very helpful in answering certain methodological questions. This chapter describes the major types of molecular markers, and briefly reviews techniques for their use, comparing their respective advantages and disadvantages. Molecular markers can be used in genetic improvement programmes to study genetic diversity, or for mapping or marker-assisted selection. Detailed examples are given of the possible uses of markers in a recurrent selection programme for rice, such as (1) improving the selection of parents constituting the population by maximizing their genetic diversity; (2) optimizing the number of recombination cycles by assessing the evolution of deviation from panmixia in populations of successive cycles; (3) assessing the effect of population improvement on genetic diversity by comparing cycles of recurrent selection with each other or with reference sets of known diversity; (4) helping to confirm the location of the male-sterility gene used for intercrossing and assessing the extent and nature of linkage drag accompanying its use; and (5) detecting quantitative trait loci (QTLs) and manipulating the alleles at these QTLs through marker-assisted selection.
Resumen
Los programas de mejoramiento genético del arroz mediante la selección recurrente han hecho poco uso de los marcadores moleculares, que podrían ser muy útiles para contestar algunas preguntas metodológicas. Este trabajo describe los principales tipos de marcadores moleculares, presenta una breve descripción de las técnicas utilizadas con éstos y compara sus respectivas ventajas y desventajas. Los marcadores moleculares se pueden utilizar en programas de mejoramiento genético para estudiar diversidad genética, o para mapeo y selección asistida. Se incluyen ejemplos más detallados de posibles utilizaciones de marcadores en un programa de selección recurrente en arroz. Un primer aspecto podría ser para mejorar la elección de los progenitores que constituirán la población, maximizando la diversidad genética entre ellos. Otra posibilidad sería optimizar el número de ciclos de recombinación mediante la evaluación de la evolución de la desviación de la panmixia en poblaciones de ciclos sucesivas. Los marcadores podrían usarse para evaluar el efecto del mejoramiento poblacional sobre la diversidad genética comparando los varios ciclos de recurrencia entre ellos o con patrones conocidas de diversidad. Los marcadores podrían ayudar a confirmar la localización del gen de androesterilidad utilizado para intercruzar y evaluar el alcance y la naturaleza del arrastre de ligamiento ("linkage drag") que podría acompañar su uso. Los marcadores también se usan para detectar características de control genético cuantitativo (QTLs) y manipular alelos en estos QTLs mediante selección asistida por marcadores.
Recurrent selection is a cyclic and gradual procedure. Its goal is to increase the frequency of favourable alleles within a population. High variability is maintained in such a way that plant breeders can extract promising plants at different cycles of the programme. The success of this type of programme depends on various factors:
An initial broad variability that permits the desired advance
Evaluation of traits of interest that permits discerning the genotypic effect, thus identifying superior materials
Occurrence of random crosses directed towards free combination of alleles within genes (Hardy-Weinberg equilibrium) and between genes (linkage equilibrium)
Hence, breeders would expect:
Where selection was efficient, an increased frequency of genotypes with the desirable traits for which the selection was being conducted
Maintenance of high variability for other traits, permitting the use of these populations as base for obtaining new genotypes. Hence, the base population needs to have a broad genetic base
Opportunities of new combinations appearing that would enhance the possibility of selection over time. Such possibility would be fulfilled in the recombination stage when conditions of panmixia should be ensured.
This type of improvement programme was first designed for maize where cross-pollination is ensured through the flowering mechanism. For autogamous plants, like rice and soybean, cross-pollination must be carried out manually, which makes executing this type of programme difficult.
For rice, incorporating a male-sterility gene into base populations means that recurrent selection schemes can be applied to population breeding programmes. Thus, cross-pollination can be ensured by harvesting, during recombination, only the male-sterile plants that had received pollen from neighbouring fertile plants and formed seeds.
However, the novelty of these schemes being applied to an autogamous crop like rice has created concern as to, first, their adequate development and, consequently, their likely future success in terms of being potential sources of improved cultivars. The following concerns in particular are being studied by the Centre for Research in Agricultural Biotechnology (CIBA, its Spanish acronym) of the Universidad Central de Venezuela, Maracay:
The Centres goal is to provide answers for plant breeders who want to conduct successful programmes in Latin America. The research involves using biotechnological tools (molecular markers of the isoenzyme and microsatellite type) and population PFD-1 as model.
For this study, 300 plants from cycle 0 of population PFD-1 were planted in trays under screen-house conditions. Eight days after planting, a sample of leaf tissue was taken from each plant.
Figure 1. Different electrophoretic patterns (Pi) observed for isoenzymes isocitrate dehydrogenase (IDH), phospho-glucoisomerase (PGI), â-esterase (â-est I and â-est II) and acid phosphatase (ACP) in cycle 0 of rice population PFD-1, and four commercial rice cultivars Araure 4,Cimarrón, Fonaiap 1 and Palmar.
The electrophoretic patterns of the isoenzymes â-esterase (â-est) and acid phosphatase (ACP) were determined in polyacrylamide gels at 10% (Velásquez, 2001). Eighteen days after planting, samples of leaf tissue were again taken from each plant. This time, the electrophoretic patterns of the enzymes phospho-glucoisomerase (PGI) and isocitrate dehydrogenase (IDH) were determined in starch gels at 12%, following the methodology of Rodríguez (2001) and Ortiz et al. (2002). Figure 1 illustrates the distinct patterns observed for each isoenzymatic system. For isoenzymes IDH, PGI and ACP, one zone of activity can be seen and, for b-est, two are seen, identified as b-est I and b-est II.
Given the impossibility of assigning a genotype to each plant for each isoenzyme- because of the lack of knowledge of their genetic control-the diversity study was carried out on the basis of the electrophoretic patterns of each isoenzyme. To do so, different band patterns for each isoenzymatic system were identified and their frequencies counted. From these data, the Shannon and Weaver diversity index (H), uniformity index (E) and the complement of Simpsons index (D) on were estimated (Sneath and Sokal, 1973).
The Shannon and Weaver index (H) measures the degree of uncertainty in finding an electrophoretic pattern given for any individual of a population. It is affected as much by the number of patterns as by their relative frequencies. The index is calculated according to the following formula:
where:
pi represents the frequency of the ith electrophoretic pattern
The highest values of H are expected in populations with the most patterns (i.e. with the richest variation) and with similar frequencies for each pattern (i.e. greatest uniformity of frequencies).
Index E measures the uniformity of frequency of distinct patterns. It represents the proportion of diversity observed (H) with respect to the maximum theoretical value. The index is calculated on the basis of the Shannon and Weaver index (H) as:
E = H/lnS
where:
S corresponds to the number of patterns per isoenzymatic system
Hence, E will have the maximum value of 1 when the frequencies of the observed patterns are most similar.
The complement of Simpsons index (D) estimates the probability of finding two distinct patterns in the population, and is calculated according to the formula:
where:
pi is the frequency of the ith electrophoretic pattern
Table 1. Shannon-Weaver index (H), uniformity index (E) and the complement of Simpsons index (D) in five isoenzymatic systems in cycle 0 of rice population PFD-1
Isoenzyme |
H
|
E
|
D
|
Acid phosphatase |
0.469 |
0.985 |
0.656 |
Isocitrate dehydrogenase |
0.465 |
0.976 |
0.649 |
Phospho-glucoisomerase |
0.538 |
0.894 |
0.678 |
b-esterase, type II |
0.624 |
0.739 |
0.694 |
b-esterase, type I |
0.445 |
0.933 |
0.617 |
Average |
0.508 |
0.905 |
0.659 |
Index D is larger when the frequencies of the patterns are similar. In contrast, when one electrophoretic pattern predominates, index D diminishes (Sneath and Sokal, 1973).
Table 1 presents the values obtained for H, E and D for each isoenzymatic system and its average value.
The highest values for index H indicate that the most diverse isoenzymatic systems were b-est II and PGI. The results for the uniformity index E indicated that this population presented almost 91% (0.905) of the maximum theoretical variability for the number of patterns observed, which would occur if all patterns presented similar frequencies. This, in its turn, indicated that none of these patterns predominated in the population. That is, the danger of genetic drift is reduced and variability is maintained in the improvement programme. The average value of D shows that a random comparison between two plants would have a 65.9% probability of finding distinct patterns.
Hence, by considering these indices, we can answer our first question by affirming that cycle 0 of population PFD-1 presents high genetic diversity, as shown by the number of distinct genotypes present and similar frequencies, with none being predominant.
However, we continued with our study to determine if the variability present in population PFD-1 is different to that of the countrys traditional cultivars. We first ascertained the electrophoretic patterns of the same isoenzymes in the commercial cultivars Araure 4, Cimarrón, Fonaiap 1 and Palmar. We compared them with 136 individuals from cycle 0 of population PFD-1. A matrix of genetic distances was prepared with these data, based on simple matching coefficients (1-SM), where SM is the number of coinciding patterns within the whole. From the matrix, a dendrogram (Figure 2) was constructed, using the UPGMA clustering method (Sneath and Sokal, 1973).
Figure 2. Dendrogram constructed on the basis of estimates of genetic similarity, according to Jaccards coefficient and the UPGMA clustering method, of 136 plants from recurrent selection cycle 0 of rice population PFD-1 and four Venezuelan rice varieties (Araure 4, Palmar, Cimarrón and Fonaiap 1)
The dendrogram showed eight large clusters, two of which held the commercial varieties: Cluster II had varieties Araure 4 and Palmar, and Cluster VIII had Cimarrón and Fonaiap 1. This latter cluster stood out for being very different to most of the sample. The fact that six clusters were detected as different to the commercial varieties suggested that new variability existed in this base population. The variability of population PFD-1 also overlapped that existing in the commercial varieties as some plants of population PFD-1 grouped with Clusters II and VIII.
Table 2. Isoenzymatic patterns found in 136 individuals from cycle 0 of population PFD-1 of the rice recurrent selection programme, together with four Venezuelan commercial varieties.
Indiv. |
Patterna |
Indiv. |
Patterna |
Indiv. |
Patterna |
Indiv. |
Patterna |
1 |
11222 |
38 |
21141* |
75 |
23323* |
112 |
32311 |
2 |
11222 |
39 |
21231 |
76 |
23333* |
113 |
32331* |
3 |
11223* |
40 |
21231 |
77 |
23341* |
114 |
33121 |
4 |
11242* |
41 |
21242* |
78 |
23422* |
115 |
33121 |
5 |
12131* |
42 |
21253* |
79 |
23442* |
116 |
33122* |
6 |
12141* |
43 |
21273* |
80 |
31121 |
117 |
33162* |
7 |
12211* |
44 |
21323* |
81 |
31121 |
118 |
33221 |
8 |
12212 |
45 |
21422 |
82 |
31122 |
119 |
33221 |
9 |
12212 |
46 |
21422 |
83 |
31122 |
120 |
33122* |
10 |
12221* |
47 |
22122 |
84 |
31122 |
121 |
33162* |
11 |
12222 |
48 |
22122 |
85 |
31131* |
122 |
33221 |
12 |
12222 |
49 |
22122 |
86 |
31133* |
123 |
33221 |
13 |
12232* |
50 |
22131* |
87 |
31142* |
124 |
33222* |
14 |
12322* |
51 |
22132* |
88 |
31171* |
125 |
33223 |
15 |
12342* |
52 |
22133* |
89 |
31222* |
126 |
33223 |
16 |
12421* |
53 |
22141* |
90 |
31232* |
127 |
33231 |
17 |
12422* |
54 |
22161* |
91 |
31242* |
128 |
33231 |
18 |
12432* |
55 |
22163* |
92 |
31462* |
129 |
33232 |
19 |
13121* |
56 |
22212* |
93 |
32122 |
130 |
33232 |
20 |
13212* |
57 |
22221* |
94 |
32122 |
131 |
33242* |
21 |
13221 |
58 |
22222* |
95 |
32122 |
132 |
33263* |
22 |
13221 |
59 |
22223* |
96 |
32123 |
133 |
33321* |
23 |
13231* |
60 |
22233* |
97 |
32123 |
134 |
33422* |
24 |
13232 |
61 |
22252* |
98 |
32132* |
135 |
33431* |
25 |
13232 |
62 |
22322* |
99 |
32133 |
136 |
33442* |
26 |
13242* |
63 |
22433* |
100 |
32133 |
Araure 4 |
21373 |
27 |
13321* |
64 |
23132* |
101 |
32152 |
Cimarrón |
23144 |
28 |
13322 |
65 |
23132* |
102 |
32152 |
Fonaiap 1 |
21144 |
29 |
13322 |
66 |
23221* |
103 |
32223* |
Palmar |
21373 |
30 |
13323 |
67 |
23222 |
104 |
32232 |
|
|
31 |
13323 |
68 |
23222 |
105 |
32232 |
|
|
32 |
13331* |
69 |
23222 |
106 |
32233* |
|
|
33 |
13442* |
70 |
23232 |
107 |
32241* |
|
|
34 |
21121* |
71 |
23232 |
108 |
32243* |
|
|
35 |
21123* |
72 |
23321* |
109 |
32262 |
|
|
36 |
21131* |
73 |
23322 |
110 |
32262 |
|
|
37 |
21132* |
74 |
23322 |
111 |
32311 |
|
|
a. The first digit of the number corresponds to acid phosphatase, the second to isocitrate dehydrogenase, the third to phospho-glucoisomerase, the fourth to â-esterase, type II, and the fifth to â-esterase, type I. The asterisk indicates unique combinations.
Table 3. Number of alleles in each SSR analysed for cycles 0 and 1 of rice population PFD-1, and comparing the results with those of Temnykh et al. (2000).
Chromosome code no. (SSR) - locus |
No. of alleles in each locus |
Temnykh et al., (2000) |
|
Cycle 0 |
Cycle 1 |
||
1 (RM102) - A |
3 |
2 |
3 |
2 (RM263) - B |
8 |
5 |
5 |
3 (RM143) - C |
5 |
3 |
3 |
4 (RM185) - D |
3 |
3 |
2 |
6 (RM111) - E |
5 |
4 |
3 |
9 (RM296) - F |
4 |
4 |
2 |
10 (RM228) - G |
7 |
8 |
7 |
11 (RM224) - H |
8 |
7 |
9 |
12 (RM235) - I |
6 |
6 |
6 |
12 (RM17) - J |
3 |
3 |
4 |
Total |
52 |
45 |
44 |
Alleles per locus |
5.2 |
4.5 |
4.4 |
Another important aspect is the presence of not only different variability but also of new combinations between the distinct isoenzymatic systems. To evaluate this point, we compared the plants of population PFD-1 with each other for combinations of electrophoretic patterns of the five isoenzymatic systems (Table 2). By direct observation, we obtained 102 distinct combinations. Of these, 76 corresponded to plants that presented unique combinations, which corroborated the broad genetic variability of cycle 0 of population PFD-1.
When combinations of patterns from the commercial varieties were also included for comparison, we found combinations different to these. This finding could be a result of new alleles being present and the recombination being a product of crosses in the formation of the base population.
On the basis of the results obtained, we may conclude that (1) cycle 0 of population PFD-1 showed broad genetic variability, and (2) this variability can be taken advantage of in breeding programmes. Furthermore, this variability only partially, and to a lesser extent, included that of the commercial varieties.
The second concern and requirement for a recurrent selection programme to be successful over time is the maintenance of genetic variability across recurrent cycles. To study this aspect, we had to determine:
Variation in allelic richness. This would include the number of different alleles at various loci for two or more cycles of a population in a recurrent selection programme and detecting possible disappearance of alleles
Variation in allelic uniformity across the cycles under study. This would include the relative frequency of each allele for two or more cycles of a population in a recurrent selection programme.
With such objectives, we decided to establish the study, using as a biotechnological tool those microsatellite markers (SSRs) already known in rice (see Gramene at www.gramene.org). The idea was to have information of an SSR per pair of chromosomes. From a total of 27 SSRs probed, we selected 10 SSRs of good resolution. These were RM102, RM263, RM143, RM185, RM111, RM296, RM228, RM224, RM235 and RM17, located on chromosomes 1, 2, 3, 4, 6, 9, 10, 11 and the last two on 12, respectively.
For plant material, we used cycles 0 and 1 of population PFD-1, taking a sample of 96 plants per cycle. We planted seeds in trays containing soil that was kept permanently moist, and we maintained the seedlings under screen-house conditions.
On day 21 after planting, leaf tissue was taken from each plant for genomic DNA purification, using the cationic hexadecyl trimethyl ammonium bromide (CTAB) protocol described by Ferreira and Gratapaglia (1998). After the DNAs quality was verified, each SSR was amplified, using the PCR protocol specific for each microsatellite (Brondani et al. 2001). The SSRs were then separated by electrophoresis on polyacrylamide gels at 6%, and the bands visualized using the silver nitrate methodology (Panaud et al. 1996).
Variation in allelic richness
To measure the allelic richness or qualitative diversity present in each cycle, the number of different alleles was counted for each SSR in cycles 0 and 1 of population PFD-1 (Table 3). On average, 5.2 alleles per locus were obtained in cycle 0 and 4.5 in cycle 1. The table includes the number of alleles obtained for a sample of 14 rice varieties representing cultivated rice germplasm for the SSRs used in the present study (Temnykh et al. 2000). The similar or even greater variability found in population PFD-1 agreed with the results obtained with the isoenzymes, indicating that the population under study presented a broad genetic variability available for continuing the breeding programme. Moreover, these results are also comparable with those found in two populations for recurrent selection in maize, where an average of 3.9 and 3.7 alleles per locus was found, using 30 molecular markers (Pinto, 2001).
In contrast, regarding alleles that disappeared between cycles, we saw a decline of 13.5% between cycles 0 and 1. The importance of this value depends on the frequencies that the disappeared alleles had in cycle 0. Had their frequencies been very low, then most probably they would have been lost, regardless of the selection method applied.
Variation in allelic uniformity across the cycles
Another way of determining whether variability was maintained from cycle to cycle would be to conduct the relative frequency and exact tests of allelic and genotypic differentiation (Raymond and Rousset, 1995) for each SSR locus in cycles 0 and 1 of population PFD-1. The results, summarized in Table 4, (see next page) showed differences between the two cycles in eight of the 10 loci for allelic differentiation and in five for genotypic differentiation. The overall test showed highly significant differences for both tests.
Detailed analysis of the allelic frequencies showed that, in some cases, when allelic frequency was very low in cycle 0, the allele did not appear in cycle 1. This occurred for alleles A3, B6, B7, B8, C4, C5, E5 and H8. This corroborates the point mentioned above: when any allele has a low frequency in the initial population, the probability that it will disappear in the next generation is very high. In contrast, allele G8 was not observed in cycle 0, indicating that the sample size was not sufficiently large to detect its presence in cycle 0.
However, the selection process is expected to favour the presence of some alleles over others. Thus, some allelic frequencies diminish from cycle to cycle, while others at the same locus increase, for example, alleles B4 and B1; E1 and E3; and G6 and G8. However, what is important for this type of breeding programme is that genetic variability is maintained at a sufficiently high level to permit advances with selection. This condition is being fulfilled. The Nei diversity indices (Nei, 1978) calculated were:
0.6045 (CI(95%) = 0.4771 to 0.7319)
for cycle 0 and:
0.5742 (CI(95%) = 0.4099 to 0.7435)
for cycle 1. That these values are statistically equal is indicated by the respective confidence intervals.
Hence, our results showed that broad genetic variability is maintained across cycles 0 and 1 of population PFD-1. The results positively answer our second question: is genetic variability maintained after a recurrent selection cycle?
The stage of recombination in a recurrent selection programme requires fulfilling the condition of panmixia. This guarantees the free combination of alleles of a given locus and the recombination between genes to establish the requisites for a population in equilibrium (Falconer, 1989).
Table 4. Allelic frequencies (pi) for each allele and exact tests for allelic and genotypic differentiation for each microsatellite (SSR) locus between cycles 0 and 1 of rice population PFD-1. ns is not significant; * is significant at 5%; and ** at 1%.
SSR |
Chromosome |
Allele |
Allelic frequencies |
Differentiation testa |
||
Cycle 0 |
Cycle 1 |
Allelic |
Genotypic |
|||
RM102 |
1 |
A1 |
0.0978 |
0.0647 |
|
|
|
|
A2 |
0.8750 |
0.9352 |
|
|
|
|
A3 |
0.0271 |
0 |
|
|
|
|
|
|
|
* |
ns |
RM263 |
2 |
B1 |
0.0256 |
0.1333 |
|
|
|
|
B2 |
0.0512 |
0.1000 |
|
|
|
|
B3 |
0.8590 |
0.3000 |
|
|
|
|
B4 |
0.3974 |
0.0557 |
|
|
|
|
B5 |
0.2948 |
0.3111 |
|
|
|
|
B6 |
0.0128 |
0 |
|
|
|
|
B7 |
0.0192 |
0 |
|
|
|
|
B8 |
0.0128 |
0 |
|
|
|
|
|
|
|
** |
** |
RM143 |
3 |
C1 |
0.0577 |
0.0446 |
|
|
|
|
C2 |
0.7692 |
0.9375 |
|
|
|
|
C3 |
0.0961 |
0.0179 |
|
|
|
|
C4 |
0.0705 |
0 |
|
|
|
|
C5 |
0.0064 |
0 |
|
|
|
|
|
|
|
** |
** |
RM185 |
4 |
D1 |
0.1547 |
0.1444 |
|
|
|
|
D2 |
0.7559 |
0.8277 |
|
|
|
|
D3 |
0.0893 |
0.0277 |
|
|
|
|
|
|
|
* |
ns |
RM111 |
6 |
E1 |
0.3188 |
0.1020 |
|
|
|
|
E2 |
0.2898 |
0.3877 |
|
|
|
|
E3 |
0.1232 |
0.3367 |
|
|
|
|
E4 |
0.2464 |
0.1735 |
|
|
|
|
E5 |
0.0217 |
0 |
|
|
|
|
|
|
|
** |
** |
RM296 |
9 |
F1 |
0.1053 |
0.1000 |
|
|
|
|
F2 |
0.1644 |
0.2765 |
|
|
|
|
F3 |
0.2171 |
0.1588 |
|
|
|
|
F4 |
0.5131 |
0.4647 |
|
|
|
|
|
|
|
ns |
ns |
RM228 |
10 |
G1 |
0.1279 |
0.1410 |
|
|
|
|
G2 |
0.0639 |
0.1026 |
|
|
|
|
G3 |
0.0232 |
0.0641 |
|
|
|
|
G4 |
0.0930 |
0.0641 |
|
|
|
|
G5 |
0.1919 |
0.1025 |
|
|
|
|
G6 |
0.4884 |
0.1666 |
|
|
|
|
G7 |
0.0120 |
0.0513 |
|
|
|
|
G8 |
0 |
0.3077 |
|
|
|
|
|
|
|
** |
** |
RM224 |
11 |
H1 |
0.0291 |
0.0500 |
|
|
|
|
H2 |
0.0523 |
0.0200 |
|
|
|
|
H3 |
0.1977 |
0.2333 |
|
|
|
|
H4 |
0.1860 |
0.1666 |
|
|
|
|
H5 |
0.2035 |
0.1666 |
|
|
|
|
H6 |
0.0581 |
0.1166 |
|
|
|
|
H7 |
0.2267 |
0.0666 |
|
|
|
|
H8 |
0.0465 |
0 |
|
|
|
|
|
|
|
** |
* |
RM235 |
12 |
I1 |
0.0937 |
0.1948 |
|
|
|
|
I2 |
0.2500 |
0.3312 |
|
|
|
|
I3 |
0.2187 |
0.1169 |
|
|
|
|
I4 |
0.1094 |
0.1558 |
|
|
|
|
I5 |
0.1094 |
0.1104 |
|
|
|
|
I6 |
0.2187 |
0.0909 |
|
|
|
|
|
|
|
* |
ns |
RM17 |
12 |
J1 |
0.3563 |
0.3224 |
|
|
|
|
J2 |
0.0977 |
0.1118 |
|
|
|
|
J3 |
0.5460 |
0.5658 |
|
|
|
|
|
|
|
ns |
ns |
|
Global test |
|
|
** |
** |
|
|
Confidence interval (CI) (95%) |
0.6045 |
0.5742 |
|
|
|
|
Neis diversity index |
0.4771 to 0.7319 |
0.4099 to 0.7435 |
|
|
a. Where ns is not significant; * is significant at 5%; and ** is significant at 1%.
Random combination of alleles at a given locus
Under conditions of panmixia and, according to the Hardy-Weinberg law, for a locus with two alleles, A and a, we could find three possible genotypes: AA, Aa and aa. If the frequencies of these alleles are, respectively, p and q, then the frequencies of possible genotypes in the population would be given by the development of the binomial (p+q)2, as follows (Falconer, 1989):
p2(AA) + 2pq(Aa) + q2(aa) = 1
Thus, the proportion of heterozygotes H would be 2pq, or also, 1 minus the proportion of homozygotes, that is:
H = 1 - (p2 + q2)
However, where there are multiple alleles, as in the case of microsatellites, the proportion of heterozygotes would be:
where:
pi represents the relative frequency of each allele
For recurrent selection programmes for rice, the harvesting of only male-sterile plants in the recombination stage is expected to induce panmixia. If, for some reason, this does not occur, because rice is an autogamous plant, we should observe an increase in homozygous genotypes at the expense of heterozygous ones. To evaluate the fulfilment of this assumption, we used the Wright inbreeding coefficient f (Hartl, 1987), which evaluates the reduction of the relative frequency of heterozygotes, compared with what was expected for the panmixia. Using the formula:
f = (He - Ho)/He
where:
the number of expected heterozygote (He) is calculated on the basis of allelic frequencies, and
the number of observed heterozygotes (Ho) is determined by direct counting of heterozygotes in the population
If the number of Ho were to equal that of He, then, in the panmixia, the coefficient f would take the expected value of 0, indicating that the population is in equilibrium. The value of f can be between -1 and +1. If an excess number of heterozygotes occurs, then the value of f would be less than 0, whereas if a deficient number of heterozygotes occurs, then the value of f would be more than 0. These last two cases indicate an absence of random crossing.
Table 5. Expected (He) and observed heterozygosity (Ho), inbreeding coefficient (f) and fulfilment of the Hardy-Weinberg equilibrium (H-W) for 10 microsatellite markers (SSRs) in cycles 0 and 1 of rice population PFD-1.
SSR |
Cycle 0 |
Cycle 1 |
||||||||
na |
Ho |
He |
f |
H-Wb |
na |
Ho |
He |
f |
H-Wb |
|
RM102 |
92 |
0.0869 |
0.2252 |
0.6153 |
** |
85 |
0.0352 |
0.1217 |
0.7113 |
** |
RM263 |
78 |
0.3333 |
0.7212 |
0.5394 |
** |
90 |
0.3444 |
0.7655 |
0.5514 |
** |
RM143 |
78 |
0.2564 |
0.3932 |
0.3494 |
** |
56 |
0.0714 |
0.1198 |
0.4062 |
* |
RM185 |
84 |
0.2976 |
0.3990 |
0.2552 |
** |
90 |
0.2000 |
0.2948 |
0.3227 |
* |
RM111 |
69 |
0.5072 |
0.7534 |
0.3192 |
** |
49 |
0.4082 |
0.7029 |
0.4219 |
** |
RM296 |
76 |
0.3552 |
0.6557 |
0.4598 |
** |
85 |
0.4588 |
0.6764 |
0.3229 |
** |
RM228 |
86 |
0.5232 |
0.6989 |
0.2525 |
** |
39 |
0.6154 |
0.8365 |
0.2669 |
** |
RM224 |
86 |
0.2674 |
0.8292 |
0.6787 |
** |
30 |
0.7333 |
0.8435 |
0.1325 |
* |
RM235 |
32 |
0.5000 |
0.8219 |
0.3955 |
** |
77 |
0.5065 |
0.7992 |
0.3677 |
** |
RM17 |
87 |
0.4942 |
0.5687 |
0.1315 |
* |
76 |
0.5000 |
0.5672 |
0.1191 |
ns |
Average |
76.8 |
0.3621 |
0.6056 |
0.4038 |
**c |
67 |
0.3873 |
0.5727 |
0.3623 |
**c |
CI(95%) |
|
|
|
0.3030 to 0.5122 |
|
|
|
|
0.2300 to 0.4310 |
|
a. The number of individuals evaluated.
b. Where ns is not significant; * is significant at 5%; and ** is significant at 1%.
c. Global test.
From the above, we can assume that, to determine the inbreeding coefficient, we would need a methodology to determine the allelic frequencies of a given pair of genes, and thus identify the heterozygous and homozygous genotypes. The SSR markers offer this possibility.
We examined whether the procedure followed for the recombination stage permitted random crossing between plants of population PFD-1. With the results described above, the allelic frequency of each allele was estimated for each SSR in cycles 0 and 1, and the frequency of heterozygotes observed for each locus. The results presented in Table 5 show that only for RM17 in cycle 1 was the value of Ho statistically equal to that of He, and its value for f was the smallest and closest to the expected value of 0.
For the other SSR loci, fewer heterozygotes than expected were observed. For cycle 0, the average of the number of observed heterozygotes was 0.3621, a different value to the average of the number of expected heterozygotes (0.6056).
In cycle 1, the value of Ho was a little higher (0.3873) than in cycle 0 but still less than for He (0.5727). The reasons why both values of Ho were different to those of He are the same for both cycles. The values for f in both cycles were statistically similar, as indicated by the confidence interval (CI) generated by bootstrapping (Lewis and Zaykin, 2001).
To determine if these differences were statistically significant, a U exact test was carried out on the deficit of heterozygotes (Raymond and Rousset, 1995), using the GENEPOP software program. The null hypothesis was the fulfilment of the Hardy-Weinberg equilibrium, and the alternative hypothesis, a deficit of heterozygotes.
The results obtained (Table 5) corroborated with the occurrence of a deficit of heterozygotes for each locus and in both cycles. The deficit of observed heterozygotes may have been a consequence of one or several factors occurring simultaneously, that is:
Existence of preferential crosses
Few recombination plantings, as in the case of population PFD-1, which would need to be measured to discover if more recombination cycles are needed
Existence of a certain degree of endogamy, which itself could be caused by:
- Errors in identifying male-sterile plants in the field. Such identification depends on observing panicle exsertion and anther colouring at the beginning of flowering.
- Presence of grains of viable pollen in the male-sterile plant. Singh and Ikehashi (1981), when identifying and characterizing the male-sterility gene in IR36, mentioned an occurrence of 4% of fertile pollen. No other known studies establish the possible effect of environment on the expression of the male-sterility gene. However, the value reported is low and would not, in itself, explain the deficit of observed heterozygotes.
- A problem of pollination, which is reflected in panicles with few seeds and which could be easily improved by using field techniques such as shaking fertile plants over sterile ones once flowering begins.
Recombination between genes
Another consequence of panmixia is the random recombination between genes. That is, at the moment gametes are formed, the alleles of a gene could appear in distinct combinations with the alleles of another gene in proportions given by the product of the allelic frequencies of the loci. The genes, in this case, would be considered as being in linkage equilibrium.
Such random recombination occurs more quickly where linkage between genes is absent, or where physical or physiological conditions favour some gametes over others. In both cases, some gametes will be in higher proportions than expected according to the allelic frequencies given for each locus, and taking into account that the genes are in linkage disequilibrium. In contrast to the Hardy-Weinberg equilibrium, which can be obtained within one generation of panmixia, linkage equilibrium requires numerous generations, depending on the type and intensity of factors causing the original linkage disequilibrium in the population (Hartl, 1987).
Table 6. Test for detecting linkage disequilibrium for 10 SSRs in cycle 0 (grey background) and cycle 1 (white background) of rice population PFD-1.a
Chrom |
Chrom. |
1 |
2 |
3 |
4 |
6 |
9 |
10 |
11 |
12 |
12 |
SSR |
RM102 |
RM263 |
RM143 |
RM185 |
RM111 |
RM296 |
RM228 |
RM224 |
RM235 |
RM17 |
|
1 |
RM102 |
|
ns |
ns |
ns |
ns |
ns |
ns |
ns |
ns |
ns |
2 |
RM263 |
ns |
|
ns |
ns |
** |
ns |
ns |
ns |
ns |
ns |
3 |
RM143 |
ns |
ns |
|
ns |
ns |
ns |
ns |
ns |
ns |
ns |
4 |
RM185 |
ns |
ns |
ns |
|
ns |
ns |
* |
ns |
ns |
ns |
6 |
RM111 |
ns |
* |
ns |
ns |
|
ns |
ns |
ns |
ns |
ns |
9 |
RM296 |
ns |
ns |
ns |
ns |
ns |
|
ns |
* |
** |
ns |
10 |
RM228 |
ns |
ns |
ns |
ns |
ns |
ns |
|
ns |
ns |
ns |
11 |
RM224 |
ns |
ns |
ns |
ns |
ns |
ns |
ns |
|
ns |
ns |
12 |
RM235 |
ns |
** |
ns |
ns |
ns |
ns |
ns |
ns |
|
ns |
12 |
RM17 |
ns |
ns |
* |
ns |
ns |
ns |
ns |
ns |
ns |
|
Statistical values are ns for not significant; * for significant at 5%; and ** for significant at 1%.
Theoretically, linkage disequilibrium for two given loci (A and B), with two alleles each (Aa, Bb), is quantified through the D index, which is calculated by considering the difference between the frequency (f) of the parental gametes (AB, ab) and recombinants (Ab, aB), as follows:
D = P11P22 - P12P21
where:
D is the linkage disequilibrium
P11 is the frequency of the gamete, product of fAfB
P22 is the frequency of the gamete, product of fafb
P12 is the frequency of the gamete, product of fAfb
P21 is the frequency of the gamete, product of fafB
The value of D will tend to be 0 where the frequencies of the parental gametes and recombinants are similar.
Linkage disequilibrium would imply a greater proportion of some gametic combinations that, consequently, would modify the genotypic classes and frequencies observed within a population. However, even if all the expected gametes are formed, the impossibility of random crossing (because of preferential crosses, possibly through incompatibility or lack of opportunity through the need to sow a larger number of recombination plantings) may affect the linkage equilibrium. Hence, certain genotypes would appear more frequently in the population.
To detect whether linkage disequilibrium has occurred in cycles 0 and 1 of population PFD-1, we used the genotypic frequencies obtained for each SSR to detect the presence of D for each combination of the two loci through Fishers exact tests (Raymond and Rousset, 1995). The GENEPOP software program started with the premise that, if the genotypic frequencies of a locus are independent of those of the other locus, then both loci will be in linkage equilibrium. In that sense, the program generated contingency tables and determined whether independence existed. This procedure made all possible combinations and, through a Fishers exact test, gave the probability of the two loci being independent (null hypothesis) or not (alternative hypothesis).
Table 6 gives the statistical significance of the detection test, showing that, overall, no linkage disequilibrium was detected for the combinations at the loci or for each cycle. This indicates that the alleles of the different loci appeared in different gametic combinations according to the particular allelic frequencies. We must point out that SSRs RM235 and RM17, both located on chromosom 12, showed no linkage disequilibrium in either cycle. These results indicate that all alleles of a given locus have equal opportunities of combining with the alleles of the other locus. Most of the SSRs used were found on different chromosomes and there was, therefore, no effect of physical linkage. Hence, we rejected the idea that, during recombination planting, some matings did not occur, whether through incompatibility or lack of opportunity because of a low number of recombination generations.
Recurrent selection applied to autogamous crops such as rice provides a challenge to plant breeders wanting to apply it. The fulfilment of certain conditions will define the future success of population improvement. For this reason, and in view of the medium- and long-term benefits of recurrent selection programmes, plant breeders must diligently fulfil the conditions, such as high initial genetic variability, maintenance of this variability over time and panmixia when planting recombinations. To verify these conditions, molecular markers provide a useful and appropriate tool.
This study verified the high genetic variability of cycle 0 of population PFD-1 and the maintenance of this variability after one selection cycle. However, the study also showed, in the recombination stage, the inconvenience of the apparent occurrence of some degree of autogamy. Such results demand a study of the possible reasons so that plant breeders in charge of a population improvement programme through recurrent selection for population PFD-1 may take the necessary measures to correct the situation.
References
Brondani, C.; Brondani, R.P.V.; Rangel, P.H.N. & Ferreira, M.E. 2001. Development and mapping of Oryza glumaepatula - derived microsatellite markers in the interspecific cross Oryza glumaepatula x O. sativa. Hereditas, 134: 59-71.
Falconer, D.S. 1989. Introduction to quantitative genetics. 3rd ed. New York, Longmans Scientific & Technical. 438 pp.
Ferreira, M.E. & Gratapaglia, D. 1998. Introdução ao uso de marcadores moleculares em análise genética. 3rd ed. Brasília, EMBRAPA. 220 pp.
Hartl, D. 1987. A primer of population genetics. Sunderland, MA, USA. Sinauer Associates Publishers. 305 pp.
Lewis, P. & Zaykin, D. 2001. Genetic data analysis: computer program for the analysis of allelic data, version 1.0 (1d16c). (Available free at http://lewis.eeb.uconn.edu/lewishome/software.html)
Nei, M. 1978. Estimation of average heterozygosity and genetic distance from small number of individuals. Genetics, 89: 583-590.
Ortiz, A.; Ramis, C.; Parra, P.; Díaz, A. & López, L. 2002. Patrones isoenzimáticos de variedades de arroz y arroces rojos en Venezuela. Rev. Fac. Agr. (Maracay), 28: 117-130.
Panaud, O.; Chen, X. & McCouch, S.R. 1996. Development of microsatellite markers and characterization of simple sequence length polymorphism (SSLP) in rice (Oryza sativa L.). Mol. Gen. Genet., 252: 597-607.
Pinto, L.R. 2001. Genetic structure of maize (Zea mays L.) populations BR-105 and BR-106 and their synthetics IG-3 and IG- 4 by microsatellite markers. Piracicaba, SP, Brazil, Escola Superior de Agricultura Luiz de Queiroz. 142 pp. (PhD dissertation)
Raymond, M. & Rousset, F. 1995. Genepop (version 1.2): population genetics software for exact test and ecumenicism. J. Hered., 86: 243-249.
Rodríguez, N. 2001. Evaluación de la erosión cualitativa de la semilla de arroz (Oryza sativa L.) en el sistema de producción de semillas certificadas en Portuguesa. Maracay, Facultad de Agronomía, Universidad Central de Venezuela. 65 pp. (MSc thesis)
Singh, R.J. & Ikehashi, H.I. 1981. Monogenic male sterility in rice: induction, identification and inheritance. Crop Sci., 21(2): 286-289.
Sneath, P. & Sokal, R. 1973. Numerical taxonomy. San Francisco, CA, USA, Freeman and Company. 573 pp.
Temnykh, S.; Park, W.D.; Ayres, N.; Cartinhour, S.; Hauck, N.; Lipovich, L.; Cho, Y.G.; Ishii, T. & McCouch, S.R. 2000. Mapping and genome organization of microsatellite sequences in rice (Oryza sativa L.). Theor. Appl. Genet., 100: 697-712.
Velásquez, R. 2001. Control genético de tres sistemas isoenzimáticos en arroz (Oryza sativa L.). Maracay, Facultad de Agronomía, Universidad Central de Venezuela. 85 pp. (Promotion paper)
[5] Faculty of Agronomy, Universidad
Central de Venezuela. E-mails: [email protected]
and [email protected], respectively [6] Department of Genetics and Molecular Biology, Universidade Estadual de Campinas (UNICAMP), currently at Universidade Federal Rural de Pernambuco, Rua Dom Manoel de Medeiros, s/n - Dois Irm.s, Fortaleza, Pernambuco, Brazil. E-mail: [email protected] [7] Embrapa Arroz e Feijão, currently at FAO, Viale delle Terme di Caracalla, 00100 Rome. E-mail: [email protected] [8] Fundación DANAC, Apdo. Postal 182, San Felipe, Estado Yaracuy, Venezuela (R.I.P.). E-mail: [email protected] |