Previous Page Table of Contents Next Page


MODELLING TSETSE AND TRYPANOSOMIASIS IN AFRICA

David J. Rogers

INTRODUCTION

There are two quite different problems and two quite different solutions in developing models for tsetse and trypanosomiasis in Africa.

The different problems are those of distribution on the one hand and abundance (or prevalence) on the other. For some activities, such as planning a control campaign where the same solution is to be applied throughout the affected area, only accurate information on distribution is required. For others, such as determining the amount of control or intervention required on a regional basis, abundance or prevalence information is also required. We will see that it is generally easier to model distribution than it is to model abundance.

The different solutions involve either a statistical or a biological approach. The statistical approach is familiar to epidemiologists concerned with modelling diseases whose aetiology is poorly understood. Factors are monitored which are assumed to be risk indicators (e.g. the proximity to a nuclear power station may be a risk indicator of child-hood leukaemia) and statistical analysis later looks for correlations between the disease and the risk indicators. Such an approach was eventually responsible for establishing the causal links between smoking and cancer, although the jump from ‘x is correlated with y’ to ‘x causes y’ needs a great deal of experimental evidence, which is often lacking.

The biological approach is adopted when the determinants of disease risk, and their interactions, are clearly understood. This approach is always preferable to the statistical approach because a clear understanding of disease transmission, within a model framework, is able to predict changes in disease distribution and abundance much more accurately than the statistical approach. Furthermore whilst the statistical approach may give satisfactory predictions of the future when the future is the same as the past, it becomes progressively less satisfactory when the future is different from the past. Given the increasingly rapid rate of environmental change the statistical approach's predictions are likely to be less and less satisfactory.

Unfortunately we seldom have enough information, except from very few sites, on which to base a biological model and so we are forced to adopt a statistical approach. In doing so we may choose from a range of statistical methodologies that differ in their accuracy and the amount of light they throw on the biological problems. In general we find that the most accurate statististical methods are often the least illuminating biologically whilst the biologically illuminating methodologies do not give quite so good a fit to the field data. It is up to the end-user of the information to decide whether he or she wants effective support for a control operation (in which case the most accurate prediction is required) or further insight into the dynamics of disease transmission (in which case biologically useful predictions are required). We must always guard against the problem that there probably exists a statistical method which will predict with 100% accuracy a map we know to be at least partly inaccurate. Statistical accuracy should not therefore be used as the sole criterion for choosing one analytical method over another. The role of the biologist here remains crucial.

DISTRIBUTION AND ABUNDANCE

All biological systems consist of a series of input processes and output processes. We can regard the input processes (e.g. birth, immigration or infection) as contributing to an increase (in number per unit area, or in disease prevalence) and output processes (death, emigration, recovery) as contributing to a decrease. It is unlikely in the extreme that the sum total of all the input processes, by chance, will equal the sum total of all the output processes. Therefore there will always be a net tendency for change (for decrease or increase). If the net change is always negative the population, or disease prevalence, will fall to extinction. We can therefore define the limits of a vector, or the disease it transmits, as the points at which, on an annual basis the net change is, on average, zero. This is an extremely powerful approach, firmly based on a biological interpretation of events, which we have used in the past to define the pan African distribution of tsetse flies from biological field data collected at only two field sites, in Nigeria and Zambia (Rogers 1979). If the net change is positive, however, the vector population, or the disease, will increase. Clearly this increase is not without limits and so some other factor or factors must come into play to change a net tendency to increase at low densities to one of no net change at high densities. These factors are therefore introducing regulation or negative feed-back into the system and this feed-back stabilises the components of the biological system around its characteristic equilibrium level. Negative feed-back processes in ecology are sometimes called ‘density dependent’ and they are as much a part of disease transmission models (Dietz 1988) as they are of vector population models (Rogers 1990). As explained elsewhere, the characteristics of the negative feed-back processes determine both the equilibrium level itself and the stability of that equilibrium in rather subtle ways (Rogers 1983). It is clearly vitally important to include these regulatory factors in biological models of both vectors and diseases, but it is also essential to try to incorporate them into the statistical analyses of vector and disease abundance. If, for example, a predatory insect is regulating the abundance of tsetse over part of its range, the statistical approach requires some estimate of predator presence and/or abundance as one of its input predictor variables. Failing this it needs another variable as a surrogate for the predator variable. Since we so rarely know precisely what is regulating vector abundance in the wild, and since we even more rarely measure the causative agents, we can only guess at an appropriate surrogate variable within our data-base. This means that predictions of vector abundance, or disease prevalence, are going to be much more difficult than predictions of presence or absence of either. This is especially the case when the regulatory factors themselves change throughout the distributional range of the vectors and diseases.

In order to illustrate these points we give four examples, two statistical and two biological: of the distribution of Glossina morsitans in Kenya and Tanzania, of the abundance of G. palpalis in Nigeria, of the prevalence of trypanosomiasis among cattle in Togo and of the incidence and prevalence of trypanosomiasis in cattle in The Gambia.

THE DISTRIBUTION OF GLOSSINA MORSITANS IN KENYA AND TANZANIA

This example uses a statistical approach to model the distribution of G. morsitans in part of its range. We use the standard statistical technique of linear discriminant analysis (available in many multi-variate software packages, e.g. SPSS) and a training set of between 1 and 5% of the data set to ‘train’ the analysis to distinguish between sites that are suitable and unsuitable for this species. The data set contains a number of data layers; the distributional data from Ford & Katondo's tsetse distribution map (Ford and Katondo 1977); elevation; ground-based meteorological records (temperature means, minima, and maxima); and finally the annual mean, minimum and maximum Normalised Difference Vegetation Indices derived from the Advanced Very High Resolution Radiometer on board the NOAA series of meteorological satellites. Certain variables we might wish, from the biological point of view, to be in this analysis (e.g. rainfall, saturation deficit) were not available in the data set. The analysis decided which is the single most important variable by determining how each variable in turn is able to distinguish between areas of fly presence and absence. The variable that gives the best separation (within the analysis this is indicated by the Mahalanobis distance, which allows for the unequal variances of the variables, and their co-variances) is chosen as being the most important. The second most important variable is chosen from the remaining variables in a similar way; and so on. At each stage of the analysis it is possible to produce a map of the prediction of areas of suitability and unsuitability, and this is colour-coded on the computer screen. The % correct predictions (i.e. both of presence and absence), the % false positives (an incorrect prediction of presence), % false negatives (an incorrect prediction of absence), the sensitivity (ability to predict positives correctly) and specificity (ability to predict negatives correctly) are calculated for each map. It is then possible to compare the increased discriminating ability of the statistical analysis (i.e. as indicated by the increasing value of the Mahalanobis distance) with the result of its spatial predictions in map form. It is important to realise that the analysis itself is not spatial in character, i.e. the fate of any particular pixel depends only upon the values of its predictor variables and not upon the fate of adjacent pixels. What is therefore quite remarkable is that the maps that are produced from the analysis make a great deal of sense spatially. The best fit to the distributional data is shown here as Fig. 1.

We have now applied this method of analysis to several examples, including both tsetse and tick species, and can draw the following conclusions:. (1) Vector distributions are rather sensitive to even small changes in environmental conditions. The mean temperature difference between areas of tsetse presence and absence in Kenya and Tanzania is only about 0.3 to 0.4° C. Analyses of this sort will be able to define the degree of accuracy required of Global Climate Models (GCMs) if they are to be of any use to biologists attempting to make predictions of the impact of global change on vector distributions. Other (non-temperature) limiting factors are likely to have similarly dramatic effects on other vector species, requiring a degree of sophistication of GCM predictions which is at present lacking. (2) Environmental conditions vary geographically and the limiting factor for any particular vector species may change from place to place. Whilst the major limiting factor for G. morsitans in Kenya and Tanzania is a vegetation index, in Zimbabwe it is a temperature variable (in Zimbabwe low temperatures are known to limit fly distribution). (3) The analysis suggests that near the edge of the range of G. morsitans (i.e. in Zimbabwe) a single environmental variable (temperature) appears to determine fly distribution and the other predictor variables do not significantly improve the fit to the data set. Within the distributional limits of the same species (i.e. in Kenya and Tanzania), however, more than one variable (NDVI, temperature and elevation) make a major contribution to the observed distribution. Thus it appears that distributional limits are characterised by single variables whilst patches within distributional limits are characterised by several variables (the same conclusions apply to the brown-ear tick Rhipicephalus appendiculatus in the same three countries (Rogers and Randolph 1993)).

(4) There is generally a much smaller proportion of false negative to false positive results suggesting that whilst the analysis has correctly identified the major environmental constraints, the present distribution maps may inadequately represent the actual distribution of vectors. False positive areas should be carefully investigated in the future, and may reveal the presence of vectors at low density, or in previously unexplored areas. At the very least, they represent ‘ecological corridors’ along which vectors might be expected to move into new areas. A good example of this occurred when a recent mapping exercise with Dr. Tim Robinson of the IPMI, Zambia, indicated the suitability for G. m. morsitans of part of the Zambezi Valley in Mozambique that is shown to be morsitans-free on the Ford & Katondo map. William Shereni of the Zimbabwe Tsetse Control Department informs us that G. morsitans is now to be found in this area.

(5) The technique of discriminant analysis may easily be applied to sub-species of tsetse, to determine whether each sub-species is responding to environmental conditions in the same way. Preliminary results with Tim Robinson's data from the RTTCP common fly belt show quite clear separation between areas infested by G. m. centralis and G. m. morsitans, suggesting a real difference between these two sub-species that is sometimes denied by taxonomists. (6) Linear discriminant analysis is a simple multi-variate technique whose statistical procedures and biological significance are transparent to the user in ways that those of other models for analysing vector distributions are not. It is also just as accurate. When the technique is compared with CLIMEX (a detailed climate matching model that requires for each species the estimation of one growth index and four stress indices (Sutherst and Maywald 1985)) discriminant analysis gives predictions at least as good as those of CLIMEX (Rogers & Randolph 1993). The rather arbitary nature of the stress indices in CLIMEX, and their combination to provide a single ecoclimatic index of suitability, make the interpretation of CLIMEX's predictions somewhat obscure. Linear discriminant analysis may be improved upon by adopting non-linear discriminant criteria, tree-based classification or other more sophisticated techniques (G.J. Staton et al., unpublished), each requiring a combination of the skills of the mathematician and the insight and common sense of the biologist.

THE POPULATION DYNAMICS OF GLOSSINA PALPALIS IN NIGERIA

G. palpalis was sampled continuously in Katabu, Nigeria, for very many years until its eventual decline to extinction was brought about by habitat destruction by humans (Onyiah 1978). An analysis of this data set before the decline set in allows us to build a biological model for this tsetse species that shows the relative importance of density dependent and independent mortalities (Rogers 1990). We adopt an approach using Moran curve analysis of the fly-round data (for details see (Rogers 1979) which allows us to extract from the data an estimate of the density independent mortality operating on this species each month. This density independent mortality is usually due to climate killing either puparia or adults or both, and causes a reduction in the population rate of increase from one month to the next.

One might imagine that if we put into a tsetse population model a constant rate of increase and the variable, density independent mortality calculated from the data set we would expect to be able to model the population satisfactorily. Fig. 2a shows that this is not the case. The population begins to increase from a low level and, although it shows seasonal changes in the rate of increase (this is best seen on the graph of puparial numbers in Fig. 2), it continues to increase well beyond the observed equilibrium level. The reason for this is that we have, as yet, not included any negative feed-back (i.e. density dependence) on either the adult or puparial populations. The effects of doing this are shown in Figs. 2b, 2c and 2d. Fig. 2b shows the best fit to the data, with the parameter values given in the legend. In Fig. 2c the amount of negative feed-back on puparial numbers has been increased, whilst in Fig. 2d the amount of density dependence on the adults is increased. Notice that increased density dependence stabilises that section of the population (puparia or adults) that it acts on directly, and all other stages indirectly. Clearly biological models can have too much density dependence (Fig. 2c and 2d), or too little of it (Fig. 2a), or just about the right amount (Fig. 2b). We do not yet know if the amount of density dependence in the real population at Katabu was about the same as the model predicts, and we shall now never know. But this approach shows how a biological model, in much the same way as the statistical model outlined in the previous section, can provide ‘guesstimates’ of crucial parameter values that can be tested in future field work.

TRYPANOSOME INFECTIONS IN TOGO

The project “Lutte contre la trypanosomiase en vue du developpment agropastoral des zones liberees de l'onchocercose” (Project GCP/TOG/013/BEL) has sampled tsetse and trypanosomiasis in cattle in Togo on a one eighth of a degree grid square basis, and I am most grateful to Dr. Guy Hendrickx for allowing me to use the project data for analysis. Fig. 3 shows areas in Togo where the prevalence of Trypanosoma (= Duttonella) vivax in cattle exceeds 5%. These data were subjected to discriminant analysis in exactly the same way as the G. morsitans distribution data of the first example. In this case, however, the entire data set was included in the training sample. Fig. 3 shows areas where disease prevalence is predicted to be greater than the 5% threshold, using as predictors the suite of variables listed in the figure. Temperature, rainfall and a vegetation index derived measure of seasonality (NDampl, the amplitude of the annual cycle of vegetation growth) are the three most important predictor variables, followed by the numbers of G. morsitans and G. tachinoides in each grid square. The result, using all the listed variables, is a prediction which is 83% correct with a sensitivity and specificity each exceeding 80%. Thus the same method of statistical analysis applies equally well to both vectors and disease.

SEASONAL TRYPANOSOME INFECTION IN N'DAMA IN THE GAMBIA

The final example is taken from published information on the incidence and prevalence of trypanosomiasis among both zebu and N'Dama cattle in the Gambia (Claxton, Leperre et al. 1992). This study has collected the most comprehensive data set on both the vectors and disease, and has been at pains also to estimate the degree of seasonal contact between vectors and hosts.

In one series of experiments herds of zebu and N'dama cattle were grazed in the same areas and each animal was Berenil treated on being clinically diagnosed as positive for trypanosomiasis. These data were interpreted as recording the incidence of disease in animals at risk. At the same time village herds of N'Dama cattle in the same general areas were surveyed and the trypanosome prevalence in them recorded. These animals were not routinely treated for trypanosomiasis. One striking observation from the comparison that can be made between these data sets is that the incidence of infection is greater than the prevalence - the reverse of the usual situation. One reason for this could be that the village herds developed and maintained a high degree of natural immunity whilst the experimental herds (both zebu and N'Dama) did not, because of their drug treatment. Results similar to these from the rest of The Gambia were used by Rawlings, Dwinger et al. (1991) to question the applicability of theoretical models of trypanosome epidemiology to field situations, and to point out that there are obviously additional, and perhaps more important, components in transmission in the case of trypanotolerant animals. The paper by Rawlings et al is an excellent example of how we should use field data to confront our existing biological models, and to improve upon them. Inspection of the results for Fugga (Claxton, Leperre et al. 1992) shows that whilst the incidence of trypanosomiasis in zebu peaks about one month after the peak in fly challenge (i.e. more or less as transmission models predict (Rogers 1988)) the incidence in N'Dama reaches a peak two months after peak challenge, whilst the prevalence in village herds of N'Dama peaks three months after peak challenge. The correlations between these variables, lagged by the appropriate numbers of months, are shown in Fig. 4. The results suggest an incubation period of the disease within the N'Dama considerably longer than that in zebu, perhaps arising from the trypanotolerant animals' ability to depress the rate of increase of the trypanosome parasitaemias within their blood stream (a characteristic of this breed of animals). This result therefore suggests that the first modification required of a biological model developed for zebu and adapted for N'Dama should be the incorporation of a much longer incubation period of infection in the N'Dama hosts. A more general comment is that we may only need to tinker with parameter values in existing models to describe results such as those in The Gambia, rather than imagine we need entirely new models.

CONCLUSIONS

From our present state of semi-ignorance it seems that we should plan our future progress as follows

  1. Establish well-documented sets of data for what might be the important predictor variables in statistical analyses. These data sets need to be chosen with extreme care and a certain degree of foresight. For example whilst soil type cannot be imagined to affect tsetse or trypanosomiasis directly, nevertheless soil type might be used to remove confusing variability from another variable (e.g. vegetation index) that is more directly related to the vectors or diseases. Statistical analyses will generally perform better on extensive data sets, and it is important that such extensive data are equally accurate across the regions considered.

  2. Examine in detail data sets gathered in restricted places, generally during short- or medium-term projects, and attempt to construct biological models that may inform future statistical analyses. It is rarely the case that sufficient biological data are gathered from all components in any one place (the vectors, the parasites and the hosts), so that the prospects for a full biological model of vector numbers and disease transmission are somewhat bleak. Nevertheless the individual components may be modelled quite satisfactorily using different data sets, in order to attempt to construct some chimaera of the complete system.

  3. Use a combination of biological models and statistical analyses to begin to make predictive maps of both vectors and diseases. Initially the aim should be to predict the spatial variability of disease risk. Temporal variability will probably be more difficult to model, requiring a good deal more biological insight than most statistical models contain.

  4. Incorporate into future projects field assessments of the predictions made under step 3) and measurements of those variables identified as important by the analyses. The biological models should help to highlight the likely impacts of these variables on the biological system, whilst the statistical analyses will indicate the degree of precision required of the measurements.

REFERENCES

Claxton, J. R., P. Leperre, et al. (1992). “Trypanosomiasis in cattle in Gambia: incidence, prevalence and tsetse challenge.” Acta Tropica 50: 219–225.

Dietz, K. (1988). “Density-dependence in parasite transmission dynamics.” Parasitology Today 4(4): 91–97.

Ford, J. and K. M. Katondo (1977). The distribution of tsetse flies in Africa. Nairobi, OAU Cook, Hammond & Kell.

Onyiah, J. A. (1978). “Fluctuations in numbers and eventual collapse of a Glossina palpalis (R.-D.) population in Anara Forest Reserve of Nigeria.” Acta trop 35: 253–261.

Rawlings, P., R. H. Dwinger, et al. (1991). “An analysis of survey measurements of tsetse challenge to trypanotolerant cattle in relation to aspects of analytical models of transmission.” Parasitology 102: 371–377.

Rogers, D. J. (1979). “Tsetse population dynamics and distribution: a new analytical approach.” J. Anim Ecol 48: 825–849.

Rogers, D. J. (1983). Interpretation of sample data. Pest and vector management in the tropics Eds. A. Youdeowei and M. W. Service. London, New York, Longman. 139–160.

Rogers, D. J. (1988). “A general model for the African trypanosomiases.” Parasitology 97: 193–212.

Rogers, D. J. (1990). “A general model for tsetse populations.” Insect Sci Applic 11(3): 331–346.

Rogers, D. J. and S. E. Randolph (1993). “Distribution of tsetse and ticks in Africa: past, present and future.” Parasitology Today 9(7): 266–271.

Sutherst, R. W. and G. F. Maywald (1985). “A computerised system for matching climates in ecology.” Auric Eco Enviro 13: 281–299.

Varley, G.C., Gradwell, G.R. & Hassell, M.P. (1973). Insect Population Ecology. Oxford, Blackwell.

Figure legends

Fig. 1. The result of applying linear discriminant analysis techniques to the problem of the distribution of the tsetse Glossina morsitans in Kenya & Tanzania is shown here as a probability map (the grey scale) super-imposed on which is the observed distribution of this species (the horizontal lines). Important predictor variables were annual mean NDVI, temperature and elevation.

Fig. 2. Output of a population model for G. palpalis in Nigeria. a) model with no density dependence and the observed density independent seasonal mortality acting on adults, b) minimum, threshold and slope of density dependent relationship for puparia were 0.005, 10, 0.05 respectively; threshold and slope of adult density dependence were 200 and 0.01 (minimum values are logarithmic, i.e. k-values (Varley et al 1973); thresholds are numbers of flies per unit area; slopes are regression coefficients relating monthly k-values to the logarithm of population size), c) As for b) but with the slope of puparial density dependence increased from 0.05 to 0.25, d) as for b) but with the threshold for puparial density dependence raised from 10 to 100 and the slope of adult density dependence raised from 0.01 to 0.50 (from Rogers 1990).

Fig. 3. Result of applying linear discriminant analysis to describing areas in Togo where T. vivax prevalence in cattle exceeds 5%. Predictor variables are listed in their order of importance in the top left of the Figure. Areas where observed prevalence exceeded the threshold value are indicated by small circles (original data from Dr. Guy Hendrickx, Project GCP/TOG/013/BEL, with ipermission).

Fig. 4. Relationships between the Berenil Index (a measure of disease incidence) in zebu (a) and N'Dama (b) cattle in The Gambia and tsetse fly challenge and between the prevalence of infection in untreated village herds of N'Dama (c) and fly challenge, each lagged by the indicated number of months (original data from Claxton et al 1992).

FIG. 1

Fig. 1

FIG. 2a

Fig. 2a

FIG. 2b

Fig. 2b

FIG. 2c

Fig. 2c

FIG. 2d

Fig. 2d

FIG. 3

Predictions of areas in Togo with PPRUA T. vivax > 5

Using variables Fig. 3
Tmm 
RAINM 
NDamp1 
LLUAGm 
LLUAGt 
%AGRIC 
NDVIx 
Tmn 
ELEV. 
NDph2 
Probability scale 
0.90 – 0.99 
0.80 – 0.89 
0.70 – 0.79 
0.60 – 0.69 
0.50 – 0.59 
0.40 – 0.49 
0.30 – 0.39% Correct = 83.28
0.20 – 0.29% False +ve = 13.83
0.10 – 0.19% False -ve = 2.89
0.00 – 0.09Sensitivity = .862
ObservedSpecificity = .825

FIG. 4

Fig. 4a
Fig. 4b
Fig. 4c

Previous Page Top of Page Next Page