Dan Osgood and Leslie Lipper
Dan Osgood is an Assistant Professor and Assistant Extension Specialist in the Department of Agricultural and Resource Economics, College of Agriculture, University of Arizona.
Leslie Lipper is a Staff Economist in the Economic and Social Department, Agriculture and Economic Development Analysis Division of the Food and Agriculture Organization of the United Nations.
This paper discusses some key methodological issues involved in analysing relationships between socio-economic conditions and environmental degradation, using the case of soil degradation as an example. A major theme of the paper is the potential for using spatially referenced geographical information system (GIS) data in conjunction with socio-economic data in statistical analyses of causal relationships. In addition, the problem of endogeneity among dependent and independent variables is discussed, and means are suggested for addressing this concern in estimations.
As this study is focussed on methodological issues, the interactions between socio-economic factors and soil degradation are addressed at only a superficial level, in order to provide a basis from which to start more complete and robust studies. The intention of this study, then, is not to provide research results, but rather to discuss in a concrete fashion the issues that need to be addressed in obtaining such results.
In conducting statistical analyses of the links between socio-economic conditions and soil degradation, the researcher is confronted with the problem of disentangling causal relationships. Yields depend on the quality of the soil, while the quality of the soil is influenced by agricultural practices. As agricultural practices are a function of the socio-economic situation of the growers, and the socio-economic situation of growers depends on crop output, the two processes combine to form a complicated feedback loop. In this system, the causal relationships between welfare and soil degradation are difficult to ascertain. A key requirement for such an analysis is to distinguish between factors that are exogenous to the system being analysed, versus those that are generated within the system, or endogenous factors. Without taking some sort of corrective measures, the use of endogenous explanatory variables in a regression analysis leads to biased results.
Recent improvements in information systems provide new avenues to address this problem. Comprehensive data have been systematically gathered through regional surveys, national censuses and global satellite observation. Extensive simulation and analysis have been performed for many of the subsystems involved. GIS software and hardware allow the researcher to link and reference these datasets as well as display the results. These resources may play an important role in understanding the links between socio-economic conditions of farming populations and environmental processes such as soil degradation.
However, several difficulties confront such an effort. The data have been gathered on a vast array of scales and encompass dramatic differences in precision, scope and levels of processing. Some datasets are in time series format, while others are cross-sectional - spatially explicit but for only one time period. With future data gathering efforts, these problems may be resolved.
This concept paper demonstrates one way in which the datasets currently available can be utilized to separate out and measure the effects of soil quality and socio-economic status independently. The effects of soil degradation on socio-economic variables are estimated, as is the influence of socio-economic forces on soil degradation through careful database integration and the use of instrumental exogenous variables. In addition, a demonstration of how the integrated data can be combined with regression results to map effects across regions is given.
The issues concerning modelling, data sources, scale, integration, estimation and graphical presentation are addressed through two examples. The first is a study of the causal links between soil degradation and socio-economic factors for the continent of Africa, using national level socio-economic data combined with spatially disaggregated, remotely sensed grids. The second study uses subnational data from Ghana, combining district level administrative data with the remote sensing information, highlighting issues of scale and scope.
The paper begins with an initial discussion to motivate important concepts in soil degradation and socio-economic forces. Next, the data are described as well as the GIS-based integration strategy. The methodology of estimation of the endogenous system is presented. Finally the paper finishes with a discussion of the results, techniques and issues.
Soil quality can be thought of as a stock of capital, which provides goods and services to farmers in the form of agricultural production outcomes. Soils generate agricultural outcomes through biological and chemical processes that are affected by the quality of the soil. Although there are various aspects of soil quality that can affect agricultural outcomes, for the purposes of this paper they are considered in the aggregate, measured as one indicator.
Through their agricultural production decisions, farmers can deplete, maintain or augment the stock of soil quality. Land degradation is defined as the depletion of soil quality. At the same time, the quality of the soil present on a farmer's field in any one production period is a determinant of the yield outcomes. Generally, more highly degraded lands result in lower productivity, although the impacts vary across production conditions and the production technologies employed. Lower productivity can be due either to decreasing yields or increased production costs associated with decreased input efficiency. Land degradation may also result in greater yield variability, and thus greater costs to risk-averse farmers.
It is assumed that the grower maximizes his/her expected profits discounted over time, through production decisions such as the method and intensity of cultivation, and the application of inputs such as fertilizer, pesticides and labour. Farm profits will be based upon yield outcomes, which in turn will be related to the level of soil quality The income (or poverty) of the farmer is thus affected by the soil degradation through its impact on yields. Income is also affected by a multitude of other economic, social and natural factors, such as prices, management practices, plot characteristics, other income sources, institutional arrangements and weather. These constitute a mix of exogenous and endogenous factors, but for the purposes of this study it is assumed they are exogenous. It is also assumed that growers will adapt their behaviour to function as well as they can, given the situations they face. Their response to the incentives and constraints that they are confronted with will, in turn, determine their decisions on the application of inputs, intensification and management practices. These, in turn, will influence the degree to which soil degradation takes place.
Soil quality is therefore a function of the grower's response to a multitude of exogenous factors as well as the grower's socio-economic situation. Likewise the grower's socio-economic situation is a function of a series of exogenous factors and soil degradation. These relationships constitute a dynamic stochastic system, so a time series econometric study would be the most appropriate course of action for its analysis. However, in this particular study the data source for soil degradation represents a single point in time, so cross-sectional variations are what must be studied. Needless to say, time series data on soil degradation would lead to greatly improved empirical results.
The estimation task is twofold: to measure the effects of soil degradation on the grower's socio-economic situation and to measure the effect of the grower's socio-economic situation on soil degradation. Because these two forces influence each other, a problem of endogeneity arises, complicating the estimation process.
One approach to this problem is to assume a causal direction. In Kirschke, Morgenroth and Franke (1999), an assumption is made that soil degradation is a one-way function of socio-economic and other factors. These factors are assumed to be effectively uninfluenced by the soil quality. An extensive dataset is gathered and a regression is performed on the Global Assessment of Soil Degradation (GLASOD) dataset (described in detail below) using a nation as an observation. In this regression analysis, the data were not able to confirm a link between poverty and soil degradation, but intensification was found to be a strong contributor to soil degradation. The unidirectional causal assumption precludes the consideration of the effects of soil degradation on the farmers' socio-economic situation in the analysis. Although the causal assumption is a practical and valid course of action, it must be kept in mind when interpreting the results of the study. To the extent that the causal assumption is incorrect, the regression results must be interpreted as evidence of a connection of which the causative direction cannot be known.
A different approach is taken for the current work. Variations in exogenous parameters are used to gauge causative relations, requiring careful selection of a small group of exogenous variables and use of statistical techniques that remove endogenous effects. This estimation strategy is presented in section 5. However, before proceeding to that stage, the data must be described.
In this section the data used in the study are described, as well as the data sources and information that could be relevant for related work. In this presentation, the availability and potential for geo-referencing of economic data are also addressed. In addition, issues of scale and scope are discussed, weighing the relative benefits and availability of global, national, subnational and local datasets.
This section begins with a description of the geophysical and climactic datasets used, data concerning physical phenomena that are manifest independently of national and administrative boundaries. These datasets are relied on for the Africa-wide analysis and used to supplement the data for the Ghana subnational analysis. The presentation then proceeds to describe the socio-economic data, which are mostly in tabular form, and must be referenced to geographic locations using the GIS borders and associated names of national or administrative units. Finally it presents the primary data source for the subnational analysis of Ghana.
One must always be wary that analysis on large-scale datasets may not accurately reflect local processes. Satellite-based remote sensing data must be combined with practical field experience when applied for policy use. The analysis does, however, allow the identification of regional differences and vulnerabilities. It also provides a method to identify features to be investigated in more depth.
The primary GIS dataset used in this study is GLASOD. It was developed by the International Soil Reference and Information Centre for the United Nations Environment Programme (UNEP) in close cooperation with FAO and was made available as a world soil chart at a scale of 1:10 million, published in 1990. The data are available both in polygon and grid form. The form of the coverage used in this study was the grid, which has a resolution of 0.04 degrees per grid cell.
The GLASOD dataset was developed from information provided by over 250 experts worldwide. Geographic areas were divided into map units - or polygons - by experts according to physical geographic criteria. Soil experts around the world were then asked to respond to a questionnaire about soil degradation within the map unit. One item on the questionnaire was the degree of degradation in relation to changes in agricultural suitability, to declined productivity and to biotic functions of the soil. Five degrees of degradation were distinguished: none, light, moderate, strong and extreme. In addition, data on the extent of degradation, classified by the percentage area of the map unit affected, were gathered in the questionnaire. These two variables were combined into one aggregate measure of degradation severity for each map unit that indicates the overall importance of degradation in the unit. This variable can take a value from zero to four corresponding to the categories of degree of degradation. The data were obtained from the Environment and Natural Resources Service (SDRN) of FAO, but are also available on the UNEP Global Resources Information Database (GRID) Web site.
This dataset represents the only global scale soil degradation study currently available and, as such, is of critical importance. It does, however, have several fundamental characteristics that must be taken into account when it is used for data analysis. The first is the scale of the coverage. Although the grid size used was relatively fine, the information that the grid was derived from was from much larger polygons. Each of these is described as a single level of degradation for a large area. The scale of these polygons is of a detail that is useful for global or continent-wide studies, but the dataset is pushed to its limits for most national level analysis, and is of inadequate detail for studies of smaller scope. Figure 1 presents the GLASOD data for the African continent illustrating degradation levels as well as the level of detail of the dataset.
Perhaps even more important for the purposes of this study is the fact that GLASOD is a dataset that contains data that may have been derived through interpretation of other datasets. A significant amount of modelling and expert input has been involved in this process. Therefore, statistical analysis performed on the dataset may recover true relationships between explanatory variables and soil degradation, but it is also possible that it will simply recover the modelling and assumptions that were put into GLASOD development. It is important to know and to communicate clearly whether project results are based on observed data or modelling products. As the current project is refined this issue will be investigated through further communication with those involved in the GLASOD development process.
The other disadvantage of the GLASOD dataset for the purposes of this study is that it is for only one period in time, preventing time series analysis. The GLASOD data represent the cumulative amount of soil degradation that was present at the time the data was collected in the late 1980s. Since soil degradation is a dynamic process, maps of changes in degradation over time would be invaluable in separating out the relative causes and effects. Statistical tools are available to perform this analysis, and many of the other socio-economic datasets have a great deal of time series information.
However, a study of the interactions between soil degradation and socio-economic factors is not impossible using the current GLASOD dataset. Spatial variations do exist and can be used to recover relationships. For the current study, differences between regions are compared in order to estimate statistical relationships. If time series soil degradation data become available at a scope commensurate with that of GLASOD, they will greatly enhance the already significant contributions to the study of global soil degradation that the original GLASOD dataset has made.
Maps of rainfall and windspeed from the Agricultural Support Systems Division of FAO have been rasterized by UNEP/GRID to the resolution of 3.7 kilometre square grid cells. These data are of approximately the same scale as the GLASOD dataset and represent average values instead of a time series. They provide a relatively detailed measure of regional climate patterns. The coverages used were downloaded from the UNEP/GRID Web site. The average windspeed is in units of metres per second and the average annual rainfall is in millimetres per year. These databases are described in the final report of the UNEP/FAO World and Africa GIS database, December 1984.
Figure 2 illustrates the magnitude of windspeed as well as the resolution of the dataset. The data ARE presented in a "false" three-dimensional form. In this figure, windspeed is used as the vertical axis. The view is what an observer flying above the southern tip of Africa would see when looking north over the windspeed dataset. Light sources and shading are applied to the image to make differences in windspeed more visible. Because the viewer is directly above southern Africa, the heights of features in southern Africa are less visible than those of northern Africa. Although in some ways this presentation is more difficult to read than a two-dimensional image, it is very effective at communicating the magnitude and scale of the information contained in the windspeed coverage. It is especially useful for revealing dataset problems and the driving forces behind unanticipated regression results.
An additional measure of climate - rainfall variability - is available through the RainSySm dataset obtained from FAO/SDRN. It was developed through spatial interpolation of approximately 10 800 stations averaged over time. The standard deviation over months of rainfall series from these weather stations was calculated and the values were replaced by their centiles (0-100).The resolution of the grid is 0.072 decimal degrees, or a grid cell size of approximately eight by eight kilometres. FAO/SDRN also supplied a grid representing slope categories at a slightly larger cell size of 0.083 decimal degrees. This coverage categorizes slope into eight increasing categories. Both of these grids provide essential information on physical and climactic characteristics that may affect soil degradation.
Remotely sensed surface temperature was used as a proxy for additional climactic variables. Derived from the Nimbus satellite HIRS/MSU sensor (National Aeronautics and Space Administration [NASA]), it consists of the monthly mean January 1979 temperature for a grid of 0.125-degree resolution (approximately 14 by 14 kilometres). The temperature is in degrees Kelvin - 100 and the dataset was downloaded from the UNEP/GRID Web site.
In order to use tabular socio-economic data together with spatial geophysical and climactic data, the tables must be linked to the region they describe. The procedure is to obtain georeferenced polygons representing national boundaries that are identified by country name. It is then possible to take non-GIS tabular data and associate the data with the appropriate polygon using the country name. The ESRI ARCWORLD polygon coverage is used as a source of national boundaries. It was obtained from the Map library at the University of California at Berkeley, the United States. The original source of the data was the 1973 World Bank II (WDBII) database produced by the United States State Department. The borders were updated in 1992. National level socio-economic data were joined to the national polygons to form the ARCWORLD CD-ROM. The sources of socio-economic data used by ESRI were the World Bank Social Indicators of Development (SID) 1990 database and the World Resources Institute 1992 database. Figure 3 presents an example of this coverage. It shows the 1985 illiteracy rates of African countries for which data are available.
For the current study, one additional socio-economic dataset was needed. The average annual percentage change in the national terms of trade between 1975 and 1984 was obtained from the World Bank, downloaded from their Web site at http://www.worldbank.org.These data were linked by country name to the ARCWORLD national polygons. Figure 4 presents these data for the countries for which they were available. This variable represents the percentage change in the terms of trade of the nation, or the percentage change in the weighted price of exports divided by the weighted price of imports.
Ghana was chosen for a subnational level case study in this analysis because soil degradation and poverty are major problems in the country, and a high quality subnational georeferenced database including socio-economic data was readily available due to a GIS-based study on aquaculture in Ghana (FAO, 1990; 1991). The existence of these data was critical as it is difficult to obtain comprehensive GIS data at a subnational level for most nations.
The aquaculture dataset was obtained from FAO/SDRN and contains district level information on socio-economic, geophysical and agricultural production variables. Figure 5 shows the poverty rate for each district in the Ghana dataset. Rainfall is presented in Figure 6 and road density is shown in Figure 7. The geophysical coverages described in earlier sections were used to supplement this dataset to form a complete basis for analysis.
In order to facilitate related work, sources of data are listed. In addition, useful sources of data that were not used in this report are also included.
There are several institutions that operate Web pages from which data can be downloaded. Some of these institutions collect the data while others merely distribute data collected by others.
The primary source of data used for this report was the UNEP/GRID Web site. It can be reached on the World Wide Web at http://grid2.cr.usgs.gov, and is a clearinghouse of georeferenced geophysical and socio-economic datasets. Another clearinghouse is ESRI, the company that produces the ARCINFO and ARCVIEW software used to perform the GIS analysis for the current study. Their homepage is www.esri.com and the data are located at www.esri.com/data/online/index.html. The Center for International Earth Science Information Network Data Clearinghouse has downloadable data at www.ciesin.org, as does the International Soil Reference and Information Centre, at www.isric.nl.
Country level agricultural, nutrition, fisheries, forestry, food quality control time series databases are available from the FAOSTAT database, which can be accessed through www.fao.org. Two other sources for data are the World Resources Institute at http://data.wri.org, and the World Bank, at www.worldbank.org. These sites provide downloadable socio-economic national level data that are not georeferenced, but can be linked by country name to a coverage such as ARCWORLD.
The Global Land Cover Characteristics Data Base at http://edcdaac.usgs.gov provides high resolution land use and land cover data developed from the AVHRR satellite. Although provided at a detailed one-kilometre resolution, the data are intended for continent wide or higher level studies. Researchers may have data quality problems if the data are used for national or subnational studies unless the dataset has been validated for the particular region.
Local research projects that have collected data to study particular issues in subnational regions provide a very high quality source of data. Those performing coordinated surveys to investigate specific problems spend a great deal of energy ensuring that the data are collected consistently and contain as much information as possible that is relevant to the particular research problem. Perhaps most importantly, the authors of the studies have usually spent significant time in the research area, gaining essential insight and experience concerning the region.
There are several problems that arise when using these data. First, the data as such are often difficult to locate and time consuming to obtain. Usually a formal proposal must be submitted to the appropriate ministry of the national government in question. Because the data can be at the level of the individual farm, privacy issues can also prevent their dissemination. It is also time consuming to learn if the dataset contains information relevant to a particular use. With a downloadable dataset, one can download the data, find out if they contain the relevant information, and discard them if they do not. If months must be taken to obtain access to a study, much more is at stake. Also, very few of the global and national databases discussed so far are at a resolution that is sufficiently high to supplement variables to the individual level surveys, so the researcher must rely on the information in the survey, data that may have been obtained with a different problem in mind. Finally, it can be difficult to project the results of studies of small regions to national, continental or global scales, making it unclear how to use them for large-scale planning or mapping projects.
Nevertheless, the high level of quality of these databases means that it is often worthwhile to investigate their availability. The focus that they have can clarify issues that become confused in a larger-scale study. For example, in Goldstein and Udry (1999) socio-economic factors and soil degradation for three villages that fall within a single district of Ghana are investigated. The survey included a GIS-based mapping of the study area. An additional feature of this study is that the data are available for downloading at www.econ.yale.edu/~udry/ghanadata.html.
Because it can be difficult to discover the existence of microlevel datasets, survey papers describing available data are extremely valuable. One such paper, (Henninger, 1998) lists subnational datasets in its discussion of GIS issues of poverty. Another (FAO, 1999), enumerates spatial data sources through its investigation of the issues for targeting agricultural development activities.
As motivated in the discussion of the GLASOD dataset, simulation models may be useful in disentangling the effects of endogenous variables. Using simulation models of production along with estimated economic parameters allows one to study how production and yields respond to driving socio-economic forces and policy options. These models do embody a great deal of information, although their use changes the nature of the regression from a data driven exercise to a summary of the current state of knowledge and expertise.
The most obvious potential candidate among simulation efforts for relevance to the current work is the agro-ecological zone (AEZ) framework developed by FAO (FAO 1978; 1998). A great deal of AEZ data and information was obtained from the Land and Plant Nutrition Management Service at FAO in the initial phases of this study. However, the work involved in integrating the AEZ model simulations proved to be beyond the scope of the current project.
In the future, the quality of regressions of the type performed in this study could be improved through integration of the AEZ model simulations. Including agro-ecological zones into the regression would enable separation of the different types of phenomena that contribute to soil degradation, and would provide a systematic method to control for environmental factors. Additionally, the data and coefficients generated from specific studies utilizing the AEZ simulations can also be quite useful for econometric investigations of socio-economic and agro-ecological interactions (FAO 1994a; 1996a).
AEZ simulations themselves can be used to separate out socio-economic from environmental phenomena in explaining agricultural outcomes. For example, AEZ simulations on potential crop yields may be overlaid with GIS coverages of measured data on actual yields in order to identify the spatial distribution of systematic deviations. Socio-economic phenomena could then be investigated to explain these gaps. If the AEZ yield simulations are used as the production function for economic models, socio-economic and agro-ecological influences can be simultaneously estimated. This would allow the researcher to distinguish the relative agro-ecological and socio-economic causes and spatial distribution of yield changes and degradation and be used to simulated policy impacts. Methods of Moments (Greene, 1990) and Maximum Entropy (Golan, Judge andMiller, 1996) estimators are well suited to this type of estimation.
Through these estimation techniques, packaging AEZ GIS data with automated yield simulation models could greatly enhance the understanding of the relationship among agricultural production, soil degradation and socio-economic effects. It could be implemented through distribution of an AEZ computer code or even through a Web-based interface that could service automated queries. Elements of AEZ simulations are already on line, such as the AEZWIN software at www.fao.org/WAICENT/FAOINFO/AGRICULT/AGL/agll/aez.htm.
Other yield simulation systems also exist, and could provide alternatives. The FAOSCAT and ARTEMIS projects are developing yield simulations based on remote sensing data. In addition it should be noted that these projects are valuable sources of remote sensing data, for example, having compiled a dataset of weekly soil moisture measurements for parts of eastern Europe and West Africa (see the NEO website at www.neo.nl).
To perform statistical analysis, the selected datasets must be integrated into a form that allows the variables to be associated with the appropriate observations, and then exported to a statistical package. In the process of integration, two final datasets are created. The first is the dataset for the statistical analysis and will be referred to as the regression dataset. The integration technique must be conservatively and carefully performed so that causality can be established and estimated parameters are unbiased. Information that may be valid but that cannot be distinguished from confounding data must either be filtered or not used. This makes confidence intervals more conservative but means that, when a variable is found to be significant, this will be due to unambiguous statistical inference.
The second final dataset, the interpolation dataset, should be of the finest detail practicable. Through it, the parameters that were derived from estimations using the conservative dataset are used to interpolate results for detailed maps. Because this process occurs after the estimation has been performed, all available data can be used to visualize effects occurring at fine geographic scales.
The general strategy to be used is a geographical one. Each dataset is layered on top of the geographic area with which it is associated. The final integrated dataset is a table which has each item linked to a geographical region. A column contains summaries of all the information that falls within that region. For this project, the ARCINFO and ARCVIEW software packages from ESRI are used to perform the integration, data presentation and map-making. A library of ArcInfo AML scripts written by the lead author is used to perform the actual data integration. PERL and MATLAB codes, also written by the lead author, are used to transfer the GIS data to the statistical software and to perform statistical analysis. To reduce computational burdens, all grids were aggregated to a common scale of 0.083 decimal degrees, or a grid cell size of approximately ten by ten kilometres.
For the regression dataset, administrative regions are used as the basic unit of observation. All data are aggregated to this level. For the analysis performed for the African continent, the nation was used as the unit of analysis. The polygon representing each nation is superimposed over the other data layers. The other grids and polygons are cut to national borders and the national data are averaged for each variable. The average values are exported to a tabular dataset, with one row for each nation and one column for each variable. The area of the country is also included for statistical weighting purposes to adjust for the level of aggregation. Country level tabular data from non-GIS sources are linked to this table using the country name. The same process is repeated at the district level for the Ghana case study.
The country is used as the unit of observation because it is important that the analysis weights the national level economic indicators appropriately. If the national data are disaggregated to the level of the geophysical data, the regression would interpret the single national statistic as if it were disaggregated, individual measurements. Therefore much of the information contained in disaggregated physical data is not incorporated into the final dataset, and only the information summarized in the average is used. It is desirable and possible to perform a more sophisticated integration that incorporates information on the variance and correlation of the geophysical datasets, but this is a nontrivial task that is beyond the scale of the current work.
The purpose of these data is to map predicted effects based on the statistical parameters derived from estimations made with the regression dataset. For this dataset, the integration strategy is quite different from that used in creating the regression dataset. Here the finest spatial scale must be preserved, and used as the basic unit, linking it to the variables of the coarser polygons it falls within. Again, a table is generated, but this table is now referenced to the smallest subpolygons of the union of all the datasets.
For example, slope is available as a remote sensed grid at the resolution of five arc-minutes per grid cell, which is approximately a ten-by-ten square kilometre. This cell becomes a basic data point, and a row of the dataset is linked to its location. Within this row, the slope of the cell is recorded as well as the socio-economic variables associated with the country that it falls within. Other geophysical data that the cell falls within that are on a coarser scale are also put in the table. If the border of a country or that of another geophysical dataset crosses the slope cell, the cell is split along the border, and each subcell becomes the basic database element. This process is accomplished by repeated application of the UNION operation within an ARCINFOAML script written by the authors.
The parameters recovered from the estimation are applied to the tabular data of the disaggregated dataset to generate an interpolated value. The calculated value is geographically referenced to the map location of the database row and is then used to generate maps. In this way, the information captured during the statistical procedure using the data aggregated by national borders can be used to predict the variations that occur within nations. This SAR methodology is described further in (Ghosh and Rao, 1994).
One important data processing step that should also be performed, but has not been in this case study due to time constraints, is the cleaning of the data. Data cleaning in which inappropriate regions are removed from the regression data before the aggregation occurs is important to remove possible sources of estimation error. The most obvious items to be addressed are bodies of water and the Sahara desert. Because this step has not been taken, it must be remembered that the regression results may be inaccurate.
In an ideal situation, researchers concerned with explaining causal relationships between soil degradation and socio-economic phenomena would conduct an experiment, exogenously setting economic variables to unique values for each region and measuring the effect of the exogenous changes on the soil quality in each of the regions. For another experiment, the researcher would manipulate the soil quality and measure the changes in economic variables. This would provide a dataset in which the effects could be disentangled. Of course, these experiments are not possible, and so other ways must be sought to solve the problem of endogeneity.
Instead one must search for a "natural experiment". This is a case in which exogenous forces have performed the manipulation of variables through events over which the agents that are being observed have negligible control. It is important that this experiment varies across observations so that the effects of different situations can be observed. Through the natural experiment, the researcher benefits from observing how different agents react to the different situations they are faced with.
There are two methods to use the information derived from a natural experiment: direct inclusion of the exogenously driven variable, or using the exogenously driven variable to instrument the phenomena that are being studied through a two-stage regression procedure. When it is possible to use the first method, its simplicity and power are preferable. However, the instrumental variable procedure is not overly difficult and allows a much greater range of phenomena to be studied. Choice of proper instruments benefits a great deal from detailed knowledge of the history and characteristics of the study and is the key to successful use of the technique. These two procedures are presented through demonstration in the following study of the links between soil degradation and socio-economic variables in Africa.
In the attempt to construct a regression model investigating the causal relationship between the socio-economic status of the population and soil degradation, it is critical to evaluate potential explanatory variables for their degree of endogeneity to the dependent variable of soil degradation. It is tempting to use the wealth of relevant, but endogenous, variables such as yields, fertilizer imports and traction measures. Also, it is difficult not to use endogenous indicators of poverty and socio-economic variables that are more direct measures of human welfare than those concerning the global economy. However, inclusion of any of these variables without careful instrumenting will confound statistical tests and lead to ambiguous results, preventing an understanding of the causality of the questioned relationship.
One way to approach the problem is to look for variables which proxy the effect of the endogenous variable of interest, but which are exogenous to the system. In the case of analysing the effects of the socio-economic situation of the population on the incidence of soil degradation, the use of direct measures of welfare is avoided, which to some extent already reflect the impacts of soil degradation. Instead, changes in the global economy are studied that directly effect the socio-economic situation of people in Africa but which they cannot significantly influence.
An example of such a variable is the change in terms of trade of the nation. The terms of trade is the ratio of the average cost of a nation's export goods divided by the average cost of that nation's import goods. As export prices rise, the terms of trade increase. As import prices become greater, the terms of trade decrease. Assuming that an individual African nation has negligible effect on world prices, a somewhat reasonable assumption, the terms of trade provide an exogenous economic variable for which the effects on soil quality can be observed. To the extent that the changes in the terms of trade of a nation reflect a change in the welfare of the population, the observed effects in a regression analysis can be used to approximate the impacts of socio-economic conditions on soil degradation.
The change in terms of trade variable was obtained from statistics provided by the World Bank and represents the percentage change over the period from 1975 to 1984. The GLASOD data on soil degradation represent the state of degradation in the late 1980s, which raises another potential issue in the selection of appropriate variables - the length of time that occurs between a change in welfare and its manifestation in soil degradation outcomes, or more broadly the appropriate temporal scale at which variables should be measured. There is a lag time between the impact of a change in terms of trade on welfare, and in turn between a change in welfare and soil degradation. This issue of lag time and temporal scale is one which needs to be investigated better and used to select better proxy variables in future studies.
The problem with using proxy variables is that they often do not capture the same effect as the original variable over varying circumstances and conditions. For example, while changes in the terms of trade of a country certainly have implications for the socio-economic status of its population, it is problematic as a proxy for welfare, as the effects of changes in terms of trade on welfare will vary by the types and amounts of goods exchanged on international markets, and the degree to which domestically produced substitutes are available as well as the distribution of the population engaged in market transactions affected by international prices.
Once the choice of a proxy variable is made, and the regression dataset prepared, the analysis is straightforward. In this example, soil degradation is regressed as a function of terms of trade, controlling for the effects of exogenous weather and geophysical variables. Heteroskedasticity which can arise from the aggregation by national border is to be addressed by weighting the regression by the square root of the area of the nation (Greene, 1990). The ordinary least squares (OLS) estimation technique is used to recover the parameters of the linear equation.
There may be some problems with the efficiency of the estimators using the OLS technique, as the dependent variable in the regression is an ordered ranking (e.g. the GLASOD soil degradation categories). By using the OLS technique, a normal approximation is assumed for the five ranked values of the soil degradation variable, which will result in more conservative confidence intervals and thus less efficient estimators. In future studies, use of a maximum likelihood approach such as the ordered probit or tobit models may be more appropriate with a dependent variable of this type.
The results are presented in Table 1. The R squared statistic demonstrates a rather low but respectable quality of fit. The P values for the parameters are low, with all the parameters being statistically significant beyond the ten percent level. The temperature coefficient, although significant, is difficult to interpret since it is included to provide a control variable for desert areas. Rainfall and the variability of rainfall are both between the five and ten percent confidence levels with the amount of rainfall increasing degradation and the variability in rainfall decreasing degradation. The decrease may be due to the variability of rainfall, or to the effects of other unobserved variables that are correlated to rainfall.
The slope and wind coefficients are highly significant. As might be expected, higher levels of either of these variables contributed to increased degradation. Highly significant, well beyond the one percent level, the sign on the terms of trade is positive, showing that increases in the terms of trade tend to increase soil degradation. This is consistent with the theoretical findings of LaFrance (1992), in which it was shown that increased commodity prices can lead to higher soil degradation. However, the terms of trade variable does not allow one to know what the actual socio-economic forces driving degradation are. The increased degradation could be due to intensification, or through increased access to inputs from lower relative import prices. It could also be due to farmers mining the nutrients of their soils to take advantage of high export prices. Future work utilizing other carefully chosen exogenous variables may be able to distinguish between these effects.
The parameters presented in Table 1 can be applied to the interpolation dataset to generate maps of the relative effects of different forces on soil degradation. For each element of the interpolation dataset, the data associated with that element are multiplied by parameters recovered with the regression. The factors contributing to degradation are summed and the fraction of the degradation due to the contribution of windspeed and of the change in terms of trade is calculated. These calculated values are presented in Figure 8 which maps the fraction of predicted soil degradation that is due to wind, and Figure 10 , which presents the fraction of soil degradation due to changes in the terms of trade. Figure 9 presents a closer view of the interpolated data for Ghana, Burkina Faso, Togo and the Côte d'Ivoire, demonstrating how the parameters from a regression of cross-sectional information of the entire African continent can be used to interpolate into a much smaller region.
Although it is clear in Figure 10 that the terms of trade difference across countries dominates the shading of the map, it can be seen that the contribution of terms of trade to soil degradation varies within countries due to changes in the important determinants. The dramatic differences in shading across national borders are exaggerated if unmeasured trade and movement across borders is non-negligible. Nevertheless, the maps do provide an excellent tool to gain understanding into the nature of the forces effecting soil degradation in different areas. They also allow communication and feedback with those who have a great deal of expertise with the regions involved but who do not have experience with the mathematics involved. Additionally, they provide a tool for identifying unexpected results and finding what drives those results for a particular area.
When performing regressions on GIS data, spatial autocorrelation is frequently included in the statistical model (Anselin 1988). Spatial autocorrelation occurs when the error terms for each observation are correlated with the error terms of their neighbours, usually due to the effects of unobserved parameters that vary over space. The consequence of ignoring spatial autocorrelation in a linear regression is inefficient parameter estimation. For the same amount of data, there will be a lower quality of fit when spatial autocorrelation is ignored, with higher standard errors and greater chance of erroneously rejecting a significant variable. Fortunately, although it does lower the power of the estimation, it does not bias the parameter estimates (Dubin, 1998).
To adjust for this spatial autocorrelation, a spatial weighting matrix is generated on the distance between observations. Often an inverse square weighting is used. The spatial weighting matrix is included in the statistical model. Specialized estimation techniques are used to recover the parameters for the regression variables simultaneously, and a spatial autocorrelation term.
Although commonly applied, spatial autocorrelation models can confound statistical tests in which endogeneity must be eliminated. If the spatial autocorrelation is due to unobserved exogenous variables, then this is a valuable technique to improve parameter estimates. However, if the spatial autocorrelation is due to endogenous variables, then the technique re-introduces the endogeneity that the researcher has been so carefully avoiding. In this case, the spatial autocorrelation corrections should not be used because they confound the hypothesis tests with the endogenous variables. Because it is impossible to know for sure what the cause of spatial autocorrelation is, it is best not to use it in the hypothesis tests, preventing endogeneity at the cost of less efficient parameter estimation. However, as a robustness check, a regression correcting for spatial autocorrelation can be performed.
A regression that includes spatial autocorrelation is done using an Estimated Moments (EM) technique. (Roberts and Osgood, 2000). The results are presented in Table 2. Comparing Tables 1 and 2, it can be seen that although the autocorrelation is statistically significant, its inclusion influences the parameter estimates very little, providing the reassurance that spatial autocorrelation issues do not greatly affect the results. If the parameter estimates were dramatically different, that does not necessarily mean that the regression is flawed, but it would be wise for the researcher to investigate possible causes of the differences. As expected, the regression that adjusts for autocorrelation has lower standard errors and P values, demonstrating its higher efficiency. This reminds the researcher that rejection of the significance of a parameter may be due to a lack of data and conservative estimation techniques.
In order to estimate the effect of soil degradation on economic variables, a two-stage least squares estimation procedure with instrumental variables is used. As in the case of using proxies, the variables exogenous to the system must be found, but these are expected to exert a similar pattern of influence as soil degradation on socio-economic status. For soil degradation, weather is a natural source of instruments. The two-stage procedure separates the weather induced changes in soil degradation from those due to economic forces. Through the weather driven changes in soil degradation, the estimation can recover the effects of degradation on socio-economic phenomena.
A two-step procedure is used. In the first step, soil degradation is regressed as a function of weather variables. A new measure of soil degradation is forecast by multiplying the estimated parameters by the associated weather variables for each observation. Because socio-economic variables were not included in the first estimation, their confounding effects have been removed from the forecast soil degradation variable. Thus, some information is purposefully ignored in order to perform a conservative regression that will only recognize information that is unambiguously due to exogenous forces.
For the second step, the economic variable is regressed as a function of explanatory variables and the forecast soil degradation variable. The parameter recovered for the forecast soil variable represents the effects of soil degradation on the economic variable with the confounding effect of economic forces on soil degradation eliminated. Once the standard errors and P values are adjusted for the two-stage process, they too represent the effects of exogenous forces. The adjustment is straightforward. Standard errors and associated P values are calculated using the original soil degradation variable instead of the forecast variable used in the regression (Greene, 1990).
Table 3 presents the results of the first stage of the estimation process. Soil degradation is regressed as a function of weather variables. Increased slope and windspeed are the dominant significant natural forces in soil degradation. These results are consistent with the current understanding of soil degradation, were soil is more vulnerable to degradation if there are steep slopes and high wind.
A predicted measure of soil degradation based on climatic variables is derived from the first regression and then used as the explanatory variable in the second stage regression. Per capita GNP was used as the measure of welfare in this analysis. This variable was used because of the availability of data and the short time frame of this report, not because it was the best measure of welfare or poverty. Again, future research efforts in this area will require much more sophisticated analysis of the appropriate measure of welfare to be used. Coordination with other ongoing efforts to map poverty indicators will constitute an important part of such a future effort.
The final results of the two-stage process are presented in Table 4. Per capita income is regressed as a function of the "instrumented" soil degradation, weather, geophysical and economic variables. The regression has an acceptable R squared. The soil degradation variable is highly significant, negative, and approximately ten times the magnitude of the other parameters. This indicates that increased soil degradation significantly decreases per capita income and therefore is an important determinant of poverty.
Illiteracy rates and population density are assumed to be exogenous, or at least to have a negligible contribution from soil degradation, and thus allowable as explanatory variables in the regression. Population density is insignificant in the regression. As mentioned earlier, this may be due to noise in the data or due to an actual lack of a contribution. Illiteracy rates, however, are significant and negatively correlated with per capita income. Rainfall is also significantly negatively correlated with income. Coefficients recovered through regressions based on national level data are far removed from direct analysis of the subsistence farmer so it is difficult to know exactly what drives these relationships.
Perhaps the understanding can be improved by masking out non-agricultural areas from the regression. However, this process must be performed carefully to prevent statistical bias. At some level, national statistics will always be less informative of individual agricultural performance than disaggregated surveys of agriculture. Since the regression must rely on the average national variables, it only has information on the average land that the average citizen with average characteristics has available to potentially farm. The actual quality of the land of an individual with certain characteristics is not in the data. Development of techniques that incorporate the correlation between the disaggregated grids within a nation will greatly improve the regression quality. Whatever techniques are employed to increase the efficiency of the data use, one must keep in mind that national level data do not contain as much information as more disaggregated datasets and will be clouded with the multitude of different factors in a complex national economy. In the future, if additional disaggregated data become available, estimations will benefit from the improvement in information. At the current state of data availability, though, national level data serve as one of the few sources of comprehensive and consistent information across large areas and over long periods of time, providing a scope essential to policy relevant research.
As mentioned earlier, GLASOD, the soil degradation dataset used, includes a certain amount of modelling in its generation, which potentially has involved some of the data that have been used in the regression. Statistical methods are traditionally applied to observed data instead of modelling results. The modelling element complicates the interpretation of the regression. However, there is potential for the use of simulation models to deal with endogeneity in regressions. The process relies heavily on the quality of the modelling. If fundamental problems exist in the modelling framework, these problems will manifest themselves through the regressions and into the final results. If the information provided by GLASOD is not sufficiently linked to actual soil degradation, then the regression results describing the link between soil degradation and socio-economic variables will be merely an artefact of the GLASOD construction.
As the source datasets for the regressions become further distanced from the raw data, the results depend on the expert opinion involved in the modelling. In essence, the estimation procedure is a way of summarizing the current state of knowledge. Thus, the estimation procedure becomes more of an expert system than a revelation of what is implied by pure data. There is unquestionable value in this alternative, but results must be presented honestly so that their meaning is not misinterpreted or overstated.
In order to provide an example of a regression that is of a finer scale of resolution, the connection between soil quality and poverty is estimated in Ghana, using the district as the unit of observation. The soil and weather coverages were used from the Africa-wide estimations, except when there was more detailed information available. The benefits of reducing the scope of the estimation are that the aggregation problems evident using national averages are less pronounced. These aggregation problems still do exist to a certain extent, however, and will always be present unless the study is based on a survey of individual growers.
Along with the benefits of decreasing the scope there are disadvantages. As the area of the study decreases, finer resolution datasets are required to maintain quality results. Many of the global level coverages are pushed to the limits of their resolution for subnational level work. Figure 15 illustrates that there is little variation in the global scale soil degradation coverage within the borders of Ghana.
Another problem occurs as the scope of the regression is decreased. The regression can only capture the experience embodied in the dataset. In order to be able to estimate parameters, observations must span a range of situations. If there is not enough local variation in exogenous variables, it will be impossible to measure their effects significantly. If the variables are collinear (for example, if the only places in which there are high levels of rainfall are the same places where there are steep slopes or dense roads) then the effects of these variables cannot be separated. As the scope of the study increases, there is a higher chance of variation across all of the variables. Thus the researcher is faced with a trade off between the problems of highly aggregated data and the limited variation within a smaller study area.
As evidence of this trade-off, the terms of trade variable cannot be used to estimate its effects on the soil of Ghana. It is a nationwide foreign trade statistic and does not vary across the country. It is difficult to arrive at an exogenous natural experiment that provides the data necessary to estimate the effects of economic variables on soil degradation. Given time-series data, one could measure how the history of variations among the terms of trade affected soil degradation. With an intimate knowledge of the history and situation within Ghana it may be possible to find a natural experiment to measure the socio-economic effects on soil. However, discovery of natural experiments is difficult and requires both insight and luck. As a result, estimation of the effects of socio-economic variables on soil degradation is left for future work.
Fortunately for these research purposes, weather does vary across Ghana, and allows the Africa-wide instrumental variable estimation to be repeated at the subnational scale. Again the procedure involves a two-stage process, with the first stage involving an estimate of predicted soil degradation based on exogenous factors. Table 5 presents the first-stage results, with a high R squared. Irrigation intensity is associated with increased soil degradation. The parameter for wind is not significant, however, perhaps because of a lack of variation in the windspeed dataset within Ghana. Surprisingly, degradation decreases with slope. This is opposite to the result for the continent-wide regression. This could be due to degradation preventing phenomena correlated with slope that are common in Ghana but not common across Africa. It could also be due to the coarseness of the GLASOD dataset. If the second possibility is believed to be driving the parameters, one option would be to estimate the first stage of the procedure for all of Africa and then to apply the recovered coefficients to forecast the soil degradation variable to be used in the second stage, adjusting the quality of its parameters if necessary.
For the second stage, the fraction of people who fell below the poverty line was regressed as a function of the other variables, including the predicted soil degradation estimated in the first equation. The results of the two-stage regression are presented in Table 6. The explanatory power of the regression is high, as evidenced by the large R squared statistic. Population density is the only variable that is not significant. Increased road density is associated with a lower poverty rate, as is increased irrigation project density. The coefficient for the instrumented soil degradation variable is positive and significant indicating that increased soil degradation is associated with a higher incidence of poverty. Analogous to the maps generated for the Africa-wide regression, Figure 16 presents the share of soil degradation of poverty inducing factors in Ghana.
One further point of clarification: the sign of the poverty parameter can be confusing when compared with the Africa-wide regression because the variable used in the Africa-wide regression was income, while it was the poverty rate for the Ghana regression. This means that a positive coefficient for the Ghana regression implies that increased soil degradation increases poverty while a negative coefficient for the Africa regression implies that increased soil degradation increases poverty.
This concept paper has demonstrated some of the key issues that must be addressed in analysing causal relationships between poverty and soil degradation with the use of GIS databases. The analysis has indicated that there is considerable potential for integrating current and emerging GIS-based datasets on environmental characteristics with socio-economic data to analyse interactions between poverty and environmental degradation, although several problems arise that need careful attention in order to resolve.
Clearly, one main issue that arises is that of integrating datasets at different scales of resolution, with trade-offs existing between the level of detail captured in the analysis and the extent to which datasets generated at a larger resolution can be used for analyses at a more detailed level. Several factors need to be considered here. First, the level of analysis that is appropriate for analysing the problem under consideration must be addressed. The degree to which the variables under consideration show significant patterns of variation is one important consideration in determining appropriate scales of analysis. The relevant spatial scale of the problem under consideration may vary from the farm to the watershed, at national and global levels. At this point, it is frequently the case that GIS data are not readily available at the finer scales of resolution, so the question then becomes to what extent datasets at a larger scale of resolution can be used to recover relationships generated at a finer scale, and the pitfalls in doing so. For example, a farm-level, time-series dataset which includes environmental characteristics would be ideal for conducting the type of analyses attempted in this paper; however, in the absence of such data, datasets at a much larger scale of resolution with several caveats on the implications of using averaged values across large spatial scales have been utilized.
Another important issue is the degree to which information is lost (or gained) in the process of integrating datasets at different scales of resolution. In the creation of the national level dataset for the regressions in this paper, how some of the detail contained in the physical datasets was lost in the process of creating average national level values was discussed. It was also noted that it is desirable and possible to perform a more sophisticated integration of data than that used in this study, which incorporates information on the variance and correlation of the geophysical datasets.
The extent to which the datasets being employed in an analysis are generated by observed data versus data derived from model simulations is an important factor that must be considered in interpreting results. Statistical analysis performed on datasets that incorporate the products from simulation models may simply recover the modelling and assumptions that were put into the dataset development. It is therefore critical to be aware of how the data used in the analysis were generated and take this into consideration when conducting analyses with them.
This paper has discussed at length the problems that arise in conducting statistical analyses of systems characterized by feedback loops among variables, generating endogeneity between explanatory and dependent variables. One important means of controlling for this effect in a statistical analysis, which has been noted but not employed in this study, is the potential for using simulation models to enable separation of the different types of phenomena that contribute to soil degradation, and would provide a systematic method to control for environmental factors. One example of how this technique has been used, and which could be used as a basis for future research is a study conducted under the auspices of the International Soil Reference and Information Centre (ISRIC) on the impact of land degradation on food productivity (Mantel and van Engelen, 1997). In this study, data on terrain and soils at a scale of 1:1 M for Uruguay, northern Argentina and Kenya were used together with simulation models for erosion risk and crop production in order to derive a map of the potential yield impacts of erosion over a 20-year period. However, it is important to note that using simulations instead of observed data changes the nature of the regression from a data-driven exercise to a summary of the current state of knowledge and expertise.
Adopting methods which are being used for poverty mapping can be an important way to overcome problems of the availability of data measuring welfare, particularly at finer scales of resolution. Essentially there are two techniques which can be employed depending on the type of data which is available. Both require a combination of census and sample survey data. The first technique requires more variables in the census dataset, as well as access to household level data from the census. A set of variables from the sample survey dataset is selected as explanatory variables in an estimation on consumption or some measure of welfare. The variables selected are limited to those that are found in both datasets, as the second step of the process involves applying the parameter estimates from the sample survey estimation to the census data. The household level value of the explanatory variable is then multiplied by the corresponding parameter estimate, which gives a predicted value of the welfare indicators for each household in the study area, which can then be aggregated at varying scales.
The second technique also involves a combination of census and sample survey data, but requires less variables from the census dataset and access to data at a community level of aggregation only. In this technique, a welfare or poverty measure is estimated at the community level using average values obtained from a regression on the sample survey data applied to the census values. An important assumption made in this estimation process is that the level of variability in welfare is not high within the community, but rather the significant level of variation occurs among communities. Either of these techniques can be used for deriving a measure of welfare at varying scales of resolution, which could then be overlaid with environmental data or simulation results in order to analyse poverty and land degradation relationships.
It has been demonstrated how instrumental variables can be used in situations where endogeneity exists between the variables that are being analysed. Clearly, the choice of a good proxy which is exogenous to the system being considered, but which has the same effects of the variable or variables it is intended to reflect, is a critical but difficult one that requires careful analysis and justification. Such an analysis was outside the scope of the present study, and so illustrations of the techniques involved have been presented using variables for which data were readily available, although which may not be the optimal choice for instruments. In particular, identifying appropriate instruments for socio-economic status is fraught with difficulty and requires considerably more in-depth analysis than was possible for this study.
Another issue that arose in the analysis is the problems that arise in using a variable that consists of ordered ranking values (e.g. the GLASOD soil degradation variable) in a statistical analysis, and the choice of an appropriate estimation technique. In future studies, it may be preferable to use a simulation model to estimate soil degradation based on disaggregated exogenous environmental factors, rather than the GLASOD data. Future work with GLASOD, or other measures of soil degradation that become available which are measured with ordered values, should include the use of alternative estimation techniques, such as an ordered probit or logit model. Use of such techniques may provide more significant results than those obtained with the OLS method; however there needs to be more rigorous work conducted on the appropriate explanatory variables and the underlying theoretical model before the results of such an estimation could be considered valid.
Despite the problems involved with the data and variables used in the analyses presented in this paper, the results obtained from the statistical analyses provide some interesting food for thought. The national level regressions for the continent of Africa using changes in the terms of trade indicated that countries that had experienced improvements in the terms of trade exhibited higher levels of soil degradation. No conclusive relationship can be established on the basis of this analysis, not least because the level of degradation derived from GLASOD data represents a cumulative effect up to the late 1980s, when the data were collected, while changes in the terms of trade represent only a fairly recent phenomenon, occurring between 1979 and 1984. The degree to which soil degradation occurred earlier than 1979, and the lag time between changes in economic variables and the manifestation of soil degradation, are obviously critical factors which will have an impact on the degree to which the relationships between the two are meaningful. However, these results indicate that investigations into the relationship between global economic phenomena and the local management of natural resources are quite important, and suggest that improvements in some measures of economic welfare may be positively associated with increased environmental degradation, a position that is frequently cited in the debate over the implications of globalization for resource management patterns.
The subnational analysis conducted for Ghana also provides some suggestive results for future research. This analysis suggests that soil degradation is a significant determinant of poverty, which implies that policies that lead to improvements in soil quality can be an important means of poverty reduction. Again, the analysis is not at a stage where definitive conclusions can be made; however it does provoke interest in pursuing this line of research for its potential implications for both poverty alleviation and improvements in the management of environmental resources.
Finally, how mapping techniques can be used to present the results derived from statistical analyses has been illustrated, which can provide an important means of communicating fairly complicated relationships. Because the heterogeneity of regions is of key importance in the understanding and management of these issues, the results are projected into maps. These maps are used to illustrate the relative contributions of each factor at each location and allow for the differences between regions to be understood through visual means. These maps provide a graphic presentation of statistical analyses in a form that facilitates communication with policy-makers as well as other field-based practitioners and researchers.
Anselin, L. 1988. Spatial econometrics: methods and models. Amsterdam, Kluwer.
Ardila, S. & Innes, R. 1993. Risk, risk aversion, and on-farm soil depletion. J. Environmental Economics and Management, 25: 27-45.
Dubin, R.A. 1998. Spatial autocorrelation: a primer. J. Housing Economics, 7: 304- 327.
Ekbom, A. & Bojo, J. 1999. Poverty and environment: evidence of links and integration into the country assistance strategy process. Discussion Paper No. 4. Environment Group, Africa Region, Washington, DC, The World Bank.
FAO. 1978. Report on the Agro-ecological Zones Project. Methodology and Results for Africa, Vol. 1. World Soil Resources Report No. 48. Rome.
FAO. 1990. Where are the best opportunities for fish farming in Ghana? The Ghana Aquaculture Geographical Information System as a decision-making tool. by J.M. Kapetsky, U. Wijkstrom, N. MacPherson, M. Vinke, E. Ataman, and F. Caponera. Field Technical Report 5. FI:TCP/GHA/0051. Rome.
FAO. 1991. Geographical information systems and remote sensing in land fisheries and aquaculture. by G.J. Meaden and J. M. Kapetsky. Technical Fisheries Paper No. 318. Rome.
FAO. 1994. Land degradation in South Asia: its severity, causes, and effects upon the people. World Soil Resources Report No. 78. Rome.
FAO. 1996a. Preliminary results and conclusions on population distribution in relation to agro-ecological zones. by F. Nachtergaele, L. Jansen and M. Zanetti. AGLS Working Paper. Rome.
FAO. 1996b. Agro-ecological zoning guidelines. FAO Soils Bulletin No. 73. Rome.
FAO. 1999. Spatial Aspects of the Design and Targeting of Agricultural Development Strategies, by S. Wood, K. Sebastian, F. Nachtergaele, D. Nielsen and A. Dai. EPTD Discussion Paper 44. Rome.
FAO/IIASA. 1994. Agro-ecological land resources assessment for agricultural development planning, by G. Fischer, and J. Antoine. World Soil Resources Report No. 71/9. FAO, Rome and the International Institute for Applied Systems Analysis, Laxenburg, Austria.
Ghosh, M. & Rao, J. 1994. Small area estimation: an appraisal. Statistical Science, 1(9): 55-93.
Golan, A., Judge, G. & Miller, D. 1996. Maximum entropy econometrics. New York, John Wiley.
Goldstein, M. & Udry, C. 1999. Agricultural innovation and resource management in Ghana. Final Report under MP17. Washington, DC, International Food Policy Research Institute.
Greene, W.H. 1990. Econometric analysis. Englewood Cliffs, NJ, USA, Prentice Hall.
Grepperud, S. 1997. Soil depletion choices under production and price uncertainty. Discussion Paper No. 186. Oslo, Statistics Norway.
Henninger, N. 1998. Mapping and geographic analysis of human welfare and poverty - review and assessment. Wshington, DC, World Resources Institute.
IFPRI. 2000 Global study reveals new warning signals: degraded agricultural lands threaten world's food production capacity. IFPRI News Release.
Kirschke, D., Morgenroth, S. & Franke, C. 1999. How do human-induced factors influence soil erosion in developing countries? Paper presentedat the International Workshop on Addressing the Impact of Agricultural Research on Povety Alleviation, San Jose, Costa Rica.
LaFrance, J. 1992. Do increased commodity prices lead to more or less soil degradation? Australian J. Agricultural Economics, (1): 57-82.
Mantel, S. & van Engelen, V.W.P. 1997. The impact of land degradation on food productivity - case studies of Uruguay, Argentina and Kenya. Volume 1: Main Report. Report 97/01. Wageningen, the Netherlands, International Soil Reference and Information Centre.
Roberts, M. & Osgood, D. 2000. Estimating multinomial choice models with spatial autocorrelation. Draft paper, University of California at Berkeley, USA, Department of Agricultural and Resource Economics.