# 3. Methods of poverty mapping

A variety of methods for spatial location of the poor have been put forward in the literature and in practice; most are under continuing development. In this section, the major methods in use around the globe will be described, and the context in which each has been employed.

## Small-area estimation

Small-area estimation is a statistical technique that combines survey and census data to estimate welfare or other indicators for disaggregated geographical units such as municipalities or rural communities. Small-area estimation applies parameters from a predictive model to identical variables in a census or auxiliary database; the assumption is that the relationship defined by the model holds for the larger population as well as the original sample. This technique has been used by the United States government for planning and targeting purposes (Ghosh and Rao, 1994).

Small-area estimation has more recently been extended to developing countries for poverty mapping. Two principal methods have emerged. The first uses census data on household units. It was developed principally by staff at the World Bank and is the main methodology used and promoted by the Bank's new poverty-mapping group (World Bank, 2000). The second uses community-level averages instead of data on household units and has been employed by researchers at the World Bank and various centres of the CGIAR system. These econometric models are not causal: they do not seek to explain the determinants of poverty, but maximize precision in identifying the poor. This is an important distinction in terms of the kinds of explanatory variables that are utilized.

### Household-level method

This method was developed in Hentschel et al. (2000) and Elbers, Lanjouw and Lanjouw (2001); it is presented in Deichmann (1999) and World Bank (2000), from which the following discussion is derived.

The method requires a minimum of two sets of data: household-level census data and a representative household survey corresponding approximately to the same period as the census. In Nicaragua, for example, poverty maps have been made using data from a 1995 population census and a 1998 Living Standards Measurement Study (LSMS) survey; in Ecuador, data from a 1990 population census were used with 1994 survey data. The maximum allowable time difference will vary by the rate of economic change in a given country. Most efforts have used a population census with data on household units; an agricultural census that includes basic demographic information could be used, such as the 1997 Chinese agricultural census or other sufficiently representative large-scale survey. Elbers et al. (2001) provide an example of small-area estimation in Brazil using a large-scale household survey instead of a population census; Minot and Baulch (2002a) use a 3 percent sample of the 1999 Vietnamese population and housing census. Efforts are currently underway to test the use of the standardized demographic and health surveys (DHS) on health and nutrition in small-area estimation (Macro International, 2002).

The first step is to estimate a model of consumption-based household welfare2 using data from the household survey. This model should be estimated by statistically representative regions, or urban or rural areas, with explanatory variables limited to those found in both data sets.

The following equation is estimated using ordinary least squares:

(1)    lnC = α + β1X + β2V + ε

Where C is total per-capita consumption, or another food-security or poverty proxy, X is a matrix of household-level characteristics and V a matrix of geographical-level characteristics.

The resulting parameter estimates are applied to the census data. For each household, the estimated parameters from the regression are used to compute the probability of each household in the census living in poverty. Household-level results can then be aggregated by the geographical region concerned by taking the mean of the probabilities for the chosen geographical entities.

For each household, the household-level value of the explanatory variable is multiplied by the corresponding parameter estimate, which in this case gives a predicted value of the log of total per-capita consumption for each household in the study area. The estimated value of the benchmark indicator is then used to determine the probability of a household being food-insecure or poor in terms of a given threshold below which a household is food-insecure or poor whether based on consumption, caloric intake or anthropometric measures. Here:

(2)   Fij = 1 if lnCij < lnz;
Fij = 0 otherwise

with the corollary in poverty analysis being the headcount index. Following Hentschel, et al. (2000) and using the model of consumption from equation (1) but with only one vector of explanatory variables for exposition purposes, the expected food-security status of household i is:

(3)

where Φ is the cumulative standard normal distribution. This equation gives the probability that a household is food insecure. Estimates of and are obtained from the model of the benchmark indicator, providing the following estimator of the expected food insecurity of household i in the census:

(4)

Regional food insecurity, F, is found with:

(5)

where N is the number of households in a specific region or geographical unit. Expected food insecurity is found with:

(6)

The incidence of food insecurity is calculated as the mean of the probability of households being food-insecure:3

(7)

F* can be calculated for different levels of food insecurity. Food-insecurity measures comparable to the depth and severity of poverty (Foster, Greer and Thorbecke, 1984) and any number of the standard poverty measures can be constructed.

Although the concept is straightforward, application in practice presents a number of econometric and computational challenges, including the large size of census data sets, non-normality, spatial autocorrelation and heteroscedasticity; these are discussed in detail in Elbers, Lanjouw and Lanjouw (2001). One virtue of this methodology is the relative ease of checking the reliability of estimates that are built into the programmes provided by the World Bank to national poverty-mapping analysts. The size of standard errors in these estimates depends largely on the degree of disaggregation sought and the explanatory power of the exogenous variables in the first-stage model. Demombynes et al. (2002), for example, show that relatively precise poverty estimates can be made at the third administrative level, which for Ecuador and Madagascar means approximately 1 000-2 000 households and for South Africa 20 000 households. The optimal degree of disaggregation will depend on:

• the purpose of the map;
• the sampling properties of the household data;
• trade-offs between the size of standard error and policy needs.

The other virtue of this approach is that it has the institutional backing of the World Bank and a team of researchers concerned with developing methodology and training. It is the only method where statistical properties have been and continue to be thoroughly investigated.

The Nicaraguan government and in particular the Fondo de Inversión Social de Emergencia (FISE), with support from the World Bank, have adopted and applied the household-unit data method for creating poverty maps for planning purposes and future targeted programmes such as the Red de Protección Social (RPS; Social Protection Network) anti-poverty programme (Government of Nicaragua, 2001). This method was pioneered in Ecuador (see Map 1) and has been used to create poverty maps for targeting and policy-making in Panama (World Bank, 2000) and South Africa (Alderman et al., 2000); the World Bank and the International Food Policy Research Institute (IFPRI) are supporting efforts in Cambodia, Guatemala, Kazakhstan, Kenya, Madagascar, Malawi, Mozambique, United Republic of Tanzania, Uganda and Viet Nam (J. Lanjouw, personal communication, 2001; N. Minot, personal communication, 2001; S. Wood, personal communication, 2001; Snel and Henninger, 2002). In their case studies, Snel and Henninger (2002) provide detail on ways in which these poverty maps have been put to use in different countries.

Researchers from IFPRI are designing the national maps of Malawi and Mozambique with the aim of building a regional poverty map that could be expanded to include other East African countries. Such an effort means that the challenge of constructing comparable poverty lines and indices over two or more countries will have to be overcome. (T. Benson, personal communication, 2001; K. Simler, personal communication, 2001; S. Wood, personal communication, 2001.)

### Community-level data method

An alternative small-area estimation method uses average values from disaggregated geographical units such as communities or small towns instead of household-unit data. This has the advantages that data requirements are less stringent and national statistical agencies may be more likely to release community averages on request; indeed, this data may be published. This is particularly important for researchers who, unlike the World Bank researchers, do not have institutional backing or resources to form formal collaborative arrangements with national statistical agencies. Apart from the difference in the scale of the predictive model, the two small-area estimation methods follow essentially the same steps. The first step is to estimate a model of consumption-based household welfare using the household survey data, as shown in equation 1. The resultant parameters are then used to predict the expected level of well-being for communities.4

Map 1
Ecuador poverty map, small-area estimation method

Source: U. Deichmann, personal communication, 2002.

Predicted mean consumption in a community is not necessarily a good proxy for poverty, however, because poverty measures are functions of mean consumption and the distribution of consumption in a community. Bigman uses a Taylor expansion of the head count to obtain an expression for the measure of poverty as a function of mean consumption and spread parameters such as the standard error of the regression.

The expected head-count measure of poverty P0 in a community j will be equal to:

(8)   E(P0j ) = E(Prob [lnyij < 0]) = E(Φ(-Xijβ/ρj))

where lny is the log of the level of consumption per adult in household i, Xij is a matrix of individual and community level variables, Φ is the standard cumulative normal distribution and ρj is the standard error of uj. Xij, however, is not observed outside the household survey sample, and even within the sample the number of observations per community is usually too small. Estimates of the means of all variables in each community Xj are available, however. Since this expression is non-linear, Xj cannot be substituted for Xij, though an approximation may be obtained using Taylor expansions.

This process relies on a series of assumptions. First, the variance around mean consumption within each village must be assumed to be constant. Second, in the consumption equation, the behavioural model inside and outside the household sample must be assumed to be constant. This may be a problem in a country where the geographical dimension is important and has not been taken into account in the sampling design. Within the sample, the problem can be addressed econometrically by testing the stability of the estimates between urban and rural areas, across regions or across other spatial units. Third, only limited information may be available at the community level for some variables, even in terms of means, which could lead to a problem of a missing variable.

This method has been frequently employed. Minot (2000) utilizes Viet Nam's 1994 agricultural census and the 1993 LSMS to create a national poverty map, relying on district-level averages to predict district-level poverty rates. Bigman et al. (2000) use a population census and household survey for a similar purpose in Burkina Faso. Bigman and Srinivasan (2001) likewise use a population census and household survey in India. Bigman and Huang (2000) have proposed a similar approach using data from the 1997 China agricultural census. Using data from Kenya, Bigman and Loevinsohn (1999) show how the community-level data method can be used in targeting agricultural research and development for poverty reduction. Godilano et al. (2000) have done preliminary work in linking disaggregated poverty incidence to environmental risk such as flooding and suitability for rice production in Bangladesh.

Easier access to data makes this method attractive, but the error associated with estimation for units of different sizes in terms of population has not been thoroughly investigated. To date, only one study, Minot and Baulch (2002b), has looked into the issue of how much precision is lost when using census data aggregated to community level or any other level. They find that the greater the disaggregation of the data, the more precise the estimates; errors in estimates based on census enumeration areas average approximately two percentage points. From another perspective, 98 percent of provincial poverty estimates had errors of less than five percentage points. Using census data aggregated to province level resulted in almost one third of provincial poverty estimates having errors of less than five percentage points. The study found that the magnitude of error varies with the estimated incidence of poverty, with error at its smallest when the poverty rate is close to zero, 50 percent and 100 percent. The authors conclude that the best option is to use household-level data; if it is unavailable, then community-level census data can be used to generate reasonably accurate poverty estimates.

## Multivariate weighted basic-needs index

Various basic-needs indices are used for disaggregated poverty mapping. They differ among themselves in terms of the choice of variables and weighting schemes. This section focuses on an assortment of weighting schemes. Three are based on multivariate statistical techniques - principal components, factor analysis and ordinary least squares. The others have no weighting scheme; all components are valued equally.

### Principal components

An alternative method of disaggregating poverty measures to the community level is that used by the Mexican Government. This methodology was first utilized to create a marginality index for policy-planning purposes and then as part of the targeting mechanism of the PROGRESA anti-poverty programme.5 Localities were deemed eligible for the programme in terms of a ranking of the marginality index. Selection of households was then based on the results of a census administered in the communities.

This US\$1 billion programme provides bimonthly cash transfers to over three million rural households, in exchange for which the children are sent to school and given medical examinations. The marginality index was developed using the method of principal components, based on seven community level variables from a combination of the 1990 and 1995 population census. In this case, four variables came from the 1995 Conteo, or population count:6

1. share of illiterate adults (persons over 14 years old) in the locality;
2. share of dwellings without water;
3. share of dwellings without drainage;
4. 4. share of dwellings without electricity.

Three variables came from the 1990 population census:

1. average number of occupants per room;
2. share of dwellings with dirt floor;
3. share of population working in the primary sector.

The principal components statistical technique reduces a given number of variables by extracting linear combinations that best describe the variables, in this case transforming seven variables into one index. The first principal component, the linear combination capturing the greatest variance, can be converted into factor scores that serve as weights for the creation of the marginality index.7 The marginality index was then divided into five groups based on the degree of marginality. The cutoff points were determined by the Dalenious-Hodges statistical procedure.8

Of 105 749 localities with a population greater than 50 individuals, only 74 994 - accounting for 97 percent of the population - had data on all seven variables. For the remaining 29 698 localities missing one or more of the seven variables, regression techniques were used to estimate the marginality index. A different equation was used to estimate the marginality index for 1 720 localities in Chiapas for which no data were collected in 1995 because of social unrest. Over 99 000 localities with fewer than two households, accounting for 585 944 inhabitants, or 0.64 percent of the population, were not included in the calculations. These households were initially excluded from the index and the programme.

For logistical, financial and programmatic reasons, the index was then crossed with other spatially based criteria - geographical location, distance between localities and access to health and school infrastructure - in order to determine inclusion in the programme. Data from other ministries were combined with GIS, and service zones were established by a process of characterizing localities according to their access to these services, taking into account the quality of roads when public services were not located in the same community. Another statistical routine was then used to choose household beneficiaries within these communities. The statistical properties of this index have not been determined. The sampling error associated with the marginality index is therefore not known. An evaluation of the PROGRESA method that compares the allocation of localities with a method similar to community-level small-area estimation is discussed in the section Method matters.

### Principal components over time

FAO and Columbia University are using principal components in ongoing joint work to construct a poverty map for Costa Rica. The map is for use in analysing the relationship between poverty and deforestation over time. The principal components technique was chosen in preference to small-area estimation methods for two reasons: first, poverty maps were to be constructed over time for four decades, with one observation per decade, corresponding to deforestation data, but household-survey data are available only for the last two decades; second, it is feared that income data are biased.9

The principal-components methodology is similar to that used by PROGRESA, but in Costa Rica a comparable index over time was required. In order to construct time-series indices in the same scale, community-level averages at the district level were pooled over census years. Maps are constructed for each year, or for differences between censuses, to show which districts have improved most over time. The basic assumption made in pooling over time is that the impact of the included variables over the four decades is averaged. Change in the marginality index is thus limited to changes in the levels of variables, not changes in the relative importance or impact of each variable in determining the index. Changes in social or economic structure, for example, may alter the importance of education over the period 1963 to 2000, but these changes are averaged over the four decades (Cavatassi, Davis and Lipper, 2002).

### Factor analysis

The South African government has created development indices based on factor analysis, a statistical technique similar to principal components. The primary purpose of factor analysis is to describe the relationships among many variables in terms of a few underlying but unobservable factors. Factor analysis is similar to principal-components analysis in that both are attempts to approximate the covariance matrix. Factor analysis, however, is more elaborate. The primary question it seeks to ask is whether data are consistent with some underlying structure (Johnson and Wichern, 1988). In factor analysis, sets of variables are grouped by their correlations; each group of variables represents a single underlying construct or factor. Although factor analysis does assist in identifying underlying factors represented by a set of variables, the method is subjective: the factors have to be interpreted to give them meaning. This interpretation relies on previous knowledge and intuition about underlying relationships.

Factor analysis with rotation was applied to 1996 population-census data in South Africa by Hirschowitz, Orkinand and Alberts (2000), with the aim of providing information for allocation of public development funds. The first component, interpreted as a household infrastructure index, explained 57 percent of the variance; the second component, interpreted as the household circumstances index, explained 17 percent of the variance. The variables in each factor can be seen in Table 1.

Creating the indices required the following steps. Variables were given equal weight in both indices, which was justifiable because all factor loadings were considered relatively high. In order to put variables into comparable units, each variable in each index was arranged from high to low values and then divided into three categories - high, medium and low development. Based on the value of each variable, each province is allocated to one of the categories. These are then summed for each province - with eight variables, the possible sums range from 8 to 24 - and adjusted by population size in order to provide a relative ranking of provinces by development.

TABLE 1
Factor analysis in South Africa.

 Variables Infrastructure Circumstances Living in formal housing .65 -.01 Access to electricity for lighting .78 .07 Tap water inside the dwelling .83 .12 A flush or chemical toilet .84 .19 A telephone in dwelling or cellular phone .77 .05 Refuse removal at least once a week .74 .19 Level of education of household head .60 .25 Monthly household expenditure .84 -.08 Unemployment rate .39 Average household size -.02 Children under five years old .05 Source: Hirschowitz, Orkinand and Alberts, 2000.

### Ordinary least squares

The Nicaraguan RPS anti-poverty programme used poverty mapping to target census segments for intervention. The pilot for this programme, which began in August 2000, currently reaches approximately 10 000 geographically targeted rural households in northern Nicaragua. The programme is similar to PROGRESA: in exchange for sending their children to school, being given health examinations and participating in public-health presentations, the woman heading a household receives a maximum of US\$336 per year in cash transfers. In contrast to PROGRESA, all households in the census segments were included in the programme (Government of Nicaragua, 2000).

For programmatic reasons, the pilot was located in six municipalities in two northern departments. A marginality index was required to rank census segments for targeting purposes. The index was composed of four variables - household size, percentage of households without potable water, percentage of households without latrines and percentage of illiterate adults - which were weighted by the coefficients derived from ordinary least squares regression analysis of the determinants of extreme poverty, using household data and a larger group of variables (Arcia, 1999). No evaluation of this targeting method has been conducted. The ongoing expansion phase of the RPS will utilize poverty maps based on the household-unit level small-area estimation strategy.

## Combination of qualitative information and secondary data

A number of organizations use various combinations of qualitative and secondary data to create poverty maps, focusing on food security rather than poverty. These instruments tend to focus on the determinants of food security, in most cases revolving around the concept of livelihood strategies, but collect and utilize data in different ways.

### Primarily qualitative

Two variants of the livelihood approach are employed that use primary data in field vulnerability assessments. The first, the household-economy approach (HEA), developed by the Save the Children Fund in collaboration with the FAO Global Information and Early Warning System (GIEWS), has also been used by the World Food Programme (WFP) Vulnerability Analysis and Mapping System (VAM). The method has five steps (Seaman et al., 2000; Bourdreau, 1998).

1. Define food-economy zones for a given region. A food economy-zone is defined as a geographical area where most households obtain food and cash income by a similar combination of economic activities.
2. In each zone, define different wealth categories. These are based on indicators of wealth identified by the people themselves, and thus relate only to categories in each zone.
3. Collect livelihood information on a typical household for a normal year in each of these categories.
4. Describe the economic context in which households live.
5. Use the above characterization as a baseline from which to hypothesize the possible impact of economic change on household income and food supply in each zone.

The main sources of data for constructing the food economy zones and livelihood strategies are rapid rural appraisal techniques, semi-structured group interviews and interviews with key informants, supplemented by secondary data. Since food-economy zones are based on geographical areas, vulnerability and risk maps can then be constructed.

The second variant is the vulnerable-group profiles developed by FAO as part of the Food Insecurity and Vulnerability Information and Mapping Systems (FIVIMS) initiative (Huddleston and Pittaluga, 2000), which identify mutually exclusive livelihood-strategy groups first through brainstorming sessions with experts. The primary source of livelihood typically serves as the principal means of classification. These groups are further refined through participatory fieldwork techniques and secondary data, and linked with geographical areas. Each profile contains information on the factors that influence livelihoods, including asset ownership and access, mediating factors such as laws, politics and culture, external factors such as demographics, the natural resource base and macroeconomic context, and vulnerability to economic and natural shocks. Emphasis is placed on understanding the determinants of food security or poverty. Group sizes are calculated, when possible using population-census data linked through occupation codes. Vulnerability maps are then constructed.

### Primarily secondary

The “indicator” approach employed by the United States Agency for International Development (USAID) famine early-warning system (FEWS) - which is also geared to vulnerability assessment, though with a focus on identification of households rather than causality - is based mainly on secondary evidence, and less on field work (FEWS, 1999a). Stratification is by administrative unit and within administrative units, in some cases by household production strategies. These strategies may derive from information provided by NGOs, key informants or livelihood-system approaches such as HEA, described above. Food access and availability per person are then calculated at the administrative or group level (see FEWS, 1999b and 2000). These secondary data range from tables to statistical procedures to qualitative information when data are missing. Multiple vulnerability indicators are commonly combined into a single index with which areas and groups can be ranked. These indices are built using the following steps:

• determination of the primary dimensions of vulnerability such as agro-ecological, infrastructure, economic resources and coping ability;
• selection and transformation of comparable indicators;
• weighting of indicators, typically based on best judgment or expert opinion;
• ranking according to summed scores.

This information is linked to geographic area and thus is commonly included in vulnerability maps.

### Statistical analysis of qualitative information combined with secondary data

This methodology can be combined with other statistical techniques. In a FEWS exercise in Malawi, for example, secondary information was collected from a variety of sources using statistical and conceptual-cluster analysis, and 154 geographical units, the extension planning area (EPA), were allocated to five “sphere of influence” clusters that best portrayed significant common factors influencing household food security behaviour. Such clusters are defined by the major factor influencing food-security decisions made by the majority of households in a given area. These included maize, mixed agriculture, large-estate influence, non-agricultural income-generating activity and urban influence. These methods parallel the HEA method described above (WFP, 1996; FEWS, 1997).

A principal-components analysis was then conducted on outcome indicators. The results produced three main components of vulnerability - poverty, food deficiency and malnutrition - which were mapped at the EPA level. A composite vulnerability tool was then constructed, based on the weighted distance of the three principal components in the EPAs. This index was eventually discarded, however, because of a perception that it condensed so much information as to be meaningless. Regression analysis by cluster was then used to discern which factors were associated with each of the three components, so that they could be used as policy levers. Finally, a time-series analysis of vulnerability was constructed, based on a regression analysis by cluster of the opinions of eight experts as to the evolution of vulnerability in 1992-1996.

## Extrapolation of participatory approaches

Participatory assessments measure poverty in terms of local perceptions of poverty, which are identified and then extrapolated and quantified in order to construct regional poverty measures. Proponents argue that such a poverty measure is more comprehensive and represents the multidimensional nature of poverty and the processes that create and maintain it. With this indicator, poverty is defined locally in terms of perceptions of well-being and how neighbouring informants rank this perception. Utilization of this measure is thus limited to areas where people know about their neighbours, usually rural communities (Ravnborg, 1999).10

The process, described in Ravnborg (1999), is as follows. The number and location of communities in a chosen area are selected using a maximum-variation sampling strategy, taking into account factors that may explain expected variation in perceptions of well-being in the area of study. Following site selection, local perceptions are gathered in each site from community informants, who provide definitions of poverty and rank neighbours in terms of well-being. A well-being index is created and extrapolated to other communities in a region through a questionnaire put to a random sample of communities using standard sampling and survey procedures. At this point, the procedure resembles a classic proxy means analysis (Grosh and Baker, 1995), but instead of identifying key variables by multivariate regression, variables are identified by local informants and homogenized across sample sites. Leclerc, Nelson and Knapp (2000) compare Ravnborg's well-being index with a more traditional basic-needs index for the communities in the study area and find some correlation between the measures.

Leclerc, Nelson and Knapp (2000) extend the extrapolation of the Ravnborg index to the rest of the communities that make up rural Honduras. Using neural net software, artificial-intelligence techniques are utilized to link the 11 variables from the Ravnborg index to nine proxy variables from the most recent population and agricultural census and to calibrate the neural net on Ravnborg's original 12 communities. Once calibrated, the neural net is applied to the remaining rural Honduran villages for which data on all nine variables were available. In another paper, Leclerc (2002) directly matches 9 of the 11 variables from the Ravnborg index with the nine proxy census variables and computes village-level well-being indexes based on the average of well-being indexes for each household of a given village. The results of this exercise are shown in Maps 2A-D. The first column of maps refers to the well-being index under different levels of aggregation, while the second column of star plots corresponds to the average levels of the household variables from the census aggregated at the same levels.

## Direct measurement of household-survey data

Survey data have served as the basis for a number of statistics-based poverty-mapping exercises, though their sampling properties sometimes present difficult statistical challenges. Household-survey data are often clustered and collected at too aggregate a level to be of much help in constructing disaggregated poverty maps; this is the origin of the development of small-area estimation strategies, discussed earlier. Many different kinds of survey data exist. Many countries have comprehensive household surveys with detailed consumption modules, such as the LSMS surveys described in the section on small-area estimation. Some surveys, such as the annual basic-grains survey in Nicaragua, are large and representative enough to serve as a census in small-area estimation. The light-monitoring survey (LMS) is shorter, collecting information on a series of socio-economic proxies, thus allowing a larger sample size. In comparison with LSMS, however, LMS is shown to be biased because of underestimation of income/consumption. This results in bias in the magnitude of poverty and spatial distribution (Fofack, 2000).

Georeferenced household surveys have the potential to be re-aggregated into new units of analysis and thus help to create novel poverty maps; DHS on health and nutrition are an example. DHS does not include consumption, but proxies income by creating asset indices (Filmer and Pritchett, 1998). Henninger (1998), for example, describes how survey data were aggregated to new units of analysis - aridity zones - within which the distribution of anthropometric indicators was analysed. Macro International, the firm that carries out DHS, is starting to map a wealth index using data from the survey in Egypt (L. Montana, personal communication, 2001).

UNEP (1997) studied the relationship of rural poverty and land-use potential using DHS data from West Africa. DHS variables such as literacy, child mortality and school enrolment were used to build a human-development index (HDI) to serve as a proxy for poverty, which was then crossed with the aridity zones described above and land degradation. This information was then included in poverty maps for the West African region.

McGuire (2000), focusing on food security vulnerability instead of poverty, uses a composite HDI as well as principal components to analyse the relationship between food security and biophysical parameters. Spatial filtering techniques are then used to extend the correspondence of the DHS data from the first to the second administrative level. The HDI index, an arbitrarily weighted index composed of biophysical, educational, demographic, nutritional, health-access and income proxies, is then mapped at the second administrative level.

Rogers (2000) uses DHS data as an input in an impact evaluation of USAID programmes in Africa based on GIS techniques. The premise is that although the welfare estimates of these types of surveys are not representative at the cluster level, covariance analysis across clusters can be conducted provided that the estimates are unbiased. These cluster-welfare indicators serve as dependent variables and are linked through GIS with a series of explanatory variables.

Map 2A
Honduras, participatory approach

Source: Leclerc, 2002.

Map 2B
Honduras, participatory approach

Source: Leclerc, 2002.

Map 2C
Honduras, participatory approach

Source: Leclerc, 2002.

Map 2D
Honduras, participatory approach

Source: Leclerc, 2002.

## Direct measurement of census data

### Income data

Many countries collect information on income in population censuses. This information, typically based on only one or a few questions about cash income, has been used to create disaggregated poverty maps, without data from other sources. The Brazilian hunger map, for example, is based on direct measurement of household income reported in the 1991 population census. Household income was compared to a food-based extreme poverty line and a non-food moderate poverty line. The headcount index was then calculated for each municipality. Regional and state measures were based on a 1990 household survey (Peliano, 1993). Similar exercises have been conducted in South Africa and elsewhere. Some studies use direct measurement of census-based household income in multivariate analysis. Osgood and Lipper (2001), for example, link subnational proxies for poverty with soil-degradation measures in Ghana.

Recent analysis of South African data (Alderman et al., 2000), however, shows that census income variables, which are necessarily limited, given the extremely large number of observations collected, are systematically biased. Census income data under-represent levels of well-being as compared to expenditure data from a nationwide household survey, thus giving higher rates of poverty - almost 80 percent in this case. Differences were correlated with urban/rural location, suggesting that for rural households with a higher share of non-cash income, well-being is under-represented by census data. Census income data are thus a poor targeting tool in countries with a large share of non-monetarized or informal income.

### Basic-needs index

A number of countries have used household-unit data from a census to create poverty maps based on basic-needs indices. In Honduras, for example, researchers from the Centro Internacional de Agricultura Tropical (CIAT) created a series of basic-needs indices for poverty-mapping purposes (Leclerc, Nelson and Knapp, 2000 and Leclerc, 2002). These were based on access to household-unit data from the 1988 population and housing census and the 1993 agricultural census. In basic-needs approaches, poverty is typically related to deprivation or lack of the goods and services necessary to sustain life. The indices are calculated at household level, then aggregated by geographical or administrative grouping by counting the fraction of the population in a particular basic-needs stratum. The process is the following: for each variable x, a minimum acceptable value x* must be defined, which in this case is the corresponding national average value; cx is an indicator of failure in obtaining x*. For household i, this is calculated as:

(X)

which is normalized by maximum and minimum values over all households in order to allow comparison and aggregation among variables. Thus cxi lies between -1 and 1.

Two aggregate indices were constructed for each household. The first included indices on small size and low quality of housing, lack of basic services and energy, lack of non-land assets and lack of education. The second was composed of the same indices, with the exception of education. Variables were weighted equally in almost all the individual indices that made up the aggregate indices, though different weights could be introduced. The two indices were aggregated to three levels: village, municipality and department. An administrative entity was considered poor if the proportion of poor households was greater than 0.4.

The Andean Network of Spatial Data (REDANDA) has brought together statistical agencies and universities in Bolivia, Columbia, Ecuador, Peru and Venezuela. They have created disaggregated municipal-level regional maps of development indicators from population-census data. This network achieved homogenization of standards among the five countries for the 2000 census, which will be jointly analysed in 2002-2003 (REDANDA, 2001).

In Brazil, 38 georeferenced variables including two composite indices from the 1970, 1980 and 1991 population censuses make up the Atlas of Human Development. The two composite indices, following United Nations Development Programme (UNDP) methodology, are the human-development index and the life-conditions index. The Atlas of Human Development has been a tremendous success as the basis for decision-making with regard to public investment and targeting of social programmes worth billions of dollars (Snel and Henninger, 2002).

Similarly, a basic-needs index based on 1993 census data was used by the Peruvian social fund (FONCODES) to distribute over US\$500 million during the 1990s (Snel and Henninger, 2002).

Another project in Honduras developed and mapped a series of disaster-vulnerability indices by municipality, using census and other data sources. The indices had the following dimensions:

1. environmental: flood and landslide risk area;
2. population: total population at risk of flooding and landslide;
3. social: percentage of very poor people at risk;
4. infrastructure: roads and electricity lines at risk.

These indices were weighted and aggregated into an overall vulnerability index that allowed identification of municipalities for priority intervention (Segnestam, Winograd and Farrow, 2000).

### Z scores

The Honduran Programa de Asignación Familiar, Fase 2 (PRAF-II), which began disbursements in 2001, is conceptually similar to PROGRESA and the Nicaraguan RPS. Beneficiaries receive US\$58 per year per child for attendance at school, and another US\$46 per year per family to cover the opportunity cost of complying with health-care attendance requirements. The programme provided funds to schools and health centres to improve the supply of services commensurate with increased demand (UCP-IFPRI, 2000).

Researchers at IFPRI who were assisting PRAF with design and development of the programme utilized yet another instrument for targeting. The second phase of PRAF was limited to the 80 poorest municipalities in the country, provided funds were available. Municipalities were selected on the basis of average height-for-age Z scores, which are the number of standard deviations from the mean. Height-for-age is considered a good indication of chronic malnutrition and thus serves as a proxy for poverty and food insecurity. Data were taken from the 1997 census of first-grade schoolchildren's height; Z scores were standardized by reversing on sex and age. Municipalities were ranked by Z scores and randomly allocated into three treatment and control groups for evaluation purposes.

2 Other well-being or food security indicators may also be used.
3 Simply counting households with expected values below the food security line gives biased estimates of poverty rates as a result of inequality in the intra-household distribution.
4 The following discussion is based on Bigman et al. (2000).
5 See PROGRESA, 1998, CONAPO-PROGRESA, 1998, and Skoufias, Davis, and de la Vega, 2001.
6 For the first round of PROGRESA in 1996, the Conteo data was not yet available. All seven variables came from the 1990 census.
7 For a description of this procedure, applied to poverty analysis, see Filmer and Pritchett (1998).
8 For details of this application, see de la Vega (1994).
9 Community level small-area estimation will be used, hwever, to check the principal component results.
10 See Ravnborg (1999) for an application in Honduras, Narayan (1997) in Tanzania and Turk (2000) in Viet Nam.