This chapter describes the method used by FAO/SDRN to develop gridded urban and rural population databases for inclusion in the FIVIMS Global GIS Database (FGGD) (Huddleston et al., 2005). The main difference between this method and the ones reviewed in Chapter 3 is that it allows making of rural population maps in which pixel values reflect variations not only between subnational units, but also within the units. The method is based on detecting and masking out urban areas on the LandScan Global Population Database in order to make a global rural population grid at the same resolution as LandScan, that is at 30 arc-seconds.
This task has been carried out as part of a larger effort within the context of a Poverty Mapping Project, implemented jointly by FAO, UNEP and CGIAR to promote the use of poverty maps in policy - making and in targeting assistance, particularly in the areas of food security and environmental management (Web site ref. 19).
Poverty mapping, defined as the spatial representation and analysis of indicators of human well-being and poverty, provides a means for integrating biophysical and geophysical information with socio-economic indicators to provide a more systematic and analytic picture of human well-being and equity (Henninger and Snel, 2002).
GIS-based analysis of links between environment and poverty would not be possible without gridded databases and maps showing the spatial distribution of the world's rural and urban populations at a very high resolution. The gridded rural population database developed by FAO/SDRN is particularly useful for comparing the distribution of rural populations with available natural resources and other environmental and geophysical indicators of the degree of vulnerability of rural livelihood systems in developing countries. In this context, the aim is to identify the spatial distribution of rural population globally, so that reasonable estimates of the numbers of people living in different rural environments around the world and within regions and countries can be generated, such as in different agro-ecological zones, farming systems or crop zones.
Besides describing the method developed to detect the urban population grid cells in the LandScan global population database and create the urban area mask, this chapter also presents results in map and table formats and compares them with other similar databases.
The task of detecting the urban areas was not straightforward because, as discussed in section 2.1, there is no commonly accepted definition of what constitutes an urban area. Indeed, since most humans tend to congregate in settlements, by some definitions almost all people could be said to live in urban areas. But generally, human settlements occurring in areas that are largely agricultural are considered rural, even though the size of their population may sometimes be quite large. In this report these are referred to as ‘rural settlements’ and the population living in these settlements is excluded from analyses reporting rural population. On the other hand, in some countries, particularly those where total population density is not very large, even some small settlements are considered urban.
To create the gridded urban, rural and rural settlements population databases, four primary sources were used. LandScan 2002 was used as the reference database for population distribution. Nighttime Lights of the World 2000 was used to identify the extent of urban areas. UN population data for each country for the year 2000 were taken as the reference point for urban/rural population and for overall totals. Detailed information about these three sources was given in previous chapters. In addition, the UN (DPKO/UNCS) International Boundaries/Coastlines map for 2004 was used to delineate the country boundaries and coastlines (Web site ref. 20).
The reasons behind the choice of LandScan as the reference database for global population distribution, and the technical steps for generating the urban mask, are given in sections 4.2.1 and 4.2.2 respectively.
Three global population datasets - GPW, GRUMP and LandScan - were evaluated in order to choose the most suitable one for this study. GPW, as described in section 3.1, has a fairly coarse resolution and the population is uniformly distributed within any given administrative unit. The GRUMP database has better spatial resolution (30 arc-seconds) and superior differentiation of urban and rural populations, but both are still uniformly distributed within any given administrative unit, and in any case the database was not available at the time of this study. Therefore, LandScan was chosen as the source database for global population distribution, because of its high resolution and its depiction of variation in population counts also within each administrative unit, rather than showing only their averages. An additional advantage of the LandScan database is that, although it does not provide direct information about urban and rural areas, its population model distinguishes urban and rural populations and their distribution.
The ORNL has released five versions of LandScan (see section 3.2). Each version has included new refinements, reflecting improvements in the quality of the data sources as well as improved data manipulation.
The 2002 version of LandScan has been used as the source for the spatial distribution of the world's population because the most recent version, for 2003, was not released until near the end of this study. Since the LandScan database does not contain administrative boundaries, it was overlaid with the standard UN International Boundaries map in order to delineate national boundaries and populations.
A comparison of UN population database figures for 2002 with LandScan 2002 showed that, at the global level and for most of the countries, the differences were insignificant. As the year 2000 was selected as the reference year for other time-sensitive variables analysed in this report series, the LandScan 2002 had to be adjusted to year 2000 population estimates. This was deemed to be a more accurate representation than using the LandScan 2000 database itself, which used a less refined population distribution model. To reconcile the two sets of data, FAO used the UN 2000 population data for the country totals and LandScan 2002 for the distribution of the population within each country. In other words, for every country included in the database, the total population numbers derived from LandScan 2002 were adjusted to the UN figure for 2000. The new totals were then distributed across the pixels in the same proportion as in the original LandScan 2002 database. The adjustment coefficient is calculated for each country and is the ratio between the total population from the UN database and the total population from the LandScan database using the UN International Boundaries. The result is a 30 arc-second grid of population distribution that is matched to the UN figures in terms of total population for each country. From here on, we will refer to this modified LandScan global population database for year 2000 as LandScan-a.
The FAO/SDRN rural and urban population distribution grids have been generated at 30 arc-seconds on LandScan-a. Because these databases were developed to analyse the distribution of rural population in relation to environmental and geophysical factors which were available only at 5 arc-minute resolution, it was necessary to convert the rural population grid from 30 arc-seconds to 5 arc-minutes. However, an analysis of the country area calculations at 5 arc-minute resolution indicated that at that resolution, GIS analysis in countries with areas less than 3 000 square kilometres would not be sufficiently accurate. Therefore such countries were not included in the analysis, nor were the countries with a UN total population figure less than 500 000. Table 4.1 lists the 154 countries included in the analysis.
Several methods were explored for determining urban area boundaries and extents that return urban population counts consistent with UN population data for each country. The simplest method is to classify all the pixels in LandScan-a with population density above a certain threshold as urban, for instance all pixels with greater than 1 000 persons per square kilometre. A variant of this method is to establish a unique threshold for each region or country. However, this concept was found to be too simplistic for discriminating urban and rural populations as it produces a very fragmented urban mask.
Another method that was considered was to use a threshold for the gradient of the population density, rather than the population density itself. This method seemed very promising as differences in population density between many urban and rural areas were quite easily detected. However this method and even its combination with the population density threshold method described above was also not sufficiently accurate in some countries, and was not pursued.
List of the 154 countries included in the urban and rural databases
|Albania||Côte d'Ivoire||Iran, Islamic Rep of||Mozambique||South Africa|
|Azerbaijan, Republic of||Dominican Republic||Jordan||Niger||Syrian Arab Republic|
|Belarus||Egypt||Kenya||Norway||Tanzania, United Rep of|
|Belgium||El Salvador||Korea, Dem People's Rep||Oman||Thailand|
|Benin||Eritrea||Korea, Republic of||Pakistan||Timor-Leste|
|Bolivia||Ethiopia||Kyrgyzstan||Papua New Guinea||Trinidad and Tobago|
|Bosnia and Herzegovina||Finland||Laos||Paraguay||Tunisia|
|Burundi||Germany||Libyan Arab Jamahiriya||Puerto Rico||United Arab Emirates|
|Cameroon||Greece||Macedonia, The Fmr Yug Rp||Romania||United States of America|
|Central African Republic||Guinea||Malawi||Rwanda||Uzbekistan|
|Chad||Guinea-Bissau||Malaysia||Saudi Arabia||Venezuela, Bolivar Rep of|
|China||Haiti||Mauritania||Serbia and Montenegro||Yemen|
|Congo, Dem Republic of||Hungary||Moldova, Republic of||Slovakia||Zimbabwe|
|Congo, Republic of||India||Mongolia||Slovenia|
The third method - and the one selected for producing the urban mask - is based on delineating urban boundaries depending on the intensity of the lights from populated areas. In satellite images of the globe taken at night, urban areas appear highly lighted. The correlation between these lighted zones and urban areas had already been explored by other researchers (Imhoff et al., 1997; Sutton, 1997; Elvidge et al., 1997). More recently a research on the metrics for quantifying the relationships within geospatial datasets has been developed. A spatial cross correlation between population counts in the LandScan database and the Nighttime Lights was computed and the two were found to be highly correlated (Ganguly A., personal communication).
The main idea is to determine light intensity threshold (LT) for delineating urban areas using the human settlements dataset of the Nighttime Lights (NTL) of the World for the year 2000. In this dataset the values indicate the Digital Number (DN) genereted by the Operational Linescan System (OLS) satellite monitoring, where the value of DN correlates with the light intensity on the ground (see section 2.3.2). The range for the DN values is from 0–63. These numbers are the average DN values for the year. The minimum value identifies a situation of no lights; the maximum of saturated lights..
Initially, it was found that the images generated from the NTL database did not have sufficiently high positional accuracy for a global analysis. There were considerable non-systematic positional shifts, sometime as high as 15 Km when compared to accurate reference maps.
Once the NTL images were geometrically corrected, there was sufficiently good registration between the NTL images and the coastlines map used. Figure 4.1 depicts the uncorrected and the corrected images of a very highly populated metropolitan area - the city of Istanbul, Turkey. The waterway (Bosphorus) in the centre of the image has a width of approximately 1 000 metres at the narrowest point and was covered with light in the uncorrected image but not in the corrected one. This is indicative of the positional accuracy achieved by the geometric correction applied.
It should be noted that in order to use the NTL images for the delineation of the urban mask two problems needed to be resolved. First, since the lights are more linearly correlated with GDP and electrification than with population density (Doll et al., 2000), a given urban population density will produce lower light intensity where GDP and electrification are low than where they are high. Second, as mentioned in section 2.3.2, the lights tend to overestimate the actual extents of the urban areas because of the blooming effect (Elvidge et al., 2004).
The solution to the first problem required the determination of a specific LT value for each country. In order to determine the LT value for a given country, first a histogram and then a cumulative distribution of population in that country for each DN value were generated. This is done by locating rural population figure on the y-axis of the cumulative distribution (see Figure 4.2) and finding the DN (i.e. the x-axis) value corresponding most closely to it. This value of DN corresponds to the LT value which covers all the urban population given by the UN figures for each country.
Geometric correction of Nighttime Lights of the World 2000 to UN international coastline map: Istanbul area
|Nighttime Lights 2000 (NOAA)|
before the geometric correction to UN international Coastline Map
|Nighttime Lights 2000 (NOAA)|
after the geometric-correction to UN International Coastline Map
Figure 4.2 depicts the procedure above using the UN population figures for Italy as an example. In the UN figures the total population for Italy is 57 536 000 and the total rural and urban populations are 19 020 000 and 38 516 000 respectively.
The DN value of 44 is the value on the x-axis that comes closest to the UN figure for rural population on the y-axis. It identifies 18 772 729 as rural population which is a difference of 1.3 percent compared to the UN figure. In the same way it is possible to talk in terms of urban population, that is 38 763 271 with a difference of one percent.
The above differences are due to rather coarse representation of the DN values that is by only 64 integer values. In most countries the difference was less than ten percent. As explained in section 2.1, the definition of urban and rural areas is controversial and therefore the UN urban and rural population figures could also be considered controversial. Nevertheless, it was considered essential to use some urban and rural population figures as a benchmark and the UN figures were chosen because of their international acceptance.
There is a large variation in the LT values between countries. Figure 4.3 depicts the average LT for all the UN regions. All Africa, except the North, required the lowest value of DN to detect the urban population. Western Asia, Northern America and Japan required greater values of DN on average.
Cumulative distribution of population versus DN value in Italy
Average light threshold (LT) value by UN region
* See section 4.3 for the description of the UN regions.
During the analysis of the NTL images, it was noted that in 50 countries there were not sufficient lights to account for UN urban population figures.
Even taking the lowest DN value as the LT, in these 50 countries the estimates of urban population were far lower than the UN figures and had an error greater than ten percent. Not surprisingly 49 of them are developing countries and only one is developed, Australia (-11 percent). Of the 49 countries, 69 percent are in Africa, 18 percent in Latin America, ten percent in Asia and two percent in Oceania.
Also, in these countries, there appears to be a number of high population pixels in LandScan-a, which could not be detected by the lights. The reason for this could not be explained as LandScan population distribution model is not available in the public domain. In most cases, they are isolated pixels with high population density outside the lights (see blue boxes in Figure 4.4), but in some other cases (see circles in Figure 4.4) they appear to be in the form of agglomerations. In these 50 countries those pixels were classified as urban if they had the same or greater population values than the urban population density of the country.
Regarding the problem of the blooming effect of the lights for the actual extents of the urban areas, even if the urban population of a country were close to the UN figure, there could be some local overestimation of extents. That is, some scarcely populated pixels within or near urban areas can have high nighttime lights. In order to eliminate these pixels, for each country where there was not underestimation, the urban population density was calculated and all the pixels in which the number of people was less than ten percent of this value were reclassified as rural. Finally, a 3×3 majority filter was applied to the urban mask to reduce the fragmentation. The procedure for generating the urban mask is illustrated in Figure 4.5 for the urban agglomeration of Johannesburg. The red pixels (a) indicate the areas detected by LT for South Africa. Pink areas (b) show the agglomeration after removing the pixels with population less than ten percent of the urban density of the country. The blooming effect is reduced considerably, but some areas were too fragmented. The blue pixels (c) indicate the final result after the application of the majority filter.
Cameroon: isolated pixels and very small agglomerations not detectable by LT
Application of the procedures described above generated an urban mask, called Poverty Mapping Urban extents (PMUe), which was then used as a tool for deriving the rural and urban population distribution grids.
All the pixels of LandScan-a corresponding to the urban extents grid generated the Poverty Mapping Urban population (PMUp) distribution grid at 30 arc-seconds. The Poverty Mapping Rural population (PMRp) distribution grid at 30 arc-seconds was defined by masking out the detected urban extents from LandScan-a. These are population distribution grids, where the value for each pixel represents the number of persons found on that pixel. It should be noted that this number does not represent persons per square kilometre, as pixel areas vary by latitude. Urban and rural population density grids, called PMUd and PMRd, were also computed by dividing each pixel value by the pixel area.
Different stages in computing the urban mask for Johannesburg and vicinity
Since almost all the maps of the Poverty Mapping project are at 5 arc-minute resolution, PMRe and PMRp were also converted to this lower resolution. These grids are denoted with the acronyms PMRe5 and PMRp5. PMRe5 values indicate the percent area occupied in each 5 arc-minute pixel by the rural pixels from the 30 arc-second grid. PMRp5 values are the sum of the rural population numbers on 30 arc-seconds pixels in each 5 arc-minute pixel.
In some countries, rural pixels exhibit very high density. Such pixels were classified differently based on a study carried out by the International Institute for Applied Systems Analysis (IIASA) for the identification of cultivated areas. Based on data from China and Bangladesh, two countries with very high population density in certain areas, the relationship between population density and the land area required for buildings and infrastructure was estimated. It was determined that almost negligible land area would be left for agriculture at areas with population density greater than 2 000 persons per square kilometre Therefore besides rural and urban classes, a third class, called ‘rural settlement’ was created and rural pixels with population density values greater than 2 000 were assigned to that class. A new mask called Poverty Mapping Rural Settlements extents (PMRSe) was generated for this class. The acronyms PMRSp and PMRSd denote the grids corresponding to population distribution and density grids respectively, corresponding to PMRSe. Subsequently a new grid PMURRS was generated in which each pixel belongs to one of the following three classes: Urban, Rural or Rural Settlement.
All the grids described in this section are part of the Poverty Mapping Urban Rural (PMUR) database.
As noted above the UN urban population figures (UNup) were used as a benchmark for computing the PMUp grid. A comparison of the PMUp results with the UNup shows the PMUp results to be within -/+ ten percent of the UN urban population figures in 125 countries, which represent about 81 percent of the countries included in the analysis. Of these, 32 percent are developed countries, and their percent differences are less than +/-five, with the exception of Australia. The remaining 68 percent are developing countries and of these almost half of them are in Asia, and the other half is distributed equally between Africa and Central/South Americas.
The 29 countries which are not within the -/+ ten percent above, are all developing countries. In 25 of these countries, the PMUp underestimates the UNup. Most of these countries are in Africa (76 percent), with a further 12 percent in Latin America, and the remaining are Mongolia and Timor-Leste in Asia and Papua New Guinea in Oceania (Table 4.2).
List of the countries with underestimated urban populations
|Country||UNup||PMUp||Percent difference PMUp - UNup|
|population in thousands||percentage|
|Papua New Guinea||928||758||-18.3|
|Tanzania, United Rep of||11,236||9,019||-19.7|
|Congo, Republic of||2,254||1,648||-26.9|
|Central African Republic||1,530||876||-42.8|
The countries in which PMUp overestimates compared to the UNup are only four: Niger, Haiti, Kenya and Afghanistan (Table 4.3).
List of the countries with overestimated urban populations
|Country||UNup||PMUp||Percent difference PMUp - UNup|
|population in thousands||percentage|
Map 4.1 shows the geographic distribution of the countries for which the PMUp - UNup difference is more than ten percent.
In general the countries where there is under and over estimation are more rural. In the countries where the PMUp estimates are within ten percent of the UNup, the average urban population share is about 75 percent for developed countries, and about 40 percent in the developing ones. In the countries, where the differences are greater than ten percent, the average urban population share is not quite 40 percent for the underestimated ones and 33 percent for the overestimated.
Spatial distribution of the difference in urban population figure by country
In this section, the PMUe results have been compared with similar data from other sources for the land area of urban extents in square kilometre and for the geographic coordinates of urban centers. This cannot be considered a measure of the accuracy of the PMUe results because of the conceptual problems for defining urban areas noted before. First there is not a clear and unique definition of an urban area that is applicable to all countries (see section 2.1). Furthermore there is a lack of non-controversial global statistical data aggregated at sufficiently high spatial resolution with accurate geographic coordinates and population figures (see section 2.2). However, the comparisons do generally confirm the validity of the PMUe.
The reminder of this section describes the results of the comparisons, aggregated by UN region and/or continents.
UN classification of regions (UN, 2002) is depicted in Map 4.2.
UN classification of the world in regions
Comparison of the urban land area results of PMUe, GRUMP and Boston University Urban Area (BUUA) databases are listed in Table 4.4 by UN region. Generally the GRUMP data produced by CIESIN yielded the largest extents. However, in 49 of the countries analysed the PMUe results were greater than the GRUMP urban extents and in seven of twenty regions (three in Africa, one in America, one in Europe and two in Oceania) the PMUe identified a larger urban area than GRUMP. The land cover class called ‘built-up area’ detected by BUUA was the smallest of the three databases, except in Japan.
Comparison of urban area by UN regions (km2)
|Continent||Region||PMUe||GRUMP extents||BUUA built-up area||Database with greater urban area|
|Oceania||Australia and New Zealand||42,123||44,601||8,990||GRUMP|
Histogram of the share of urban area in total area, by UN region
Comparison of the share of urban area in total area, by continent
Comparison of the share of urban area in total area by developed/developing country
Figures 4.6, 4.7 and 4.8 compare the share of the urban area in total area by UN region, by continent and by developed/developing country. As expected, developed countries are more urbanized than developing countries, with the largest difference in GRUMP. In conclusion, the urban land area of the world estimated by PMUe and GRUMP are 1.7 and 2.7 percent respectively. The BUUA figures are much smaller, only 0.5 percent.
Figure 4.9 compares boundaries of four urban extents defined by PMUe, GRUMP and BUUA. Three of these examples show the generally larger extents generated by GRUMP compared to PMUe and BUUA for most urban areas, both the agglomerations and the smaller settlements. One example from South America shows a larger extent estimated by PMUe, as was typical for that region.
Visual comparison of urban extents
|(a) Atlanta, Georgia, USA||(b) Mexico City, Mexico|
|(c) Santiago, Chile||(d) New Delhi, India|
|Global Rural Urban Mapping Project (GRUMP)|
|Poverty Mapping Urban extents (PMUe)|
|Boston University Urban Area (BUUA)|
As noted in section 3.3, the GRUMP human settlements database contains the geographic coordinates and the estimated population for the years 1990, 1995 and 2000 for each human settlement. Therefore it could be used for comparing the geographic coordinates of the human settlements detected in PMUe.
Using the original version of the PMUe (i.e. without a buffer around the human settlements), it was possible to detect about 72 percent of the GRUMP settlements globally; these held 88 percent of the population living in human settlements in 2000.
The lowest detection percentage was again in Africa, where the results were about 60 percent, containing 86.5 percent of the population.
In order to ascertain whether this relatively poor detection performance was merely the result of a slight difference in the positioning of the urban area and the points, or was caused by a more serious problem, a buffer of one kilometre around the PMUe human settlements was generated. With this buffer the detection results improved considerably. Globally, 92 percent of the GRUMP human settlements were captured, corresponding to 97 percent of the population (Table 4.5).
With the buffer, the improvement in terms of the detected points was almost 30 percent, but in terms of population, the increment was only nine percent, as the buffer pixels were not highly populated. However the size of the global urban population not captured was very small - around three percent.
The human settlements in GRUMP database detected by PMUe with one kilometre buffer
|Continent||Region||N. of human settlements in grump||Estimated population for the human settlements in 2000||Percentage of human settlements detected with PMUe||Percentage of population detected with PMUe|
|Oceania||Australia and New Zealand||324||21,387,165||97.5||99.7|