Previous Page Table of Contents Next Page


PART TWO. INVENTORY OF CGDB-TF CORE DATA LAYERS


CHAPTER 4. BOUNDARIES: COASTAL, ADMINISTRATIVE, AND AREAS OF SPECIAL INTEREST

Under this first topical classification of the inventory, boundaries including: Coastal and Oceanic; Political and Administrative; Areas of Dispute or Conflict; and, Parks and Protected Areas, are presented. Statistical boundary areas, such as agricultural divisions, census enumeration areas, and various aggregations of these units over time are not covered in this inventory. However, such units perhaps represent an area of future collaboration that FAO-PMPG and UNGIWG may wish to explore with the NMAs, Agricultural Ministries, and Central Statistical Offices of UN Member States to ensure that appropriate comparative baselines and time-series analysis can be facilitated in the future. The Second Administrative Level Boundaries data set (SALB) collaborative effort discussed in Section 4.2.2 could provide a "best-practices" approach to the determination and potential harmonization of such areas. Although also considered boundary data layers, Health Areas and Districts - or the lack thereof - are covered under a separate topical heading of this report.

4.1 COASTLINE AND RELATED OCEANIC DATABASES OR LAYERS

4.1.1 Coastline polygonal data

Based in all likelihood on the same baseline source, i.e. NGA's 1:250 000 World Vector Shoreline (WVS+), four public domain and three commercial products have been identified for inclusion in the inventory. Although, samples of commercial data covering the Namibian AOI were provided by ADC and EuropaTech, neither of these samples included specific coastline data layers for review other than those in conjunction with country boundaries. The seven data layers identified for the inventory are summarized in Table 4.1.1. The data sources presented in Table 4.1.1 are further evaluated in the textual discussions immediately following this table.

Table 4.1.1
Vector based coastline data layers

Data Type/Source

URL

Extent

Scale

Availability1

Notes

PUBLIC DOMAIN DATA SOURCES

NGA-World Vector Shoreline Plus (WVS+), 3rd Edition 2004

http://store.usgs.gov/

Global

1:250 K
-
1:120 M

PD

N G A product containing six scales of coastline data in VPF libraries

NOAA-Global Self-consistent, Hierarchical, High-resolution Shoreline Database-GSHHS, 1999

www.ngdc.noaa.gov/mgg/shorelines/gshhs.html

Global

1:250 K
-
1:120 M

PD

Enhanced derivative of original NGA-WVS+, inconsistencies in source data harmonized for a polygonal result.; six layers of data

NOAA-Online Coastline Extractor, WVS+/GSHHS, 1994

http://rimmer.ngdc.noaa.gov/mgg/coast/getcoast.html

Global

1:250 K
-
1:5 M

PD

Earlier version of GSHHS and other shoreline sources such as water bodies from WDBII; 2 layers

British Oceanographic Data Centre, General Bathymetric Chart of Oceans (BODC- GEBCO)

www.bodc.ac.uk/projects/gebco/gebco_content.html

Global

1:250 K
-
1:43 M

Likely PD

A version of perhaps the original WVS+. Lineage of source data could not be determined from on-line source, nor number of layers

COMMERCIAL DATA SOURCES

American Digital Cartography - WrldMap-Continental1:250 000 Shoreline data layer

www.adci.com

Global - Continental

1:250 K

C, CR, LF, RD

Only coastline layer found was based on the AD1 country data, 1:250 000 data layer not provided

East View Cartographic- WVS+ Database

www.cartographic.com

Global

1:250 K
-
1:5 M

C, CR, LF, RD

Updated to latest edition of NGA-WVS+, possibly some VMap1 coastlines, # of layers not listed

Veridian - Global Shoreline Database

www.veridian.com/offerings/suboffering.asp?offeringID=528

Global

1:250 K
1:120 M

C, CR, LF, RD

A derivative WVS+ database with all six scale layers, and possible updates for smaller reefs & shoals

1 C=Commercial CR=Copyright; LF=License/Fee, PD=Public Domain; NC=Non-Commercial; FQ=Fair Quotation; RA=Restricted Access; RD=Registered Distribution

The most recent revision of public domain polygonal coastline data identified in Table 4.1.1 would be the Third (3rd) Edition of the NGA's WVS+ in VPF format. This edition was released in the first quarter of 2004. In addition to the global 1:250 000 baseline, the WVS+ library also includes: five generalized coastline data layers at the scales of 1:1 000 000, 1:3 000 000, 1:12 000 000, 1:40 000 000, and 1:120 000 000. The NGA distributes the WVS+ at a cost of US$270 via the USGS Store. In conjunction with the preparation of this report, the 1:250 000 data layers of the WVS+ Second (2nd) Edition were processed for Africa. The African sample dataset is derived from the base polygonal 1:250 000 coastline layer of the WVS+ library. This WVS+ data layer also contains first order country boundaries and includes some disputed area boundaries globally.

In direct response to queries sent out with regard to this inventory on behalf of FAO and UNGIWG, scientists at NOAA undertook the processing of the 1999 version of the 1:250 000 GSHHS data layer listed in Table 4.1.1 from their binary format into the ESRI Shapefile format. This dataset is now available for download on-line via NOAA's GSHHS Web site listed in Table 4.1.1.

According to Internet based references, the GSHHS data have undergone extensive modification from the original NGA-WVS+ 1992 baseline, including the better integration of political boundaries and the addition of major inland water bodies. The 1994 version for these data available from NOAA's on-line coastline extractor should be superseded by the 1999 GSHHS data processed into ESRI Shapefile format. It is likely that the GEBCO database listed in Table 4.1.1 is based either on the 1994 GSHHS data or the original 1992 WVS+ data. A copy of the GEBCO database was not available for comparison during the preparation of this report.

A direct comparison of WVS+ derivative data available from NOAA as well as the source data based on the NGA's 2nd Edition NGA-WVS+ are shown in Figure 4.1.1 below.

Figure: 4.1.1
Comparison of derivative GSHHS/WVS + databases

Figure 4.1.1 depicts a comparison of coastlines based on: the VMap0; the VMap1; the "original" 1994 WVS+/GSHHS as available from NOAA's coastline extractor; the 1999 version of the GSHHS translated into ESRI's Shapefile format by NOAA; and lastly, the WVS+ 2nd Edition processed for Africa.

The area shown on this graphic covers Walvis Bay, which lies at the extreme south-west of the Namibian AOI. As can be seen, at least for this area, the WVS+ data compare favourably with the cartographically based VMap1 coastline and - given the scale differential - the 1:1 million scale VMap0 coastline based on political boundaries. The nominal 90 m stepping of the original DLMB raster source for the WVS+ is apparent in the 1:15 000 scale view shown at the right in Figure 4.1.1, as are differences between the two versions of the GSHHS database and NGA's "source" database.

The differences between the coastline-extractor sample of WVS+ data and the GSHHS Shapefile version provided by NOAA cannot currently be explained. However, as the coastline-extractor sample was imported locally using double precision tolerances, the differences may possibly be based on single precision processing in conjunction with Shapefile version by NOAA. This could not be confirmed however, and reference to the NGA 2nd Edition WVS+ could not be used to determine the relative accuracy of data from either GSHHS source.

In fact, as can also be seen in Figure 4.1.1, the level of generalization exhibited within the WVS+ 2nd Edition is disturbing as it is also evident in the NGA’s more recent 3rd edition. Given that there are other potential problems with the Shapefile version of the GSHHS translated by NOAA, i.e. the lack of polygon typology and the use of a totally positive coordinate space for the x-axis, there may be a number of issues which need to be addressed if a GSHHS derivative is selected as a potential CGDB coastal baseline over the WVS+ 3rd Edition. NOAA appears to be willing to cooperate in such an effort. However, given the relatively low LOE estimated for processing 1:250 000 coastlines globally directly from the NGA VPF source data library, the WVS+ 3rd Edition would be suggested as the baseline to consider.

Similarly, although no samples of the commercial coastline data offerings were obtained, the incumbent copyright and licensing restrictions of such data, would likely outweigh the cost of any LOE associated with processing the data directly from the WVS+, GSHHS, or GEBCO sources.

ADC includes their 1:250 000 coastline data as part of the various 1:1 million scale continental product offerings. However, these data were again not included in the Namibian AOI data sample provide by the company. East View Cartographic's data product is licensed at a base cost of US$165 per licence. The on-line reference material for this product gives a good indication of the size of any database which may result from processing the NGA-WVS+ source database directly, i.e. 250 mb. The base price for Veridian's WVS+ based product is US$250 dollars per single user licence. A "Length of Shoreline" database developed by Veridian was located via UNEP-GRID's on-line metadata and can be distributed without restriction. These data remain however, unevaluated and likely represent only linear summaries of the WVS+ baseline.

Based on the WVS+ sample derived for Africa, the LOE for processing polygonal coastlines from the 3rd Edition of WVS+ into a seamless global data layer would be seven days. This processing should be conducted in conjunction with the AD1 boundary layer presented later in Section 4.2.1 of this report to support coastal and country masking of the near global 3 arc second (90 m) digital elevation data discussed under Section 8.4.3.

4.1.2 Polygonal oceanic water bodies

Starting with the third and fourth editions of the VMap0, a polygonal data layer demarcating the extents of marine water bodies such as oceans and seas based on the 1986 Fourth Edition of the International Hydrographic Organization's (IHO) standard was included. This data layer is fairly basic and only includes a name attribute for the various water bodies. Additionally, the demarcations of the water bodies are fairly generalized and some gaps have been noted between adjacent polygonal boundaries. For this reason the LOE associated with the processing of this data layer would be 1.5 days. The majority of this time would be spent correcting slivers or small gaps between adjacent boundaries. A further 0.5 days would be required to assign International Hydrographic Organization encoding for the Limits of Oceans and Seas to each water body as this encoding is not included in the VMap0 data layer as an attribute.

4.1.3 Point data layer of oceanic islands

Both the DCW and then the VMap0 data libraries included point data layers of major oceanic islands which were too small to include as polygons within the source political boundary layer(s). While the LOE for processing and harmonizing these features layers from these two libraries is estimated at only 0.5 days, the additional harmonization and correction of these features with point features derived from the WVS databases to produce a more complete but still generalized data layer would require a further one day.

4.1.4 Linear oceanic features

The RWDBII-Sv1.1 data library provides an interesting Law of the Sea linear data layer that contains a variety of feature classes which may be of potential interest. These feature classes include: Generalized continental margins; Sea level lines on land; Generalized 100, 200, 350 nautical-mile limits; Agreed upon territorial sea, continental shelf, and oceanic boundaries; Min-Max Arctic Polar/Pack Ice limits; The maximum Antarctic polar ice limit; Pacific Nuclear Free Zone lines; Joint development, fishing, and protected zone lines; Exclusive economic zones; Fishery conservation zones; and lastly, Geophysical zones. There is no LOE associated with the processing these linear feature classes. However, as many of these classes contain features which could be better represented as polygons, the estimated LOE for such processing these data would be three days.

4.2 POLITICAL COUNTRY, ADMINISTRATIVE, DISPUTED AREAS, AND PARK POLYGONAL BOUNDARIES

Under this topical section of the inventory, data layers which demarcate administrative: Level one country, territories, protectorates, and special interest areas; Level 2 boundaries for provinces, states, etc.; and, Level 3 boundaries which may potentially include districts, divisions, and counties, are discussed. For the purposes of this inventory these levels will be designated as the AD1, AD2, and AD3 data layers respectively. Source databases providing coverage below the AD2 level were very limited and no consistent administrative AD3 or AD4 boundary datasets could be identified for the inventory.

It should be noted that in general there exists an extensive overlap between both the sources of data and then the levels of administrative coverage provided by data layers presented. For example, since AD3 or AD2 data layers will by definition include AD1 level boundaries, listings for these data are not duplicated in the country verses administrative subdivisions of the tables presented below. Similarly, despite the fact that the UNCS/FAO 1:1 million database of political and disputed area boundaries is discussed under a separate topical heading, these data likely represent the most up-to-date political boundaries dataset reviewed.

Lastly, the lineage metadata associated with the VMap0 Ed.5 references that a "small number of international boundary updates were made based on digital boundary files supplied from NIMA's Digital International Boundaries database (as of March, 2000)", (NGA-VMap0, 2000). Unfortunately, no Internet based or other citations were available for what might be an important database of AD1 country boundaries. Similarly, the Cartographic Section of the U.S. Department of State, in conjunction with that of another USG agency are also producing AD1 level databases with updates scheduled for completion in late 2004. However, like the Digital International Boundaries database, access to these databases - even for restricted distribution - may require higher level requests from within the UN.

4.2.1 First Order/AD1 Level country and political boundary data layers

Six databases or library data layers representing country or other international political boundaries were identified for the inventory. The DCW and later editions of the VMap0 provide the baseline source of AD1 political boundaries for all but two of the datasets inventoried. Table 4.2.1 presents the data layers identified for this topical subsection of the inventory.

Table 4.2.1
First Order/AD1 Level country and political boundary data layers

Data Type/Source

URL

Extent

Scale

Availability1

Notes

AD1 POLITICAL/COUNTRY BOUNDARIES

NGA-WVS+, six layers of coastal, international, & selected disputed area boundaries, 2004

http://store.usgs.gov/

Global

1:250 K
-
1:120 M

PD

Seamless processing from VPF format required and harmonization with ISO and other non-FIPS codes

WHO-EIP AD1 Database, circa 1992 but possibly updated selectively 1998

www.who.int/vaccines-surveillance/digitalmaplibrary/librarymapintro.htm, [email protected]

Global

1:1 M

PD, RA, RD

Harmonized vectors from DCW with ISO, WHO, etc. encoding and some updated boundaries

FAO-AWRD World political boundaries, circa 1997 some AD2 as delimited in VMap0

[email protected], pending distribution via CD-ROM and FAO- GeoNetwork

Global

1:1 M

PD, CR, FQ, NC

Corrected/harmonized coastal & AD1 layers based on VMap0.Ed.4/5, w/FIPS, ISO, etc. encoding

UNCS/FAO Level 1 database of countries, territories, and disputed areas, circa 2000/01

www.ungiwg.org, log-on required

Global

1:1 M

PD, CR, FQ, NC

Harmonized VMap0 updated with coastlines, selected disputed area & AD1 boundaries, and ISO codes

ESRI AD1 International boundaries, latest revision 2002

ESRI Data & Maps CD ROMs

Global

1:3 M
-
1:10 M

C, RD, FQ

Data distributed commercially or as a lost leader data product to GIS software sales

UNCS Quick Impact Database, circa late80s-mid 1990s

UNCS data portal not currently available

Global

1:5 M
-
1:10 M

CR, FQ

Based on Russian military maps,data available for evaluation sometime 2004

1 C=Commercial CR=Copyright; LF=License/Fee, PD=Public Domain; NC=Non-Commercial; FQ=Fair Quotation; RA=Restricted Access; RD=Registered Distribution

WVS+ AD1 1:250 000 international boundaries

Given the vested interest in the VMap0 as the de-facto standard baseline for AD1 boundaries, it is likely that the 1:250 000 WVS+ Coastline/AD1 boundaries data layer can only be used as a reference for confirming the location of island features and possibly for selective updates or additions to political/disputed area boundaries. Other than specifying changes to the former Yugoslavia, the WVS+ metadata does not provide a date or other information concerning the source of the AD1 boundaries in the database. However, as there is a fair correspondence to circa 1997-2000 AD1 boundaries demarcated in the VMap0, taking into account the 1:250 000 versus 1:1 million differences in the base scales, the source for the original vectors appear to be dependant on a separate digitisation process using the ONC Charts as a baseline. The 2004 3rd Edition of these data does include at least some localized updates for post 2000 boundaries and areas of dispute.

The estimated LOE to process these data into a seamless global database is again seven days and includes the coastline processing discussed under Section 4.1.1. The harmonization of the FIPS encoding with ISO and UN codes contained in the other international datasets discussed below, would likely require a further LOE of 1.5 days. Given the vested interest of the UN in the VMap0 as an AD1 baseline, the WVS+ administrative data are not recommended as a potential replacement to the UNCS/FAO effort described below. Rather, these data should again be considered as a baseline for the masking of high resolution digital elevation and satellite imagery.

DCW and VMap0 international boundaries and derivative data layers

The DCW and later editions of the VMap0 provide the baseline source of AD1 political boundaries for all but two of the datasets inventoried in Table 4.2.1. Due to the single precision vector geometry of the DCW, some localized distortion or differences in relation to the 0.000005 "accuracy" of the VMap0 will be evident when similar features are compared between the two baseline sources. Temporal updates as to the relative currency of the political boundaries represented in the VMap0, and then selective corrections made to account for truncated or features identified as missing between both the DCW and then later editions of the VMap0, favour the utilization of the VMap0.Ed5 data as a more consistent baseline source for political AD1 boundaries.

The more recent 2002 revision of ESRI's International boundary data layer listed in Table 4.2.1 was not available for review. However, based on the 1998 edition, the AD1 boundaries of these data are highly generalized in comparison to the VMap0. The source data scale listed in the metadata supplied with these data is 1:3 million and is based on ESRI's ArcWorld data product. However, additional textual descriptions indicate that these data have been further generalized for use at the scale of 1:10 million. The baseline political reference source for ArcWorld and its supplements which have been updated over time was the DCW. ESRI's 1:3 million ArcWorld data product was not evaluated with regard to its suitability as a potential CGDB data source. Although these data are distributed as part of ESRI Data & Maps CD-ROMs, ESRI has a fairly liberal EULA stipulating that the data can be used more broadly or partially integrated into derivative data as long as fair quotation is provided to ESRI. Such fair quotation will include the direct citation of ESRI as a data source on any hardcopy or digital graphics where these data are used in either whole or part.

The WHO-EIP database is likely the oldest data layer listed in Table 4.2.1. This dataset is based primarily on the DCW, but includes data accessed at least in part from ESRI sources. These data are nominally in the public domain, but WHO still maintains some limited restrictions to access and distribution. Two comparatively more up-to-date and robust databases derived primarily from the VMap0, are FAO-AWRD and UNCS/FAO political and international boundaries datasets listed in Table 4.2.1. This latter dataset is based on the collaborative work of the UNGIWG International Boundaries Task Group.

The creation of an effective polygonal baseline from the VMap0 requires both the seamless reprocessing of these data to remove tile boundaries and artefacts as well as the integration of linear coastline features to account for features missing from the polygonal layer. Further, as the primary encoding of the VMap0 political boundaries layer is based on the older FIPS standard, the addition of ISO, UN, and other potential attributes are also required to create the most complete polygonal baseline. The creation of such polygonal datasets is necessary to facilitate the efficient query based retrieval and thematic representation of country units using standardized FIPS, ISO and UN encoding, as well as unique island and country names contained in the source data library.

The primary difference between the two datasets is that the UNCS/FAO dataset contains international boundaries and coastlines for countries and disputed areas, while the AWRD-VMap0 contains political boundaries including in certain cases subcountry administrative boundaries, disputed areas, and protectorates. Further, although the AWRD-VMap0 provides a robust harmonization of three VMap0.Ed3/4 source layers, other than for seven variations, the derivative remains "true" to VMap0 source(s). Within the UNCS/FAO dataset, any VMap0 subcountry boundaries were aggregated to the national level and a number of additional "corrections" were undertaken based on boundaries from sources other than the VMap0.

The LOE required to produce a direct VMap0 political boundary baseline from the linear coastline and political boundary layers, integrated with a seamless derivative of the polygonal political boundary features, is seven days. This estimate is based on the processing of FAO-AWRD dataset, and includes the encoding of ISO, UN, FIPS and other attributes. Given the corrections undertaken based on the existing 1997 VMap0.Ed3/4 derivatives, the LOE associated with updating either the AWRD or UNCS/FAO datasets to the year 2000 fifth edition VMap0 - or any subsequent VMap0 edition - is estimated at five days.

Such processing would also need to consider the correction of island features displaced in the VMap0, which have been updated in the UNCS/FAO dataset. The NOAA GSHHS coastline data has already confirmed a number of such corrections, and the WVS+ data should be used as the arbitrator of any inconsistencies. In addition, to the above tasks, the integration of the extant linear encoding attributes of the VMap0 Ed.5 detailing the accuracy and status of each boundary should be considered as important to the UNCS/FAO data effort. This latter task would not only provide a robust linear feature set for base and other mapping purposes, but also a means to track the source and lineage of features to these data.

UNCS QID 1:5 million - 1:10 million scale international boundary data layers

The UNCS/FAO database is discussed again under the "Disputed Areas" section of this inventory, however it is possible that this dataset may also provide the 1:1 million scale baseline used for the UNCS Quick Impact Database(s). The specifics of the AD1 boundaries associated with the QID 1:5 and 1:10 million scale data efforts could not be reviewed in relation to this inventory. The QID 1:5 and 1:10 million specifications state that, "Boundaries, coastline, countries, subnational administrative units, islands and the ocean will be captured by UN at 1:1 million scale and supplied as reference material to assist in the capture of international boundaries at scales of 5 million and smaller", (UNCS, 2001). It is, however, again uncertain if the UNCS/FAO dataset represents this baseline for the representation of AD1 political boundaries. As discussed previously, it could not determine for this report whether either or both of the 1:5 million and 1:10 million scale QID international boundaries will be generalized from a composite DCW/VMap0 baseline or based on a direct digitization from the Russian source maps.

4.2.2 Combined international AD1 and subnational AD2 administrative boundary data layers

The lack of a common geographic database demarcating subnational boundaries which can be used throughout the UN, donor agencies, NGOs, and nationally focused development efforts has been cited overtime as a factor limiting the comparison and more directed targeting of development and relief efforts, (SALB, 2003a). It is therefore appropriate that one of the first synergistic data development efforts sponsored under the UNGIWG umbrella has been focused on the codification and production of a database addressing this issue. Accordingly, an effort entitled the Second Administrative Level Boundaries data set (SALB) was developed to establish global AD2 level administrative boundaries as of the year 2000. This effort is being coordinated by the WHO.

The datasets discussed under this topical heading do not include either census or agricultural division data layers which are country specific and may differ from administrative boundaries.

The Second Administrative Level Boundaries data effort[8]

The Second Administrative Level Boundaries data set (SALB) effort is focused on the improvement and availability of information concerning administrative boundaries down to the second subnational level. The second order administrative boundary datasets which have resulted from the SALB effort form part of the UN geographic database and have been developed within the context of the United Nations Geographic Information Working Group (UNGIWG).

The SALB represents a global data initiative consisting of digital maps and tables of codes which can be downloaded on a country by country basis. The first objective and output of the SALB was the creation of a database representative of the situation observed in January 2000. However, the information finally collected has allowed the determination of historic changes which have occurred since 1990 at the country or AD1 1st order level and since 2000 subnationally for the AD2 second (2nd) order level. Additionally, a process has also been put in place to update and maintain the SALB database in regard to future changes. Given the above development efforts, the results of the SALB are currently more indicative of a data library than any single dataset.

In order to insure the consistency of boundaries between countries, the SALB employs an international border standard, i.e. the UNCS/FAO Level 1 database detailed in sections 4.2.1 and 4.3, and includes an encoding schema recently recognized by UNGIWG for adoption by the UN and potentially for consideration by the ISO. For countries sharing disputed areas, within the SALB, the area of concern will be integrated into the data representing both countries. Those portions of the SALB library approved for distribution can be downloaded at no cost via the WHO reference Web site at: http://www3.who.int/whosis/gis/salb/salb_home.htm. Due to differences in the quality of the documents and data compiled for the SALB, the spatial data layers of the database are more suitable for thematic mapping rather than either precise representation or modelling. It is therefore recommended that these data are not used at scales larger than 1:1 million, (SALB, 2003b).

As of June 2004, of the 191 countries slated for inclusion in the dataset at the inception of the SALB effort, direct contact with 184 NMAs has been established and institutionalized. This has resulted in the acquisition of maps demarcating administrative boundaries for 145 countries, based upon which digital boundaries have been finalized for 23 countries with another 35 currently undergoing validation. The finalized datasets can be directly downloaded from the SALB Web site. Additionally, tabular listings of historic boundary changes to administrative units have been codified and include changes for: 97 countries since 1990; 151 countries as of the year 2000; and for 105 countries after 2000. Discussions with the seven remaining countries for which the participation of the NMAs in the SALB effort has not been formalized are still ongoing.

Summary of combined international and subnational administrative boundaries

The VMap0.Ed5 for the first time includes subnational boundaries within the library and likely represents the highest resolution source of subnational data nominally in the public domain. However, as presented in section 3.6, a careful review of the metadata associated with the VMap0.Ed5 indicates that some or all of the subnational boundary features of the VMap0 are based on commercially copyrighted data from ESRI and Global Mapping International (GMI). Although subject to a liberal fair-quotation EULA which would not negatively impact the integration of these data for UN purposes, the existence of this copyright should be noted and its potential implications for derivative data once or twice removed from source recognized.

Table 4.2.2, lists the availability of global or continentally specific AD2 and AD3 subadministrative identified for the inventory. Datasets covering specific countries or regional entities such as: the European Union, the IGADD/Horn region of Africa; or those covered by the Mesoamerican and Caribbean Geospatial Alliance (MACGA) have not been evaluated for inclusion in the inventory.

Table 4.2.2
AD2/AD3 Subnational administrative boundary data layers

Data Type/Source

URL

Extent

Scale

Availability1

Notes

AD2/AD3 SUBNATIONAL ADMINISTRATIVE BOUNDARIES

NGA-VMap0 Ed.5 AD1 & AD2 Second order political boundaries, circa 2000

http://store.usgs.gov/

Global

1:1 M

PD,CR, FQ

Seamless processing from VPF format required and harmonization with ISO and other non-FIPS codes

WHO/UNCS SALB AD1 & AD2 Second order political boundaries, circa 2000

www3.who.int/whosis/gis/salb/salb_home.htm

Global
-
Country

nominal
1:1 M

PD, CR. NC

Dataset pending, based on NMA AD2 data integrated with UNCS-FAO AD1. Some NMA data available

FAO Subnational boundaries for Africa, AD2/AD3 circa 2000

www.fao.org/geonetwork/srv/en/main.search

Africa

nominal
1:1 M

PD, CR, FQ

Revision of 1993 dataset, updated to denote year 2000 AD3 boundaries

RWDBII-Sv1.1-AD2 Political boundaries with AD1 circa 2001 and AD2 spanning 1997-2000

CD available from WHO, [email protected]

Global

1:3 M

PD

Translation of the CIA’s revised Relational World Data Bank II. Also includes some AD3 & AD4

ESRI Admin98, AD2 boundaries circa 1998, possibly revised in 2002

ESRI Data & Maps CD ROMs

Global

1:3 M - 1:10 M

C, CR, FQ

Data distributed commercially under ArcWorld or as a lost leader data product to GIS sales

American Digital Cartography - WorldMap

www.adci.com

Global
-
Continental

1:1 M

C, CR, LF, RD

The AD1/AD2 specific layers of ADC’s global or continental products

Europa-Technologies - Global Insight

www.europa-tech.com

Global

1:1 M
1:3 M

C, CR, LF, RD

AD1, some AD2 for admin specific layers of Europa’s global products

Global Mapping International - Seamless Digital Chart of the World Base Map

www.gmi.org/wlms/dcw.htm

Global

1:1 M

C, CR, LF, RD

Only source of updated AD2 boundaries identified. Highly restrictive copyright & licensing

1 C=Commercial CR=Copyright; LF=License/Fee, PD=Public Domain; NC=Non-Commercial; FQ=Fair Quotation; RA=Restricted Access; RD=Registered Distribution

As indicated in Table 4.2.2, there is a relative paucity of data below the AD2 level. In fact, the only consistent sub-AD2 level database identified is the FAO-Subnational Boundaries of Africa 2000 dataset providing coverage of continental Africa and island states. This dataset uses the original DCW coastline and political boundary data layers as a structured baseline for harmonising subnational boundaries from various national and regional datasets. The dataset was updated extensively in 1999 for subnational boundaries, including some selective integration of VMap0 coastlines, etc. The dataset was further amended between 1999-2001. Although the dataset contains harmonized AD2 level boundaries for all countries, there is some variation in the actual number of levels demarcated between countries, including: the omission of AD3 boundaries for six continental countries, and the provision of AD4 level boundaries for six countries.

Public domain subnational administrative boundaries

Without doubt, the SALB effort represents the only rigorous public domain effort addressing subnational boundaries globally. However, as both the codification and then validation of AD2 level boundaries has currently been completed for 24 countries, prohibiting the generation of a complete global coverage, three public domain global data sources containing AD2 boundaries were identified for the inventory. These sources include: the AD2 level linear and polygonal divisions as available from the VMap0 Ed.5 Political Boundary layers - which are ultimately based on the RWDBII-Sv1.1 subnational boundaries; the linear and polygonal administrative boundary layers of the source RWDBII-Sv1.1 library itself; and ESRI's Admin98 dataset. The RWDBII-Sv1.1 library lies clearly within the public domain. While, portions of the VMap0.Ed5 AD2 layer and whole of ESRI’s Admin98 dataset are not, with ESRI and GMI holding a joint copyright. The VMap0.Ed5 AD2 layer and the Admin98 are, however, subject to a fairly liberal fair-quotation EULA when utilized for non-commercial purposes.

In addition to providing both AD1 and AD2 political boundaries, the RWDBII-Sv1.1 also provides some limited AD3 and AD4 level subnational administrative boundaries as both polygonal and linear features. The RWDBII-Sv1.1 is also the only data library reviewed which also demarcates at least some maritime AD1 boundaries island states or countries such as the Mariana Islands. Based on a rapid review, however, the coastline linear layer is incomplete when compared to the AD1 polygonal layer, and the maritime AD1 boundaries may not be complete. Within the linear national AD1 and then, subnational AD2, AD3, and AD4 administrative boundary data layers of the RWDBII-Sv1.1, the feature attributes contain left/right encoding of any adjacent country or subnational unit. In comparison to VMap0.Ed5 library where the encoding for the AD2 subnational units use names derived from the FIPS code, the RWDBII-Sv1.1 attributes for AD1 boundaries and subnational units use the standard FIPS alphanumerical codes.

All three of the above data layers/datasets provide valuable contextual resources which can be used with caveats regarding the timeliness of the subnational boundaries demarcated for base mapping. The inset graphic at the right depicts subnational boundaries for two of these datasets covering the Namibian AOI, compared against the SALB 1:1 million scale verified boundaries for the country. These latter boundaries are consistent with AD3 administrative units.

Based on processing for continental Africa, estimates of the level of effort required to process consistent global subnational administrative linear and polygonal datasets from the VMap0 and RWDBII-Sv1.1 are seven and twenty-one days respectively. While the processing of each library should be accomplished from the ground-up initially consolidating diverse sets of linear and polygonal layers as outlined in Section 3.7, the LOE for RWDBII-Sv1.1 is anticipated to be much greater based on two factors. The first of these would be the need to harmonize SWB and potentially river feature layers before administrative layers are considered, while the second includes both the correction of gross AD2 topological errors and in cases where multiple historical boundaries are represented, a manual selection process. A further combined LOE of seven days should also be anticipated for the VMap0 and the RWDBII-Sv1.1 to ensure the harmonization of FIPS encoding between these libraries and then with various other attribute encoding standards such as the ISO, UN and SALB[9].

Commercial subnational boundary data layers

The commercial data products offered by ADC and Europa Technologies are also listed in Table 4.2.2. Based on the product descriptions from both companies, these data layers contain purportedly updated AD2 and possibly AD3 subnational boundaries. However, at least for the Namibian AOI, the product samples provided by these companies did not include any subnational boundaries. Further, because the ADC sample only covered the Namibian AOI, the full extent of such boundaries globally for ADC's products could not be determined. While for the EuropaTech product, a review of the global boundaries layer determined that only an incomplete coverage of AD2 or lower level boundaries were available globally (e.g. AD3 level features were limited to the US), and available boundary layers are represented by linear versus polygonal features.

Although not specifically reviewed in Section 3.9 on commercial data products, the AD2-administrative division layer contained in GMI’s Seamless Digital Chart of the World Base Map product likely represents the most up up-to-date source of AD2 boundaries identified for the inventory. Working from the original WDBII/ RWDBII-Sv1.1 subnational boundaries, GMI has historically provided the subnational boundaries integrated into ESRI’s ArcWorld and Data & Maps on CD products. Further, as referenced in Section 3.6, ESRI and GMI hold the copyright for at least a major portion of the AD2 boundaries included in the VMap0.Ed5. However, unlike any AD2 features derived from the VMap0.Ed5, which are subject to a more liberal fair quotation EULA under the ESRI/GMI copyright, all of the layers comprising GMI’s Seamless Digital Chart of the World Base Map product are subject to a stringent copyright and restrictive distribution licensing.

4.3 AREAS OF DISPUTE, CONFLICT, AND LANDMINE DISPERSAL

Source data identified under this topical heading are extremely limited and do not warrant a tabular listing. In summary, the only specific "area of conflict" database identified was the spatial database available via links from the UNCS Web site delimiting areas where UN Peace Keeping Forces are or have been deployed, and then two "landmine dispersal" spatial datasets covering a portion of Angola and cleared/uncleared fields in Afghanistan based on a review of the WHO's data holdings. It is likely that further sources of such data might be obtained from country or regionally specific donor and NGO projects, indicating that further research is necessary. It is also likely, however, that the result of further research would find that any data sources identified would be highly localized in extent and differ widely regarding the accuracy and completeness of data attribution.

Discussed briefly in conjunction with Section 4.2.1 covering AD1 international/ political boundaries, the UNCS/FAO-UNGIWG International Boundaries Task Group (IBTG) dataset represents the sole spatial data source of border and consistent international areas of dispute that was identified. The WVS+, VMap0 and derivative VMap0 datasets such as the FAO-AWRD global political dataset, also contain at least a limited coverage of disputed areas, and the data layers included in the RWDBII-Sv1.1 and ESRI's Admin98 dataset, also potential demarcate others. However, only the IBTG effort consistently attempts to encompass such areas globally using a 1:1 million scale spatial reference. This said, the IBTG should consider the integration of the accuracy and status attributes of the VMap0 Ed.5 AD1 linear data layer as means of promoting an assessment of data sources and the tracking of lineage within the effort.

The name associated with the IBTG data effort has been applied in this report to indicate the main participants associated with this effort, i.e. the UNCS and FAO. This proved necessary when no descriptive document describing the responsibilities associated with the effort could be obtained at the time this effort was inventoried in March 2004. Based on e-mail communications and the limited information available from the UNGIWG reference Web site, the main participants in the IBTG effort include: the UNCS, FAO, and WHO. However, other members of the UNGIWG International Boundaries Task Group are also likely contributing to this effort[10].

Based on on-line reference material, (UNGIWG, 2003), the first version of a standardized GIS data layer was produced by the UNCS in a linear format, followed by the creation of a polygonal version by FAO. Based on beta samples of these data obtained from FAO, the extent of both corrective measures and the inclusiveness of the effort are both extensive and ongoing. FAO cites the WHO potentially as the provider of at least the spatial baseline for this effort.

By inference, it may be likely that the UNCS/FAO dataset will at some point provide the international and disputed area baselines for the SALB, the generalized QID 1:5 million - 1:10 million, and an FAO specific database of such boundaries. However, the utilization of this database as a potential baseline for inclusion in other datasets could not be directly ascertained. E-mails forwarded in conjunction with the ongoing refinement of this database, indicate that issues associated with the spatial demarcation of the Eritrean/Ethiopian border dispute have been spatially codified recently by the UNCS. The paper submitted by FAO at the 4th Annual 2003 UNGIWG meeting in Nairobi, indicate that FAO has in the past been dependant on a 1996 fax from the UNCS as the basis for representing Disputed Areas, and that some input from the UN Legal Office is required to approve some boundaries and take the effort forward to completion, (FAO, 2003).

The UNCS/FAO dataset associated with efforts of the UNGIWG International Boundaries Task Group is currently in a limited distribution beta edition and may be released into the public domain subject to a non-commercial EULA sometime in early 2005.

4.4 PARKS, CONSERVANCIES, AND PROTECTED AREAS[11]

The UNEP World Conservation Monitoring Centre (UNEP-WCMC) is the custodian and manager of the World Database on Protected Areas (WDPA) in collaboration with the IUCN World Commission on Protected Areas (WCPA). The process of gathering and reviewing the WDPA is undertaken through a number of partnership arrangements with other organizations through the WDPA Consortium and includes agreements with intergovernmental organizations such as the European Environment Agency and the ASEAN Regional Centre for Biodiversity Conservation. By becoming part of the WDPA Consortium, members agree to provide data concerning country and regional protected areas which they may hold.

In 2002/2003 UNEP-WCMC undertook a major revision of the WDPA database by sending requests to 183 countries and organizations for updates and verification regarding current data holdings. Currently, the WDPA geospatial data layers are comprised of four point and two polygon layers. These layers can be further categorized and evenly split based on whether each protected area has been recognized under international law and/or designated under national legislation. For example, areas protected under international law, including those sites which have been designated by international conventions such as the World Heritage Convention or the RAMSAR Convention on Wetlands of International Importance, and also sites defined using European designations and UNESCO MAB Biosphere Reserves, are contained in one data layer. While, areas such as National Parks or Nature Reserves which have been designated under national legislation and recognized as protected by individual countries, are contained in another.

Because the WDPA is comprised of data from multiple sources which have been captured at a variety of scales, it is difficult to assign a precise scale for the entire dataset. However, the polygon layers represent the best available information available on park boundaries and are in general suitable for map scales ranging from 1:1 million to 1:5 million. For similar reasons, the four point layers of the WDPA provide only the approximate location for each protected area. These layers can be further classified once again from the internationally and nationally recognized subdatasets of protected areas, based on whether a data layer represents all of the protected areas for which spatial referencing information has been identified or only those areas for which a coincident polygon protected area feature could not be defined. The most up-to-date version of the WDPA spatial data layers are available via an IMS server at http://deben.unep-wcmc.org/imaps/ipieca/world.

In addition to the spatial data layers, the WDPA includes an aspatial relational database, i.e. a tabular database, containing information on individual protected areas, their size, IUCN category, history, and a number of other attributes. The aspatial database is searchable via the Internet at http://sea.unep-wcmc.org/wdbpa/.

The WDPA database is constantly being updated as new sites are designated and more accurate information is made available. The WDPA data are copyrighted and for general purposes are made available at no charge subject to a fair quotation EULA. However, for commercial use, a licence is required and can be obtained from the UNEP-WCMC.

CHAPTER 5. HUMAN HEALTH: BOUNDARIES AND FACILITIES

5.1 HUMAN HEALTH INFRASTRUCTURE AND STATISTICAL DATABASES

No consistent global databases providing coverage of either health statistical-enumeration boundaries or facilities were identified. Similarly, no consistent global databases providing subnational distribution estimates of mortality, morbidity, or prevalence rates associated with diseases such as HIV-AIDS/STDs, Malaria, Tuberculosis; or access to Potable Water could be identified.

This finding was based on a concerted review of the metadata listings provided by the Evidence and Information for Policy (EIP) and Communicable Disease Surveillance and Response (CSR) Units of the WHO and fairly extensive reviews of on-line reference material available from: the Health Studies Branch of U.S. Bureau of Census; the U.S. Center for Disease Control; The World Bank (WB); UNDP; UNICEF; the UNAIDS/WHO Global HIV/AIDS On-line Database; USAID; and other related resources. Although it does not provide as complete a listing of data sources as the extensive metadata holding provided by both EIP and CSR, a document entitled "WHO-UNAIDS_GIS_Inventory.pdf" prepared by the CSR can be downloaded via links to CSR's on-line data server, www.unaids.org, as discussed in Section 5.3 below.

5.2 HEALTH FACILITIES

A review of the metadata listings available from the WHO's EIP Unit detail that health facilities for 72 countries are available. Further, the metadata indicate that georeferencing information is available for at least some, if not all of the health facilities within 65 of these countries.

A similar review of the metadata holdings for the WHO's CSR Unit show that georeferenced features representing health facilities for some 59 countries are available. The CSR metadata further classify available health facility references into nine separate "Health-Type" (HT) subclasses based on the level of health services available. Globally, data providing coverage of facilities nationally include: 47 countries with HT-1 locations; 44 with HT-2 locations; 40 HT-3 locations; 36 HT-4; 35 HT-5; 27 HT-6; 16 HT-7; 11 HT-8; and 6 countries with HT-9 facilities.

Overall, however, it should be recognized that the effort by the WHO and collaborators to both categorize and georeference health facilities is an ongoing one. Further, the number of facilities which have been inventoried for countries by no means represents either the total number of facilities in the country or those which can be georeferenced subnationally. According to WHO-EIP, the number of georeferenced facilities is roughly 20 time less than the number of health facilities which have been inventoried and categorized to date. With the actual number of health facilities for which WHO has georeferencing and other attribute information currently numbering 54 420 in 73 countries, while the number of health facilities in the national level database includes 1 172 598 for 87 countries, (WHO 2004b).

A graphical overview of the current global coverage of health facilities within each of these HT subclasses - or for all nine classes - can be obtained via UNAIDS/WHO Global HIV/AIDS On-line Database, www.who.int/GlobalAtlas/InteractiveMap/MainFrame2.asp.

5.3 HEALTH STATISTICS AND EPIDEMIOLOGICAL DATABASES

The UNAIDS/WHO Global HIV/AIDS On-line Database also provides access to health indicator data either spatially via its IMS server www.who.int/GlobalAtlas/InteractiveMap or in tabular format via its Query based interface. Unfortunately, like most such interfaces reviewed in conjunction with this report, the process of drilling-down to access either spatial data presented graphically or tabular data in MS-Excel format proved to be both cumbersome and slow, yielding neither preliminary graphical outputs suitable for inclusion in reports nor any tabular data which could be similarly integrated into reports or referenced for independent statistical or spatial analysis.

This finding was unexpected given that the site design is not complicated, contains assessable help information, and is apparently well maintained. With regard to health statistical information, this on-line database provides a range of information covering infectious diseases, including: cholera (African countries only); HIV-AIDS/STIs-STDs; malaria; polio; rabies; and tuberculosis. The level of disaggregation available for these data is variable based on the country of interest and the linkage to the IMS interface was straight forward in regards to selecting either a spatial AOI or country, and then potential sentinel site survey locations. As an example of disease specific data availability, using HIV-AIDS as an example, the interface for the on-line database returned: HIV prevalence; AIDS cases; sexually transmitted infections; demography; socio-economic factors; risk behaviour; health sector responses; health sector characteristics; and virology as potential topical classes for either thematic map representation via the IMS server or the tabular retrieval of data. Each of these topical classes also potentially contains sub-data references, which in the case of HIV Prevalence included both Estimates and Sentinel Surveillance statistics.

However, based on two on-line sessions spanning almost 1.5 hours focused on the retrieval of thematic maps suitable for reports/publication and representative tabular information providing coverage of the Namibian AOI, the results of these sessions were not productive. Again, since the UNAIDS/WHO Global HIV/AIDS On-line Database-IMS server is both well designed and maintained, the lack of useful results indicates that either no viable information was available for the Namibian AOI specified or that IMS based data retrieval interfaces have yet to reach a data intensive versus low resolution graphic output potential.

CHAPTER 6. HUMAN POPULATION: POPULATION CENTRES AND DISTRIBUTION

6.1 POPULATION CENTRES AND DISTRIBUTION DATABASES

Grouped under this topical subsection is a discussion of databases which provide either the location of population centres or those representing the distribution of population globally based on continental landmasses. Discussion of datasets which do not contain at a minimum an attribute for a name or a population variable, i.e. the Night-Time or Stable Lights databases, are not included. Nor, for the most part, are databases available from on-line gazetteers such as: www.alexandria.ucsb.edu, www.calle.com, or www.world-gazetteer.com.

6.1.1 Public domain datasets of populated places containing name attributes

Four public domain databases or data layers contained in broader libraries providing either point or polygonal representations of populated places were identified for the inventory. Each of these databases/layers contains an attribute providing a name for at least some of the populated places represented. These databases are differentiated from others discussed under a separate topical subsection that, in addition to a name attribute, also contain population census estimates. Table 6.1.1 below contains a summary of these four databases or data layers.

For Namibia as a whole, the NGA-GEOnet database contained 2 456 named locations including alternate names associated with 2 096 unique populated place locations. Of the approximate 2 500 names, 75 are classified as localities representing potential suburbs of a larger city. In comparison, the VMap0 Ed.4/5 contained 178 unique point and 11 polygonal populated place locations, 133 of which contained a valid name attribute[12].

Based on Table 6.1.1, the listing for the NGA-GEOnet Gazetteer contains the greatest number of named features globally. The number of unique features, i.e. those not based on an alternate name, was not evaluated, so the actual number of populated places represented will be somewhat less than the 2 million plus names listed. Additionally, locations for the United States and territories are not covered by the GEOnet database and the attribute encoding a census population estimate was found to be invariably blank or null.

Table 6.1.1
Public domain datasets of populated places containing name attributes

Data Type/Source

URL

Extent

Scale

Availability

Notes

POPULATION CENTRES BASED ON PRIMARY GAZETTEERS

NGA-GNS/GEOnet Gazetteer

http://164.214.2.59/gns/html/index.html

Global

variable
1’-1"
dms
1:250 000

Public Domain

Database contains over 2 million population centers and is the basis for many gazetteers, including planned UNGEGN gazetteer

NGA-DCW Gazetteer

Available from DCW CD-ROM or via jose.aguilarbanjarrez@fa o.org, pending distribution via FAO-GeoNetwork

Global

1:1 M
rounded to ±100 metres

Public Domain

Database contains 84 209 named population centers captured from the original ONC reference charts

POPULATION CENTRES BASED ON VECTOR DATA LIBRARIES

NGA-VMap0 Ed.5 Built-Up Area polygonal data layer of populated places

http://geoengine.nima.mil or available as a set of CD-ROMs via USGS-Store

Global

1:1 M

Public Domain

~36 500 urban areas globally, Ed.5 corrected some 90% of the missing ~18 000 name attributes in Ed.3/4

NGA-VMap0 Built-Up and Miscellaneous Populated Place point layer

http://geoengine.nima.mil or available as a set of CD-ROMs via USGS-Store

Global

1:1 M

Public Domain

Globally 201 520 populated place point locations for which roughly 65 000 will contain a valid name

Given potential rounding to the nearest degree of latitude/longitude noted in previous versions of the database, the maximum displacement for any GEOnet named location will be ±0.9 km. In regards to facilitating base mapping, this renders these data more suitable for spatial referencing or centring when used in conjunction with satellite imagery, composite map layers derived from, for example, the VMap0, or Virtual Base Maps. A good example of such referencing can again be seen at the Alexandria Digital Library Gazetteer Server www.alexandria.ucsb.edu.

The LOE associated with processing these features from the NGA source database would again be seven days for an initial processing of the total database rather than a subset of populated place locations. Such processing should again include the attribution of an accuracy assessment based on whether rounding to the nearest degree of latitude/longitude has occurred.

The LOE required to process the VMap0 source polygonal and point data layers globally would be three days. It should be anticipated that the majority of this time would be spent correcting segmented polygons and/or slivers for built-up areas containing the same name attribute. Such processing should also include: the consolidation of the base VPF encoding; additional attribute encoding based on whether a valid name is currently encoded; a sub-classification indicating relative size based on whether any name - which are based on the original map annotation - is upper or lowercase; the addition of a proper name attribute versus the extant uppercase name; and lastly, the creation of a point subset of capital cities referenced against the nominal VMap0 capital cities point data layer.

6.1.2 Commercial and other populated place databases providing population estimates

Six public domain or commercial database layers of populated places containing both a name and a population census attribute were identified for inclusion in the inventory. As these databases either supersede or possibly include data from the Birkbeck College/UNEP databases of: approximately 2 800 global population centres greater than 100 000 people and capital cities, or the approximate 500 African cities greater than 20 000 people, these data are not listed[13]. Table 6.1.2 below, summarizes the available databases.

Table 6.1.2
Commercial and other place name data containing population estimates

Data Type/Source

URL

Extent

Scale

Availability1

Notes

PUBLIC DOMAIN NAMED POPULATION CENTRES WITH CENSUS ESTIMATES

UNCS Quick Impact Database, four populated place point layers from each scalar library

UNCS data portal not currently available, data available for distribution & evalua-tion sometime 2004

Global
Global
Africa-ME

1:10 M
1:5 M
1:1 M

PD, CR, FQ

Six classes of population range estimates from 10 000 to 1 million, source or date not specified for population estimates

RWDBII-Sv1.1-Cities Point Data Layer

CD available from WHO, [email protected], or FAO

Global

1:3 M

PD

55 038 features: named, with Pop. est. in 1 000s, and various administrative encoding

CIESIN, Global Rural Urban Mapping Project (GRUMP), settlement points and urban extents databases

http://beta.sedac.ciesin.columbia.edu/gpw

Global

nominal
1:1 M

CR, RD

Three databases available: Settlement points data include names and coordinates for 70560 places with estimates of population for 1990, 1995, and 2000. Urban extents data include spatial extents, georeferenced to settlement database for names and to 1 km grid for population estimates, see also Table 6.2

COMMERCIAL NAMED POPULATION CENTRES WITH CENSUS ESTIMATES

ALLM GeoData’s Global Gazetteer

www.allm-geodata.com/field_stats.htm

Global

Variable

C, CR, LF

297 727 named populated places containing a population estimate, 219 000 of which are geocoded

Europa Technologies Populated Places layer from Global Insight Plus& Discovery product data libraries

www.europa-tech.com/prodcomp.htm

Global

nominal
1:1 M

C, CR, LF, RD and FQ on maps for internal use

601 595 populated places, 49 127 of which contain population estimates, and1 547 variant names

ADC WorldMap Digital Atlas v.4 Capitals, populated place, & small city library layers

www.adci.com/products/

Global

nominal
1:1 M

C, CR, LF, RD and FQ on maps for internal use

Based on literature: roughly 429 000 place names containing seven possible range classifications of population

1 C=Commercial CR=Copyright; LF=License/Fee, PD=Public Domain; NC=Non-Commercial; FQ=Fair Quotation; RA=Restricted Access; RD=Registered Distribution

Public Domain Databases: UNCS-Quick Impact Database

As discussed earlier, data layers from the UNCS Quick Impact Database (QID)

1:10 million and 1:5 million global, and the 1:1 million African/Middle Eastern data libraries were not available for actual evaluation and possible comparison. According to specifications, these data libraries will each contain four layers of populated places: national capitals, administrative capitals, major cities, and other cities. The features comprising each of these layers will contain an attribute ranking the relative population of each location into one of six categories ranging from 10 000 to those greater than 1 000 000. The date for the population estimates used in the ranking is not specified, but is assumed to be circa 2000.

RWDBII-Sv1.1 Cities data layer

The RWDBII-Sv1.1 Cities data is comprised of two layers representing first national-administrative capitals and then general population centres. The attributes for both layers are the same, facilitating their integration into a single data layer, and include: a proper name using a standard ASCII diacritic set; general feature encoding or ranking; a population estimate in 1 000s for some 27 000 of the city features; a source/date reference which may be difficult to decipher; and a relative population ranking classification that may no longer be validated. With the above caveats, these data layers still provide a fairly good medium scale populated place reference for both base and thematic mapping. The LOE associated with the processing of these data would only be two days, based primarily on the need to clean up the diacritical name attribute encoding.

Center for International Earth Science Information Network urban extents

A preliminary version of the urban extents database for Africa by the Center for International Earth Science Information Network (CIESIN), Columbia University and colleagues (International Food Policy Research Institute (IFPRI), the World Bank (WB), and the International Center for Tropical Agriculture (CIAT)) represents one of the most promising data libraries of populated places which can be used for both analytical and base mapping purposes. The effort is global in extent, although only Africa was available for review here.

The library is comprised of three datasets capturing population centres greater than 5 000 people as: points based on named locations from a variety of gazetteers; an areal representation of urban extents based mostly on boundaries derived by night-time lights satellite imagery (but also those captured from the TPC and ONC Charts in Africa); and lastly, a 1 kilometre grid capturing the urban/rural distribution of settlement. As of November 2004, the urban mask was only available as a grid, but plans for rendering a polygonal layer are also anticipated.

The urban extents database is available as a mask of urban areas, with rural areas denoted as the remaining land-area. Oceans and large water bodies have been masked out of the database. Population values, for 1990, 1995, and 2000 are associated with this database, although at the present time, they are not included with the mask. The underlying point data, as well as the associated 1 km gridded population data supply population values. In the inset graphic at the right, the CIESIN urban-rural extents of Swakopmond and Walvis Bay within the Namibian AOI are shown in white and the city extents based on the VMap0 are shown in red.

Unlike any of the other point databases discussed in this section, the one by CIESIN and partners identifies the origins of the information for each settlement’s estimation of population, geographical location, and year of estimation. The population values are then converted into estimates of population in the target years of 1990, 1995, and 2000, using standard interpolation techniques described in the documentation accompanying the datasets.

Commercial databases

Populated place layers from two commercial data libraries and one gazetteer are also listed in Table 6.1.2. Of these three data layers, for Namibia the ALLM gazetteer contains the greatest number of features, i.e. 2 681 named locations for which population estimates were available for 39 places. In comparison, the product from EuropaTech included 1 178 named population places for Namibia, 29 of which contained a estimate of population. As a further comparison, the publicly available CIESIN Urban-Rural Extent data layer contained 32 named polygons with population estimates covering the country. The commercial data products do not include a date attribute in association with any population estimates

A direct numerical comparison for Namibia based on the WorldMap product was not possible because the sample requested from ADC covered only that portion of the country contained within the AOI. However, these data contain population range estimates and not "head-counts". Additionally, contrary to expectations, the ADC library sample did not contain any built-up area polygon data for the Namibian AOI. Because the ADC sample data does contain point references for larger cities within the Namibian AOI such as Walvis Bay or Swakopmond, which in the VMap0 are represented as polygons, it is highly likely that this omission was due to an error in the "Built-Up Area" sample data layer provided. Both ADC's Capital and Populated Place layers contain six classes of population estimates purportedly based on the most recent census estimates ranging from 10 000 to 2.5 million or greater. The Small Cities data layer contains only the one population range, i.e. 5 000 to 10 000, for locations within the Namibian AOI. A null listing for the population range attribute in these data may indicate either a smaller population or potentially a suburb of a larger city for which an estimate could not be differentiated.

The ADC data library includes a separate data layer for capital cities, while EuropaTech includes such features under its general populated place point data layer. For some reason, the name attribute for EuropaTech Urban Sprawl polygonal layer was not carried over from the VMap0 source data. Similar to ADC's library, a named point representation of these polygonal features is again included under the separate populated places point layer.

Although not presented in Section 3.9 on commercial databases, a tabular product was identified during the inventory that contains subnational population estimates for AD2 and some limited AD3 level administrative areas. This product is named Primary Administrative Subdivisions, and is available for US$150 dollars with an optional twelve month subscription available for an additional $100 dollars per year; see www.statoids.com/statoids.html. The product builds on a 1999 hardcopy publication entitled, Administrative Subdivisions of Countries, authored by Gwillim Law. The author also maintains an extensive on-line resource or version of the above product via the same URL. This resource is termed, Statoids, and in addition to population information derived from national census organizations, provides detailed information on AD2 and some AD3 level encoding and historical changes in FIPS/ISO standards over time by country. For this reason, in addition to a providing a possible source of population data, Statoids also represents an important resource to be used during any ancillary codification of the RWDBII-Sv1.1 AD2 level boundary layer and its subsequent harmonization with the relevant VMap0.Ed5 data layers as discussed in Section 4.2.2.

The Statoids, Primary Administrative Subdivisions, tabular product is subject to a non-specific single entity licence, with a rather unique EULA stipulating that any public display of the data by the licensee be limited to a single country at a time.

Comparison of spatial populated place databases

Figure 6.1.2, depicts a comparison of both the public domain, publicly available, and commercial populated place data products discussed above.

Based on this comparison, it can be seen that there is a fair amount of variation in both the relative number of populated place features between the available databases and layers, and then those containing actual population estimates. Specific to the commercial data products reviewed, all three provide a fairly comprehensive inclusion of the populated point features contained within the DCW/VMap0 libraries and include additional populated point references which do not exist in these NGA sources. The ADC library also includes a name attribute for its railway station and port data layers, which is not contained in the DCW/VMap0 source library; these features are derived and maintained based on ALLM's Global Gazetteer.

A railway station layer does not exist in EuropaTech's library, however, the "Place" point data layer of this library appears to have been supplemented by the addition of locations from NGA's GEOnet GNS gazetteer database. Again, EuropaTech's product documentation indicates that the company does not use the GNS data due to errors associated with the potential rounding to the nearest minute. However, a direct spatial comparison of the GNS to EuropaTech's point data layer, indicated that a significant number of features bearing the same name attribute lay within 100 m of each other between these data sources[14]. In addition to the general Place data layer, EuropaTech's library also includes greatly reduced subsets of this point data layer, which provide alternate spelling and six language renditions of the name attribute. However, Walvis Bay was the only alternate name feature contained within the Namibian AOI.

Figure 6.1.2
Comparison of populated place datasets

Based on the above comparison, the ALLM gazetteer provides the most robust product, but one that is followed fairly closely by EuropaTech's populated "Place" layer. Either of these products could provide valuable references for facilitating, for example CIESIN's efforts to expand the urban-rural extents and population database globally. Issues related to the use of EuropaTech's product with regard to rights for any derivative data would, however, need to be negotiated. Although, the ALLM product is the more expensive of the two - ALLM quoted a price of US$11 300 for these data - as discussed in the company's product overview, their utilization for updating or as an input into say CIESIN Urban-Rural extents data library, may be covered under the copyright. In e-mail communications, ALLM states that population estimates are updated based on regular contacts and searches of census sites, for at least every census, and generally at least once during intercensal periods, (ALLM 2004a).

In practice however, great care must be taken concerning the incorporation of data from propriety sources. For example, although it is only used an input into the urban mask, since the settlement database by CIESIN and colleagues is disseminated freely, care must be taken to ensure that the inclusion of proprietary data in one data layer, does not limit the degree to which derivative data layers can also be distributed without possible commercial copyright encumberment.

6.2 POPULATION CENSUS AND DISTRIBUTION DATABASES

Although numerous global census databases containing mainly aggregated estimates of national population were identified during the on-line review, no consistent global databases of subnational census vector boundaries and data covering even more recent intercensal timeframes were identified. Given the wide variation of the time and periodicity between national censuses conducted globally - much less the inherent difficulties of integrating even more basic aggregations of enumeration boundaries intercensally at the national level over time - this finding was not unexpected. In particular, as even simple gerrymandering or other adjustments to enumeration boundaries intercensally can create difficulties in the harmonization of vector data features, the production of raster population distribution or density databases likely represents the most analytically robust avenue of approach for producing consistent international or global datasets.

Additionally, while a number of global raster distribution databases of population were downloaded and evaluated in conjunction with the inventory review, there was found to be a wide variation as to the resolution and methodologies used to process these data. This variation exists both between each of the three population modelling efforts which have been made available over time, as well as, the methodologies used to distribute population between the individual "censuses" produced by each modelling effort. Three principal population distribution or density databases in raster format have been produced during the last 10-15 years: the CGEIC 1990 Global Population Distribution Database; the various versions of the CIESIN Global Population of the World; and, the ORNL LandScan databases.

6.2.1 CGEIC 1990 global population distribution database

The Canadian Global Emissions Inventory Centre (CGEIC) produced what was possibly the first publicly available global population distribution database. This database captured population circa 1990 worldwide using a one by one degree cell size. The effort was funded by Environment Canada and the United Nations Environment Programme (UNEP). This database uses relevant percentage values to aggregate national census data and to then apportion population distributions across each cell of the grid. This work has been superseded in usage by the CIESIN and ORNL population modelling efforts and is therefore not recommended for consideration as a CGDB core data layer.

6.2.2 CIESIN and colleagues[15] Global Population of the World (GPW3)[16]

The GPW3 provides global and continental subsets of population distribution currently containing estimates for three time periods: 1990, 1995, and 2000. Although attributed to CIESIN in this report, the GPW data modelling effort, now in beta for version 3, represents a fairly complex set of organizations, participants and potential sources of funding that are difficult to document in brief. Further, due to the evolution of population modelling via the GPW effort, there have been variations in the processing methodologies used between the GPW1, GPW2, and GPW3 editions both globally as well as for the various continental subsets released over time. These variations are also difficult to document concisely. The differences primarily reflect the degree of modelling used to allocate and distribute population. GPW versions 1, 2, and 3 use no formal modelling, whereas the interim continental scale efforts- undertaken largely by UNEP and CIAT, as well as the newer population grid effort undertaken by CIESIN in collaboration with IFPRI, the WB and CIAT, all lightly model population.

Since GPW3 is methodologically similar to its predecessors, it is perhaps sufficient to concentrate on the GPW3 descriptively, rather than the earlier GPW1 or GPW2 data efforts. The GPW3 effort utilizes what is most commonly known as proportional allocation or areal weighting approach to distribute population based on administrative polygons to a grid, rather than any relative density based on central place or routing infrastructure algorithms. In contrast, the GPW1 used a pycnophylactic interpolation to create a smoother surface between discontinuous input data rather than an areal weighting approach, but like GPW versions 2 and 3, did not reallocate population using information about other features such as roads, urban areas, or land cover. A 2.5 minute latitude/ longitude cell size is used for the standard GPW global, continental and country-level efforts. This cell size equates to a nominal pixel size of 4.8 km and given the use of “geographic” pixels, there will exist some variability in the nominal extent comprising each grid cell.

The number of administrative units used as inputs for the GPW process is constantly improving. Some 128 000 administrative units, roughly half of which covered the US, were used as inputs in the GPW2. Whereas, for the GPW3, this number was substantially increased with more than 350 000 polygonal input units going into the construction of the grid. The output GPW3 grid datasets contain both raw population counts and population density per pixel. In addition to the outputs based on population data from national statistic offices, estimates of national totals adjusted to the United Nations population figures are also available as outputs. Because many of the underlying vector boundary layers used in the GPW process were obtained via proprietary and copyrighted distribution channels, these layers are not available for distribution. However, a national level boundary file and centroids for each grid cell will be released to overcome some of the analytic limitations arising from the inability to disseminate subnational boundaries. The GPW raster data are available on-line in a number of formats including ASCII, BIL, or ESRI's compressed.E00 format for grids. The LOE necessary to process these data is negligible. Note, at the time of review the GPW3 had only been distributed in beta and the final version is expected to be released by the middle of 2005.

In addition to GPW’s global heuristic approach, other GPW based modelling efforts have also been undertaken, e.g. UNEP and CIAT respectively disseminate population grids for Africa and Latin America. In comparison, these databases lightly modelled population distribution in 1960, 1970, 1980, 1990, 1995 and in some cases 2000, and unlike the GPW took access to road infrastructure into account in addition to population counts. CIESIN and colleagues have similarly lightly modelled population based on their urban extent database and GPW3 inputs into a 30 arc second grid, for their Global Urban Rural Mapping Project (GRUMP) population effort. Like GPW3, GRUMP uses the best or lowest order population input data available as well as the urban-rural mask discussed previously, to effectively increase the number of input units by an order of magnitude. Like GPW3, population counts and density grids are available, both with and without an UN adjustment factor.

In regard to availability, the GPW and GRUMP databases have been listed in Table 6.2 below as copyrighted with restricted distribution. This was done to illustrate that no third-party redistribution of the GPW is allowed. However, the GPW-GRUMP data are available at no cost for public non-commercial use, subject to a simple registration process requiring only a name and e-mail address.

6.2.3 Oak Ridge National Laboratory's LandScan (ORNL) population databases[17]

Unlike the various GPW modelling efforts which have essentially distributed human population onto grids based on the relative area of a grid cell in - or across - particular administrative units, ORNL's LandScan effort employs an extensive dasymetric spatial model to distribute global population. The LandScan model produces population grids by utilizing weighting criteria based on known population centres, the radiance calibrated night-time lights, distance to transportation infrastructure such as roads, elevation, slope suitability categories, land cover and other socio-environmental parameters. In addition, LandScan utilizes very high resolution 1 m to 5 m global satellite imagery databases in both the modelling, e.g. for delineating the extents of urban areas, as well as for verification and validation processes.

The cell size of the LandScan data outputs is 30 arc seconds (as). In “true” units of measurement, this cell size equates to 928 by 928 metres at the Equator and decreases towards the poles. The database is updated every year based upon the availability of new census figures, refinements in the associated input databases, and enhancements in the spatial modelling algorithms. Changes or differences in the LandScan outputs result from one or more of these factors. However, specific details of the nature of these changes are not reported to users. Thus, even though each revision date of LandScan represents the adjusted midyear-July population estimates for that year, comparatively, the available 1998, 2000, 2001, 2002, and 2003 releases of these data do not represent a time-series that can be used for pixel by pixel analyses or comparisons.

The population estimates used as inputs into the LandScan effort are based primarily on aggregate second order administrative units compiled by the International Programs Center of the U.S. Bureau of Census to represent the most recent census information for each country. As with the GPW effort, the number of disaggregate subnational population estimates is constantly being improved upon. For example, an additional 700 international second order administrative boundaries were used for the LandScan 2002 as compared to the LandScan 2001. Overall, the LandScan 2003 outputs will be require some 239000 estimates of population based on second and third order administrative boundaries as inputs.

ORNL has copyrighted the 2001, 2002 and 2003 revisions of the LandScan database, and made them subject to a non-commercial use EULA. ORNL has also stipulated that no third party distribution of the copyrighted LandScan databases is authorised, and potential users must complete a basic registration form before downloading these data. The LandScan databases are distributed as zipped compression files using either the base ESRI grids directory structure or in ESRI's binary E00 grid export format. Both global and continental subsets of the database are available for download after registration. The input spatial data and attributes used for producing the LandScan grid data are not available for distribution outside of ORNL.

Post processing tasks associated with these data are again negligible, and would in the main be confined to the rebuilding of the attribute and statistical tables of the base raster data grids.

6.2.4 Comparison of available population distribution databases

Table 6.2 provides a summary of the database outputs attributed to the CEISIN and ORNL population efforts for the purposes of this report.

Table 6.2
Public domain population distribution databases

Data Type/Source

URL

Extent

Scale

Availability1

Notes

HUMAN POPULATION DISTRIBUTION AND DENSITY


CIESIN/CIAT Gridded Population of the World, GPW Version 3

http://sedac.ciesin.columbia.edu/gpw

Global

2.5’
~4.8 km

CR, RD, publicly available

Population distribution time- series 1990, 1995, 2000 based on of best available subnational census units

UNEP/CIAT Population Distribution Accessibility Model

http://grid2.cr.usgs.gov/globalpop/africa/, http://gisweb.ciat.cgiar.org/population/asia

Africa
Asia
Latin America

2.5’
~4.8 km

CR, publicly available

Population distribution time-series 1960-1990 based on of best available subnational census units and roads data

CIESIN et al. Global Urban Rural Mapping Project (GRUMP): Gridded Population of the World, version 3, with Urban Reallocation (GPW-UR)

http://beta.sedac.ciesin.columbia.edu/gpw

Global

30
~1 km

CR, RD

Population distribution time-series 1990, 1995, 2000 based on of best available subnational census units and GURMP urban extents data

ORNL-LANDSCAN Population distribution

http://web.ornl.gov/sci/gist

Globall

30as
~1 km

PD, NC, RD

Annual revisions 1998, 2000-2003 based on weighted density for ~2nd order U.S. BuCen global population estimates using a number of factors

1 C=Commercial CR=Copyright; LF=License/Fee, PD=Public Domain; NC=Non-Commercial; FQ=Fair Quotation; RA=Restricted Access; RD=Registered Distribution

In comparing the potential utility of outputs resulting from the LandScan and GPW population efforts, the selection of any one population database over the other should be dictated by the individual user’s application. For instance, in order to avoid the introduction of inaccuracies and/or circularity into any analysis, in certain cases users may need to evaluate the relationship between the methods used to distribute population employed for each effort. This may include both the type and quality of the administrative boundaries employed as original inputs and then the type of spatial and environmental variables that may already be represented in the population grids. Given this, it may be important for potential users to consider the following.

1. The GPW effort uses a much simpler and perhaps transparent methodology, whereas LandScan reallocates population based on a dasymetric spatial model that again integrates lights at night, road networks, land cover classifications, high resolution satellite image derived urban boundaries, and slope.[18]

2. Caution should also be used when attempting to employ data from various revisions and/or editions of the outputs resulting from the GPW population effort. Similarly, because the LandScan effort employs an evolving and spatially heterogeneous set of inputs, and the weighting schemes are not reported to users, these data should be used with caution for any time-series analyses. In particular, those based on pixel by pixel comparisons.

3. Finally, although LandScan relies heavily on many types of inputs, it currently uses fewer population polygons as the basis for the reallocation of population. In comparison, the GPW v.3 uses just over one third more input polygons than LandScan. Particularly, if the number of input polygons outside of the USA are compared.

In summary, the choice of which database to use largely comes down to the application. Researchers doing analyses which potentially consider the relationship between environmental variables and population distribution should understand the implication of using databases which already include environmental information in the model to reallocate population as this may introduce some circularity into their analysis, (CIESIN 2004). On the other hand, researchers should also be aware that the issue of the spatial accuracy of delineated boundaries in databases depending on more disaggregate polygonal boundaries representing lower order administrative hierarchies has not been fully documented and in some instances may introduce unacceptable errors in population modelling, (ORNL 2004). Lastly, although, there have been studies which have demonstrated how to isolate the environmental effect of variables from dasymetric models (ORNL 2004), recent unpublished studies have explored the impact of the various population distribution surfaces, including those integrating environmental factors, in the context of public health, (CIESIN 2004). Similar types of studies when published should help users to intelligently select the most appropriate dataset for their needs.

Figure 6.2 below, provides a graphical comparison of the GPW3 and the LandScan-2002 databases for the Namibian AOI[19]. It should be noted that the LandScan database represented on this graphic has been adjusted to reflect mid-2002 population estimates and not the year 2000 estimates depicted for the GPW v.3.

Figure 6.2
Comparison of raster population distribution databases

Based on a comparison of the two graphics shown in Figure 6.2 the following differences should be evident:

1. The relative difference between the 2.5' and 30as pixel sizes used in each effort.

2. The use in the LandScan effort of VMap1/VMap0 roads infrastructure, as well as the weighting factors used for urban areas.

3. The difference in the application of null/no data areas and water masks between the two data efforts. Where in the GPW, water bodies are assigned null values, i.e. Etosha Pan can be distinctly identified in the figure, in LandScan large water bodies are assigned a zero population value. In the LandScan effort, this is done in order to avoid the masking of potentially populated cells in water, e.g. marinas or ports, which can be differentiated based on access to higher resolution satellite imagery used as inputs, (ORNL 2004).


[8] Based on a preliminary review of the inventory, the text in this section was been revised and enhanced by WHO, (2004).
[9] A potentially important on-line resource was identified during the inventory that can be used during any ancillary codification of the RWDBII-Sv1.1 AD2 level boundary layer and its subsequent harmonization with the relevant VMap0.Ed5 data layers. This resource is the Administrative Divisions of Countries ("Statoids") website located at http://www.statoids.com/statoids.html. Statoids provides detailed information on AD2 and some AD3 level encoding and historical changes in FIPS/ISO standards over time. Statoids is also available as a tabular commercial product updated monthly and may represent an important resources for SALB managers to consider licensing. Since the Statoids product contains population estimates, it is discussed briefly in Section 6.1.2.
[10] Any erroneous attribution regarding either the participants or contributors to the IBTG effort are regretted.
[11] Based on a review of the inventory, this section was extensively modified by UNEP-WCMC (2004).
[12] Subsequent processing of the VMap0 Ed.5 library indicates that all 189 populated place features falling within the AOI would now contain a valid name attribute.
[13] A UN-Habitat dataset entitled “All Cities over 100 000 Inhabitants” in ESRI Shapefile format was noted as missing from the inventory during the peer review process, (UNCS, 2004a). This and possibly similar datasets are available from UNEP-Nairobi.
[14] An independent assessment of EuropaTech’s Populated Place data layers made by the UNCS, found that in certain areas a location may be covered by duplicate point features, (UNCS 2004a).
[15] Any incorrect attribution of the GPW-1, GPW-2, and GPW-3 population modeling effort to CIESIN alone is regretted. Such an attribution is made in the interest of brevity and the notable contributions of either individual scientists such as Tobler, Deichmann and others, or the relevant contributions made by organizations such as: the NCGIA, UNEP, CIAT, WRI, the WB, etc. are meant to be short shrifted. The full attribution of the GPW can be found at, http://sedac.ciesin.columbia.edu/plue/gpw/credits.html.
[16] The text in this section has been revised and enhanced based CIESIN’s review of a preliminary draft, (CIESIN 2004).
[17] The text in this section has been revised and enhanced based ORNL’s review of a preliminary draft, (ORNL 2004).
[18] The generic algorithm for the LandScan model is available (Dobson et al., 2000).
[19] Both CIESIN and ORNL commented that Namibia is likely an unrepresentative country for the comparison of population data due to its low population, aridity, and the sparse settlement occurring within the AOI.

Previous Page Top of Page Next Page