Previous PageTable Of ContentsNext Page


3 Methodology

Mapping global forest cover, even for only a limited number of classes, is a large undertaking with many challenges that are unique to the scale of the effort (Cihlar 2000). For a given class, many different forest types and physiognomy in various geographic regions of the world give rise to great variations in their spectral or seasonal properties. This requires extensive studies of regional forest distribution and the use of flexible stratification in the mapping. For a mapping area as large as a continent or the world, land cover conditions and patterns in many regions are unfamiliar to the mapping analysts. There are problems with persistent clouds, atmospheric contamination, sensor viewing angle, and solar illumination, leading to varied data quality in global image data sets. Under these circumstances, which are inherent and unique to the scale and pixel sizes of large-area mapping, many large-area land cover mapping efforts have relied on the use of temporal composite data, stratification, unsupervised classification, and interpreter skills to compensate for these scale-dependent factors. The USGS global land cover characterization project also adopted these measures, as well as the use of a flexible land cover description on the basis of vegetation seasonality as measured by a temporal series of AVHRR NDVI (normalized difference of vegetation index). While the USGS global land cover database does not differentiate open from closed forest lands, descriptions in the database for the other three required land cover types under the FAO legend – other wooded land, other land cover, and water – were compatible.

Still, the requirements for forest canopy density mapping (the first step leading to the first two classes), clearly suggested that there was a need for a flexible methodology that allowed certain interactive flexibility in deriving and correcting for the FAO classes. The methodology used for the FAO global forest cover mapping consisted of temporal compositing, estimating percent forest cover, linking percent forest cover to the USGS global land cover classes, and validation.

3.1 Temporal image compositing

Source data used for the FAO forest cover mapping were drawn from AVHRR NDVI composites produced for February 1995-January 1996, the latest data year available when the project began. The temporal composites, consisting of five calibrated AVHRR bands and a NDVI band, were initially computed at EDC using the protocol of maximum NDVI value (to minimize the amount of clouds in imagery) over 10-day compositing periods (Eidenshink and Faundeen 1994). Bands 1 and 2 were further corrected for atmospheric ozone and Rayleigh scattering effects. For this study, bands 1 and 2 (red and infrared) and NDVI of the 10-day composites were processed into monthly composites (three 10-day composites for each monthly composite), using a rule of minimum band 1 (red band) value (Waller and Zhu, 1999). Studies have shown that, in the temporal compositing process, the maximum NDVI algorithm tends to retain large off-nadir pixels in the backscatter direction (Cihlar et al. 1994, Qi et al 1991), resulting in varying pixel sizes and additional noise introduced into spectral bands. Indeed, visual examination of the source data showed that large backscatter view angles were associated with raised red and IR reflectance in forestland, leading to confusion with vigorous agriculture and added noise in open or fragmented forest regions. As a partial measure to correct for the bi-directional effect, the minimum red band-compositing rule used for this study was found to better preserve pixels near nadir or in fore-scatter direction and maintain the integrity of patterns of forest lands. There are also geometric concerns associated with the fore-scatter view angles selected by the minimum red compositing, but these are thought to be of lesser importance than the reflectance concerns (Waller and Zhu, 1999).

The final image data consisted of 12 monthly composites of the first two AVHRR bands (red and infrared) and NDVI band for each of five major land masses: Africa, Australia and tropical Asia2, Eurasia (Europe and temperate/subtropical Asia), North America, and South America. Lack of good quality image data prevented several important tropical island chains in the Pacific, including Hawaii, Micronesia, and Polynesia, from being mapped. For each major land mass, initial image data preparation and subsequent mapping were conducted using an equal-area Interrupted Goode Homolosine projection (Steinwand 1994). Finished map products were re-projected to two map projections: Lambert equal-area projection for all continental masses and a global Robinson projection for global presentation.

3.2 Estimating percent forest cover with stratification

The concept of spectral mixture analysis (SMA) quantifies pixels as fractions of basic surface components (Smith et al., 1990, Wessman et al. 1996), or “endmembers,” such as green vegetation, soil, and shade. Mixture analysis provides a means to estimate amount of endmembers within target pixels, and results serve as one additional layer of information about land cover (Zhu and Evans, 1994). It is generally understood that, in relatively small study areas and with sufficient spectral information, unique and representative endmembers can be identified to produce reasonable results. However, spectral mixture analysis is nevertheless important to large-area land cover mapping conducted with coarse-resolution satellite imagery as the primary data set, since coarse pixels (e.g. 1-km) are more likely than fine resolution (e.g. 30-m) to contain mixed land cover types. Several recent large-area land cover studies approached the problem of quantifying mixed pixels with different algorithms. For instance, Defries et al (2000) used a multivariate, multitemporal regression model to estimate 8-km global land cover mixtures. For the tropical forest TREES project, a spatial regression estimator was used to improve coarse-resolution area estimates of deforestation on the basis of a fine-resolution sample (Mayaux and Lambin 1995). These algorithms reduced the need for a time-consuming refinement process in classification (Cihlar 2000) and proved the key element in achieving overall objectives of these studies.

For the objective of estimating fraction of forest cover in 1-km pixels, we adopted a methodology that combined linear mixture modeling with scaling of NDVI and the visible band based on pixel positions along the infrared band (Figure 1). Development of this methodology resulted from our observations that it would be impractical to obtain a sufficiently large sample of high-resolution reference data set in the global framework required to run a TM-guided approach. Moreover, a simple use of the mixture analysis usually would not result in satisfactory results for forest cover, particularly over a large mapping area, since endmember fractions might not directly correspond to forest fractions (Roberts et al., 1993). In spectral space, forest is often a mixture of vegetation and shade (traditional endmembers) and its spectral position can vary according to a variety of factors such as leaf type, structure, productivity, or topography (Waller and Zhu, 1999). View angle and solar illumination problems as well as open woodland with varying soil background would complicate the analysis even further. These issues necessitated the use of geographic stratification and flexible modeling.

In the combined model, pixels were modeled depending on their relative positions along the infrared band (Y-axis). Pixels with low IR reflectance contained new forests in burned areas, woody wetlands, and other dark land cover such as shadow or water, and these pixels could be best scaled with their NDVI values because NDVI was considered insensitive to illumination variance (Holben et al. 1986). It may be noted in the red-IR space that, for green vegetation, the distribution of NDVI values varies between the diagonal line (0 value) and the boundary near Y-axis (NDVI as 1). It was found that NDVI values varied between 0.3 and 0.8 from open woodland to closed forest in the low IR range. Scaling was flexibly set between different forest cover types.

Pixels with high IR reflectance were treated in a two-band unmixing method, using the three endmembers, as shown in Figure 1: A as forest (usually healthy deciduous forest), B as agriculture (bright green), and C as non-vegetated land. Linear combinations of the three land cover types were then solved for individual fractions with the least-square fit technique (Smith et al. 1990, Roberts et al. 1993). For pixels with mid-range IR reflectance for green vegetation, which represented conifer or mixed forests, fragmented forest (forest mixed with agriculture), open woodland, shrubland and grassland, a linear scaling of AVHRR red band was applied. Previous studies have shown that, with careful applications of stratification to both land cover and physiography, forest cover density was highly correlated to red band reflectance in this spectral range (Yang and Prince, 1997).

Even with the three methods in the mixture analysis model, significant regional variations in climate, topography, and forest types dictated that geographic stratification be used to ensure that the same canopy definitions be mapped across varying regional conditions. For example, dry miombo woodland in southern Africa would display different spectral properties from that of fragmented tropical moist forest in the Congo Basin, even if they had a similar range of canopy openness. Geographic stratification allowed treating regional variations separately by resetting threshold or endmember values flexibly to fit with vegetation conditions being mapped. Geographic stratification for each continent was based on digitized lines following combinations of ecoregions, physiography, vegetation types, and imagery conditions.

The three methods in the combined model for forest canopy mapping were applied to each monthly AVHRR composite. To provide the least atmospherically affected results, final percent forest cover for a continent was determined over the course of the year (February 1995-January 1996) on the basis of maximum monthly forest cover value achieved, regardless of the method chosen. The maximum forest compositing was compared to average forest canopy cover over the course of the year; areas of high ratio were examined to prevent overestimating from anomalous data. The two FAO forest classes: closed forest and open or fragmented forest were then derived on the basis of FAO definitions: 40-100 percent and 10-40 percent.

3.3 Adapting the USGS seasonal land cover classes to the FAO classification

After the first two FAO classes (closed forest, open or fragmented forest) were mapped using the mixture analysis model described above, the existing USGS global land cover database was adapted to derive the next three classes: other wooded land, other land cover, and water (Table 1). Vegetation classification and descriptions in the USGS global land cover database were built on characteristics of vegetation seasonality, productivity and leaf types determined in terms of 1992-93 AVHRR NDVI and other attributes such as general topography and climate (Loveland et al. 1999, Reed et al. 1994). The magnitude of integrated NDVI over the length of the temporal period helped separate successively decreasing vegetation primary production, ranging from forested land to open woodland, shrub and grass, and sparse land cover (Reed et al. 1994). Additionally, seasonal variations were investigated to partially support identification of agricultural land uses (e.g., single versus multiple cropping). At the most basic level of the database is the flexible classification of seasonal land cover regions, with the number of classes at this level varying among continents as a function of land cover complexity and land use history.

To derive the three FAO classes, the USGS seasonal land cover classes were used as the baseline data on a continent-by-continent basis. The refinement methods to fit USGS classes to FAO definitions were similar to the methods used in producing these USGS classes, namely that refinements depended on local conditions of land cover and relied on a careful study of all available evidence. Loveland et al. (1999) describes the overall approach in detail. The “other wooded land” class consisted primarily of open or closed shrub land cover in tropical and subtropical regions as well as low density tree cover in northern boreal zones near the polar regions. The “other land cover” class contained mostly barren land, grassland, and cropland from the USGS database. Class merges and splits were aided by ancillary data sets, such as ecoregions and digital elevation models. Spectral reclustering, as well as user-defined polygon splits, was also used. In areas of disagreement, the initially defined first two FAO classes took priority. The three FAO classes were merged with the first two forest classes for each continent, and all continents were combined into a single global forest cover map.

3.4 Validation – a pragmatic plan

Assessing accuracy for results of large-area land cover mapping is a necessary yet difficult task largely because of potentially high costs – in terms of both resources committed and time. Moreover, as the mapping area is large and land cover in coarse-resolution pixels is highly variable, requirements on sampling and data collection may be more stringent than small, less heterogeneous mapping areas. As Cihlar (2000) noted, these constraints limit accuracy assessment for large-area land cover to a matter of compromise between what is desired and what is available. For this study, an independent, global-scale, sampling-based reference data set was not available. Instead, the validation plan was centered around a high-resolution interpreted data set used for validating the IGBP global land cover, and two national land cover maps: China and the U.S.

The IGBP validation data set (Scepan, 1999) contained 312 interpreted sample sites from a randomly selected Landsat-5 Thematic Mapper (TM) sample, acquired in early 1990’s and stratified by the IGBP land cover classes (Belward, 1996). The sample sites were 1 square kilometer in size in the Landsat scenes, and were interpreted by a team of regional land cover experts (Scepan, 1999). The Landsat sample was stratified by class for 16 of the 17-class IGBP legend; water was not sampled. To provide a knowledge of accuracy for the FAO global forest cover map, the IGBP validation data set was re-coded from the 16-class IGBP legend to the first 4 FAO classes, and then to forest or nonforest (Appendix 1). This re-coded data set was overlaid on the global FAO map to derive a set of accuracy statistics including overall accuracy, and user’s and producer’s accuracy (Stehman and Czaplewski, 1998) for the first four classes in the FAO legend. The accuracy assessment was conducted for each continent rather than landmasses used as separate mapping areas, since the IGBP validation data set was coded for continents.

Two special cases in the overall validation effort were the two existing, wall-to-wall, land cover maps produced with 30-meter Landsat-5 TM data acquired in the early 1990’s. For China, the Chinese Academy of Sciences (CAS) developed a wall-to-wall land cover classification using Landsat-5 data in early 1990’s (Liu and Buheaosier, 2000). Similarly, early 1990’s Landsat 5 data were used to produce a conterminous U.S. land cover map (Vogelmann et al. 1998) by the Multi-Resolution Land Characterization (MRLC) Consortium. Reclassifications of the two land-cover maps to the FAO legend are given in Appendix 1. Because of the issue of mismatch between the existing data sets and the full FAO map legend (4 classes, excluding water), we derived accuracy numbers based on only forest/nonforest reclassification. The China and U.S. land cover maps were resampled to 1 square km pixel size and overlaid to the FAO global forest cover map in the same locations. Again, the commonly used accuracy parameters, such as the overall accuracy, and user’s and producer’s, were derived from the overlay.

All three reference data sets had different mapping techniques, and their reclassifications to the FAO legend were approximate. Still the utility of the three existing data sets is that, together, the three data sets provide users an indication of consistency of thematic accuracy at different levels (global vs. China and U.S.). It may be noted that, while other land cover maps existed, only the two data sets from China and U.S were used. This is because 1) the data sets covered sufficiently large areas, 2) dates of image data were close (early 90’s), and 3) primary image data were the 30-meter Landsat, following the convention of using finer resolution data to validate products from coarser source data (e.g., Kloditz et al. 1998).


2 Grouping of tropical Asia with Australia was primarily for convenient image processing and mapping.

Previous PageTop Of PageNext Page