adolfo.kindgard | Global Forum on Food Security and Nutrition (FSN Forum)

Federico Adolfo Kindgard

FAO

Argentina

22.09.2022

Dear Mila, thank you for you interesting and detailed comments. I pasted it bellow in italic and after each of them I draft my comments, so all colleagues interested can follow these discussions.

The detection of attributing forest recuperation or other change in condition can be challenging in the sense of visually assessing the year that the codition changed (i.e. visually assessing the year that shrubland converted to forest. We might need some kind of remote sensing-based threshholding approach to guide decisions that a country could use as part of their documentaiton of how they did this.

I agree with you, recovery is quite a difficult attribute to detect, for two main reasons. Availability of maps to stratify profits and difficulties in photo-interpretation of profits. Regarding the first problem, there are no good indicators of where that is supposed to be happening. If we are not able to more or less identify where the forest might be expanding, it is very unlikely that we will have enough samples to detect the phenomenon in our sample. This area is a very small share of the total map. FRA 2020 RSS has estimated that these areas occurred on just 0.5% of the land area between 2000 and 2018.

The second challenge, the difficulty of assessing forest area gains through visual interpretation, many of them due to the speed of growth/recovery of trees until they become visible by satellite. In that sense, providing complementary information, such as spectral coefficients from satellite time series such as CCDC (Change Detection and Classification Algorithm) could be very useful.

What is the population?

If the population is a polygon layer that represents “the area of the coutnry”, then the sampling frame should have some known, positive chance of being selected for a plot. When hexagons are ovelaid on a country boundary, we need a criterion to deal with partial hexagons. Consider treating the hexagons not as the population boundary, but rather as tool to assign plots to the bounday.

For example, intersect the hexagons with the “countries” polygon, and any hexagon that is within or touching the country boundary gets a plot randomly assigned to it. If the plot happens to fall in the ocean for a given coastal hex, it is not in the sample – it is only in the sample if it falls on the piece of the hex that is within the boundary. That way, you are not over or under-sampling coastlines. Each piece of land along the coastline has some probability of having a plot fall in it, just like you might have in an internal hexagon – the same chance of being chosen.

The population objective was the entire tassel of the world, divided in equal area hexagons of 39.6 ha. Countries were not considered at any stage of the stratification. As for the coastline issue, you are right that some land area could have been excluded. The criteria applied was that all hexagons with more than 10% of land (the minimum detectable area with our scheme) were included. So some hexagons that maybe touch a little corner of land, but less than 10% were excluded. The big problem behind that is the vector file of the coastline. We use a combination of GAUL country borders, Hansen and GEZ data, but I still think something globally accurate is missing, or can be improved.

Could you have gotten the same precision obtained with the hexagons, with more, smaller centroid plots? Doing just centroid plots will have made the overall inventory cheaper.

Same design, more plots, would have generated more costs because it supposes more clicks, The smaller plots also have some interferences with the FRA Forest definitions. Considering that one of the parameters is the minimum size, that mas be 0.5 ha or higher. The 1 ha centroid and the majority criteria allows detect this minimum unity. Instead if you mean to a grid of subpoints, so just points, tend to confuse tree cover with forest land use, or at least it´s more difficult to derive one of these two possibilities directly.

This is the crux of the whole centroid vs hexagon dichotomy – the scale of the phenomenon being measured.

If you carefully measure a large plot, you will get better precision. But it might take a long time to label a big plot, while in the meantime you could have done 10 little plots.

Another issue to consider is the likelihood of over-generalizing – i.e., small patches which occur on the landscape will be very likely to be overlooked when the interpreter’s eye is dominated by the larger, more abundant patch. This could likely lead to an underestimate of the area of those classes that occur in smaller patches.

The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid.

Regarding the time that takes the hexagon, it´s a bit difficult to separate the time spent in each unite because mainly of the technique we asked the experts to apply. The idea was to have first a landscape understanding of the sample, on space and time. This is the most expensive task in time-consuming sense. After the interpreter has identified the trends and characteristics of the region, our measurements showed that The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid. The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid, so we think that for the shared parameters (Forest area, forest area change, OWL and other lands) the hexagons performed better.

We understand it is hard to abandon the hexagon idea, but there are many advantages to small, cheaper plots. You can do more of them, and the scale of the plot is closer to the scale of the phenomenon being measured.

We are read and open to change/adjust the design as much as need. During the next moths we will be testing all alternatives that can be later implemented. At the same time I have to say that the evidences we presented that larger sample unite size capture more efficiently our target parameters (forest area/forest area changes) could be indicating that keep 40 ha sample is a good idea. Would be very important to have a deeper discussion with you about the topic, considering that you think the opposite. If you also include in the analysis the interpreter subjectivity component of the variance, that we fund that can be withing 10 and 50 times the sampling error variance, the results shows that more than collect more samples, could be better to have less samples, better assessed. An open and important discussion is open and going on about it.

Reducing the sample size by half and reallocating some samples by creating combinations of the old and some new strata seems to be somewhat arbitrarily. Consider the following alternative approach:

With existing strata and results per stratum, calculate the sample variance for area of forest loss. You then can calculate a weighted variance and a weighted mean.

From that, calculate a global standard deviation of the sample (sqrt of the global variance of the sample), and use that in an equation like global required n = (Global CV% * t/E%)^2 , where CV is the globalsd/global mean (as a %), t is the t value (about 2), and E is the % sampling error you are seeking (for example, +/- 10% of the estimate). That is the required global n you will need to allocate. Then, use Neyman to allocate that number to the strata (taking into account per-stratum sample sd’s and areas), and compare the required n per stratum to what you had in 2020 to see if you need more or fewer plots per stratum.

Construct your new strata (such that there is equal probability of selection within each new stratum, through the intersecting scheme they describe) and repeat, to see if by mixing and matching the strata yields a lower required n.

Still it´s not defined the amount of samples that we would collect on the next FRA cycle and we are open to discuss and evaluate alternatives. This was just a simulation, only considering the sampling error on the optimization.

Agree that combining new and old strata will generate problems and looks arbitrary. In other hand, we must try to explorer alternatives statistically valid in order to reused the work already done. This is very important from a logistic/collaboration point of view, as far as we have to approach the same national experts and starting from scratch would not sound too reasonable for them.

What you propone is interesting, we have been thinking in something similar, so for sure we will test it and share

This member contributed to:

FRA 2025 Expert Consultation. Document #5: FRA 2020 Remote sensing survey and the way forward

Consultation
12.09.2022 - 23.09.2022
- Federico Adolfo Kindgard
  
  FAO
  
  Argentina
  
  22.09.2022
  
  Dear Mila, thank you for you interesting and detailed comments. I pasted it bellow in italic and after each of them I draft my comments, so all colleagues interested can follow these discussions.
  
  The detection of attributing forest recuperation or other change in condition can be challenging in the sense of visually assessing the year that the codition changed (i.e. visually assessing the year that shrubland converted to forest. We might need some kind of remote sensing-based threshholding approach to guide decisions that a country could use as part of their documentaiton of how they did this.
  
  I agree with you, recovery is quite a difficult attribute to detect, for two main reasons. Availability of maps to stratify profits and difficulties in photo-interpretation of profits. Regarding the first problem, there are no good indicators of where that is supposed to be happening. If we are not able to more or less identify where the forest might be expanding, it is very unlikely that we will have enough samples to detect the phenomenon in our sample. This area is a very small share of the total map. FRA 2020 RSS has estimated that these areas occurred on just 0.5% of the land area between 2000 and 2018.
  
  The second challenge, the difficulty of assessing forest area gains through visual interpretation, many of them due to the speed of growth/recovery of trees until they become visible by satellite. In that sense, providing complementary information, such as spectral coefficients from satellite time series such as CCDC (Change Detection and Classification Algorithm) could be very useful.
  
  What is the population?
  
  If the population is a polygon layer that represents “the area of the coutnry”, then the sampling frame should have some known, positive chance of being selected for a plot. When hexagons are ovelaid on a country boundary, we need a criterion to deal with partial hexagons. Consider treating the hexagons not as the population boundary, but rather as tool to assign plots to the bounday.
  
  For example, intersect the hexagons with the “countries” polygon, and any hexagon that is within or touching the country boundary gets a plot randomly assigned to it. If the plot happens to fall in the ocean for a given coastal hex, it is not in the sample – it is only in the sample if it falls on the piece of the hex that is within the boundary. That way, you are not over or under-sampling coastlines. Each piece of land along the coastline has some probability of having a plot fall in it, just like you might have in an internal hexagon – the same chance of being chosen.
  
  The population objective was the entire tassel of the world, divided in equal area hexagons of 39.6 ha. Countries were not considered at any stage of the stratification. As for the coastline issue, you are right that some land area could have been excluded. The criteria applied was that all hexagons with more than 10% of land (the minimum detectable area with our scheme) were included. So some hexagons that maybe touch a little corner of land, but less than 10% were excluded. The big problem behind that is the vector file of the coastline. We use a combination of GAUL country borders, Hansen and GEZ data, but I still think something globally accurate is missing, or can be improved.
  
  Could you have gotten the same precision obtained with the hexagons, with more, smaller centroid plots? Doing just centroid plots will have made the overall inventory cheaper.
  
  Same design, more plots, would have generated more costs because it supposes more clicks, The smaller plots also have some interferences with the FRA Forest definitions. Considering that one of the parameters is the minimum size, that mas be 0.5 ha or higher. The 1 ha centroid and the majority criteria allows detect this minimum unity. Instead if you mean to a grid of subpoints, so just points, tend to confuse tree cover with forest land use, or at least it´s more difficult to derive one of these two possibilities directly.
  
  This is the crux of the whole centroid vs hexagon dichotomy – the scale of the phenomenon being measured.
  
  If you carefully measure a large plot, you will get better precision. But it might take a long time to label a big plot, while in the meantime you could have done 10 little plots.
  
  Another issue to consider is the likelihood of over-generalizing – i.e., small patches which occur on the landscape will be very likely to be overlooked when the interpreter’s eye is dominated by the larger, more abundant patch. This could likely lead to an underestimate of the area of those classes that occur in smaller patches.
  
  The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid.
  
  The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid, so we think that for the shared parameters (Forest area, forest area change, OWL and other lands) the hexagons performed better.
  
  Regarding the time that takes the hexagon, it´s a bit difficult to separate the time spent in each unite because mainly of the technique we asked the experts to apply. The idea was to have first a landscape understanding of the sample, on space and time. This is the most expensive task in time-consuming sense. After the interpreter has identified the trends and characteristics of the region, our measurements showed that The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid. The scale of the phenomenon varies broadly at global level. For instances, deforestation drive by livestock grazing in South America, Subsistent agriculture in DRA or Palm oil in Malaysia. Originally we kept the hexagons with the intention to better capture the fragmented patters in Africa and Central America. At the end, the variances of the same estimator was in all cases less at hexagon level than the centroid, so we think that for the shared parameters (Forest area, forest area change, OWL and other lands) the hexagons performed better.
  
  Regarding the time that takes the hexagon, it´s a bit difficult to separate the time spent in each unite because mainly of the technique we asked the experts to apply. The idea was to have first a landscape understanding of the sample, on space and time. This is the most expensive task in time-consuming sense. After the interpreter has identified the trends and characteristics of the region, our measurements showed that centroid talked about half time and hexagon the other half time.
  
  Regarding the time that takes the hexagon, it´s a bit difficult to separate the time spent in each unite because mainly of the technique we asked the experts to apply. The idea was to have first a landscape understanding of the sample, on space and time. This is the most expensive task in time-consuming sense. After the interpreter has identified the trends and characteristics of the region, our measurements showed that centroid talked about half time and hexagon the other half time.
  
  We understand it is hard to abandon the hexagon idea, but there are many advantages to small, cheaper plots. You can do more of them, and the scale of the plot is closer to the scale of the phenomenon being measured.
  
  We are read and open to change/adjust the design as much as need. During the next moths we will be testing all alternatives that can be later implemented. At the same time I have to say that the evidences we presented that larger sample unite size capture more efficiently our target parameters (forest area/forest area changes) could be indicating that keep 40 ha sample is a good idea. Would be very important to have a deeper discussion with you about the topic, considering that you think the opposite. If you also include in the analysis the interpreter subjectivity component of the variance, that we fund that can be withing 10 and 50 times the sampling error variance, the results shows that more than collect more samples, could be better to have less samples, better assessed. An open and important discussion is open and going on about it.
  
  Reducing the sample size by half and reallocating some samples by creating combinations of the old and some new strata seems to be somewhat arbitrarily. Consider the following alternative approach:
  
  With existing strata and results per stratum, calculate the sample variance for area of forest loss. You then can calculate a weighted variance and a weighted mean.
  
  From that, calculate a global standard deviation of the sample (sqrt of the global variance of the sample), and use that in an equation like global required n = (Global CV% * t/E%)^2 , where CV is the globalsd/global mean (as a %), t is the t value (about 2), and E is the % sampling error you are seeking (for example, +/- 10% of the estimate). That is the required global n you will need to allocate. Then, use Neyman to allocate that number to the strata (taking into account per-stratum sample sd’s and areas), and compare the required n per stratum to what you had in 2020 to see if you need more or fewer plots per stratum.
  
  Construct your new strata (such that there is equal probability of selection within each new stratum, through the intersecting scheme they describe) and repeat, to see if by mixing and matching the strata yields a lower required n.
  
  Still it´s not defined the amount of samples that we would collect on the next FRA cycle and we are open to discuss and evaluate alternatives. This was just a simulation, only considering the sampling error on the optimization.
  
  Agree that combining new and old strata will generate problems and looks arbitrary. In other hand, we must try to explorer alternatives statistically valid in order to reused the work already done. This is very important from a logistic/collaboration point of view, as far as we have to approach the same national experts and starting from scratch would not sound too reasonable for them.
  
  What you propone is interesting, we have been thinking in something similar, so for sure we will test it and share
  - expand to read more
  - read on a separate page

Global Forum on Food Security and Nutrition (FSN Forum)

Federico Adolfo Kindgard

Facilitator of

Online Expert Consultation on “Global Forest Resources Assessment: Towards FRA2025”

This member contributed to:

FRA 2025 Expert Consultation. Document #5: FRA 2020 Remote sensing survey and the way forward

Federico Adolfo Kindgard