3. Sample size

Determination of sample size is one of the most important steps in constructing a sample design. If the sample is too small, then uncertainty will be great; if the sample is too large, then the cost will be unnecessarily high. It is possible to quantify the expected confidence in future estimates made from a valid probability sample. As the number of sample plots increases, the variance of the estimation error decreases, the precision of the estimate increases, and more confidence can be placed in the estimate. Usually, the exact value of the estimate is known but not the true condition of the forest. With probability samples, the probability that an estimate is within a specified distance from the true value may be determined. These are the roles of ¿confidence interval,¿ an estimated range of proportions which is likely to include the true, but unknown, proportion of forest, and the ¿confidence coefficient,¿ the probability that the confidence interval will contain the true proportion of forest.

The simplest case is estimating proportions with a simple random sample; for example estimating the proportion of a nation that is forested. Suppose an NFA covers a sampled population of 5-million ha and that in a simple random sample with n =1 000 plots, 400 are forested. The estimated proportion of forest is 40%, but what is the level of confidence that can be placed in this estimate? Suppose a confidence coefficient of 80% is acceptable. This means for 80 sample plots, the true, but unknown, percentage of forest is within the confidence interval. From available tables and figures (Czaplewski 2003), with n =1 000, and an estimate of 40% forest, the confidence interval is 38.0% to 42.0%. As another example, assume a rare forest type exists in the population, but the exact amount is not known. However, none of this rare forest was observed in the simple random sample of n =1 000 plots, and the estimated percentage of the nation in this rare forest condition is 0%. For the same 80% confidence coefficient, the confidence interval for this estimate is 0.0% to 0.2%. Thus, the estimate of the area of this rare forest type in the entire 5-million hectare nation is 0 to 10 000-ha. The final example is a 100 000-ha municipality for which measurement of a sample of n =20 of the 1 000 plots, revealed that 18 are forested. The estimate for this municipality is 90% forest cover, with a confidence interval of 75.5% to 97.3%, or 75 500- to 97 300-ha. Other calculations of sample sizes are possible with interactive ¿sample size calculators¿ that are available on the Internet. These examples demonstrate that accurate national estimates for common types of forest cover are possible with relatively few sample plots. However, larger sample sizes are often needed if the NFA requires estimates of rare forest types or small portions of the nation. It is the sample size that determines the precision of estimates in an NFA, not the size of the entire sampled population.

Determining the required sample size requires an estimate of the standard deviation of the differences between individual plot-level values and their average value. This standard deviation may be estimated with a pilot study or inventory that measures a small sample of forest plots to determine the variability among them. For example, assume the pilot inventory includes 60 plots, and wood volume is measured on each plot. Further, suppose that the mean volume is =100 m3/ha, the variance among plots is =2 500 m6/ha2, and the standard deviation is =50 m3/ha. If observations from the pilot plots is normally distributed, about 1/6th of the plots will have (100-50)=50 m3/ha or less, and another 1/6th of the 60 plots have 100+50=150 m3/ha or more. Assume the precision requirement for the NFA is to estimate mean wood volume per hectare to within a ±5% ¿tolerance¿ or ¿maximum allowable difference¿ (Dmax=0.05) with a 66% confidence coefficient. The required sample size n is approximately 100 sample plots.

If this NFA precision requirement is for the entire nation, then 100 sample plots are sufficient. If this NFA accuracy precision is for each of 10 sub-national units, then a total of 1 000 sample plots are necessary. Sample sizes increase greatly as the acceptable tolerance becomes smaller. A tolerance of ±1% would require the sample size to increase from n =100 to n =2 500 sample plots (Equation 15) in this example. The required sample size increases for larger confidence coefficients. For example, it requires four-times more sample plots to improve precision from a 66% confidence coefficient to the 95% level. More exact and detailed calculations of required sample sizes are possible with the interactive ¿sample size calculators¿ that are available on the internet: http://calculators.stat.ucla.edu/.