Knowledge reference for national forest assessments - modeling for estimation and monitoring
7. Model error contribution to total error
Methods of estimating the precision of inventory estimates are dependent on the sampling design used to collect the data. These methods, however, generally assume that the individual observations are measured without error. For model-based estimates like volume, biomass, and carbon, however, there are model errors to consider. Consequently, there are three main sources of error: measurement error, model error, and sampling error. The sample-based precision estimates, therefore, should be considered to be underestimating the variance, or conversely, as implying confidence intervals that are too narrow, for derived variables such as volume, biomass, or carbon content. Similarly, methods of estimating the sample requirements to achieve a desired level of precision will indicate fewer samples than really needed unless consideration of model error is taken into account in addition to sampling error. For more information see sampling design.
Inventory models are never perfect. The discrepancy between the actual (unknown) value (YA) and the predicted value from a model (YP) is called the model error (EP). In equation form, this becomes:
This simple equation also implies that the variance of a series of predicted values is less than or equal to the variance of the actual values. Equality holds only for perfect models with no error variance. For example, if we predict the volume of trees in a plot from a suitable volume equation then the calculated variance of the volume predictions will be less than the actual variance of the volume of the trees in the plot. Consequently, the standard error of a predicted mean volume for a plot will be biased downwards. The variance of prediction errors must be included to obtain an unbiased estimate of the total error.
The variance of prediction errors may be substantially larger than the residual variance obtained during model fitting, especially when the mean and covariance of the input variables vary from those of the data used for model fitting. Application of the model outside the recommended application domain raises the specter of serious additional underreporting of error.