6. Judging model quality

A model summarizes a conceptual relationship between a dependent variable and one or more predictors. Models are mostly used for predicting a new value of an unobserved entity from available predictors. The volume, biomass and carbon. equations given above provide examples. A model may be formulated through subject knowledge, adopted from other studies, or suggested by apparent trends in observed data.

In forest inventory and biological sciences, data exhibit a large amount of natural variation and models are limited to predicting the expected value of the dependent variable given the input data. The quality of a model is therefore judged by its ability to provide unbiased estimates of these expectations and the precision of model predictions.

When fitting a model to data, a comparison of values predicted by a model and the actual values of the dependent variable provides an initial assessment of model quality. It is generally desirable for departures from model predictions (residuals) to average to zero for any input and be distributed tightly around the predicted values.

The quality of a model for prediction purposes is assessed by comparing a prediction of a new (unseen) observation not used in model development to the actual value of the new observation. Common criteria for assessing model quality include a t-test of the hypothesis of a zero mean model prediction error, absolute deviation, the sign test for testing the hypothesis of non-random trends in errors along a gradient of predictor values, normal distribution of prediction errors of residuals, linear regression analysis of errors, and homogeneity of error variances across a range of input. Reynolds (1984) provides a basic approach that can be used in model evaluation. Vanclay and Skovsgaard (1997) provide a brief overview and an operational frame for judging model quality.

When applying a general model, such as the volume and biomass equations given earlier, or a model developed for a given species in a different geographic area, it is important to attempt to assess model quality prior to application. This may require the collection of new field data, or it may be possible to utilize existing data for this purpose. Failure to assess model quality forces the user to make an untested, implicit assumption that the model used is appropriate for the species and geographic area to which it is applied, which may or may not be true. Users of models should always keep in mind that a model may generate unusual predictions. It is the user¿s responsibility to do a check of model assumptions and model predictions.