Model validation can be defined as demonstrating the accuracy of the model for a specified use. Within this context, accuracy is the absence of systematic and random error - in metrology they are commonly known as trueness and precision, respectively. All models are, by their nature, incomplete representations of the system they are intended to model, but, in spite of this limitation, models can be useful. General information on working with mathematical models can be found in various theoretical and applied textbooks. Doucet and Sloep (1992) give a very good introduction to model testing. These authors discriminate between model confirmation (i.e. shown to be worthy of our belief; plausible) and model verification (i.e. shown to be true). McCullagh and Nelder's book on linear models (1989) is a valuable resource on statistical modelling methods, and describes some general principles of applying mathematical models, underlining three principles for the modeller:
1. All models are wrong, but some are more useful than others.
2. Do not fall in love with one model to the exclusion of others.
3. Thoroughly check the fit of a model to the data.
In addition Law and Kelton (2000) in addressing the issue of building valid, credible and appropriately detailed simulation models consider techniques for increasing model validity and credibility. It is worth noting however that some models cannot be fully validated, but components or modules of the model can be validated on an individual basis. Dee (1995) has identified four major aspects associated with model validation, as follows:
1. Conceptual validation
2. Validation of algorithms
3. Validation of software code
4. Functional validation
These are described below. The issue of validation is further addressed in the FAO/WHO guidelines for Exposure assessment of microbiological hazards in foods and Risk characterization of microbiological hazards in foods.
Conceptual validation concerns the question of whether the model accurately represents the system under study. Was the simplification of the underlying biological process in model steps realistic, i.e. were the model assumptions credible? Usually, conceptual validation is largely qualitative and is best tested against the opinion of experts with different scientific backgrounds. Different models with various conceptual bases can be tested against each other within a Bayesian framework, using Bayes factors, or some information criterion. Experimental or observational data in support of the principles and assumptions should be presented and discussed. The modelling concepts described in Chapter 6 are a minimum set of assumptions representing the consensus opinion of a broad group of experts who contributed to these guidelines. These are based on mechanistic reasoning, and are supported by some experimental evidence. As such, they are considered to be currently the best basis for dose-response modelling studies.
Algorithm validation concerns the translation of model concepts into mathematical formulae. It addresses questions such as: Do the model equations represent the conceptual model? Under which conditions can simplifying assumptions be justified? What effect does the choice of numerical methods for model solving have on the results? and: Is there agreement among the results from use of different methods to solve the model? For microbiological dose-response models, these questions relate both to the adequacy of models (discussed in Section 6.3) for describing the infection and illness processes, and to the various choices outlined in here. Is it valid to assume a constant probability of infection for each pathogen in each host? Can the approximate Beta-Poisson model be used or is it necessary to use the exact hypergeometric model? A powerful method to evaluate the effects of numerical procedures is to compare the results of different methods used to estimate parameter uncertainty, such as overlaying parameter samples obtained by Monte Carlo or bootstrap procedures with likelihood-based confidence intervals. Graphical representation of the results can be useful, but must be used with care.
Software code validation concerns the implementation of mathematical formulae in computer language. Good programming practice (i.e. modular and fully documented) is an essential prerequisite. Specific points for attention are the possible effects of machine precision and software-specific factors on the model output. Internal error reports of the software are important sources of information, as well as evaluation of intermediate output. For dose-response models, it is advisable to check the results of a new software implementation against previously published results.
Functional validation concerns checking the model against independently obtained observations. Ideally, it is evaluated by obtaining pertinent real-world data, and performing a statistical comparison of simulated outcomes and observations. This requires more detailed information than is usually available. It may be possible to compare results from risk assessment studies with independently obtained epidemiological estimates of disease incidence. Such data cannot validate the dose-response model per se, but may produce valuable insights. Most studies to date have considered that a range check of estimated risks and observed incidences was sufficient "validation" of the model. However, the very nature of risk estimates (estimated probabilities) allows their use as a likelihood function to do a more formal test of adequacy.
Credibility of results can also be established by demonstrating that dose-response relationships computed from different data sources are consistent. For example, a dose-response relationship developed from a human feeding study may be validated against outbreak data or data from national surveillance of foodborne diseases. When making such comparisons, the different nature of hosts, pathogens and matrices must be accounted for. Thus, different sources of data may either be useful for model validation, or to provide a better basis for generalization.
The process used to develop the results can improve the credibility of hazard characterization results. Peer and public review of results is a fundamental part of the process. Interdisciplinary interaction is essential to the process of risk assessment, and should be extended to the review process. Experts in the biological processes involved should review the basic concepts and underlying assumptions used in a hazard characterization. Furthermore, statistical experts should review the data analysis and presentation of model fitting results. A critical factor in obtaining a good peer review is to provide an accessible explanation of the mathematics, such that non-mathematicians can work their way through the concepts and details of the assessment.
Critical evaluation of a hazard characterization process is a demanding task that requires highly specialized experts. Therefore, adequate resources for the peer review process should be made available as an integral part of the project plan. The results of the peer review process should be accessible to all interested parties, including a statement on how comments were incorporated in the final version of the document and, if relevant, reasons why specific comments were not accepted.
The public review process serves two main purposes. First, it allows all stakeholders in a hazard characterization to critically review the assumptions made, and their effect on the risk assessment results. Second, it allows for evaluation of the completeness of the descriptive information and datasets used for the hazard characterization.