Concurrently with the descriptive analysis of clinical or epidemiological information or data, mathematical modelling has been advocated to provide assistance in developing a dose-response relationship, in particular when extrapolation to low doses is necessary. Mathematical models have been used for several decades in the field of toxicology. In the field of water and food microbiology, it is currently recognized that mathematical models may facilitate the dose-response assessment exercise, and provide useful information while accounting for variability and uncertainty. The assumptions on which current models are based, their use and possible limitations are carefully considered in the following sections.
The focus of these sections is on infectious and toxico-infectious pathogens, as this has been the area of most development. Some attention is given to other pathogens at the end of the chapter.
The biological basis for dose-response models derives from major steps in the disease process as they result from the interactions between the pathogen, the host and the matrix. Figure 4 illustrates the major steps in the overall process, with each step being composed of many biological events. Infection and illness can be seen as resulting from the pathogen successfully passing multiple barriers in the host. These barriers are not all equally effective in eliminating or inactivating pathogens and may have a range of effects, depending on the pathogen and the individual. Each individual pathogen has some particular probability to overcome a barrier, which is conditional on the previous step(s) being completed successfully. The disease process as a whole and each of the component steps may vary by pathogen and by host. Pathogens and hosts can be grouped with regard to one or more components, but this should be done cautiously and transparently.
Figure 4. The major steps in the foodborne infectious disease process.
A dose-response model describes the probability of a specified response from exposure to a specified pathogen in a specified population, as a function of the dose. This function is based on empirical data, and will usually be given in the form of a mathematical relationship. The use of mathematical models is needed because:
contamination of food and water usually occurs with low numbers or under exceptional circumstances; the occurrence of effects can not usually be measured by observational methods in the dose range needed, and hence models are needed to extrapolate from high doses or frequent events to actual exposure situations;
pathogens in food and water are usually not randomly dispersed but appear in distinct clumps or clusters, which must be taken into account when estimating health risks; and
experimental group sizes are limited, and models are needed, even in well controlled experiments, to distinguish random variation from true biological effects.
Plots of empirical datasets relating the response of a group of exposed individuals to the dose (often expressed as a logarithm) frequently show a sigmoid shape, and can be fitted by a large number of mathematical functions. However, when extrapolating outside the region of observed data, these models may predict widely differing results (cf. Coleman and Marks, 1998; Holcomb et al., 1999). It is therefore necessary to select between the many possible dose-response functions. In setting out to generate a dose-response model, the biological aspects of the pathogen-host-matrix interaction should be considered carefully. The model functions derived from this conceptual information should then be treated as a priori information. For more details, see Section 6.2.
In general, biologically plausible dose-response models for microbial pathogens should consider the discrete (particulate) nature of organisms and should be based on the concept of infection from one or more "survivors" from an initial dose. Before proceeding, however, it is necessary to carefully consider the concept of "dose".
The concentration of pathogens in the inoculum is usually analysed by some microbiological, biochemical, chemical or physical method. Ideally, such methods would have 100% sensitivity and specificity for the target organism, but this is rarely the case. Therefore it may be necessary to correct the measured concentration for the sensitivity and specificity of the measurement method to provide a realistic estimate of the number of viable, infectious agents. The result may be greater or smaller than the measured concentration. Note that, in general, the measurement methods used to characterize the inoculum in a data set used for dose-response modelling will differ from the methods used to characterize exposure in a risk assessment model. These differences need to be accounted for in the risk assessment.
Multiplying the concentration of pathogens in the inoculum by the volume ingested, the mean number of pathogens ingested by a large group of individuals can be calculated. The actual number ingested by any exposed individual is not equal to this mean, but is a variable number that can be characterized by a probability distribution. It is commonly assumed that the pathogens are randomly distributed in the inoculum, but this is rarely the case. Compound distribution (or over-dispersion) may result from two different mechanisms:
A 'unit" as detected by the measurement process (e.g. a colony-forming unit (CFU), a tissue culture infectious dose, or a Polymerase Chain Reaction (PCR) detectable unit) may, due to aggregation, consist of more than one viable, infectious particle. This is commonly observed for viruses, but may also be the case for other pathogens. The degree of clumping strongly depends on the methods used for preparing the inoculum.
In a well-homogenized liquid suspension, unit doses will be more or less randomly distributed. If the inoculum consists of a solid or semisolid food matrix, however, spatial clustering may occur and result in over-dispersion of the inoculum. This aspect may differ between the data underlying the dose-response model and the actual exposure scenario.
The Poisson distribution is generally used to characterize the variability of the individual doses when pathogens are randomly distributed. Microorganisms have a tendency to aggregate in aqueous suspensions. In such cases, the number of "units" counted is not equal to the number of infectious particles but to the number of aggregates containing one or more infectious particles. In such cases, it is important to know whether the aggregates remain intact during inoculum preparation or in the gastrointestinal tract. Also, different levels of aggregation in experimental samples and in actual water or food products need to be accounted for.
Each individual organism in the ingested dose is assumed to have a distinct probability of surviving all barriers to reach a target site for colonization. The relation between the actual number of surviving organisms (the effective dose) and the probability of colonization of the host is a key concept in the derivation of dose-response models, as will be discussed later.
Infection is most commonly defined as a situation in which the pathogen, after ingestion and surviving all barriers, actively grows at its target site (Last, 1995). Infection may be measured by different methods, such as faecal excretion or immunological response. Apparent infection rates may differ from actual infection rates, depending on the sensitivity and specificity of the diagnostic assays. Infection is usually measured as a quantal response (presence or absence of infection by some criterion). The use of continuous-response variables (e.g. an antibody titre) may be useful for further development of dose-response models. Infections may be asymptomatic, where the host does not develop any adverse reactions to the infection, and clears the pathogens within a limited period of time, but infection may also lead to symptomatic illness.
Microbial pathogens have a wide range of virulence factors, and may elicit a wide spectrum of adverse responses, which may be acute, chronic or intermittent. In general, disease symptoms may result from either the action of toxins or damage to the host tissue. Toxins may have been preformed in the food or water matrix ("intoxication") or may be produced in vivo by microorganisms in the gut ("toxico-infection"), and may operate by different pathogenic mechanisms (e.g. Granum, Tomas and Alouf, 1995). Tissue damage may also result from a wide range of mechanisms, including destruction of host cells, invasion and inflammatory responses. For many foodborne pathogens, the precise pathogenic sequence of events is unknown, and is likely to be complex. Note that health risks of toxins in water (e.g. cyanobacterial toxins) usually relate to repeated exposures, and these require another approach, which resembles hazard characterization of chemicals.
Illness can basically be considered as a process of cumulative damage to the host, leading to adverse reactions. There are usually many different and simultaneous signs and symptoms of illness in any individual, and the severity of symptoms varies among pathogens and among hosts infected with the same pathogen. Illness is therefore a process that is best measured on a multidimensional, quantitative, continuous scale (number of stools passed per day, body temperature, laboratory measurements, etc.). In contrast, in risk assessment studies, illness is usually interpreted as a quantal response (presence or absence of illness), implying that the results depend strongly on the case definition. A wide variety of case definitions for gastrointestinal illness are used in the literature, based on a variable list of symptoms, with or without a specified time window, and sometimes including laboratory confirmation of etiological agents. This lack of standardization severely hampers integration of data from different sources.
In a small fraction of ill persons, chronic infection or sequelae may occur. Some pathogens, such as Salmonella enterica serotype Typhi, are invasive and may cause bacteraemia and generalized infections. Other pathogens produce toxins that may result not only in enteric disease but also in severe damage in susceptible organs. An example is haemolytic uraemic syndrome, caused by damage to the kidneys from Shiga-like toxins of some Escherichia coli strains. Complications may also arise by immune-mediated reactions: the immune response to the pathogen is then also directed against the host tissues. Reactive arthritis (including Reiter's syndrome) and Guillain-Barré syndrome are well known examples of such diseases. The complications from gastroenteritis normally require medical care, and frequently result in hospitalization. There may be a substantial risk of mortality in relation to sequelae, and not all patients may recover fully, but may suffer from residual symptoms, which may last a lifetime. Therefore, despite the low probability of complications, the public health burden may be significant. Also, there is a direct risk of mortality related to acute disease, in particular in the elderly, neonates and severely immunocompromised.
Several key concepts are required for the formulation of biologically plausible dose-response models. These relate to:
threshold vs non-threshold mechanisms;
independent vs synergistic action; and
the particulate nature of the inoculum.
Each of these concepts will be discussed below in relation to the different stages of the infection and disease process. Ideally, the dose-response models should represent the following series of conditional events: the probability of infection given exposure; the probability of acute illness given infection; and the probability of sequelae or mortality given acute illness.
In reality, however, the necessary data and concepts are not yet available for this approach. Therefore models are also discussed that directly quantify the probability of illness or mortality given exposure.
The traditional interpretation of dose-response information was to assume the existence of a threshold level of pathogens that must be ingested in order for the microorganism to produce infection or disease. A threshold exists if there is no effect below some exposure level, but above that level the effect is certain to occur. Attempts to define the numerical value of such thresholds in test populations have typically been unsuccessful, although the concept is widely referred to in the literature as the "minimal infectious dose".
An alternative hypothesis is that, due to the potential for microorganisms to multiply within the host, infection may result from the survival of a single, viable, infectious pathogenic organism ("single-hit concept"). This implies that, no matter how low the dose, there is always, at least in a mathematical sense, and possibly very small, a non-zero probability of infection and illness. Obviously, this probability increases with the dose.
Note that the existence or absence of a threshold, at both the individual and population levels, cannot be demonstrated experimentally. Experimental data are always subject to an observational threshold (the experimental detection limit): an infinitely small response cannot be observed. Therefore, the question of whether a minimal infectious dose truly exists or merely results from the limitations of the data tends to be academic. A practical solution is to fit dose-response models that have no threshold (no mathematical discontinuity), but are flexible enough to allow for strong curvature at low doses so as to mimic a threshold-like dose-response.
The probability of illness given infection depends on the degree of host damage that results in the development of clinical symptoms. For such mechanisms, it seems to be reasonable to assume that the pathogens that have developed in vivo must exceed a certain minimum number. A non-linear relation may be enforced because the interaction between pathogens may depend on their numbers in vivo, and high numbers are required to switch on virulence genes (e.g. density dependent quorum-sensing effects). This concept, however, is distinct from a threshold for administered dose, because of the possibility, however small, that a single ingested organism may survive the multiple barriers in the gut to become established and reproduce.
The hypothesis of independent action postulates that the mean probability p per inoculated pathogen to cause (or help cause) an infection (symptomatic or fatal) is independent of the number of pathogens inoculated, and for a partially resistant host it is less than unity. In contrast, the hypotheses of maximum and of partial synergism postulate that inoculated pathogens cooperate so that the value of p increases as the size of the dose increases (Meynell and Stocker, 1957). Several experimental studies have attempted to test these hypotheses and the results have generally been consistent with the hypothesis of independent action (for a review, see Rubin, 1987).
Quorum sensing is a new area of research that is clearly of importance in relation to the virulence of some bacteria. It means that some phenotypic characteristics such as specific virulence genes are not expressed constitutively, but are rather cell-density dependent, using a variety of small molecules for cell-to-cell signalling, and are only expressed once a bacterial population has reached a certain density (De Kievit and Iglewski, 2000). While the biology of quorum sensing and response is still being explored, the nature of the effect is clear, it may be that some virulence factors are only expressed once the bacterial population reaches a certain size. The role of quorum sensing in the early stages of the infectious process has not been investigated in detail, and no conclusion can be drawn about the significance of quorum sensing in relation to the hypothesis of independent action. In particular, the role of interspecies and intraspecies communication is an important aspect. Sperandio et al. (1999) have demonstrated that intestinal colonization by enteropathogenic E. coli could be induced by quorum sensing of signals produced by non-pathogenic E. coli of the normal intestinal flora.
Specific properties in the data become meaningful only within the context of a model. Different models may, however, lead to different interpretations of the same data, and so a rational basis for model selection is needed. Different criteria may be applied when selecting mathematical models. For any model to be acceptable, it should satisfy the statistical criteria for goodness of fit. However, many different models will usually fit a given data set (for example, see Holcomb et al., 1999) and therefore goodness of fit is not a sufficient criterion for model selection. Additional criteria that might be used are conservativeness and flexibility.
Conservativeness can be approached in many different ways: "Is the model structure conservative?" "Are parameter estimates conservative?" "Are specific properties of the model conservative?" and so forth. It is not recommended to build conservativeness into the model structure itself. From a risk assessment perspective, a model should be restricted to describing the data and trying to discriminate the biological signal from the noise. Adding parameters usually improves the goodness of fit of a model, but using a flexible model with many parameters may result in greater uncertainty of estimates, especially for extrapolated doses. Flexible models and sparse datasets may lead to overestimation of the uncertainty, while a model based on strong assumptions may be too restrictive and lead to underestimation of the uncertainty in risk estimates.
It is recommended that dose-response models be developed based on a set of biologically plausible, mechanistic assumptions, and then to perform statistical analysis with those models that are considered plausible. Note that it is generally not possible to "work back", i.e. to deduce the assumptions underlying a given model formula. There is a problem of identifiability: the same functional form may result from different assumptions, while two (or more) different functional forms (based on different assumptions) may describe the same dose-response data equally well. This may result either in very different fitted curves if the data contains little information, or virtually the same curves if the data contain strong information. However, even in the last case, the model extrapolation may be very different. This means that a choice between different models or assumptions cannot be made on the basis of data alone.
The foregoing considerations lead us to the working hypothesis that, for microbial pathogens, dose-infection models based on the concepts of single-hit and independent action are regarded as scientifically most plausible and defensible. When the discrete nature of pathogens is also taken into account, these concepts lead to the single-hit family of models, as detailed in Box 1.
The single-hit models are a specific set of models in a broader class of mechanistic models. Haas, Rose and Gerba (1999) describe models that assume the existence of thresholds - whether constant or variable - for infection, i.e. some minimum number of surviving organisms larger than 1 is required for the infection to occur. Empirical (or tolerance distribution) models, such as the log-logistic, log-probit and Weibull(-Gamma) models, have also been proposed for dose-response modelling. The use of these alternative models is often motivated by the intuitive argument that single-hit models overestimate risks at low doses.
Currently, infection-illness models have received little attention and data available are extremely limited. Experimental observations show that the probability of acute illness among infected subjects may increase with ingested dose, but a decrease has also been found (Teunis, Nagelkerke and Haas, 1999), and often the data do not allow conclusions about dose dependence, because of the small numbers involved. Given this situation, constant probability (i.e. independent of the ingested dose) models, possibly stratified for subgroups in the population with different susceptibilities, seem to be a reasonable default. Together with ingested dose, illness models should take into account the information available on incubation times, duration of illness and timing of immune response, and should preferably measure illness as a multidimensional concept on continuous scales. There is no basis yet to model the probability of illness as a function of the numbers of pathogens that have developed in the host.
Box 1. Hit-theory models Consider a host that ingests exactly one cell of a pathogenic microorganism. According to the single-hit hypothesis, the probability that this pathogen will survive all barriers and colonize the host has a non-zero value of p_{m} Thus, the probability of the host not being infected is 1-p_{m}. If a second cell of the pathogen is ingested, and the hypothesis of independent action is valid, then the probability of the host not being infected is (1-p_{m})^{2}. For n pathogens, the probability of not being infected is (1-p_{m})^{n}. Hence, the probability of infection of a host that ingests exactly n pathogens can be expressed as: Starting from this basic function, a broad family of dose-response models (hit-theory models) can be derived. The most frequently used models are the exponential and the Beta-Poisson models, which are based on further assumptions on the distribution of pathogens in the inoculum, and on the value of p_{m}. When the distribution of the organisms in the inoculum is assumed to be random, and characterized by a Poisson distribution, it can be shown (e.g. Teunis and Havelaar, 2000) that the probability of infection as a function of the dose is given by: where D is the mean ingested dose. If p_{m} is assumed to have a constant value r for any given host and any given pathogen, the simple exponential model results: When , this formula is approximated by: If the probability of starting an infection differs for any organism in any host, and is assumed to follow a beta-distribution, then: For and , the Kummer confluent hypergeometric function is approximately equal to the Beta-Poisson formula: When , this formula is approximated by . For both and , while , the Beta-Poisson formula converts into the exponential model. Other assumptions for n or p lead to other models. For example, spatial clustering of cells in the inoculum can be represented by a negative binomial distribution or any other contagious distribution. However, this has little effect on the shape of the dose-response relationship (Haas, Rose and Gerba, 1999) although the limiting curve for the confidence interval is affected (Teunis and Havelaar, 2000). It is also possible to model p_{m} as a function of covariables, such as immune status or age. |
The default assumption of constant probability models for illness given infection lead to the conclusion that the only difference between dose-infection and dose-illness models is that the dose-illness models do not need to reach an asymptote of 1, but of P(ill|inf). They would essentially still belong to the family of hit-theory models.
Given illness, the probability of sequelae or mortality, or both, depends of course on the characteristics of the pathogen, but more importantly on the characteristics of the host. Sequelae or mortality are usually rare events that affect specific subpopulations. These may be identified by factors such as age or immune status, but increasingly genetic factors are being recognized as important determinants. As above, the current possibilities are mainly restricted to constant probability models. Stratification appears to be necessary in almost all cases where an acceptable description of risk grouping is available.
Dose-response information is usually obtained in the range where the probability of observable effects is relatively high. In experimental studies using human or animal subjects, this is related to financial, ethical and logistical restrictions on group size. In observational studies, such as outbreak studies, low dose effects can potentially be observed directly, but in these studies only major effects can be distinguished from background variation. Because risk assessment models often include scenarios with low dose exposures, it is usually necessary to extrapolate beyond the range of observed data. Mathematical models are indispensable tools for such extrapolations, and many different functional forms have been applied. Selection of models for extrapolation should primarily be driven by biological considerations, and only subsequently by the available data and their quality. The working hypotheses of no-threshold and independent action lead to a family of models that is characterized by linear low dose extrapolations on the log/log scale, or even on the arithmetic scale. That is, in the low dose range, the probability of infection or disease increases linearly with the dose. On the log-scale, these models have a slope of 1 at low doses. Some examples include:
· |
The exponential model |
P = r.D |
· |
Beta-Poisson model |
P = (a/b).D |
· |
The hypergeometric model |
P = {a/(a+b)}.D |
where D = mean ingested dose and r, a and b are model parameters. Note that if a > b, the risk of infection predicted by the Beta-Poisson model is larger than the risk of ingestion, which is not biologically plausible. This highlights the need to carefully evaluate the appropriateness of using this simplified model for analysing dose-response data.
Experimental datasets are usually obtained under carefully controlled conditions, and the data apply to a specific combination of pathogen, host and matrix. In actual exposure situations, there is more variability in each of these factors, and dose-response models need to be generalized. Assessing such variability requires the use of multiple datasets that capture the diversity of human populations, pathogen strains and matrices. Failure to take such variation into account may lead to underestimation of the actual uncertainty of risks.
When developing dose-response models from multiple datasets, one should use all of the data that is pertinent. There is currently no way of determining which data source is best. This requires that the risk assessor make choices. Such choices should be based on objective scientific arguments to the maximum possible extent, but will inevitably include subjective arguments. Such arguments should be discussed with the risk manager and their significance and impact for risk management considered. The credibility of dose-response models increases significantly if dose-response relations derived from different data sources are consistent, especially if the data are of varying types.
When combining data from different sources, a common scale on both axes is needed. This often requires adjusting the reported data to make them comparable. For dose, test sensitivity, test specificity, sample size, etc., need to be taken into account. For response, a consistent case definition is needed or the reported response needs to be adjusted to a common denominator (e.g. infection ´ conditional probability of illness given infection). Combining data from different sources within a single (multilevel) dose-response model requires thorough statistical skills and detailed insight into the biological processes that generated the data. An example is the multilevel dose-response model that has been developed for different isolates of Cryptosporidum parvum (Teunis et al., 2002a). The issue of combining data from different outbreak studies is discussed in the FAO/WHO risk assessments of Salmonella in eggs and broiler chickens (FAO/WHO, 2002a).
Dose-response relations where an agent only affects a portion of the population may require that subpopulation to be separated from the general population in order to generate meaningful results. Using such stratified dose-response models in actual risk assessment studies requires that the percentage of the population that is actually susceptible can be estimated. Consideration of such subpopulations appears to be particularly important when attempting to develop dose-response relations for serious infections or mortality. However, it would also be pertinent when considering an agent for which only a portion of the population can become infected.
Stratified analysis can also be useful when dealing with seemingly outlying results, which may actually indicate a subpopulation with a different response. Removal of one or more outliers corresponds to removing (or separately analysing) the complete group from which the outlying results originated. Where a specific reason for the separation cannot be identified, there should be a bias toward being inclusive in relation to the data considered. Any elimination of the data should be clearly communicated to ensure the transparency of the assessment.
A particular and highly relevant aspect of microbial dose-response models is the development of specific immunity in the host. Most volunteer experiments have been conducted with test subjects selected for absence of any previous contact with the pathogen, usually demonstrated by absence of specific antibodies. The actual population exposed to foodborne and waterborne pathogens will usually be a mixture of totally naive persons and persons with varying degrees of protective immunity. No general statements can be made on the impact of these factors. This is strongly dependent on the pathogen and the host population. Some pathogens, such as many childhood diseases and the hepatitis A virus, will confer lifelong immunity upon first infection whether clinical or subclinical, whereas immunity to other pathogens may wane within a few months to a few years, or may be evaded by antigenic drift. At the same time, exposure to non-pathogenic strains may also protect against virulent variants. This principle is the basis for vaccination, but has also been demonstrated for natural exposure, e.g. to non-pathogenic strains of Listeria monocytogenes (Notermans et al., 1998). The degree to which the population is protected by immunity depends to a large extent on the general hygienic situation. In many developing countries, large parts of the population have built up high levels of immunity, and this is thought to be responsible for lower incidence or less serious forms of illness. Some examples are the predominantly watery form of diarrhoea by Campylobacter spp. infections in children and the lack of illness from this organism in young adults in developing countries. The apparent lack of E. coli O157:H7-related illness in Mexico has been explained as the result of cross-immunity following infections with other E. coli, such as enteropathogenic E. coli strains that are common there. In contrast, in the industrialized world, contact with enteropathogens is less frequent and a larger part of the population is susceptible. Obviously, age is an important factor in this respect.
Incorporating the effect of immunity in dose-response models has as of yet received little attention. The absence of accounting for immunity in dose-response models may complicate interpretations, and comparisons among places. This is particularly likely to be a problem with common infections such as Campylobacter spp., Salmonella spp. and E. coli. Immunity may affect the probability of infection, the probability of illness given infection, or the severity of illness. There are currently only few data available on which to base model development. Where such data are available, a simple and possibly effective option would be to resort to stratified analysis and divide the population into groups with different susceptibility (see, for example, FDA/USDA/CDC, 2001). Recently, experimental work on infection of volunteers having different levels of acquired immunity to Cryptosporidium parvum was analysed with a dose-response model that includes the effects of immunity (Teunis et al., 2002b).
First and foremost, like other parts of the risk assessment process, model-fitting procedures should be reported clearly and unambiguously, for transparency and to allow reproduction.
Likelihood-based methods are preferable. The approach taken depends on the kinds of data that are available, and the presumed stochastic variation present. For instance, for binary data, model fitting should be performed by writing down the appropriate binomial likelihood function. For a dose-response function with parameter vector , the likelihood function for a set of observations is:
where the product is taken over all dose groups, with index i. At dose D_{i}, a number n_{i} of subjects is exposed, and k_{i} are infected. Fitting consists of finding parameter values that maximize this function, hence the term: maximum likelihood parameter values. Optimization may require special care, since many dose-response models are essentially non-linear. Most technical mathematics systems, such as Matlab, Mathematica or Gauss, or statistical systems, such as SAS, Splus, or R, provide procedures for non-linear optimization.
Haas (1983) and Haas, Rose and Gerba (1999) provide specific technical information on how to fit dose-response models. A general overview can be found in any textbook on mathematical statistics, such as Hogg and Craig (1994). McCullagh and Nelder (1989) is the definitive source for the statistical methods involved, and many dose-response models can be written as generalized linear models (but not the exact single-hit model - see Teunis and Havelaar, 2000). Vose (2000) is a valuable resource for a general description of mathematical and statistical methods in risk assessment.
When the likelihood function of a model is available, model testing can be done by calculating likelihood ratios. Goodness of fit may be assessed against a likelihood supremum - a model with as many degrees of freedom as there are data (i.e. dose groups). For instance, for binary responses, a likelihood supremum may be calculated by inserting ratios of positive responses to numbers of exposed subjects into the binomial likelihood (McCullagh and Nelder, 1989):
The deviance, -2 × the difference in log-likelihood, can be approximated as a chi-square variate, with degrees of freedom equal to the number of dose groups minus the number of model parameters.
The same method can be used for model ranking. To compare two models, one starts by calculating maximum likelihoods for both models, and then determining their deviance (-2 ´ the difference in log-likelihoods). This deviance can now be tested against chi-square with degrees of freedom equal to the difference in numbers of parameters of the two models, at the desired level of significance (Hogg and Craig, 1994).
The chi-square approximation is asymptotically correct for large samples. In addition to this, the likelihood ratio test is only valid for models that are hierarchically nested, meaning that the more general model can be converted to the less general one by parameter manipulation. Model complexity may be addressed by using an information criterion, such as the Akaikes Information Criterion (AIC), instead of the likelihood ratio. This penalizes parameter abundance, to balance goodness of fit against parameter parsimony (i.e. the minimum number of parameters necessary).
More generally valid are Bayesian methods, allowing comparison among any models, not only nested ones. Goodness of fit can be compared with Bayes factors, and there is also a corresponding information criterion: the Bayesian Information Criterion (BIC) (Carlin and Louis, 1996).
Determination of parameter uncertainty is indispensable. Categories of methods that can be applied include:
Likelihood-based methods The (log-)likelihood function as a chi-square deviate can be used to construct confidence intervals for parameters. For more than one parameter, the resulting uncertainty in the dose-response model cannot be calculated in a straightforward manner (Haas, Rose and Gerba, 1999).
Bootstrapping Bootstrapping involves the generation of replicate data by means of re-sampling (Efron and Tibshirani, 1993). For instance, for binary data, replicates can be generated by random sampling from a binomial distribution at each dose, with number of trials equal to the number of exposed subjects, and probability the fraction of infected over exposed subjects (Haas, Rose and Gerba, 1999; Medema et al., 1996). The model can then be fitted to each of these replicate data sets, thereby producing a random sample of parameter estimates, one for each replicate. These may subsequently be employed to construct a confidence range for the dose response relation, or to assess the uncertainty at a given dose.
Markov chain Monte Carlo methods (MCMC) Adaptive rejection sampling methods are a powerful and efficient means of sampling from posterior distributions, especially when models with many parameters need to be analysed. Working within a Bayesian framework avoids many of the implicit assumptions that restrict the validity of classical likelihood methods, and so MCMC methods are rapidly becoming more common. For instance, most data sets used for dose-response analysis are very small, containing only a few dose groups with a few exposed subjects. Current interest in these methods has also increased the availability of ready-to-use tools (Gilks, Richardson and Spiegelhalter, 1996).
Most dose-response analyses for pathogenic microorganisms to date have only considered binary responses (infected or not; ill or not). Since, in such a context, each dose group may contain a mixture of responses, analysis of the heterogeneity in the response (segregation of uncertainty and variation) is not possible. Modelling infection as the amount of pathogens excreted, or elevation of one or more immune variables, or combinations of these, provides better opportunities for addressing heterogeneity within the host population and the pathogen population, and their segregation.