Fish Disease Group, Department of Microbiology,
National University of Ireland, Galway, Galway City, Ireland.
Quality assurance and validation
Quality assurance. For diagnostic assays of any type, quality assurance strives to ensure that the results are repeatable and reproducible i.e. that the assay has an acceptable level of precision (Welac Working Group, 1993). The precision of results is assured by stipulation of the source of reagents and disposable materials, acceptability limits for instrument calibration, the generation of standard operating procedures (SOPs) for the performance of the assay and the incorporation of appropriate positive and negative controls into the performance of the assay. These controls should, if appropriately chosen, function as both safeguard and check against deviation in the performance of the assay that might be generated as a consequence of inhibition and contamination, sample to sample variation and variability arising from the use of the technique by different operators or laboratories. Standardisation also requires description of other parameters of the assay including specification of the sample type to be analysed and how that sample should be collected, stored and processed. It is assumed, for the purpose of this paper, that all the facets of quality assurance of molecular-based detection techniques would already be in place prior to the application of the technique in the field.
Validation. Quality assurance is clearly vital in the performance of diagnostic assays upon which important management or regulatory decisions may be made. However, quality assurance cannot provide information on how to interpret assay results. This information can only be obtained by the process of validation, defined here as "investigation of the extent to which a technique can legitimately be used for a particular purpose". Validation does not demonstrate that an assay will have standardised performance, but rather that it is appropriate for a given application. Therefore, validation is essential to determine the extent to which the potential of PCR-based techniques for detection aquatic animal pathogens will be realised in practice.
Hiney and Smith (1999) have provided a framework for the validation of PCR-based detection techniques which outlines three major criteria (quantitivity, qualitivity and reliability) which should, where possible, be evaluated sequentially at four levels of increasing experimental complexity (in vitro, seeded samples, incurred samples and field samples). In vitro studies and studies in seeded and incurred samples can be performed in the laboratory. Although these laboratory-based validation studies form a vital part of the validation process, they can only provide information on the performance of the technique in the sample type to which it will be applied. In order to ascribe meaning to the results generated in the field, the last level of validation (i.e. field application) is the most important.
Ascribing meaning to results. When assessing the data generated by a non-culture-based detection technique, either DNA- or immunological-based, the critical question to be addressed is: 'what can the results validly be taken to mean". The meaning that can be attributed to the results generated by any non-culture-based detection technique will be dependent on the application for which that technique is validated, and also on the context in which the results will be interpreted.
With regard to the intended application of a detection technique, the sample type being examined may be either CLINICAL or ENVIRONMENTAL. Laboratory validation should, therefore, have demonstrated that the detection technique performs reliably in the sample type in which that technique will be used. In addition, the type of study being conducted will normally be designed to answer questions of either ECOLOGICAL or EPIZOOTIOLOGICAL relevance. Hiney (1997) outlined the difference in emphasis of the questions being asked in these two types of studies and the importance of designing a validation programme suitable for the study type.
The meaning that will be attributed to results will also depend on the context of the interpretation. Contexts include research, disease diagnostics and regulation of fisheries operations which will require different levels of validity (Fig. 1). In the research context, the required level of validity need not be high provided the technique generates information of use for the formulation or confirmation of models of disease. On the other hand, the use of the technique for diseases diagnosis necessitates a much higher level of validity because of the therapeutic and/or management decisions that may rest on the outcome of the application of that technique. At the highest level of validity, the interpretation of results in a regulatory context may have far-reaching and serious implications for a fish farmer, region or country.
Figure 1. Required levels of
validity of detection techniques in the context of their
The selectivity of non-culture-based detection techniques. Regardless of the application intended for a non-culture-base detection technique or the context of that application, the most important question that must be asked of any detection technique is whether the technique provides THE TRUTH, THE WHOLE TRUTH, AND NOTHING BUT THE TRUTH. In other words, does the technique detect the species (viral, bacterial or protozoan) of interest, does the technique detect all members of that species and is there cross-reaction with members of any non-target species?
The property of 'truth' is often referred to as 'specificity' in scientific reports on detection techniques. However, in veterinary terms 'specificity' has another meaning, that is, the number of false-negatives in a population, which is clearly not the property we wish to describe. More correctly, the property of truth should be called 'selectivity'. The Welac Working Group (1993) defined selectivity as "the extent to which a technique can detect a particular analyte in a complex mixture without interference from other components in the mixture". In the case of microbial detection techniques, the mixture will be a sample, either clinical or environmental, and potential interfering components would include non-target species and chemical components in that mixture which might generate false positive or negative responses.
Non-culture-based techniques are proxy measurements. In determining the selectivity of non-culture-based detection techniques it must be born in mind that these techniques provide a proxy measurement of the presence of the target species i.e. they detect only a fragment of that cell. The use of proxy measurements is based on two important assumptions about the meaning of signals generated by non-culture-based detection techniques: i) that the SIGNAL = TARGET; and ii) that the SIGNAL = DISEASE RISK. The purpose of validation is to establish that these assumptions are true.
Assumption 1: signal = target
The selectivity of non-culture-based detection techniques is established in the laboratory by the use of CONTROL PANELS i.e. collections of strains of the same species (the truth) that should be representative of all members of the target species (the whole truth), and collections of closely related and non-related strains (nothing but the truth).
However, a number of problems can be identified for control panels. The first is that many of these control panels are badly designed, containing too few target organisms (especially in the case of heterogeneous species), inappropriate representatives of the target species or too few or badly chosen closely related non-target organisms. A second problem frequently observed in control panels is that they do not contain appropriate application-dependent non-target organisms. Application dependent panels should take account of the sample type in which the technique will be applied and should, therefore, contain the non-target organisms most likely to be present in that sample type, be they other pathogenic species that infect the same host or species indigenous to that environment.
Even where these problems have been addressed, there still remains a fundamental problem with the design of control panels, especially for environmental applications (although clinical applications will also be effected), namely that ONLY ORGANISMS THAT CAN BE CULTURED CAN BE INCLUDED IN CONTROL PANELS. It has been variously estimated that only 0.1 - 1% of all organisms present in the environment have been cultured in the laboratory. This leaves a vast and unknowable reservoir of organisms whose potential for cross-reaction cannot be assessed.
In reality, no amount of internal or laboratory validation or standardisation can tell us how we should interpret results generated in the field. To do this we need to use external validation techniques, of which there are two main types, comparative validation and predictive validation.
Comparative validation involves the comparison of results generated by two or more methods targeted against the same organism. There are a number of approaches that can be taken to comparative validation.
a). Compare against method previously validated for the same application
As there are currently no non-culture-based detection techniques adequately validated for diagnosis of fish diseases this option will for the moment remain theoretical.
b). Compare against another unvalidated non-culture-based method
The comparison of two methods based on the same detection principle, such as two PCR-based assays, is not ideal. This type of comparison cannot allow for inherent flaws (e.g. inhibition, matrix interaction) in the technique. Ideally the methods being compared should be based on different detection principles (genetic, immunological or culture-based) and should not, in theory, be inhibited by the same components of the test matrix or generate the same false positives from that matrix. If the degree of concordance in the results generated by these techniques is high when applied to the same samples, then the methods can be said to mutually co-validate each other.
What results may comparative validation generate?
There are a number of possible outcomes to a comparative validation programme:
a). Both techniques valid
A comparison should generate full concordance if the techniques have the same lower detection limit and are both valid for the application for which they have been developed.
b). Both techniques valid but lower detection limits different
In the case of the comparison of two or more valid techniques whose lower detection limits are different we would expect to see asymmetric concordance. A percentage of the samples will be positive by both techniques, but additional samples will also be positive by the assay with the better lower detection limit.
c). One technique invalid
When one of the methods being compared is invalid for the intended application, that is, generates either false positive or negative signals, we would expect low concordance in any comparative study. Unfortunately, it may not be possible to distinguish which method is invalid unless the results generated by it are at odds with more than one other method.
d). Both techniques invalid
If all of the methods being compared are invalid for a particular application then the results generated will have low concordance.
Regardless of the results generated by a comparative validation, this approach can only provide us with information on the presence of the target per se (Assumption 1). It is still possible that what we are detecting are cell fragments or dead cells. Therefore, comparative validation cannot give us any information about what the presence of that target means in terms of disease (Assumption 2). So how could we interpret these results in any meaningful way?
One possible means of interpreting the results generated in the field by a non-culture-based detection technique is through predictive validation (i.e. 'ESTABLISHING THE ABILITY OF A TECHNIQUE TO PREDICT A DISEASE EVENT').
Clearly, with regard to diagnostic techniques the most important event that can be predicted would be the occurrence of a disease episode in the host following the detection of a positive response through application of a non-culture-based detection technique to either host tissue or the environment of that host. However, 'disease' is a rather loose concept, defined by the World Health Organisation as "any divergence from a healthy state". Therefore, the event to be predicted must be capable of being established empirically (e.g. that the detection of positive responses by a PCR assay would predict the future isolation of the pathogen of interest bacteriologically from host tissue). Equally, the prediction could be that the absence of a positive response would predict a reduction in the requirement for antibiotic therapy in the host population.
Regardless of the predicted event, the ultimate objective of a predictive validation program is to establish that the detection of a positive signal by a non-culture-based technique has meaning in terms of disease, that is the SIGNAL = DISEASE (Assumption 2).
Meaning in context
Getting back to the relevance of context in the interpretation of results a number of observations can be made. In terms of research, a great deal of interesting and useful data can generated by the use of techniques of poor validity or whose validity has not been adequately established through either comparative or predictive validation studies. However, as diagnostic tools, assays with poor validity or whose validity has not been adequately established may generate completely misleading data and should not be considered for application in this context until sufficient data on their performance in the field is available. Most seriously, from the regulatory viewpoint, the interpretation of the data obtained by assays with poor validity or whose validity has not been adequately established, to indicate a disease risk and warrant regulatory sanctions would be completely invalid. Therefore, the first priority of any programme that hopes to introduce non-culture-based detection techniques for detection of aquatic animal pathogens must be to establish an adequate validation programme which includes both comparative and predictive validation which take cognisance of the intended application (i.e. sample type, conditions, context). Only by such an approach can we have confidence that we are ascribing the correct meaning to the results we generate.
Hiney, M. (1997). How to test a test: Methods of field validation for non-culture-based detection techniques. Bulletin of the European Association of Fish Pathologists 17, 245-250
Hiney, M.P. and Smith, P.R. (1999). Validation of polymerase chain reaction-based techniques for proxy detection of bacterial fish pathogens: Framework, problems and possible solutions for environmental applications. Aquaculture 162, 41-68.
Welac Working Group (1993). Welac Eurochem Guidance Doc. No.1. Accreditation for chemical laboratories: Guidance on the interpretation of the EN45000 series of standards and ISO/IEC guide 25. Addition 1. April.