5. CONCLUDING REMARKS

5.1 Sensitivity and confidence limits

When any of the methods described in the previous sections are used to provide an estimate of, for example, Z/K = 1.7, one thing is certain. This is that the true value of Z/K is not precisely 1.7. It may be close - 1.68 or 1.6995 - but it could, if the input data are not good, be quite different - 0.9 or 2.5. If we are to use the result, for example, to judge how heavily the stock is exploited by comparing our estimate of Z/K to likely values of M/K, it is important to know how close our estimate is likely to be to the true value.

The standard way of doing this is to recognize that, given the variability of the input data, the estimate will depart from the true value by a greater or lesser extent, depending on the chance effects of sampling. In favourable circumstances, if the estimation procedure fits a simple statistical model, it is possible to calculate, based on the variability of the data, confidence limits about the “best” estimate, i.e., to calculate limits such that if the true value does lie outside these limits we would obtain the observed estimate by chance only extremely rarely (once in twenty times or once in a hundred times for the 5% and 1% confidence limits respectively).

In simple analyses these confidence limits can be readily calculated. Thus if we are interested in the mean length of a sample of N fish, the variance of the mean is s²/(N-1), where s² is the sample variance, and the 5% confidence limits are about two standard deviations each side of the mean. Unfortunately most of the methods described here are less simple, and do not allow such simple calculations of confidence limits. For example the ELEFAN method provides a “score” - the ESP/ASP ratio - which describes how well alternative sets of growth parameters fit the observations - or rather the set of peaks obtained after restructuring the length data. What the ELEFAN does not tell us is how high or how low a value of the ESP/ASP ratio would be expected by chance if the true parameter set is used. If the highest ratio is 0.67 obtained with K = 0.42 L_inf = 54, and with K = 50 L_inf = 48 the ratio is only 0.53, we can say the first set is in some sense better. What we cannot say is whether the second set is wrong, and can be rejected at the 1% or 5% level of significance.

Apart from uncertainty caused by variability in the data, which in favourable circumstances can be expressed as confidence limits, there are two other important ways - wrong models, and wrong input parameters - in which the estimates obtained may differ from the true values. The extent of the uncertainties that are introduced from these effects are not always easy to determine, but in many cases may be larger than the random errors introduced by variable input data. For various reasons, therefore, it has become commonplace for estimates from a stock assessment analysis to be presented as a single figure - K is 0.426, Z is 1.34, etc., without attempting to say how accurate these figures are.

This is really not satisfactory, and can risk seriously misleading the users of the analysis. Some excuse can be made for those advising managers on matters such as the level of the TAC for the coming year, where it can be argued that if the manager is given a choice of figures for the TAC, he will always take the highest. This is only partly true; giving a single figure may simplify things in the short run, but it should be better in the long run if the managers are aware of the uncertainties in the scientific advice. Their chances of making sensible decisions will be increased if they are also informed of the risks involved if they always act on the most optimistic of a range of figures. With this possible exception any estimates should be accompanied by some indication of their accuracy.

The effect of possible errors in the model, and failures in the assumptions implicit in the model, can sometimes be dealt with by using more detailed models, with less demanding assumptions or by simulation, i.e., by applying the methods to artificially generated data, for example, length-frequencies for which the input values of growth, etc., are known. Thus Rosenberg and Beddington (1988) have examined the performance of ELEFAN 1, and Basson et al. (1988) the performance of Shepherd's length composition analysis (SLCA) and the projection method (Rosenberg et al., 1986). It appears that there will be some bias in the estimates of K and L_inf produced by these methods under all but the most favourable circumstances. The bias is least if rough estimates of the quantities are known before the analysis starts. The reliability of the methods is not the same for all, with SLCA and the projection method generally performing better, the latter being robust to high levels of variation in length at age. The ELEFAN method has been amended to reduce some of the causes of bias, particularly in the potential double-scoring of peaks at high lengths. The important message is probably that bias is always likely to occur, not only in these particular methods of estimating growth, but in other methods where the statistical behaviour of the estimation process has not been looked at. This means most of the methods described here. This is perhaps not a very helpful conclusion, but does serve as a general warning to treat all results with care.

The question of wrong input parameters is more easily dealt with. A full examination of the sensitivity of estimates to errors in the input parameters may involve moderately complex mathematics (e.g., Laurec and Mesnil, 1987), but it is not necessary to go as far as this. All that is needed is to repeat the calculations for different values of the input parameters, e.g., the values of M, K and L_inf in length cohort analysis. In this case it is probably sufficient to test the sensitivity for each parameter separately rather than all together (though Table 5 of Laurec and Mesnil suggests it would be desirable to test simultaneous changes in both K and L_inf). This will cut down the number of calculations needed.

A point to be remembered in examining the results of calculating sensitivity, or looking at published estimates of sensitivity (i.e., the percentage change in the estimate caused by a 1% change in the input parameter) is that not all parameters are estimated equally well. In general L_inf will be relatively well known, especially when there are plenty of large old fish in the samples, and M poorly. It is not unreasonable to hope to estimate L_inf to within 10 or 15%, but the confidence limits of Pauly's regression of M against growth parameters are times 3 each side of the central estimate. It also should be remembered that what matters is usually the effect on the final output, e.g., the advice given to managers. It will often happen that certain intermediate values may be very sensitive to variations in the inputs, but these may have little effect on the final result. For example many assessments, e.g., of the optimum fishing mortality, are very sensitive to the value of M, but uncertainties in M have little effect on others, including the calculations of status quo catches (Shepherd, 1988).

A powerful method for determining the effect of chance variations in the input data, which is not so demanding as a full statistical examination of the analysis is the jack-knife method (Miller, 1974; Tukey, 1977). Essentially this consists, supposing the input is made of n sets of data (e.g., n length compositions) of looking at the n different estimates, which can be obtained by omitting in turn one of the sets and repeating the calculations. A simple examination of these values will give some insight on how the final estimate depends on individual samples. A more precise estimate is provided by calculating “pseudo-values” (Pauly, 1984) A_i' = nA -(n-1)A_i, where A is the estimate based on all samples, and A_i the estimate based on all but the ith sample. The best revised estimate can then be expressed as the mean of the “pseudo-values” A_i and this will have a variance equal to 1/(n-1) times the variance of the A_i's.

This jack-knife procedure seems applicable to most of the methods described, especially those such as ELEFAN which look at a limited number of individual samples, and should be much more widely used than it is at present.

In the absence of any better analysis a simple commonsense approach is to recognize that though computer-processing can nearly always produce a result (e.g., values of K and L_inf that give the lowest ESP/ASP score) they cannot produce a sensible result unless there is a signal in the data. Thus a simple rule-of-thumb when estimating growth is to look at graphical presentations of the length frequencies. Only if there are identifiable modes, and several of these modes can be followed from month to month to give a first estimate of a growth-curve, should one proceed to use computer methods.

5.2 What method to use

The previous sections have described a great variety of techniques that can be applied to length-data. In this section we attempt to review these, and give some guidance as to what techniques should be used in different situations. The first point is that there is no one method that is best in all situations, and the most appropriate method will depend on the characteristics of the fish stock and the fishery, and on the purpose of doing assessment and the resources available. It is not possible in a short section to cover all situations, and here we will concentrate on what may be one typical user of FAO publications of this kind. This is a scientist in a developing country who has received training in the standard methods of fish stock assessment somewhere bordering the North Atlantic, and has returned home to be given responsibility for assessing some multi-species tropical fishery and for giving advice on its management. For this, many of the age-based methods he has been taught are not immediately applicable and he turns to length methods for help. It may be assumed that he has access to some resources for data collection of whatever type he believes necessary, that sampling of commercial catches for length is one possibility, but that these funds are very limited. What should he do?

Enough is now known about the potential value of length methods to be sure that, in the absence of any strong counter-indication, high priority should be given to looking at length information. The first step, before setting up any long-term programme of sampling or analysis should therefore be to collect length samples from the fishery, spread as widely as possible in both space and time. In section 2 it was suggested that before setting up a large-scale routine scheme for sampling of length-composition it would be desirable to collect a moderate number of samples spread through the fishery to establish the pattern of variability within the fishery. The same principle applies to a preliminary examination to determine what priority, if any, should be given to length-analysis. Half a dozen samples, each of perhaps some 200 fish, should provide guidance on which way to go.

Having collected a few samples, the next step is to look at them to see what modes are present, and how these vary from month to month. Following Shepherd et al. (1987) this pattern of modes may be placed, to a reasonable approximation, in one of four classes (see Figure 5.1). This classification then determines the attention that should be given to length information in future research and the methods of analysis that are likely to be most useful.

Figure 5.1 Typical length-frequencies observed in different types of fishery. A. Highly selective fishery, no movement in mode detectable. B. Fishery on a very short-lived fish with one annual pulse of recruitment. C. Fishery on a longer-lived fish, with clear modes among younger fish. D. Fishery on longer-lived fish, with much overlap among adjacent age-groups

Type A - A single mode, always in the same position

The observed length-frequency is almost certainly due to high selectivity in the sampling gear; it is typical of a gillnet fishery, and can also occur when a migratory fish is sampled at one point in its movements. In these circumstances the length-composition of the samples give little information about the length-composition of the population. If this pattern is observed, check whether the gear is likely to be selective. If it is, then there is little point in using length-based methods, unless other sources of information on the length-composition of the population, e.g., from the use of a non-selective gear operated from a research vessels, are available. Length-sampling can be reduced to a bare minimum, sufficient to check that no significant change in the pattern has occurred.

The same effect can also be produced, not by simple gear selection, but by the behaviour of the fish (e.g., a strong size-determined migration) such that only a narrow band of sizes is available to the fishery. The chances of this happening may be judged by looking at what is known about the biology of the fish and the sizes of the same species taken in other fisheries. This is particularly likely to happen in highly migratory fish like tuna. Samples in the bait-boat fisheries for yellowfin tuna often consist only of small fish quite untypical of the population as a whole.

If the gear used does not appear to be selective, and there is little probability of selection due to behaviour, then it may be that the classification is wrong and that we are really seeing a Type D pattern but with an unusually short right-hand limb, perhaps due to the fish not recruiting to the fishery until they have nearly stopped growing (i.e., near L_inf). In that case some of the remarks under D would apply, but because of the small growth the size of the signal that can be produced (e.g., a steepening of the right-hand limb following an increase in fishing mortality) will be weak. This makes it less likely that convincing results will be produced by length methods, and therefore reduces the priority they should be given.

Type B - A single mode moving steadily upwards

This is typical of some intense fisheries on tropical shrimps, and other short-lived species (see Figure 3.5). This is an ideal situation with which to deal. It appears that we are seeing the growth of a single cohort, and the first thing to do is to look for any information confirming this. If samples are available round the year, there should be signs of the next cohort appearing (at a fairly consistent place in the length frequency) before the previous one finally disappears, i.e., for a short time there are two modes. If each mode is indeed a single cohort we can proceed to using age-based methods without the need of looking at scales or otoliths. Each cohort can be ascribed an arbitrary but convenient birthday - perhaps the beginning of the year - and on this arbitrary scale its age is known.

Type C - Several modes, most distinguishable among the smaller fish

This is the classical picture of a length distribution suitable for analysis by length-based methods. The presumption is that the visible modes correspond to the two or three youngest recruited age-groups, and that among the older fish adjacent age-groups are overlapping so that modes are not apparent. The hope with a fishery showing this sort of distribution is that most of the information can be derived from the length-data.

Type D - Only one mode, but with an extended right-hand limb

The presumption is that this is an extreme form of a Type C distribution, but with such an overlap between adjacent age-groups - perhaps because of a very extended spawning season, high variation in individual growth rates, or because the fish are long-lived, with many age-groups present - that they cannot be separated, even among the youngest fish. With little signal, difficulties in applying most length methods can be expected. However, the slope of the right-hand limb should give information on total mortality, and, especially if the fish are long-lived, changes in this slope over time might provide good information on the intensity of exploitation.

In detail the analyses that might be done for the different types, and the sampling strategy, are as follows:

Type A

Little or no information can be extracted by length-based methods. Sampling should be cut to a very low level - just sufficient to check that no changes have occurred. The frequency of sampling should be tied to other information on the fishery. If, for example, there are changes in the type of gear or area or season fished additional sampling should be done for a period to establish whether or not the sizes of fish has changed.

Type B

Sampling. Since the sizes of fish change appreciably from month to month, it is important to have samples from each month - or possibly from smaller time periods - and for each period to be kept separate in the preliminary stages of summarizing the data. On the other hand because the pattern of sizes within any one month is likely to be similar from one year to the next, the intensity of sampling can often be cut down to a relatively low level once the general pattern has been established. The exception is the period when two modes are present, i.e., just after the pulse of recruitment occurs, when there are still survivors from the previous batch of recruits. The ratio of large to small at this time gives valuable information on mortalities and on relative recruitment magnitudes, and is likely to vary from year to year, especially as the amount of fishing changes. More intense sampling at this time is desirable.

Growth. This is very easy to estimate. The progression of modes will give the growth pattern almost immediately without using advanced techniques, though it is better to use one or other of the more objective methods e.g., ELEFAN. Two problems should be noted. When a new batch of recruits first enters the fishery we may be seeing, in the first month or so, only the larger fish. This incomplete recruitment can lead to over-estimation of the modal size of the youngest fish, and hence an under-estimation of growth rate. Second, if there is a strong seasonal pattern in recruitment, there may well be a seasonal pattern in growth. A seasonal slowing-down in growth may be confused with a slowing-down because the fish are approaching their limiting size. L_inf can thus be underestimated. For this type of fishery, much more than the others, it is important to consider using a seasonally varying pattern of growth (e.g., Pauly and Gaschutz, 1979).

Mortality. This is much less easy to estimate. In most months (those when only one mode is present) the size composition has virtually no information on mortality. Methods such as the length versions of cohort analysis or catch-curves cannot be applied to the data of individual months. They must be applied to length-composition of the population as a whole, taken over the year. To obtain this without risk of bias it is necessary to weight monthly samples by the abundance in the month. If estimates of relative abundance in each month are available, e.g., as cpue, it would be better to use this information to calculate month-to-month mortalities directly as the changes in numbers of an identifiable batch of fish (the main mode).

It is possible, under favourable circumstances, that the method of Ebert (1987) can be applied, but a better and simpler use of the period when there are two modes present and the length-composition of the catches can be clearly split into new recruits and survivors from the previous recruitment, is to express total mortality as the ratio of these two groups of fish, i.e., to estimate Z as - ln(p) where p is the proportion of new recruits.

Assessment and management. Since it is difficult to estimate total mortality it is also difficult to assess the effect of fishing. In many months, when only one mode is present, the size-composition will be almost independent of the amount of fishing. Some indication will be given by the extent to which the proportion of larger fish surviving into second year has changed with increasing fishing. More often the actual assessment will have to rely on other data sources, such as the changes in cpue, especially towards the end of the season (i.e., when mostly large fish are being caught).

On the other hand, once it has been established that the stock is being heavily fished, the length data are very useful in assessing the potential benefits from different management measures. One obvious one is to close the fishery at times when very small fish are abundant. Possible closure dates are immediately obvious from the length data. Simple simulation modelling, perhaps using a spreadsheet, using an age-structured model, but using the length-data to establish the mean weight of individuals each month, can be used to give quantitative estimates of the benefits from alternative opening dates (Gulland and Gibson, in press).

Type C

Sampling. Since conditions are highly favourable to the use of all kinds of length-based methods, high priority needs to be given to regular collection of length samples from fisheries of this type. Though Shepherd et al. (1987) suggested that sampling should be “high, intensive for one year” implying that sampling could be cut down after one year, this is only partly true. While estimates of the current values of most parameters can be obtained from one year's intensive sampling, a reliable assessment can only be obtained from data over a period of years, during which changes in the amount of fishing can be related to changes in the population, especially total mortality. The priority given to regular sampling, after a period of intensive work, will depend on the extent to which the amount of fishing is changing. The greater the changes, the more important it is to monitor the effect of those changes, and hence more sampling is needed.

Growth. Most methods are applicable, and all will give good estimates of the growth rate (cm/year, and thus in terms of the von Bertalanffy equation, the product K L_inf) among the smaller fish where the modes are clear. The problem is to extract the most information from the larger fish, where modes are indistinct. Where data exist for many years, the method of anomalies should be considered. Attempts should also be made to obtain independent estimates of growth among the older fish, e.g., by examination of otoliths. Even a few age-determinations of large fish can help greatly. Among the estimation methods those that do not rely on identifying modes (e.g., SLCA) are likely to work better than those that use modes. Thus, the restructuring method of ELEFAN 1 risks identifying false modes among the larger fish, and generating wrong answers.

Mortality. Most methods work well, including those based on mean length, and length-corrected catch curves. Where the modes are particularly clear, methods such as that of Sparre that simultaneously estimate growth and mortality should be considered.

Assessment and management. The greatest need is for data from a long enough period to cover appreciable changes in the amount of fishing. Given that, assessment of the impact of fishing from relating changes in estimates of Z or Z/K to changes in the amount of fishing should be straightforward. Methods such as those of Jones can be used to provide managers with advice on the effect of changes in the amount of fishing or in selectivity.

Type D

Sampling. Since not much information can be obtained from a single sample or from samples in a single year, but data from this type of fishery over a long period can, if fishing changes much in this period, providing there are good assessments, sampling of these fisheries should be established as a long-term programme.

Growth. There is likely to be little time signal, i.e., length-frequencies taken at different seasons may not differ much, so there is little chance of estimating growth. If there are differences from one season to the next, it is possible that they reflect differences in availability, or in the distribution of fish or fishing, and should therefore be interpreted with caution. It can be particularly dangerous to apply to this sort of distribution computer-based methods (e.g., ELEFAN) which will always give a “best” estimate of growth parameters, but do not provide tests of significance or confidence limits to the results. The rule-of-thumb that such methods should only be used after a set of modes and growth curve through them has been identified by eye, is particularly applicable here.

Mortality. Though it is difficult to estimate absolute mortality rates until a time scale has been established, i.e., until growth is known, relative mortality i.e., K/Z (assuming von Bertalanffy growth pattern) can readily be estimated. Methods to estimate Z/K based on mean length work well, and several will also provide approximate estimates of L_inf.

Assessment and management. These can be done in a similar way to Type C distributions.

5.3 Area differences; migration, stock separation, etc.

In this review and in many discussions about length-based methods, little or nothing has been said about possible area differences. It has been implied that any sample of fish taken at a particular time, will, apart from the possible effects of gear selection, have the same length-composition. This is highly unlikely. Common experience is that the sizes of fish taken does vary, sometimes considerably from place to place. To ignore area differences also means that some of the most important problems in stock assessment - the extent to which the stock migrates, the degree of uniformity within a stock, and the degree of separation between stocks - are not tackled. Also, as Sparre et al. (1989) point out, if a length-based method (and indeed nearly any assessment method) is applied to a migratory stock, without taking the effects of migration into account, the results can be highly biased.

Sparre et al. (1989) describe methods that, provided that enough is known about the migration patterns (e.g., from tagging), samples taken at different time can be matched together so that one can be sure that they come from the same cohort of fish, and thus, an unbiased estimate of growth or mortality obtained.

In practice it is doubtful that sufficient knowledge of the migration pattern will exist for this approach to be used with much confidence. On the other hand the length samples themselves can often give good indications of the general pattern of migration, and, perhaps less often, of stock separation. For example, in the earlier discussions of different types of length frequency, it was noted that type A (a single mode always in the same place) was a strong indication that the samples came from only a part of a migratory stock, with migration determined largely by size (though there is also a possibility of strong gear selection). In that case it was suggested that the sizes of fish of the same species caught in other fisheries shall be examined.

This suggestion should in practice be applied widely in any fishery where stock experiment work is being started, and in which not much is known about possible migration patterns (interpreting migration in a wide sense to cover not only seasonal movements between nursery, feeding and spawning grounds (see Harden Jones, 1968) but also diet movements, and the gradual movement of larger demersal fish into deeper waters) and stock separation.

As pointed out above, collection of length-frequency data is usually one of the simplest and cheapest operations in fishery research, and priority in the early stages of an investigation should be given to collecting as many samples as practicable, well spread in space and time. This information can be used in designing an efficient sampling system for regular use.

In addition it would be useful to plot, for suitable time periods (e.g., each month) simple indices of each length sample (e.g., mean length or the main modes) on a chart of the area of investigation (which at this stage should be as wide as possible).

In the most favourable case the length-compositions within a month are the same, regardless of the origin of the samples, and the geographical spread is similar between months. This would suggest a single well-mixed stock, and indicate that analysis can proceed without much concern about the possible effects of migration. Another simple case is when there are big month-to-month differences in the location of samples, possibly with differences in the positions of the modes between samples in the same month. In that case the stock is probably migratory, by following the geographical position and modal lengths of samples in successive months it may be possible to make approximate estimates of the pattern of migration and of the growth.

Another possibility is of consistent differences between areas e.g., smaller fish have always been found in one area, and larger in another. In this case the most likely explanation is of a migration from one area to the other, especially if there are few, if any, small fish being found in both areas, it may be that one is seeing the effect of stock separation, with two more or less independent stocks, and one stock (that with fewer large fish) experiencing a high intensity of fishing.

The point being made here is NOT that firm conclusions about migration or stock separation can be immediately drawn from a simple examination of length data. It is that length data do very often contain relevant information about these matters. A simple examination can therefore give valuable clues as to what is happening, and the results of such an examination should be used, first to give some guidance on what further, more reliable and more intensive, studies should be made of migration, etc., and second, to indicate the extent of possible migration or other area differences, and the degree to which their effects should be taken into account in using various length-based methods.