5. ESTIMATING LONGLINE AND GILLNET SELECTION CURVES FROM SELECTION EXPERIMENTS

The catch taken by a fishing gear depends on the amount of fish that is available to the gear, their catchability, the selectivity and fishing power of the gear, and the effort deployed. These factors are related by the selection equation:

C_l,g=q_l,gN_lP_gS_l,gE_g

where subscript g refers to gear and l refers to fish size. In principle, it is possible to observe the catch (C) and the effort (E) directly. The stock size (N) can be estimated from stock assessments. The other parameters, catchability (q), fishing power (P) and fishing selectivity (S) must be estimated through designed experiments. This chapter discusses estimation methods to obtain these parameters.

The selection equation requires normalisation: the fishing power is measured relative to a standard gear with fishing power = 1, and the selectivity is restricted to the interval [0,1] and is assumed to be 1 for one or more length classes. The model is multiplicative and it is not possible to estimate all components from catch and effort data. So, it is impossible to distinguish a gear-dependent catchability effect (more fish are available to the gear) from a difference in fishing power (the gear is more effective in retaining the fish). For example, a particular longline bait may attract more fish or alternatively ensure more successful hooking. Both mechanisms will be observed as an increase in the catch.

C_l,g: Catch

The catch by length class taken by a fishing gear is an observed variable when conducting an experiment.

q_l,g: Catchability

The catchability measures the proportion of the population that is available to the gear. The selection equation shows that the catchability is obtained directly for the length group with maximum selectivity (=1) when fishing with one unit of effort. The catchability depends, among other things, on the reactions of the fish to the gear. For gears of the same type (e.g. two different gillnets) it is commonly assumed that q only depends on length, i.e. q_l,g≡q_l

N_l: Number of fish available

q_l,g N_l is the number of fish encountering the gear. As noted by Hamley (1975), q_l,g S_l,g is the desired selection measure in several applications as it expresses the selection relative to the stock rather than the fish encountering the gear. This product is estimated when experimental fisheries are used for providing absolute density estimates of fish abundance (see Section 6). In some cases, reduced gillnet sizes matching particular abundant size classes of fish have been used (e.g. Regier and Robson 1966).

P_g: Fishing power of the gear

The fishing power is a measure of the efficiency of that particular gear in retaining fish. Many methods developed for estimating selectivity include the assumption that the gears being compared have identical fishing power, i.e. P_g =1. The model ignores gear saturation; this would make the fishing power a function of the amount of fish available to the gear.

S_l,g: Selectivity of the gear

Most gears are only effective at catching certain fish and shellfish species and even then usually only over a limited size range. Traditionally this size dependence has attracted much scientific attention while other parameters in the selection equation have largely been ignored.

E_g: The effort deployed

Effort is measured by the duration of fishing and the number of gears (number of nets or hooks). Experiments with several gears of the same type deployed concurrently often use the same effort for each gear (i.e. same numbers of hooks or number of nets of similar size used for the same duration).

Methods for estimating selectivity are usually classified into two main groups based on whether information on the length distribution of the fish available to the gear exists or not.

Direct methods: Methods that require an external estimate of the fish abundance per size class, i.e. that N_l is known.
Indirect Methods: Methods that do not require such information, i.e. that N_l is unknown.

In most cases, N_l is unknown and hence indirect methods are those used most often. An indirect method estimates size selection and the length distribution of the fish encountering the gear concurrently based on the selection equation. This requires additional assumptions on the form of the selection curve.

5.1 USING LENGTH FREQUENCIES OF CATCHES AS THE SELECTION CURVE

The catch size distribution is sometimes assumed to directly present the selection curve, ignoring the stock size distribution. Hamley (1975) briefly discussed this case and noted that the result is a rough proxy as gillnets are usually very size selective. Baranov (1948) suggested as a simple rule of thumb that fish smaller or larger than 20% of the optimal selection size would only occur rarely in the catches.

Figure 5.1 Using observed length frequencies of the catches as a proxy for selection. The upper panel shows a stock length distribution and the middle panel a selection curve. The catch size distribution, which is the product of stock and selection, is shown in the lower panel. When the selection range is narrower than the stock size distribution the size distribution of the catches may be used as a proxy for selection. Redrawn from Baranov (1948).

The selection equation stipulates that when the length frequency distribution of the stock (q_lN_l) is relatively uniform then the length frequency distribution of the catch will provide a proxy of the selection (Fig. 5.1).

Using the catch length distribution directly to represent the selection curve may be useful when the gear is only effective over a very narrow size range, but the approach has only limited use. The assumption is sometimes useful for exploratory purposes but final selection estimates should be obtained by one of the methods that explicitly account for the length composition of the population, N_l.

When the catches are obtained from fisheries covering different areas and periods, and are of some magnitude, the observed catch distributions may provide a fairly accurate estimate of the selection curve (see for instance Hovgård 1996a). Length frequency information from gillnet catches may serve as guidance when planning gillnet experiments. In recent years more elaborate methods based on interpretations of the length frequencies have been suggested (e.g. Henderson and Wong 1991, Helser et al. 1991).

5.2 DIRECT SELECTION ESTIMATES: THE POPULATION AVAILABLE TO THE GEAR IS KNOWN

Direct estimation has been used infrequently for pure selectivity studies as the population size for each length group is rarely known. When the population length distribution (N_l) is known, selection can be directly estimated from the selection equation as

C_l.g/ (N_lE_g,)= q_l.gS_l.gP_g

Only the product of the catchability, the fishing power and selection can be estimated. Fujimori et al. (1996) notes in direct experiments that more shrimps were taken in the larger mesh-sizes and interpreted this as being due to an increased fishing power (a P_g effect) of the larger mesh-sizes. However, the observation may equally well be due to an increase in swimming distances with increasing shrimp size changing the catchability (a q_l.g effect). Such ambiguity can only be resolved by an appropriate experimental design. Inclusion of a ‘non-mesh selective’ gear (e.g. a small meshed trap) in the experimental design may allow an evaluation of a difference in the encounter probability by shrimp size.

The population size structure may be known if fishing takes place on a stocked population in an experimental enclosure (e.g. Fujimori et al. 1996) or by fishing on a tagged population in a natural environment (e.g. Hamley and Regier 1973). Both procedures require large resources, and for tagging experiments, extensive fisheries are needed to ensure a sufficient number of recaptures (Ricker 1975).

The population size structure has been inferred using various stock assessment techniques such as mark-recapture experiments (e.g. Hamley and Regier 1973, Borgstrøm 1989, 1992), acoustic surveys (e.g. Rudstram et al. 1987) or from fishing using non-mesh selective gears (e.g. Borgstrøm 1989, Winters and Wheeler 1990). However, all such assessment methods possess various weaknesses or limitations. For example, no gear may be assumed to be ‘non-selective’ towards the stock even when no-mesh selection is assumed (e.g. for fish above a certain size the mesh selection is 100% in trawls and purse seines). Comparing the catches of such sizes from a gillnet to the catches from a purse seine leads to:

where k is a constant absorbing the effects of effort and fishing power. The expression shows that the catch ratio between gears only reflects selection if q_l.GN = q_l.PS, i.e. when the probabilities of encounter for all length groups are the same for both gears.

The uncertainty in the stock estimates influences the selection estimates. The relative uncertainty of the selectivity estimates may be written as

assuming that catch per unit effort (C/E) and stock (N) are estimated independently. This implies that when the population size distribution is imprecisely known this will directly be carried over into the selection estimates. The uncertainty in stock size estimates is usually difficult to assess and is rarely reported in direct selection studies. Borgstrøm (1989) lists confidence intervals for mark-recapture based stock estimates.

5.3 INDIRECT SELECTION ESTIMATIONS: THE POPULATION AVAILABLE TO THE GEAR IS UNKNOWN

Information on the size distribution of the population being fished is not available in most gillnet selectivity experiments and the selectivity must be estimated indirectly. Therefore, the size frequency distribution of the population and the selectivity parameters are estimated simultaneously. This is possible if a) fishes of a given size are equally available to nets of different mesh size fishing concurrently and, b) the selectivity depends on the fish size and mesh size only. The first assumption stipulates the catchabilities are equal between the gears compared (q_l.G1=q_l.G2), the second may, for instance, be met using Baranov's principle of geometric similarities.

Three different approaches have been used to obtain the selection curve using various manipulations of the selection equation.

Methods that use the ratio in catches for a given fish size taken in different gears reflect the difference in selection. For instance, when comparing the catches of two gears and assuming equal fishing power and effort, then . This approach is used by the various catch-relative-to-best-catch methods (e.g. Gulland and Harding 1961, Jensen 1973) and by Holt's (1963) method.
Methods that use the ratio in catches between gears where selection is equal reflect the difference in available fish (again assuming equal fishing power and effort), i.e. when S_l.1 =S_L.2 then . This approach is used in the graphical methods of McCombie and Fry (1960) and Kitahara (1971).
Methods where q_lN_l and S_l.g are estimated simultaneously. This approach is used in the statistics-based methods proposed in recent years (e.g. Kirkwood and Walker 1986, Hovgård 1996a, Millar and Holst 1997, Hovgård et al. 1999).

5.3.1 Methods Based on Catch Relative to Best Catch Methods

These methods are based on the catch ratio of two gears being fished with the same effort:

under the assumption that the fishing power of the two gears is equal (e.g. two different mesh sizes of otherwise identical gillnets). There are two unknowns, S_l.g1 and S_l.g2, but only one can be estimated. This problem is most easily circumvented by assuming that the gear catching the most fish in a particular size group is selecting 100% of that size. The catch comparisons are made for each size group separately.

The simple procedure suggested by Jensen (1973) has been used as an example of this class of methods. The calculation scheme is illustrated in Example 5.1 where Baranov's principle of geometric similarity (see Section 3) has been used to combine the information from the different mesh sizes.

Example 5.1 Approximated selection estimates from comparing catch to best catch.

Table 5.1 shows the gillnet catches of sockeye salmon reported by Peterson (1954). The catch amounts to 6 333 salmon caught in eight different mesh sizes between 13.5 and 19 cm.

Table 5.1 Catches of sockeye salmon taken in a gillnet experiment reported by mesh size and fish size by Peterson, 1954.

Length	Mesh size
Length	13.5	14.0	14.8	15.4	15.9	16.6	17.8	19.0
52.5	52	11	1	1	0	0	0	0
54.5	102	91	16	4	4	2	0	3
56.5	295	232	131	61	17	13	3	1
58.5	309	318	362	243	95	26	4	3
60.5	118	173	326	342	199	100	10	11
62.5	79	87	191	239	202	201	39	15
64.5	27	48	111	143	133	185	72	25
66.5	14	17	44	51	52	122	74	41
68.5	8	6	14	23	25	59	65	76
70.5	7	3	8	14	15	16	34	33
72.5	0	3	1	2	5	4	6	15

The best catches for each size group are compiled in Table 5.2, first column. These are used in the calculation of the selection proxy which is derived as S_l.g = C_l.g / C_l.best (Table 5.2.). For fish below 56.5cm the smallest mesh size always provides the best catches, as do the largest mesh for sizes above 68.5cm. For these extreme size groups the best catch can hardly be expected to be 100% selected and therefore it is reasonable to restrict the data to be used to the interval 56.5 to 68.5cm.

Table 5.2 Jensen's selection proxies (catch divided by best catch) using the data provided in Table 5.1. Only the selection proxies (grey shaded area) are considered useful.

Length Midpoint Cm	Mesh size
Length Midpoint Cm	Best catch, see Table 6.1.	13.5	14.0	14.8	15.4	15.9	16.6	17.8	19.0
52.5	52	1.00	0.21	0.02	0.02	0.00	0.00	0.00	0.00
54.5	102	1.00	0.89	0.16	0.04	0.04	0.02	0.00	0.03
56.5	295	1.00	0.79	0.44	0.21	0.06	0.04	0.01	0.00
58.5	362	0.85	0.88	1.00	0.67	0.26	0.07	0.01	0.01
60.5	342	0.35	0.51	0.95	1.00	0.58	0.29	0.03	0.03
62.5	239	0.33	0.36	0.80	1.00	0.85	0.84	0.16	0.06
64.5	185	0.15	0.26	0.60	0.77	0.72	1.00	0.39	0.14
66.5	122	0.11	0.14	0.36	0.42	0.43	1.00	0.61	0.34
68.5	76	0.11	0.08	0.18	0.30	0.33	0.78	0.86	1.00
70.5	34	0.21	0.09	0.24	0.41	0.44	0.47	1.00	0.97
72.5	15	0.00	0.20	0.07	0.13	0.33	0.27	0.40	1.00

Figure 5.2 A selection curve for the size range 56.5–68.5cm describing Peterson's (1954) sockeye salmon data using the catch / best catch approach suggested by Jensen (1973). The selection is plotted versus length/mesh-size. Curve fitted by eye.

As noted the use of the catch / best catch approach includes uncertainty on whether the best catch corresponds to 100% selection. Normally the fitted selection curve actually negates this assumption and Gulland and Harding (1961) therefore suggested an iterative procedure. Their procedure includes the assumption that the observed catches can be corrected by the estimated selection, i.e.

where S_l.g is the selection read from the selection curve. The analysis is then repeated with the new corrected set of catches (C* values) until the estimated selection curve remains constant between two iterations. A similar procedure has recently been used by Hansen et al. (1997). However as the graphical methods are mainly useful for data exploration, it is probably not worth improving the estimate. Instead one of the statistical methods suggested later in the chapter should be used.

5.3.2 Holt's Method

Holt's method (Holt 1963) is one of the most commonly used methods for estimating gillnet selectivity. The method has also been used to estimate longline selection (e.g. Cortes-Zaragoza et al. 1989). The method is based upon standard linear regression and can be carried out using a pocket calculator. However, it is restrictive as it assumes the normal curve as the selection model. This selection model does not conform to the principle of geometric similarity.

The method compares the catch in the same length group taken by two gillnets that have nearly the same mesh size. The selection model is the normal curve, i.e.

where k is the selection factor and σ² a measure of the width of the selection curve. Holt (1963) proposed to use linear regression on the logarithmic ratio between catches from the same fish length group.

If the same population is fished with two gillnets with the same effort (e.g. two nets of identical power but having different mesh sizes) the relative selection is:

and

with regression constants (α, β) derived from the normal distribution:

The selection can therefore be estimated by linear regression giving α and β. Solving for the selection parameters k and σ² gives

and

Re-introduction of the estimated selectivity in the selection model allows estimation of the relative population size:

Example 5.2 Estimation of selection using Holt's method

The example is based on data in Peterson (1954), see Table 5.1. The logarithmic catch ratios are given in Table 5.3.

Table 5.3 Logarithmic catch ratios, In (C _l,m1/C_l,m2) for neighbouring combinations of mesh sizes. The data have been curtailed, as only ratios where at least 5 fish in both meshes are caught have been included in the calculations.

Range of meshes being compared

Mid length of fish (cm)	13.5	14.0	14.8	15.4	15.9	16.8	17.8
Mid length of fish (cm)	14.0	14.8	15.4	15.9	16.8	17.8	19.0
52.5	1.5533
54.5	0.1141	1.7383
56.5	0.2402	0.5715	0.7643	1.2777	0.2683
58.5	-0.0287	-0.1296	0.3986	0.9392	1.2958
60.5	-0.3826	-0.6336	-0.0479	0.5415	0.6881	2.3026	-0.0953
62.5	-0.0965	-0.7864	-0.2242	0.1682	0.0050	1.6397	0.9555
64.5	-0.5754	-0.8383	-0.2533	0.0725	-0.3300	0.9437	1.0578
66.5	-0.1942	-0.9510	-0.1476	-0.0194	-0.8528	0.5000	0.5905
68.5	0.2877	-0.8473	-0.4964	-0.0834	-0.8587	-0.0969	-0.1564
70.5			-0.5596	-0.0690	-0.0645	-0.7538	0.0299
72.5							-0.9163

Regressing the data in Table 5.3 linearly with the fish length as the independent variable gives:

Smallest mesh	13.5	14	14.8	15.4	15.9	16.8	17.8
Largest mesh	14	14.8	15.4	15.9	16.8	17.8	19

Intercept α	3.977	10.012	5.248	6.521	6.845	20.345	6.7744

Slope β	-0.06405	-0.16661	-0.08376	-0.09713	-0.1075	-0.29908	-0.09872

Which leads to the estimates of the selection parameters of the different mesh-sizes

Mesh difference	0.5	0.8	0.6	0.5	0.9	1	1.2	Overall mean

Mean Mesh	13.75	14.4	15.1	15.65	16.35	17.3	18.4

K	4.52	4.17	4.15	4.29	3.89	3.93	3.73	4.10
σ²	35.25	20.04	29.72	22.08	32.61	13.15	45.33	28.31
σ	5.9	4.5	5.5	4.7	5.7	3.6	6.7	5.3

The model formulation implies that the mode in the selection curve is proportional to mesh-size (i.e. the mode equals K multiplied by mesh-size), but that the spread of the selection curve (σ) is constant for all mesh-sizes. Using the overall mean values allows the selection curves to be drawn (Fig. 5.3).

Figure 5.3 Selection curves from Peterson's data using the Holt (1963) method.

5.3.3 Graphical Methods Using Baranov's Principle of Geometric Similarity

The methods described in the earlier sections compared catches in the same length group taken by different gears. In the class of methods discussed in this section we compare catches in different length classes that are exposed to the same size selectivity. This class of methods is typically used for gillnets as the methods are based on Baranov's principle of geometric similarity (selection is described as a function of length/mesh-size) and assume that the fishing power is the same for all mesh-sizes.

The selection equation is now used to compare catches of two gillnets with different mesh sizes m₁ and m₂ in two different length groups l₁ and l₂:

According to Baranov's principle of geometric similarity, equal selection is found for constant values of fish-size/mesh-size (i.e. l₁/m₁=l₂/m₂). The selection equation can then be written as

For each particular length class, q_lN_l is the same across all mesh sizes used. This implies that C_l.m/E_m is proportional to S(l/m) for a given length, l. A plot of C_l,m/E_m against l/m for each length class therefore provides a proxy of the selection curve. Curves for different length classes have the same shape, but with different amplitude due to differences between q_lN_l' s.

Estimation of the selection curve is done as follows:

Plot C_l,m/E_m against (l/m) by length class.
Choose an appropriate model for describing the scaled points (e.g. choose a normal distribution if the points appear as a bell-shaped distribution).
Scale the points for the individual length classes to a common magnitude. This implies ‘guessing an appropriate scaling factor’ for each length class. These scaling factors are implicit measures of 1/ q_lN_l.

When effort is identical between mesh sizes, as when fishing with groups of gillnets with different mesh sizes, the catches can be used directly instead of catch per effort for the analyses. This was done in the graphical estimations presented by McCombie and Fry (1969) and Kitahara (1971) whose methods are essential similar. Example 5.3 illustrates the graphical approach.

Example 5.3 Estimation of the selection curve by the McCombie and Fry (1960) method

Figure 5.4 shows Peterson's gillnet salmon data (Table 5.1) where the catches per size class are plotted on the abscissa: length class/mesh-size. The curves for the individual length classes are of the same shape except for a scaling factor. Table 5.4 shows scaling values that bring the various length-classes to common amplitudes.

Figure 5.5 shows the catches scaled. The scaling factors are equivalent to 1/ q_lN_l, i.e. the reciprocal of the numbers of fish encountering the gear.

Table 5.4 Factors used for scaling the individual size classes shown on Figure 5.5 to common amplitude.

Length	52.5	54.5	56.5	58.5	60.5	62.5	64.5	66.5	68.5	70.5	72.5
Guess	52	101	260	381	320	238	169	95	63	33	5

Graphical methods are simple means to show the catch information split into selection and stock size. However, this split is conditioned by the use of some predetermined selection curve. The graphical approaches have been relatively little used in recent years (e.g. Spangler and Collins 1992, Fujimori et al. 1996).

Figure 5.4 Peterson's (1954) sockeye salmon data plotted per size class vs. transformed length. For three size classes the points have been joined to indicate that the different size classes follow similarly shaped curves differing only in amplitude.

Figure 5.5 Gillnet selection curve derived by McCombie and Fry's method, by scaling the catch data shown in Figure 5.4 to equal height.

5.4 COMPUTER BASED STATISTICAL METHODS

Several rigorous statistical methods to estimate the selection curve from indirect selection experiments have been proposed starting with Kirkwood and Walker (1987). These methods use established statistical methods for optimising a fit between a specified model and the observed catches.

The selection equation is typically simplified as most methods ignore the effort and fishing power terms (both assumed to be the same for all mesh sizes). The selection equation is, on the other hand, extended to include a specific model for the noise in the observation. The selection equation now reads:

C_l,m=qN_lS_lm + Noise

The noise term includes several mechanisms such as patchy distribution of fish, variability of fish behaviour to the gear, gear performance variability and sampling variance. Note that representing noise as an additive random term is very natural for some probability distributions, such as the normal distribution, but not for others, such as the Poisson distribution. Nevertheless, here we use it to represent all random factors for consistency.

Normally the estimation problem is simplified by assuming that the observation in each length class is independent of the observation in another length class. Also for simplification the trivial summation over several experiments is usually ignored and the problem only formulated for a single experiment using several “gears” (e.g. panels with different mesh sizes, or section of longline using different hooks).

Estimation may be done by either maximising the log-likelihood function (Maximum Likelihood [ML] methods, e.g. Kirkwood and Walker 1987, Millar and Holst 1997) or by minimising least squares (Least Square methods [LSQ] e.g. Hovgård 1996a and Hovgård et al. 1999). ML methods require an explicit assumption of the error structure in the observations. Millar and Holst (1997) suggest from experience that although the error structure is over-dispersed, Poisson distributed errors generally provide an adequate description.

The LSQ methods rely on weaker assumptions, i.e. that the statistical noise is symmetrically distributed around the expected values with a common variance. This usually can not be assumed and therefore it is common practice to transform the data. Elliott (1983) suggests power transformation as a flexible tool useful for normalising the most commonly occurring error structures. The LSQ and the ML estimates are identical when the errors are normally distributed. That means that the LSQ estimates of logarithmic transformed observations corresponds to ML estimates assuming log-normal distributed errors. Erzini and Castro (1998) found little differences between estimates derived by ML or LSQ methods applied to the same data.

5.4.1 Maximum Likelihood Methods for the Estimation of Selection Curves

These methods are based on the standard statistical estimation technique of maximum likelihood (e.g. see Lehmann 1983) that was introduced by R.A. Fisher in the 1920s. He proposed that the best guess on the unknown parameters is to assume that the set of observations is the most probable.

Kirkwood and Walker (1987) applied the method assuming Poisson distributed errors. Millar (1992) investigated the methods based on the general family of exponential error distribution applied to trawl selection models. Millar and Holst (1997) investigated the maximum likelihood methods for gillnet selection models.

Assuming that the experiments are independent, the logarithm of the observation probability is the sum of the log-probabilities of the observed catches given a particular set of parameters. The maximum likelihood principle requires that this expression is maximised by varying the parameter values.

5.4.1.1 Poisson Distributed Observation Variance

The procedure using Poisson distributed errors is presented below following Kirkwood and Walker (1987). Assuming Poisson errors (i.e. that the variance is proportional to the mean) is in many ways the most simple case. The stochasticity in the observations is in this formulation:

The log-likelihood function is

and the estimation equations become for a parameter θ in the selection model

The estimation equations are equal to zero at the maximum, which can be used to find the maximum likelihood parameters.

The parameters are log(qN_l) and the selection parameters φ. Because of the multiplicative structure of the selection equation, it is possible to derive the estimate for the population (qN_l) for all selection models (see Appendix 1 for the derivations).

The estimation equations for the selection parameters can only be specified when the selection model has been decided upon. There are numerous selection formulations available, see Table 3.1. The method is then to calculate logL as a function of the selection parameters and maximise that sum.

Example 5.4 Maximum likelihood estimation assuming poisson distributed errors.

The example uses data from Peterson (1954) presented in Table 5.1. Selection is assumed to follow the log-normal distribution.

where k, Ψ are the mode and the spread of the log-normal selection curve, respectively.

The estimates can be obtained in EXCEL using the SOLVER add-in, which requires initial input parameters (guesses) for the two parameters in the log-normal distribution to start the fitting process. Setting the calculations up in an EXCEL worksheet requires 3 matrices with the dimensions:

columns = numbers of mesh-sizes,

rows = numbers of size classes.

Matrix 1: contains the catches by mesh size and length class.

Matrix 2: contains the calculated selection using the two input parameters. Add three extra columns

A: summed catch,

B: summed selection and

C: the ratio A/B which express the number of fish in each length class encountering the nets (q_lN_l)

Matrix 3: contains the log probabilities = C*log(q*N*S)-q*N*S-logC!.

The log C! is calculated using the log-gamma function, in EXCEL called GammaLn. Log C! = GammaLn(C+1). However, as this function is not dependent on the parameters, it is not necessary to calculate the function to find the maximum of the function.

The sum of values in this matrix is stored in a cell that is used as the target cell in SOLVER. This facility in EXCEL is run to maximise the value stored in this cell by changing the two parameter values of the log-normal selection model: log k and Ψ².

The selection parameters are the obtained as

Parameters

log k 1.401307

Ψ² 0.00659

Matrix 4: contains the residuals (C_l,m q_lN_l*S_l,m). This matrix is not used in the estimation but is required for inspection of the goodness of fit.

The format of the worksheet is shown in Fig. 5.6

Figure 5.5 A example worksheet for estimating selectivity of gillnets or longlines using the maximum likelihood procedure suggested by Kirkwood and Walker (1987). Data input is required in the cells marked grey.

Choose selection qurve formulation, e.g. log-normal which has two parameters, In k and sigma, i.e.

		Input parameters
			Guess
S(l,m)=	exp (-0.5(In(l/m)-Ink)^2/ sigma^2),*	Ln k
		sigma

Matrix 1 Observed catches (C)by mesh-size(m1-mk) and length(L1-Lj)

Mesh-size	m1	m2	mk
Length	m1	m2	mk
L1
L2


Lj

Matrix 2 Calculated selections (S) using selection curve formulation with the parameters given as input. qN, the population encountering the gear, derived by values in column A divided by values in column B.

				A	B	A/B
Mesh-size	m1	m2	mk	Sum Catch	Sum Selection	qN
Length	m1	m2	mk	Sum Catch	Sum Selection	qN
L1
L2


Lj

Matrix 3 Log Probabilities=C*ln(qN)*S-qN*S-In(C!). The C values taken from cells in Matrix 1, S and qN values taken from Matrix 2. Ln (C!) derived by the EXCEL function GAMMALN(C+1)

				Optimization cell
Mesh-size	m1	m2	mk
Length	m1	m2	mk	(sum of all cells in matrix 3)
L1
L2


Lj

Matrix 4 Residuals, r = C- qN*S, C taken from matrix1, S and qN
from matrix 2

Mesh-size	m1	m2	mk
Length	m1	m2	mk
L1
L2


Lj

5.4.1.2 Log-Normal Distributed Observation Variance

Most authors have estimated gillnet selection curves by assuming Poisson distributed errors, either in ML estimations (e.g. Kirkwood and Walker 1987, Millar and Holst 1997) or implicitly by choosing a data transformation matching this assumption (e.g. Hovgård et al. 1999). Poisson distributed errors correspond to fish being caught at random. However, catches should a priori be expected to be overdispersed as fish are usually patchily distributed. The log-normal error structure where the variance is estimated independently of the mean allows for overdispersed observations.

The stochastic noise in the observations is now:

where Δl is the width of the length class. The logarithmic likelihood function now becomes

Again, as was the case for the Poisson distributed errors, it is possible to estimate the population as a function of the selection model. Since the error structure has now changed the estimator also changes (see Appendix 1 for derivation). The catch rate term can be derived directly, based on a defined selectivity model:

Again, a shape of selection curve must be chosen — the log-normal selection mode is again chosen as the example. Note that there are two “log-normal” models involved here. The first, discussed above, is the account of the noise in the observations and reflects the stochastic element of the observations. The second “log-normal” refers to the shape of the selection curve.

where k, ψ² are the mode and the width of the log-normal selection curve.

Example 5.5 Maximum likelihood estimation assuming log-normal distributed errors. Selection is assumed to follow the log-normal distribution.

An EXCEL worksheet may be constructed following principles similar to those used in Example 5.4, however Matrix 3 in that example is now calculated in EXCEL as:

Ln[[LOGNORMDIST(Catch+0.5,log(qNS), ψ²) -

LOGNORMDIST(Catch-0.5, log(qNS), Ψ²)]/Catch]

This is the log-likelihood for log-normal selection with log-normal errors. The results of the estimations are:

Log k 1.4145

ψ² 0.01128

Compared to Example 5.4 where Poisson distributed errors were assumed, the log k values are quite similar, but the width of the selection curve (ψ²) differs more. Figure 5.7A shows the estimated population sizes and Figure 5.7B shows the selection curves obtained by the two assumptions - i.e. Poisson observation noise and log-normal observation noise. The observation variance estimated from the log-normal error assumption (CV approximately 200 %) is substantially higher than what is assumed in the Poisson model (CV < 20 % for those combinations of length and mesh with significant catches).

The difference between the two selection curves may be interpreted as an uncertainty due to the true error structure being unknown.

Figure 5.7 Estimated population length distribution and selection curve using Maximum Likelihood based on Poisson and Log-normal error distributions. Considering that population length distribution estimates from indirect methods are only relative, the shape of the two length distributions, rather than the actual numbers, should be compared.

5.4.2 Regression Framework

The regression framework presented here is a generalisation of the approach used by Hovgård (1996a) and Hovgård et al. (1999). The framework allows the researcher to account for the error structure by using a power transformation of the catch data and least-squares estimation. The regression minimises the difference between the observed and the predicted catches on the transformed scale, (i.e. min Σ(C ^β, Ĉ ^β)²), read me where β takes a value between zero and one. For β=0.5 the transformation is equivalent to assuming a random distribution of catches (Poisson distribution), β>0.5 signifies a distribution with low-contagion, β<0.5 a contagious distribution. Elliot (1983) provides a short and very readable introduction to the distribution patterns of animals and the statistical treatment of survey data. The derivation of the equations given below are supplied in Appendix 1.

The method estimates the population per size class as:

This is the least-squares estimate for the qN_l's. The remaining parameters may need to be estimated by minimising the least squares sum:

The effort term (E_m) may be ignored if the gears compared are of equal size and operated for identical duration. Similarly, if the fishing power is assumed to be equal, this term may be ignored.

5.4.2.1 Formulation in an Excel Spreadsheet

The spreadsheet (Figure 5.7) contains four matrixes and supplementary cells for parameters. Data inputs are required in the cells marked grey. These include:

The power used in the transformation; setting the power at 0.5 is approximately the same as assuming Poisson distributed errors.
The initial parameters for the particular model chosen. The parameter values must be initially guessed.

Matrix 1: The catch per length and mesh-size.

Matrix 2: The fishing power of the various nets in the first row if such estimates are available, otherwise all power values are set at 1.0. The subsequent rows contain the calculated selections derived using the guessed selection input parameters. The last column in Matrix 2 contains the estimated qN's taken from the qN_l equation above.

Matrix 3: The estimated catches, which are calculated as the Fishing-power * Selection * estimated-qN.

Matrix 4: The transformed residual, i.e. C^β_obs - Ĉ ^β. The residual are squared and summed in the SSQ cell.

Estimations are carried out using Excel's Solver facility. The target cell is the SSQ cell, which is to be minimised, by changing the values in the parameter cells. Solver iteratively adjusts the parameter values until a minimum is found. The output then contains the estimated parameters, the estimated selections per mesh-size, the estimated qN's and the transformed residuals.

Example 5.6 shows the analysis of Peterson's Fraser River salmon data (Table 5.1) using four different selection formulations: following Baranov's principle of geometric similarity they are evaluated as in the model of Holt, where all selection curves follow a normal distribution with identical spreads.

Figure 5.8 Example worksheet for estimating selectivity of gillnets or longlines using the regression framework described in section 5.4.2. Input parameters are in gray cells.

Choose selection qurve formulation, e.g. log-normal which has two parameters, In k and sigma, i.e. S(l,m)= exp (-0.5(In(l/m)-Ink)^2/ sigma^2)*	Input parameters
		Guess
	Ln k
	sigma
Choose the error structure (stabilise variance) assumed by setting the power of Beta, e.g. Beta=0.5 for Poisson distributed errors
	Beta

Matrix 1 Observed catches (C)by mesh-size(m1-mk) and length(L1-Lj)

Mesh-size	m1	m2	mk
Length	m1	m2	mk
L1
L2


Lj

Matrix 2 Calculated selection (S) using selection curve formulation with the
parameters given as input. qN, the population encountering the gear,
derived by expression given in text if available, fishing power (FP) may be
supplied (see section 5.5).

Mesh-size	m1	m2	mk	qN
FP	p1	p2	pk
Length
L1
L2


Lj

Matrix 3 Predicted catch=qN*S*FP. Values are taken from
Matrix 2

Mesh-size	m1	m2	mk
Length	m1	m2	mk
L1
L2


Lj

Matrix 4 Transformed residuals= catch^beta-predicted catch^beta. Catch
taken from Matrix 1, predicted catch from Matrix 3.

Mesh-size	m1	m2	mk
Length	m1	m2	mk	Optimization cell
L1
L2
				(sum of squares of all cells in Matrix 4)
				(sum of squares of all cells in Matrix 4)
Lj