# 4 THE POPULATION MODEL

4.1 THE COHORT MODEL
4.2 THE CATCH MODEL
4.3 COHORT PROJECTIONS
4.4 SEPARABLE VPA
4.5 POPE’S APPROXIMATION
4.6 CONVERGENCE BEHAVIOUR OF THE VPA PROCEDURE
4.7 SOLVING THE VPA EQUATIONS
4.8 MULTISPECIES VPA
4.9 COMPARISON WITH OTHER STOCK ASSESSMENT MODELS

The state of the stock at a specific point in time is described by:

· Stock in numbers by age group (cohort)

· Mean weight per individual in the stock by age group

· Mean weight per individual in the catch by age (sometimes these weights are specified for a number of fleets)

· Maturity proportion by age group

The stock development is based on a number of processes:
· Growth. The classical description of growth is the von Bertalanffy model, but it is often described without a model, simply as an age group vector of mean weight per individual.

· Recruitment. This is often described as a function of the spawning stock biomass only: R = R(SSB) or as a simple white noise random process.

· Survival. Nage+Dt = Nage e-Z Dt. The total mortality Z is decomposed into fishing mortality (F) and natural mortality (M): Z = F + M

· Fishing mortality. (F) is a process independent of natural mortality (M). Fishing mortality is usually estimated for each year and age group combination separately, i.e. fishing mortality and the exploitation pattern vary between years.

· Natural mortality. (M) is described either as an age dependent fixed parameter M= M(age) or as a multispecies interactive process: M = M(age, other populations). Natural mortality can implicitly account for emigration. As natural mortality is often assumed at least to be a stationary process (i.e. independent of time) then emigration is assumed to have similar characteristics. The emigration rate therefore is usually assumed to depend on age only, without year-to-year variation.

## 4.1 THE COHORT MODEL

The analytical model follows a cohort from year to year. A cohort is subject to total mortality (Z), which defines the rate of decrease of the population size (P). In the fisheries version of the cohort model, this mortality is decomposed into a fisheries contribution (F), and contribution from all other sources, the natural mortality (M). These processes are assumed to operate without interference. (1) (2)

The interpretation of these mortalities is as instantaneous mortality rates and within a time interval where these rates are constant, we get an exponential decrease in numbers: (3)

or, in logarithmic form: (4)

It is in the logarithmic form that we shall use in the cohort model.

Often the final age group accumulates survivors. The minimum age for this group is often chosen either because it is not possible to distinguish individuals by size at this age as they have reached their growth limit or the numbers in this group is very small relative to others in the catch. This “plus” group consists of members surviving from the previous year as well as adding additional cohorts as they reach the group’s minimum age: (5)

where PAy = the population size in the plus group with minimum age A.

## 4.2 THE CATCH MODEL

The decomposition of the total mortality into natural and fishing mortality is required because we only observe the catch, C, and not the other deaths. The solution is derived using a similar argument as above: (6)

which, when F is constant, has the solution: (7)

or, in logarithmic terms: (8)

where Pay is the population size at the beginning of the time period y, Cay is the catch during the period, and Fay and Ma are assumed to be constant during the period. Equation 7, sometimes known as the Baranov catch equation, cannot be solved for Pay or Fay directly, as it is transcendental. It can only be solved through numerical techniques such as Newton-Raphson Iteration (Section 4.7.1) and Functional Iteration (Section 4.7.2). However, any method for solving non-linear equations can serve.

Equation 7 is, in essence, a deterministic time series population model. This can be seen more clearly by rewriting it in terms of the population sizes rather than fishing mortality: (9)

Given values for Cay, Ma and either one of the two population sizes, Pay or Pa+1,y+1, or Fay, the remaining variables can be calculated. In other words, there is only one exact solution. This contrasts with a statistical model where there is no exact solution, but we identify one that is the ‘best fit’.

As has been noted, the same model can be rewritten in a number of different algebraic forms. This is done for mathematical convenience, but makes no difference to the results. For example, we can either have the initial population, Pay, and calculate forwards or the final population, Pa+1,y+1, and calculate backwards. In either case, there is only one identical solution.

## 4.3 COHORT PROJECTIONS

By solving a series of Equation 7, the sequence of F’s and P’s for each cohort can be defined. However, to start this process, at least one F or P is needed for each cohort. This is usually the terminal F. Given the terminal F, the cohort size at the beginning of the last period can be calculated as: (10)

where the subscript ay refers to the last age or year, as appropriate. In classical VPA these parameters are pre-defined, but in tuned VPA, they are estimated from other data.

In the model, each cohort is separate and therefore can be calculated independently. In practice, data often extend across cohorts, so this independence is lost when fitting the model to real data. The cohorts are therefore usually jointly represented in a matrix form, with years forming the columns and ages forming the rows (Figure 4.1).

Figure 4.1 The age*year matrix illustrating cohorts. The arrows, moving diagonally through the matrix, indicate the relationships defined by Equation 7. Shaded cells are the terminal age group/year which require a parameter each to complete the cohort calculations.

As in the case of the single cohort, the population equations do not fully determine the entire matrices {P}ay. There is a set of A+Y-1 F’s (or P’s) that together with the natural mortality Ma cannot be estimated as there are an infinite number of solutions that fulfil the criteria set up by the equations. In classical VPA, these are provided as external parameters.

The external parameters required for the solution of these equations include the fishing mortality for the terminal year and the fishing mortality for the oldest age group for all years. An equivalent set of external parameters that are used in some formulations is the number of survivors of the terminal year and of the terminal age group. This is sometimes more convenient in arranging the calculations. It also has the advantage that it is the number of survivors of the terminal year that is essential to the projections on which the biological advice is based. It is therefore useful that the variance of survivor estimates is derived directly in the estimation procedure.

## 4.4 SEPARABLE VPA

Under normal VPA the fishing mortality for each age in each year is calculated, still leaving us short of the terminal F’s, which must be provided. In essence, the Fay are parameters which equal the number of catch-at-age data. Given the end (or start) population sizes, the full model has one solution. Introducing an additional model describing the relationship between fishing mortalities effectively can reduce this number of parameters. The separable model separates the exploitation rate from the selectivity pattern. The assumption is that the exploitation pattern remains constant over some significant period of time and only the overall level of exploitation varies between years, so that: (11)

The fishing mortality is separated into an age-dependent exploitation pattern (Sa) and an age-independent exploitation level (Ey), which may be parameters or a fishing effort data series. Now instead of estimating the individual Fay’s, we use the catch data to estimate the Sa and Ey. This can be done by substituting SaEy in Equation 11 for Fay in the population model (Equation 7) and the terminal F equation defining the final population size (Equation 10).

As well as reducing the number of parameters, the separable VPA model has the advantage in projecting the effects of future fishing activities. For example, very often the selectivity pattern is gear-dependent and it changes only slowly compared to the level of exploitation. This allows projections to based upon the variation of one parameter over time which is much easier to understand and carry out.

The separable VPA model has advantages only for longer time series with several age classes. With every one year’s data that are added, we only add one parameter to the model, but a number of data points equaling the number of age classes. In other words, the number of data points increases relative to the number of parameters leading to a more reliable fit. When this model has fewer parameters than data, there is no exact solution and it is necessary to fit the model using principles such as least-squares (Section 4.7.3).

Pope and Shepherd (1982) proposed the formal analysis as separable VPA. They discuss the simple situation when there is only catch-at-age in numbers available. This method was extended by Deriso et al. (1985) and Patterson and Melvin (1995) to include tuning data (i.e. abundance indices from research vessel surveys or commercial CPUE data).

## 4.5 POPE’S APPROXIMATION

Pope (1972) proposed a very simple approximate solution to the Baranov equation: (12)

This is, essentially, a linear approximation to the transcendental Equation 7. It depends mathematically on the approximation: (13)

In modelling terms, it assumes that all fishing takes place instantaneously in the middle of the year. This removes the effect of a declining catch rate within the year as the cohort declines. In cases of low fishing mortality, it is often a good approximation (Figure 4.2).

Figure 4.2 A comparison between Pope’s cohort approximation and the exact solution to the Baranov equation (M = 0.5). Although Pope’s model is usually thought of as an approximation to the solution to the Baranov equation, it may actually also be considered as a model in its own right. The Baranov equation assumes that fishing with constant fishing mortality is spread throughout the time period considered, e.g. a year. Pope’s cohort model describes “pulse fishing”, where the fishing takes place in a single moment in the middle of the year. Based on that model the constant, ½, would be replaced with some other value if the pulse fishing takes place at a different time of the year. For example, if fishing occurred on the 1st April, the constant would be ¼. In both these models, the natural mortality M is assumed constant over the year, perhaps unlikely because of the annual cycle in the biological productivity. For a further discussion of the interpretation of these models, see MacCall (1986).

## 4.6 CONVERGENCE BEHAVIOUR OF THE VPA PROCEDURE

The VPA procedure is simply using the summed catches weighted by the natural mortality as the stock estimate. This can be seen more clearly from Pope’s cohort model. This suggests that when fishing mortality is high the recruitment estimate is insensitive to the initial guess on survivors or fishing mortality of the oldest age group. This is because remaining fish that are assumed to be still alive will be only a small fraction of the total recruitment. The number of fish actually observed in the catches will dominate the total recruitment. This is the “convergence behaviour” of VPA (Figure 4.3).

Figure 4.3 Age 12 fish as the proportion of the cohort size at each age, illustrating the influence of terminal survivors on the VPA stock estimate. Where F/Z=0.2, the percentage of the cohort dependent on the survivors remains high, making up 22% of the original recruits. Where F is much higher, survivors contribute very little to recruitment estimate, which now statistically depends much more on the sum of past catches. If the fishing mortality is low then survey results and other “tuning” data will have a major impact on the fitted historical pattern of stock sizes and fishing mortalities, while if fishing mortality is high (compared to the natural mortality) then this influence is much less. The terminal year is of special interest for fisheries management applications. The estimate of the state of the stock in this year is the starting point for projections that form the basis for calculation of TAC’s and other possible regulatory measures that are required to meet the management objectives. So, while the historical pattern may be well estimated in a situation with high fishing mortality, this may not be the case for state of the stock in the terminal year. We shall return to this problem later.

## 4.7 SOLVING THE VPA EQUATIONS

4.7.1 Newton-Raphson Iteration
4.7.2 Functional Iteration
4.7.3 Least-Squares

### 4.7.1 Newton-Raphson Iteration

The Newton-Raphson method works efficiently and the procedure can be extracted from any computer program library on numerical methods (e.g. Press et al. 1989). The first derivative can be found directly or, as with most modern numerical methods, through a numerical approximation to the derivative.

The simplest method uses the population Equation 9, rearranged into a form where the evaluation as a function of Na is zero. (14)

and its derivative with respect to Na is: (15)

These functions are equivalent to Equation 7, but in a form to solve for Na. To obtain the value of F requires the additional calculation once all Na’s for the cohort have been obtained: (16)

The initial value for the iteration can easily be obtained from Pope’s approximation, which leads to rapid discovery of the solution (Macro 4.1).

Macro 4.1 The following procedure, written in Visual Basic, but easily translated into other languages, uses Equations 14 and 15 to solve the Baranov equation. It has three parameters, natural mortality, the catch and cohort size one age unit later.

 Function SolBaranov(M as Double, Ca as Double, Na_1 As Double) As Double Dim fx as Double, dfx as Double, Na as Double, DeltaNa as Double, Z As Double Na = Na_1 * Exp(M) ‘Calculate Na with no fishing DeltaNa = Ca * Exp(M / 2) ‘Calculate Pope’s fishing mortality approximation Do While Abs(DeltaNa) > 0.1 ‘Test current accuracy Na = Na + DeltaNa ‘Add correction to Na Z = Log(Na) - Log(Na_1) ‘Calculate total mortality fx = (1 - M / Z) * (Na - Na_1) - Ca ‘Calculate the function dfx = 1 - (Z - (Na - Na_1) / Na) * M / (Z * Z) ‘And its derivative DeltaNa = -fx / dfx ‘Calculate the new correction factor Loop SolBaranov = Na ‘Return solution End Function

The equation can be solved for F rather than Na in much the same way. However, the iteration may in some cases run wild and give negative results. It is therefore advised that instead of solving the Baranov equation directly for the fishing mortality it is rather the logarithm of the fishing mortality that should be found. Hence, (17)

The iteration scheme then becomes:

(i) Provide an initial estimate for the log fishing mortality parameter (ln F), usually around F = 0.5, ln F = D = -0.693.

(ii) Calculate an estimate of the expected catch that would have been taken with this mortality: (18)

(iii) Adjust the D parameter by the ratio between the function, f(D) and its derivative f’(D) (Equation 17): (19)

(iv) Check for convergence by seeing if new D parameter is significantly different from the previous one. If its is go back to step, else end the procedure.

Although the scheme requires an initial guess of the logarithm of the fishing mortality, the method is robust to this initial value. Even poor guesses usually converge to the correct solution. Therefore, most stock assessment software uses a fixed initial guess on the fishing mortality such as 0.5. For very large fishing mortalities (>1.5) this procedure may not converge and it would probably be better to initiate the iteration with a large value of fishing mortality, perhaps 1 to 2. Again, using Pope’s cohort approximation (Section 4.5) for a starting value would avoid this problem with large F’s and reduce the number of iterations (see Macro 10.2 for an implementation).

### 4.7.2 Functional Iteration

Solving the Virtual Population equations is often done through “functional iteration”. Functional iteration is a method that may work in specific situations when either the problem has been formulated so that functional iterations become a natural approach or where a specific functional iteration scheme has been found.

Functional iteration is the special case where it is possible to find a function so a sequence of evaluations b, g(b), g(g(b)), g(g(g(b))), ... converge towards a desired solution. The following algorithm illustrates how functional iteration might be used to solve for fishing mortality in VPA.

(i) Provide the terminal fishing mortality and calculate the populations for each cohort for the oldest age group: (20)

(ii) With the next oldest age group, calculate the start cohort size, using Pope’s approximation: (21)

(iii) Calculate the fishing mortality using the new population size estimate: (22)

(iv) Calculate a new population size estimate based on this fishing mortality: (23)

(v) Check for convergence (i.e. current Pa or Fa evaluations have not changed much from the last iteration, less than say 0.1%). If the function has not converged, go back to step (iii).

(vi) Go to step (ii) unless all ages are complete.

Macro 4.2 The following Visual Basic procedure solves the Baranov equation using functional iteration. This function requires the same parameters as Macro 4.1, but finds the solution in a different way. Functional iteration appears slower than the Newton-Raphson procedures, but avoids the need to calculate the derivative, which may be useful in some circumstances.

 Function SolVPAFI(M as Double, Ca as Double, Na_1 As Double) As Double Dim Na as Double, PrevNa as Double, F as Double, Z As Double Na = Na_1 * Exp(M) + Ca * Exp(M / 2) ‘Calculate first approximation to Na Do F = Ca * (Log(Na) - Log(Na_1)) / (Na - Na_1) ‘Calculate a new F Z = F + M ‘Calculate new Z PrevNa = Na ‘Record previous result Na = Ca * Z / ((1 - Exp(-Z)) * F) ‘Calculate a new population size Loop While Abs(Na - PrevNa) > 0.1 ‘Test current accuracy SolVPAFI = Na ‘Return solution End Function

The procedure may, however, not always converge dependent on how close the initial guess of the population size is to the solution. The iteration can diverge to ±¥ or oscillate between two non-solution values. A procedure fully implementing this technique would need to avoid these problems.

The reason why functional iteration works with the VPA equations can be shown by considering the dynamic behaviour of the function around its solution. A convergent sequence forms for the VPA equation if it is solved for fishing mortality, such that: (24)

Let F* be the solution, and let F = F*- DF be the initial guess. By substitution, this becomes: (25)

This can be approximated using Taylor expansion around the true value F*. (26)

As the factor on DF lies between 0 and 1, the sequence must converge to F* if DF is sufficiently small such that higher terms of the Taylor series can be neglected. Hence, by sequentially replacing P and F, the function will converge.

In the case of the VPA equations, functional iteration is not necessary, so why use it? The functional iteration scheme avoids having to find the derivatives of first or second order required by the Newton-Raphson iteration. Where parameters are related through the functions, it is necessary to construct matrices of the first or second order derivatives (see Section 6.2). This can be costly in computer time, if carried out numerically and or in researchers’ time if carried out analytically. Furthermore this matrix (although symmetrical) needs to be inverted, another costly numerical operation for large matrices. The functional iteration methods were therefore derived to limit the number of evaluations and inversions of the derivative matrix required for obtaining the solution.

### 4.7.3 Least-Squares

The alternative method is to fit the expected to the observed catches. Following the least-squares approach, the estimation function to be minimised is: (27)

The model involved for the catch in numbers, Cay, is the Baranov equation: (28)

Notice now that the catch variable on the left-hand side is the model catch, not observed catch. The result is that the model has become a statistical description rather than a deterministic population model. The cohort model links the populations at different times together in the normal way (Equation 4).

The catches in the population model are the expected rather than observed catches used in the classical VPA population model. However, if only catch data are available the model remains over-specified and will fit exactly in any case. So, if the terminal F parameters are fixed, the errors can all be reduced to zero, the expected and observed catches are the same and the model is exactly equivalent to a classical VPA. In practice, this method depends on how much weight is given to the catch series compared to other tuning indices (see Section 6.2).

## 4.8 MULTISPECIES VPA

Multispecies VPA (MSVPA) includes estimation of some part of the natural mortality from additional information on predator consumption. In normal VPA, natural mortality parameters are assumed to be constant from year to year, although age dependent natural mortality is sometimes included in the assessment. MSVPA relaxes this assumption and includes estimation of some part of the natural mortality as both age and year-to-year dependent. The proportion included is the predation part of the natural mortality, which may lead to assessments that are more accurate if it is the dominating component of natural mortality. MSVPA uses the usual catch-at-age data, but supplemented with data on stomach contents and annual feeding rates for the predators. MSVPA formulates the catch equation and the predation equations in parallel and solves the resulting set of coupled equations. This set of equations becomes quite large. Where the single species approach solves separate sets of equations for each stock, the multispecies approach solves all these equations simultaneously. Sparre (1991) and Magnusson (1995) present the procedure in detail.

The basis of MSVPA is the Baranov catch equation: C = F´N, where C is the catch rate in numbers from a cohort, F is the fishing mortality (instantaneous rate) and´N is the average number of survivors in that cohort. The same mathematical expression applies in the case of the number of deaths caused by natural causes including predation: D = M´N, where D is number of fish which are dying from natural causes and M is the natural mortality (instantaneous rate). D is not observed directly, but it is assumed that an important proportion of fish that die from natural causes are eaten by other fish. This proportion can be calculated based on observations of predators’ food and feeding rate.

Natural mortality is separated into two components M = Mp + Mo. MSVPA assumes that Mo is a (small) constant while, Mp is calculated from the estimated stock sizes and stomach contents. MSVPA calculates the number of prey species k eaten by each predator i as: (29)

The model for Mpki is based on the population size of prey and predators and on “suitabilities” (Uki), which measure how suitable prey k is as food for a predator i. These suitabilities are proportions and add up to 1 for each predator species. The MSVPA defines a “suitable food biomass” of prey k for predator i as´Nk wk Uki. Here, wk is the average body weight of prey species k. This quantity of suitable biomass is linked to the stomach content data by: (30)

where Ski is the weight of prey k in the stomach of predator i as a proportion of the total weight of stomach contents of predator i. This is assumed to be equal to the suitable biomass of prey k as a proportion of the total suitable biomass available to the predator.

Then finally introducing the total annual food consumption of each predator i (Ri), we can calculate the total number of prey k eaten by predator i: (31)

from which the predation part of the natural mortality can be calculated as: (32)

and finally the total predation mortality is found by summing over all predators: (33)

The MSVPA assumes that food preference is well defined and remains constant between years. Therefore, it is possible to apply stomach content data from one period to the entire time series. This is equivalent to assuming that the suitabilities Uki are time independent.

The calculations for solving these sets of equations are organised as a functional iteration, but consist of two nested iterations. The inner iteration solves the catch equations for given M, i.e. find F and N in The outer iteration finds Mpki. The procedure works year-by-year starting from the most recent year and progressing backwards in time. As opposed to the ordinary VPA, the calculations have to be done year-by-year rather than cohort-by-cohort.

In the calculation of Mpki, it is possible to calculate the suitabilities directly from the stomach content data and the average population sizes; the formula is (34)

## 4.9 COMPARISON WITH OTHER STOCK ASSESSMENT MODELS

The cohort model is very similar to most other time series ‘depletion model’ techniques, such as the Leslie, Delury and Biomass Dynamics models. All these models rely on fishing to decrease the stock size, so that an index can be matched against removals of fish and the impact of fishing measured. Where catch-at-age methods do much better than these models, which describe entire undifferentiated populations, is that they remove the parameter confounding with recruitment (other than by assuming no recruitment). By following each age class, and assuming no immigration, a much better estimate can be obtained of fishing mortality even with little change in fishing effort.

VPA is closely related to other catch-at-age methods. Statistical catch-at-age methods, such as catch curves and Doubleday’s (1976) method also rely on modelling cohort mortality in much the same way (as in the least-squares approach of Section 4.7.3). The crucial difference is that in Doubleday’s method the catches used in the population model are not the observed catches, but expected catches from the model itself. This has the effect of smoothing the population model through the data. A possible advantage is that the model can be fitted to data in its entire form. The disadvantage is that it may not take full account or advantage of perturbations induced by random changes in catches, so that this source of process error is not accounted for. An extreme example of the error this might incur is indicated by the fact that it is possible to fit statistical models where the cohort size is smaller than the observed catch, the difference being treated as “observation error”. In VPA, this is not possible. This is not a disadvantage where catches are estimated and the error might be genuine. Then the issue becomes the relative weighting to give catch data over other data types.