M. Rout

Freshwater Aquaculture Research & Training Centre

Dhauli

Composite fish culture or polyculture of indigenous carps together with the Chinese carp and common carp has been found to result in high production of 9 to 10 tons/ha in a year with a stocking density of 7500 to 10,000 and with inputs of feed, fertilizers, organic mahures, lime etc. The evaluation of causative factors by statistical techniques i.e. ecological conditions, feed utilization, applied and available nutrients etc. will help in optimizing the production process. The lecture has been prepared to identify such functional relationships between the response variable and their causative variables and the procedure to estimate the parameters designated in the equation.

The parameters are estimated from a sample size taken
from population. Consider the problem of predicting
variable Y from P variable X, X_{2} … X_{p} (P > 1). The
variable Y is called the dependent variable, while variables
X_{1}, X_{2} … X_{p} are called independent variables. The model
equation expressing the relationship between the dependent
and independent variables is given by

Y = f (x_{1} … x_{p}; β_{1}β_{2} … β_{m}) + e where

β_{1}, β_{2} … β_{m} are unknown parameters and e is an error
variable representing the error incurred in approximating y
by the regression function. If f (x_{1}, … x_{p}; β_{0}, β_{1} .. β_{p})
= β_{0} + β_{1} x_{1} + … + β_{p}x_{p}, then the multiple linear regression
model is

Y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + … + β_{p}x_{p} + e

and the polynomial regression model is

Y = β_{0} + β_{1} x + β_{2} x^{2} + … + β_{p}xP + e.

The least square method is employed to estimate the parameters such that

S = Σ ( Yi - β_{0} - β1x1i - …βp xpi)^{2} is a
minimum. The sum of squares of deviation ‘S’ is a measure
of error incurred in fitting the sample with the model. It
has been observed that the higher the number of independent
variables, the better is the prediction likely to be.

As an illustration, a simple model can be written
for composite culture experiments as

W = a+f+F

Where ‘a’ is the mean natural production of pond, ‘f’ is the
component of production added to it by fertilizers and ‘F’
is the component added to it by feed. W represents the
expected production. The model of above type can be
expressed as

W_{1} = a + f+ F + e where ‘W_{1}’ is the observed production
and ‘e’ is the departure of the model predicted value ‘W’ and
the observation ‘W_{1}’ of production.

Consider for example, the relation between the fertilizers applied and the yield of a fish crop. If by applying 100 kg of NPK to a fish pond weight an increase in yield of 200 kg/ha, it is not necessarily likely that 500 kg of it would give an increase of 1000 kg/ha or that 10 kg of it would give an additional yield of 20 kg/ha. Higher and higher fertilizers or feeds application, production may not linearly related and may touch asymptotically at an ultimate weight.

The natural parameters and other major parameters for achieving production in culture experiments have been summarised in subsequent chapters.

In culture experiments, the estimation of standing crop of fish from time to time is required to adopt a judicious management measures for a higher production au well as a lucrative profit. Inaccessibility of fishes to direct observation in their natural habitat raises problems for the biologists. So a basic understanding of population dynamics, production, yield is essential for monitoring the numerical changes which occur in a population in a course of time.

This is based on drawing a sample of marked and unmarked
fishes from a population with known number of marked
and unknown number of unmarked fishes. The ratio of marked
fish recaptured to the total number of fish caught is assumed
to be equal to the ratio of the total marked fish in the
pond to the total population i.e.

where = estimated population, m = total no. of marked
fish in the population, c = total no. of fish in the sample,
r = number of marked fish recaptured.

The estimated population and production of fish are to be calculated from time to time to study the survival and the quantum of feed to be required for fish. In a culture experiment if m = 400, c = 600 and r=100 then the estimated population

In a short term sequence of identical fishing operations
in a fish pond, the successive catches decline in a
regular manner. If two hauls (C_{1} & C_{2}) are taken, an
estimate () based on just two trials can be expressed
in the form

If there are three hauls C_{1}, C_{2} and C_{3}, each catch
not allowed back to pond, then the estimated population

The sequence of expected catches E(C_{1}), E(C_{2}), E(C_{3})
thus follows the pattern Nq^{1}, Nq^{1} (1-q^{1}), Nq^{1} (1-q^{1})^{2}
where q^{1} is the probability of catch being caught in each
trial. The estimate N is based on the catch model

Rate of daily feeding is calculated either as a percentage of the total weight of the standing crop or on the basis of the daily increase in total weight of standing crop. Both the methods are subject to two sources of error. The first is the possible error in estimating the average weight of the population and the second is the error in estimating the number of individuals in a population at one time.

Estimate of standing crop may be made from total amount
of daily feed which is expressed as W = IR^{N} or logW = log I +
N logR.

W = total weight of standing crop at the end of N intervals of feeding, N = number of ‘T’ intervals during which fish have been fed at rate ‘P’, I = initial weight of fish and R = rate of change in weight during interval T.

Example:- Suppose 40 kg of silvercarp are stocked and fed with supplemental feed, six days weekly with 3 per cent of weight of standing crop for a period of 30 weeks and R = 1.106. Then I = 40, N = 30T, T = 1 week, P = 0.03, Log W = log 40 + 30 log = .106 or log W = 2.91486 or W = 821 kg.

Production is defined as the total elaboration of fish tissue during any time interval. It may be measured in terms of wet weight, dry weight, nitrogen content or energy content. Production studies are based either on measurements of weight or energy contents.

The average size or average weight of fish is calculated when n number of fishes are weighed in one bulk, then the mean is the sum of the values of individuals to the total number of individuals.

Suppose there are n different values x_{1}, x_{2}, x_{3}… x_{n}
which occur respectively f_{1}, f_{2} … f_{n} times. Then
The mean x of the
N values of the variable is given by

since the value x_{i} occurs f_{i} times. This is also called
as the weighted mean of the different values x_{i} whose weights
are their frequenceis f_{i}.

The variance of the weight distribution is estimated as which may be taken as an indication of the extent to which the values of x are scattered.

Production can be estimated for each interval in a well arranged composite culture pond where natural mortality is almost absent except under severely adverse conditions such as depletion of oxygen, disease and poaching.

When the stock size remains constant i.e. no mortality during the period of culture, the production is Nx(W

_{2}-W_{1}) where W^{1}s are average weights at start (w_{1}) and end (W_{2}) of the interval.When there is mortality during the period of culture and time is not known, the production is given by N

_{1}(-W_{1})+N_{2}(W_{2}-) where 1, 2 are suffixes for start and end points of time interval and = antilog

The periods in which sudden mortalities are observed and average weight (W) and number (N=N

_{2}) is observed immediately after the mortality; production is given by N_{1}(W-W_{1}) + N_{2}(W_{2}-W)

Example:- Suppose in a composite culture experiment, the growth and survival of fishes are as follows:

Time | Number per present | Average weight (g) | |

Stocking | 1000 | 10 | |

After one month | 600 | 40 | |

After sixth month | 600 | 800 | |

Harvest after 10 months | 500 | 1000 |

Production can be estimated in two ways:

i) General method:

Production = N_{2} × final weight-N_{1} × initial weight

= 500 × 1000-1000×10 = 490 kg.

ii) Suggested method:

Production for the 1st
month = N_{1}(-W_{o}) + N_{2} (W_{2}-)

= 1000(20-10)+600(40-20) = 22 kg

Production for 2nd to 6th month = 600 (800-40) = 456 kg Production for 7th to 10th month = 500 (1000-800) = 100 kg Total production = 22 + 456 + 100 = 578 kg The difference of (578-490) = 88 kg in production loss is due to overall calculation for estimating production.

An individual fish in a stock increase in size, at the same time, the numbers are reduced from the stock because of mortality. The mass of a whole stock at a given time is determined by the resultant of the forces of growth and of mortality.

In practice, neither growth nor mortality is likely to be exponential for any longer period of time, but those expressions can be used as approximations, being better, the shorter the interval. Any growth curve can be treated in this way if it is divided up in to short segments.

When growth is considered to be exponential, its
instantaneous coefficient G is estimated by
where _{1}, _{2} are the mean weights
of fish at times t_{1} & t_{2}, respectively.

Consider W_{j}(t) be the body weight of an average
species j carp at time t. Then the growth of fish per time
unit is a model which is also called as an INPUT/
OUTPUT model i.e. = INPUT-OUTPUT

Example: A fish in culture experiment grew from .1 to 1.5 kg
in unit time, say, a year then the absolute growth is 1.5–0.1
= 1.4 kg per year. Its relative growth or annual growth rate
= 14.0 i.e. 1400% per year. The instantaneous rate of
growth is log_{e}

The most useful manner of expressing the decay (decrease) of an age group of fishes through time by means of exponential rates and are expressed as follows:

Nt = No e^{-zt} where ‘No’ is the initial number of
fishes at time t = O and ‘Nt’ is the number of remaining
fishes at the end of time ‘t’ and ‘z’ being the instantaneous
rate of total mortality.

Nt, No are the number of fishes present at times t_{1} and t_{2}.

The differences G-Z is the net rate of increase in biomass
during the time t_{2}-t_{1}. Also Z = M+F where M is the instantaneous
rate of natural mortality and F is the instantaneous
rate of fishing mortality. When F = O, Z = M which means that
total mortality and natural mortality have the same value
where there is no fishing. Modelling population decay in
culture fishery by an exponential Nt = No e^{-zt} is hardly
valid except in the early phase of stocking when sizeable
mortalities have been reported due to weak or too small sizes
at stocking.

Production may be estimated graphically when data on growth and survivorship of fish are available over the time span of interest. In this method, the number of individuals (N) in the population at successive instants of time are plotted against the mean weight () of an individual at the same instants.

A hypothetical example from an experimental pond is illustrated in Fig.-1.

Fig. 1. Seasonal changes in numbers (N) and average weight () of fish in a pond.

In June 1st, stocking was done and on 1st of every-month sampling was made. The estimate of population number and mean size were collected and put in the graph. Production in August is equal to the shaded area beneath the curve. Biomass on August 1 is shown on the graph. Annual production is the entire area beneath the curve from June 1 to June 1, a year later. Areas can be measured on the graph by using a planimeter. If the curve is plotted on graph paper, areas can be measured by counting squares.

Technique of multivariate data analysis relates to the study of interrelationship between inputs and output. Ecological variables and input variables from an environment which are mostly interrelated can be evaluated. Use of different inputs (independent variables) in ponds may enable us to predict the output (dependent variable). This requires the application of the multiple regression techniques. The more common data interpretation techniques are covered here.

Multiple regression provides an analysis of the relations among two or more predictor variables and a single criterion variable. One result of the analysis is an equation for predicting the criterion score of a subject from its known set of predictor scrores. If Y is the criterion variable, the relationship between Y and the X's is formulated as a lenear model i.e.

Y = a_{o} + b_{1}x_{1} + b_{2}x_{2} + … + b_{m} x_{m} + o

or Y = a_{o} + ∑ b_{i} x_{i} = o

where a_{o} is overall mean corresponding to mean levels of
x_{s}; b_{s}, are regression coefficients and e the residual
“error” term of departure of individual Y values from model.

To illustrate the model in composite culture experiments,
we can treat fish growth (or production) Y over a
period as criterion or response variable determined by the
joint effects of predictor variables such as area of pond (x_{1}),
number of fish stocked (x_{2}), fertilizers applied (x_{3}), manures
put (x_{4}), feed given (x_{5}). Then the multiple regression
equation is Y = a+b_{1}x_{1}+b_{2}x_{2} + b_{3}x_{3} + b_{4}x_{4} + b_{5}x_{5} + e

It is assumed that this regression equation provides an acceptable approximation to the truo relationship between Y and the x's. In other words, Y is an approximatel linear function of the x's and e measures the discrepancy in that approximation.

In this approach, aquaculture production or growth would be related to various factors in a sequencial fashion, with the sequence being directly related to the explanatory power of the factor. The procedure allows one to observe the partial effect of a factor while at the sametime considering only those factors for which the data suggests a relationship. It is conceivable that after considering four or five of the factors according to their magnitude of correlation, the remaining factors do not add much to the explanation of production or growth.

This method is highly laborious and is generally handled on computer. Statistical software package on stepwise regression is available mostly in FORTRAN language.

Regression analysis by least squares technique implicitly assumes that the predictor variables are uncorrelated with each other. With this assumption in mind, it is usual to interpret a regression coefficient as measuring the change in the response variable when the corresponding predictor variable is increased by one unit and all other predictor variables are held constant.

If there are p variables i.e. 1 response and P-1 predictors then the data sets involve p means, p variances and P(P-1)/2 correlations to interpret and estimate. One way of getting out of this is to search for uncorrelated predictor variables. Since correlation coefficients only sore up the parameters and interpretation problems, we seek to have those which have zero correlations.

Let the regression model stated in terms of original standardized variables be

Y = β_{1}X_{1} + β_{2}X_{2} + … + β_{p}X_{p} + V

The above equation may be written in terms of the principal
components as Y = α_{1}Z_{1} + α_{2}Z_{2} + … + α_{p}Z_{p} + V.

The α's and may be obtained by the regression of Y against the principal components Z's or against the original standard variables x's. The principal components regression is used as a means for deleting and analysing the problem of multicollinearity. The final regression estimates are always restated in terms of β's for interpretation.

A suitable computer program for regression analysis by principal components approach is developed. Regression analysis for a set of maximum up to 40 variables including the response variable can be performed.

The interrelationships between two sets of measurements made on the same subject can be studied by canonical correlation method. The canonical correlation is the maximum correlation between linear functions of the two vector variables. That is, each pair of functions is so determined as to maximize the correlation between the new pair of canonical variates, subject to restriction that they may be entirely orthogonal to all previously derived linear combinations.

For example, in composite culture experiment, if there
are production of six species from eight inputs applied to
the pond, then there are 48 correlations between production
and inputs. If there are P_{1} elements in Z_{1} and P_{2} elements
in Z_{2} and P = P_{1} + P_{2} then the square, symetric correlation
matrix R of order P is sub-divided in to
such that R_{11} contains the inter correlations
among the elements of Z_{1} and is of order P_{1}, R_{22} is the inter
correlations among the elements of Z_{2} and is of order P_{2}.
R_{12} = R_{21} contains the cross correlations between elements
of Z_{1} and Z_{2} and is of order P_{1} × P_{2}.

This method is an extension with much more soundness and having the direct property of immediate identification of response correlations with the predictors. Computer software programmes are available for this.

Composite culture of indigenous and exotic species with balanced fish food, judicious fertilization, elimination of accumulated metabolities has shown the practicability of enhancing fish production. But at higher level of manure application and higher rate of feeding, the physico chemical characters change, natural primary productivity affected and large quantities of food remain unutilized. Soil and water characteristics, plankton density, composition in terms of species and size would not remain same and the resultant variations show very high. These systematic causes of variation and the inherent variability can be studied by use of different statistical tests to identify the degree of error variations and corrections needed for achieving the goal within reasonable limit. To estimate the population parameters affecting the growth and production, the sample, functions are to be built by applying proper sampling techniques.

The use of advanced statistical techniques has been
significantly increased since the introduction of computers.
The agricultural statisticians, economists, soil scientists,
fishery biologists are to work with full cooperation by
using more sophisticated methods of analysis. In culture
experiments, any number of variables limiting to the capacity
of the computer, chemical parameters or biological indicators
effecting production can be analysed and interpreted. Techniques
of linear programming, multivariate least squares
analysis, factor and path analysis, D^{2} statistics are also
some of the useful and important methods for analysing data.
Multivariate analysis presents no problem when computerisation
is done for any number of variables. So greater coordination
is needed between the aquaculture researcher and
applied statistician for indepth analysis of various problems
in aquaculture.

Afifi, A.A. and S.P. Azen (1979).

__Statistical Analysis - A Computer Oriented Approach.__
Academic Press, New York, 442 P.

Anderson, T.W. (1958)

__An Introduction to Multivariate Statistical Analysis__
John Wiley & Sons, Canada, 374 P.

Cochran, W and G.M. Cox (1957)

__Experimental Designs__. John Wiley, New York.

Cooley, W.W. and P.R. Lohnes (1971)

__Multivariate Data Analysis__. John Wiley, New York, 314 P.

Doshi, S.P. and Ram Kumar (1981).

__User's Manual and Fortran Programs for Regression
Analysis__. Indian Agricultural Statistics Research
Institute, New Delhi-12, 45P.

Kendall, M.G. (1957)

__A Course in Multivariate Analysis__.
Charles Griffin & Co., London.

Por Sparre (1984)

__A Programme of Bioeconomic modelling on Freshwater
Fish Culture in India__. Report. FAO/UNDP/75/031, Rome, 89P

Snedecer, G.W. (1961). __Statistical Methods__.

Allied Pacific Private Ltd., Bombay, 534 P.