In the two-stage sampling design the population is partitioned into groups, like cluster sampling, but in this design new samples are taken from each cluster sampled. The clusters are the first stage units to be sampled, called primary or first sampling units and denoted by SU1. The second-stage units are the elements of those clusters, called sub-units, secondary or second sampling units and will be denoted by SU2.
Two-stage sampling is used when the sizes of the clusters are large, making it difficult or expensive to observe all the units inside them. This is, for example, the situation when one wishes to estimate total landing per trip of a fishery with many landing sites and also with a large number of vessels.
Sometimes, in order to decrease the sizes of the primary sampling units, one can previously stratify the population and apply two-stage sampling to each stratum.
It is possible to extend the two-stage sampling design to three or more stages. A short reference will be made to a three-stage sampling design, using a case where the procedure to estimate errors is simple.
Most of the population parameters of interest to fisheries research in the two-stage sampling design are the same as in cluster sampling. These are summarised below.
Table 6.1
Main population parameters of interest to fisheries
research in two-stage sampling design
| N | Number of clusters (SU1) in the population |
| Mi | Number of elements (SU2) in cluster (SU1) i |
![]() | Total number of elements (SU2) in the population |
![]() | Mean number of elements (SU2) per cluster (SU1). This is useful when a population has clusters (SU1) of unequal size. |
| Yij | Value of the chosen characteristic of element (SU2) j in cluster (SU1) i |
![]() | Total value of the chosen characteristic in cluster (SU1) i |
![]() | Mean value of the characteristic Y in the elements (SU2) of cluster (SU1) i |
![]() | Total value of the characteristic Y in the population |
![]() | Mean value of the characteristic Y per cluster (SU1) |
![]() | Mean value of the characteristic Y per element (SU2) |
![]() | Mean value of the characteristic Y per element (SU2) if Mi= constant = M |
| Variances | |
![]() | Variance between total values of the characteristic Y per cluster |
![]() | Variance between mean values of the characteristic Y per cluster (SU1) The asterisk is used in the symbol, , to differentiate the variance between mean values of the characteristic per cluster (SU1) and the variance between total values of the characteristic per cluster (SU1) |
![]() | Variance between values of the characteristic Y in the elements (SU2) within cluster (SU1) i |
![]() | Variance between values of the characteristic Y in the elements (SU2) within all clusters (SU1) |
In this design, as opposed to cluster sampling, the numbers mi of elements sampled in the second sampling stage are not equal to the sizes of the corresponding clusters. The sample statistics common of this design that are most important to fisheries research are summarised below.
Table 6.2
Main sample statistics
of interest to fisheries research in two-stage sampling design
| n | Number of clusters (SU1) sampled |
| mi | Number of elements (SU2) sampled from cluster (SU1) i |
![]() | Total number of elements (SU2) sampled |
![]() | Mean number of elements (SU2) sampled per cluster (SU1) |
| yij | Value of the characteristic Y in element (SU2) j of cluster (SU1) i |
![]() | Total value of the characteristic Y in the elements (SU2) sampled from cluster (SU1) i |
![]() | Mean value of the characteristic Y in the elements (SU2) sampled from cluster (SU1) i |
![]() | Total value of the characteristic Y in the clusters (SU1) sampled |
![]() | Sample mean value of the characteristic Y per cluster (SU1) |
![]() | Sample mean value of the characteristic Y per element (SU2) |
![]() | Sample mean value of the characteristic Y per element (SU2) if mi= constant = m |
![]() | Sample variance between total values of the characteristic Y per cluster (SU1) |
![]() | Sample variance between mean values of the characteristic Y per cluster (SU1) |
![]() | Sample variance between values of the characteristic Y in the elements (SU2) sampled within cluster (SU1) i |
![]() | Sample variance between values of the characteristic Y in the elements (SU2) sampled within all clusters (SU1), if mi= constant = m |
In two-stage sampling design, the expected values and the variances of an estimator in the sampling world are calculated taking into consideration the two stages.
Let the sub-indices 1 and 2 refer respectively to the first and to the second sampling stages.
First sampling stage
E1 refers to the expected value of the estimator among all possible first-stage samples to be selected from the population.V1 refers to the sampling variance of the estimator among all possible first-stage samples to be selected from the population.
Second sampling stage
E2 refers to the expected value of the estimator among all possible second-stage samples to be selected from the first-stage clusters already sampled, that is, conditional on the SU1 sampled.
V2 refers to the sampling variance of the estimator among all possible second-stage samples to be selected from the first stage clusters already sampled, that is, conditional on the SU1 sampled.
Using these
definitions it can be demonstrated that, if
is an estimator of
the population parameter θ,
the expected value of the estimator is:
E[
]= E1[E2 (
)]
and its sampling variance is:
V[
]=V1[E2(
)]+ E1, [V2 (
)]
The first term relates to the sampling variance of the estimator between the clusters (SU1) and the second term relates to the sampling variance between the elements (SU2) within the clusters (SU1).
This basic theorem is valid for the sampling distribution of any estimator and it is valid for any two-stage sampling design. These results can also be extended to sampling designs with more stages.
There are different methods of selecting the sampling units in the two-stage sampling design. The units can be selected with simple random sampling, or with different probabilities in one or both stages. These choices will affect the sampling distribution of the estimators, and correspondingly the choice of estimators to use for any particular purpose.
In this document the following methods will be analysed:
In
two-stage sampling applied to fisheries science, the population total value, Y,
and the mean per element,
are often the parameters
to be estimated.
In this two-stage sampling design, an unbiased estimator of the total value of the population is:

where Ŷi is an estimator of the total value of the characteristic in cluster (SU1) i. Taking into consideration that simple random sampling is adopted in the second sampling stage, the estimator Ŷi would be:
Ŷi= M i
i or 
Applying the general theorem from section 6.4.2, it can be proven that the estimator is unbiased with sampling variance equal to:

In this expression, the sampling
fractions at the first and second stages are 
and
respectively,
is the population
variance between total values of the characteristic of clusters (SU1)
and
is the population
variance between the values of the characteristic of the elements (SU2)
within cluster (SU1) i.
An estimate of
the variance, V(Ŷ), is obtained by replacing the population variance
with a sample estimate, Ŝ12 and the
population variances
with the sample variances
:

Particular case of SU1s with equal sizes
When the SU1
sampling units have equal sizes and both selections are with equal
probabilities (
for the first stage and
for the second
stage), two-stage sampling
design becomes a very simple particular case.
Let M be the constant number of second-stage sampling units, SU2, in any of the N clusters (SU1) of the population and m the constant number of SU2 sampled from any SU1 of the sample.
Replacing Mi with M and mi with m in the previous case, the above sampling variance will become:

In this case,
the estimator of the mean per element,
, is given by
. The sampling variance of this estimator is given by
and hence,
from the previous expression of the sampling variance, this will be:

It was earlier seen that, for equal sizes of clusters SU1 s and equal sizes of the respective SU2 s samples, the variances between mean values of the characteristic Y in the SU1 s and the variances between values of the characteristic Y in the SU2 s could be written as:
and
respectively
Then the sampling variance of
will take the form:

An estimate of the sampling variance is:

The second term of this variance is negligible when f1 is small.
In the case of three-stage (or more stages) sampling with sampling units of equal sizes, the extension of the expressions above is simple. For example, considering three-stage sampling, the estimator of the population mean per element is:
where 
The sampling variance of this estimator will be:

where:

with:

An estimator of the sampling variance can be obtained from:

where:
and

In the case of two-stage sampling with equal-sized SU1s, it is also simple to estimate the proportion of elements of the population belonging to one certain category.
A proportion in a sample of size n is considered as a mean of n Bernoulli variables. Then, the proportion pi, in the ith sampled cluster ith SU1, is:
pi =
i
Then the sample mean per element is the overall proportion:

Therefore, the estimator of the overall
proportion of the elements belonging to the category of interest can be the
average,
, of the proportions, pi, of the clusters
sampled:

This estimator is unbiased, and an estimate of its sampling variance is given by:

where
and
with 
To analyse this
design, let Pi be the known probability of selecting the ith
cluster(SU1i) in one extraction,
.
An unbiased estimator of the population total, Y, is:

where
i = Mi
i is an estimator of
the total value, Yi, of the ith SU1.
The sampling variance of this estimator will depend on the sampling design of the first stage. However if independent estimates, Ŷi, of the total values, Yi, are available, an unbiased estimator of the sampling variance, will be:

The estimators presented for multi-stage sampling are in general efficient, but they suffer from the handicap of having complicated expressions. This complication arises from the unequal selection probabilities, requiring the calculation of weights for each cluster. Under some conditions of sample allocation, however, this estimator can be self-weighting.
In fact, the estimator of the total value in the population can be rewritten as:
with 
If the sample allocation is such that
, then
, and the estimator can be rewritten:
showing that the
sample is self-weighted.
The importance of self-weighting sampling is to facilitate the calculations, since the constant weight factors simplify the calculations of the estimators and of the estimated sampling variances.
Under these conditions, an estimate of the sampling variance is:

The population mean per element,
, can be estimated using the estimator
.
The sampling variance of this estimator is:

As in the previous cases, this estimator requires that M0, the total number of elements of the population, is known.