THEORETICAL JUSTIFICATION OF THE NEGATIVE BINOMIAL DISTRIBUTION
We have derived the Poisson Distribution from the Binomial Distribution, and the necessary condition for the Binomial Distribution to hold is that the probability, p, of an event E shall remain constant for all occurrences of its contextevents. Thus, this condition must also hold for the Poisson Distribution.
If, however, it is known that p is not constant in its contextevents, another distribution known as the Negative Binomial Distribution (N.B.D.) may provide an even closer “fit”.
Suppose we have a Binomial Distribution for which the variance V,(x) = s^{2} = npq is greater than the mean m = np.
In such a case the following equalities/inequalities are held:
(i) npq > npBut np being positive, n must be negative also (writing n = k).and
(ii) since p + q = 1, p must be negative, i.e.
The trouble about this type of distribution lies in the interpretation, for we have defined probability in such a way that its measure must always be a number lying between 0 and 1 and so, essentially positive. Again, since n(= k) is the number of contextevents how can it possibly be negative?
It is often found that observed frequency distributions are represented by Negative Binomial Distributions. This is theoretically justified when in frequency distributions the variance is greater than the mean.
This often arises when the probability of an event E does not remain constant for all occurrences of its contextevents.^{1}
^{1} The concentration of units varies between different parts of the population (nonrandomly distributed throughout the whole population).From the above (ii) we have,
and , where
substituting we get
The parameters of the distribution are the arithmetic mean (m) and the exponent k.
Since the variance of the population is,
,
substituting
we get,
(iii)
The probability series of the N.B.D. is given by the expansions
The individual terms of are given by
By using the recurrence formula the individual terms of the series are,
and
Note that k is no longer the maximum possible number of individuals a sampling unit could contain, but is related to the Spatial distribution of the surveyed population (k is a measure of the heterogeneity of the distribution). Unlike the positive Binomial, k is not necessarily an integer in the Negative Binomial Distribution.
From above (iii) we have,
The above formula indicates that, the reciprocal of the exponent k, i.e., is a measure of the excess of variance or clumping of the individuals in the population. Specifically, as approaches zero and k approaches infinity, the distribution coverges to the Poisson series (s^{2} Þ m). Conversely, if clumping increases , 1 approaches infinity (k Þ 0) and the distribution converges to the Logarithmic Series.
Example:
The Table below gives the number of aquatic invertebrates on the bottom in 400 square units. Fit a Negative Binomial Distribution to the empirical data.
Number of aquatic invertebrates (x) 
0 
1 
2 
3 
4 
5 
Total 
Frequency (f) 
213 
128 
37 
18 
3 
1 
400 
Estimated variance:
Calculated q:
or
0.81 = 0.68q, and q=1.19,
Calculated :
and
Estimated
Estimated probabilities:
Recurrence formula:Estimated theoretical frequencies (N.B.D.):P(x=0) = q^{k}Therefore,
N_{x=0} = 400 × P(x=0) = 400 × 0.5365 = 214Testing goodness of fit:N_{x=1} = 400 × P(x=1) = 400 × 0.3065 = 123
N_{x=2} = 400 × P(x=2) = 400 × 0.1120 = 45
N_{x=3} = 400 × P(x=3) = 400 × 0.0332 = 13
N_{x=4} = 400 × P(x=4) = 400 × 0.0087 = 4
N_{x=5} = 400 × P(x=5) = 400 × 0.0022 = 1
A problem that arises frequently in statistical work is the testing of comparability of a set of observed (empirical) and theoretical (N.B.D.) frequencies.
To test the hypothesis of goodness of fit of the N.B.D. to the empirical frequency distribution we calculate the value of
where
f_{i} = empirical frequenciesThe estimated X^{2}  value is compared^{2} with the tabulated value. The hypothesis is valid if X^{2} < , the hypothesis is discredited if X^{2} >
q_{i} = theoretical frequencies
^{2} It should be noted that, since x^{2} curve is an approximation to the discrete x^{2} frequency function care must be exercised that the x^{2} test is used only when the approximation is good. Experience and theoretical investigations indicate that the approximation is usually satisfactory  provided that the frequencies of the class intervals are usually ³ 5 and that the number of classes in the frequency distribution are ³ 5.The following Table gives the empirical and theoretical frequencies of the previous example and the estimated X^{2}  value.
Table X^{2} test of goodness of fit N.B.D. to spatial distribution of aquatic invertebrates

Number of squares 




Number of aquatic invert. (x) 
Empirical frequencies (f_{i}) 
Theoretical frequencies (q_{i}) 
(f_{i} + q_{i}) 

Remarks 
0 
213 
214 
1 
0.0047 

1 
128 
123 
+5 
0.2033 

2 
37 
45 
8 
1.4222 

3 
18 
13 
+5 
1.9231 

4 
4 
5 
1 
0.2000 
combined 




X2 =3.7533 

(n  degrees qf freedom, n =5 classes (2 estimated parameters + 1)
Since , 3.7533 < 5.991 the hypothesis of goodness of fit is valid.
Note: A second estimate of k
From the above (iii) we have(See also Appendix II), and
A second estimate of k is given by
In the above example,
Transformations
Analysis of variance, correlation analysis, testing hypothesis and other statistical methods of analysis of data associated with the normal distribution are performed on the transformed counts (see Table below).
Transformations
Original distribution 
Special conditions 
Estimated parameters 
Transformation 
1. Poisson 
1, No counts less than 10 

Replace x by 
2, Some counts less than 10 

Replace x by 

2. Negative Binomial 
1. k greater than 5 
 
Replace x by 
2. k between 2 and 5 
 
Replace x by y = log(x+k/2) 

3. No zero counts 

Replace x by y = log x 

4. Some zero counts 

Replace x by y = log(x+1) 
As the derived mean is smaller than the arithmetic mean of the original counts before transformation, it is not comparable with arithmetic mean obtained by direct averaging. Therefore small adjustments have to be made to the derived means. (See section 10.4.4.).