Chapter Objectives
Structure Of The Chapter
Random sampling
Systematic sampling
Stratified samples
Sample sizes within strata
Quota sampling
Cluster and multistage sampling
Area sampling
Sampling and statistical testing
The null hypothesis
Type I errors and type II errors
Example calculations of sample size
Chapter Summary
Key Terms
Review Questions
Chapter References
Following decisions about how data is to be collected the next consideration is how to select a sample of the population of interest that is truly representative. At the same time, the requirement that samples be representative of the population from which they are drawn has to be offset against time and other resource considerations. This being the case, choices have to be made between the mathematically superior probabilistic sampling methods and the more pragmatic nonprobability sampling methods.
This chapter serves to teach the reader to:
· Distinguish between probabilistic and nonprobabilistic sampling methods
· Understand the bases for stratifying samples
· Make an informed choice between random and quota samples
· Comprehend multistage sampling, and
· Appreciate the use of area or aerial sampling.
The early part of the chapter outlines the probabilistic sampling methods. These include simple random sampling, systematic sampling, stratified sampling and cluster sampling. Thereafter, the principal nonprobability method, quota sampling, is explained and its strengths and weaknesses outlined. The statistical aspects of sampling are then explored. A number of illustrative calculations are presented.
Two major principles underlie all sample design. The first is the desire to avoid bias in the selection procedure; the second is to achieve the maximum precision for a given outlay of resources. Bias in the selection can arise:
· if the selection of the sample is done by some nonrandom method i.e. selection is consciously or unconsciously influenced by human choice· if the sampling frame (i.e. list, index, population record) does not adequately cover the target population
· if some sections of the population are impossible to find or refuse to cooperate.
These cause selection or sample bias and can only be avoided if a random method is used. Other designs, to be described shortly, can retain the essential element of randomness but manage to increase precision by incorporating various restrictions and refinements. Figure 7.1 gives an overview of the sampling methods that are either explained within this chapter or are explored in the exercises which accompany this textbook.
Figure 7.1 Methods of sampling
It can be seen that there is a dichotomy  probability and non probability sampling methods. The text which follows explains these methods in some detail, and highlights the advantages and disadvantages of each method.
Random, or probability sampling, gives each member of the target population a known and equal probability of selection. The two basic procedures are:
1 the lottery method, e.g. picking numbers out of a hat or bag
2 the use of a table of random numbers.
Systematic sampling is a modification of random sampling. To arrive at a systematic sample we simply calculate the desired sampling fraction, e.g. if there are 100 distributors of a particular product in which we are interested and our budget allows us to sample say 20 of them then we divide 100 by 20 and get the sampling fraction 5. Thereafter we go through our sampling frame selecting every 5th distributor. In the purest sense this does not give rise to a true random sample since some systematic arrangement is used in listing and not every distributor has a chance of being selected once the sampling fraction is calculated. However, because there is no conscious control of precisely which distributors are selected, all but the most pedantic of practitioners would treat a systematic sample as though it were a true random sample.
Figure 7.2 Systematic sampling as applied to a survey of retailers
Systematic sampling  
Population = 100 Food Stores  
Sample desired = 20 Food Stores  
a. Draw a random number 15.  
b. Sample every Xth store.  
Sample 
Numbered Stores  
1 
1, 
6, 
11, 
16, 
21... 
96 
2 
2 
7, 
12 
17, 
22... 
97 
3 
3, 
8, 
13 
18, 
23... 
98 
4 
4, 
9, 
14 
19, 
24... 
99 
5 
5, 
10, 
15, 
20, 
25... 
100 
Stratification increases precision without increasing sample size. Stratification does not imply any departure from the principles of randomness it merely denotes that before any selection takes place, the population is divided into a number of strata, then random samples taken within each stratum. It is only possible to do this if the distribution of the population with respect to a particular factor is known, and if it is also known to which stratum each member of the population belongs. Examples of characteristics which could be used in marketing to stratify a population include: income, age, sex, race, geographical region, possession of a particular commodity.
Stratification can occur after selection of individuals, e.g. if one wanted to stratify a sample of individuals in a town by age, one could easily get figures of the age distribution, but if there is no general population list showing the age distribution, prior stratification would not be possible. What might have to be done in this case at the analysis stage is to correct proportional representation. Weighting can easily destroy the assumptions one is able to make when interpreting data gathered from a random sample and so stratification prior to selection is advisable. Random stratified sampling is more precise and more convenient than simple random sampling.
When stratified sampling designs are to be employed, there are 3 key questions which have to be immediately addressed:
1 The bases of stratification, i.e. what characteristics should be used to subdivide the universe/population into strata?2 The number of strata, i.e. how many strata should be constructed and what stratum boundaries should be used?
3 Sample sizes within strata, i.e. how many observations should be taken in each stratum?
Bases of stratification
Intuitively, it seems clear that the best basis would be the frequency distribution of the principal variable being studied. For example, in a study of coffee consumption we may believe that behavioural patterns will vary according to whether a particular respondent drinks a lot of coffee, only a moderate amount of coffee or drinks coffee very occasionally. Thus we may consider that to stratify according to "heavy users", "moderate users" and "light users" would provide an optimum stratification. However, two difficulties may arise in attempting to proceed in this way. First, there is usually interest in many variables, not just one, and stratification on the basis of one may not provide the best stratification for the others. Secondly, even if one survey variable is of primary importance, current data on its frequency is unlikely to be available. However, the latter complaint can be attended to since it is possible to stratify after the data has been completed and before the analysis is undertaken. The only approach is to create strata on the basis of variables, for which information is, or can be made available, that are believed to be highly correlated with the principal survey characteristics of interest, e.g. age, socioeconomic group, sex, farm size, firm size, etc.
In general, it is desirable to make up strata in such a way that the sampling units within strata are as similar as possible. In this way a relatively limited sample within each stratum will provide a generally precise estimate of the mean of that stratum. Similarly it is important to maximise differences in stratum means for the key survey variables of interest. This is desirable since stratification has the effect of removing differences between stratum means from the sampling error.
Total variance within a population has two types of natural variation: betweenstrata variance and withinstrata variance. Stratification removes the second type of variance from the calculation of the standard error. Suppose, for example, we stratified students in a particular university by subject speciality  marketing, engineering, chemistry, computer science, mathematics, history, geography etc. and questioned them about the distinctions between training and education. The theory goes that without stratification we would expect variation in the views expressed by students from say within the marketing speciality and between the views of marketing students, as a whole, and engineering students as a whole. Stratification ensures that variation between strata does not enter into the standard error by taking account of this source in drawing the sample.
Number of strata
The next question is that of the number of strata and the construction of stratum boundaries. As regards number of strata, as many as possible should be used. If each stratum could be made as homogeneous as possible, its mean could be estimated with high reliability and, in turn, the population mean could be estimated with high precision. However, some practical problems limit the desirability of a large number of strata:
1 No stratification scheme will completely "explain" the variability among a set of observations. Past a certain point, the "residual" or "unexplained" variation will dominate, and little improvement will be effected by creating more strata.2 Depending on the costs of stratification, a point may be reached quickly where creation of additional strata is economically unproductive.
If a single overall estimate is to be made (e.g. the average per capita consumption of coffee) we would normally use no more than about 6 strata. If estimates are required for population subgroups (e.g. by region and/or age group), then more strata may be justified.
Proportional allocation: Once strata have been established, the question becomes, "How big a sample must be drawn from each?" Consider a situation where a survey of a twostratum population is to be carried out:
Stratum 
Number of Items in Stratum 
A 
10,000 
B 
90,000 
If the budget is fixed at $3000 and we know the cost per observation is $6 in each stratum, so the available total sample size is 500. The most common approach would be to sample the same proportion of items in each stratum. This is termed proportional allocation. In this example, the overall sampling fraction is:
_{}Thus, this method of allocation would result in:
Stratum A (10,000 × 0.5%) = 50
Stratum B (90,000 × 0.5%) = 450
The major practical advantage of proportional allocation is that it leads to estimates which are computationally simple. Where proportional sampling has been employed we do not need to weight the means of the individual stratum when calculating the overall mean. So:
_{sr }= W_{11} + W_{2} _{2} + W_{3} _{3}+    W_{k} _{k}
Optimum allocation: Proportional allocation is advisable when all we know of the strata is their sizes. In situations where the standard deviations of the strata are known it may be advantageous to make a disproportionate allocation.
Suppose that, once again, we had stratum A and stratum B, but we know that the individuals assigned to stratum A were more varied with respect to their opinions than those assigned to stratum B. Optimum allocation minimises the standard error of the estimated mean by ensuring that more respondents are assigned to the stratum within which there is greatest variation.
Quota sampling is a method of stratified sampling in which the selection within strata is nonrandom. Selection is normally left to the discretion of the interviewer and it is this characteristic which destroys any pretensions towards randomness.
Quota v random sampling
The advantages and disadvantages of quota versus probability samples has been a subject of controversy for many years. Some practitioners hold the quota sample method to be so unreliable and prone to bias as to be almost worthless. Others think that although it is clearly less sound theoretically than probability sampling, it can be used safely in certain circumstances. Still others believe that with adequate safeguards quota sampling can be made highly reliable and that the extra cost of probability sampling is not worthwhile.
Generally, statisticians criticise the method for its theoretical weakness while market researchers defend it for its cheapness and administrative convenience.
Main arguments against: Quota sampling
1 It is not possible to estimate sampling errors with quota sampling because of the absence of randomness.
Some people argue that sampling errors are so small compared with all the other errors and biases that enter into a survey that not being able to estimate is no great disadvantage. One does not have the security, though, of being able to measure and control these errors.
2 The interviewer may fail to secure a representative sample of respondents in quota sampling. For example, are those in the over 65 age group spread over all the age range or clustered around 65 and 66?
3 Social class controls leave a lot to the interviewer's judgement.
4 Strict control of fieldwork is more difficult, i.e. did interviewers place respondents in groups where cases are needed rather than in those to which they belong.
Main arguments for: quota sampling
1 Quota sampling is less costly. A quota interview on average costs only half or a third as much as a random interview, but we must remember that precision is lost.
2 It is easy administratively. The labour of random selection is avoided, and so are the headaches of noncontact and callbacks.
3 If fieldwork has to be done quickly, perhaps to reduce memory errors, quota sampling may be the only possibility, e.g. to obtain immediate public reaction to some event.
4. Quota sampling is independent of the existence of sampling frames.
Cluster sampling: The process of sampling complete groups or units is called cluster sampling, situations where there is any subsampling within the clusters chosen at the first stage are covered by the term multistage sampling. For example, suppose that a survey is to be done in a large town and that the unit of inquiry (i.e. the unit from which data are to be gathered) is the individual household. Suppose further that the town contains 20,000 households, all of them listed on convenient records, and that a sample of 200 households is to be selected. One approach would be to pick the 200 by some random method. However, this would spread the sample over the whole town, with consequent high fieldwork costs and much inconvenience. (All the more so if the survey were to be conducted in rural areas, especially in developing countries where rural areas are sparsely populated and access difficult). One might decide therefore to concentrate the sample in a few parts of the town and it may be assumed for simplicity that the town is divided into 400 areas with 50 households in each. A simple course would be to select say 4 areas at random (i.e. 1 in 100) and include all the households within these areas in our sample. The overall probability of selection is unchanged, but by selecting clusters of households, one has materially simplified and made cheaper the fieldwork.
A large number of small clusters is better, all other things being equal, than a small number of large clusters. Whether single stage cluster sampling proves to be as statistically efficient as a simple random sampling depends upon the degree of homogeneity within clusters. If respondents within clusters are homogeneous with respect to such things as income, socioeconomic class etc., they do not fully represent the population and will, therefore, provide larger standard errors. On the other hand, the lower cost of cluster sampling often outweighs the disadvantages of statistical inefficiency. In short, cluster sampling tends to offer greater reliability for a given cost rather than greater reliability for a given sample size.
Multistage sampling: The population is regarded as being composed of a number of first stage or primary sampling units (PSU's) each of them being made up of a number of second stage units in each selected PSU and so the procedure continues down to the final sampling unit, with the sampling ideally being random at each stage.
The necessity of multistage sampling is easily established. PSU's for national surveys are often administrative districts, urban districts or parliamentary constituencies. Within the selected PSU one may go direct to the final sampling units, such as individuals, households or addresses, in which case we have a twostage sample. It would be more usual to introduce intermediate sampling stages, i.e. administrative districts are subdivided into wards, then polling districts.
Area sampling is basically multistage sampling in which maps, rather than lists or registers, serve as the sampling frame. This is the main method of sampling in developing countries where adequate population lists are rare. The area to be covered is divided into a number of smaller subareas from which a sample is selected at random within these areas; either a complete enumeration is taken or a further subsample.
Figure 7.3 Aerial sampling
A grid, such as that shown above, is drawn and superimposed on a map of the area of concern. Sampling points are selected on the basis of numbers drawn at random that equate to the numbered columns and rows of the grid.
If the area is large, it can be subdivided into subareas and a grid overlayed on these. Figure 7.4 depicts the procedures involved. As in figure 7.3 the columns and rows are given numbers. Then, each square in the grid is allocated numbers to define grid lines. Using random numbers, sampling points are chosen within each square. Figure 7.4 gives an impression of the pattern of sampling which emerges.
Figure 7.4 Multistage aerial sampling
Suppose that a survey of agricultural machinery/implement ownership is to be made in a sample of rural households and that no comprehensive list of such dwellings is available to serve as a sampling frame. If there is an accurate map of the area we can superimpose vertical and horizontal lines on it, number these and use them as a reference grid. Using random numbers points can be placed on the map and data collected from households either on or nearest to those points. A variation is to divide the area into "parcels" of land. These "parcels" (the equivalent of city blocks) can be formed using natural boundaries e.g. hills or mountains, canals, rivers, railways, roads, etc. If sufficient information is known about an area then it is permissible to construct the "parcels" on the basis of agroecosystems.
Alternatively, if the survey is of urban households then clusters of dwellings such as blocks bounded by streets can be identified. This can serve as a convenient sampling frame. The town area is then divided into blocks and these blocks are numbered and a random sample of them is selected. The boundaries of the blocks must be well defined, easily identifiable by field workers and every dwelling must be clearly located in only one block. Streets, railway lines and rivers make good boundaries.
Research is conducted in order to determine the acceptability (or otherwise) of hypotheses. Having set up a hypothesis, we collect data which should yield direct information on the acceptability of that hypothesis. This empirical data requires to be organised in such a fashion as to make it meaningful. To this end, we organise it into frequency distributions and calculate averages or percentages. But often, these statistics on their own mean very little. The data we collect often requires to be compared and when comparisons have to be made, we must take into account the fact that our data is collected from a sample of the population and is subject to sampling and other errors. The remainder of this paper is concerned with the statistical testing of sample data. One assumption which is made is that the survey results are based on random probability samples.
The first step in evaluating sample results is to set up a null hypothesis (Ho). The null hypothesis is a hypothesis of no differences. We formulate it for the express purpose of rejecting it. It is formulated before we collect the data (a priori). For example, we may wish to know whether a particular promotional campaign has succeeded in increasing awareness amongst housewives of a certain brand of biscuit. Before the campaign we have a certain measure of awareness, say x%. After the campaign we obtain another measure of the awareness, say y%. The null hypothesis in this case would be that "there is no difference between the proportions aware of the brand, before and after the campaign",
Since we are dealing with sample results, we would expect some differences; and we must try and establish whether these differences are real (i.e. statistically significant) or whether they are due to random error or chance.
If the null hypothesis is rejected, then the alternative hypothesis may be accepted. The alternative hypothesis (H1) is a statement relating to the researchers' original hypothesis. Thus, in the above example, the alternative hypothesis could either be:
a. H1: There is a difference between the proportions of housewives aware of the brand, before and after the campaign,orb. H1: There is an increase in the proportion of housewives aware of the brand, after the promotional campaign.
Note that these are clearly two different and distinct hypotheses. Case (a) does not indicate the direction of change and requires a TWOTAILED test. Case (b), on the other hand, indicates the predicted direction of the difference and a onetailed test is called for. The situation when a onetailed test is used are:
(a) comparing an experimental product with a currently marketed ones(b) comparing a cheaper product which will be marketed only if it is not inferior to a current product.
Parametric tests and nonparametric tests
The next step is that of choosing the appropriate statistical test. There are basically two types of statistical test, parametric and nonparametric. Parametric tests are those which make assumptions about the nature of the population from which the scores were drawn (i.e. population values are "parameters", e.g. means and standard deviations). If we assume, for example, that the distribution of the sample means is normal, then we require to use a parametric test. Nonparametric tests do not require this type of assumption and relate mainly to that branch of statistics known as "order statistics". We discard actual numerical values and focus on the way in which things are ranked or classed. Thereafter, the choice between alternative types of test is determined by 3 factors: (1) whether we are working with dependent or independent samples, (2) whether we have more or less than two levels of the independent variable, and (3) the mathematical properties of the scale which we have used, i.e. ratio, interval, ordinal or nominal. (These issues are covered extensively in the data analysis course notes).
We will reject Ho, our null hypothesis, if a statistical test yields a value whose associated probability of occurrence is equal to or less than some small probability, known as the critical region (or level). Common values of this critical level are 0.05 and 0.01. Referring back to our example, if we had found that the observed difference between the percentage of housewives aware of the brand from pretopostcampaign could have arisen with probability 0.01 and if we had set our significance level in advance at 0.05, then we would accept the Ho. If, on the other hand, we found the probability of this difference occurring was 0.02 then we would reject the null hypothesis and accept our alternative hypothesis.
The choice of significance level affects the ratio of correct and incorrect conclusions which will be drawn. Given a significance level there are four alternatives to consider:
Figure 7.5 Type I and type II errors
Correct Conclusion 
Incorrect Conclusion 
Accept a correct hypothesis 
Reject a correct hypothesis 
Consider the following example. In a straightforward test of two products, we may decide to market product A if, and only if, 60% of the population prefer the product. Clearly we can set a sample size, so as to reject the null hypothesis of A = B = 50% at, say, a 5% significance level. If we get a sample which yields 62% (and there will be 5 chances in a 100 that we get a figure greater than 60%) and the null hypothesis is in fact true, then we make what is known as a Type I error.
If however, the real population is A = 62%, then we shall accept the null hypothesis A = 50% on nearly half the occasions as shown in the diagram overleaf. In this situation we shall be saying "do not market A" when in fact there is a market for A. This is the type II error. We can of course increase the chance of making a type I error which will automatically decrease the chance of making a type II error.
Obviously some sort of compromise is required. This depends on the relative importance of the two types of error. If it is more important to avoid rejecting a true hypothesis (type I error) a high confidence coefficient (low value of x) will be used. If it is more important to avoid accepting a false hypothesis, a low confidence coefficient may be used. An analogy with the legal profession may help to clarify the matter. Under our system of law, a man is presumed innocent of murder until proved otherwise. Now, if a jury convicts a man when he is, in fact, innocent, a type I error will have been made: the jury has rejected the null hypothesis of innocence although it is actually true. If the jury absolves the man, when he is, in fact, guilty, a type II error will have been made: the jury has accepted the null hypothesis of innocence when the man is really guilty. Most people will agree that in this case, a type I error, convicting an innocent man, is the more serious.
In practice, of course, researchers rarely base their decisions on a single significance test. Significance tests may be applied to the answers to every question in a survey but the results will be only convincing, if consistent patterns emerge. For example, we may conduct a product test to find out consumers preferences. We do not usually base our conclusions on the results of one particular question, but we ask several, make statistical tests on the key questions and look for consistent significances. We must remember that when one makes a series of tests, some of the correct hypotheses will be rejected by chance. For example, if 20 questions were asked in our "before" and "after" survey and we test each question at the 5% level, then one of the differences is likely to give significant results, even if there is no real difference in the population.
No mention is made in these notes of considerations of costs of incorrect decisions. Statistical significance is not always the only criterion for basing action. Economic considerations of alternative actions is often just as important.
These, therefore, are the basic steps in the statistical testing procedure. The majority of tests are likely to be parametric tests where researchers assume some underlying distribution like the normal or binomial distribution. Researchers will obtain a result, say a difference between two means, calculate the standard error of the difference and then ask "How far away from the zero difference hypothesis is the difference we have found from our samples?"
To enable researchers to answer this question, they convert their actual difference into "standard errors" by dividing it by its standard deviation, then refer to a chart to ascertain the probability of such a difference occurring.
1. Suppose a researcher wishes to measure a population with respect to the percentage of persons owning a maize sheller. He/she may have a rough idea of the likely percentage, and wishes the sample to be accurate to within 5% points and to be 95% confident of this accuracy.
2. Consider the standard error of a percentage:
_{}Assume that the researcher hazards a guess that the likely percentage of ownership is 30%.
Then,
_{}
But 2. [SE(p)] must equal 5% (the level of accuracy required)
i.e.
_{}It is necessary to take a sample of, say, 340 (rounding up).
_{} i.e. _{}
Generally, then, for percentages, the sample size may be calculated using:
_{} for accuracy at the 95% level.
Case 1: In a census taken 6 years ago, 60% of farms were found to be selling horticultural produce direct to urban markets. Recently a sample survey has been carried out on 1000 farms and found 70% of them were selling their horticultural produce to urban centres direct.
Situation: Population statistics (P = 60%) are known
Question: Has there been a change in 6 years or is the higher percentage (p = 70%) found due to sampling error?
When the population value is known, we can know the sampling error and we use this error for the purpose of our statistical test. The standard error of a percentage is always pq/n, but in this case the researcher puts p, the population value, in the formula and uses the size of the sample, n, to ascertain the standard error of the estimate, p = 70%.
The null hypothesis for this case is: "There is no difference between the sample percentage of farms selling direct to urban areas and the population percentage of farms found to be selling direct 6 years ago" (i.e. the sample we have drawn comes from the population on which the census was carried out and there has been no change in the 6 years).
This must be a 2tailed test as it could not be assumed that there would either be more or less farms selling produce direct six years later.
Standard error _{} PQ where Q=100 P
_{}_{}
_{}
Statistical test:
_{} = _{}_{} N.B. This has infinite degrees of freedom.
_{}
t=6.45
If reference is made to the table for a twotailed test with infinite degrees of freedom, it can be seen that t = 3.29 which shows that there is only a 1/1000 chance of our result (p = 70%) being due to sampling error, since 6.45 > 3.29. Researchers realise that the probability of this having occurred because of sampling error must be even smaller than 1/1000. Thus they are able to say that the probability that the percentage of households selling direct is now 70% is at least 999/1000 and that the null hypothesis is refuted at beyond 1/1000 level of significance. If researchers claim this, they shall be wrong less than 1 in 1000 times.
Case 2:. Six months ago, it was found from a sample survey that 20% of shoppers in a certain urban area buy fresh fruit from street vendors rather than established shops or supermarkets. A second survey, independent of the earlier one, is carried out on 500 respondents and it is found that 24% of them buy fresh fruit and vegetables regularly from street vendors. Is there any real difference?
Situation: The two surveys are carried out on different occasions, so the two samples may well be subject to different amounts of error. Due to this researchers use both estimates of error.
Question: Has the percentage of gift shoppers changed?
Null hypothesis: There is no difference in the percentages of housewives buying from street _ vendors six months ago and now. This is a 2tailed test.
Six months ago 
Now 
P1 = 20% 
P2 = 24% 
n1 = 200 
n2 = 500 
_{} 
_{} 
Standard error of _{}
Since P1 is independent of P2
S.E. _{}_{}
= 3.3%
Test of significance
_{}
N.B. This has infinite degrees of freedom.
Since 1.18 < 1.64, the difference is not significant at even 1/10 (10%) level, so the null hypothesis is not refuted and researchers do not accept that there is any significant change in the percentage of women buying fresh fruit and vegetables from street vendors.
Case 3: 54% of rural housewives are found, in a sample of 200, to include fish in their family's weekly diet. However, in a sample of 100 urban housewives only 33% said that fish was a regular part of their diet.
Situation: The same commodity is being investigated on the same occasion by listing two parts of a population.
Question: Is there any difference between rural and urban housewives in their regular consumption of fish?
Null hypothesis: There is no difference between the two social class groups in their regular consumption of fish. This is a twotailed test.
ABC 
DE 
no. = 33 = c1 
no. = 108 = c2 
P1 = 33% 
P2 = 54% 
n1=100 
n2=200 
Standard error of _{}
where _{}
_{}
N.B. Researchers take an average value of p, since they believe both the rural and urban families to be alike and the circumstances of measurement of p1 and p2 are exactly the same.
S.E. _{}
So
_{}_{}
= 6.1
Significant test
_{}_{}
t= 3.44
(N.B. This has infinite degrees of freedom).
Since 3.44 > 3.29, the twotailed tvalue for 1/1000 level of significance for 0 degrees of freedom, the null hypothesis is refuted at beyond the 1/1000 level. Thus the difference in fish consumption between rural and urban housewives is significant at beyond 1/1000 level.
Case 4:. 200 housewives are interviewed in June to determine their purchases of a canned fruit juice. Two months later, after an intensive promotional campaign, they are reinterviewed with the same object.
Situation: The same sample is interviewed on two different occasions (or assessing two different products).
Question: Is there any difference in purchases of the product between June and September?
Null hypothesis: 
There is no difference in purchases of the product between June and September (A twotailed test).  

June 
September 

Purchases % 
20 
32 
Sample size = n = 200 
_{}
The last term under the square root sign = 2 × Covariance of the two assessments, the term which takes into consideration how each person behaves both in June and September.
_{}
=_{}= 3.54
Significance test
_{}
This has infinite degrees of freedom.
Since 3.39 > 3.29 with 0.0 degrees of freedom, the difference between the June and September purchases is significant at beyond the 1/1000 or 0.1% level, (i.e. the null hypothesis is refuted at this level).
Confidence intervals for the mean
Sometimes the task is one of estimating a population value from a sample mean, rather than testing hypotheses. For example, suppose from a sample of 100 farmers it is found that their average monthly purchases of the Insecticide Bugdeath were 10.5 litres. It cannot assume that simply because the sample mean was 10.5 litres that this is necessarily a good estimate of the average purchases of all farmers in the population. Indeed, samples do not and cannot give point estimates, like 10.5 litres. Rather a sample will give a range within which it is thought the true population value lies. To calculate this range researchers need to know the standard deviation as well as the mean. The standard deviation is calculated as follows:
Suppose a small sample of say 8 farmers is taken and asked how much Bugdeath they bought each month. Their responses appear in table 7.1 below. Their mean consumption is 10.5 litres per month. In the middle column you will see that researchers have subtracted each of the individual values from the mean. In the end column these values have been squared and summed to give the total variance.
Table 7.1 Calculating the mean and standard deviation
X Consumption in litres 
_{} X 
(_{} X^{2}) 
5 
5.5 
30.25 
8 
2.5 
6.25 
8 
2.5 
6.25 
11 
0.5 
0.25 
11 
0.5 
0.25 
11 
0.5 
0.25 
14 
3.5 
12.25 
16 
5.5 
30.25 
X=10.5 
Total variance = 86.00 
To calculate the standard deviation researchers divide the total variance by the sample size to obtain the standard deviation i.e.
_{}
From the standard deviation researchers must now calculate the standard error if they are to project from what are sample figures to the population. The standard error is calculated by dividing the standard deviation by the square root of the sample size, viz:
_{}
Thus the estimate is that the average consumption is 10.5 litres plus or minus 2.83 litres, i.e., it is estimated that most farmers buy somewhere between 7.67 litres and 13.33 litres. This is the best estimate that can be given on the basis of such a small sample.
As those who have studied elementary statistics will know, only 68% of the values under a normal distribution curve lie between ±1 standard deviation. In other words, researchers can only be 68% sure that the true consumption level is between 7.67 and 13.33 litres. If researchers want to be 95% sure of a correct prediction then they must multiply their standard error by 1.96. (Students may have to be reminded that if they look up their statistical tables they will see that 95% of the area under the curve equates to a Z value of 1.96.)
Thus, the calculation becomes:
_{} (Standard Error)
Confidence Interval 
= 10.5 ± 1.96 × 2.83 

=10.5±5.5 

=5 to 17 litres 
So, researchers are 95% confident that the true value of farmers' usage of Bugdeath is between 5 and 17 litres. This example serves to show the mechanics of the confidence interval calculation and the poor estimates we get from small sample sizes.
Students who have had a basic training in statistics will also know that if they wanted to be 99% confident then the Z value would be 2.57 rather than 1.96.
Two major principles underlie all sample design: the desire to avoid bias in the selection procedure and to achieve the maximum precision for a given outlay of resources. Sampling bias arises when selection is consciously or unconsciously influenced by human choice, the sampling frame inadequately covers the target population or some sections of the population cannot be found or refuse to cooperate.
Random, or probability sampling, gives each member of the target population a known and equal probability of selection. Systematic sampling is a modification of random sampling. To arrive at a systematic sample we simply calculate the desired sampling fraction and take every nth case.
Stratification increases precision without increasing sample size. There is no departure from the principles of randomness. It merely denotes that before any selection takes place, the population is divided into a number of strata, then a random sample is taken within each stratum. It is only possible to stratify if the distribution of the population with respect to a particular factor is known, and if it is also known to which stratum each member of the population belongs. Random stratified sampling is more precise and more convenient than simple random sampling. Stratification has the effect of removing differences between stratum means from the sampling error. The best basis would be the frequency distribution of the principal variable being studied. Some practical problems limit the desirability of a large number of strata: (1) past a certain point, the "residual" variation will dominate, and little improvement will be effected by creating more strata (2) a point may be reached where creation of additional strata is economically unproductive. Sample sizes within strata are determined either on a proportional allocation or optimum allocation basis.
Quota sampling is a method of stratified sampling in which the selection within strata is nonrandom. Therefore, it is not possible to estimate sampling errors. Some argue that sampling errors are so small compared with all the other errors and biases that not being able to estimate standard errors is no great disadvantage. The interviewer may fail to secure a representative sample of respondents in quota sampling, e.g. are those in the over 65 age group spread over all the age range or clustered around 65 and 66? Social class controls leave a lot to the interviewer's judgments. Strict control of fieldwork is more difficult, i.e. did interviewers place respondents in groups where cases are needed rather than in those to which they belong.
A quota interview on average costs only half or a third as much as a random interview, the labour of random selection is avoided, and so are the headaches off noncontact and callbacks, and if fieldwork has to be quick, perhaps to reduce memory errors, quota sampling may be the only possibility. Quota sampling is independent of the existence of sampling frames.
The process of sampling complete groups or units is called cluster sampling. Where there is subsampling within the clusters chosen at the first stage, the term multistage sampling applies. The population is regarded as being composed of a number of first stage or primary sampling units (PSU's) each of them being made up of a number of second stage units in each selected PSU and so the procedure continues down to the final sampling unit, with the sampling ideally being random at each stage. Using cluster samples ensures fieldwork is materially simplified and made cheaper. That is, cluster sampling tends to offer greater reliability for a given cost rather than greater reliability for a given sample size. With respect to statistical efficiency, larger numbers of small clusters is better  all other things being equal  than a small number of large clusters.
Multistage sampling involves first selecting the PSU, then the final sampling units such as individuals, households or addresses:
Area sampling is basically multistage sampling in which maps, rather than lists or registers, serve as the sampling frame. This is the main method of sampling in developing countries where adequate population lists are rare.
Area sampling
Cluster sampling
Confidence intervals
Degrees of freedom
Multistage sampling
Nonparametric tests
Null hypothesis
Parametric tests
Proportional allocation
Quota sampling
Random number
Random sampling
Sample mean
Significance test
Standard errors
Stratified samples
Systematic sampling
Type I errors and type II errors
1. Define the term 'random sampling'
2. Name the 3 nonprobability sampling methods shown in the opening section of the chapter.
3. What are the 3 key questions to be posed when employing stratified sampling?
4. Explain the term 'proportional allocation'.
5. Outline the arguments against quota sampling.
6. Explain the term 'primary sampling units 'PSUs'
7. Define the term null hypothesis'.
8. What are the 2 types of statistical tests?
9. Explain the meaning of a 'type I error'.
10. Which Z value equates to a 95% confidence level?
1. Crawford, I. M. (1990), Marketing Research, Centre and Network for Agricultural Marketing Training in Eastern and Southern Africa, Harare, pp 3648.