Previous Page Table of Contents Next Page


Data issues

5. Data sets have several characteristics. Firstly, a given information will have traits which can be conflicting. For example, a given livestock price such as for beef can be reported with a pre-determined frequency (giving it a semblance of timeliness) but it may not be accurate. The most precise information may not be relevant any more when one makes a decision. Hence, general livestock statistics cannot satisfy all the data quality demands of the various users.

6. Secondly, the accuracy of different data inputs is dependent on both the quality of the estimates of the components as well as the interrelationship existing between them. Consider the simple relationship given by the following: S = P + M + I

Where

S is livestock supply level
P is domestic livestock production
M is livestock imports
I is the herd inventory.

Assume further that the following ratios are true:

(i) The relative errors (defined as the difference between the actual data and its estimate divided by the actual data) are: P (20%); M (30%); and I (10%).

(ii) The ratios P/S, M/S, I/S are 0.4, 0.4, 0.2 respectively.

7. The relative error of "S" will be 22% (20%* 0.4 + 30%* 0.4 + 10%* 0.2). This means that the estimate of S will be partly influenced by the quality of the estimates of its components. The nature of the relationship between given data also affect the accuracy of estimates. For example, however accurate data measurement is for "S", "P" or "M", "I" (i.e. inventory) will be subject to large errors because the I/S ratio is relatively small.

8. Thirdly, data use makes sense if one selects the appropriate statistical index. For example, causal analysis always leads one to assume that an average measures the representativeness of the population being scrutinized. If the population is the liveweight price of cattle with the following values (Naira/head): 1,000, 2,000, 2,000, and 5,000, the average price is 2,500 which is not even the price of a single animal. A more representative value in the example cited is the mode which is 2,000 (i.e., the price observation with the largest number of frequency count).

9. Fourthly, data cost money. An aerial national livestock census for Nigeria can easily cost US$ 2-3 million. If one is undertaking a survey (i.e. analysing the subsector of a given population), then if the accuracy of the sample statistical estimates (as measured by the standard error) is to be increased by 50%, then the sample size will need to be increased by a multiple of four. This simply means that instead of interviewing 100 farmers, 400 will need to be covered. It is obvious that the additional 300 respondents will cost additional money (e.g. higher traveling cost). The magnitude of data activity will partly be determined by its value relative to its cost. It should further be noted that one should not seek comfort in a layer sample size in terms of assured statistical accuracy for the estimate. Measurement errors can easily creep in with a large sample size - the interviewers can be afflicted with boredom in conducting the survey and hence become careless in accepting large sample responses.

10. Fifth, the end-use of a data set can influence its quality. A case in point are livestock trade data. If a country has stringent tariff and non-tariff sanctions, then animal imports will likely be understated. It is natural for an entrepreneur to dodge government regulations if it will affect his profits. A livestock trade policy analyst must use his ingenuity in data validation through corresponding major trade partners and other information correlated with live animals traded to detect the magnitude of the bias in his data.


Previous Page Top of Page Next Page