This chapter gives guidelines for extracting data on breed characteristics and for assembling them in an appropriate fashion for subsequent compilation into the approved Descriptor List. The person preparing data (compiler) is reminded of the role of the Data Bank (DB) and urged to keep in mind its value as a pool of information on breed characteristics within defined environments. The compiler should also keep in mind the needs of users for information relevant to the future utilization of animal genetic resources in other similar or dissimilar environments. Thus, this exercise of data extraction and presentation must include an exhaustive search of the published literature and other unpublished data sources, the evaluation of these sources and the extraction of valid genetic and associated environmental information and preparation of this information in a form suitable for entry into the Descriptor Lists.
The data for the Data Bank will be derived from various published or unpublished sources. A Source is defined here as any document having authentic data which would add to the sum of knowledge about the genetic characteristics of a breed. The Source could have been written in any language. The likely types of Sources are listed below.
The Data Bank does not include individual animal records but performance statistics of groups of animals of known breed type and conditions under which these statistics were measured. They should be entered in English, using the Descriptor lists in this publication. Similar Descriptor Lists in French and Spanish are available.
All the persons involved should understand the background objectives and the basic principles of data handling. The team leader must have the following qualifications:
The assisting members of the team should preferably have a degree in Animal Science, Veterinary Science or Biological Sciences. Non professional members could assist in restricted areas such as compilation of data on rainfall, environmental temperatures etc. for various stations covered by the Sources. It is emphasised that the team leader be closely involved in training the team members and at all stages of the data extraction.
The Descriptor List is comprehensive, covering all aspects of the breed characteristics and almost all classes of livestock. It was derived from trials in different countries in Africa, Asia and Latin America, and covers all possible traits of interest and occurence. As a result it is massive. It is therefore emphasized here that the compiler should study the general pattern and contents of the Descriptor List first. Then the mode of execution is to look and search from each source, data on genetic characterization. It is not to look at the Descriptor List each time and search for corresponding data from the source. From past experience, each source is likely to provide data for only 5 to 40 percent of the options listed in the Descriptor List.
The Descriptor List should serve as a dictionary of genetic characteristics and should be used as a format for layout of the Source Data Sheet prepared by the compiler before entering them into the system (see item 10 of these guidelines).
The Descriptor List is divided into two components.
Master Record. This record refers to physical characteristics of the breed within the species. Descriptive features have been categorised and may require the compiler to make decisions. For instance, in the case of hump size (large or medium or small) or proportion of a colour. Each species will have one Master Record for each of its breeds or strains. This record for the strain need not necessarily be derived from a single Source, but from a number of Sources and may also include additional information supplied by the compiler himself. This will allow the compiled Master Record to consist of one complete set of information on the physical characteristics of the strain.
Slave Record. This consists of performance characteristics of a group of animals of a breed or strain within a species. It also contains provisions for entering environmental characteristics if such details are given in the Source. Every Source will result in one Slave Record. But if the Source has performance characteristics of more than one breed, than this Source will provide one Slave Record for each breed; in this case environmental details are repeated for each of these Slave Records, unless of course the breeds were raised differently. In exceptional circumstances, an author may have published two or more papers covering different traits in each paper but all derived from the same group of animals maintained over the same time period. The information from these sources could be pooled into a single Slave Record. If these papers compared several breeds, then, the resulting number of Slave Records will correspond to the total number of breeds in all these papers.
After a complete exercise, the end result is one Master Record for each breed or crossbred and a larger number of Slave Records for each breed or crossbred. Each Slave Record derives from one Source, (or from several only in exceptional circumstances when several Sources report on the same animals). On the other hand, each Source contributes a Slave Record for each breed or crossbred type reported.
The Master Record is made up of breed descriptive data and is qualitative in nature. Attempts have been made in the Descriptor Lists to categorise descriptors such as body colours, horn shape and size, temperament and belly shape into fixed format alternatives (e.g. straight vs. curved; short, medium or long and colour percent). Compilers need to be consistent in their subjective evaluations. For other traits, for example, resistance to diseases and parasites, format free fields for word description are allowed. It is requested that such descriptions need to be precise and short.
Usually very few publications are available which describe the physical features of a breed. Therefore, the Master Record in spite of the lack of published data, should be completed as far as possible with added information based on personal experiences. Visual examination of the animals should be necessary to reduce unfilled gaps in the record.
As some of the data in the Master Record are subjective measures, it is recommended that all Master Records for a group of breeds or crosses be completed within an uninterrupted period of time so as to ensure uniformity.
Experience shows that about three man-days are normally necessary to complete one Master Record for a breed if the breed is available in the station where the geneticist who is compiling the data is working.
All Sources after 1960 should be used to develop the Data Bank. Exceptionally Sources before 1960 may be considered valuable, but it is recommended not to search for Sources before 1960 normally. The Source should first be reviewed. Subsequently, if it is found to be suitable, information can be extracted for Data Bank use.
Review of Source: Each source needs to be studied carefully and the following points noted.
Johnson, S.A., T. Killer and A. Victor. 1981. The relative performance of Friesian and Brown Swiss cattle in Nigeria. J. Anim. Sci. 51: 2222-2275.
Nanda, K. and S. Singam. 1972. Growth rate and milk yield of Selembu cattle in Malaysia. Proc. Malaysian Society of Animal Production, 8th Ann. Conf., p. 197-200.
Black, T. and M. White. 1965. Performance of Black and White cattle in South Africa. Ann. Rpt. No. 32. 1970, Agric. Res. Inst. , London.
Mahendra, M. and V. Buva. 1982. Factors affecting performance of Friesian crossbred cattle in Sri Lanka. Ministry of Agriculture, Sri Lanka, No. 3, 56 pp.
Hoest, R. and M.E. Berg. 1985. Unpublished data Livestock Department, Ministry of Agriculture, Kuala Lumpur, Malaysia.
Extraction of data: As much relevant information as possible must be extracted from the Sources. The Slave Record descriptor list needs to be referred to constantly especially during early stages, Generally, the extraction of data from the Sources may not be straight forward. Often a considerable amount of data editing is necessary and the following is a brief summary of types of data:
In the case of 'idle' data, the compiler is expected to conduct some minimum statistical analysis as required by the Slave Record. Environmental data with relevant and reliable details should also be provided.
All statistics should be given in the metric system. Coversions from inches, lb and Fahrenheit to cm, kg and Celsius respectively, are given in Appendix 1.
During the process of data extraction, some common problems may be encountered, as follows:
The compiling geneticist is encouraged to be specific and accurate while transcribing data from Sources for the Data Bank. For example, if yields of a dairy herd were given and during the period of data recording the cows were herded for some days and strip grazed on other days, both of these should be indicated in Section 8.1.1 of Slave Record of Cattle Descriptors. In addition, if details are given, the compiler should include the proportion of time for each, e.g.
strip grazed (80%)
The Master and Slave Records should be prepared separately. Any one Source will usually have less than 40 percent of the characteristic listed in the Descriptor Lists. Therefore, to complete a set of Descriptor List for each Source will mean bulky copies of the descriptors and many items whose contents only partially filled. Further, because of the size of the Descriptor List, the necessity of reviewing the Sources before extraction of the relevant data, the need for processing of some of the data and to allow layoff time for data collection on climate, direct entry of data from Source into the computer system is not possible. It is therefore suggested that the extracted data be written on to a sheet of paper, the Source Data Sheet. Relevant climatic details are also added to the list as these details come in. In order to maintain the meaningful link between the data and its name headings, the corresponding descriptor number that appears on the left of the descriptor list (e.g. 18.104.22.168.2) is also written alongside the data on the Source Data Sheets as tag numbers. The resulting Source Data Sheets derived from the various sources are now ready for entry into the system. An example of a Source Data Sheet for a cattle Slave Record is given below.
|Tag number||Source Data Sheet for a Source|
|4||800112 - 830531|
|6||Mahatir, M. and S. Velu. 1970 Performance of Kedah-Kelantan cattle in Malaysia. J. Animal Sc. 32 : 1-20.|
|18.3.2||4 kg per day per head for two weeks before calving, 3 kg. per day per head from calving to end of 100 days and 1 kg per day per head until end of lactation.|
As a guide to compilers, a brief time framework is given in Appendix 2 for the various steps in the data search, extraction and presentation. This is based upon the experiences in the two-year trials held in different countries in Africa, Asia, and Latin America from 1983-85.
Various source materials published after 1960 will be scanned and breed or strain characteristics extracted and presented in a format (free as well as fixed) that could be easily entered into a computer system. The presentation will be separate for physical characteristics (in Master Records) and performance and environmental characteristics (in Slave Records). A summarised flow chart is given below for the data extraction and presentation. For each breed/strain represented in the country, there will be one Master Record and several Slave Records. The latter will depend on the number of publications available.
A. To summarise data
1. Calculate overall total, T
2. Calculate total number of observations, N and overall mean.M
3. Calculate annual variance, s2
4. Calculate annual totals, t
5. Calculate overall sum of square, S
6. Calculate overall variance, V and standard deviation, SD
Thus the overall number of observations, mean and standard deviations are 101, 20.6 and 4.2 respectively and range 17.0 to 24.3
B. Metric Conversion
Guideline of time required
|Approximate %||Approximate man-days/source|
|i)||Search for source||40||variable|
|ii)||Collect data for Master Record||15||3|
|iii)||Review each Source for Slave Record||15||1/2|
|iv)||Data extraction||25||1/2 to 7|
|v)||Presentation for data entry||5||1|