Loganaden Naiken
FAO (retired)
Rome, Italy
Executive summary
This paper gives a broad account of the methodology and data used by FAO for estimating the prevalence of undernourishment. Following a short introduction, the basic methodological framework is reviewed, which consists of a frequency distribution of food consumption (expressed in terms of dietary energy) and a cutoff point for intake inadequacy defined on the basis of minimum energy requirement norms. Subsequently, the data and procedures used for estimating the frequency distribution of food consumption and the cutoff point are described. The meaning and significance of the FAO measure of food deprivation in the light of the limitations placed by the data and procedures used are then explored. A section is devoted to a brief description of similar measures produced by other organizations or authors and their relationship with the FAO measure. The strengths and weaknesses of the FAO estimates, the feasibility of their improvement in the future, and issues relating to the feasibility of disaggregating the estimates by sexage or subnational groups are also discussed. The paper includes four technical appendices. Appendices A, B and C deal with certain issues raised in the paper, in particular the role of the bivariate probability distribution and the expectation of a correlation between energy intake and requirement in justifying the methodology for estimating the prevalence of undernourishment. Appendix D illustrates the application of the FAO methodology in a hypothetical country.
Introduction
The FAO measure of food deprivation, which is referred to as the prevalence of undernourishment, is based on a comparison of usual food consumption expressed in terms of dietary energy (kcal) with certain energy requirement norms. The part of the population with food consumption below the energy requirement norm is considered undernourished ("underfed").
The focus on dietary energy in assessing food insufficiency or deprivation is justified from two perspectives. First, a minimum amount of dietary energy intake is essential for body weight maintenance and work performance. Second, increased dietary energy, if derived from normal staple foods, brings with it more protein and other nutrients as well, while raising intakes of the latter nutrients without ensuring a minimum level of dietary energy is unlikely to be of much benefit in terms of improving nutritional status. Nevertheless, by focusing on dietary energy intake, the measure is attempting to capture those whose food consumption level is insufficient for body weight maintenance and work performance. It follows from this that the FAO measure is focusing on the phenomenon of hunger rather than undernutrition (or malnutrition), which has a broader nutritional connotation.
FAO has been traditionally preparing estimates referring to the prevalence of undernourishment in connection with its World Food Survey reports, the last being The Sixth World Food Survey (FAO, 1996). The principal aim of the estimates in this context has been to provide information on the broad dimension of the hunger problem in the developing world. In fact, although the estimates have been worked out on a countrybycountry basis, only the global and regional aggregates have been published. Furthermore, the focus has been on the longterm trends, as the World Food Surveys were issued between periods of roughly ten years. However, following the World Food Summit in 1996, the situation has actually changed. At this Summit, the signatories of the Rome Declaration on World Food Security made the following pledge:
We pledge our political will and our common and national commitment to achieving food security for all and to an ongoing effort to eradicate hunger in all countries, with an immediate view to reducing the number of undernourished people to half their present level no later than 2015.
Hence, for the purpose of monitoring progress towards the target of halving the number of undernourished, the need had arisen to prepare and regularly update such estimates at the global as well as country level. FAO has been undertaking this task in its annual report on "The State of Food Insecurity in the World" (SOFI), which was first issued in 1999. SOFI 2002, which is the latest report, was issued in October 2002.
In the following sections, the basic methodological framework, data and procedures used by FAO for deriving the country estimates are described and discussed. The relationship with similar measures used by other organizations is also discussed. The strengths and weaknesses of the estimates, as well as the improvement that could be envisaged in the future, are highlighted. Lastly, the feasibility of disaggregating the estimates by demographic or subnational breakdowns is discussed. Appendices A, B and C discuss in more detail certain issues referred to in the paper, and Appendix D illustrates the application of the methodology in a hypothetical country.
Basic methodological framework
In developing the methodology for estimating the prevalence of undernourishment, a basic problem concerns the use of the available energy requirement norms. These norms are usually specified as the average for groups of individuals of the same age, sex, body weight and activity. This means that even after taking into account the most influential factors such as age, sex, body weight and activity, differences exist in the energy requirement of individuals. This variation has been attributed mainly to differences in the efficiency of energy utilization between individuals. As it is not feasible to determine the efficiency of energy utilization of each individual, the departure of the specific energy requirement of an individual from the average is not known. In view of this, the estimate of the proportion of individuals having inadequate energy intake has been defined within a probability distribution framework. In this context, Sukhatme, in a pioneering study on the application of distribution analysis in estimating the extent of hunger in the World (Sukhatme, 1961), had indicated that if information in the form of a bivariate frequency distribution of intake x and requirement r, referring to individuals of the same age, sex, body weight and activity, was available, the proportion of individuals with intake below requirement could be formulated as follows:
(1)
However, as indicated above, even if the actual intake of each individual can be measured, the actual individual requirement is unknown. Therefore, information on the joint distribution f(x,r) cannot be obtained empirically. Nevertheless, as some information pertaining to the distribution of requirement was available, Sukhatme (1961) suggested the following formula for estimating the proportion of undernourished in the group of individuals with the same age, sex, body weight and activity:
(2)
where f(x) is the marginal frequency distribution of dietary energy intake, and r_{L} is a cutoff point reflecting the lower limit of the marginal distribution of energy requirement (and hence also referred to as the minimum energy requirement). Expression (2) has been referred to as the cutoff point formula to distinguish it from the bivariate formula given by expression (1). Initially, assuming that the distribution of requirement is normal, Sukhatme (1961) had taken the cutoff point (minimum energy requirement) as corresponding to the lower limit of the 99 percent confidence interval, i.e.:
where m_{r} represents the mean and s_{r} the standard deviation of the requirement distribution. In subsequent studies, he has, however, indicated that
may be more appropriate (Sukhatme, 1982).
FAO has traditionally applied the cutoff point formula. In the past, this was considered to be an approximation of the bivariate formula necessitated by the absence of adequate information pertaining to the bivariate distribution. A key problem in attempting to model and apply the latter distribution has been how to handle the effect of correlation that is expected to exist between energy intake and requirement. However, a study undertaken in FAO during the preparation of the Sixth World Food Survey showed that if intake is correlated with requirement, the bivariate formula reduces to the cutoff point formula (Naiken, 1998). In other words, expression (1) is a general formula, which, under the condition of correlation, reduces to expression (2). In Appendix A, this issue and the derivation of the cutoff point formula are discussed in detail.
Another point to be considered is that the Sukhatme cutoff point formula given by expression (2) refers to a group consisting of individuals of given age, sex, body weight and activity so that the variation in requirement (i.e. s_{r}) taken into account in determining the cutoff point is due to differences in the efficiency of energy utilization. However, a population in general consists of individuals who differ with respect to age, sex, body weight and activity. Thus, a correct application of the approach would require the stratification of the population into groups of similar individuals with respect to such characteristics and having information on the usual food consumption of the individuals in the different groups taken separately. This approach is, however, not feasible for the simple reason that the required data are not available. In the best of circumstances, what may be available refers to information on the food consumption of a representative sample of households in the population. The information on the food consumption and the size of the households sampled enables the derivation of the frequency distribution of the household per capita energy consumption. This means that what is in practice feasible is the specification of the distribution of energy consumption referring to households adjusted for differences in size.
It follows from the above that in applying the cutoff point formula, f(x) is taken to reflect the frequency distribution of household per capita dietary energy consumption, and consequently, the cutoff point, r_{L} refers to the household per capita minimum energy requirement. By implication, the variation in energy requirement to be considered in arriving at the cutoff point reflects the composite effect of the differences in the composition of the households with respect to the age, sex, body weight, activity and efficiency of energy utilization of their members. The next two sections discuss the estimation of f(x) and r_{L}.
Estimation of the distribution of energy consumption
Source of data  household surveys
Surveys that collect data on the quantities of food products consumed by individuals in a representative sample of households in the population are the only source of basic data pertaining to f(x). Such surveys provide data on household size and food consumption of the households surveyed, thus leading to household level data on per capita dietary energy consumption, which can be used to estimate f(x).
A wellknown type of survey in this context is the specialized food consumption or dietary survey. The information collected in these surveys normally refers to the quantities of food items actually consumed by the household members, which are converted into nutritive values by applying the appropriate nutritive factors. However, since the main objective of these surveys is to obtain a close approximation of the total amount of food eaten (food intake) by members of the household, the data collection procedures are usually rather complicated (e.g. weighing the food items used for the preparation of each meal). They are therefore costly to implement on a national scale. As a result, only a few countries have attempted to carry out such surveys. Even in these cases, the sample size is often so small that their usefulness for distributional analysis is questionable. Furthermore, the food eaten away from home (street foods, food consumed in restaurants and other establishments, etc.) is often not taken into account.
The more commonly and regularly undertaken type of survey is that usually called the Household Income/Expenditure Survey (HIES). The HIES, which also includes food consumption data as an integral part of its broader enquiry on household consumption expenditures, has been conducted in many countries. Frequently, information not only on the monetary expenditures but also on the quantities relating to food items purchased or acquired for consumption by households has been collected.^{[1]} In addition, the food items recorded are often sufficiently detailed to enable the conversion of food quantities into nutritive values and the estimation of household per capita energy consumption. The surveys also provide data on household income/expenditure as well as a number of other socioeconomic characteristics, so they permit an analysis of the interrelationship between food consumption and certain socioeconomic variables. Furthermore, to the extent that these surveys may have been designed to yield subnational estimates, they may permit mapping the subnational variations in food consumption. These surveys are in fact the only existing source of data for distribution analysis of both income and food consumption and hence for the estimation of the prevalence of both poverty and food deprivation.
Within FAO, the need for a better and enhanced use of the food consumption data from the existing HIESs for the purpose at hand has been recognized for some time (Naiken and Becker, 1990). The problem is that the data that are normally processed and tabulated by the respective national statistical organizations refer to the monetary values of the food consumed, i.e. food expenditure. In this form, the information is not directly usable as input for estimating the frequency distribution of dietary energy consumption. For this purpose, there is a need, as indicated above, to convert the quantities of the various items of food expenditure corresponding to each household into their dietary energy (calorie) equivalents and then derive the appropriate tabulations referring to dietary energy consumption distribution. However, the process of converting the food consumption data from the HIES type surveys into dietary energy equivalents and tabulating the results has been undertaken in a few countries only.
Problems in the use of the distribution data from household surveys
Even when survey data pertaining to food consumption are available, considerable problems are encountered in using such data for estimating the distribution of dietary energy consumption. These problems are generally of two kinds: one concerning the precision of the household level data and the other referring to the reliability of the estimated frequency distribution owing to sample design.
PRECISION OF THE HOUSEHOLD LEVEL DATA
The methodology and concepts applied in the surveys are usually not sufficiently precise to provide an accurate estimate of the usual consumption at the household level.
In some cases, certain contributions to the consumption of the household members are excluded; in others, certain contributions of consumption that should be excluded are included. Generally, the HIES has a questionnaire format that refers to food purchased or acquired during the reference period, making no distinction between the consumers so that food given to guests, visitors or tenants as well as residual household wastages are included. Food transfers to and from household stocks may not be adequately taken into account. Food consumed away from home by the household members may be recorded, but these usually correspond to prepared food and are expressed in monetary terms, which present difficulties for conversion into nutritive values. Furthermore, irrespective of the questionnaire format, the precision of the information collected depends on the ability of the respondents to recall the quantities of the different food items that have been consumed, thus resulting in measurement errors. For this reason, the reference period is chosen to be sufficiently short (one day, one week or two weeks) to facilitate recall, but this increases the risk of not reflecting the usual consumption of the household owing to the effect of seasonal variations, etc.
As a result of the above problems, the household per capita dietary energy consumption figures derived from the food consumption data collected at the household level are imprecise and, in many cases, found to be unrealistically high or low.
SAMPLE DESIGN
Many of the problems mentioned above stem from the fact that the sample surveys are designed to provide reliable estimates of the population means or totals rather than the individual household per capita values and hence the frequency distribution. This is evident if we take into account the points below.
Data collection is usually undertaken in rounds by spreading the sample over the survey period of one year. This ensures that the sample means or the means of households corresponding to certain socioeconomic groups and/or subnational categories are free from the effect of seasonal variations and any random errors in the individual household measurement.
For the purpose of administrative convenience and the aim of minimizing the variance in order to increase the precision of the population mean or total, the sample design is not usually implemented according to the Equal Probability Selection Method (EPSEM). In other words, the different households in the population do not have the same probability for being selected in the sample. As a consequence, even if the individual household data reflect usual consumption, the resulting sample frequency distribution is not an unbiased estimate of the distribution in the population.
It follows from the above that to the extent that the surveys have not been designed to yield reliable estimates of the usual consumption at the household level, and the selection of the households in the sample has not been implemented with equal probability, the resulting sample distribution is subject to significant errors. This problem applies not only to the food consumption data but also to the total income/expenditure data that are used to estimate the prevalence of poverty.
In view of the problems relating to the distribution data from the existing sample surveys, FAO has continued to rely on a theoretical model to represent the distribution of household per capita dietary energy consumption. Furthermore, as will be seen later, special care is taken to avoid the effects of the irrelevant factors influencing survey data in estimating the measure of inequality in distribution.
The choice of theoretical distribution
As the frequency distribution depicted by the tabulated survey data is generally unimodal, only such kinds of theoretical distributions have been considered for application. In connection with the estimates of undernourishment prepared for the Fourth World Food Survey, the Beta distribution was applied (FAO, 1977). This distribution was chosen because it enabled fixing the lower and upper limits of the range as determined by the physiological lower and upper limits of intake in individuals. However, beginning with the Fifth World Food Survey (FAO, 1987), the Beta distribution was abandoned in favour of the twoparameter lognormal distribution
The idea of fixing the lower and upper limits of the distribution (based on knowledge/assumptions about the physiological limits of intake) is appropriate when dealing with the true intake of individuals, but not when dealing with the kind of household level data emanating from the existing surveys. In most of these surveys, the data refer to the food available to, or acquired by, the household and thus include household wastages, food fed to pets, etc. In this context, the lognormal distribution with its short lower tail and long upper tail is considered to reflect better the fact that wastages, food fed to pets, etc. are likely to be confined to the upper tail representing the richer and more affluent households.
The approach used to specify the parameters of the distribution of dietary energy consumption
As indicated earlier, household survey data that have been processed to yield information on the distribution of dietary energy consumption still cover only a limited number of countries. Even in these cases, the time reference of the surveys differs from country to country. However, information on the national per capita dietary energy supply (DES) is available for nearly all countries. The per capita DES, which is regularly prepared and updated by FAO, is widely used to reflect the levels and trends of the average food consumption in the different countries of the world. In view of this, FAO has used the per capita DES to represent the mean, x^{}, of the distribution for each country in preparing estimates of the prevalence of undernourishment at the national, regional and global levels.
Thus, as the distribution is assumed to be lognormal, the only other information required in order to specify f(x) for each country is the coefficient of variation CV(x). This parameter reflects the inequality in the distribution and (under the assumption of log normality) can be easily expressed in terms of the wellknown Gini coefficient. It is estimated as far as possible on the basis of survey data. Given x^{} and CV(x), the two parameters of the lognormal distribution, m and s^{2}, can be determined as follows:
and
The estimation of the mean, , and the inequality in distribution parameter, CV(x), is described below.
ESTIMATION OF THE MEAN
The mean represented by the per capita DES does not refer directly to energy intake but refers to the energy available for human consumption during the course of the reference period, expressed in kilocalories (kcal) per person. It is assumed that the latter is a close approximation of energy consumption, at least for developing countries. The available data are derived from the food balance sheets compiled every year by FAO on the basis of data on the production and trade of food commodities. Using these data and the available information on seed rates, waste coefficients, stock changes and types of utilization (feed, food, other uses), a supply/utilization account is prepared for each commodity in weight terms. The food component, usually derived as a balancing item, refers to the total amount of the commodity available for human consumption during the year. The total DES is obtained by aggregating the food component of all commodities after conversion into energy values.
ESTIMATION OF THE CV
As indicated earlier, the household level data surveys on dietary energy consumption sufficiently precise to yield reliable on the usual consumption. How  the HIES sample design, the means groups of households classified by variable determining household food i.e. household per capita income total expenditure, provide reliable annual average consumption. On such a classification, it is possible the CV of household per capita energy consumption owing to However, household dietary energy is likely to vary also owing to as the sexage composition, body activity level of the household the factors determining dietary requirements. The composite effect of these factors can be approximated by the variation of the dietary energy requirements. This variation corresponds to a CV of about 0.20 (see Appendix B). Thus, by only considering these two sources of variation, the CV of the household per capita dietary energy consumption distribution is estimated as follows:
where CV(x) is the total CV of the household per capita dietary energy consumption, is the component owing to household per capita income (v), and is the component owing to energy requirement (r).
The variation in household per capita dietary energy consumption owing to differences in the energy requirements will always exist and is not expected to vary significantly between countries and over time. In view of this, can be considered to be a fixed component of CV(x). It follows from this that the survey data are used to estimate only .
The is estimated using the averages of household per capita dietary energy consumption by household per capita income (or total expenditure) classes from n households as , with
where k is the number of income classes, f_{j} is the number of sampled households, and is the average household per capita dietary energy consumption of the jth income class.
The denominator
is the estimated overall average household per capita dietary energy consumption.
It follows from the above that household survey data are used to estimate the CV of household per capita dietary energy consumption owing to income only. Thus, the effect of differences in the sexage composition of the households is not taken into account in estimating the CV; consequently, the distribution derived by linking this CV to the mean given by the per capita DES refers to a population composed of average individuals in so far as sexage composition is concerned. In other words, the unit of the distribution is what is sometimes referred to as the national per capita unit. For this reason, the derived distribution will be referred to henceforth as "distribution of per capita dietary energy consumption".
In the large majority of countries, the appropriate data for directly estimating are not readily available. Hence, for these countries, depending on the available data from the survey reports, indirect procedures are used by FAO to arrive at approximate values of this parameter. As far as possible, information pertaining to the distribution of closely related variables such as food expenditure and income (or its proxy total expenditure) is used. When even such information is not available, other information that is readily available and can be linked to is used. These indirect procedures are briefly described below.
Procedure 1:
Countries where the available survey data refer to the average food expenditure of households grouped according to income classes
In these countries, the available data enable the calculation of the CV of food expenditure owing to income. Consequently, this CV is used to estimate through the following equation:
where is the CV of food expenditure owing to income/expenditure. The parameters a and b are estimated by fitting the following linearized equation to a group of countries where the survey reports provided tabulations enabling the calculation of both and :
Procedure 2:
Countries where the available survey data refer to the distribution of household income
In these countries, the data on the distribution of household income enable the estimation of the inequality in the distribution as expressed by the Gini coefficient. Thus, in order to derive an estimate of , the Gini coefficient of income, G(v), is first converted into the Gini coefficient of dietary energy consumption owing to income, . Then, given the assumption of a lognormal distribution, the latter Gini coefficient is easily expressed as.
The conversion of the Gini coefficient of income into the Gini coefficient of dietary energy consumption owing to income is undertaken through a factor defined as follows:
where is the Gini coefficient of dietary energy consumption owing to income, and G(v) is the Gini coefficient of income.
The factor k is estimated for each country through the following equation
where GDP refers to per capita gross domestic product (GDP) expressed in purchasing power parity. The parameters a and b are estimated by fitting the following linearized equation to the countries where GDP data are available, and survey data enable estimation of both and G(v):
Procedure 3:
Countries where not even survey data referring to income distribution are available
For these countries, rough estimates of have been obtained by linking it to infant mortality, through the following general equation: ^{[2]}
In the above equation, IMR is the infant mortality rate, and the dummy indicators TR, AI, AC, AF, OC and AS correspond to country regional locations. The parameters a, b_{1} to b_{7} are obtained by fitting the equation to a group of countries having data on IMR and . Given IMR and the parameters, the CV of dietary energy consumption owing to income is derived as
where m refers to the evaluation of the general equation.
CHANGES IN THE INEQUALITY IN DISTRIBUTION OVER TIME
Estimates of the prevalence of undernourishment in the developing countries are prepared for the benchmark periods 196871, 197981, 199092 and the most recent period for which estimates of per capita DES are available (which currently refer to 199799). This means that f(x) needs to be specified for each of those four periods for a given country, and therefore, estimates of and CV(x) are required for each of these periods. In this connection, the means for the different periods are taken from the most recently available series of per capita DES, but the same CV(x) is used. This implies that the inequality in distribution, as measured by CV(x), is assumed to have remained constant over the last three decades. This approach has been dictated by the fact that the available survey data referring to dietary energy consumption, food expenditure and income/expenditure that are used for estimating the in the different countries are almost never available for all the four periods. In other words, there is no possibility of taking into account any change in the inequality that may have occurred in the individual countries. However, evidence provided by the available series of Gini coefficients of income/expenditure compiled for a number of developing countries suggests that there has been little, if any, change in the inequality of income/expenditure (see Appendix C).
The Gini coefficient is a measure of inequality in access to certain goods or services (income/expenditure in our case) among the units under consideration (households in our case). It ranges from zero to unity  zero implying an equal distribution among the units and unity implying absolute inequality or concentration in a single unit. Thus, the more the coefficient departs from zero, the more unequal is the distribution.
Definition of energy requirement and estimation of the cutoff point
Definition of energy requirement
The human body requires dietary energy intake for its expenditure of energy, which in turn is composed of several components: the basal metabolic rate (BMR), i.e. the energy expended for the functioning of an individual in a state of complete rest; the energy needed for digesting food, metabolizing food and storing an increased food intake; and the energy required for performing physical activities, both work and nonwork. For children, the energy required for growth should be taken into account. Similarly, for women during pregnancy and lactation, the energy required for the deposition of tissue and secretion of milk needs to be considered.
The energy requirement norms or standards, adopted at the international level, are periodically reviewed by expert groups and consultations. The report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements (FAO/WHO/UNU, 1985), has defined energy requirements as follows:
The energy requirement of an individual is the level of energy intake from food that will balance energy expenditure when an individual has a body size and composition and level of physical activity, consistent with longterm good health; and that will allow for the maintenance of economically necessary and socially desirable physical activity. In children and pregnant or lactating women the energy requirement includes the energy needs associated with the deposition of tissues or the secretion of milk at rates consistent with good health.
Specification of energy requirement
Energy requirements are specified by sex and age groups. As per the recommendations of the FAO/WHO/UNU Expert Consultation, the procedure for deriving the sexagespecific energy requirement differs between adults and adolescents on the one hand and children below age ten on the other.
For adults and adolescents, the specification of energy requirement begins with the BMR. This is derived on the basis of body weight through the use of a set of sexagespecific regression equations linking the BMR with body weight. The energy needed for activity is expressed in terms of the BMR so that the energy requirement for a given sexage group is finally expressed as a multiple of the BMR. The BMR multiple is referred to as the physical activity level (PAL) index.
For example, the energy requirement for adult males involved in light activity is given as 1.55 BMR, whereas for females, it is 1.56 BMR. The component involved in digesting and metabolizing food is difficult to measure in isolation of activity since the very act of eating involves activity. In view of this, it is allowed for in the PAL value.
For children below age ten, the above component approach is not applied, and the sexagespecific energy requirements are expressed as fixed amounts of energy per kilogram of body weight. In addition, for children below age two from developing countries, an allowance is made for the energy needed to recover from frequent attacks of infection.
It follows from the above that the key parameters to be specified for deriving the energy requirements are body weight and the PAL index for adults and adolescents, and only body weight for children below age ten.
Definition of the minimum energy requirement by sexage groups
It may be recalled that the FAO/WHO/UNU Expert Consultation has defined requirement as the level of intake "that will balance energy expenditure when the individual has a bodysize and composition and level of physical activity consistent with good health and that will allow for the maintenance of economically necessary and socially desirable activity". This definition implies that energy requirement should be derived on the basis of normatively specified body weight and PAL rather than the actual body weight and activity level of the individual.
However, the Expert Consultation has recognized that for a given height, there is a range of body weights that are consistent with good health. Similarly, there are a range of PALs that are consistent with performance of economically necessary and socially desirable activity and therefore may be considered to be acceptable. In view of this, the range of variation in requirement for adults and adolescents has been defined in terms of the range of energy expenditure resulting from the application of the different combinations of acceptable weightforheight and PAL. Accordingly, the lower limit of the range of variation of requirement is reflected by the energy expenditure corresponding to the lowest acceptable weightforheight and the lowest acceptable activity allowance and the upper limit by the energy expenditure corresponding to the highest acceptable weightforheight and the highest acceptable activity allowance.
Thus, as the cutoff point should be based on the lower limit of the range of variation, the sexagespecific minimum energy requirements for adults and adolescents have been specified on the basis of the lowest acceptable body weight and the lowest acceptable PAL index. The lowest acceptable body weight for a given height has been estimated on the basis of the fifth percentile of the body mass index^{[3]} (BMI) (WHO, 1995), and the PAL index corresponding to light activity (1.55 for males and 1.56 for females) has been taken to reflect the lowest acceptable activity level.
It follows from the above that the sexagespecific minimum energy requirements have been derived not by the m  2s formula but by directly considering the energy expenditure corresponding to the lowest acceptable weightforheight and the lowest acceptable activity level.
As regards children below ten years of age, the body weight figures required are fixed at the median of the range of weightforheight given by the WHO reference data (WHO, 1983) rather than a lower limit as in the case of adolescents and adults. This is due to the lack of recommendations by the FAO/WHO/UNU Expert Consultation concerning the range within which weightforheight for a given sexage group may be regarded as satisfactory. The energy requirement was estimated on the basis of the specified weight and the energy requirement per kilogram of body weight recommendations provided by FAO/WHO/UNU (1985). However, the energy requirements per kilogram of body weight recommended by the Expert Consultation include a five percent allowance to account for the fact that the energy intakes of the reference groups on which they were based do not reflect the optimum activity levels for children. This extra allowance has been removed for the purpose of deriving the minimum requirement.
In all sexage groups, the height figures needed for determining the minimum body weight norms were obtained from the tables given by James and Schofield (1990) and other sources.
The overall minimum per capita energy requirement (the cutoff point)
The minimum per capita dietary energy requirement used as the cutoff point for estimating the prevalence of undernourishment is derived by aggregating the estimated sexagespecific minimum dietary energy requirements, using the relative proportion of the population in the corresponding sexage groups as weights. Thus, as the sexage distribution of the population changes over time, the cutoff point has to be adjusted to reflect this change in demographic structure.
FIGURE 1. FRAMEWORK FOR THE CALCULATION OF THE PROPORTION OF THE POPULATION UNDERNOURISHED
It may be recalled that the frequency distribution of intake, f(x), refers to the average individual or the national per capita unit. This means that with respect to the variation in requirement and the cutoff point, the differences in sexage composition of households are not considered. Nevertheless, the variation in requirement to be taken into account in defining the cutoff point should reflect the composite effect of not only the acceptable differences in body weight and activity as discussed above in the section "Definition of the minimum energy requirement by sexage groups" but also the differences in the efficiency of energy utilization of the individuals. However, there is likely to be significant covariance between the latter factor on the one hand and the weight and activity factors on the other. As the extent of the covariance is not known, it was not felt prudent to unduly reduce the energy requirements further in order specifically to take into account the effect of the differences in efficiency of energy utilization in arriving at the cutoff point.
Calculation of the prevalence of undernourishment
Figure 1 graphically illustrates the framework for the calculation of the prevalence of undernourishment. The curve f(x) depicts the proportion of the population corresponding to different per capita dietary energy consumption levels (x) represented by the horizontal line. The cumulative proportion of the population up to the cutoff point, r_{L}, on the horizontal line represents the proportion of the population undernourished.
The distribution, f(x), is assumed to be lognormal, and so the parameters of the distribution of the logarithm of x (i.e. m and s^{2}) can be derived on the basis of the mean, , and the coefficient of variation, CV(x). The section on "Estimation of the distribution of energy consumption" discussed the derivation of the two latter measures on the basis of the available data, and the section on "Definition of energy requirement and estimation of the cutoff point" discussed the derivation of the cutoff point, r_{L}. A detailed illustration of the derivation of these three measures and the calculation of the prevalence of undernourishment for a hypothetical country is given in Appendix D. Here, only a summarized version limited to the procedure for calculating the prevalence of undernourishment on the basis of , CV(x) and r_{L} for the same hypothetical country is given.
The values of the three input measures for the calculation of the prevalence of undernourishment are as follows:
The first step following the specification of the above is to derive the two parameters of the lognormal distribution as follows:
The next step is to construct the standard normal deviate corresponding to the cutoff point:
Finally, the proportion of the population below the cutoff point is obtained as:
The prevalence of undernourishment is therefore estimated to be 23 percent. The number of undernourished is obtained by multiplying F(Z) by the total population in the country. As the total population is 11 million, the number of undernourished is estimated as 2.6 million.
Meaning and significance of the resulting estimate of the prevalence of undernourishment
The data and approximations used to derive the distribution of dietary energy consumption and the cutoff point have implications on the precise meaning and significance of the resulting estimate of the prevalence of undernourishment. These are discussed below.
Concept of food consumption
It was noted that the per capita DES is used as the mean of the frequency distribution, f(x). This means that the distribution refers to food acquired by (or available to) the households rather than the actual food intake of the individual household members. As a consequence, the resulting estimate refers to food availability rather than food intake.
Unit of analysis
The unit of analysis is the household per capita average rather than the individual within the household. This means that any inequity that may exist in the intrahousehold access to food is ignored so that, conceptually speaking, the measure of undernourishment refers to the proportion of households in the population whose food availability is below requirement.
Time reference
The per capita DES used as the mean of f(x) corresponds to a threeyear rather than annual average in order to even out the effect of errors in the annual food stocks data used in preparing the food balance sheets. Furthermore, for the purpose of deriving the CV(x  v), only household survey data grouped according to income/expenditure classes are used, thus removing the effect of seasonal and other shortterm variation that the household level data are subject to. As a consequence of these, the estimate refers to the average condition during the given threeyear period, and the effects of seasonal and other shortterm variations in food availability are not considered.
Use of concept of minimum energy requirement as cutoff point
The cutoff point is derived by aggregating the sexagespecific minimum energy requirements using the proportion of the population in the different sexage groups as weights. The sexagespecific minimum energy requirements for at least adults and adolescents are based on the energy expenditure corresponding to the lower limit of the range of acceptable body weight for a given height and the light activity norm. This approach of arriving at the cutoff point might give the impression that undernourishment is operationally defined as the state of having a food consumption level that is below that needed by an average individual for maintaining minimum acceptable body weight and performing light activity. However, this is, strictly speaking, not so. The minimal approach in establishing the cutoff point is a consequence of the consideration, owing to the effect of the correlation between energy intake and requirement, that when an individual's consumption falls within the range of variation of requirements, it is likely to be very close, if not equal, to the individual's own energy requirement. In other words, the extent of their energy shortfalls or excesses is likely to be small if not exactly zero.
Similar measures by other organizations/authors
The approach of using information pertaining to food availability, income distribution and energy requirement for estimating the global prevalence of food deprivation has been followed by other organizations or authors. Two recent studies can be referred to in this context: one by the US Department of Agriculture in its annual assessment of food security in developing countries (US Department of Agriculture, 2000) and the other by Senauer and Sur (2001).
Both approaches rely on a basic methodological framework that was first applied by Reutlinger and Selowsky (1976). In this section, following a description of this basic methodological framework, the essential differences in the implementation of the two approaches are noted. Then, the relationship with the FAO measure is highlighted.
Basic methodological framework
The methodology relies on three components:
a relationship between food consumption and income;
a distribution of income;
a cutoff point for defining undernourishment.
The relationship between food consumption and income is represented by the Engel function
where x and n represent food consumption and income, respectively, and a and b are parameters. The latter are estimated by crosscountry regression using the per capita food availability derived from food balance sheet as x and the per capita GDP as v.
The distribution of income is derived on the basis of World Bank data on income, and the cutoff point for undernourishment is based on a certain energy requirement level.
The relationship between x and v enables the derivation of the per capita food availability corresponding to a given per capita income level or vice versa. Hence, once the cutoff point defining undernourishment is specified, the population in the income groups with per capita food availability below this level is taken as the undernourished.
Differences between the USDA and Senauer/Sur approaches
While the two approaches share the above basic methodological framework, certain differences are noted in actual implementation as stated below.
UNIT OF MEASUREMENT OF FOOD AVAILABILITY
The USDA expresses food availability in terms of grain equivalents, whereas the Senauer/Sur study retains the FAO dietary energy supply approach.
INCOME DISTRIBUTION
In the USDA approach, the population is divided into five income groups using the World Bank data on income quintiles, and the Engel function is used to derive the per capita food availability corresponding to each income group. The population in the income quintiles with food availability below the specified cutoff point is taken as the undernourished.
In the Senauer/Sur approach, the World Bank data on income quintiles are used to generate a cumulative density function with the per capita GDP as the mean. In view of this density function approach, the Engel function is used to derive the income level corresponding to the food consumption level implied by the specified energy requirement level. The proportion of the population below this income level, read off the generated cumulative density function, is taken as the prevalence of undernourishment.
THE CUTOFF POINT FOR DEFINING UNDERNOURISHMENT
The USDA uses the concept of average energy requirement for each country, which averages to about 2 100 kcal/person/day for the 67 developing countries covered in their study, whereas the Senauer/Sur study adopted the FAO regional minimum energy requirements derived in connection with the estimates published in the Agriculture: Towards 2010 report (FAO, 1995), which are on average about 300 kcal/person/day below the level used by USDA.
Nevertheless, it may be pointed out that the differences with respect to the unit of measurement for food availability and the degree of detail in representing income distribution are not likely to have any systematic effects on the results. What is likely to have a systematic effect is the difference with respect to the cutoff point. The lower cutoff point used by Senauer/Sur implies that their estimates are likely to be lower than those of the USDA.
Relationship with the FAO approach
The two approaches described above essentially rely on data relating to the distribution of household income and its effect on the variation in per capita food consumption. The FAO approach also relies on such data but with the aim of deriving the distribution of per capita food consumption. The key distinction between the USDA and Senauer/Sur approaches and the FAO approach is that the latter also considers the effect of certain nonincome factors that determine the inequality in the distribution of food consumption. This suggests a difference in the focus of the two kinds of measures. In the case of the USDA and Senauer/Sur approaches, the focus is on the part of the population whose income level is insufficient to ensure a minimum food consumption level, as determined by the specified requirement level. In the FAO case, the focus is on the part of the population whose actual food consumption level is below the specified minimum energy requirement level.
Strengths and weaknesses of the FAO estimates
Strengths
The main strength of the FAO estimates lies in the fact that the distribution of household per capita dietary energy consumption is directly linked to the per capita DES derived from the food balance sheet. This procedure is useful for the reasons given below.
TABLE 1. PERCENTAGE OF UNDERNOURISHED CORRESPONDING TO DIFFERENT COMBINATIONS OF MEAN AND CV OF DIETARY ENERGY CONSUMPTION DISTRIBUTION 

Mean food consumption (kcal/person/day) 
CV 

0.20 
0.24 
0.29 
0.35 

1 700 
65 
64 
63 
63 
2 040 
30 
34 
38 
42 
2 450 
7 
12 
17 
23 
2 940 
1 
2 
6 
10 
The FAO per capita DES database, which covers practically all countries of the world, is regularly revised and updated in connection with FAO's continuous work programme on supply/utilization accounts and food balance sheets. As a result, the database represents a readily available source of information for the assessment and monitoring of the prevalence of undernourishment at the global, regional and country levels.
The linkage of the per capita DES with a measure of inequality within a theoretical distribution framework provides a mechanism for assessing the effect of shortterm changes in aggregate food availability as well as its components (production, import, etc.) on the distribution of dietary energy consumption and hence the prevalence of undernourishment.
Weaknesses
It is evident there are several weaknesses that affect the validity of the estimates. These are primarily associated with the fact that the frequency distribution of per capita dietary energy consumption is not based on observed data but is derived using a model whose parameters are estimated on data or measures that are subject to errors of unknown magnitude and direction. However, the derivation of the cutoff point also has certain weaknesses.
Assuming that the distribution model used (lognormal) is correct, the validity of the estimates depends on the reliability of the three elements involved in the estimation process, i.e. the mean (represented by per capita DES), the CV and the cutoff point. Of these, the last is obviously of crucial importance as its location within the range of variation of the consumption distribution directly determines the proportion of the population undernourished. In other words, the result is quite sensitive to the cutoff point. The impacts of the other two elements are through their diverse effects on the consumption distribution. The sensitivity of the result with respect to these elements can be assessed on the basis of Table 1, which shows the percentage of the population undernourished corresponding to different combinations of the mean and CV with the cutoff point kept fixed at 1 800 kcal/person/day.
In Table 1, the mean and the CV are given initial values of 1 700 kcal/person/day and 0.20, respectively, and these are then successively increased by 20 percent over the previous values in three steps to finally reach 2 940 kcal/person/day and 0.35, respectively. Accordingly, the absolute change in the percentage as one moves from one cell to the other along the rows indicates the sensitivity of the result to the CV, while the movement along the columns indicates the sensitivity to the mean.
It is noted that the percentage is quite sensitive to the mean but much less so with respect to the CV. In fact, when the mean is low and close to the cutoff point, the percentage is practically insensitive to the CV. Sensitivity to the CV tends to increase as the mean increases to higher levels. In the present analysis, which assumes a cutoff point of 1 800 kcal/person/day, the sensitivity to the CV appears to reach a peak when the mean is about 2 500 kcal/person/day. However, even then, the sensitivity to the mean is greater.
It follows from the above that errors in the estimation of the CV are of less consequence to the result as compared with errors in the cutoff point and the mean. Nevertheless, for the sake of precision of the result, it is important that all three elements are sufficiently accurate. The weaknesses having a bearing on the accuracy of these elements are indicated below.
THE MEAN
The per capita DES, which is taken to represent the mean of the distribution of per capita energy consumption, is derived as the ratio of total food supply to population. The total food supply is based on information relating to food production, food products traded, wastages, stock changes and nonfood uses of food products. While data on production and trade are available for most countries, it is well known that they are subject to errors and there are many gaps in the information reported by countries. As regards information on stocks and nonfood utilization, comprehensive and regular statistics are not normally available. There is therefore a need to rely on estimates based on fragmentary data or assumptions to fill in the data gaps in many instances.
The population estimates used as the denominator of the ratio are based on the global series updated biannually by the UN Population Division. Although most of the developing countries have carried out population censuses, these invariably suffer from errors of overestimation or underestimation. The UN Population Division undertakes evaluation and adjustment of the census data in deriving the series of population estimates, but the significant revisions that are made in the series when they are updated indicate that the estimates are not accurate for many countries. It is therefore evident that the per capita DES estimates resulting from the ratio of total food supply to population are subject to significant errors, particularly where the data problems are severe, for example in Africa.
THE CV
Several problems have been noted with the specification of this coefficient, as indicated below.
The coefficient cannot be completely specified, even when the appropriate survey data are available owing to problems associated with survey practices, measurement errors and sample design. It is in fact only the component of the CV owing to income that could be estimated with a certain degree of reliability from the survey data. However, owing to the lack of the appropriate survey data for many countries, even this component of the CV had to be estimated indirectly and assumed to remain constant over time.
It is uncertain whether the method adopted for estimating the nonincome component of the CV is adequate and reliable.
THE CUTOFF POINT
The weaknesses relating to the cutoff point mainly refer to the specification of the sexagespecific energy requirement. In this context, three points need to be mentioned.
EQUATIONS TO ESTIMATE THE BMR: The regression equations used to estimate the BMR on the basis of the body weight are those recommended in the report of the FAO/WHO/UNU Expert Consultation on Energy Requirement. These equations (known as the Schofield equations) are believed to be less accurate in predicting BMR in population of the tropics and North America and appear to overestimate the BMR in many populations (Hayter and Henry, 1994).
BODY WEIGHT NORMS FOR CHILDREN: It was noted that the minimum energy requirements for children below age ten were based on weights corresponding to the median of the range of acceptable weightforheight. This approach is in fact inconsistent with the general approach applied to adults and adolescents of basing the minimum energy requirements on the lower limit of the range of acceptable weightforheight. As a consequence, the per capita minimum energy requirement used as the cutoff point may have been systematically overestimated.
AVERAGE HEIGHT BY SEXAGE GROUPS: The height figures used for fixing the weightforheight norms are in many instances based on fragmentary data or analogy with countries having a similar racial/ethnic background. Hence, these may be subject to significant errors.
Feasibility of improving the FAO estimates
It is evident that future attempts to improve the FAO estimates should focus on the weaknesses relating to the specification of both the distribution of household per capita dietary energy consumption as well as the cutoff point. The actions that could be considered are briefly discussed below.
The mean and the CV of household per capita dietary energy consumption
The food balance sheet estimate of the per capita DES, which is used as the mean of the distribution, can be improved by improving the basic statistics on which it is based. The figure should also be reconciled with the corresponding national level estimates derived from the aggregated HIES data. Although the estimates of per capita food consumption from these two sources are expected to differ, the food consumption data from the HIES can be used to improve the food balance sheet estimates, particularly with respect to the consumption of minor food crops and the selfproduced and consumed food.
There is also scope for improving the estimate of the CV of household per capita dietary energy consumption on the basis of the available distribution data from the existing household income and expenditure surveys, through further research and analysis. However, for the purpose of improving the per capita DES, there is a need for access to the national per capita quantities corresponding to the various items of food expenditure. On the other hand, for the purpose of the research and analysis relating to the estimation of the CV of household per capita dietary energy consumption, the quantity data need to be converted into the energy equivalents and tabulated. Yet, despite FAO's promotional efforts in this direction, few countries have actually undertaken the related work systematically in connection with their existing HIESs. Consequently, a potential source of data for improving the methodology remains largely unexploited. As the data and research will benefit not only FAO's work but also the individual countries' information system for assessing and monitoring the prevalence of undernourishment, it is important that the data processing and tabulation work to be undertaken with respect to the HIES be considered as a key component of national FIVIMS programmes being implemented by FAO and the Interagency Working Group on FIVIMS.
Surveys designed to provide reliable estimates of the distributions
In the long term, it is desirable to avoid the use of a theoretical model and different data sources for the distribution of dietary energy consumption and to rely solely on household surveys, at least in the context of national FIVIMS. However, for this purpose, it is necessary to undertake research focusing on the development of sample survey designs that will yield reliable estimates of the distribution at minimal costs.
The cutoff point
With respect to the cutoff point, the weaknesses noted refer to the regression equations used to predict the sexagespecific BMR, the body weight norms used for deriving the minimum energy requirements for children below age ten and the height figures used for determining the weightforheight norms.
A considerable amount of new BMR data referring to populations in the tropical areas has become available since the publication of the report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements. Therefore, there is scope for a more extensive analysis in order to arrive at equations that can predict more accurately the BMR for populations in the different parts of the world.
As regards the body weight norms for children below age ten, the report of the FAO/WHO/UNU Expert Consultation did not contain any recommendation regarding the range of acceptable weightforheight. However, given the practice of using the cutoff point of median  2 SD in anthropometric assessment of undernutrition among children, it may be logical to adopt this practice also for the purpose of fixing the minimum energy requirements. In this way, the approach taken with respect to children may be brought in line with that taken for adults and adolescents.
The height figures used for deriving the body weight norms are largely based on the data published by James and Schofield in 1990. There is therefore scope for updating this dataset on the basis of the new data available since 1990.
Feasibility of disaggregating the estimates by sexage and subnational groups
There is of course an interest in obtaining information on the differences that may exist in the prevalence of undernourishment among individuals in different sexage groups and among those living in different areas within a country. The feasibility of undertaking these within the framework of the FAO methodology is discussed below.
Disaggregation by sexage groups
Although the minimum energy requirements are first specified by sexage groups and then aggregated as a weighted average for all sexage groups, the consumption data refer to household per capita averages that do not permit breakdowns according to the sex and age of the household member. The available consumption data therefore do not allow for disaggregation of the estimates by sex and age groups.
Disaggregation by subnational areas
With respect to global assessment, which relies on the per capita DES from the food balance sheet as the mean of the distribution of consumption, it is not possible to disaggregate the national estimate by subnational areas as the food balance sheet approach is not applicable at the subnational level. However, it may be feasible to apply the FAO methodology separately to the different areas and thus derive subnational estimates of the prevalence of undernourishment if the mean and CV of the distribution of dietary energy consumption by subnational areas are available from household survey data of specific countries. In this context, the national estimate could be obtained as an aggregation over the subnational areas.
Contribution to vulnerability assessment
The FAO country level estimates can be used as an indicator reflecting the extent of chronic food insecurity in vulnerability studies.
Frequency and cost of updating the estimates
As the per capita DES, which is a key element in the estimation process, is updated for nearly all countries on an annual basis by FAO, it is possible to derive annual updates of the estimate of the prevalence of undernourishment with minimum costs.
Acknowledgements
The author wishes to acknowledge the valuable inputs and comments provided by Messrs J. Mernies and R. Sibrian, his former colleagues in the Statistical Analysis Service, Statistics Division, FAO. In addition, the assistance in typing the drafts provided by Ms G. MarcianiPoliti, also in the Statistical Analysis Service, is much appreciated.
References
FAO. 1977. The fourth world food survey. Rome.
FAO. 1987. The fifth world food survey. Rome.
FAO. 1995. Agriculture: towards 2010. Rome.
FAO. 1996. The sixth world food survey. Rome.
FAO/WHO/UNU. 1985. Energy and protein requirements. Report of a joint FAO/WHO/UN ad hoc Expert Consultation. WHO Technical Report Series, No. 724, Geneva.
Hayter, J.E. & Henry, C.J. 1994. A reexamination of basal metabolic rate predictive equations: the importance of geographical origin of subjects in sample selection. Eur. J. Clin. Nutr., 48(10); 702707.
James, W.P.T. & Schofield, E.C. 1990. Human energy requirements. Oxford, Oxford University Press.
Kakwani, N.C. 1992. Measuring undernutrition with variable calorie requirements. In S. R. Osmani, ed. Nutrition and poverty, pp. 165185. Oxford, Clarendon Press. 336 pp.
Lorstad, M. 1974. On estimating incidence of undernutrition. FAO Nutrition Newsletter, 12(1): 111. Rome.
Naiken, L. 1998. On certain statistical issues arising from the use of energy requirements in estimating the prevalence of energy inadequacy (undernutrition). J. Indian Soc. Agric. Stat., L1(2.3): 113128.
Naiken, L. & Becker, K. 1990. Food consumption statistics  the potential role of data from household income/expenditure surveys. FAO Quarterly Bulletin of Statistics, 3(4). Rome.
Reutlinger, S. & Selowsky, M. 1976. Malnutrition and poverty: magnitude and policy options. World Bank Staff Occasional Papers No. 23. Baltimore, MD, Johns Hopkins University Press.
Reutlinger, S. & Alderman, H. 1980. The prevalence of caloriedeficient diets in developing countries. World Dev., 8: 399411.
Scrimshaw, N.S., Waterlow, J.C. & Schurch, B., eds. 1994. Energy and protein requirements. Proceedings of an International Dietary Energy Consultative Group Workshop, London.
Senauer, B. & Sur, M. 2001. Ending global hunger in the 21st century: projections of the number of food insecure people. Rev. Agric. Econ., 23(1): 5168.
Sukhatme, P.V. 1961. The world's hunger and future needs in food supplies. J. R. Stat. Soc. [Ser A], 124: 463585.
Sukhatme, P.V. 1982. Poverty and malnutrition. In P.V. Sukhatme, ed. Newer concepts in nutrition and their implications for policy, pp. 1152. Pune, India, Maharashtra Association for the Cultivation of Science.
Svedberg, P. 2001. Undernutrition overestimated. IIES Seminar Paper No. 693, Stockholm, Stockholm University.
US Department of Agriculture. 2000. Food security assessment. Situation and Outlook Series, International Agricultural and Trade Reports. Washington, DC.
WHO. 1983. Measuring change in nutritional status. Geneva.
WHO. 1995. Physical status: the use and interpretation of anthropometry. WHO Technical Report Series No. 854. Geneva.
Introduction
As energy requirements are normatively specified as averages corresponding to groups of individuals of given age, sex, body weight and activity, the implied variation within the group needs to be taken into account in determining whether an individual's intake is below, equal to or above their requirement. The variation within the group of such similar individuals is believed to be due to differences in the efficiency of energy utilization. In the context of the FAO methodology, where the unit of analysis is the household per capita intake, the variation reflects the net effects of differences in the composition of the households with respect to sex, age, efficiency of energy utilization, body weight and activity of the household members. In order to take into account this variation, it is necessary to rely on a bivariate distribution model referring to energy intake and energy requirement. The estimate of the prevalence of undernourishment based on such a model has been referred to as the bivariate formula. However, a key issue in evaluating this formula has been how to incorporate the effect of the positive correlation that is believed to exist between energy intake and requirement. Because of a lack of sufficient information on this matter, FAO has traditionally adopted the Sukhatme cutoff point formula that relies on the frequency distribution of household per capita energy intake and a cutoff point reflecting the lower limit of the range of variation of energy requirement. In the past, this approach has been justified as a means of avoiding the risk of overestimating the prevalence of undernourishment, but a study undertaken during the preparation of The Sixth World Food Survey showed that under the condition of a correlation between energy intake and requirement, the bivariate formula reduces to the cutoff point formula (Naiken, 1998).
However, recently, Svedberg (2001) argued that the FAO cutoff point formula yields an estimate that is "biased" downwards as it appears to ignore the risk of energy insufficiency among the individuals with an intake falling within the range of variation of requirement. His argument is based on a comparison of the estimate obtained through the cutoff point formula with that obtained using the bivariate formula. However, in evaluating the bivariate formula, he, just as others who have applied it before (e.g. Kakwani, 1992; Lorstad, 1974; Reutlinger and Alderman, 1980), has not correctly interpreted the effect of correlation with the consequence that the prevalence of undernourishment is overestimated. This issue, as well as the derivation of the cutoff point formula, is discussed in detail in this appendix.
The flaw in the approach taken to incorporate the effect of correlation in evaluating the bivariate formula
In the subsection "Basic methodological framework", the bivariate formula for estimating the prevalence of undernourishment is expressed as follows:
(A1)
Needless to say, the above statement refers to the probability
that the intake of a randomly selected individual from the population is below
his or her requirement.
m
In evaluating the
formula given by expression (A1), f(x,r) has been assumed to be bivariate
normal with the related parameters given as follows:
m_{x} representing the mean of intake x;
m_{r} representing the mean of requirement r;
s^{2}_{x} representing the variance of intake x;
s^{2}_{r} representing the variance of requirement r; and
s_{xr} representing the covariance of x and r.
The covariance reflects the presence of correlation. As the coefficient of correlation, r_{1}, is given by the ratio, s_{xr}/s_{x}s_{r}, the analysts (including Svedberg) who have attempted to apply the bivariate formula have expressed the covariance as rs_{x}s_{r}. Thus, given s_{x} and s_{r} the effect of correlation has been taken into account by simply introducing r as an additional parameter of the joint distribution of intake and requirement. The flaw resulting from the introduction of the effect of correlation in this manner is discussed below.
Since P(x<r) can also be written as P[(xr) < 0], the double integral given by expression (A1) can be evaluated by considering the distribution of (xr), which is expected to be normal (as the joint distribution is assumed to be normal) with mean m_{(xr) }and variance s^{2}_{(xr).} Thus, using this distribution, the prevalence of undernourishment is expressed as follows:
(A2)
where the mean and variance of (xr) are given as follows:
(A3)
(A4)
and F[Z] is the area under the standard normal curve to the left of the point Z. Needless to say, P(x>r) can be derived as 1  F[Z].
Actually, Svedberg has assumed f(x,r) to be bivariate lognormal, which implies that the probability statement is expressed as P[(x/r) <1] instead of P[(xr) <0]. As this boils down to the same probability statement expressed in relative rather than absolute terms, it does not affect the argument made here, which is based on the assumption that f(x,r) is bivariate normal. Furthermore, Svedberg has evaluated P(x<r) by operating directly on the joint frequency distribution rather than the distribution of (xr), but again, this does not affect the argument as the two approaches lead to the same result.
Thus, it is evident that by keeping m_{x}, m_{r}, s_{x} and s_{r} constant and increasing r from zero, s^{2}_{(xr) }is decreased so that P(x<r) given by expression (A2) is marginally reduced. However, it is also evident that when m_{x} = m_{r}, the numerator of the ratio within the square brackets on the righthand side of expression (A2) becomes 0, so that, irrespective of the value of r in the denominator, P(x<r), and consequently, P(x>r) will remain 0.5. The only exception is when s_{x} = s_{r} and r is assumed to be exactly 1.
The high and constant value of 0.5 for P(x<r) and P(x>r) when m_{x} =m_{r} and r is increased from 0 to nearly 1 (i.e. the results remain the same as under the assumption of no correlation) is due to the fact that the effect of correlation has not been correctly introduced in evaluating the bivariate formula. In this connection, the following statement from the report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements is worth noting:
Most people have the ability to select their food intake in accordance to their requirement over the long term, since it is believed that regulatory mechanisms operate to maintain balance between energy intake and requirement over long periods. This implies that one expects there to be a correlation between energy intake and energy requirement among individuals if sufficient food is available in the absence of interfering factors .... If selfselection is allowed to operate, it is to be expected that individuals will make selection according to the energy need and the probability of inadequacy or excess will be low across the whole range of requirement (emphasis mine) ... If the average intake of a class were equal to the average requirement of the class almost all individuals would be at low risk because of processes regulating energy balance and the resultant correlation between intake and requirement (FAO/WHO/UNU, 1985).
The above statement clearly indicates that correlation is meant to reflect the effect of "regulatory mechanisms "that "operate to maintain balance between energy intake and energy requirement over long periods". This implies that correlation needs to be interpreted as the existence of a probability for intake to match requirement, i.e. P(x=r).
The interpretation of correlation in the above manner is actually a consequence of the fact that, in the present context where energy requirement is fixed, the joint distribution is such that the range of variation of requirement lies within the range of variation of intake, and hence s_{x} includes s_{r}. In this situation, intake cannot be linked to requirement by the usual regression equation but by the following expression:
(A5)
where (r  m_{r}) refers to the deviation of the individual's requirement from the mean requirement, and (xr) refers to the deviation of the individual's intake from their requirement. Thus, as (r  m_{r}) and (xr) can be assumed to be uncorrelated, the variance of intake may be expressed as
(A6)
Now, given that x is correlated with r, the second term on the righthand side of expression (A6) may be expressed as
(A7)
so that
(A8)
Upon elimination of s_{x}^{2} on both sides, it will be noted that
.
Hence
and
.
Thus, if s_{x} =s_{r}, then r =1 and s^{2}_{(xr)} = 0, which implies that P(x=r) =1 for all intakes that fall within the range of variation of requirement. This means that the condition for x to be correlated with r is that P(x=r) =1 for the intakes that lie within the range of variation of requirement. By implication, the condition for the existence of P(x<r) or P(x>r) is that s_{x} > s_{r} so that P(x<r) =1 for the intakes that consequently will lie below the range of variation of requirement, and P(x>r) =1 for the intakes that will lie above the range of variation of requirement. In view of this, given the frequency distribution of intake, P(x<r) for the population is derived by integrating the distribution over the part that lies below the range of variation of requirement; P(x=r) by integration over the part that coincides with the range of variation of requirement; and P(x>r) by integration over the part that lies above the range of variation of requirement.
It follows from the above discussion that the analysts who have attempted to evaluate the bivariate formula have ignored the fact that, as the range of variation of requirement lies within the range of variation of intake, the covariance is given by s^{2}_{r}, and hence r (which is given by the ratio s_{r}s_{x}) reflects the existence of P(x=r) in addition to P(x<r) and P(x>r) in the bivariate probability space. Consequently, P(x<r) cannot be evaluated on the basis of the parameters of the two marginal distributions and independently assumed values for r. By doing this, the whole probability space is divided into P(x<r) and P(x>r), irrespective of the assumed level of r, so that P(x=r) is absorbed by P(x<r) and P(x>r). As a result, the effect of correlation is blunted, and both P(x<r) and P(x>r) are overestimated.
Derivation of the cutoff point formula within the bivariate distribution framework
In the previous section, the effect of correlation on P(x<r) has been discussed under the assumption that the joint distribution is bivariate normal. This is mainly because nearly all attempts made to evaluate the bivariate formula have been based on this assumption. However, as will be shown below, the argument justifying the use of the cutoff point formula is not bound by this assumption and can be made without specifying the type of theoretical distribution that is relevant in the present context. It suffices to assume that the two marginal distributions are continuous and unimodal, and to consider the condition for the existence of a dependent relationship (correlation) between energy intake and energy requirement.
As illustrated in Figure A1, the range of variation of the marginal distribution of requirement is expected to be located within the range of variation of the marginal distribution of intake.^{[4]} Consequently, the probability space can be divided into three parts as follows:
(A9)
where f(x) and f® are the marginal density functions of intake and requirement, respectively, and r_{L} and r_{U} represent the lower and upper limits, respectively, of the distribution of requirement.
FIGURE A1. FREQUENCY DISTRIBUTION OF DIETARY ENERGY INTAKE AND DIETARY ENERGY REQUIREMENT
The area defined by the first integral represents the proportion of the population whose intakes are below the lowest requirement and are therefore in all probability below their respective requirements, while the area defined by the third integral represents the proportion whose intakes are above the highest requirement and are therefore in all probability above their respective requirements. The area defined by the second integral represents the proportion of the population whose intake status requires evaluation of the joint distribution of intake and requirement over the requirement range. As a consequence, the bivariate formula given by expression (A1) can be written as follows:
. (A10)
The evaluation of the double integral over the range of requirement depends on whether intake is considered to be independent of requirement (uncorrelated) or dependent on requirement (correlated). The condition for independence in this context implies that the individuals' intakes are either below or above their respective requirements, i.e. the intakes do not match the respective requirements. The condition for dependence is the contrary, i.e. the intakes match the respective requirements.
Thus, if intake is assumed to be uncorrelated with requirement (i.e. intake is independent of requirement) the double integral involving the joint distribution on the righthand side of expression (A10) can be expressed in the form of two simple integrals involving the marginal distributions of intake and requirement so that the bivariate formula may be written as follows:
(A11)
This implies that under the assumption of no correlation, P(x<r) will include part of the intakes that fall within the range of variation of requirement.
However, if intake is considered to be correlated with requirement (i.e. intake is dependent on requirement), all the intakes falling within the range of variation of requirement are expected to match requirements [i.e., x =r] so that the double integral [the second term on the righthand side of expression (A10)] becomes zero. As a consequence, the intakes that fall within the range of variation of requirement need to be excluded in evaluating P(x<r) so that the bivariate formula reduces to the cutoff point formula as follows:
(A12)
It therefore follows that the bivariate formula given by expression (A1) is a general formula for the prevalence of undernourishment that, under the assumptions that the marginal distributions are unimodal and a correlation exists between energy intake and requirement, reduces to the cutoff point formula given by expression (A12).
Practical significance of the estimate resulting from the application of the cutoff point formula
In the introduction of this appendix, it was indicated that the prevalence of undernourishment is formulated as a probability measure because energy requirement is specified as the average for a group of individuals, and consequently, the actual requirement of each individual in the group is not known. In other words, the probability measure is used because of the uncertainty regarding the actual requirement level of the individuals. This uncertainty, however, concerns only the group of individuals with intakes that fall within the range of variation of requirement. In making inference regarding this group, a key issue is whether intake is correlated with requirement (i.e. whether intake is dependent on requirement). It was noted that if, as generally accepted, intake is correlated with requirement, this group has to be considered to be in the adequate category, i.e. their intakes match their respective requirements. Consequently, the bivariate formula reduces to the cutoff point formula with the cutoff point given by the lower limit of the distribution of requirement.
The above result is actually theoretical for two obvious reasons. First, it is based on the assumption that the marginal distributions of intake and requirement are continuous. Second, it takes into account the theoretical condition for the existence of statistical dependence or correlation between the two variables. However, in reality, the distributions are likely to contain certain irregularities and are considered to be continuous only approximatively, so that the theoretical condition cannot be met exactly. In other words, even if a correlation exists, the intakes falling within the range of variation of requirements may still lie either below or above the individual's respective requirements (i.e. the intakes do not exactly match requirements). It is therefore necessary to explain the significance of the theoretical condition in this context. This is undertaken below by considering a bivariate plot resulting from a hypothetical situation where intake is correlated with requirement.
In Figure A2, requirement (r) is shown on the horizontal axis and intake (x) on the vertical axis. Requirement is assumed to vary within the range of r_{L} and r_{U}. The point x 1 on the vertical axis represents the level of intake that equals r_{L}, i.e. the cutoff point, and x_{2} represents the level of intake that equals r_{U}. The 45 degree line is where intake exactly matches requirement, i.e. x=r. Thus, the area below the 45 degree line represents the space where the individuals whose intakes are below their respective requirement will be located, while that above the 45 degree line is the space where the individuals whose intakes are above their respective requirements will be found. It is evident that as r_{L} and r_{U} are the lowest and highest requirements, respectively, there can be no individuals in the area to the left of the vertical line drawn from r_{L} as well as that to the right of the vertical line from r_{U}. Thus, all the individuals will be located between the vertical lines rising from r_{L} and r_{U}.
FIGURE A2. BIVARIATE PLOT OF HYPOTHETICAL DATA ON INTAKE AND REQUIREMENT OF INDIVIDUALS
It follows from the above that the individuals with intakes that fall within the range of variation of requirements will be located in the area ABCD. In fact, the tendency for intake match requirement (correlation) is possible only for this group of individuals. However, because of the fact that, in reality, the frequency distributions are not likely to be exactly continuous, the points corresponding to this group of individuals are not likely to be exactly on the 45 degree line, as stipulated by the condition for the existence of a correlation. Hence, the points corresponding to this group are shown to cluster closely around the 45 degree line. Those whose intakes are below r_{L} will lie in the area below BC, while those whose intakes are above r_{U} will be in the area above AD. The points in these two groups (not shown in the graph) are obviously the outliers reflecting the effect of factors that are uncorrelated with requirement. It is actually their presence that makes the correlation between intake and requirement in the population not perfect. The greater their proportion in the population, the less perfect is the correlation.
All individuals in the area below BC are clearly in the inadequate category, while those in the area above AD are clearly in the excess category. As regards those in the area ABCD, the theoretical argument is that although the corresponding points may be either below or above the 45 degree line, the extent of the departure from the line cannot be significant for a correlation to exist between intake and requirement. Hence, these individuals as a group are considered to be in the state of energy adequacy in the probability sense. In other words, the corresponding points are in theory considered to be on the 45 degree line. A situation where the points depart significantly from the 45 degree would obviously imply that a correlation does not exist. In such a situation, some individuals in this group also are likely to be in the inadequate category, implying that the cutoff point formula will underestimate the prevalence of undernourishment.
Thus, the use of the lower limit of the range of variation of requirement as the cutoff point on the distribution of intake for estimating the prevalence of undernourishment does not imply a deliberate disregard of undernourishment among the individuals with intakes falling within the range of variation of requirement, but is a recognition of the fact that, owing to the effect of correlation, these intakes are likely to be close to, if not exactly matching, the respective requirements.
In the subsection "Estimation of the CV "of the main paper, it was indicated that the CV(x r) needed to specify CV(x) is estimated to be about 0.20. The procedure used to arrive at this figure is described in this appendix.
The first step in the procedure was to derive estimates of the minimum and maximum per capita energy requirements. Then, assuming that the distribution is normal, the CV implied by this range was derived.
The minimum and maximum per capita energy requirements have been estimated on the basis of the same principles adopted for deriving the sexagespecific minimum energy requirements for the purpose of the cutoff point, i.e. by considering the ranges of acceptable body weight for given height and physical activity as described in the subsection "Definition of the minimum energy requirement by sexage groups". However, there are two exceptions, as indicated below:
Variation in the BMR for adults and adolescents
The regression equations used for estimating the BMR given body weight are subject to a prediction error corresponding to a CV of about 0.08 (Scrimshaw, Waterlow and Schurch, 1994). As this variation is of a random nature, it was not considered in deriving the minimum energy requirements for the purpose of the cutoff point. But in the present context where this variation in energy requirement is to be used for estimating the variation in energy intake, the variation owing to error in estimating the BMR is taken into account.
Range of acceptable body weight for given height for children
In defining the sexagespecific minimum energy requirements for children below age ten for the purpose of the cutoff point, no allowance was made for variation in the weightforheight norms. However, in the present context, the approach was made consistent with that taken for adults and adolescents. Accordingly, the range of acceptable weightforheight used in anthropometric assessments of nutritional status was adopted to arrive at the sexagespecific minimum and maximum energy requirements.
The body weight and activity specifications used for defining the minimum and maximum energy requirements are summarized in Table B1.
As in the case of the cutoff point, the derived sexagespecific minimum and maximum energy requirements are aggregated by using the proportion of the population in the relevant sexage groups as weights to arrive at the country minimum and maximum per capita energy requirements. The CVs based on the thus derived range of energy requirements have been calculated for all countries for 1960 and 2000. The resulting values were around 0.20. Both the intercountry and the interperiod differences, which reflect the intercountry and temporal differences in the sexage composition of the population, were small corresponding to a standard deviation of less than 0.005. In view of this, a fixed value of 0.20 has been used for all countries.
TABLE B1. BODY WEIGHT AND ACTIVITY SPECIFICATIONS FOR DEFINING MINIMUM AND MAXIMUM ENERGY REQUIREMENTS 

Body weight 
Minimum 
Maximum 






Physical activity level 








The available series referring to the Gini coefficient of the distribution of household income/expenditure for a number of developing countries in Asia, Africa and Latin America are given in Table C1. Only the countries with data referring to more than one time period are shown in the table. In referring to this table, it should be borne in mind that, as the share of food in household expenditure declines with rising income and there is an upper limit to food consumption levels, the inequality in the distribution of the household per capita food consumption is much smaller than the inequality in the distribution of household income. In view of this, the focus should be on the changes rather than the actual levels observed.
In interpreting the changes in the Gini coefficient from survey to survey over time, account must be taken of the fact that the coefficients are based on distributions derived from data collected in sample surveys, which are normally designed to provide valid estimates of the means rather than the distribution. Furthermore, the means as well as the variances derived through these surveys are subject to sampling errors. As the distribution of income is known to be positively skewed, the sampling error is in fact larger than what would be expected in the case of a normal (symmetric) distribution. The effect of these errors, which are common to all socioeconomic surveys based on samples, implies that the estimated variances, and hence Gini coefficient, are not likely to be stable, even if there is no true change in the inequality. Thus, considering these issues associated with the precision of the measures based on sample surveys, the periodtoperiod change observed cannot be taken to reflect a true change unless it is very large.
Table C1 shows that wherever the number of observations is sufficient for drawing conclusions, the changes over time in the different countries are rather small with no clear indication of either a decreasing or increasing trend.
TABLE C1. GINI COEFFICIENT OF DISTRIBUTION OF HOUSEHOLD INCOME/EXPENDITURE IN DEVELOPING COUNTRIES, 19701993 

Region/country 
YEAR 



70 
71 
72 
73 
74 
75 
76 
77 
78 
79 
80 
81 
82 
83 
84 
85 
86 
87 
88 
89 
90 
91 
92 
93 

LATIN AMERICA AND THE CARIBBEAN 

Brazil^{a} 
0.58 









0.58 
0.55 




0.55 
0.56 






Colombia^{a} 

0.52 
0.53 





0.55 









0.51 





Guatemala^{b} 

















0.58 

0.59 




Jamaica^{c} 


















0.43 

0.43 
0.41 
0.38 
0.38 
Mexico^{b} 














0.51 




0.55 




Trinidad and Tobago^{a} 

0.51  








0.42 












Venezuela^{a} 






0.44 
0.42 
0.41 
0.39 














SUBSAHARAN AFRICA 

Ivory Coast^{c} 















0.41 
0.39 
0.40 
0.37 





Gabon^{d} 





0.59 

0.63 
















Ghanac 


















0.36 
0.37 




Mauritius^{c} 
















0.40 




0.37 


Nigeria^{c} 
















0.37 





0.41 

ASIA 

Bangladesh^{a} 



0.36 



0.33 
0.35 


0.39 

0.36 


0.37 







China^{b} 










0.32 

0.29 
0.27 
0.26 
0.31 
0.33 
0.34 
0.35 
0.36 
0.35 
0.36 
0.38 

China, Hong Kong^{a} 

0.41 




0.41 



0.37 













India^{c} 
0.30 

0.32 
0.29 



0.32 





0.31 


0.32 
0.32 
0.31 
0.30 
0.30 
0.33 
0.32 

Indonesia^{c} 






0.35 

0.39 

0.36 
0.34 


0.32 


0.32 


0.33 



South Korea^{a} 










0.39 




0.35 


0.34 





Malaysia^{a} 






0.53 







0.48 









Pakistan^{e} 

0.31 







0.32 





0.32 

0.32 
0.31 





Philippines^{a} 

0.49 













0.45 


0.45 


0.48 


Sri Lanka 



0.35 





0.44 

0.45 





0.47 






Thailand^{a} 





0.42 





0.43 




0.47 

0.47 

0.49 

0.52 

^{a} Gini coef ficients based on the distribution of households
by household gross income. 
^{[1]} Especially where
homeproduced food is an important part of food consumption or where data on
quantities are a prerequisite to arrive at the expenditure. ^{[2]} This general equation defines six regional dummies with industrialized countries as the reference group. The regional dummy term in the general equation for countries in transition economies is TR. For the developing countries, it is as follows: AI in American Islands, AC in Continental America, AF in Africa, OC in Oceania and AS in Asia. These dummies take on the value of unity or zero otherwise. For example, the equation for countries in transition will include a, b_{1} and b_{7}, while for developing countries in American Islands, it will include a, b_{2} and b_{7}, and so on. The equation for industrialized countries, as the reference group, will include a and b_{7} only. ^{[3]} The BMI refers to weight (kilograms) divided by height² (metres). ^{[4]} It must be pointed out that, in theory, the area below f(x )should be the same as that below f(r ).This is, however, not so in Figure 1, as the two curves have not been drawn to scale. 