Part II: Methods for the measurement of food deprivation and undernutrition

Keynote Paper: FAO methodology for estimating the prevalence of undernourishment

Loganaden Naiken
FAO (retired)
Rome, Italy

Executive summary

This paper gives a broad account of the methodology and data used by FAO for estimating the prevalence of undernourishment. Following a short introduction, the basic methodological framework is reviewed, which consists of a frequency distribution of food consumption (expressed in terms of dietary energy) and a cutoff point for intake inadequacy defined on the basis of minimum energy requirement norms. Subsequently, the data and procedures used for estimating the frequency distribution of food consumption and the cutoff point are described. The meaning and significance of the FAO measure of food deprivation in the light of the limitations placed by the data and procedures used are then explored. A section is devoted to a brief description of similar measures produced by other organizations or authors and their relationship with the FAO measure. The strengths and weaknesses of the FAO estimates, the feasibility of their improvement in the future, and issues relating to the feasibility of disaggregating the estimates by sex-age or subnational groups are also discussed. The paper includes four technical appendices. Appendices A, B and C deal with certain issues raised in the paper, in particular the role of the bivariate probability distribution and the expectation of a correlation between energy intake and requirement in justifying the methodology for estimating the prevalence of undernourishment. Appendix D illustrates the application of the FAO methodology in a hypothetical country.

Introduction

The FAO measure of food deprivation, which is referred to as the prevalence of undernourishment, is based on a comparison of usual food consumption expressed in terms of dietary energy (kcal) with certain energy requirement norms. The part of the population with food consumption below the energy requirement norm is considered undernourished ("underfed").

The focus on dietary energy in assessing food insufficiency or deprivation is justified from two perspectives. First, a minimum amount of dietary energy intake is essential for body weight maintenance and work performance. Second, increased dietary energy, if derived from normal staple foods, brings with it more protein and other nutrients as well, while raising intakes of the latter nutrients without ensuring a minimum level of dietary energy is unlikely to be of much benefit in terms of improving nutritional status. Nevertheless, by focusing on dietary energy intake, the measure is attempting to capture those whose food consumption level is insufficient for body weight maintenance and work performance. It follows from this that the FAO measure is focusing on the phenomenon of hunger rather than undernutrition (or malnutrition), which has a broader nutritional connotation.

FAO has been traditionally preparing estimates referring to the prevalence of undernourishment in connection with its World Food Survey reports, the last being The Sixth World Food Survey (FAO, 1996). The principal aim of the estimates in this context has been to provide information on the broad dimension of the hunger problem in the developing world. In fact, although the estimates have been worked out on a country-by-country basis, only the global and regional aggregates have been published. Furthermore, the focus has been on the long-term trends, as the World Food Surveys were issued between periods of roughly ten years. However, following the World Food Summit in 1996, the situation has actually changed. At this Summit, the signatories of the Rome Declaration on World Food Security made the following pledge:

We pledge our political will and our common and national commitment to achieving food security for all and to an ongoing effort to eradicate hunger in all countries, with an immediate view to reducing the number of undernourished people to half their present level no later than 2015.

Hence, for the purpose of monitoring progress towards the target of halving the number of undernourished, the need had arisen to prepare and regularly update such estimates at the global as well as country level. FAO has been undertaking this task in its annual report on "The State of Food In-security in the World" (SOFI), which was first issued in 1999. SOFI 2002, which is the latest report, was issued in October 2002.

In the following sections, the basic methodological framework, data and procedures used by FAO for deriving the country estimates are described and discussed. The relationship with similar measures used by other organizations is also discussed. The strengths and weaknesses of the estimates, as well as the improvement that could be envisaged in the future, are highlighted. Lastly, the feasibility of disaggregating the estimates by demographic or subnational breakdowns is discussed. Appendices A, B and C discuss in more detail certain issues referred to in the paper, and Appendix D illustrates the application of the methodology in a hypothetical country.

Basic methodological framework

In developing the methodology for estimating the prevalence of undernourishment, a basic problem concerns the use of the available energy requirement norms. These norms are usually specified as the average for groups of individuals of the same age, sex, body weight and activity. This means that even after taking into account the most influential factors such as age, sex, body weight and activity, differences exist in the energy requirement of individuals. This variation has been attributed mainly to differences in the efficiency of energy utilization between individuals. As it is not feasible to determine the efficiency of energy utilization of each individual, the departure of the specific energy requirement of an individual from the average is not known. In view of this, the estimate of the proportion of individuals having inadequate energy intake has been defined within a probability distribution framework. In this context, Sukhatme, in a pioneering study on the application of distribution analysis in estimating the extent of hunger in the World (Sukhatme, 1961), had indicated that if information in the form of a bivariate frequency distribution of intake x and requirement r, referring to individuals of the same age, sex, body weight and activity, was available, the proportion of individuals with intake below requirement could be formulated as follows:

(1)

However, as indicated above, even if the actual intake of each individual can be measured, the actual individual requirement is unknown. Therefore, information on the joint distribution f(x,r) cannot be obtained empirically. Nevertheless, as some information pertaining to the distribution of requirement was available, Sukhatme (1961) suggested the following formula for estimating the proportion of undernourished in the group of individuals with the same age, sex, body weight and activity:

(2)

where f(x) is the marginal frequency distribution of dietary energy intake, and r_L is a cutoff point reflecting the lower limit of the marginal distribution of energy requirement (and hence also referred to as the minimum energy requirement). Expression (2) has been referred to as the cutoff point formula to distinguish it from the bivariate formula given by expression (1). Initially, assuming that the distribution of requirement is normal, Sukhatme (1961) had taken the cutoff point (minimum energy requirement) as corresponding to the lower limit of the 99 percent confidence interval, i.e.:

where m_r represents the mean and s_r the standard deviation of the requirement distribution. In subsequent studies, he has, however, indicated that

may be more appropriate (Sukhatme, 1982).

FAO has traditionally applied the cutoff point formula. In the past, this was considered to be an approximation of the bivariate formula necessitated by the absence of adequate information pertaining to the bivariate distribution. A key problem in attempting to model and apply the latter distribution has been how to handle the effect of correlation that is expected to exist between energy intake and requirement. However, a study undertaken in FAO during the preparation of the Sixth World Food Survey showed that if intake is correlated with requirement, the bivariate formula reduces to the cutoff point formula (Naiken, 1998). In other words, expression (1) is a general formula, which, under the condition of correlation, reduces to expression (2). In Appendix A, this issue and the derivation of the cutoff point formula are discussed in detail.

Another point to be considered is that the Sukhatme cutoff point formula given by expression (2) refers to a group consisting of individuals of given age, sex, body weight and activity so that the variation in requirement (i.e. s_r) taken into account in determining the cutoff point is due to differences in the efficiency of energy utilization. However, a population in general consists of individuals who differ with respect to age, sex, body weight and activity. Thus, a correct application of the approach would require the stratification of the population into groups of similar individuals with respect to such characteristics and having information on the usual food consumption of the individuals in the different groups taken separately. This approach is, however, not feasible for the simple reason that the required data are not available. In the best of circumstances, what may be available refers to information on the food consumption of a representative sample of households in the population. The information on the food consumption and the size of the households sampled enables the derivation of the frequency distribution of the household per capita energy consumption. This means that what is in practice feasible is the specification of the distribution of energy consumption referring to households adjusted for differences in size.

It follows from the above that in applying the cutoff point formula, f(x) is taken to reflect the frequency distribution of household per capita dietary energy consumption, and consequently, the cutoff point, r_L refers to the household per capita minimum energy requirement. By implication, the variation in energy requirement to be considered in arriving at the cutoff point reflects the composite effect of the differences in the composition of the households with respect to the age, sex, body weight, activity and efficiency of energy utilization of their members. The next two sections discuss the estimation of f(x) and r_L.

Estimation of the distribution of energy consumption

Source of data - household surveys

Surveys that collect data on the quantities of food products consumed by individuals in a representative sample of households in the population are the only source of basic data pertaining to f(x). Such surveys provide data on household size and food consumption of the households surveyed, thus leading to household level data on per capita dietary energy consumption, which can be used to estimate f(x).

A well-known type of survey in this context is the specialized food consumption or dietary survey. The information collected in these surveys normally refers to the quantities of food items actually consumed by the household members, which are converted into nutritive values by applying the appropriate nutritive factors. However, since the main objective of these surveys is to obtain a close approximation of the total amount of food eaten (food intake) by members of the household, the data collection procedures are usually rather complicated (e.g. weighing the food items used for the preparation of each meal). They are therefore costly to implement on a national scale. As a result, only a few countries have attempted to carry out such surveys. Even in these cases, the sample size is often so small that their usefulness for distributional analysis is questionable. Furthermore, the food eaten away from home (street foods, food consumed in restaurants and other establishments, etc.) is often not taken into account.

The more commonly and regularly undertaken type of survey is that usually called the Household Income/Expenditure Survey (HIES). The HIES, which also includes food consumption data as an integral part of its broader enquiry on household consumption expenditures, has been conducted in many countries. Frequently, information not only on the monetary expenditures but also on the quantities relating to food items purchased or acquired for consumption by households has been collected.^[1] In addition, the food items recorded are often sufficiently detailed to enable the conversion of food quantities into nutritive values and the estimation of household per capita energy consumption. The surveys also provide data on household income/expenditure as well as a number of other socio-economic characteristics, so they permit an analysis of the inter-relationship between food consumption and certain socio-economic variables. Furthermore, to the extent that these surveys may have been designed to yield subnational estimates, they may permit mapping the subnational variations in food consumption. These surveys are in fact the only existing source of data for distribution analysis of both income and food consumption and hence for the estimation of the prevalence of both poverty and food deprivation.

Within FAO, the need for a better and enhanced use of the food consumption data from the existing HIESs for the purpose at hand has been recognized for some time (Naiken and Becker, 1990). The problem is that the data that are normally processed and tabulated by the respective national statistical organizations refer to the monetary values of the food consumed, i.e. food expenditure. In this form, the information is not directly usable as input for estimating the frequency distribution of dietary energy consumption. For this purpose, there is a need, as indicated above, to convert the quantities of the various items of food expenditure corresponding to each household into their dietary energy (calorie) equivalents and then derive the appropriate tabulations referring to dietary energy consumption distribution. However, the process of converting the food consumption data from the HIES type surveys into dietary energy equivalents and tabulating the results has been undertaken in a few countries only.

Problems in the use of the distribution data from household surveys

Even when survey data pertaining to food consumption are available, considerable problems are encountered in using such data for estimating the distribution of dietary energy consumption. These problems are generally of two kinds: one concerning the precision of the household level data and the other referring to the reliability of the estimated frequency distribution owing to sample design.

PRECISION OF THE HOUSEHOLD LEVEL DATA

The methodology and concepts applied in the surveys are usually not sufficiently precise to provide an accurate estimate of the usual consumption at the household level.

In some cases, certain contributions to the consumption of the household members are excluded; in others, certain contributions of consumption that should be excluded are included. Generally, the HIES has a questionnaire format that refers to food purchased or acquired during the reference period, making no distinction between the consumers so that food given to guests, visitors or tenants as well as residual household wastages are included. Food transfers to and from household stocks may not be adequately taken into account. Food consumed away from home by the household members may be recorded, but these usually correspond to prepared food and are expressed in monetary terms, which present difficulties for conversion into nutritive values. Furthermore, irrespective of the questionnaire format, the precision of the information collected depends on the ability of the respondents to recall the quantities of the different food items that have been consumed, thus resulting in measurement errors. For this reason, the reference period is chosen to be sufficiently short (one day, one week or two weeks) to facilitate recall, but this increases the risk of not reflecting the usual consumption of the household owing to the effect of seasonal variations, etc.

As a result of the above problems, the household per capita dietary energy consumption figures derived from the food consumption data collected at the household level are imprecise and, in many cases, found to be unrealistically high or low.

SAMPLE DESIGN

Many of the problems mentioned above stem from the fact that the sample surveys are designed to provide reliable estimates of the population means or totals rather than the individual household per capita values and hence the frequency distribution. This is evident if we take into account the points below.

Data collection is usually undertaken in rounds by spreading the sample over the survey period of one year. This ensures that the sample means or the means of households corresponding to certain socio-economic groups and/or subnational categories are free from the effect of seasonal variations and any random errors in the individual household measurement.

For the purpose of administrative convenience and the aim of minimizing the variance in order to increase the precision of the population mean or total, the sample design is not usually implemented according to the Equal Probability Selection Method (EPSEM). In other words, the different households in the population do not have the same probability for being selected in the sample. As a consequence, even if the individual household data reflect usual consumption, the resulting sample frequency distribution is not an unbiased estimate of the distribution in the population.

It follows from the above that to the extent that the surveys have not been designed to yield reliable estimates of the usual consumption at the household level, and the selection of the households in the sample has not been implemented with equal probability, the resulting sample distribution is subject to significant errors. This problem applies not only to the food consumption data but also to the total income/expenditure data that are used to estimate the prevalence of poverty.

In view of the problems relating to the distribution data from the existing sample surveys, FAO has continued to rely on a theoretical model to represent the distribution of household per capita dietary energy consumption. Furthermore, as will be seen later, special care is taken to avoid the effects of the irrelevant factors influencing survey data in estimating the measure of inequality in distribution.

The choice of theoretical distribution

As the frequency distribution depicted by the tabulated survey data is generally unimodal, only such kinds of theoretical distributions have been considered for application. In connection with the estimates of undernourishment prepared for the Fourth World Food Survey, the Beta distribution was applied (FAO, 1977). This distribution was chosen because it enabled fixing the lower and upper limits of the range as determined by the physiological lower and upper limits of intake in individuals. However, beginning with the Fifth World Food Survey (FAO, 1987), the Beta distribution was abandoned in favour of the two-parameter log-normal distribution

The idea of fixing the lower and upper limits of the distribution (based on knowledge/assumptions about the physiological limits of intake) is appropriate when dealing with the true intake of individuals, but not when dealing with the kind of household level data emanating from the existing surveys. In most of these surveys, the data refer to the food available to, or acquired by, the household and thus include household wastages, food fed to pets, etc. In this context, the log-normal distribution with its short lower tail and long upper tail is considered to reflect better the fact that wastages, food fed to pets, etc. are likely to be confined to the upper tail representing the richer and more affluent households.

The approach used to specify the parameters of the distribution of dietary energy consumption

As indicated earlier, household survey data that have been processed to yield information on the distribution of dietary energy consumption still cover only a limited number of countries. Even in these cases, the time reference of the surveys differs from country to country. However, information on the national per capita dietary energy supply (DES) is available for nearly all countries. The per capita DES, which is regularly prepared and updated by FAO, is widely used to reflect the levels and trends of the average food consumption in the different countries of the world. In view of this, FAO has used the per capita DES to represent the mean, x^-, of the distribution for each country in preparing estimates of the prevalence of undernourishment at the national, regional and global levels.

Thus, as the distribution is assumed to be log-normal, the only other information required in order to specify f(x) for each country is the coefficient of variation CV(x). This parameter reflects the inequality in the distribution and (under the assumption of log normality) can be easily expressed in terms of the well-known Gini coefficient. It is estimated as far as possible on the basis of survey data. Given x^- and CV(x), the two parameters of the log-normal distribution, m and s², can be determined as follows:

and

The estimation of the mean, , and the inequality in distribution parameter, CV(x), is described below.

ESTIMATION OF THE MEAN

The mean represented by the per capita DES does not refer directly to energy intake but refers to the energy available for human consumption during the course of the reference period, expressed in kilocalories (kcal) per person. It is assumed that the latter is a close approximation of energy consumption, at least for developing countries. The available data are derived from the food balance sheets compiled every year by FAO on the basis of data on the production and trade of food commodities. Using these data and the available information on seed rates, waste coefficients, stock changes and types of utilization (feed, food, other uses), a supply/utilization account is prepared for each commodity in weight terms. The food component, usually derived as a balancing item, refers to the total amount of the commodity available for human consumption during the year. The total DES is obtained by aggregating the food component of all commodities after conversion into energy values.

ESTIMATION OF THE CV

As indicated earlier, the household level data surveys on dietary energy consumption sufficiently precise to yield reliable on the usual consumption. How - the HIES sample design, the means groups of households classified by variable determining household food i.e. household per capita income total expenditure, provide reliable annual average consumption. On such a classification, it is possible the CV of household per capita energy consumption owing to However, household dietary energy is likely to vary also owing to as the sex-age composition, body activity level of the household the factors determining dietary requirements. The composite effect of these factors can be approximated by the variation of the dietary energy requirements. This variation corresponds to a CV of about 0.20 (see Appendix B). Thus, by only considering these two sources of variation, the CV of the household per capita dietary energy consumption distribution is estimated as follows:

where CV(x) is the total CV of the household per capita dietary energy consumption, is the component owing to household per capita income (v), and is the component owing to energy requirement (r).

The variation in household per capita dietary energy consumption owing to differences in the energy requirements will always exist and is not expected to vary significantly between countries and over time. In view of this, can be considered to be a fixed component of CV(x). It follows from this that the survey data are used to estimate only .

The is estimated using the averages of household per capita dietary energy consumption by household per capita income (or total expenditure) classes from n households as , with

where k is the number of income classes, f_j is the number of sampled households, and is the average household per capita dietary energy consumption of the jth income class.

The denominator

is the estimated overall average household per capita dietary energy consumption.

It follows from the above that household survey data are used to estimate the CV of household per capita dietary energy consumption owing to income only. Thus, the effect of differences in the sex-age composition of the households is not taken into account in estimating the CV; consequently, the distribution derived by linking this CV to the mean given by the per capita DES refers to a population composed of average individuals in so far as sex-age composition is concerned. In other words, the unit of the distribution is what is sometimes referred to as the national per capita unit. For this reason, the derived distribution will be referred to henceforth as "distribution of per capita dietary energy consumption".

In the large majority of countries, the appropriate data for directly estimating are not readily available. Hence, for these countries, depending on the available data from the survey reports, indirect procedures are used by FAO to arrive at approximate values of this parameter. As far as possible, information pertaining to the distribution of closely related variables such as food expenditure and income (or its proxy total expenditure) is used. When even such information is not available, other information that is readily available and can be linked to is used. These indirect procedures are briefly described below.

Procedure 1:

Countries where the available survey data refer to the average food expenditure of households grouped according to income classes

In these countries, the available data enable the calculation of the CV of food expenditure owing to income. Consequently, this CV is used to estimate through the following equation:

where is the CV of food expenditure owing to income/expenditure. The parameters a and b are estimated by fitting the following linearized equation to a group of countries where the survey reports provided tabulations enabling the calculation of both and :

Procedure 2:

Countries where the available survey data refer to the distribution of household income

In these countries, the data on the distribution of household income enable the estimation of the inequality in the distribution as expressed by the Gini coefficient. Thus, in order to derive an estimate of , the Gini coefficient of income, G(v), is first converted into the Gini coefficient of dietary energy consumption owing to income, . Then, given the assumption of a log-normal distribution, the latter Gini coefficient is easily expressed as.

The conversion of the Gini coefficient of income into the Gini coefficient of dietary energy consumption owing to income is undertaken through a factor defined as follows:

where is the Gini coefficient of dietary energy consumption owing to income, and G(v) is the Gini coefficient of income.

The factor k is estimated for each country through the following equation

where GDP refers to per capita gross domestic product (GDP) expressed in purchasing power parity. The parameters a and b are estimated by fitting the following linearized equation to the countries where GDP data are available, and survey data enable estimation of both and G(v):

Procedure 3:

Countries where not even survey data referring to income distribution are available

For these countries, rough estimates of have been obtained by linking it to infant mortality, through the following general equation: ^[2]

In the above equation, IMR is the infant mortality rate, and the dummy indicators TR, AI, AC, AF, OC and AS correspond to country regional locations. The parameters a, b₁ to b₇ are obtained by fitting the equation to a group of countries having data on IMR and . Given IMR and the parameters, the CV of dietary energy consumption owing to income is derived as

where m refers to the evaluation of the general equation.

CHANGES IN THE INEQUALITY IN DISTRIBUTION OVER TIME

Estimates of the prevalence of undernourishment in the developing countries are prepared for the benchmark periods 1968-71, 1979-81, 1990-92 and the most recent period for which estimates of per capita DES are available (which currently refer to 1997-99). This means that f(x) needs to be specified for each of those four periods for a given country, and therefore, estimates of and CV(x) are required for each of these periods. In this connection, the means for the different periods are taken from the most recently available series of per capita DES, but the same CV(x) is used. This implies that the inequality in distribution, as measured by CV(x), is assumed to have remained constant over the last three decades. This approach has been dictated by the fact that the available survey data referring to dietary energy consumption, food expenditure and income/expenditure that are used for estimating the in the different countries are almost never available for all the four periods. In other words, there is no possibility of taking into account any change in the inequality that may have occurred in the individual countries. However, evidence provided by the available series of Gini coefficients of income/expenditure compiled for a number of developing countries suggests that there has been little, if any, change in the inequality of income/expenditure (see Appendix C).

The Gini coefficient is a measure of inequality in access to certain goods or services (income/expenditure in our case) among the units under consideration (households in our case). It ranges from zero to unity - zero implying an equal distribution among the units and unity implying absolute inequality or concentration in a single unit. Thus, the more the coefficient departs from zero, the more unequal is the distribution.

Definition of energy requirement and estimation of the cutoff point

Definition of energy requirement

The human body requires dietary energy intake for its expenditure of energy, which in turn is composed of several components: the basal metabolic rate (BMR), i.e. the energy expended for the functioning of an individual in a state of complete rest; the energy needed for digesting food, metabolizing food and storing an increased food intake; and the energy required for performing physical activities, both work and non-work. For children, the energy required for growth should be taken into account. Similarly, for women during pregnancy and lactation, the energy required for the deposition of tissue and secretion of milk needs to be considered.

The energy requirement norms or standards, adopted at the international level, are periodically reviewed by expert groups and consultations. The report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements (FAO/WHO/UNU, 1985), has defined energy requirements as follows:

The energy requirement of an individual is the level of energy intake from food that will balance energy expenditure when an individual has a body size and composition and level of physical activity, consistent with long-term good health; and that will allow for the maintenance of economically necessary and socially desirable physical activity. In children and pregnant or lactating women the energy requirement includes the energy needs associated with the deposition of tissues or the secretion of milk at rates consistent with good health.

Specification of energy requirement

Energy requirements are specified by sex and age groups. As per the recommendations of the FAO/WHO/UNU Expert Consultation, the procedure for deriving the sex-age-specific energy requirement differs between adults and adolescents on the one hand and children below age ten on the other.

For adults and adolescents, the specification of energy requirement begins with the BMR. This is derived on the basis of body weight through the use of a set of sex-age-specific regression equations linking the BMR with body weight. The energy needed for activity is expressed in terms of the BMR so that the energy requirement for a given sex-age group is finally expressed as a multiple of the BMR. The BMR multiple is referred to as the physical activity level (PAL) index.

For example, the energy requirement for adult males involved in light activity is given as 1.55 BMR, whereas for females, it is 1.56 BMR. The component involved in digesting and metabolizing food is difficult to measure in isolation of activity since the very act of eating involves activity. In view of this, it is allowed for in the PAL value.

For children below age ten, the above component approach is not applied, and the sex-age-specific energy requirements are expressed as fixed amounts of energy per kilogram of body weight. In addition, for children below age two from developing countries, an allowance is made for the energy needed to recover from frequent attacks of infection.

It follows from the above that the key parameters to be specified for deriving the energy requirements are body weight and the PAL index for adults and adolescents, and only body weight for children below age ten.

Definition of the minimum energy requirement by sex-age groups

It may be recalled that the FAO/WHO/UNU Expert Consultation has defined requirement as the level of intake "that will balance energy expenditure when the individual has a body-size and composition and level of physical activity consistent with good health and that will allow for the maintenance of economically necessary and socially desirable activity". This definition implies that energy requirement should be derived on the basis of normatively specified body weight and PAL rather than the actual body weight and activity level of the individual.

However, the Expert Consultation has recognized that for a given height, there is a range of body weights that are consistent with good health. Similarly, there are a range of PALs that are consistent with performance of economically necessary and socially desirable activity and therefore may be considered to be acceptable. In view of this, the range of variation in requirement for adults and adolescents has been defined in terms of the range of energy expenditure resulting from the application of the different combinations of acceptable weight-for-height and PAL. Accordingly, the lower limit of the range of variation of requirement is reflected by the energy expenditure corresponding to the lowest acceptable weight-for-height and the lowest acceptable activity allowance and the upper limit by the energy expenditure corresponding to the highest acceptable weight-for-height and the highest acceptable activity allowance.

Thus, as the cutoff point should be based on the lower limit of the range of variation, the sex-age-specific minimum energy requirements for adults and adolescents have been specified on the basis of the lowest acceptable body weight and the lowest acceptable PAL index. The lowest acceptable body weight for a given height has been estimated on the basis of the fifth percentile of the body mass index^[3] (BMI) (WHO, 1995), and the PAL index corresponding to light activity (1.55 for males and 1.56 for females) has been taken to reflect the lowest acceptable activity level.

It follows from the above that the sex-age-specific minimum energy requirements have been derived not by the m - 2s formula but by directly considering the energy expenditure corresponding to the lowest acceptable weight-for-height and the lowest acceptable activity level.

As regards children below ten years of age, the body weight figures required are fixed at the median of the range of weight-for-height given by the WHO reference data (WHO, 1983) rather than a lower limit as in the case of adolescents and adults. This is due to the lack of recommendations by the FAO/WHO/UNU Expert Consultation concerning the range within which weight-for-height for a given sex-age group may be regarded as satisfactory. The energy requirement was estimated on the basis of the specified weight and the energy requirement per kilogram of body weight recommendations provided by FAO/WHO/UNU (1985). However, the energy requirements per kilogram of body weight recommended by the Expert Consultation include a five percent allowance to account for the fact that the energy intakes of the reference groups on which they were based do not reflect the optimum activity levels for children. This extra allowance has been removed for the purpose of deriving the minimum requirement.

In all sex-age groups, the height figures needed for determining the minimum body weight norms were obtained from the tables given by James and Schofield (1990) and other sources.

The overall minimum per capita energy requirement (the cutoff point)

The minimum per capita dietary energy requirement used as the cutoff point for estimating the prevalence of undernourishment is derived by aggregating the estimated sex-age-specific minimum dietary energy requirements, using the relative proportion of the population in the corresponding sex-age groups as weights. Thus, as the sex-age distribution of the population changes over time, the cutoff point has to be adjusted to reflect this change in demographic structure.

FIGURE 1. FRAMEWORK FOR THE CALCULATION OF THE PROPORTION OF THE POPULATION UNDERNOURISHED

It may be recalled that the frequency distribution of intake, f(x), refers to the average individual or the national per capita unit. This means that with respect to the variation in requirement and the cutoff point, the differences in sex-age composition of households are not considered. Nevertheless, the variation in requirement to be taken into account in defining the cutoff point should reflect the composite effect of not only the acceptable differences in body weight and activity as discussed above in the section "Definition of the minimum energy requirement by sex-age groups" but also the differences in the efficiency of energy utilization of the individuals. However, there is likely to be significant covariance between the latter factor on the one hand and the weight and activity factors on the other. As the extent of the covariance is not known, it was not felt prudent to unduly reduce the energy requirements further in order specifically to take into account the effect of the differences in efficiency of energy utilization in arriving at the cutoff point.

Calculation of the prevalence of undernourishment

Figure 1 graphically illustrates the framework for the calculation of the prevalence of undernourishment. The curve f(x) depicts the proportion of the population corresponding to different per capita dietary energy consumption levels (x) represented by the horizontal line. The cumulative proportion of the population up to the cutoff point, r_L, on the horizontal line represents the proportion of the population undernourished.

The distribution, f(x), is assumed to be log-normal, and so the parameters of the distribution of the logarithm of x (i.e. m and s²) can be derived on the basis of the mean, , and the coefficient of variation, CV(x). The section on "Estimation of the distribution of energy consumption" discussed the derivation of the two latter measures on the basis of the available data, and the section on "Definition of energy requirement and estimation of the cutoff point" discussed the derivation of the cutoff point, r_L. A detailed illustration of the derivation of these three measures and the calculation of the prevalence of undernourishment for a hypothetical country is given in Appendix D. Here, only a summarized version limited to the procedure for calculating the prevalence of undernourishment on the basis of , CV(x) and r_L for the same hypothetical country is given.

The values of the three input measures for the calculation of the prevalence of undernourishment are as follows:

The first step following the specification of the above is to derive the two parameters of the log-normal distribution as follows:

The next step is to construct the standard normal deviate corresponding to the cutoff point:

Finally, the proportion of the population below the cutoff point is obtained as:

The prevalence of undernourishment is therefore estimated to be 23 percent. The number of undernourished is obtained by multiplying F(Z) by the total population in the country. As the total population is 11 million, the number of undernourished is estimated as 2.6 million.

Meaning and significance of the resulting estimate of the prevalence of undernourishment

The data and approximations used to derive the distribution of dietary energy consumption and the cutoff point have implications on the precise meaning and significance of the resulting estimate of the prevalence of undernourishment. These are discussed below.

Concept of food consumption

It was noted that the per capita DES is used as the mean of the frequency distribution, f(x). This means that the distribution refers to food acquired by (or available to) the households rather than the actual food intake of the individual household members. As a consequence, the resulting estimate refers to food availability rather than food intake.

Unit of analysis

The unit of analysis is the household per capita average rather than the individual within the household. This means that any inequity that may exist in the intra-household access to food is ignored so that, conceptually speaking, the measure of undernourishment refers to the proportion of households in the population whose food availability is below requirement.

Time reference

The per capita DES used as the mean of f(x) corresponds to a three-year rather than annual average in order to even out the effect of errors in the annual food stocks data used in preparing the food balance sheets. Furthermore, for the purpose of deriving the CV(x | v), only household survey data grouped according to income/expenditure classes are used, thus removing the effect of seasonal and other short-term variation that the household level data are subject to. As a consequence of these, the estimate refers to the average condition during the given three-year period, and the effects of seasonal and other short-term variations in food availability are not considered.

Use of concept of minimum energy requirement as cutoff point

The cutoff point is derived by aggregating the sex-age-specific minimum energy requirements using the proportion of the population in the different sex-age groups as weights. The sex-age-specific minimum energy requirements for at least adults and adolescents are based on the energy expenditure corresponding to the lower limit of the range of acceptable body weight for a given height and the light activity norm. This approach of arriving at the cutoff point might give the impression that undernourishment is operationally defined as the state of having a food consumption level that is below that needed by an average individual for maintaining minimum acceptable body weight and performing light activity. However, this is, strictly speaking, not so. The minimal approach in establishing the cutoff point is a consequence of the consideration, owing to the effect of the correlation between energy intake and requirement, that when an individual's consumption falls within the range of variation of requirements, it is likely to be very close, if not equal, to the individual's own energy requirement. In other words, the extent of their energy shortfalls or excesses is likely to be small if not exactly zero.

Similar measures by other organizations/authors

The approach of using information pertaining to food availability, income distribution and energy requirement for estimating the global prevalence of food deprivation has been followed by other organizations or authors. Two recent studies can be referred to in this context: one by the US Department of Agriculture in its annual assessment of food security in developing countries (US Department of Agriculture, 2000) and the other by Senauer and Sur (2001).

Both approaches rely on a basic methodological framework that was first applied by Reutlinger and Selowsky (1976). In this section, following a description of this basic methodological framework, the essential differences in the implementation of the two approaches are noted. Then, the relationship with the FAO measure is highlighted.

Basic methodological framework

The methodology relies on three components:

a relationship between food consumption and income;
a distribution of income;
a cutoff point for defining undernourishment.

The relationship between food consumption and income is represented by the Engel function

where x and n represent food consumption and income, respectively, and a and b are parameters. The latter are estimated by cross-country regression using the per capita food availability derived from food balance sheet as x and the per capita GDP as v.

The distribution of income is derived on the basis of World Bank data on income, and the cutoff point for undernourishment is based on a certain energy requirement level.

The relationship between x and v enables the derivation of the per capita food availability corresponding to a given per capita income level or vice versa. Hence, once the cutoff point defining undernourishment is specified, the population in the income groups with per capita food availability below this level is taken as the undernourished.

Differences between the USDA and Senauer/Sur approaches

While the two approaches share the above basic methodological framework, certain differences are noted in actual implementation as stated below.

UNIT OF MEASUREMENT OF FOOD AVAILABILITY

The USDA expresses food availability in terms of grain equivalents, whereas the Senauer/Sur study retains the FAO dietary energy supply approach.

INCOME DISTRIBUTION

In the USDA approach, the population is divided into five income groups using the World Bank data on income quintiles, and the Engel function is used to derive the per capita food availability corresponding to each income group. The population in the income quintiles with food availability below the specified cut-off point is taken as the undernourished.

In the Senauer/Sur approach, the World Bank data on income quintiles are used to generate a cumulative density function with the per capita GDP as the mean. In view of this density function approach, the Engel function is used to derive the income level corresponding to the food consumption level implied by the specified energy requirement level. The proportion of the population below this income level, read off the generated cumulative density function, is taken as the prevalence of undernourishment.

THE CUTOFF POINT FOR DEFINING UNDERNOURISHMENT

The USDA uses the concept of average energy requirement for each country, which averages to about 2 100 kcal/person/day for the 67 developing countries covered in their study, whereas the Senauer/Sur study adopted the FAO regional minimum energy requirements derived in connection with the estimates published in the Agriculture: Towards 2010 report (FAO, 1995), which are on average about 300 kcal/person/day below the level used by USDA.

Nevertheless, it may be pointed out that the differences with respect to the unit of measurement for food availability and the degree of detail in representing income distribution are not likely to have any systematic effects on the results. What is likely to have a systematic effect is the difference with respect to the cutoff point. The lower cutoff point used by Senauer/Sur implies that their estimates are likely to be lower than those of the USDA.

Relationship with the FAO approach

The two approaches described above essentially rely on data relating to the distribution of household income and its effect on the variation in per capita food consumption. The FAO approach also relies on such data but with the aim of deriving the distribution of per capita food consumption. The key distinction between the USDA and Senauer/Sur approaches and the FAO approach is that the latter also considers the effect of certain non-income factors that determine the inequality in the distribution of food consumption. This suggests a difference in the focus of the two kinds of measures. In the case of the USDA and Senauer/Sur approaches, the focus is on the part of the population whose income level is insufficient to ensure a minimum food consumption level, as determined by the specified requirement level. In the FAO case, the focus is on the part of the population whose actual food consumption level is below the specified minimum energy requirement level.

Strengths and weaknesses of the FAO estimates

Strengths

The main strength of the FAO estimates lies in the fact that the distribution of household per capita dietary energy consumption is directly linked to the per capita DES derived from the food balance sheet. This procedure is useful for the reasons given below.

TABLE 1. PERCENTAGE OF UNDERNOURISHED CORRESPONDING TO DIFFERENT COMBINATIONS OF MEAN AND CV OF DIETARY ENERGY CONSUMPTION DISTRIBUTION
Mean food consumption (kcal/person/day)	CV
	0.20	0.24	0.29	0.35
	0.20	0.24	0.29	0.35
1 700	65	64	63	63
2 040	30	34	38	42
2 450	7	12	17	23
2 940	1	2	6	10

The FAO per capita DES database, which covers practically all countries of the world, is regularly revised and updated in connection with FAO's continuous work programme on supply/utilization accounts and food balance sheets. As a result, the database represents a readily available source of information for the assessment and monitoring of the prevalence of undernourishment at the global, regional and country levels.

The linkage of the per capita DES with a measure of inequality within a theoretical distribution framework provides a mechanism for assessing the effect of short-term changes in aggregate food availability as well as its components (production, import, etc.) on the distribution of dietary energy consumption and hence the prevalence of undernourishment.

Weaknesses

It is evident there are several weaknesses that affect the validity of the estimates. These are primarily associated with the fact that the frequency distribution of per capita dietary energy consumption is not based on observed data but is derived using a model whose parameters are estimated on data or measures that are subject to errors of unknown magnitude and direction. However, the derivation of the cutoff point also has certain weaknesses.

Assuming that the distribution model used (log-normal) is correct, the validity of the estimates depends on the reliability of the three elements involved in the estimation process, i.e. the mean (represented by per capita DES), the CV and the cutoff point. Of these, the last is obviously of crucial importance as its location within the range of variation of the consumption distribution directly determines the proportion of the population undernourished. In other words, the result is quite sensitive to the cutoff point. The impacts of the other two elements are through their diverse effects on the consumption distribution. The sensitivity of the result with respect to these elements can be assessed on the basis of Table 1, which shows the percentage of the population undernourished corresponding to different combinations of the mean and CV with the cutoff point kept fixed at 1 800 kcal/person/day.

In Table 1, the mean and the CV are given initial values of 1 700 kcal/person/day and 0.20, respectively, and these are then successively increased by 20 percent over the previous values in three steps to finally reach 2 940 kcal/person/day and 0.35, respectively. Accordingly, the absolute change in the percentage as one moves from one cell to the other along the rows indicates the sensitivity of the result to the CV, while the movement along the columns indicates the sensitivity to the mean.

It is noted that the percentage is quite sensitive to the mean but much less so with respect to the CV. In fact, when the mean is low and close to the cutoff point, the percentage is practically insensitive to the CV. Sensitivity to the CV tends to increase as the mean increases to higher levels. In the present analysis, which assumes a cutoff point of 1 800 kcal/person/day, the sensitivity to the CV appears to reach a peak when the mean is about 2 500 kcal/person/day. However, even then, the sensitivity to the mean is greater.

It follows from the above that errors in the estimation of the CV are of less consequence to the result as compared with errors in the cutoff point and the mean. Nevertheless, for the sake of precision of the result, it is important that all three elements are sufficiently accurate. The weaknesses having a bearing on the accuracy of these elements are indicated below.

THE MEAN

The per capita DES, which is taken to represent the mean of the distribution of per capita energy consumption, is derived as the ratio of total food supply to population. The total food supply is based on information relating to food production, food products traded, wastages, stock changes and non-food uses of food products. While data on production and trade are available for most countries, it is well known that they are subject to errors and there are many gaps in the information reported by countries. As regards information on stocks and non-food utilization, comprehensive and regular statistics are not normally available. There is therefore a need to rely on estimates based on fragmentary data or assumptions to fill in the data gaps in many instances.

The population estimates used as the denominator of the ratio are based on the global series updated biannually by the UN Population Division. Although most of the developing countries have carried out population censuses, these invariably suffer from errors of overestimation or underestimation. The UN Population Division undertakes evaluation and adjustment of the census data in deriving the series of population estimates, but the significant revisions that are made in the series when they are updated indicate that the estimates are not accurate for many countries. It is therefore evident that the per capita DES estimates resulting from the ratio of total food supply to population are subject to significant errors, particularly where the data problems are severe, for example in Africa.

THE CV

Several problems have been noted with the specification of this coefficient, as indicated below.

The coefficient cannot be completely specified, even when the appropriate survey data are available owing to problems associated with survey practices, measurement errors and sample design. It is in fact only the component of the CV owing to income that could be estimated with a certain degree of reliability from the survey data. However, owing to the lack of the appropriate survey data for many countries, even this component of the CV had to be estimated indirectly and assumed to remain constant over time.

It is uncertain whether the method adopted for estimating the non-income component of the CV is adequate and reliable.

THE CUTOFF POINT

The weaknesses relating to the cutoff point mainly refer to the specification of the sex-age-specific energy requirement. In this context, three points need to be mentioned.

EQUATIONS TO ESTIMATE THE BMR: The regression equations used to estimate the BMR on the basis of the body weight are those recommended in the report of the FAO/WHO/UNU Expert Consultation on Energy Requirement. These equations (known as the Schofield equations) are believed to be less accurate in predicting BMR in population of the tropics and North America and appear to overestimate the BMR in many populations (Hayter and Henry, 1994).

BODY WEIGHT NORMS FOR CHILDREN: It was noted that the minimum energy requirements for children below age ten were based on weights corresponding to the median of the range of acceptable weight-for-height. This approach is in fact inconsistent with the general approach applied to adults and adolescents of basing the minimum energy requirements on the lower limit of the range of acceptable weight-for-height. As a consequence, the per capita minimum energy requirement used as the cutoff point may have been systematically overestimated.

AVERAGE HEIGHT BY SEX-AGE GROUPS: The height figures used for fixing the weight-for-height norms are in many instances based on fragmentary data or analogy with countries having a similar racial/ethnic background. Hence, these may be subject to significant errors.

Feasibility of improving the FAO estimates

It is evident that future attempts to improve the FAO estimates should focus on the weaknesses relating to the specification of both the distribution of household per capita dietary energy consumption as well as the cutoff point. The actions that could be considered are briefly discussed below.

The mean and the CV of household per capita dietary energy consumption

The food balance sheet estimate of the per capita DES, which is used as the mean of the distribution, can be improved by improving the basic statistics on which it is based. The figure should also be reconciled with the corresponding national level estimates derived from the aggregated HIES data. Although the estimates of per capita food consumption from these two sources are expected to differ, the food consumption data from the HIES can be used to improve the food balance sheet estimates, particularly with respect to the consumption of minor food crops and the self-produced and consumed food.

There is also scope for improving the estimate of the CV of household per capita dietary energy consumption on the basis of the available distribution data from the existing household income and expenditure surveys, through further research and analysis. However, for the purpose of improving the per capita DES, there is a need for access to the national per capita quantities corresponding to the various items of food expenditure. On the other hand, for the purpose of the research and analysis relating to the estimation of the CV of household per capita dietary energy consumption, the quantity data need to be converted into the energy equivalents and tabulated. Yet, despite FAO's promotional efforts in this direction, few countries have actually undertaken the related work systematically in connection with their existing HIESs. Consequently, a potential source of data for improving the methodology remains largely unexploited. As the data and research will benefit not only FAO's work but also the individual countries' information system for assessing and monitoring the prevalence of undernourishment, it is important that the data processing and tabulation work to be undertaken with respect to the HIES be considered as a key component of national FIVIMS programmes being implemented by FAO and the Inter-agency Working Group on FIVIMS.

Surveys designed to provide reliable estimates of the distributions

In the long term, it is desirable to avoid the use of a theoretical model and different data sources for the distribution of dietary energy consumption and to rely solely on household surveys, at least in the context of national FIVIMS. However, for this purpose, it is necessary to undertake research focusing on the development of sample survey designs that will yield reliable estimates of the distribution at minimal costs.

The cutoff point

With respect to the cutoff point, the weaknesses noted refer to the regression equations used to predict the sex-age-specific BMR, the body weight norms used for deriving the minimum energy requirements for children below age ten and the height figures used for determining the weight-for-height norms.

A considerable amount of new BMR data referring to populations in the tropical areas has become available since the publication of the report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements. Therefore, there is scope for a more extensive analysis in order to arrive at equations that can predict more accurately the BMR for populations in the different parts of the world.

As regards the body weight norms for children below age ten, the report of the FAO/WHO/UNU Expert Consultation did not contain any recommendation regarding the range of acceptable weight-for-height. However, given the practice of using the cutoff point of median - 2 SD in anthropometric assessment of undernutrition among children, it may be logical to adopt this practice also for the purpose of fixing the minimum energy requirements. In this way, the approach taken with respect to children may be brought in line with that taken for adults and adolescents.

The height figures used for deriving the body weight norms are largely based on the data published by James and Schofield in 1990. There is therefore scope for updating this dataset on the basis of the new data available since 1990.

Feasibility of disaggregating the estimates by sex-age and subnational groups

There is of course an interest in obtaining information on the differences that may exist in the prevalence of undernourishment among individuals in different sex-age groups and among those living in different areas within a country. The feasibility of undertaking these within the framework of the FAO methodology is discussed below.

Disaggregation by sex-age groups

Although the minimum energy requirements are first specified by sex-age groups and then aggregated as a weighted average for all sex-age groups, the consumption data refer to household per capita averages that do not permit breakdowns according to the sex and age of the household member. The available consumption data therefore do not allow for disaggregation of the estimates by sex and age groups.

Disaggregation by subnational areas

With respect to global assessment, which relies on the per capita DES from the food balance sheet as the mean of the distribution of consumption, it is not possible to disaggregate the national estimate by subnational areas as the food balance sheet approach is not applicable at the subnational level. However, it may be feasible to apply the FAO methodology separately to the different areas and thus derive subnational estimates of the prevalence of undernourishment if the mean and CV of the distribution of dietary energy consumption by subnational areas are available from household survey data of specific countries. In this context, the national estimate could be obtained as an aggregation over the subnational areas.

Contribution to vulnerability assessment

The FAO country level estimates can be used as an indicator reflecting the extent of chronic food insecurity in vulnerability studies.

Frequency and cost of updating the estimates

As the per capita DES, which is a key element in the estimation process, is updated for nearly all countries on an annual basis by FAO, it is possible to derive annual updates of the estimate of the prevalence of under-nourishment with minimum costs.

Acknowledgements

The author wishes to acknowledge the valuable inputs and comments provided by Messrs J. Mernies and R. Sibrian, his former colleagues in the Statistical Analysis Service, Statistics Division, FAO. In addition, the assistance in typing the drafts provided by Ms G. Marciani-Politi, also in the Statistical Analysis Service, is much appreciated.

References

FAO. 1977. The fourth world food survey. Rome.

FAO. 1987. The fifth world food survey. Rome.

FAO. 1995. Agriculture: towards 2010. Rome.

FAO. 1996. The sixth world food survey. Rome.

FAO/WHO/UNU. 1985. Energy and protein requirements. Report of a joint FAO/WHO/UN ad hoc Expert Consultation. WHO Technical Report Series, No. 724, Geneva.

Hayter, J.E. & Henry, C.J. 1994. A reexamination of basal metabolic rate predictive equations: the importance of geographical origin of subjects in sample selection. Eur. J. Clin. Nutr., 48(10); 702-707.

James, W.P.T. & Schofield, E.C. 1990. Human energy requirements. Oxford, Oxford University Press.

Kakwani, N.C. 1992. Measuring undernutrition with variable calorie requirements. In S. R. Osmani, ed. Nutrition and poverty, pp. 165-185. Oxford, Clarendon Press. 336 pp.

Lorstad, M. 1974. On estimating incidence of undernutrition. FAO Nutrition Newsletter, 12(1): 1-11. Rome.

Naiken, L. 1998. On certain statistical issues arising from the use of energy requirements in estimating the prevalence of energy inadequacy (undernutrition). J. Indian Soc. Agric. Stat., L1(2.3): 113-128.

Naiken, L. & Becker, K. 1990. Food consumption statistics - the potential role of data from household income/expenditure surveys. FAO Quarterly Bulletin of Statistics, 3(4). Rome.

Reutlinger, S. & Selowsky, M. 1976. Malnutrition and poverty: magnitude and policy options. World Bank Staff Occasional Papers No. 23. Baltimore, MD, Johns Hopkins University Press.

Reutlinger, S. & Alderman, H. 1980. The prevalence of calorie-deficient diets in developing countries. World Dev., 8: 399-411.

Scrimshaw, N.S., Waterlow, J.C. & Schurch, B., eds. 1994. Energy and protein requirements. Proceedings of an International Dietary Energy Consultative Group Workshop, London.

Senauer, B. & Sur, M. 2001. Ending global hunger in the 21st century: projections of the number of food insecure people. Rev. Agric. Econ., 23(1): 51-68.

Sukhatme, P.V. 1961. The world's hunger and future needs in food supplies. J. R. Stat. Soc. [Ser A], 124: 463-585.

Sukhatme, P.V. 1982. Poverty and malnutrition. In P.V. Sukhatme, ed. Newer concepts in nutrition and their implications for policy, pp. 11-52. Pune, India, Maharashtra Association for the Cultivation of Science.

Svedberg, P. 2001. Undernutrition overestimated. IIES Seminar Paper No. 693, Stockholm, Stockholm University.

US Department of Agriculture. 2000. Food security assessment. Situation and Outlook Series, International Agricultural and Trade Reports. Washington, DC.

WHO. 1983. Measuring change in nutritional status. Geneva.

WHO. 1995. Physical status: the use and interpretation of anthropometry. WHO Technical Report Series No. 854. Geneva.

Appendix A

On the use of the bivariate distribution of intake and requirement in estimating the prevalence of undernourishment

Introduction

As energy requirements are normatively specified as averages corresponding to groups of individuals of given age, sex, body weight and activity, the implied variation within the group needs to be taken into account in determining whether an individual's intake is below, equal to or above their requirement. The variation within the group of such similar individuals is believed to be due to differences in the efficiency of energy utilization. In the context of the FAO methodology, where the unit of analysis is the household per capita intake, the variation reflects the net effects of differences in the composition of the households with respect to sex, age, efficiency of energy utilization, body weight and activity of the household members. In order to take into account this variation, it is necessary to rely on a bivariate distribution model referring to energy intake and energy requirement. The estimate of the prevalence of undernourishment based on such a model has been referred to as the bivariate formula. However, a key issue in evaluating this formula has been how to incorporate the effect of the positive correlation that is believed to exist between energy intake and requirement. Because of a lack of sufficient information on this matter, FAO has traditionally adopted the Sukhatme cutoff point formula that relies on the frequency distribution of household per capita energy intake and a cutoff point reflecting the lower limit of the range of variation of energy requirement. In the past, this approach has been justified as a means of avoiding the risk of overestimating the prevalence of undernourishment, but a study undertaken during the preparation of The Sixth World Food Survey showed that under the condition of a correlation between energy intake and requirement, the bivariate formula reduces to the cutoff point formula (Naiken, 1998).

However, recently, Svedberg (2001) argued that the FAO cutoff point formula yields an estimate that is "biased" downwards as it appears to ignore the risk of energy insufficiency among the individuals with an intake falling within the range of variation of requirement. His argument is based on a comparison of the estimate obtained through the cutoff point formula with that obtained using the bivariate formula. However, in evaluating the bivariate formula, he, just as others who have applied it before (e.g. Kakwani, 1992; Lorstad, 1974; Reutlinger and Alderman, 1980), has not correctly interpreted the effect of correlation with the consequence that the prevalence of undernourishment is overestimated. This issue, as well as the derivation of the cutoff point formula, is discussed in detail in this appendix.

The flaw in the approach taken to incorporate the effect of correlation in evaluating the bivariate formula

In the subsection "Basic methodological framework", the bivariate formula for estimating the prevalence of undernourishment is expressed as follows:

(A1)

Needless to say, the above statement refers to the probability that the intake of a randomly selected individual from the population is below his or her requirement.
m
In evaluating the formula given by expression (A1), f(x,r) has been assumed to be bivariate normal with the related parameters given as follows:

m_x representing the mean of intake x;
m_r representing the mean of requirement r;
s²_x representing the variance of intake x;
s²_r representing the variance of requirement r; and
s_xr representing the covariance of x and r.

The covariance reflects the presence of correlation. As the coefficient of correlation, r₁, is given by the ratio, s_xr/s_xs_r, the analysts (including Svedberg) who have attempted to apply the bivariate formula have expressed the covariance as rs_xs_r. Thus, given s_x and s_r the effect of correlation has been taken into account by simply introducing r as an additional parameter of the joint distribution of intake and requirement. The flaw resulting from the introduction of the effect of correlation in this manner is discussed below.

Since P(x<r) can also be written as P[(x-r) < 0], the double integral given by expression (A1) can be evaluated by considering the distribution of (x-r), which is expected to be normal (as the joint distribution is assumed to be normal) with mean m_(x-r)and variance s²_(x-r). Thus, using this distribution, the prevalence of undernourishment is expressed as follows:

(A2)

where the mean and variance of (x-r) are given as follows:

(A3)

(A4)

and F[Z] is the area under the standard normal curve to the left of the point Z. Needless to say, P(x>r) can be derived as 1 - F[Z].

Actually, Svedberg has assumed f(x,r) to be bivariate log-normal, which implies that the probability statement is expressed as P[(x/r) <1] instead of P[(x-r) <0]. As this boils down to the same probability statement expressed in relative rather than absolute terms, it does not affect the argument made here, which is based on the assumption that f(x,r) is bivariate normal. Furthermore, Svedberg has evaluated P(x<r) by operating directly on the joint frequency distribution rather than the distribution of (x-r), but again, this does not affect the argument as the two approaches lead to the same result.

Thus, it is evident that by keeping m_x, m_r, s_x and s_r constant and increasing r from zero, s²_(x-r)is decreased so that P(x<r) given by expression (A2) is marginally reduced. However, it is also evident that when m_x = m_r, the numerator of the ratio within the square brackets on the right-hand side of expression (A2) becomes 0, so that, irrespective of the value of r in the denominator, P(x<r), and consequently, P(x>r) will remain 0.5. The only exception is when s_x = s_r and r is assumed to be exactly 1.

The high and constant value of 0.5 for P(x<r) and P(x>r) when m_x =m_r and r is increased from 0 to nearly 1 (i.e. the results remain the same as under the assumption of no correlation) is due to the fact that the effect of correlation has not been correctly introduced in evaluating the bivariate formula. In this connection, the following statement from the report of the FAO/WHO/UNU Expert Consultation on Energy and Protein Requirements is worth noting:

Most people have the ability to select their food intake in accordance to their requirement over the long term, since it is believed that regulatory mechanisms operate to maintain balance between energy intake and requirement over long periods. This implies that one expects there to be a correlation between energy intake and energy requirement among individuals if sufficient food is available in the absence of interfering factors .... If self-selection is allowed to operate, it is to be expected that individuals will make selection according to the energy need and the probability of inadequacy or excess will be low across the whole range of requirement (emphasis mine) ... If the average intake of a class were equal to the average requirement of the class almost all individuals would be at low risk because of processes regulating energy balance and the resultant correlation between intake and requirement (FAO/WHO/UNU, 1985).

The above statement clearly indicates that correlation is meant to reflect the effect of "regulatory mechanisms "that "operate to maintain balance between energy intake and energy requirement over long periods". This implies that correlation needs to be interpreted as the existence of a probability for intake to match requirement, i.e. P(x=r).

The interpretation of correlation in the above manner is actually a consequence of the fact that, in the present context where energy requirement is fixed, the joint distribution is such that the range of variation of requirement lies within the range of variation of intake, and hence s_x includes s_r. In this situation, intake cannot be linked to requirement by the usual regression equation but by the following expression:

(A5)

where (r - m_r) refers to the deviation of the individual's requirement from the mean requirement, and (x-r) refers to the deviation of the individual's intake from their requirement. Thus, as (r - m_r) and (x-r) can be assumed to be uncorrelated, the variance of intake may be expressed as

(A6)

Now, given that x is correlated with r, the second term on the right-hand side of expression (A6) may be expressed as

(A7)

so that

(A8)

Upon elimination of s_x² on both sides, it will be noted that

.

Hence

and

.

Thus, if s_x =s_r, then r =1 and s²_(x-r) = 0, which implies that P(x=r) =1 for all intakes that fall within the range of variation of requirement. This means that the condition for x to be correlated with r is that P(x=r) =1 for the intakes that lie within the range of variation of requirement. By implication, the condition for the existence of P(x<r) or P(x>r) is that s_x > s_r so that P(x<r) =1 for the intakes that consequently will lie below the range of variation of requirement, and P(x>r) =1 for the intakes that will lie above the range of variation of requirement. In view of this, given the frequency distribution of intake, P(x<r) for the population is derived by integrating the distribution over the part that lies below the range of variation of requirement; P(x=r) by integration over the part that coincides with the range of variation of requirement; and P(x>r) by integration over the part that lies above the range of variation of requirement.

It follows from the above discussion that the analysts who have attempted to evaluate the bivariate formula have ignored the fact that, as the range of variation of requirement lies within the range of variation of intake, the covariance is given by s²_r, and hence r (which is given by the ratio s_rs_x) reflects the existence of P(x=r) in addition to P(x<r) and P(x>r) in the bivariate probability space. Consequently, P(x<r) cannot be evaluated on the basis of the parameters of the two marginal distributions and independently assumed values for r. By doing this, the whole probability space is divided into P(x<r) and P(x>r), irrespective of the assumed level of r, so that P(x=r) is absorbed by P(x<r) and P(x>r). As a result, the effect of correlation is blunted, and both P(x<r) and P(x>r) are overestimated.

Derivation of the cutoff point formula within the bivariate distribution framework

In the previous section, the effect of correlation on P(x<r) has been discussed under the assumption that the joint distribution is bivariate normal. This is mainly because nearly all attempts made to evaluate the bivariate formula have been based on this assumption. However, as will be shown below, the argument justifying the use of the cutoff point formula is not bound by this assumption and can be made without specifying the type of theoretical distribution that is relevant in the present context. It suffices to assume that the two marginal distributions are continuous and unimodal, and to consider the condition for the existence of a dependent relationship (correlation) between energy intake and energy requirement.

As illustrated in Figure A1, the range of variation of the marginal distribution of requirement is expected to be located within the range of variation of the marginal distribution of intake.^[4] Consequently, the probability space can be divided into three parts as follows:

(A9)

where f(x) and f® are the marginal density functions of intake and requirement, respectively, and r_L and r_U represent the lower and upper limits, respectively, of the distribution of requirement.

FIGURE A1. FREQUENCY DISTRIBUTION OF DIETARY ENERGY INTAKE AND DIETARY ENERGY REQUIREMENT

The area defined by the first integral represents the proportion of the population whose intakes are below the lowest requirement and are therefore in all probability below their respective requirements, while the area defined by the third integral represents the proportion whose intakes are above the highest requirement and are therefore in all probability above their respective requirements. The area defined by the second integral represents the proportion of the population whose intake status requires evaluation of the joint distribution of intake and requirement over the requirement range. As a consequence, the bivariate formula given by expression (A1) can be written as follows:

. (A10)

The evaluation of the double integral over the range of requirement depends on whether intake is considered to be independent of requirement (uncorrelated) or dependent on requirement (correlated). The condition for independence in this context implies that the individuals' intakes are either below or above their respective requirements, i.e. the intakes do not match the respective requirements. The condition for dependence is the contrary, i.e. the intakes match the respective requirements.

Thus, if intake is assumed to be uncorrelated with requirement (i.e. intake is independent of requirement) the double integral involving the joint distribution on the right-hand side of expression (A10) can be expressed in the form of two simple integrals involving the marginal distributions of intake and requirement so that the bivariate formula may be written as follows:

(A11)

This implies that under the assumption of no correlation, P(x<r) will include part of the intakes that fall within the range of variation of requirement.

However, if intake is considered to be correlated with requirement (i.e. intake is dependent on requirement), all the intakes falling within the range of variation of requirement are expected to match requirements [i.e., x =r] so that the double integral [the second term on the right-hand side of expression (A10)] becomes zero. As a consequence, the intakes that fall within the range of variation of requirement need to be excluded in evaluating P(x<r) so that the bivariate formula reduces to the cutoff point formula as follows:

(A12)

It therefore follows that the bivariate formula given by expression (A1) is a general formula for the prevalence of undernourishment that, under the assumptions that the marginal distributions are unimodal and a correlation exists between energy intake and requirement, reduces to the cutoff point formula given by expression (A12).

Practical significance of the estimate resulting from the application of the cutoff point formula

In the introduction of this appendix, it was indicated that the prevalence of undernourishment is formulated as a probability measure because energy requirement is specified as the average for a group of individuals, and consequently, the actual requirement of each individual in the group is not known. In other words, the probability measure is used because of the uncertainty regarding the actual requirement level of the individuals. This uncertainty, however, concerns only the group of individuals with intakes that fall within the range of variation of requirement. In making inference regarding this group, a key issue is whether intake is correlated with requirement (i.e. whether intake is dependent on requirement). It was noted that if, as generally accepted, intake is correlated with requirement, this group has to be considered to be in the adequate category, i.e. their intakes match their respective requirements. Consequently, the bivariate formula reduces to the cutoff point formula with the cutoff point given by the lower limit of the distribution of requirement.

The above result is actually theoretical for two obvious reasons. First, it is based on the assumption that the marginal distributions of intake and requirement are continuous. Second, it takes into account the theoretical condition for the existence of statistical dependence or correlation between the two variables. However, in reality, the distributions are likely to contain certain irregularities and are considered to be continuous only approximatively, so that the theoretical condition cannot be met exactly. In other words, even if a correlation exists, the intakes falling within the range of variation of requirements may still lie either below or above the individual's respective requirements (i.e. the intakes do not exactly match requirements). It is therefore necessary to explain the significance of the theoretical condition in this context. This is undertaken below by considering a bivariate plot resulting from a hypothetical situation where intake is correlated with requirement.

In Figure A2, requirement (r) is shown on the horizontal axis and intake (x) on the vertical axis. Requirement is assumed to vary within the range of r_L and r_U. The point x 1 on the vertical axis represents the level of intake that equals r_L, i.e. the cutoff point, and x₂ represents the level of intake that equals r_U. The 45 degree line is where intake exactly matches requirement, i.e. x=r. Thus, the area below the 45 degree line represents the space where the individuals whose intakes are below their respective requirement will be located, while that above the 45 degree line is the space where the individuals whose intakes are above their respective requirements will be found. It is evident that as r_L and r_U are the lowest and highest requirements, respectively, there can be no individuals in the area to the left of the vertical line drawn from r_L as well as that to the right of the vertical line from r_U. Thus, all the individuals will be located between the vertical lines rising from r_L and r_U.

FIGURE A2. BIVARIATE PLOT OF HYPOTHETICAL DATA ON INTAKE AND REQUIREMENT OF INDIVIDUALS

It follows from the above that the individuals with intakes that fall within the range of variation of requirements will be located in the area ABCD. In fact, the tendency for intake match requirement (correlation) is possible only for this group of individuals. However, because of the fact that, in reality, the frequency distributions are not likely to be exactly continuous, the points corresponding to this group of individuals are not likely to be exactly on the 45 degree line, as stipulated by the condition for the existence of a correlation. Hence, the points corresponding to this group are shown to cluster closely around the 45 degree line. Those whose intakes are below r_L will lie in the area below BC, while those whose intakes are above r_U will be in the area above AD. The points in these two groups (not shown in the graph) are obviously the outliers reflecting the effect of factors that are uncorrelated with requirement. It is actually their presence that makes the correlation between intake and requirement in the population not perfect. The greater their proportion in the population, the less perfect is the correlation.

All individuals in the area below BC are clearly in the inadequate category, while those in the area above AD are clearly in the excess category. As regards those in the area ABCD, the theoretical argument is that although the corresponding points may be either below or above the 45 degree line, the extent of the departure from the line cannot be significant for a correlation to exist between intake and requirement. Hence, these individuals as a group are considered to be in the state of energy adequacy in the probability sense. In other words, the corresponding points are in theory considered to be on the 45 degree line. A situation where the points depart significantly from the 45 degree would obviously imply that a correlation does not exist. In such a situation, some individuals in this group also are likely to be in the inadequate category, implying that the cutoff point formula will underestimate the prevalence of undernourishment.

Thus, the use of the lower limit of the range of variation of requirement as the cut-off point on the distribution of intake for estimating the prevalence of undernourishment does not imply a deliberate disregard of undernourishment among the individuals with intakes falling within the range of variation of requirement, but is a recognition of the fact that, owing to the effect of correlation, these intakes are likely to be close to, if not exactly matching, the respective requirements.

Appendix B

Estimation of the CV of energy requirement

In the subsection "Estimation of the CV "of the main paper, it was indicated that the CV(x |r) needed to specify CV(x) is estimated to be about 0.20. The procedure used to arrive at this figure is described in this appendix.

The first step in the procedure was to derive estimates of the minimum and maximum per capita energy requirements. Then, assuming that the distribution is normal, the CV implied by this range was derived.

The minimum and maximum per capita energy requirements have been estimated on the basis of the same principles adopted for deriving the sex-age-specific minimum energy requirements for the purpose of the cutoff point, i.e. by considering the ranges of acceptable body weight for given height and physical activity as described in the subsection "Definition of the minimum energy requirement by sex-age groups". However, there are two exceptions, as indicated below:

Variation in the BMR for adults and adolescents

The regression equations used for estimating the BMR given body weight are subject to a prediction error corresponding to a CV of about 0.08 (Scrimshaw, Waterlow and Schurch, 1994). As this variation is of a random nature, it was not considered in deriving the minimum energy requirements for the purpose of the cutoff point. But in the present context where this variation in energy requirement is to be used for estimating the variation in energy intake, the variation owing to error in estimating the BMR is taken into account.

Range of acceptable body weight for given height for children

In defining the sex-age-specific minimum energy requirements for children below age ten for the purpose of the cutoff point, no allowance was made for variation in the weight-for-height norms. However, in the present context, the approach was made consistent with that taken for adults and adolescents. Accordingly, the range of acceptable weight-for-height used in anthropometric assessments of nutritional status was adopted to arrive at the sex-age-specific minimum and maximum energy requirements.

The body weight and activity specifications used for defining the minimum and maximum energy requirements are summarized in Table B1.

As in the case of the cutoff point, the derived sex-age-specific minimum and maximum energy requirements are aggregated by using the proportion of the population in the relevant sex-age groups as weights to arrive at the country minimum and maximum per capita energy requirements. The CVs based on the thus derived range of energy requirements have been calculated for all countries for 1960 and 2000. The resulting values were around 0.20. Both the intercountry and the interperiod differences, which reflect the intercountry and temporal differences in the sex-age composition of the population, were small corresponding to a standard deviation of less than 0.005. In view of this, a fixed value of 0.20 has been used for all countries.

TABLE B1. BODY WEIGHT AND ACTIVITY SPECIFICATIONS FOR DEFINING MINIMUM AND MAXIMUM ENERGY REQUIREMENTS
Body weight	Minimum	Maximum
Children (ages 0 9)	Lower limit of the range of acceptable weight-for-height	Upper limit of the range of acceptable weight-for-height
Adolescents and adults (ages 10 and above)	Weight for given height based on lower limit of the range of acceptable BMI	Weight for given height based on the upper limit of the acceptable BMI
Physical activity level
Children (ages 0 9)	Energy requirement corresponding to given weight with no allowanceto for desirable activity	Energy requirement corresponding to given weight plus 5% for desirable activity
Adolescents and adults (ages 10 and above)	Males: PAL factor of 1.55 Females: PAL factor of 1.56	Males: PAL factor of 2.10 Females: PAL factor of 1.82

Appendix C

Available data on changes in the inequality of income distribution

The available series referring to the Gini coefficient of the distribution of household income/expenditure for a number of developing countries in Asia, Africa and Latin America are given in Table C1. Only the countries with data referring to more than one time period are shown in the table. In referring to this table, it should be borne in mind that, as the share of food in household expenditure declines with rising income and there is an upper limit to food consumption levels, the inequality in the distribution of the household per capita food consumption is much smaller than the inequality in the distribution of household income. In view of this, the focus should be on the changes rather than the actual levels observed.

In interpreting the changes in the Gini coefficient from survey to survey over time, account must be taken of the fact that the coefficients are based on distributions derived from data collected in sample surveys, which are normally designed to provide valid estimates of the means rather than the distribution. Furthermore, the means as well as the variances derived through these surveys are subject to sampling errors. As the distribution of income is known to be positively skewed, the sampling error is in fact larger than what would be expected in the case of a normal (symmetric) distribution. The effect of these errors, which are common to all socio-economic surveys based on samples, implies that the estimated variances, and hence Gini coefficient, are not likely to be stable, even if there is no true change in the inequality. Thus, considering these issues associated with the precision of the measures based on sample surveys, the period-to-period change observed cannot be taken to reflect a true change unless it is very large.

Table C1 shows that wherever the number of observations is sufficient for drawing conclusions, the changes over time in the different countries are rather small with no clear indication of either a decreasing or increasing trend.

TABLE C1. GINI COEFFICIENT OF DISTRIBUTION OF HOUSEHOLD INCOME/EXPENDITURE IN DEVELOPING COUNTRIES, 1970-1993
Region/country	YEAR

	70	71	72	73	74	75	76	77	78	79	80	81	82	83	84	85	86	87	88	89	90	91	92	93
LATIN AMERICA AND THE CARIBBEAN
Brazil^a	0.58										0.58	0.55					0.55	0.56
Colombia^a		0.52	0.53						0.55										0.51
Guatemala^b																		0.58		0.59
Jamaica^c																			0.43		0.43	0.41	0.38	0.38
Mexico^b															0.51					0.55
Trinidad and Tobago^a		0.51										0.42
Venezuela^a							0.44	0.42	0.41	0.39
SUB-SAHARAN AFRICA
Ivory Coast^c																0.41	0.39	0.40	0.37
Gabon^d						0.59		0.63
Ghanac																			0.36	0.37
Mauritius^c																	0.40					0.37
Nigeria^c																	0.37						0.41
ASIA
Bangladesh^a				0.36				0.33	0.35			0.39		0.36			0.37
China^b											0.32		0.29	0.27	0.26	0.31	0.33	0.34	0.35	0.36	0.35	0.36	0.38
China, Hong Kong^a		0.41					0.41				0.37
India^c	0.30		0.32	0.29				0.32						0.31			0.32	0.32	0.31	0.30	0.30	0.33	0.32
Indonesia^c							0.35		0.39		0.36	0.34			0.32			0.32			0.33
South Korea^a											0.39					0.35			0.34
Malaysia^a							0.53								0.48
Pakistan^e		0.31								0.32						0.32		0.32	0.31
Philippines^a		0.49														0.45			0.45			0.48
Sri Lanka				0.35						0.44		0.45						0.47
Thailand^a						0.42						0.43					0.47		0.47		0.49		0.52
^a Gini coef ficients based on the distribution of households by household gross income. ^b Gini coef ficients based on the distribution of persons by household gross income. ^c Gini coef ficients based on the distribution of persons by household net expenditure. ^d Gini coef ficients based on the distribution of households by household net income. ^e Gini coef ficients based on the distribution of households by household net expenditure.

^[1] Especially where home-produced food is an important part of food consumption or where data on quantities are a prerequisite to arrive at the expenditure.
^[2] This general equation defines six regional dummies with industrialized countries as the reference group. The regional dummy term in the general equation for countries in transition economies is TR. For the developing countries, it is as follows: AI in American Islands, AC in Continental America, AF in Africa, OC in Oceania and AS in Asia. These dummies take on the value of unity or zero otherwise. For example, the equation for countries in transition will include a, b₁ and b₇, while for developing countries in American Islands, it will include a, b₂ and b₇, and so on. The equation for industrialized countries, as the reference group, will include a and b₇ only.
^[3] The BMI refers to weight (kilograms) divided by height² (metres).
^[4] It must be pointed out that, in theory, the area below f(x )should be the same as that below f(r ).This is, however, not so in Figure 1, as the two curves have not been drawn to scale.