CHAPTER TWO
General Principles

This chapter focuses particularly on the possibilities of use and the limitations of indicators, the process of collecting and analysing them, the method of selection, and, in particular, on the necessary trade-offs between the benefit of having the information and the cost or difficulty of collecting it.

The challenge of complexity: making choices

A high prevalence of low hemoglobin levels, in addition to a low amount of bioavailable iron in diets, may constitute the basic indicators of iron deficiency anemia in a population, provided there is a consensus on their meaning and on the cut-off values used. If a strategy of fortifying a vector food with iron is adopted, in view of the extent and homogeneity of the phenomenon in the population, regardless of age or sex, repeatedly measuring the same indicators and comparing them to previous values or to an international reference, will allow evaluating the effectiveness of the strategy.

Nevertheless, it will probably also be necessary to include a whole series of indicators of health status, of the use of the health care system, of dietary patterns and food availability, or perhaps of manufacturing and distribution channels, in order to provide an overall picture of the problem, its causes and possible solutions and to allow evaluation of actions undertaken.

Nature of indicators

Indicators for assessing and analyzing the nutritional situation

Status indicators:

Choosing and prioritizing actions to combat undernutrition will depend primarily on the information regarding the nutritional status of the population.

Such information will be provided by indicators of status, allowing to characterize the nature of the undernutrition problem. They will then be linked to the characteristics of persons, times and places, in order to obtain an indication of the distribution of the problem in the population, and thus reach an overall picture of the situation.

The nutrition situation: defining priorities for action

Who suffers from undernutrition? (in terms of age, sex, socio-occupational category, etc.)
What is the type of undernutrition? (global energy deficiency, deficiencies in particular nutrients, severity of the situation, etc.),
When? (temporary, seasonal or annual; recurring or not, chronic),
Where are these malnourished individuals? (agro-ecological zones or administrative areas most at risk: districts, regions, etc.).

Precisely defining the nutritional status of a person, and more so of a population, is difficult. It is a global concept which can only be grasped through a set of clinical, physical or functional characteristics which could constitute as many potential indicators if a cut-off value were attributed to them allowing to separate malnourished individuals from well-nourished ones. This task has been carried out - and there has been a consensus on it - mainly in the fields of child and adult malnutrition and of deficiency in three micronutrients which are widespread and have serious consequences in terms of public health (vitamin A, iodine and iron).

First of all, measurements or corresponding indices are collected at individual level (for example, weight, arm circumference, hemoglobin level, etc.). This information is then expressed at the level of the population group concerned, in the form of prevalence rates, in other words, percentages of individuals who are well- or malnourished with respect to the form of malnutrition considered, in accordance with cut-off values chosen. For example, % of children of pre-school age with a ‘weight for age’ index of <-3 Z-scores or <-2 Z-scores; or 2 % of adults having a body mass index of <18.5, or <16.0 kg/m, etc.

The use and interpretation of these indicators of status are presently well-established. Nevertheless, it is useful to consult a specialist for selecting and interpreting them, as these indicators can reflect, for example, either a likely risk (simple deviation from a norm) or a real risk of nutrient deficiency (recognised functional deficit), either a recent or old, acute or chronic history of undernutrition (wasting, stunting in the young child). Some indicators are useful at population level rather than at individual level. Finally, some will be more useful than others for anticipating the benefit of a possible intervention.

Indicators of causes:

Once the nutritional status of the population and its geographical or socio-economic distribution are known, and goals for improvement have been set, information is needed on the determinants of the situation; in other words, on the factors, events or characteristics which are likely to affect the nutritional status of individuals within the population at different levels. It will then be possible to define a strategy seeking to alter a number of these factors to improve the situation as reflected in the stated objectives.

Most of the major international meetings convened since the 1990s referred to the same general framework for the analysis of the different types of causes of malnutrition and mortality, and to a classification depending on the level of intervention (ACC/SCN, 2000). If the immediate causes are generally a quantitative or qualitative inadequacy of the food ration or a disease, usually of infectious origin, these events are quite obviously themselves linked to a cause, and there are ‘chains of causality’ which are gradually brought to light.

Classification of these chains of causes can be simplified into three major categories:

a) food insecurity

This first category includes food supply problems at national, regional and household levels, as well as problems of households’ and communities’ access to foods with a good nutritional value, especially in terms of purchasing power. This sector includes a wide range of potential indicators covering agricultural production, food marketing and food consumption. A number of them are regularly collected by the information systems operated by Ministries of Agriculture and Trade.

b) environmental hygiene, access to health services

Environmental hygiene aspects encompass water supply, and supply of healthy food products, sanitation in a broad sense, and the life-styles of the populations themselves; health-related aspects include the sphere of infectious and parasitic diseases on the one hand, and that of health care systems, their coverage and utilization, on the other. In general, relevant departments of the Ministries of Health collect the corresponding indicators; a number of them have formed the basis for the health information systems launched in connection with the implementation of the policy of primary health care in the 1980s, which was updated in 1996 (WHO 1981; 1996a).

c) care and caring practices

The concept of "caring" relates to both caring at family level and broader aspects of social solidarity and protection at the community or national level. It thus covers the whole range of mother-and-child caring practices, since mothers and infants are the main groups at risk, but also includes attitudes and practices of other household or community members towards those most vulnerable socially (regarding time available, food distribution, emotional and material support) and the level of education of care providers in general. Indicators of this type are seldom collected regularly, when they do exist, they tend not to be easily accessible on a clearly identified central level. Thus the available information usually has to be complemented through specific community surveys, focusing especially on qualitative aspects.

Yet the most fundamental causes of malnutrition and mortality very often lie outside the field of nutrition and the chains of causes briefly reviewed above: they are naturally linked to potential resources of countries (energy resources, climate), but also to all factors which govern their use, such as management of population density in relation to available resources, poverty, social inequalities, secondary effects of macro-economic growth or structural adjustment policies, urban migration, etc. Fundamental agro-ecological and socio-economic indicators therefore also need to be included in any causal analysis of a nutrition situation at national level. They are generally available from the major Ministries, particularly those in charge of planning.

Indicators for monitoring and evaluating nutrition programmes

On the basis of an up-to-date evaluation of the country’s nutrition situation and in view of the different causes of malnutrition identified at various levels, the role of nutrition policies is to set priorities, translate them into general goals, then into strategies and programmes, each with specific objectives.

Conceptual framework of causes of malnutrition and mortality

The above diagram, initially developed by the United Nations Children’s Fund (UNICEF, 1990), then endorsed by numerous experts and by international organizations, schematically illustrates the framework of sets of causes which generally underlie the conceptual analysis of ‘nutrition security’ (in other words, securing a good nutritional level by controlling the different causes of malnutrition, particularly underlying causes); this analysis must, however, be conducted in a more specific way for each local situation, and appropriate indicators must be chosen corresponding to each level of causes and to each sector.

Designing a programme consists of defining material and human resources to be mobilised, in what way, for what purpose, and how, ultimately, this will alter the initial situation. Monitoring these policies and programmes will therefore require three different types of evaluations, namely monitoring implementation of programmes, evaluation of programme impact, and, keeping track of general trends in the nutritional situation.

a) monitoring implementation of programmes

This deals with the assessment of programme activities, in other words the extent to which operational objectives are met. Indeed, in order to make sure that the programme contributed to changing the situation, we must first know whether it was implemented according to plans. This assessment is based on indicators of programme implementation developed from the conception of the programme and monitored for partial or full achievement at each stage of the programme.

Programmes are composed of a series of operations, each with a specific goal. To each operation corresponds a set of indicators whereby the quantity or quality of the operation can be assessed.

Indicators of the implementation of an education programme

Under a programme to promote healthier life-styles and eating habits, a country has decided to implement activities to produce training material and to carry out educational campaigns. The implementation indicators that were adopted focused on the number and quality of educational materials produced, the number of training workshops held and teachers thus recruited, and the number of promotion campaigns carried out, associations set up and situation reports produced by those in charge throughout implementation of the programme, etc.

These indicators may concern the extent to which the target population is covered by the programme, the number of training sessions organized, the percentage of households who benefited from access to the various services set up for them, etc.

In general, these indicators are specific and easy to identify, if the activities to be accomplished, which they should reflect, have been correctly defined; they are completely dependent on the specific operational aspects of the programme and therefore cannot be defined independently, in advance, based on a general framework. Extensive use is therefore made of qualitative indicators inasmuch as the quality of activities is measured as well as their level of implementation. This type of assessment and the corresponding implementation indicators are outside the scope of this guide.

b) evaluation of programme impact

Indicators of outcomes and of impact are used here in order to measure the effectiveness of the programme - its ability to modify the situation at the beneficiary level - as well as any possible undesired effects, whether anticipated or not. These may be intermediate or short-term outcomes affecting beneficiaries’ capabilities, or the final longer term impact of the programme on the nutritional well-being of beneficiaries.

The evaluation of a programme is commonly based on a longitudinal comparison of indicators before and after implementation of the programme (before-after comparison). However, unless the programmes are highly specific and narrowly targeted, interpretation may be difficult, since factors other than those introduced or changed by the programme (known as confounding factors) may have varied at the same time and contributed to the apparent effect of the programme.

If conditions fluctuate over time (change of climatic conditions, food production varying from one year to another), if the measurements are carried out at very long intervals, or if the planned intervention is general in nature, attributing the effects observed to the programme alone becomes increasingly difficult.

Outcome and impact indicators

In the framework of a programme aimed at reducing the prevalence of undernutrition, analysis of the context revealed that diarrheal diseases were one of the main associated factors. A sub-programme was therefore set up to reduce the incidence of diarrheal diseases among young children. One of its components was the use of oral rehydration solution (ORS), and the other involved an information campaign on how to improve environmental hygiene.

One of the undesirable effects that the programme had to assess was the risk that the rehydration solutions would be prepared incorrectly or unhygienically. Indicators of outcome selected were the level of ORS use and the rate of incorrectly prepared ORS. Concerning improvements in environmental hygiene, the programme recorded indicators relating to: better knowledge of the relationship between environmental hygiene and diarrheal diseases, of ways of improving the environment and corresponding behavioural changes (cemented courtyards, containers with taps, use of soap, number of latrines built, etc). Changes achieved in terms of health status (reduction in the incidence of diarrhea per child per year, improvement in the nutritional status of young children) were selected as final impact indicators.

If the programme consists of scaling up an intervention that has proved effective elsewhere, at experimental level, the causal interpretation is simplified. If it is based on strong, but as yet unverified, hypotheses, it is more difficult to automatically attribute the observed effects to the intervention^[4]. Insofar, however, as indicators of different confounding factors likely to influence the situation were recorded before and after the implementation of the programme, statistical adjustments may be used during the analysis to improve interpretation - hence the importance of collecting these additional indicators.

A with-without comparison can then be made between two areas, one benefiting from the programme and the other not (external control group). This poses the problem of initial comparability of the two areas: here too, it will be useful to collect a number of indicators of level of risk to verify this comparability. Alternatively, two areas may be compared with an unequal level of implementation of the programme (internal control group) or, more simply still, groups of individuals or households may be compared which have not benefited of the programme at the same level, since the level to which target individuals are reached by programmes is generally variable.

Ideally, the impact evaluation should follow an experimental design, with randomization of the individuals or areas to receive or not the intervention. This is the most rigorous way to proceed in order to be able to conclude on the actual impact of the intervention. However it is generally impossible to use such an experimental design, due to heterogeneity of the target population and to its size, or on account of the complexity of the project, or again for ethical reasons, time or money constraints, or more simply because of high risks of ‘contamination’ by elements of the programme between areas which are very close.

In most cases, an impact evaluation of the crude effect will be quite acceptable, i.e. that the programme has fulfilled its goals; those in charge of the programme can regard this as sufficient. Elements suggesting a cause and effect relationship can be formulated, but without seeking absolute proof, if plausibility of the effectiveness of the programme appears sufficient to those in charge. From this point of view, an evaluation based on repeated measurements will be more demonstrative than a before/after evaluation based on only two measures.

Specific interventions versus general programmes

In 1992, Vietnam implemented a national strategy of supplementation with vitamin A capsules through health centres to combat xerophthalmia. Three years later, an evaluation recorded a very high coverage of the populations at risk by the programme and, in addition, did not observe any clinical case of xerophthalmia based on a nationally representative sample of pre-school children. In this case, there is little doubt that the result is directly linked to the programme, even if the evaluation cannot formally prove it: there do not appear to be any other factors that could have led to this result in such a short period of time and in such a specific field where spontaneous improvement is unlikely. Plausibility of the link is very strong here.

On the other hand, during the same period another country launched a programme to improve household food security, encompassing a certain number of measures such as the support to farm-gate prices for food crops and a reorganization of local markets on the basis of previously identified weaknesses. The evaluation of the programme after several years of operation showed a slight improvement in the situation. However, many other indicators had also progressed during the same period, following an improvement in the country’s general economic situation. Without a rigorous evaluation design, it is impossible to evaluate the relative share of improvement due to the programme or to other factors.

These elements will be useful each time it has to be decided whether the programme should be continued or not. A group of convergent elements based on the available indicators will be established in order to reach a conclusion on its likely effectiveness.

Often, for financial reasons, a programme cannot be implemented straight away in all the targeted areas; these will be incorporated into the programme gradually. However, the necessary indicators can usefully be collected in all the zones from the start, for this will provide elements for comparisons between zones with and without the programme and before and after the programme, which will in turn be useful to document the plausibility of effectiveness of the intervention. This will make it easier to evaluate the sustainability of the programme (by measuring the effect simultaneously in areas where the programme has been in operation for increasing durations).

The purpose of an evaluation is not only to measure impact, but also to allow the programme to be adapted to changing conditions.

Reorienting programmes

An early warning system will be evaluated primarily on its ability to foresee any worsening in the consequences of food crises among the groups most at risk; it will thus comprise a number of indicators on the strategies implemented according to the degree of vulnerability, on the levels of food consumption and on the nutritional status of these groups, for example. However, it will also involve indicators to assess whether the situation is evolving towards greater stability (improvement of climatic conditions or of food production, for example) so that the primary objective of the programme can be refocused if the initial goal has become obsolete.

c) monitoring general trends in the nutrition situation

When evaluating programmes, a distinction is made in practice between impact which is the direct result of the programme, and longer term benefits, which encompass the indirect effects of the programme on the target population, or indeed the whole population, in terms of health, economic and social situation.

In the case of an isolated programme, attention may be focused on its specific impact, but in the context of overall monitoring of a policy or group of programmes, the impact of the complete set of strategies will be the subject of regular evaluation - which will aim not so much at providing evidence of the effectiveness of one or another programme, but rather at verifying whether the situation is evolving in the desired direction, taking into account external circumstances and the programmes in operation.

Apart from regular measurement of progress, this will also provide an opportunity to check that the conceptual analysis on which the choice of different strategies was based is still relevant, or to see whether activities need refocusing.

The aim is to examine changes in the situation in terms of the general objectives of the policy adopted, implying regular collection of a certain number of indicators of risk and of causes, as well as major basic indicators, to be used by country planners and by international agencies or donors, and assessment of trends. This corresponds to one of the nine strategies proposed in 1992 by the ICN Plan of Action - which has been taken up since then by a number of countries for their national action plan - that of "assessing, analysing and monitoring nutrition situations". This implies setting up a proper nutrition surveillance system applied to planning.^[5]

These national plans have explicit general goals with an order of magnitude for expected reductions in malnutrition levels or improvements in various sectors.

National plans of action: quantified objectives

As a result of its plan, Ecuador, like other countries, anticipates fulfilling the following objectives in terms of improvements in the nutritional status of the population:

reducing the current prevalence of low birth weight (<2500 g) by 50% in urban areas and by 30% in rural areas;
eliminating almost entirely severe underweight (weight-for-age <-3 Z-scores), reducing by 50% the current prevalence of moderate forms (weight-for-age between <-1 and -3 Z-scores) and reducing by 80% marginal to moderate forms in children under five years who are treated in outpatient nutrition rehabilitation centres.
reducing by 80% the current prevalence of nutritional anemia in pregnant women and children under two years attending public health services; keeping the prevalence of iodine deficiency disorders below 10%; virtually eliminating vitamin A deficiency in children under five years;
promoting and ensuring a sufficient intake of calcium in pregnant women attending pre-natal consultations and improving the attention given to the feeding and nutrition of hospitalised persons.

Objectives will be all the more explicit and realistic if there is a recent "baseline" and an idea of trends in the past or in neighbouring countries or in countries with similar constraints.

However, waiting for a complete baseline to be available would not be reasonable; one can start with existing data from the various services, or with rapid surveys carried out on a one-off basis when there are no data for a specific problem deemed to be important.

Yet implementing a policy must be an opportunity for also setting up a monitoring system - covering at least the main indicators of status and causes of malnutrition, which will be put in perspective with major agro-ecological and socio-economic indicators - in order to have an ongoing "log-book" of the situation and of time trends.

A nutritional "log-book"

After analysis, a country considers that the prevalence of low birthweight is too high and that the goal of reducing it implies (i) strengthening the performance of pre-natal health care services, (ii) promoting a better diet for mothers-to-be, either through better use of local food or the specific distribution of food supplements, and (iii) encouraging a reduction in the workload of pregnant women through various measures.

The precise actions to be undertaken and any precise quantification in terms of intermediate objectives depend of course on the specific country situation. Monitoring implementation of these actions will be based on a quantitative and qualitative assessment of the performance level of the units concerned (number of rations distributed or number of persons who have used the services, percentage of services which have given advice and care of adequate quality to pregnant women, quality of rations distributed, level of use of the advice and care by the beneficiaries, etc).

At programme evaluation, outcomes and impact indicators can be based on changes in the frequency of consumption of certain foods by the women attending the units, or on changes in average birth weight and prevalence of low birth weight in the target population.

General monitoring of demographic, health, food and nutritional changes in the population will allow an evaluation of the overall impact of the programme and of the need to continue it, adjust it or change it completely in order to have some chances of reaching the general objective of reducing prevalence of low birthweight by 50%, over a period of five years, for example, in the overall context of the country’s development.

Characteristics of indicators

Indicators do not all have the same value. In theory this depends on their ability to best reflect a sometimes complex reality, but a trade-off will have to be found given the level of difficulty in collecting them.

Therefore, indicators are traditionally defined according to a certain number of properties that allow their value to be assessed, at least in a given context. Obviously they do not all present all the characteristics of a good indicator, so that it will have to be decided which characteristics are to be given priority when selecting indicators.

Basic characteristics

· Validity is obviously the most important. It entails that the indicator does indeed offer a true and as direct as possible measurement of the phenomenon considered.

At conceptual level, it depends first of all on how clearly the phenomenon to be measured has been defined and also on the ability to measure it directly. A consensus among users as to what an indicator can or cannot be ‘made to say’ is essential. This poses a problem where the phenomenon to be measured is linked to a multidimensional concept, and is thus difficult to measure in a global way.

There must, in particular, be a consensus on the level and significance of cut-off points for classification. A major standardization effort has for example been made in the field of measuring nutritional status and recommended dietary intakes, and this has helped give a more precise framework for use of the corresponding indicators. This is not always the case in other sectors, either because the indicators lend themselves less to quantification, or because such quantification depends very much on local circumstances. Relevance in the context of planned use must, in this case, be based on a local analysis shared among the different stakeholders, as we will see below.

Moreover, even if the indicator correctly describes a phenomenon, any systematic bias in collecting the corresponding information due to measurement methods or instruments will affect its validity.

There is no overall indicator to provide a picture of "nutritional status", therefore a decision has to be made on which specific aspect of nutritional status is to be characterized: energy status, protein status, iron status, Vitamin A status, etc. Even in the case of energy status, for example, no overall indicator is available; the indicator which is the most relevant for the aspect one wishes to prioritise - physical, biochemical, functional, etc. - will therefore have be selected. For assessing the nutritional situation of a population, a set of individual anthropometric measurements have been adopted, that, when compared to reference values, make it possible to assess the status of individuals or populations; they constitute the corpus of relevant indicators to be used preferably over any other. However, when using these indicators, one should be aware of limitations to their validity: they provide general information on nutritional status but do not represent all its aspects.

In the field of "food security", - again a very broad concept difficult to translate in simple terms - there is a considerable number of indicators, each reflecting a specific aspect and thus only relevant for a given aspect. For example, in order to describe the level of food insecurity of a household, an indicator based on a quantitative criterion of food consumption or a qualitative criterion of the perception by the household of its own food insecurity situation will be more relevant than an indicator of prices of foodstuffs on the local market.

· Reproducibility corresponds to the indicator’s ability not to be influenced by the person or instrument measuring the data, so that the value obtained will be the same whatever the operator, the place or the measurement instrument. Imprecision due to measurement methods, variability from one day to another may limit the reproducibility of the indicator. This causes an increase in variance and implies that larger samples will be needed in order to assess correctly the level of the indicator and its variations over time.

Subjectivity bias is a frequent risk with indicators deriving from qualitative surveys, as they describe behaviours or opinions of households, for example, since the personality or technique of the person conducting the survey may influence the nature of responses. Moreover, respondents to a questionnaire or subjects under observation can modify their responses or behaviour in a normative way. People who are overweight, for example, often minimise their actual food intake when interviewed for a food consumption survey.

Reproducibility guarantees that an indicator can be measured at repeated intervals in a comparable manner - a quality which is crucial when using the indicator to assess and monitor the situation.

· Sensitivity refers to the ability to identify all the individuals or groups affected by a given risk or characteristic. A complementary characteristic is specificity, which refers to the ability to identify those not affected by the risk or characteristic.

Sensitivity is measured in practice by the ratio of the number of individuals identified by the indicator as being at risk (or as having the characteristic) to the number of individuals who are actually at risk (or have the characteristic). Specificity is the ratio of the number of individuals not identified by the indicator to the number of individuals who are actually not at risk (or do not possess the characteristic).

Sensitivity thus gives an idea of the degree of correct or misclassification linked to the use of an indicator. Not all indicators lend themselves to an assessment of sensitivity. Sensitivity applies essentially to indicators with cut-off values. A reference indicator (‘gold standard’) is needed to assess it.

For example, one talks about the sensitivity of anthropometric indicators for screening wasted children or thin adults (weight-for-height <-2 Z-scores or body mass index <18.5 kg/m 2, respectively) or about the sensitivity of a socio-economic indicator in distinguishing those most at risk of food insecurity (like a wage < 3^rd tercile of the distribution of wages in the region), etc.

	RISK OR CHARACTERISTIC:		CALCULATION:
	PRESENT	ABSENT	CALCULATION:
indicator +	a	b	sensitivity = a/(a+c)
indicator -	c	d	sensitivity = d/(b+d)
Indicator + or -: the value of the indicator is above or below the cut-off value set to define the risk.

The sensitivity of an anthropometric indicator such as BMI (body mass index or weight/height 2) to detect adults who are actually thin varies in accordance with the cut-off value adopted: the higher the value, the better the sensitivity (although specificity, by contrast, will diminish). Moreover, sensitivity is measured with respect to a given goal; sensitivity of an indicator such as weight-for-height at a given cut-off value will not be the same, depending on whether the goal is to identify children who are wasted or those who are at risk of dying in coming months.

Furthermore, performance, i.e., the ability to detect a significant percentage of malnourished individuals for a given level of sensitivity of the indicator, depends on the prevalence of undernutrition in the population.

Data for quick computation of these parameters (sensitivity, specificity) are not always available, so in practice, reference is made to existing data from the literature to find those closest to the chosen cut-off values and expected prevalences.

One particular aspect of sensitivity is the ability of an indicator to measure change, not in order to identify or target a particular category of individuals as previously but to detect the smallest possible change in the phenomenon described, in a significant way. While sensitivity, in general, is important when establishing a baseline, and for defining the target groups to which the activities will be directed, this ability for measuring change is crucial for assessing or monitoring trends, in particular to detect changes in the situation during implementation of the programme.

Sensitivity to change

Mid-upper arm circumference of young children is sometimes used in place of weight-for-height as an indicator of a population’s nutritional status since it is faster to measure, easier to interpret and, although less precise, does have a similar sensitivity in this descriptive framework. However, it is relatively inert when assessing small progressive changes in nutritional status over time, and the weight-for-height indicator will be preferred in this case, since it is more sensitive to change. Also, urinary iodine will respond to introduction of salt iodization in a region quicker than prevalence of goitre, which will decline only slowly.

In addition to these inherent characteristics of indicators, their operational value should be examined; it will be essential when the choice of indicators is made, especially in terms of speed and cost of collecting data for producing these indicators.

Operational characteristics

· Availability should be considered first of all. It represents the practical possibility of making available the indicator in question. It implies the feasibility of collecting the corresponding data by whatever means. There are indicators described as "ideal" which nobody is in practice able to collect. As a result of major international conferences and of programmes that have followed them during the last two decades, many of the required indicators are already systematically and regularly collected within the framework of such programmes and are thus very easily available.

· Dependability, focuses on the quality of sources of information, in other words on the accuracy of the data and their representativeness (sampling) in terms of the target population. It affects use of the indicator not only at the descriptive stage, but also when monitoring the situation. An indication of the quality of the measurements, of sampling and of the confidence interval of the result is essential here to assess dependability.

Occasionally, it has been observed that the number of malnourished children estimated by nutritional surveys carried out by various organizations on identical populations and during the same periods, differed substantially; using the results for targeting purposes or for monitoring the situation is ruled out in this case. The reason was usually the lack of precision of the anthropometric measurements or of the definition of age, and occasionally a sampling problem.

Data on food consumption obtained by weighing food are more precise than those obtained with the "recall" technique, although the former implies technical constraints and can therefore only apply to small samples, so that there is a broad confidence interval in the results. Recall techniques, on the contrary, can easily be applied to a large sample, obviously with a smaller confidence interval. The various available data must therefore be carefully examined before using them for monitoring purposes, and a choice will sometimes be made between data collected with a higher level of accuracy but lower power at the level of the target population, or the opposite.

· The simplicity of collecting the data in order to obtain the indicator. On this depends, in part, the speed and frequency with which the indicator can be regularly measured.

· The problem of cost does not really arise if the corresponding data are routinely available from a service. When the data necessary for the construction of the indicator need to be collected specifically for evaluation or monitoring, cost should be considered; it depends on the difficulty and sophistication of the measurements, the accessibility of the objects or people to be measured, the frequency of collection and the complexity of the analysis subsequently. The relevance of an indicator will need to be considered before deciding the regular collection of associated data and the related cost; however, there is also a cost of ‘non-collection’ for the programme.

The cost of ‘non-collection’: a neglected aspect

The cost of non-collection may be measured, in the case of a food subsidy programme, for example, by the difference between the cost of the programme if it is carried out without particular targeting, in the absence of any indicator allowing targeting, and the cost of the programme for the target population, plus the cost of targeting, if the programme is to be directed at a high risk group only.

Nevertheless, information on the cost of collecting an indicator for each situation is seldom available. It is difficult to measure, and estimates are generally based on the cost of different types of survey within the country, taking account of the fact that several indicators are collected at the same time.

Sources of information

Indicators can be categorized schematically in the following way according to the level at which they are produced or made available:

Indicators available centrally

· These centrally available indicators may firstly be data collected routinely on an ongoing basis by the various administrative or technical departments for their own use or within the framework of agricultural or health information systems. These indicators are constructed on the basis of data routinely transmitted to this level; they are often representative at national level, but are little disaggregated - generally at best by region, by urban/rural sector, or by gender (food production at the Ministry of Agriculture, food imports at the Ministry of Trade, distribution and level of wages at the Ministry of Employment, figures of mortality by causes at the Ministry of Health, etc.).

They include both indicators regarding the implementation of services as well as indicators regarding the situation or the impact of actions under way. It is generally easy to obtain them from the departments concerned, which usually have time series that are very useful in distinguishing medium- and long-term trends. Even so, it is not always possible to cross-tabulate these indicators, since they do not necessarily come from the same databases and are accessible only in a relatively aggregated form. It is also difficult to verify the quality of the original data. Lastly, even if the data are collected on a frequent basis (monthly reports, for example), recovery and analysis may take too long.

· They may be data collected periodically, either exhaustively or over a broad representative sample (for example, population census, national surveys on the nutritional situation, national household income and expenditure surveys, etc.).

Such data tend not to be immediately accessible except in summary form, although it is easy to organize new analyses with the departments in charge of them. These data allow statistical cross-tabulation to be made between the many variables collected simultaneously on the sample. Although carried out at best at very long intervals, they can be updated with reasonable projections, especially if information on trends in the fields of interest, based on routinely collected data, are also available. These data are often kept together in national statistical offices.

· There are also specific information systems for food and nutrition, such as early warning systems to forecast and monitor shortages (see Eele, 1994; Chopak, 2000; Djaby et al., 2000; FAO/FIVIMS and FAO, 2000), or nutritional and food surveillance systems with a view to longer term planning (Soekirman & Karyadi, 1995). They consist of a regular collection of information based on a small number of selected indicators. The system varies by country, those that perform best are based on an explicit conceptual framework and are linked to a clear decision-making mechanism. They can represent a sound basis for central monitoring.

A particular category is derived from surveys conducted by international bodies for various purposes: Demographic and Health Surveys (DHS, ORC Macro), surveys of household living standards (LSMS/World Bank) and surveys of the social dimensions of adjustment (SDA, World Bank; cf. Delaine et al., 1992), and monitoring of the World Summit for Children (Multiple Indicators Cluster Survey, MICS/UNICEF), etc. These cross-sectional surveys are conducted directly at household level on samples which are representative at national level but of variable size; they include a wide variety of indicators (in number, goals and qualities) and are now frequently repeated. Although conducted peripherally, they are generally available and used centrally. These sources, which are in principle fairly reliable, benefit from an advanced level of analysis allowing causal inference to be derived of relationships among various household indicators, and with individual indicators, such as nutritional status. They represent a precious source when establishing a baseline and when analysing causes prior to launching an intervention.

Indicators available at intermediate level

These are constructed primarily on the basis of routinely collected data (from local government offices, community-based authorities). They are usually passed on as indicators or raw data to the central level, and then sent back to the decentralized levels, with varying degree of regularity, after analysis. They are often disaggregated by district or locality, but are not always representative, since they often refer only to users of the services under consideration. They are generally grouped together at the central administrations of regions or administrative centres.

The indicators relate primarily to activities that lend themselves to regular observation, either because they record activities (indicators of operation or delivery of services) or because they are necessary for decision-making (crop forecasts, unemployment rates) or for monitoring purposes (market prices of staples, number of cases of diseases, etc.). They do not necessarily include indicators of the causes of the phenomena recorded and are not in principle qualitative indicators.

Indicators collected at decentralized levels should meet both the needs of users on these levels and also those of users on the central level for the implementation and monitoring of programmes. If these regularly compiled indicators do not have any real use at the local level and are intended only for the national central level, there is a danger that their quality will drop over time, for lack of sufficient motivation of those responsible for collection and transmission - and gaps are therefore often found in available data sets. Nevertheless, they are invaluable in giving a clear picture of the situation on the regional or district level, together with medium-term trends. Generally speaking, their limitation is the low level of integration of data from different sectors.

Indicators available only at peripheral level

A certain number of indicators, particularly those concerning the life of communities or households and not touching on the activities of the various government departments, are not routinely collected by such departments and are in any case not handed on to the regional or central offices. They are sometimes collected at irregular intervals by local authorities, but most often by non-governmental organizations for specific purposes connected with their spheres of activity - health, hygiene, welfare, agricultural extension, etc.

Analytical capabilities are often lacking at this level, and the available raw data may not have led to the production of useful indicators. Action therefore should be taken to enhance analytical capacities or else sample surveys will have to be carried out periodically on these data in order to produce indicators. A sound knowledge of local records and their quality is needed to avoid wasting time.

New collection procedures often have to be introduced for use by local units, while being careful not to overload them or divert them from their own work. Otherwise a specific collection has to be carried out by surveying village communities targeted for analysis or intervention. These surveys are vital for a knowledge of the situation and behaviours of individuals and households and an evaluation of their relationship with the policies introduced. In general, they offer an integrated view of the issues concerned.

They may have the aim of supplying elements concerning the local situation and local analysis, in order to confirm the consensus of the population and of those in charge as to the situation and interventions to be carried out, and also to allow an evaluation of the impact of such interventions. The participatory aspect should be emphasized rather than the precision or sophistication of data. An FAO work on participatory projects illustrates issues of evaluation, and especially the choice of indicators in the context of such projects (FAO 1994).

If data already collected are used or if a new survey is carried out for use on a higher level, the size and representativeness of the sample must be checked, and it must be ensured that the data can be linked to a more general set on the basis of common indicators collected under the same conditions (method, period, etc.) and that they will allow regular monitoring (feasibility of collection, regular transmission of data). Verification of the quality of the data is crucial.

Before undertaking a specific data collection, a list of indicators (and of corresponding raw data) should be developed which can be used by services at all levels; it is not unusual to find that surveys could have been avoided by a better knowledge of the data available from different sources. To track down these useful sources and judge the quality of the data available and their level of aggregation, a good understanding is needed of the goals and procedures of the underlying information system.

Analyzing existing information systems

We can take the example of an information system on food production used in Brazil for many years (Von Braun & Puetz, 1993). The country had set up a monthly national information system on production estimates for 35 crops, covering information on crop intentions, areas actually planted, crop yields and quantities harvested in each state.

The information was obtained during monthly meetings of experts at various levels - local, regional and national. Participants at the local level could be agricultural extension workers, bank representatives, heads of cooperatives and farmers’ associations, and the sellers of farm inputs or the purchasers of farm produce. Account could equally well be taken of precise data such as areas under cultivation financed by the agricultural credit system or sales figures for seed, but in a certain number of cases the elements taken into account were the participants’ experience or their field observations.

The information was then put together at the state level, and then at the national level, reviewed by a national committee of experts, and sent on to the central statistics office. The different levels thus had some rich information at their disposal, coming from a range of local-level sources. Although it was certainly fairly reliable, being confirmed by a large number of stakeholders and experts, its precision could not be defined, in view of its diversity.

The usefulness of such data varies depending on information needs and thus on the quality of the data required. Data concentrated at the central level are probably useful primarily for analysing trends. On the other hand, apart from the figures, more general information on production systems exists at local level, and this can be useful for identifying relevant indicators of causes, or for simplifying monitoring of the situation.

Choosing indicators

We have seen that there is a great number of indicators which differ widely in quality; the availability of corresponding data is variable, and any active collection will be subject to constraints. Therefore the choice of indicators must be restricted to the real needs of decision makers or programme planners. This implies that a method is needed for guiding the choice.

The main elements that will guide choice are: (i) the use of a reference conceptual framework of the programme that links the situation, lines of action and expected impact; (ii) the availability of a "baseline"; and (iii) the required characteristics of the indicators.

The need for a reference conceptual framework

Any intervention is based on an analysis of the situation, an understanding of the factors that determine this situation, and the formulation of hypotheses regarding programmes able to improve the situation. A general framework was presented earlier (see Figure), representing a holistic model of causes of malnutrition and mortality, which was endorsed by most international organizations and nutrition planners. However, the convenient classification that it implies, for instance into levels of immediate, underlying or basic causes needs to be operationalized through further elaboration in context.

The benefit of constructing such a framework, over and above the complete review of the chain of events which determine the nutritional situation, is to allow the expression, in measurable terms, of general concepts which, because of their complexity, are not always well defined. For example, it is not enough to refer to "food security"; one should state which of the existing definitions is to be used, on which dimensions of food security the focus is placed and the corresponding indicators.

The use of conceptual frameworks when implementing programmes or planning food and nutrition is not new. Many examples have been developed, focusing on different aspects. The most traditional approach is that of placing the food supply chain on a flowchart, highlighting the necessary information and indicators corresponding to each stage (FAO, UNICEF & WHO, 1976); there are many examples of this (FAO 1996; Von Braun & Puetz 1993; Maxwell & Frankenberger 1992; FAO 1984a, 1984b, 1985).

The concept of food security is generally perceived as that of sufficient availability of food for all. However, several dozen different definitions have been proposed over these last 15 years! This concept may, for example, comprise different aspects depending on the level being related to: overall satisfaction of needs at country level or the actual satisfaction of needs among all the individuals in the community; similarly, among individuals, reference may be made to the concept of adequate ‘availability’ or to that of ‘access’ for all to food resources. In the first case, analysis will focus on agricultural production, and in the second the emphasis will be on improving the resources of those who lack access to a correct diet.

This preliminary brainstorming exercise will allow a better definition of the perceived chain of causes (production shortfall, excessive market prices, defective marketing infrastructures, low minimum wage, low level of education, etc.) as well as the programmes needed to remedy the situation. It will then be easier to consider potential indicators of the situation and its causes, or potential indicators of programme impact.

Obviously it is not so much the final diagram which is of importance as the process through which it was developed. Insofar as the relations between all the links of the chain of events (or flow data, depending on the type of representation) have been discussed step by step and argued with supporting facts, the framework will be adapted to the local situation and will become operational.

Methodologies have been developed for making this process effective in the context of planning, for example with the method of "planning by objectives" (see ZOPP), which comprises several phases: analysis of the issues on the basis of previous studies, in order to obtain a clear picture of the original context before any programme; identification of possible interventions; definition of the more specific objectives of programmes; and, lastly, final development of an overall logical framework that will be used as a reference "model" by all stakeholders. During this planning process, all programme activities, corresponding partners, necessary inputs and resulting outputs as well as indicators for both monitoring implementation and evaluating impact of the programme will be successively identified. The method acts as a guide for team work, encouraging intersectoral analysis and offering a simplified picture of the situation, so that the results of discussions are clear to all in the team.

Let us again take the example of a problem of food security. It can be broken down into three determining sectors: that of food production, that of the processing and sale of produce, and that of food consumption. A series of structural elements can be defined for each sector: for production, for instance, one should consider capital in terms of natural resources, land-tenure structure, farming techniques, level of training of farm workers, etc. These elements affect both production levels and operation of markets. A certain number of macro-economic or specific policies will affect one or all the elements in this block. Each block can be considered in a similar way, and this will provide the groundwork for a theoretical model of how the system works (see C. Mueller in Von Braun & Puetz, 1993).

The final steps in order to operationalize the model are (i) that of defining indicators that will, in the specific context of the country, reflect the key elements of the system, and (ii), once policies and programmes have been chosen, that of identifying which of these indicators are useful for monitoring trends and evaluating programme impact. This will be the basis for an information system reflecting the overall framework of the programme and how it should work.

Another method has been proposed by researchers from the Institute of Tropical Medicine in Antwerp based on their field experience in collaboration with different partners (Lefèvre et al., 2001). Basically, it stresses the participatory aspect, with the aim of obtaining a true consensus on the local situation, the rationality of interventions in view of the situation, and the choice of indicators.

It includes first a phase in which a causal framework is developed with the aim of providing an understanding of the mechanisms leading to undernutrition in the context under consideration. The framework is constructed in the form of a schematic, hierarchized diagram of causal hypotheses formulated after discussions among all stakeholders. The way it is built tends to favour a clear, "vertical" visualization of series of causal relationships, eliminating the lateral links or loops that are often the source of confusion in other representations.

In a second phase, a framework is developed linking the human or material resources available at the onset (inputs), the procedures envisaged (activities), the corresponding results of implementation (outputs), and the anticipated intermediate outcomes or final impact of each activity or of the programme. This tool is very useful for defining all the necessary indicators.

Finally, the cohesion of all the procedures defined is conveyed by a ‘dynamic model’ intended to visualise the organization of the hypotheses on which the programmes are based and to highlight the convergent programme elements which allow forecasting a positive impact. This represents the formalisation of a real conceptual scheme.

Specific and dynamic models for each situation

While many representations of conceptual models comprise comparable elements, it is essential that a model should never be considered as directly transposable, since it must absolutely apply to the local context. A direct transposition would therefore be totally counter-productive. While it is obvious that the conceptual analysis must ideally be carried out before the programmes are launched, it can be done or updated at any time, leading to greater coherence and a consensus on current and anticipated actions; this applies even more in a long-term perspective of sustainability.

In operational terms, establishment of a conceptual framework allows to define in a coherent way the various types of indicators to be used at each level. After defining the activities to be undertaken, status indicators referring to the target group will be identified, as well as indicators of causes that will or will not be modified by these activities, and indicators that will reflect the level or quality of the activities performed. Lastly, indicators will be chosen to reflect the changes obtained, whether or not these are a result of the programme. Identification of precise objectives makes it possible to monitor changes in impact indicators not only vis-à-vis the original situation but also in terms of fulfilment of the objectives adopted.

During this initial phase, existing indicators are assessed, as well as those that will be taken from records or collected through specific surveys. It should be specified who needs this information, as well as who collects the data. In fact, it is important that this choice should be demand-driven, in order to be sure that the information selected is then actually used. One might be dealing with several groups of users who do not exactly have the same needs: political leaders and their advisers, officials at different decision-making levels, including at province and district level, local administrative authorities, donors, academics, etc. In this way, foundations can be laid for an information system essential for monitoring and evaluation.

Required characteristics of indicators

· Validity is the first characteristic to which attention should be paid. Quite often the ‘ideal’ indicator from this point of view is not available or is difficult to collect. A proximate, often indirect, indicator will have to be sought and limitations to its validity in the context considered will have to be verified carefully which will depend on the precise objective. For example, can a measurement of food stocks at a given moment be validly replaced in the context under consideration with a measurement of food consumption in order to assess the food insecurity situation of a target group? Is a measurement of food diversity a good proximate indicator for micronutrient intake? Does it at least consistently classify consumers into strong and weak consumers? Does it allow defining an acceptable level of consumption vis-à-vis recommendations? Will it allow children to be classified correctly vis-à-vis a goal of improved growth?

Validity studies are sometimes available locally, otherwise specific studies can be carried out; hence the usefulness of collaborating with research groups - for example from universities - who will be able to carry out this type of validation study under good conditions. A recent work published by IFPRI (Chung et al. 1997) offers a practical and very detailed illustration of the issues.

The relationship between two variables, making them interchangeable for defining an indicator, may vary over time as a result of implementation of a programme, and this must be taken into account. For example, if there is a clear link between family size and food insecurity in a given context, the criterion of family size can simply be taken as a basis for identifying families at risk. However, if a specific programme has been successfully carried out among these families, this indicator could lose its validity.

· Aspects of comparability have to be taken into account. The ideal would be to use the same indicators in all places and at all times in order to have the benefit of common experience regarding collection and analysis, so that direct comparisons can be made. In practice, however, concepts on indicators evolve steadily with the progress of knowledge, leading to the dilemma of being unable to carry out comparisons either with older series of indicators or with what is being done elsewhere. Comparability within time is obviously a priority in the case of monitoring. Preference will thus be given to indicators that, although not necessarily identical, are comparable, in other words give a similar type of information. The issue of the comparability of data from different sources has been the subject of studies especially in the field of health indicators.^[6]

Whenever traditional indicators seem inadequate or insufficient in capturing the phenomenon or situation under consideration, the value of "innovative" and potentially promising indicators with excellent basic characteristics should not be neglected - although it is important to make sure that they have been validated for circumstances similar to those under study. Since such innovative indicators usually have to be collected "actively", especially at the community level, the decision often depends on their technical feasibility as a guarantee of the sustainability of collection.

· Dynamic rather than static indicators are sought, in other words, indicators that are sensitive to change and capable of recording phenomena which are likely to evolve more or less rapidly as a result of socio-economic change or intervention programmes - particularly if they are predictive in nature.

In a context of dietary transition, an indicator expressing the structure of food consumption (for example the percent of energy from fat) is more subject to major changes than the average consumption level expressed in calories, while also providing important information on the future health of the population considered. In contrast, data on food habits tend not to change rapidly, unless an education programme is specifically developed for this purpose; the repeated collection of the corresponding indicators is thus of little use for purposes of short- or medium-term monitoring of the situation.

· Finally, operational qualities, particularly simplicity and low cost of collection, largely determine what choice is made. Slowness in collection and in getting the data back to user level are key factors to be considered, for many information systems are paralyzed by this problem, while timely information is often needed for decision-making or for adjusting the programme or the intervention (e.g. the case with early warning systems)

From this point of view, the nature of potential sources of data for these indicators or the direct availability of these indicators at the level where they are needed can be decisive for their selection.

Usefulness of a baseline

In practice, data collected to produce indicators need to be compared to a reference or to a "cut-off value". These can based on an international consensus within the scientific community or the political world, thus avoiding disagreement on interpretation and allowing comparisons between countries and regional extrapolations. Even so, the information is still sometimes insufficient; moreover, there are no international references for several categories of indicators. In such cases, the value of the same variable at a previous date will be taken as a point of reference. Interpretation of changes in an indicator can be carried out only on the basis of our knowledge of the original situation; knowing a baseline therefore forms part of the information value of a number of indicators.

Prevalence of wasting (weight-for-height <-2 Z-scores) among young children in a certain context provides only an imperfect assessment of the situation. For instance, was it better or worse before? The only information it supplies as such is the difference from a reference situation in a country without any major problem of undernutrition (defined as a prevalence of 2.27%). The impact of a programme cannot be measured without knowledge of the situation at baseline.

The existence of chronological series for an indicator will be considered when choosing among several indicators, because such series allow a rapid interpretation of impact in terms of trends.

When previous data are old, an effort is made to assess their present level by projection, as is usually done for major demographic or economic indicators.

In a certain number of cases, a preliminary survey is needed in order to establish the present level of various indicators. Many countries undertook national surveys of their nutritional situation prior to establishing their policies and programmes, so that they could decide on the type or scope of the programme, and could subsequently evaluate the impact. Such surveys are not cheap, but their cost must be examined in regard to that of the programme to be developed, and of the potential cost linked to the lack of evaluation of a programme that fails to yield the expected results.

Collection and analysis

Collection methods

When passive collection of data from existing sources does not provide the necessary indicators in an appropriate form, active collection should be considered through surveys among the population with an appropriate level of disaggregation. This may also be needed when the administrative coverage of the population, particularly of groups at risk, is insufficient.

Types of surveys

Firstly, it is important to consider that the preferred level of expression of the indicators varies by discipline (individuals for the expression of epidemiological risks, households for the level of food security, administrative units for an economist, etc.). The statistical units of measurement vary accordingly.

Conversely, indicators can be expressed on the same scale while data are being measured based on different statistical units; for example, food consumption data expressed in kilocalories/person/day may be constructed from national data divided by the number of inhabitants of the country as well as from data measured among households and divided by the number of individuals in the household, or assessed at individual level. These three expressions of the same situation cannot be treated in the same way statistically. Data that have been collected at different levels, must be analyzed accordingly.

Depending on the type of indicator required, quantitative or qualitative survey techniques will be used, each based on specific methodologies.

· Quantitative surveys follow a certain number of precise collection rules in each of the sectors considered. A good understanding of the limitations of the data thus collected in terms of their interpretation, representativeness, accuracy and precision is crucial. Well-known guides written by specialists in each sphere are generally available.^[7] Issues of representativeness and confidence intervals in situations where there is no sampling frame are quite well codified (stratification, cluster sampling, etc.), and these aspects will not be detailed here.

For the collection of data on the nutritional status of a population, for example, the WHO and FAO have published guides describing the procedures to be followed for sampling, collecting and interpreting anthropometric measurements in the context of cross-sectional surveys (WHO, 1983; FAO, 1992). The LSMS project showed how to assess the quality of the data thus collected (Kostermans, 1994). There is also a guide for the main types of surveys on food consumption (Cameron and van Staveren, 1988) and publications on household food security indicators and how to measure them (Maxwell and Frankenberger, 1992; Delaine et al., 1992). Appropriate methods have also been developed in the fields of demographics, health (WHO, 1981) and economics, in order to establish rough indicators when most of the usual sources are lacking.

· Community surveys based on a qualitative approach can be useful in order to collect information on indicators that do not lend themselves easily to quantification but are nonetheless useful for an overall analysis. These qualitative methods, developed and commonly used in the social sciences, especially anthropology, are now widely used in economics and agronomy (Chambers, 1992) in combination with more traditional quantitative surveys, but those working in the food and nutrition sector are not always familiar with them.

These methods, difficult to use in their original form, have been simplified and various ‘rapid’ survey methodologies have been developed: RAP (‘rapid assessment procedures’), PRA (‘participatory rural appraisal’) or RRA (‘rapid rural appraisal’) The latter type, for example, can focus on food preferences, modes of food storage or complementary feeding practices, and is especially good in highlighting regional differences. On the other hand, the PRA methods are more suited to investigations at village level, inasmuch as they allow detailed description of perceptions and attitudes within a community, but they serve above all to reinforce the community’s capacities for analysis and action.

A description of these methodologies, adapted to different uses, can be found in various publications (Maxwell and Frankenberger, 1992; Chambers, 1992; Den Hartog and van Staveren, 1985; Kidima, Scrimshaw and Hurtado, 1990). Examples of application and comments on limits of interpretation also appear in the work by IFPRI already cited (Von Braun and Puetz, 1993) and in Scrimshaw and Gleason (1992) Finally, a recent study presents an analysis of a substantial number of experiences in various fields (Cornwall and Pratt, 2003).

These surveys are based on observations or interviews, either open or structured and of varying lengths, concerning beliefs, perceptions, knowledge, behaviours or practices of individuals or social groups, with varying degrees of precision, triangulation or participation, and with results expressed in various forms (diagrams, maps, calendars, case studies). The main difficulty is to synthesize the information in order to reach a conclusion, so that the information collected can be used, without converting it inappropriately into reductive numerical data.

However, despite obvious limitations (representativeness, questionnaire bias, problems of reproducibility in case of monitoring), particularly when ‘rapid’ surveys are conducted which is often the case, these surveys provide invaluable information for understanding the linkages between various processes, and subsequently, the impact of programmes. A number of ‘innovative’ indicators have arisen from these types of survey, which adequately complement quantitative surveys.

· It is often worthwhile - sometimes necessary - to group together the collection of a number of indicators of different type, in order to reduce cost and be able to subsequently cross-tabulate various indicators. Not every survey can, however, deal satisfactorily with everything. It is therefore important in this case to check that periodicity, level of collection, representativeness and confidence intervals are relevant for each indicator, otherwise it is better to undertake a separate survey suited to the indicator considered. Surveys on sub-samples often save time and resources. It must also be ensured that the results can later be aggregated at an adequate level and in a coherent manner.

Quantity versus variety?

Two classical approaches are inevitably opposed when survey procedures are defined: surveys that favour representativeness and sample size, and hence the precision and statistical power of conclusions at the expense of variety of information, and surveys that sacrifice large samples to a closer control of the quality or variety of data. Should a food consumption survey be based on the "frequency method" allowing a large sample to be surveyed, or should the "weighing technique" be used, providing more precise estimates but on a smaller sample because the method is cumbersome? These choices, which reflect initial objectives, must be made at the conception stage of the programme and not later, on the basis of resources available and skills of local field-workers.

Large surveys often have to be combined with lighter ones in order to document particular points and work at different levels of representativeness. It is therefore essential to define a survey strategy that will ensure that methods of investigation focusing on different statistical units (national surveys of a representative sample of individuals from different age groups, light surveys in particular communities at household level based on a convenience sample, etc.) are coordinated in time and space. In the context of large-scale assessment or monitoring, that may become difficult to manage if one is not organized, as results will be arriving out of order, at the wrong moment or without the required level of breakdown or representativeness. Collection therefore needs to be structured in line with the information required and the corresponding levels of analysis, right from the initial design stage.

Monitoring and evaluation

In the context of monitoring and evaluation of programmes or in situations where comparisons are made between regions and times, the first issue is that of the sample and whether it should be a representative (random), or convenience sample (with a risk of bias) or based on sentinel sites. The latter choice certainly has a practical advantage, but requires some periodic assessment of what the sentinel sites represent with respect to the overall population. This choice remains generally valid if trends are of greater interest than a strictly representative value.

There is then the issue of longitudinal collection versus collection through repeated cross-sectional surveys. "Longitudinal" refers to the collection of data on the same sample of individuals or households each time, while "cross-sectional" refers to a resampling with each collection. Statistically speaking, it is clearly best to keep the same sample from one collection to the next, for this helps to reduce sample variance and to estimate the share of variance due to the intervention or to outside phenomena. However, this aspect becomes irrelevant if the individuals cannot be traced from one survey round to the next.

Longitudinal versus repeated cross-sectional

Repeated sample-based nutrition surveys conducted among young children require a new sample each time, since the children from the previous sample will have grown in the meantime and the age distribution will no longer be comparable; and since this distribution must remain constant, a longitudinal survey is unsuitable. On the other hand, the problem does not arise in the case of adults, since their nutritional status varies little with age, even over fairly long periods; by using the same sample each time it is easier to observe a change linked to an intervention or to other circumstances.

A longitudinal survey is useful - and sometimes necessary - in order to assess specific intervention programmes, but is less useful for overall monitoring of a situation or in the case of more general socio-economic development programmes; in this case interest is focused more on changes among the population as a whole than on the impact on individuals (or other units).

One of the drawbacks of longitudinal surveys is the inevitable bias caused by the loss of part of the sample from one collection round to the next (through migration, death, people’s refusal to take part, loss of documents, neglected collection, etc.). This factor is all the greater if measurements are taken at longer intervals or when the duration of the programme is particularly long.

The frequency of collection is more difficult to determine; it depends on a combination of several parameters:

first, the event monitored: linear growth retardation, for example, develops more slowly than wasting; measurement of corresponding indicators will not necessarily be made at the same frequency to assess the effect of a programme;
the needs of the programmes (a simple before/after comparison or a constant rapid alert);
sensitivity of the indicator to change (in the case of acute malnutrition, for example, the weight-for-height index is more sensitive to change than mid-upper arm circumference);
the extent and rate of change expected (although a frequency chosen initially may be modified according to actual change or occurrence of unforeseen events, such as a severe economic crisis);
the usual variance of the phenomenon measured and sample sizes;
the ease and cost of collecting the indicator.

When the parameter studied can normally fluctuate around average values from one collection to the next, measurements will be taken more often in order to give a clearer picture of significant trends. However, it is important to beware of the deceptive nature of over-frequent measurements in the case of major cyclic (for example seasonal) variations.

Finally, a relative homogeneity of the key indicators should be maintained between the different points of collection and over time so that comparisons continue to be meaningful.

Principles of analysis

Before starting the analysis, data from all sources need to be confronted. A local analysis of data is always possible and desirable, and is often a guarantee of obtaining the necessary indicators in time for decision-making. However, at a certain point the data deemed indispensable for evaluating activities or monitoring the situation more consistently and at a more centralized level need to be assembled. If the data are on different computer platforms or in different formats, the services of a systems data analyst can be useful in organizing this stage logically and ensuring its sustainability

Most countries have statistical bureaus capable of doing this type of work. Constant coordination and consultation at all stages of the collection and analysis chain will greatly facilitate the overall efficiency of operations.

An analytical strategy based on the initial conceptual framework will be preferred, rather than carrying out a large number of time-consuming analyses - which also entail the risk of finding random associations. Although analyses may not be very complex, it is helpful to have a competent statistician on hand, who can help formulate the questions correctly in terms of analysis and choose the most suitable methods. A proper plan of analysis should thus be drawn up, usually encompassing several stages.

Measurements, indices, indicators

In the case of anthropometric measurements, an adjustment is often made for age, sex and type of measurement. Rough measurements (e.g. mid-upper arm circumference) or calculated indices (weight-for-age, weight-for-height or height-for-age) expressed against a reference value and a cut-off may be summarized in the form of averages, standard deviations and confidence intervals (for example, the height-for-age average expressed as Z-scores of the reference population), of percentages of individuals below a critical cut-off value (% of children <-2 Z-score of height-for-age), of continuous distributions (curves) or of categories of nutritional status. All these forms of expression provide corresponding indicators of nutritional status (wasting, stunting, etc).

Software applications (e.g. Epi-Info/Epinut^[8]) carry out all these computations automatically for data of this type collected in a standardized way - and also for other types of transformations (survey data on food consumption) or computation of economic or demographic indices, for example.

A first stage will be that of checking the raw data, if necessary calculating various rates or indices, and presenting descriptive statistics in order to establish numerical summaries (averages, medians, standard deviations), relevant tables or charts, and do some smoothing (moving averages)

The next step is to study links between variables according to the hypotheses of causality adopted, using inferential statistics. It is important to take into account the variability linked to sampling. The choice of techniques will be affected by the nature of the variables under study, the type of sampling, the period and level of collection, the aims of the study, etc.

for continuous quantitative variables: comparison of means, analysis of variance, and use of the general linear model to take account of confounding variables;
in the case of qualitative variables (or discretized quantitative variables) expressed as frequencies: comparisons of proportions with the Chi-Square test and Chi-Square for trends. Confounding factors can be controlled for through various techniques of multivariate analysis (for example logistic regression).

Where longitudinal data are analyzed, bearing in mind that the same individuals have been surveyed on each occasion, techniques of analysis of variance with repeated measurements, based on a generalisation of Student’s paired t test, will be used.

Similarly, in the case of major series of observations, analytical techniques for chronological series can be used (seasonal adjustment, trends, model-building, etc.).

All these points are detailed extensively in current statistical publications and in the following references: Schlach, 1992; Mascie-Taylor, 1994; Watier, 1995; they will not be developed further here. Similarly, other publications may be consulted such as Analysis of health surveys, by Korn and Granbard (1999), for problems associated with data collection, capture, management and analysis, and the small study by Juul (2001), which can be accessed on the internet.

When assessing programmes, it is crucial to verify on the one hand the initial degree of comparability between groups, and on the other hand the degree of implementation of the programmes to the target population. In the latter case, this must be taken into account during analysis, distinguishing the units that have effectively benefited from the programme. The confidence interval will give an idea of the magnitude of the effect observed (including when the difference is not significant).

^[4]In theory, three stages should be followed: (a) demonstrating the theoretical efficacy of an intervention through rigorous experimental studies (randomized controlled trials), (b) undertaking implementation on a wider scale in context but in a controlled way (control versus experimental group), and (c) scaling up the intervention at population level while assessing its overall effectiveness. However, such studies are rarely available for all types of interventions (Habicht et al., 1999).
^[5]"Nutritional surveillance is an ongoing activity aiming to provide up-to-date information on the nutrition situation of the population and factors affecting it, in order to assist policy decision-makers, planners and those in charge of managing programmes to improve food consumption and nutritional status." Joint FAO, UNICEF and WHO Expert Committee (FAO, UNICEF & WHO 1976). See also Mason et al., 1987; Maire et al., 1999; Bloem et al., 2003
^[6]The EUROHIS project has developed a range of common instruments for the measurement of eight health indicators. This methodology may be useful as an example, for harmonizing analysis of data from surveys conducted among different communities which do not all use an identical definition of the indicators (Nosikov & Gudex, 2003).
^[7]Regarding sampling aspects, see Levy and Lemeshow (1999).
^[8] Available from CDC, Atlanta (USA): http://www.cdc.gov/epiinfo/

CHAPTER TWO General Principles