Chapter 11 - Evaluating extension programmes

David Deshler

David Deshler is an Associate Professor of extension education, Cornell University, Ithaca, New York.

Getting started with evaluation: Avoiding a passive sabotage of evaluation efforts
Selecting evaluation purposes: Unclear purposes ensure unsatisfying evaluations
Recognizing the politics of evaluation: Know the stakeholders or you will be sorry
Selecting alternative approaches and models: Which model for which purpose?
Focusing the evaluation effort: You can't evaluate everything, so let's set limits
Selecting methods for programme evaluation: Choosing the right tool for the task
Selecting methods for evaluation of teaching and learning: How do we know that learning has happened?
Interpreting findings and data: To what do we compare the findings?
Managing the work: Who will be responsible for what?
Using evaluation findings for improving extension: Who should be told what?
References

Getting started with evaluation: Avoiding a passive sabotage of evaluation efforts

Although extension educators, funders, and administrators are in favour of evaluating extension programmes, honesty requires us to admit that most of us are not overly enthusiastic about undertaking it. There are many reasons for our resistance. However, if we are to provide evaluation leadership, we must recognize forms of organizational and personal resistances, or these will sabotage our evaluation efforts. It is not uncommon to use the occasion of evaluation to mask our insecurity with distrust or criticism of others' intentions, motivations, competency, or adequacy of effort. We are sometimes caught in a bind between our own reluctance to ask ourselves hard evaluation questions and our resentment of outsiders asking questions.

The second form of resistance is fear, particularly the fear of change that evaluation might precipitate. It is the nature of organizations to be self-protective and defensive. Evaluation and organizational comfort appear to be somewhat incompatible because evaluation challenges organizations to change. We all resist change to some extent. Although evaluation is not revolutionary, it is a handmaiden to gradual change, and we have to recognize that our reluctance to participate in evaluation is partially a reluctance to embrace needed change.

Another resistance to evaluation comes from our need to avoid embarrassment about potential bad news. We need to know the quality of our efforts, but we also have a fear of finding out the truth about our achievements, particularly if we lack confidence. Most of us avoid tests for the same reason. We need to face up to our personal and organizational ambiguity regarding our need to know and our fear of knowing.

Still another real resistance results from the fact that evaluation is often an additional task to an already impossible workload. Those whose job description includes evaluation may need only to be reminded of this. However, potential benefits may have to be discussed and identified if collaboration is to become a reality among those who do not have a formal professional responsibility for under-taking it. Benefits include recognition for achievements, opportunities to improve practice, establishment of accountability, and learning new lessons about our efforts.

Creating Positive Evaluation Metaphors

The term "evaluation" can be threatening. If evaluations are to be positively embraced by extension organizations, it may be necessary for those initiating them to be positive in the language they use to describe the proposed effort. Negative metaphors that are sometimes used for evaluation, such as "under the gun," "a mine field," "an invasion of inspectors," "last rites before hanging," and "defend the fortress" reflect past negative experiences with evaluation. Why not create more positive metaphors for the evaluation task? For instance, we could describe evaluation in medical metaphors such as "complete programme physical" to identify physical soundness and potential programme health risks. The term "auto-diagnostico" has been used in the Dominican Republic and "self-strengthening" in Sri Lanka. Or we could use a business metaphor, such as "taking stock" of profits and liabilities. We could use sports metaphors, such as "watching video tapes" of yesterday's game to improve our "game plan." We can use "mirrors" to see ourselves as others see us. We could consider evaluation to be a "learning process" or a "sabbath" day of reflection, renewal, gratitude, and celebration. Being self-conscious and deliberate about the language that is used regarding evaluation may help reduce unnecessary resistance to the effort and may build positive support.

Exploding Myths about Evaluation

Several evaluation myths have often discouraged extension managers from engaging in useful evaluation.

Myth #1: Evaluate only when mandated. Many funded programmes require evaluation as a form of accountability. However, it is a myth that evaluation should occur only if it is mandated. On the contrary, evaluations that are self-initiated are more likely to be taken seriously for immediate programme improvement. Programmes become responsible and excellent just as often through self-initiated evaluation.

Myth #2: Evaluation is an add-on. It is a myth that evaluation is an add-on activity or at most a pretest with a posttest. It is most meaningful when it is integrated with decision making at every stage of programme planning and operation (Patton, 1991).

Myth #3: Evaluation is an activity for experts. It is a myth that evaluation should be undertaken only by technical experts. Yes, complex methods can be used; however, systematic evaluation can be undertaken by inexperienced managers, and specialists and educators themselves can be helped to critique their own work. This chapter is intended to help managers and specialists to do evaluation themselves.

Myth #4: Outside evaluators are best. It also is a myth that evaluation should be done only by external, outside, objective evaluators. Yes, external evaluators are often useful in challenging insiders to address what they have overlooked because of their nearsightedness. However, internal, self-initiated, and subjectively oriented evaluations also can be rigorous and valuable. In fact, because they often are participatory in generating, analysing, and interpreting data, they may result in greater acceptability of the findings and recommendations.

Myth #5: There is one best evaluation approach. Still another myth is that there is one best way to conduct an extension evaluation. Some approaches are probably better than others for addressing particular types of questions or concerns. However, the many types of evaluation approaches have their own strengths and limitations. Some situations require quantification and measurement, while others require qualitative, descriptive, and subjective data. Alternative approaches will be briefly described later in this chapter.

Myth #6: Quantitative data are best. A mixed-methods approach combining qualitative and quantitative methods can lead to better understanding and appreciation of phenomena under evaluation and provide triangulation, convergence, and corroboration of results from different methods. Qualitative methods are best for understanding the nature of something, while quantitative methods help in appreciating its extent. If we do not know the nature of something, we should conduct qualitative studies. After measuring something, we may still need to use qualitative methods to learn about variations and unique forms. The use of mixed methods can increase user responsiveness to evaluation information (Greene, Caracelli, & Graham, 1989). We should ask which combinations of methods are best for answering which evaluation questions, rather than deciding on a method and then forcing or changing our questions to fit that method.

Major Elements in Evaluation

There are at least five major elements in most evaluations: (1) focus questions, (2) objects or events to be evaluated, (3) data or evidence, (4) analysis and interpretation using judgement perspectives, and (5) judgements, conclusions, or findings. Purposes and approaches or models may vary, but these elements will be present in one form or another.

Selecting evaluation purposes: Unclear purposes ensure unsatisfying evaluations

There are many purposes for undertaking evaluations in any particular situation. It cannot be assumed that all stakeholders (farmers, extension staff, administrators, funders) share common purposes. Although people may never completely agree about the purposes, one of the tasks at the outset is to seek clarity about evaluation purposes through discussions with major stakeholders. Appreciating different purposes at the outset can reduce conflicts and disappointments down the road. Consider the following purposes.

Pseudo Self-Serving Purposes

Since organizations, including extension systems, have a self-serving tendency, it is not unreasonable to expect that some staff members, especially those in the highest places, may want a pseudo evaluation that will postpone, buy time, or avoid threatening change. In these cases, evaluators are not taken seriously, and the evaluation becomes a meaningless political diversion. In other cases, some members of organizations want evaluations as excuses for evading or avoiding administrative responsibility or to provide a scapegoat for criticism. Evaluations that are undertaken only to make the programme look good ("whitewash job") or to make someone or some aspect of a programme look bad ("hatchet job") are pseudo and illegitimate.

Enhance Accountability Purposes

It is quite common for external donors to expect that evaluation will provide accountability through evidence of impact, or to document cost-benefits, or to measure efficiency-effectiveness. In some cases, this evaluative evidence is considered in decisions to continue the programme; or to propose change, expansion, or reduction of a programme; or to change a policy, organizational structure, philosophy, or design. The potential for negative findings and the threat of discontinuing funding has led to "hiding the mistake," a dysfunctional practice. However, evaluations rarely provide a single basis for political decisions. They often are used by funders, administrators, or policy makers to justify their decisions even when the evidence of benefits is weak.

Improve Performance Purposes

This purpose of evaluation is sometimes called "formative" because the results are intended to help improve the programme during its formative stages. This is in contrast to "summative evaluations" when the purpose is to sum up or summarize the accomplishments at a point in time. When evaluations are to improve programmes, lessons learned about strengths and limitations of the programme are mined from the data so that changes can be made immediately. Sometimes the intent is to discover new approaches and alternatives or to adjust the programme to changing situations or client groups. Evaluation also is used to understand multiple reasons for apparent failure or to improve the management or operation of a programme.

Social Learning and Communication Purposes

Sometimes evaluations are intended to stimulate political dialogue or to resolve political conflicts intelligently. For example, an evaluation of extension in a country could provide an opportunity to debate the need to hire more women agents to respond to an increase of women in small-scale agriculture or to extend the extension network to subsistence farmers not being served. Often the most significant contribution of an evaluation is the creation of new expectations, new organizational arrangements, new linkages, and new purposes and goals. Evaluation may give visibility to a good idea and new language that can communicate new ways of viewing extension to others who also may want to share an experiment.

Evaluation purposes tend to vary, depending upon where one stands within a system like extension. External funders often want an accountability purpose, while field staff are more likely to favour a programme improvement purpose. Policy makers and programme administrators can often appreciate an evaluation that contributes to new ways of thinking about extension or new forms of extension. Farmers want an evaluation to improve the benefits they may receive from extension staff.

Recognizing the politics of evaluation: Know the stakeholders or you will be sorry

Evaluations are never value neutral. Political implications are always present because stakeholders have interests to defend. In this sense, all evaluations are political. Therefore, it is very important to know the values, expectations, and interests of the stakeholders at the outset. Failure to understand these interests may make the findings irrelevant or may cause unnecessary conflict and rejection of the findings. Knowing the interests also helps in the focus and design of the evaluation.

Values, expectations, and interests are reflected in stakeholders' visions regarding what "good" looks like in extension. Some may favour an extension system that empowers the poorest of farmers in their struggle for more land and a larger share of market benefits. Multinational agribusinesses may view the success of extension as increasing productivity for export. What a "good" or "excellent" extension system should look like will vary according to these different values and criteria. What different funders want may differ from what various groups of farmers may want in an extension system. Assumptions of stakeholders also vary regarding the relationship of extension education to larger social and political visions. For some, the extension system is intended to reproduce existing power relationships between government and farmers. For others the extension system is viewed as an instrument to transform those relationship. For some, extension is primarily a technology transfer system from researchers to farmers. For others, extension is a communication network among farmers, researchers, credit institutions, market organizers, consumers, and government policy makers. These expectations, values, and interests are reflected in the criteria that various stakeholders put forth as central for judging extension, regardless of the stated objectives in programme documents.

Figure 1. Alternative locus of evaluators and criteria for planning extension evaluations.

External Evaluator
External Criteria

Internal Evaluator
External Criteria

External Evaluator
Internal Criteria

Internal Evaluator
Internal Criteria

Political interests are usually behind the debate over external versus internal evaluations. Some external stakeholders favour external evaluators because they want an evaluation to serve the direction of change, and they suspect that internal evaluators will be self-protective and not face up to the necessary reforms. Local programme staff sometimes fear external evaluators who impose criteria or external visions through evaluation judgements without understanding the situation on the front lines. They do not want to be evaluated against unreasonable external criteria. The matrix in Figure 1 regarding locus of evaluators and criteria may help clarify conflicting viewpoints over external versus internal evaluation or external versus internal criteria for judging what is "good" or "excellent." One way to accommodate these interests is to employ evaluation teams of external and internal evaluators and to negotiate combinations of external and internal criteria.

Politics is about power and power relationships. Education, including extension education, also is about power, because one way of increasing power is through increasing knowledge. Extension is political in that it can either reproduce existing access to knowledge or privilege or provide a redistribution of power in a society through increasing knowledge access. Probably the most crucial political question that evaluation can raise is, "Who benefits most from extension programmes?" This question is political, because the answer often reflects class and economic privilege, ethnic or racial dominance, and gender interests and relationships between staff and various sectors of the population of farmers. How does the extension system privilege some and not others? Where are extension staff located in relationship to the political, economic, social, and cultural structures of a country? How is extension affected by relationships that governments have to various sectors of agriculture? These values, expectations, and interests provide the background for negotiating the purpose, focus, and design, as well as the interpretation of the findings of evaluations.

Selecting alternative approaches and models: Which model for which purpose?

What are several alternative approaches to programme evaluation? Choosing among these approaches is important because for each there are different assumptions about what data to collect, how to collect them, and how to make judgements about success. The following seven major approaches will provide a sufficient choice for most extension evaluation situations: (1) expert model, (2) goal-free model, (3) attainment of objectives model, (4) management decision model, (5) naturalistic model, (6) experimental model, and (7) participatory evaluation model.

Major Models for Programme Evaluation

Expert Model. This approach relies on expert judgement (Eisner, 1983). Usually, documentation is prepared in advance of experts' visits. The experts then interview, analyse documents, and make judgements using their own judgement perspectives or those set as standards by the outside organizations or stakeholders. Typically this type of evaluation brings in a team of experts from FAO or extension systems from several countries to make judgements and comparisons regarding strengths and limitations.

Goal-Free Model. This approach assumes that outside evaluators do not know, or need to know, what the programme has intended to accomplish, but that it is the task of the evaluators to uncover what is actually happening relative to farmers' interests regardless of stated goals and intentions. The focus point is to identify environmental and farming conditions and then to compare these needs with what people are actually experiencing as a result of the extension programme. The gap is then viewed as a starting point for making changes in the programme. An example is an evaluation that describes conditions of indigenous farming groups cultivating fragile hillside soils and comparing these conditions with access to and appropriate content of knowledge from existing extension services. This approach relies heavily on open-ended interviewing and observation by persons who do not have a vested interest in the programme (Scriven, 1972).

Attainment of Objectives Model. This approach assumes that the success of a programme can be determined by measuring a programme's outcomes against its own goals and objectives. This type of evaluation begins with clarifying measurable objectives and then gathering data that validate the extent to which these objectives have been met. For this model to be credible, an essential feature should be added, namely, the evaluation of the appropriateness of goals and objectives, given the circumstances and needs of farmers. For example, an extension system may have adequately met its objectives of increasing production of maize among large landholders, but at the same time it may have neglected to question its lack of commitment to small landholders or tenant farmers. If an attainment of objectives evaluation is anticipated, programmes are often tempted to set goals quite low so that outcomes will be met easily, thus appearing to be successful while ignoring major challenges. This model also has a "black box" limitation in that it tends to ignore the extension process, thereby failing to provide explanations for outcomes (Provus, 1971).

Management Decision Model. The purpose of this model is to provide relevant information as a management tool to decision makers. It assumes that evaluation should be geared to decisions during programme initiation and operation stages to make results more relevant at each particular stage. Participation of stakeholders is central to the process because evaluation should serve their decisions. Sometimes cost effectiveness and operations monitoring are included (Stufflebeam, 1971; Tripodi, Pellin, & Epstein, 1971; Gold, 1988). One limitation of this model is the tendency for the decisions of major stakeholders to be viewed as more important than those of various types of farmers, especially women in agriculture who may not benefit directly from such an evaluation unless care is taken. Naturalistic Model. This model assumes that a programme is a natural experiment and that the purpose of evaluation is to understand how the programme is operating in its natural environment. There is an assumption that programmes are negotiated realities among the significant stakeholders and that evaluation serves this value-laden negotiation (Cronbach, 1981; Guba & Lincoln, 1989). Data should be collected and analysed from multiple perspectives. The outcome of the evaluation is dialogue concerning disagreements about objectives, expectations, problems, opportunities, policies, procedures, and suggested changes in methods or activities. Many positive collaborative changes can be made through this model of evaluation if conflict resolution skills are combined with evaluation. Another purpose of this model is to diagnose or to identify the causes for certain behaviour on the part of some farmers, agency staff, or other development actors (Murphy & Marchant, 1988).

Figure 2. Ladder of farmer participation in extension evaluation. Adapted from Arnstein (1969).

Level 5: Farmers conduct their own evaluation of extension independently of extension and report their findings to policy makers.
Level 4: Farmers carry out evaluation of extension in cooperation with extension managers and make decisions regarding changes in providing extension services.
Level 3: Farmers receive evaluation results and other information from extension staff and are asked to give reactions and recommendations for improving extension processes and resources.
Level 2: Farmers receive information, evaluation summaries, feedback on extension performance from extension staff, but are not asked to react.
Level 1: Farmers provide data and evidence of their achievements along with their reactions to extension without being involved in planning evaluation efforts.

Experimental Model. The purpose of this approach is to determine whether changes in programme outcomes (learning accomplishments) were due to the contributions of the programme and not just to life's experiences or from other influences (Goldstein, 1986). This model asks the question, "Were differences in sustain-able agriculture practice attributable to the programme?" The simplest way to determine causality between the programme inputs and comparable groups, a group that received the educational treatment and a group that did not. This means that programme accessibility, at least during the experiment, programme consequences is to compare at least two is withheld from those learners who serve as a control group. Because of the nature of human subjects, the ethics of withholding educational services, and the difficulty of controlling for external influences, it is extremely difficult and costly to operationalize this model. It is recommended that this model be used only when major changes are expected or when a major failure is anticipated in pilot efforts where causal claims are central to making major programme investments (Rossi & Freeman, 1982).

Participatory Evaluation Model. The purpose of this model is for extension educators and farmers themselves to initiate a critical reflection process focussed on their own activities. This is done through identifying a persistent major situation such as extension's neglect of women in agriculture; subject it to critical reflection, underlying assumptions, habits of mind, and cause and effect expectations; and then after creating new assumptions, change practices and validate or invalidate the results. The model assumes a democratic participatory process along with autonomy on the part of educators and learners at the local level (Brunner & Guzman, 1989; Greene, 1988). This is a form of what is usually called "participatory action research."

Benefits and Limitations of Participation in Evaluation

Why use participatory approaches to evaluation? Involving people who are on the receiving end is likely to assure the most efficient allocation of scarce resources and the early identification of ineffective or wasteful use of resources. People on the receiving end are ultimately the best judges of impact, whether benefits have been produced or not (Uphoff, 1989, 1992). Being included in planning, implementation, and evaluation will show farmers that they are regarded as responsible, capable individuals and not simply passive "beneficiaries" or a "target group." Participatory evaluation is self-educating. It can encourage the development of human capacities among farmers. Without participation in evaluation, sustainable agriculture is unlikely. Participatory evaluation also can decrease the paternalistic, directive, impatient, or insensitive relationships among officials, technicians, and farmers by improving staff attitudes toward working with persons having less status and education. Although an initial cost of time is required, participation in evaluation can speed implementation when participants take greater ownership of efforts.

Farmers' lack of experience is sometimes said to be an impediment to participatory evaluation, but this need not be an obstacle if the process whereby people gain experience can be planned for and invested in (Uphoff, 1992, p. 5). Lack of resources and organizational skills can be overcome. Power differentials, social stratification and cleavages, and personal conflicts do present limitations. However, these obstacles can be reduced by outside actors who can provide spaces for subgroups to express their views for the sake of broader community interests. Officials or NGO staff themselves may present the most formidable obstacle by their paternalism and preoccupation with control. They often feel threatened by people participation. Organizations tend to replicate in their environments the same attitudes, values, and social relations they exhibit internally. Practising participatory evaluation internally will provide a model for participatory evaluation that includes farmers. Positive experiences with participatory evaluation can help overcome staff anxiety.

How much participation should the farmers have in evaluating extension programmes and their own learning? Farmers, in many extension programmes, have participated in evaluation of extension through farmer associations and committees. However, as Arnstein (1969) has pointed out in a much quoted article regarding a ladder of participation, there are various levels of involvement and participation. In her model, the bottom rungs provide for merely token participation, while the top rungs allow for participant control. A ladder adapted from Arnstein that provides levels of farmer participation in evaluation of extension is depicted in Figure 2.The practice of levels 3 through 5 will increase authenticity of data, reduce paternalism in relationships, assure relevance, and make possible a collaborative commitment to positive change.

Levels 1 and 2 can be characterized as pseudo participation because they represent paternalism on the part of extension. Levels 3 and above can be characterized as genuine participation because they represent collaborative or empowering relationships.

Farmers are by nature involved in informal evaluation of their own practice. What is at issue is the degree to which extension managers will deliberately collaborate and share control with them in collecting, analysing, and reflecting on evidence of their achievements and judgements in regard to the helpfulness of events, technologies, inputs, processes, and learning experience.

Focusing the evaluation effort: You can't evaluate everything, so let's set limits

What should be the focus of programme evaluation? This question raises the spectre of evaluating everything, which is an impossibility. Choices and priorities among many possible questions have to be made. A full range of possible focusses can be represented at least in part by the levels represented in Figure 3 (modified from Summers, 1977), which shows eight major areas of focus for programme evaluation: (1) inputs, (2) activities, (3) participation, (4) reactions, (5) individual change, (6) organizational change, (7) community change, and (8) national change. Evaluations rarely cover all items of these areas. Most limit themselves to a combination of items that serve the evaluation model or the concerns of stakeholders.

Narrowing the focus usually begins during planning with the major stakeholders in a programme effort (farmers, extension staff, administrators, and flinders). Interviews with these stakeholders are usually conducted at the outset of evaluation efforts, often during the programme-planning process, to identify the programme model or approach and to determine the questions that are central. Stakeholders often do not agree, so priorities, methods, and costs must be negotiated so that the task can be "reality based" and "doable." The key here is to determine the decisions that stakeholders intend to make based on the evaluation findings so that immediate use will be made of the information generated. Evaluation use is not based on the quantity of data, but on its timeliness and relevance to decisions and purpose.

Figure 3. Focus for programme evaluation.

8. National impacts: Political stability, economic fairness, agricultural environmental sustainability
7. Community change: Changes in administration of justice; health, welfare, and quality of life; fairness in the marketplace; change in human rights, status of women; change in economic and social indicators for poor; change of indicators of sustainable agriculture and natural resource management; change in communication patterns and access to education and news; public opinion change; fairer distribution of land and other resources; improved interorganization relations; evidence of conflict resolution; and cultural practice change
6. Organization change: Group operation and management; economic performance; technical operation and management; financial operation; group institutionalization and self-reliance, new groups of farmers included, new organization linkages; change in staff performance, new service delivery, new methods used, additional facilities and equipment; cost-benefits improved; new philosophy, purposes, and goals; improved organizational culture
5. Individual change: Changes in knowledge, attitudes, skills; sustainable agricultural practice; change in aspirations, self-image, perspectives; expenditure of effort and money; use of methods, services; invention of appropriate technology; increased production or use of tools; compliance with or opposition to public policy; patterns of communication, career directions, and family relationships
4. Reactions: Testimonials; reactions to the relevance of content; appropriateness of technology, helpfulness, perceived value of educational experience; reputation of the extension provider
3. Participation: Farmer access to extension services by social class, gender, and ethnic groups; intensity of face-to-face contacts; extent of media-assisted contacts; type of participation (volunteering, planning, recruiting, learning, experimenting, evaluating); indicators of commitment (attendance, continuity, frequency)
2. Activities: Participatory rural appraisal; planning; local knowledge documentation; farmer experimentation; farmer-to-farmer knowledge sharing; farm tours; farmer organizing; master farmer leadership training; farm demonstrations; exhibitions and fairs; residential workshops; marketing analysis; farm policy education
1. Inputs-resources: Organizational sponsorship and networks; funds; organizational design, facilities, equipment; philosophy, mission, goals, objectives; staff, resource people, volunteers; local and external research knowledge and relevance; cultural, economic, and political context

The seven evaluation models described above tend to focus on the different levels shown in Figure 3. For in-stance, the expert model most often focusses on data from inputs, activities, and participation, while the goal-free model tends to focus on individual change, organization change, or community change, ignoring the inputs and activities. The attainment of objectives model usually compares the philosophy, goals, and objectives of inputs to the extent of individual or organizational change outcomes. The naturalistic model emphasizes understanding activities, participation, and reactions as processes that occur within cultural, economic, and political contexts. The experimental model emphasizes causal relationships between inputs and individual or organizational change. The participatory evaluation model emphasizes activities and their relationship to benefits and values to farmers. It also emphasizes participation of farmers themselves in planning the focus, data collection, interpretation, and implementation of action that emerges from the evaluation process.

Selecting methods for programme evaluation: Choosing the right tool for the task

Which methods are right for the task? The basic rule here is that the selection of methods follows the selection of focus, not the other way around. Each evaluation question must be examined in relationship to what would constitute evidence for answering it. The following brief descriptions of data collection methods, although by no means exhaustive, can be used as a toolkit for a variety of circumstances. The list includes document analysis, observations, interviews, surveys, focus group committees, field visits and tours, village drama and role plays, maps, case studies, field trial documentation by farmers, remote sensing, and aerial photography.

Document Analysis. Examples include minutes of meetings, correspondence, budget records, workshop notes, participant papers, and newspaper reports, to name a few. These can be treated as data, analysed for content, and summarized in relation to questions, including extent of inputs into the programme; levels of participation, nature of goals and activities, and themes regarding problems, concerns, expectations, and new directions. Themes from documents can be a credible source of information. These documents usually do not reveal participants' motivations or subjective experiences. However, documents often reveal difficulties of programme operation.

Observations. Observers can be outsiders or persons who are involved in learning activities. Observers are usually given a short list of items that may include extent of participation and personal interaction, nonverbal indicators of interest or inattention, leadership roles, performance levels, and conflict indicators, to name a few. Both qualitative and quantitative data can be collected.

Findings can be reported to the learners as a whole to start a reflective process about what may need to be changed, or they can be used as evidence of successful methods or learning outcomes (Worden & Neumaier, 1987). Observations of process and outcomes can be recorded by video or photo documentation. These data are very powerful graphic ways of communicating the nature of a programme and its outcomes to sponsors as well as to potential learners. Video records of learners' local knowledge also can be used to help them reflect on strengths and limitations of knowledge.

Interviews (Key Informants, Oral Histories, Storytelling). Interviews are probably the most widely used method for programme evaluation, including the evaluation of learning. Interviews with key informants and representative farmers are suited to indepth explorations of issues. If the questions are standardized, responses can be tabulated numerically to indicate item strength. If questions are open-ended, indepth unique responses can be generated, which in turn can provide information regarding reasons why the activities are viewed differently by diverse groups of participants. Individual oral histories can inform patterns of practice and the use of extension resources. Storytelling is one of the oldest forms of human social life. Evaluative judgements are usually embedded in stories. Interviews, oral histories, and Storytelling with farmers who are not served is essential to a full picture of extension. Illiterate members of the community can participate fully through interviews, oral histories, and Storytelling, demonstrating that illiteracy does not rule out participatory evaluation.

Group interviews (sometimes called focus groups) can be formed according to geographic locations or by farm type to discuss specific evaluation questions. Sometimes the groups may already exist. At other times new groups may be formed just for the evaluation. The purpose is not only to generate judgements using agreed upon criteria, but also to uncover unanticipated outcomes, applications, opportunities, and problems to inform future extension efforts. A community can reconstruct its history, chronology of events, crises, turning points, accomplishments, and so forth. Chalkboards, pictures, and drawings can represent milestones or decisions. The art of conducting group interviews can be learned. Many community members have a talent for this and are more likely than agency staff to evoke authentic responses. Subgroups are often necessary to listen to voices that are unlikely to be heard in groups dominated by those with status.

Surveys. The survey is a more standardized form of data collection that incorporates a prepared questionnaire. In most parts of the world, surveys of farmers in rural areas must be conducted by interview using interviewers who know the territory, language, and culture of potential respondents. Sometimes surveys can be administered at meetings or public gatherings; however, responses have to be treated as an "opportunity sample," rather than as a "random sample," and therefore generalization of the findings to a larger population is limited. Surveys are often used to evaluate the extent of practice, estimations of production yields, preferences for appropriate technology, and expectations regarding the future. They are best used with homogeneous populations rather than with quite diverse populations because standardized questionnaires are less likely to be sensitive to diversity. Evaluation of unique practices and adaptations of diverse farmer types is best done through interviews and observations.

Field Visits and Tours. There is no real substitute for field visits and tours to provide authenticity and reality to the conditions, limitations, and impacts of extension programmes. Evaluation teams composed of local farmers, extension staff, administrators, funders, and external evaluators provide balance and interactive learning regarding different perspectives. Team members can undertake observations and interviews and learn from each other about their findings during travel. Assignments can be distributed so that special knowledge regarding specific aspects of situations can be gathered. Some team members can focus on economic, social, and cultural aspects, while others focus on technical aspects. Comparing data, analysis and reflection on findings, and insights following field visits can generate a more holistic, balanced evaluation.

Documentation of Farmer Demonstrations. Farm visits, which have been used frequently by extension staff for teaching and the transfer of technology, also can be used for evaluation of appropriate technology. Farmers and local leaders can be taught to conduct their own field trials, thus fostering the pride and dignity of local people who can transfer appropriate technologies through their local languages. When farmers choose their own focus of inquiry, collect and analyse their own data, and come to their own evaluative judgement, they are more likely to adopt and pass on relevant and effective appropriate technologies. World Neighbors NGOs in Bolivia and Peru have developed a standardized means by which farmers engage in site-specific experimental documentation and reporting of crop yield comparisons, using simple calculators (Ruddell, 1994).

Village Drama and Role Plays. Asking farmers in farmer association meetings or in village gatherings to create a drama or role play that describes the interaction process of extension staff with the village on a specific practice will reveal a variety of evaluative data on social relationships, relevance of extension knowledge to local knowledge, and historical events that have affected the solutions to farmer problems.

Maps. The generation of maps can provide a basis for making judgements about access to extension resources by showing where farmer contacts have been made. Maps also can be made to show the location of sustainable agricultural practices. These facts can be used to evaluate the scope and effectiveness of extension efforts on practices in a region, watershed, or geographic area. When access maps are overlaid with social ranking maps, judgements can be made regarding benefits by social class. Concept maps created collaboratively among farmers and extension staff can provide explanations regarding success and failure of specific programme efforts. Reflections on these maps, which can be drawn on chalkboards or in the sand, can reveal contradictions in underlying assumptions and expectations, which in turn can lead to new experiments.

Maps showing before-and-after photographic and remote sensing images can be the basis for evaluative discussion regarding sustainability of farmer and extension staff practices. This is especially relevant to desertification, deforestation, soil erosion, and status of wetlands and wildlife habitat. These data can help direct extension activity to environments that are most at risk, as well as provide evidence regarding extension's positive impacts from collaboration with farmers on sustainable agricultural and natural resource practices.

Case Studies. In order to understand motivation of farmers or potential extension contacts, case studies of specific types of farmers or farming practices can be undertaken. Comparisons between farmers who have used extension technologies and those who have not are common types of case studies. The typology may be based on geographic regions, soil types, and cultural, age, gender, and economic differences. Case studies are best constructed through repeated interviews over time and often include, in addition to self-report, data from persons who know the subjects well. Oral histories, logs, and journals can also contribute to case study data if farmers collaborate in producing these case studies. Evaluators should guarantee the right to privacy and confidentiality of their sources. Case studies often reveal deterrents to participation, as well as ways participants have overcome deterrents to practice through extension contacts.

Selecting methods for evaluation of teaching and learning: How do we know that learning has happened?

The Perspective of Learners

We have briefly considered methods for programme evaluation at the macro level. Let us now consider the evaluation of learning at the micro level from the perspective of the two most important stakeholders, the extension educators and the farmers or programme volunteers as learners in specific educational events. When learning is evaluated, there are many questions to answer. Central of course is how learners experience the learning process and what they actually learn (the outcome of learning), their knowledge, attitudes, skills, and aspirations, and their behaviour change. An evaluation also can focus on the extension educator as learner and the content, processes, and resources that are used. Because learning is always a social phenomenon, an evaluation can focus on the social environment, organizational context, and the relevance of language, culture, and sometimes public policy to learning. These underlying cultural assumptions often explain resistance to learning, as well as the way learning either reproduces existing racial, gender, and economic power relationships or challenges these relationships. Not all evaluations include all of these questions. Educators tend to focus on questions that serve their own perspectives. Learners, including farmers, likewise may be interested in questions that serve their perspectives. Often adult learners are eager to reflect critically on their past and present learning contexts in order to overcome socially constructed deterrents to their learning.

When farmers consider evaluation of their own learning, they may ask themselves a broad range of questions. For example, have they, as learners,

· Gained knowledge or problem-solving skills that are useful to them?
· Increased their hopes and aspirations regarding the future?
· Learned how to learn better or gain access to more knowledge?
· Changed their assumptions, habits of mind, priorities?
· Gained confidence in taking leadership and presenting their ideas?
· Increased their commitment to experiment or take direct action?

The Perspective of Extension Educators

When extension educators consider evaluation of learning, they usually want to know how the learners perceive the process of learning, especially how they, as educators, have been helpful to the learning process. Educators ask themselves - and ask the learners to indicate - whether they have,

· Negotiated expectations and objectives?
· Introduced a variety of useful methods and materials?
· Encouraged the use of examples to illustrate concepts or practice?
· Given step-by-step instructions?
· Summarized the material presented?
· Related theory to practice?
· Showed concern about learners as human beings?
· Promoted discussion and learner interaction?
· Encouraged silent learners to participate?
· Used understandable vocabulary?
· Respected racial, ethnic, and gender differences and their unique contributions to learning?
· Appreciated learning handicaps and disabilities?
· Helped learners reflect critically on how they learn?
· Appreciated local knowledge of learners and made use of it?
· Helped learners learn from each other during learning activities?

Asking learners to reflect on educator behaviours also encourages critical reflection on their own learning.

The purpose of listing all of these questions from both the learners' and educators' perspectives is not that one should ask them all, but to stimulate discussion about which are essential items for a specific evaluation effort. The methods that follow can be used to gather evidence of learning through relationships.

Documentation of Local and External Knowledge. A basic principle of adult education is that learning should begin with prior knowledge. How can we appreciate what has been learned if educators do not know what participants already know? For centuries, farmers have passed on their indigenous or local knowledge. Extension educators can assist farmers in documenting this knowledge either in written, photographic, audio, or video forms. This is essential, since evaluation of new learning should be compared with what farmers already know. Also, farmer acknowledgment of the limitations of their local knowledge can form the basis for collaborative inquiry and the linkage of external knowledge with their local knowledge.

Rating Scales and Checklists. Educators can use these in checking the performance of learners. Learners can use these to check the performance of educators. They can be administered in group or field settings and can be easily revised (Worden & Neumaier, 1987). Learners can use them to judge their own performance, current knowledge, or educational expectations. Rating scales and checklists are not very useful in measuring attitudes or consequences of performance.

Feedback Committees. When residential training for extension staff and farmers takes place, participants can elect a feedback committee to provide evaluation observations to the leadership during the event. The feedback committee should be open to any complaints participants may have about the event, ranging from relevance of content, adequacy of facilities, or effectiveness of leadership, to the involvement of learners in discussion or activities. The committee can bring items to the attention of the instructor or to the group as a whole through written or oral form (Apps, 1991).

Group Discussion Assessment. This method can be incorporated into an ongoing group or meeting. It is relatively efficient in terms of costs and time use. The discussion is usually focussed on several open-ended questions, including those listed above. Groups can create their own questions and then make recommendations for changes.

Peer Review Panels. Farmers can become involved in evaluating one another's work through peer review panels. Panels can be taught to use standards and rating scales. Their evaluative judgements can be made with or without identification of reviewers. When peer review panels are used, it is important to establish a positive climate of constructive criticism.

End-of-Event Analysis. This can be done in several ways. The most frequently used is an evaluation form that is administered at the end of workshop sessions. Another way is to have these forms distributed, collected, and summarized by a feedback committee. They can then report these findings and conduct a discussion on the overall strengths of the workshop or training event. After discussing what should not be changed, they can then discuss what specific modifications should be made (Apps, 1991).

Testimonials and Stories. Testimonials and stories can provide subjective records of educational experiences and activities from the perspective of the learners. They are a form of results data and can qualitatively describe the nature and process of educational change. These stories also can be easily understood by others outside the programme as illustrations of types of outcomes and can lead to ideas for future programming. Disadvantages of this method include social desirability bias, nongeneralizability beyond the person giving the testimony, and difficulty in determining what happened as a result of the programme versus other influences on the person. Stories can be either written by the learners or created as a result of an audio-taped interview.

Pretests, Posttests, and Quizzes. In spite of the negative attitudes associated with tests and quizzes, they can be useful for diagnosing learner proficiencies, documenting prior knowledge, projecting learning achievements, and understanding learner attitudes. Repeating the quiz at the end of a learning event can document change that also can demonstrate to learners themselves that they have learned (Jacobs & Chase, 1992).

Interpreting findings and data: To what do we compare the findings?

In some evaluations, the findings consist of narratives that describe the history of events. In other cases, evaluation findings are descriptions of the way things are without explanations or judgements. In still other cases, evaluations seek to answer questions of cause and effect or the relationships between methods and outcomes. As useful as these findings may be, they do not in themselves help us to judge the worth of programme efforts. How do we know that a programme deserves praise? This can occur only when the findings are judged in relationship to several judgement perspectives, five of which are briefly discussed below.

Standard-Referenced Judgement Perspective

Findings regarding a particular extension programme can be compared with examples of excellent achievement by using agreed upon criteria established by experts as "state of the art." For instance, such experts could describe excellence as (1) promotion of sustain-able agricultural practice; (2) recognition of local knowledge and cultural practices; (3) high participation of farmers in appropriate technology generation; (4) high inclusion of farmers from all economic, social, and cultural groups; (5) access of women farmers to extension; and (6) promotion of justice in agricultural policies.

Cohort-Referenced Judgement Perspective

Findings from an extension programme evaluation can be compared with similar programmes. For this judgement, we ask how a particular programme or aspect of a programme would be ranked with extension programmes elsewhere. External evaluators are most prepared to do this since they may be knowledgeable about extension practices and impacts elsewhere. This approach is limited because few places are truly similar in extension resources, political and economic contexts, and other handicaps that can affect programme outcomes. However, a failure to make judgements using external cohort comparisons may result in a false sense of achievement and self-satisfaction. Such programmes, when compared, may be mediocre, outdated, poorly conceptualized, and wasteful while being considered excellent or acceptable by local practitioners and their funders.

Difficulty-Referenced Judgement Perspective

This perspective takes into consideration the difficulty of what is being attempted when making judgements regarding programme achievement. For example, an extension programme that is addressing sustainable agriculture in highland areas with few staff members in the midst of revolutionary unrest must be given extra credit for achievements under these difficult circumstances compared with programmes that have a large staff who are helping farmers with relatively rich soil increase their crop production. Difficulty-referenced judgement is the basis for giving higher scores for more difficult dives in Olympic competition. Achievement must be judged relative to difficulty and conditions.

Progress-Referenced Judgement Perspective

This perspective on the interpretation of findings gives credit and recognition to progress from past to present. Before-and-after descriptions are essential to making these judgements. Those who emphasize "base line" data usually want to use it to make progress-referenced judgements.

Alternative-Referenced Judgement Perspective

This perspective considers present descriptions of an extension programme in comparison with what could have been accomplished with the same resources used in alternative ways and places or with different people. This perspective asks, "How else could these resources have been spent with better or different results?" For example, would sustainable agricultural and environmental practices be more evident today if funds had been given directly to village groups or farmers' associations for conducting their own participatory action research? Is the present farm visit system better than a distance education system using radio and audio tapes? What would have been the results if all areas of extension programmes had worked more directly with women through their traditional organizations? Sometimes answers to these questions can be generated by projecting results based on pilot efforts or by alternative approaches that have already been tried in limited form in parts of an extension system.

These judgement perspectives also are helpful in evaluation of learning by individual participants themselves. Learners can judge their learning in comparison with expert standards, performance by their peers, their learning handicaps and external difficulties, their own progress from past to present, and what they would have found more important to have learned or done with their time. Often a combination of these judgements provides a balanced evaluation.

Managing the work: Who will be responsible for what?

All too often decisions are made to undertake an extensive evaluation without counting the cost and without people agreeing to do the work. These efforts are doomed to failure. Negotiating what is possible, given resource limitations and reasonable time expectations, is an essential collaborative task if people at all levels are to own the evaluation and act on the findings. Involving learners and educators together in critically reflecting on and interpreting the evidence will go a long way toward guaranteeing that evaluation itself is a major learning process.

Using evaluation findings for improving extension: Who should be told what?

Typically, a single report provided to stakeholders neglects the interests of many other participant groups. One solution is to consider the findings, interpretations, and judgements as constituting a pool from which a variety of reports and styles of reporting can be fashioned to serve specific purposes and different users. Forms of reporting can include formal written reports, written executive summaries, letters to individuals and organizations, exhibits and pictorial displays, video narratives, news reports, news conferences, and public meetings. Stakeholders can then be given the information to which they are entitled in the form that best suits their purposes and best encourages learning.

Evaluation is a process of individual and collective learning (Choudhary & Tandon, n.d.) We can learn from our successes, but especially from our failures. Korton (1980) calls this "embracing error." Evaluation provides that occasion.

References

Apps, J. W. (1991). Mastering the teaching of adults. Malabar, FL: Krieger.

Arnstein, S. (1969). The ladder of citizen participation. Journal of American Institute of Planners, 35, 221.

Brunner, I., & Guzman, A. (1989). Participatory evaluation: A tool to assess projects and empower people. In R. F. Conner & M. Hendricks (Eds.), International innovations in evaluation methodology (p. 9-18). New Directions for Program Evaluation, No. 42. San Francisco: Jossey-Bass.

Choudhary, A., & Tandon, R. (n.d.). Participatory evaluation: Issues and concerns. New Delhi, India: Society for Participatory Research in Asia.

Cronbach, L. J., & Associates. (1981). Toward reform of program evaluation: Aims, methods, and institutional arrangements. San Francisco: Jossey-Bass.

Deshler, D. (1990). Concept mapping. In J. Mezirow & Associates (Eds.), Fostering critical reflection in adulthood: A guide to transformative learning. San Francisco: Jossey-Bass.

Eisner, E. (1983). Educational connoisseurship and criticism: Their form and function in educational evaluation. In G. F. Madaus, M. Scriven, & D. Stufflebeam (Eds.), Evaluation models. Boston: Kluwer-Nijhoff.

FAO (1990). Participation in practice: Lessons learned from the FAO People's Participation Programme. Rome: Food and Agriculture Organization, People's Participation Programme.

Gold, N. (1988). Stakeholders and program evaluation: Characterizations and reflections. In A. S. Bryk (Ed.), Stakeholder-based evaluation. New Directions for Program Evaluation, No. 17 (p. 62-63). San Francisco: Jossey-Bass.

Goldstein, I. L. (1986). Evaluation procedures. In I. L. Goldstein, Training in organizations: Needs assessment, development and evaluation (2nd ed.). Monterey, CA: Brooks/Cole.

Greene, J. C. (1988). Stakeholder participation and utilization in program evaluation. Evaluation Review, 12, 91-116.

Greene, J. C., Caracelli, B. J., & Graham, W. F. (1989). Toward a conceptual framework for mixed-method evaluation designs. Educational Evaluation and Policy Analysis, 11, 255-274.

Guba, E. G., & Lincoln, Y. S. (1989). Fourth-generation evaluation. Newbury Park, CA: Sage.

Jacobs, L. C., & Chase, C. I. (1992). Developing and using tests effectively: A guide for faculty. San Francisco: Jossey-Bass.

Korton, D. C. (1980). Community organizations and rural development: A learning process approach. Public Administration Review, 40 (5), 480-511.

Murphy, J., & Marchant, T. (1988). Monitoring and evaluation in extension agencies. Washington, DC: The International Bank for Reconstruction and Development/The World Bank.

Patton, M. Q. (1991). Beyond evaluation myths. Adult Learning (October), 9-10, 28.

Provus, M. M. (1971). Discrepancy evaluation/or education program improvement and assessment. Berkeley, CA: McCutchan.

Ruddell, E. D. (1994). A simplified methodology/or training peasant farmers how to conduct site-specific scientific field trials in deprived rural areas. 1994 Conference Refereed Papers. Association for International Agricultural and Extension Education. Tenth Annual Conference, Arlington, VA.

Rossi, P. H., & Freeman, H. E. (1982). Evaluation: A systematic approach. Beverly Hills, CA: Sage.

Scriven, M. (1972). Pros and cons about goal-free evaluation. Evaluation Comment, 3, 1-5.

Stufflebeam, D. L. (1971). Educational evaluation and decision making. Itasca, IL: F. E. Peacock.

Summers, J. C. (1977). Dimensions of program effectiveness and accountability. Morgantown: West Virginia University, Office of Research and Development.

Tripodi, T, Fellin, P., & Epstein, I. (1971). Social program evaluation: Guidelines for health education and welfare administration. Itasca, IL: F. E. Peacock.

Uphoff, N. (1989). Afield methodology for participatory self-evaluation of P.P.P. group and inter-group association performance. Ithaca, NY: People's Participation Programme of the U.N. Food and Agriculture Organization.

Uphoff, N. (1992). Participatory evaluation of rural development. New York: IFAD Monitoring and Evaluation Division for the Monitoring and Evaluation Panel.

Worden, P. E., & Neumaier, P. (1987). A source book for program evaluation and accountability. Fort Collins: Colorado State University, State Cooperative Extension.

External Evaluator External Criteria	Internal Evaluator External Criteria
External Evaluator Internal Criteria	Internal Evaluator Internal Criteria