|
Steps in Program Evaluation |
The framework emphasizes six connected steps that together can be used as
a starting point to tailor an evaluation for a particular public health
effort, at a particular point in time. Because the steps are all
interdependent, they might be encountered in a nonlinear sequence; however,
an order exists for fulfilling each — earlier steps provide the foundation
for subsequent progress. Thus, decisions regarding how to execute a step
should not be finalized until previous steps have been thoroughly addressed.
The steps are as follows:
Understanding and adhering to these six steps will allow an understanding
of each program's context (e.g., the program's history, setting, and
organization) and will improve how most evaluations are conceived and
conducted.
Back to Top
Engaging Stakeholders
The evaluation cycle begins by engaging stakeholders (i.e.,
the persons or organizations having an investment in what will be learned
from an evaluation and what will be done with the knowledge). Public health
work involves partnerships; therefore, any assessment of a public health
program requires considering the value systems of the partners. Stakeholders
must be engaged in the inquiry to ensure that their perspectives are
understood. When stakeholders are not engaged, evaluation findings might be
ignored, criticized, or resisted because they do not address the
stakeholders' questions or values. After becoming involved, stakeholders
help to execute the other steps. Identifying and engaging the following
three groups are critical:
-
Those involved in program
operations (e.g., sponsors, collaborators, coalition
partners, funding officials, administrators, managers, and staff)
-
Those served or affected
by the program (e.g., clients, family members,
neighborhood organizations, academic institutions, elected officials,
advocacy groups, professional associations, skeptics, opponents, and staff
of related or competing organizations)
-
Primary users of the
evaluation (e.g., the specific persons who are in a
position to do or decide something regarding the program.) In
practice, primary users will be a subset of all stakeholders identified. A
successful evaluation will designate primary users early in its
development and maintain frequent interaction with them so that the
evaluation addresses their values and satisfies their unique information
needs.
For additional details, see "Engaging
Stakeholders".
Back to Top
Describe
the Program
Program descriptions convey the mission and objectives of the
program being evaluated. Descriptions should be sufficiently detailed to
ensure understanding of program goals and strategies. The description should
discuss the program's capacity to effect change, its stage of development,
and how it fits into the larger organization and community. Program
descriptions set the frame of reference for all subsequent decisions in an
evaluation. The description enables comparisons with similar programs and
facilitates attempts to connect program components to their effects.
Moreover, stakeholders might have differing ideas regarding program goals
and purposes. Evaluations done without agreement on the program definition
are likely to be of limited use. Sometimes, negotiating with stakeholders to
formulate a clear and logical description will bring benefits before data
are available to evaluate program effectiveness. Aspects to include in a
program description are:
Need:
A statement of need describes the problem or
opportunity that the program addresses and implies how the program will
respond.
Expected effects:
Descriptions of expected effects convey what the program must accomplish
to be considered successful.
Activities:
Describing program activities (i.e., what the program does to effect
change) permits specific steps, strategies, or actions to be arrayed in
logical sequence. This demonstrates how each program activity relates to
another and clarifies the program’s hypothesized mechanism or theory of
change
Resources:
Resources include the time, talent,
technology, information, money, and other assets available to
conduct program activities.
Stage of development:
Public health programs mature and change over time; therefore, a program’s
stage of development reflects its maturity. A minimum of three
stages of development must be
recognized: planning, implementation, and effects.
During planning, program activities are untested, and the goal of
evaluation is to refine plans. During implementation, program activities
are being field-tested and modified; the goal of evaluation is to
characterize real, as opposed to ideal, program activities and to improve
operations, perhaps by revising plans. During the last stage, enough time
has passed for the program’s effects to emerge; the goal of evaluation is
to identify and account for both intended and unintended effects.
Context:
Descriptions of the program’s context should include the setting and
environmental influences (e.g., history, geography, politics, social and
economic conditions, and efforts of related or competing organizations)
within which the program operates. Understanding these environmental
influences is required to design a context-sensitive evaluation and will
aid users in interpreting findings accurately and assessing the
generalizability of the findings.
Logic model:
A logic model, which describes the sequence of events for bringing about
change, synthesizes the main program elements into a picture of how the
program is supposed to work. Often, this model is displayed in a flow
chart, map, or table to portray the sequence of steps leading to program
results.
For additional details, s ee
"Describing the Program".
Back to Top
Focus
the Evaluation Design
The direction and process of the evaluation
must be focused to assess the issues of greatest concern to stakeholders
while using time and resources as efficiently as possible. Not all design
options are equally well-suited to meeting the information needs of
stakeholders. After data collection begins, changing procedures might be
difficult or impossible, even if better methods become obvious. A thorough
plan anticipates intended uses and creates an evaluation strategy with the
greatest chance of being useful, feasible, ethical, and accurate. Among the
items to consider when focusing an evaluation are the following:
Purpose:
Articulating an evaluation’s purpose (i.e., intent) will prevent premature
decision-making regarding how the evaluation should be conducted.
Characteristics of the program, particularly its stage of development and
context, will influence the evaluation’s purpose. Four general purposes
exist for conducting evaluations in public health practice.
- Gain insight -- evaluations done for this purpose provide the
necessary insight to clarify how program activities should be designed
to bring about expected changes.
- Change practice -- evaluations done for this purpose include
efforts to improve the quality, effectiveness, or efficiency of program
activities.
- Assess effects -- evaluations done for this purpose examine
the relationship between program activities and observed consequences.
- Affect participants -- evaluations done for this purpose use
the processes of evaluation to affect those who participate in the
inquiry. The logic and systematic reflection required of
stakeholders who participate in an evaluation can be a catalyst for
self-directed change. An evaluation can be initiated with the intent
that the evaluation procedures themselves will generate a positive
influence.
- Users: Users are
the specific persons that will receive evaluation findings. Because
intended users directly experience the consequences of inevitable design
trade-offs, they should participate in choosing the evaluation focus.
User involvement is required for
clarifying intended uses, prioritizing questions and methods, and
preventing the evaluation from becoming a misguided or irrelevant
exercise.
- Uses: Uses are
the specific ways in which information generated from the evaluation will
be applied. Several uses
exist for program evaluation. Uses should be planned and prioritized with
input from stakeholders and with regard for the program’s stage of
development and current context. All uses must be linked to one or more
specific users.
Questions:
Questions establish boundaries for the evaluation by stating what aspects
of the program will be addressed. Negotiating and prioritizing questions
among stakeholders further refines a viable focus for the evaluation. The
question-development phase might also expose differing opinions regarding
the best unit of analysis. Certain stakeholders might want to study how
programs operate together as a system of interventions to effect change
within a community. Other stakeholders might have questions concerning the
performance of a single program or a local project within that program.
Still others might want to concentrate on specific subcomponents or
processes of a project.
The methods for an evaluation are
drawn from scientific research options, particularly those developed in
the social, behavioral, and health sciences. A basic classification of
design types includes experimental, quasi-experimental, and observational
designs. No design is intrinsically better than another under all
circumstances. Evaluation methods should be selected to provide the
appropriate information to address stakeholders’ questions (i.e., methods
should be matched to the primary users, uses, and questions).
Methodology decisions also raise questions regarding
how the evaluation will operate (e.g., to what extent program participants
will be involved; how information sources will be selected; what data
collection instruments will be used; who will collect the data; what data
management systems will be needed; and what are the appropriate methods of
analysis, synthesis, interpretation, and presentation). Because each
method option has its own bias and limitations, evaluations that mix
methods are generally more effective.
Agreements:
Agreements summarize the evaluation
procedures and clarify roles and responsibilities among those who will
execute the plan. Agreements describe how the evaluation plan will be
implemented by using available resources (e.g., money, personnel, time,
and information). Agreements also state what safeguards are in place to
protect human subjects and, where appropriate, what ethical (e.g.,
institutional review board) or administrative (e.g., paperwork reduction)
approvals have been obtained. Creating an explicit agreement verifies the
mutual understanding needed for a successful evaluation. It also provides
a basis for modifying or renegotiating procedures if necessary.
For additional details, s ee
"Focusing the
Evaluation Design".
Back to Top
Gather
Credible Evidence
Persons involved in an evaluation should strive to
collect information that will convey a well-rounded picture of the program
and be seen as credible by the evaluation’s primary users. Information
(i.e., evidence) should be perceived by stakeholders as believable and
relevant for answering their questions. Such decisions depend on the
evaluation questions being posed and the motives for asking them. Having
credible evidence strengthens evaluation judgments and the recommendations
that follow from them. Although all types of data have limitations, an
evaluation’s overall credibility can be improved by using multiple
procedures for gathering, analyzing, and interpreting data. Encouraging
participation by stakeholders can also enhance perceived credibility. When
stakeholders are involved in defining and gathering data that they find
credible, they will be more likely to accept the evaluation’s conclusions
and to act on its recommendations. The following aspects of evidence
gathering typically affect perceptions of credibility:
Indicators:
Indicators
define the program attributes that pertain
to the evaluation’s focus and questions. Because indicators translate
general concepts regarding the program, its context, and its expected
effects into specific measures that can be interpreted, they provide a
basis for collecting evidence that is valid and reliable for the
evaluation’s intended uses. Indicators address criteria that will be used
to judge the program; they therefore highlight aspects of the program that
are meaningful for monitoring
Sources:
Sources of evidence in an evaluation
are the persons, documents, or
observations that provide information for the inquiry.
More than one source might be used to gather evidence for each indicator
to be measured. Selecting multiple sources provides an opportunity to
include different perspectives regarding the program and thus enhances the
evaluation’s credibility. The criteria used for selecting sources
should be stated clearly so that users and other stakeholders can
interpret the evidence accurately and assess if it might be biased. In
addition, some sources are narrative in form and others are numeric. The
integration of qualitative and quantitative information can increase the
chances that the evidence base will be balanced, thereby meeting the needs
and expectations of diverse users. Finally, in certain cases, separate
evaluations might be selected as sources for conducting a larger synthesis
evaluation.
Quality:
Quality refers to the appropriateness
and integrity of information used in an evaluation. High-quality data are
reliable, valid, and informative for their intended use. Well-defined
indicators enable easier collection of quality data. Other factors
affecting quality include instrument design, data-collection procedures,
training of data collectors, source selection, coding, data management,
and routine error checking. Obtaining quality data will entail trade-offs
(e.g., breadth versus depth) that should be negotiated among stakeholders.
Because all data have limitations, the intent of a practical evaluation is
to strive for a level of quality that meets the stakeholders’ threshold
for credibility.
Quantity:
Quantity refers
to the amount of evidence gathered in an evaluation. The amount of
information required should be estimated in advance, or where evolving
processes are used, criteria should be set for deciding when to stop
collecting data. Quantity affects the potential confidence level or
precision of the evaluation’s conclusions. It also partly determines
whether the evaluation will have sufficient power to detect effects. All
evidence collected should have a clear, anticipated use. Correspondingly,
only a minimal burden should be placed on respondents for providing
information.
Logistics:
Logistics encompass the methods,
timing, and physical infrastructure for gathering and handling evidence.
Each
technique for gathering evidence must be suited to the
source(s), analysis plan, and strategy for communicating findings. Persons
and organizations also have cultural preferences that dictate acceptable
ways of asking questions and collecting information, including who would
be perceived as an appropriate person to ask the questions. The techniques
for gathering evidence in an evaluation must be aligned with the cultural
conditions in each setting of the project. Data-collection procedures
should also be scrutinized to ensure that the privacy and confidentiality
of the information and sources are protected.
For additional details, s ee
"Gathering Credible
Evidence".
Back to Top
Justify
Conclusions
Evaluation conclusions are justified when they are linked to
the evidence gathered and judged against agreed-upon values or standards set
by the stakeholders. Stakeholders must agree that conclusions are justified
before they will use the evaluation results with confidence. Justifying
conclusions on the basis of evidence includes the following five elements:
Standards reflect the values held by stakeholders and provide the
basis for forming judgments concerning program performance. Using explicit
standards for judgment is fundamental for effective evaluation because it
distinguishes evaluation from other approaches to strategic management in
which priorities are set without reference to explicit values. In
practice, when stakeholders articulate and negotiate their values, these
become the standards for judging whether a given program’s performance
will, for example, be considered successful, adequate, or unsuccessful. An
array of value systems might serve as
sources of standards.
When operationalized, these standards establish a comparison by which the
program can be judged
- Analysis and synthesis:
Analysis and synthesis are methods for examining and summarizing
an evaluation’s findings. They detect patterns in evidence, either by
isolating important findings (analysis) or by combining sources of
information to reach a larger understanding (synthesis). Mixed method
evaluations require the separate analysis of each evidence element and a
synthesis of all sources for examining patterns of agreement, convergence,
or complexity. Deciphering facts from a body of evidence involves deciding
how to organize, classify, interrelate, compare, and display information.
These decisions are guided by the questions being asked, the types of data
available, and by input from stakeholders and primary users.
- Interpretation:
Interpretation is the effort of figuring out what the findings mean and is
part of the overall effort to
make
sense of the
evidence gathered in an evaluation. Uncovering facts regarding a program’s
performance is not sufficient to draw evaluative conclusions. Evaluation
evidence must be interpreted to appreciate the practical significance of
what has been learned. Interpretations draw on information and
perspectives that stakeholders bring to the evaluation inquiry and can be
strengthened through active participation or interaction.
- Judgment:
Judgments are statements concerning the merit, worth, or significance of
the program. They are formed by comparing the findings and interpretations
regarding the program against one or more selected standards. Because
multiple standards can be applied to a given program, stakeholders might
reach different or even conflicting judgments. Conflicting claims
regarding a program’s quality, value, or importance often indicate that
stakeholders are using different standards for judgment. In the context of
an evaluation, such disagreement can be a catalyst for clarifying relevant
values and for negotiating the appropriate bases on which the program
should be judged.
- Recommendations:
Recommendations are actions for consideration resulting
from the evaluation. Forming recommendations is a distinct element of
program evaluation that requires information beyond what is necessary to
form judgments regarding program performance. Knowing that a program is
able to reduce the risk of disease does not translate necessarily into a
recommendation to continue the effort, particularly when competing
priorities or other effective alternatives exist. Thus, recommendations
for continuing, expanding, redesigning, or terminating a program are
separate from judgments regarding a program’s effectiveness. Making
recommendations requires information concerning the context,
particularly the organizational
context, in which programmatic decisions will be made.
For additional details, s ee
"Justifying Conclusions".
Back to Top
Ensure
Use and Share Lessons Learned
Assuming that lessons learned in the course of an evaluation
will automatically translate into informed decision-making and appropriate
action would be naive. Deliberate effort is needed to ensure that the
evaluation processes and findings are used and disseminated appropriately.
Preparing for use involves strategic thinking and continued vigilance, both
of which begin in the earliest stages of stakeholder engagement and continue
throughout the evaluation process. The following five elements are critical
for ensuring use of an evaluation:
- Design:
Design refers to how the evaluation’s questions, methods, and
overall processes are constructed. As discussed in the third step of this
framework, the design should be organized from the start to achieve
intended uses by primary users. Having a clear design that is focused on
use helps persons who will conduct the evaluation to know precisely who
will do what with the findings and who will benefit from being a part of
the evaluation.
- Preparation:
Preparation refers to the steps taken to rehearse
eventual use of the evaluation findings. The ability to translate new
knowledge into appropriate action is a skill that can be strengthened
through practice. Building this skill can itself be a useful benefit of
the evaluation. Rehearsing how potential findings (particularly negative
findings) might affect decision-making will prepare stakeholders for
eventually using the evidence. Preparing for use also gives stakeholders
time to explore positive and negative implications of potential results
and time to identify options for program improvement.
- Feedback:
Feedback is the communication that occurs among all
parties to the evaluation. Giving and receiving feedback creates an
atmosphere of trust among stakeholders; it keeps an evaluation on track by
letting those involved stay informed regarding how the evaluation is
proceeding.
- Follow-up:
Follow-up refers to the technical and emotional
support that users need during the evaluation and after they receive
evaluation findings. Because of the effort required, reaching justified
conclusions in an evaluation can seem like an end in itself; however,
active follow-up might be necessary to remind intended users of their
planned uses. Follow-up might also be required to prevent lessons learned
from becoming lost or ignored in the process of making complex or
politically sensitive decisions. Facilitating use of evaluation findings
also carries with it the responsibility for preventing misuse. Active
follow-up can help prevent misuse by ensuring that evidence is not
misinterpreted and is not applied to
questions other than those that were the central focus of the evaluation.
- Dissemination:
Dissemination is the process of communicating either the
procedures or the lessons learned from an evaluation to relevant audiences
in a timely, unbiased, and consistent fashion.
Although documentation of the evaluation is needed, a formal report
is not always the best or even a necessary product. Planning effective
communication also requires considering the timing, style, tone, message
source, vehicle, and format of information products. Regardless of how
communications are constructed, the goal for dissemination is to achieve
full disclosure and impartial reporting. A
checklist of items to
consider when developing evaluation reports includes tailoring the report
content for the audience, explaining the focus of the evaluation and its
limitations, and listing both the strengths and weaknesses of the
evaluation.
For additional details, s ee
"Ensuring Use and Sharing Lessons
Learned".
Back to Top
|