0% found this document useful (0 votes)
12 views199 pages

Module_3

Uploaded by

minoxlive72
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views199 pages

Module_3

Uploaded by

minoxlive72
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 199

22018

015

DIPLOMA IN
DIPLOMA IN
MONITORING AND
MONITORING AND
EVALUATION
EVALUATION
MODULE 3
MODULE 3
Module two
Module threeofofthe
Monitoring and
Diploma in Evaluationand Evaluation
Monitoring

CATI
CAPACITY AFRICA TRAINING INSTITUTE
TABLE OF CONTENTS
Choosing questions and planning for Evaluation .....................................Pg 3
Information Gathering and Synthesis.......................................................Pg 36
Qualitative and Quantitative Evaluation Design .....................................Pg 59
Selecting appropriate Design ...................................................................Pg 88
Collecting and Analyzing Data ...............................................................Pg118
Collecting and use of Archival Data ......................................................Pg 142
Refining Project based on Evaluation Research .....................................Pg 166

2
MODULE 3 MONITORING AND EVALUATION
Chapter 1

CHOOSING QUESTIONS AND PLANNING FOR THE EVALUATING

In this Module, we'll discuss the first, and perhaps most important, step in evaluation research: deciding

exactly what to evaluate. Each of the rest of the sections in the chapter will deal in detail with one of the

steps you'll need to take to design, implement, and use the evaluation. The goal of the chapter is to provide

guidelines that are useful to grassroots or community-based organizations as well as students or academic

researchers.

WHAT DO WE MEAN BY CHOOSING QUESTIONS, AND WHY IS IT NECESSARY?

Every evaluation, like any other research, starts with one or more questions. Sometimes, the

questions are simple and easy to answer. (Will we serve something close to the 50 people we

expect to?) Often, however, the questions can be complex and the answers less easy to find.

(Which or which combination, of the three parts of our intervention will affect which of the two

behavior changes we seek within participants?) The questions you ask will guide not only your

evaluation, but your program as well. By your choice of questions, you're defining what it is you're

trying to change.

For example, what's the real goal of a program to introduce healthier foods in school lunches? It

could be simply to convince children to eat more fruits, vegetables, and whole grains. It could be

to get them to eat less junk food. It could be to encourage weight loss in kids who are overweight

or obese. It could be to educate them about healthy eating, and to persuade them to be more

adventurous eaters.

3
The evaluation questions you ask both reflect and determine your goals for the program. If you

don't measure weight loss, for instance, then clearly that's not what you're aiming for. If you only

look at an increase in children's consumption of healthy foods, you're ignoring the fact that if they

don't cut down on something else (junk food, for instance), they'll simply gain weight. Is that still

better than not eating the healthy foods? You answer that question by what you choose to examine

- if it is better, you may not care what else the children are eating; if it's not, then you will care.

You choose your evaluation questions by analyzing the community problem or issue you're

addressing, and deciding how you want to affect it. Why do you want to ask this particular

question in relation to your evaluation? What is it about the issue that is the most pressing to

change? What indicators will tell you whether that change is taking place? Is that all you're

concerned with? The answer to each of these and other questions helps to define what it is you're

trying to do, and, by extension, how you'll try to do it.

THINGS TO CONSIDER WHEN CHOOSING EVALUATION QUESTIONS

WHAT DO YOU WANT TO KNOW?

Academics and other researchers may approach choosing research questions differently from

those involved in community programs. In addition to their practical and social applications,

they may choose problems to research simply because they are interesting, or because they tie

into other work that they or their colleagues are doing. Community service workers and others

directly involved in programs, on the other hand, are concerned specifically with improving what

4
they're doing so they can help to enhance the quality of life for the participants in their programs,

and often for the community as a whole. Since we assume that most people using this chapter of

the Tool Box are likely to be practitioners in the community, let's look at some of the reasons

they might pick a particular area to evaluate.

If you're running, or about to run, a program to affect a community issue or problem, you

might want to know one or more of the following:

Is there a cause-and-effect relationship (i.e., does one action or condition directly cause

another) between a particular action and a particular change? Usually, you'll be concerned with

this in terms of your program. (Does our smoking-cessation support group help members to

quit smoking?) Sometimes, however, it might be important to look at it in terms of the

community. (Does a smoking ban in public buildings, bars, and restaurants lead to a decrease in

the number of community residents who smoke?)

If we try this new method, what will happen?

Will the program that worked in the next town, or the one that we read about in a professional

journal, work with our population, or with our issue?

WHY ARE YOU INTERESTED?

Some of the same differences between the concerns of researchers and the concerns of

practitioners may hold here. Those interested primarily in research may simply be moved by

curiosity or by the urge to solve a difficult problem. As a practitioner, on the other hand, you'll

want to know the effects of what you're doing on the lives of participants or the community.

Your interest, therefore, might grow from:

5
• Your experience with an issue and its consequences in a particular population or

community

• Your knowledge of promising interventions and their effects on similar issues

• The uniqueness of the issue to your particular community or population

• The similarity of the issue to other issues in your community, or the issue's interaction with

other issues

Your interest as a community worker has to be considered in relation to your evaluation and the

purpose of your program. Your basic intent is probably to improve things for the population or

the community, but in what ways and by what means? Are you trying out some new things in the

hope of making an already-successful program more successful? Are you importing a promising

practice to see if it works with your population? Are you trying to solve a particularly difficult

professional problem?

A community mediation program found that it was having little success in cases involving

adolescents. After conferring with other similar programs - all of which were struggling with the

same issue - mediators in the program devised a number of strategies to try to reach youth. The

overall question they were concerned with - "Will these strategies make it possible to mediate

successfully where teens are involved?" - was one with real consequences.

IS THE ISSUE YOU'RE ADDRESSING IMPORTANT TO THE COMMUNITY OR TO THE

SOCIETY?

Media reports about or community attempts to address the issue are clear indicators that it is

socially important. If it affects a particular group - violence in a given neighborhood, a high rate

6
of heart disease among middle-aged black males - it has an obvious impact on the community

and society. If your program or intervention has the potential to help resolve the issue in other

places, to be used by community workers in other fields, or to be applied in a number of ways,

the importance of your analysis increases even further. If addressing the issue can lead to

longterm positive social change, then the analysis is vitally important.

All of this affects your evaluation and the questions you ask. If the issue is one of social

importance, then your evaluation of your work is socially important as well. Are you addressing

the aspects of your program or intervention that are of the greatest value to participants, the

community, and society? If not, how might you begin to do so?

HOW DOES THE ISSUE RELATE TO THE FIELD?

The real question here is not whether the issue is important to the field - if it's important to the

community, that's what matters. However, you should explore whether there's evidence from the

field to apply to the issue. Is what you're doing likely to be more effective than other approaches

that have been tried? If your approach isn't effective, are there other approaches out there that hold

more promise? Can the published material about the issue help you understand it better, and give

you better ideas about how to address it?

IS THE ISSUE GENERAL, RATHER THAN SPECIFIC TO YOUR POPULATION OR

COMMUNITY?

Consider whether there is evidence that the issue occurs with a variety of populations and under

a range of conditions. Also consider whether the observations or methods used to determine the

issue's existence are accurate and whether they can be used in different situations and with

7
different groups. Your evaluation may give you valuable information to pass on to practitioners

in different fields or different circumstances.

WHO MIGHT USE THE RESULTS OF YOUR EVALUATION?

If evaluation shows that your program or intervention is successful, that's obviously valuable

information, especially if what you're evaluating is innovative and hasn't been tried before. Even

if the evaluation turns up major problems with the intervention, that's still important information

for others - it tells them what won't work, or what barriers have to be overcome in order to make

it work.

Some of those who might use your results include individuals and groups affected by the issue;

service providers and others who have to deal with the problem (in the case of youth violence,

for instance, this last group might include police, school officials, small business owners,

parents, and medical personnel, among others); advocates and community activists; and public

officials and other policy makers.

WHOSE ISSUE IS IT?

Who has to change in order to address the issue? The focus of the intervention will tell you

whom the evaluation should focus on.

Some possibilities:

• Those directly affected by the problem

• Those in direct personal contact with those directly affected: parents, spouses and children,

other relatives, friends, neighbors, coworkers

8
• Those who serve or otherwise deal with those directly affected: medical professionals,

police, teachers, social workers, therapists, etc.

• Administrators and others who serve or deal with those indirectly affected: hospital or

clinic directors, police chiefs, school principals, agency directors, etc.

• Appointed or elected officials and other policy makers

WHY IS IT NECESSARY TO CHOOSE EVALUATION QUESTIONS CAREFULLY?

You know why you're running your program. Evaluating it should just be a matter of deciding

whether things are better when you evaluate than they were before you started, right? Well,

actually...wrong. It's not that simple. First of all, you need to determine what "things" you are

actually looking at (remember the school lunch example?) Second, you will need to consider

how you will determine what you're doing right, and what you need to change. Here's a partial

list of reasons why choosing questions beforehand is important.

• It helps you understand what effects different parts of your effort are having. By

framing questions carefully, you can evaluate different parts of your effort. If you add an

element after the start of the program, for instance, you may be able to see its effect

separate from that of the rest of the program...if you focus on examining it. By the same

token, you can look at different possible effects of the program as a whole. (Do adult basic

education learners read more as a result of being in a program? Are they more likely to

register to vote? Do their children improve their school performance?)

• It makes you clearly define what it is you're trying to do What you decide to evaluate

defines what you hope to accomplish. Choosing evaluation questions at the start of a

9
program or effort makes clear what you're trying to change, and what you want your results

to be.

• It shows you where you need to make changes. Carefully choosing questions and making

them specific to your real objectives should tell you exactly where the program is doing

well and where the program isn't having the intended effect.

• It highlights unintended consequences. When you find unusual answers to the questions

you choose, it often means that your program has had some effects you didn't expect.

Sometimes these effects are positive - not only did people in the heart-healthy exercise

program gain in fitness, but a majority of them report changing their diet for the better and

losing weight as well - sometimes negative - obese children in a healthy eating program

actually gained weight, even though they were eating a healthier diet - and sometimes

neither. Like the side effects of medication, the unintended consequences of a program can

be as important as the program itself. (In the case of the exercise program,

the changes in diet might do as much as or more than the exercise to maintain heart

health, for instance, and may point toward changing the focus of the program in some

way.)

• It guides your future choices. If you find that your program is particularly successful in

certain ways and not in others, for example, you may decide to emphasize the successful

areas more, or to completely change your approach in the unsuccessful areas. That, in turn,

will change the emphasis of future evaluation as well.

• In participant evaluations, evaluation involves stakeholders in setting the course of the

program, thus making it more likely that it will meet community needs.

10
• It provides focus for the evaluation and the program. Choosing evaluation questions

carefully keeps you from becoming scattered and trying to do too many things at once,

thereby diluting your effectiveness at all of them.

• It determines what needs to be recorded in order to gather data for evaluation. A clear

choice of evaluation questions makes the actual gathering of data much easier, since it

usually makes obvious what kinds of records must be kept and what areas need to be

examined.

WHEN SHOULD YOU CHOOSE QUESTIONS AND PLAN THE EVALUATION?

Evaluation questions, since they help shape your work, should be chosen and the evaluation

planned when planning the overall program or effort. That gives you time and room for a

participatory process, and gives you the chance to use the evaluation as an integral part of the

program. As the program unfolds, you might find yourself adjusting or adding questions to reflect

the reality of what is happening, but unless your original questions were misguided (you were

wrong about what behavior had to change in order to produce certain results, for instance), they

should serve you well.

Now let's discuss reality for many community based and grassroots programs. They're often

understaffed and underfunded. Staffs members may be underpaid, and may often work many

more hours a week than they're paid for, because of their dedication to social justice and social

change. Most or all program staff may even be volunteers, with full-time jobs and family

responsibilities aside from their work in the program. Initial evaluation in these circumstances is

often anecdotal - i.e., based on participants' comments and stories about their progress and staff

11
members' personal, informal observations. A formal evaluation will probably wait until there's

funding for it, or until someone has the time to coordinate or take charge of it.

In that case, the "when" becomes "as soon as you can." You may be dealing with a program that

has just started, or with one that's been operating for a long time. You may know that changes

need to be made, or it may seem that the program is in fact meeting its goals. Whatever the

situation, evaluation questions need to be chosen, and an evaluation planned that will give you

the information you need to improve your work. Even with a program that's been going on for a

while, the questions can still help you define or redefine your work, and will certainly help you

improve it over the long term.

WHO SHOULD BE INVOLVED IN CHOOSING QUESTIONS AND PLANNING THE

EVALUATION?

If you've consulted other sections of the Tool Box concerned with evaluation, you probably

know that we advocate that all stakeholders be involved in planning the evaluation. We believe

that the best evaluation is participatory. That means that there is representation of the views and

knowledge of people affected by the issue to be addressed. The list of potential participants is

essentially the same as that under "Whose problem is it?" in the first part of this section: those

directly affected and their close contacts; those who work with those directly affected, or who

deal directly or indirectly with them and the issue; and public officials. To these groups, we

might add other concerned citizens, and those indirectly affected by the issue. (A shop owner

may not be a victim of neighborhood violence, but fear of that violence might nonetheless keep

customers away from his shop, for instance.)

12
Evaluations that involve all stakeholders have a number of advantages over those conducted in a

vacuum by outside evaluators or agency or program staff. They're more likely to reflect the real

needs of the community, and they bring to bear the community's knowledge of its own context -

history, relationships, culture, etc. - without which a program and its evaluation can go astray.

Participation can range from simple consultation before the fact to complete involvement in

every aspect of an evaluation - assessment, planning, data gathering, analysis, and passing on the

information. In general, the greater the involvement of stakeholders, the better, but in-depth

involvement of the stakeholders may not always be possible. There are time disadvantages to

participatory evaluation - it takes longer - and there are logistical concerns, as well. Participants

may have nothing in their backgrounds to prepare them for research, so training in a number of

areas may be necessary, requiring skill, careful planning, and yet more time. The level of

participation your evaluation can sustain, therefore, relies to some extent on your time

constraints and your capacity to train and support participants.

HOW DO YOU CHOOSE QUESTIONS AND PLAN THE EVALUATION?

Choosing questions

When you choose evaluation questions, you're really choosing a research problem - what you

want to examine with your research. (Evaluation, whether formal or informal, is in fact research.)

You have to analyze the issue and your program, consider various ways they can be looked at,

and choose the one(s) that most nearly tell you what you want to know about what you're doing.

Are you just trying to determine whether you're reaching the right people in sufficient numbers

with your program? Do you want to know how well an intervention is working with specific

populations? What kinds of behavior changes, if any, are taking place as a result? What the

13
actual outcomes are for the community? Each of these - as well as each of the many other things

you might want to know - implies a different set of evaluation questions. To find the questions

that best suit your evaluation, there is a series of steps you can follow.

Describe the issue or problem you're addressing

A problem is a difference between some ideal condition (all people 10 years of age or older

should be able to read; people should be able to find a decent job) and some actual condition in

the community or society (a 25% illiteracy rate among those attending a particular high school;

50% unemployment among minority youths in a particular city). This may mean the absence of

some positive factor (qualified teachers and adequate educational facilities; entry-level jobs that

are reachable from minority neighborhoods) or the presence of some negative factor (students'

difficulty with English; discrimination against minority job applicants), or some combination of

these.

To describe the issue or problem:

• Describe the ideal condition, including the positive factors present and the negative factors

absent. What should it look like if everything was as you'd want it to be?

• Describe the actual conditions that constitute the problem of interest, including the negative

conditions present and the positive conditions absent. What are conditions really like?

• Describe the actual problem in terms of what you're hoping to change. What positive

factors do you want to produce and/or what negative factors do you want to eliminate?

Describe the importance of the problem

14
To be sure that this is a problem you really should be addressing, consider its importance

to those affected and to the community.

• Is the discrepancy between ideal and actual conditions of the kind and size to be considered

important?

• What are the consequences (positive and negative) of the problem?

• Who experiences these consequences (i.e. program participants; their families, friends, and

peers; service providers, policymakers, and others)? How many people are affected?

• How often and for how long are they affected? What is the intensity of the effect?

• How much does the fact that the problem is experienced to this degree by these people

matter to them?

You might also ask whether the effects of the problem matter to society, but in fact, that

shouldn't make a difference. If they matter to the people who experience them, they're

important. Society doesn't always consider a problem important if it's only a problem for a

minority, or for a group that's generally ignored (the poor, the homeless).

In light of these factors, decide whether the problem is important to the evaluation.

Describe those who contribute to the problem

Whose behavior, by its presence or absence, contributes to the problem? Are they in the

program participants' personal environment (participants themselves, family, friends), service

environment (teachers, police), or broader environment (policymakers, media, general public)?

15
For each of them, consider the types of behavior that, by their presence or absence, contribute to

the discrepancy that constitutes the problem.

Assess the importance and feasibility of changing those behaviors

HOW IMPORTANT IS EACH OF THESE BEHAVIORS TO SOLVING THE PROBLEM?

WHAT ARE THE CHANCES THAT YOUR EFFORT CAN HAVE ANY EFFECT ON EACH

OF THEM?

Describe the change objective

Based on the above analysis, choose behavior changes to target in specific people. Where you

can, specify the desired levels of change in targeted behaviors and outcomes (those changes in

conditions that should occur if the problem were to be solved).

For example, a behavior change goal might be an increase in pre-employment capacity -

self-presentation, job-seeking, interview skills, interpersonal competence, resume

writing, basic skills, etc. - for minority job seekers aged 18-24. Or you might instead or

in addition target policy makers, with the goal of having them offer tax incentives to

businesses that locate in or close to minority communities.

This is a way of defining your work. If you're planning the evaluation as you plan the program -

as you would in the ideal situation - then the questions you're asking the evaluation to examine

reflect the problems you're trying to solve, and this kind of analysis is important. If you're

starting an evaluation of a program that has been in place for some time, then you're going to

16
have to do some figuring after the fact about what consequences you think (hope) the program is

having, and what they will lead to. You may be talking about changes in specific participant

behaviors, about behaviors that act as indicators of other changes, or about results of another sort

(participants gaining employment, for instance, which may have a direct relationship to

participant behavior or may have more to do with local economic conditions).

Make sure that the expected changes would constitute a solution or substantial

contribution to the problem

If you conclude that they would not result in a substantial contribution, revise your choice of

problem and/or your selection of targeted people and actions as necessary. If you think that what

you're looking at in an evaluation doesn't address the problem, then you should be looking at

something else. If the objectives you've chosen do constitute all or a substantial part of a

solution, you've found your questions.

SETTING

Now that you've chosen your questions, there may be other factors to consider, such as the

settings in which the evaluation will be conducted. If your program is relatively small and/or has

only one site, this wouldn't be an issue. However, if you don't have the resources - whether

finances, time, or personnel - to evaluate the whole program.

There are some situations in which the choice setting may be important:

• If your program is very large and/or has multiple sites

• If different sites provide different services, activities, or conditions, or use different

methods

17
MULTIPLE SITES

Multiple sites

Can present a challenge for an evaluation, because, although every effort may be made to make

the program at all sites exactly the same, it will seldom be so. If the program relies on human

interaction - teacher/learner, counselor/counselee, trainer/trainee, doctor/patient, etc. - there will

be differences from site to site depending on the people staffing each. (The exception is when the

same people staff all sites, providing the same services at each site at different times or on

different days.) Even if all are equally competent, no two staff members or teams will do things

in exactly the same way or relate to participants in exactly the same way, and the differences can

be reflected in differences in outcomes. If methods or other factors vary from site to site, that

will further complicate the situation.

Furthermore, the physical character of a site can influence not only program effectiveness, but

also the recruitment of participants and whether or not they remain in the program long enough

for it to have some effect (often called "retention.") The site's layout, comfort, apparent safety

and security, and - often most important - how easy it is to get to, all affect whether participants

enroll and stay in the program.

Where you do have the capacity to evaluate all sites, it will be helpful to build into the evaluation a

method of comparing them. This will allow you to identify and adopt at all sites methods,

conditions, or activities that seem to make one site particularly successful, and to identify and

change at all sites methods, conditions, or activities that seem to create barriers to success at

others.

18
If you can't evaluate each site separately, you'll have to decide which one(s) will give you the

information that will most help in adjusting and improving your program. If you're most

concerned with assessing your overall effectiveness, this may mean evaluating the site(s) closest

to the program norm, in terms of methods, conditions, activities, goals, participant/staff

interaction, etc. If, on the other hand, your chief consideration is learning whether a particular

new or unusual method or situation is working, you may find yourself evaluating the site(s) least

like the others.

If sites appear only minimally different, some other considerations that may come into play

are:

• The number and character of participants at the site. Participants at a particular site may be

experiencing the effects of the issue more severely, or may have a particular important

characteristic, such as a language barrier.

• The ability and willingness of participants and staff to support the evaluation research. If

staff at a particular site are unable or unwilling to record observations, attendance, and

other key information, or if site participants are unable or unwilling to be interviewed or

monitored, evaluation at that site might be difficult.

• The stability of the population at the site. If participants at a site come and go at a rapid rate

- unless that's the program's intent - it can be difficult to gain information that contributes to

an accurate evaluation.

• An exception, of course, occurs here if one point of the evaluation is to find out why

participants stay for so short a time, and to try to develop methods or create conditions to

assist them to remain in the program long enough to reach their goals.

19
Sites with different methods, conditions, activities, or services

Programs sometimes are organized so that different methods are used or different services

provided at different sites. In other cases, conditions may vary from site to site because of the sites'

geographical locations or the available space. The ideal situation is to evaluate all sites and

compare the effects of the different methods, conditions, or services. When that's not possible,

you'll have to decide what's most important to find out.

If the methods, services, or conditions at a particular site are new or innovative, you may want to

evaluate them, rather than those that have a track record. There may be a particular method or

service that you want to evaluate, in which case the decision about which site to choose is

obvious. The decision should be based on what makes the most sense for your program, and

what will give you the best information to improve its effectiveness.

When you have the capacity to choose more than one site to evaluate, it often makes sense to

choose two or three sites that are different - especially if each is representative of other sites in

the program or of program initiatives - so that you can compare their effectiveness. Even where

sites are essentially similar, you'll get more information by evaluating as many as you can.

PARTICIPANTS

Another factor to consider is the participants whose behavior, activity, or circumstances will be

evaluated. If your program is relatively small this might not be an issue - the participants will

simply be all those in the program. However, if you don't have the resources - whether finances,

time, or personnel - to evaluate the whole program, there are some situations in which the choice

of participants may be important:

20
• If your program includes different groups of participants (groups that are in different

stages of the program, or that are exposed to different methods or services).

• If groups of participants belong to populations with distinctly different cultures, stemming

from race, ethnicity, class, religion, or other factors.

Multiple groups

There are a number of reasons why there might be multiple groups of participants in a

program. You might start different groups at different times, either because the program has a

rolling start schedule (when there are enough people for a class/training group, one will begin),

or because the program is aimed at different groups (for example, 5 year-olds, 8-year-olds, and

14-year-olds). You might also be trying different strategies with different groups.

The Brookline Early Education Project (BEEP), a program aimed at school readiness for children

aged pre-birth through 5, recruited pregnant families in three cohorts over the course of three

years. In addition, families in each cohort were assigned to one of three levels of service. Thus,

there were actually nine different groups among BEEP participants, even though, by the third year,

all were receiving services at the same time.

Once again, if there's no problem in evaluating the whole program, participants will simply include

everyone. If that's not possible, there are a number of potential choices:

Evaluate your work with only one group, with the expectation that work with the others will be

evaluated in the future. In this case, you'd probably want to choose the one for whom you

consider the program most crucial. They might be at greater risk (of heart attack, of school

failure, of homelessness, etc.) or might be experiencing the issue at a high level of intensity

21
(daily shooting incidents in the neighborhood, high rates of teen pregnancy, massive

unemployment).

Include a small number (2-4) of groups in your evaluation. You might want to choose groups

with contrasting characteristics (different ages, for example, or addressed by different

strategies). On the other hand, depending on the focus of your evaluation, you might want

groups that are essentially similar, to see whether your work is consistent in its effects.

Choose a few participants from each group to focus your evaluation on. While this won't give

you a complete picture, it should give you enough information to tell where your program is

accomplishing its goals and where it needs improvement. The differences in the ways

participants in different groups respond to the program (assuming there are differences) can also

give you ideas for ways to change what you're doing.

Participants from different populations and cultures

Cultural factors can have an enormous effect on participants' responses to a program. They can

govern conceptions of social roles, family responsibilities, acceptable and unacceptable behavior,

attitudes toward authority (and who constitutes authority), allowable topics of conversation,

morality, the role of religion - the list goes on and on. In planning a program that involves

members of different populations and cultures, you essentially have three choices:

• Plan your program and implement it in the same way for everyone. If the program involves

groups - classes, support groups, etc. - participants' membership is determined not by

population group but by when they sign up, what time of day they can attend, what they

sign up for, or whatever other criteria make sense logistically.

22
• Plan your program to be as culturally sensitive as possible, and try to screen out anything

that might be offensive to or difficult for any group. In this instance, you might be prepared

to respond if participants from a particular population requested a group of their own.

• Divide participants by cultural group and plan different culturally sensitive approaches for

each. Your overall approach might be the same for everyone, but the way you apply it

might differ by culture.

In any of these instances, it would probably be important to understand how well your approach

is working with members of the various populations. If you can evaluate the whole program,

make sure that you include enough members of each group so that you can compare results (and

their opinions of the program) among them. If your evaluation possibilities are limited, then your

choices are similar to those for multiple groups of other kinds, and will depend on what exactly is

most useful for you.

There are interactions between the choice of sites and the choice of participants here. You may

be concerned about the effects of your program on a particular population, which may be largely

concentrated at one site. In that case, if you have limited resources, you may want to evaluate

only that site, or that site and one other.

Regardless of other considerations, you may want to set some guidelines about whom you

include in the evaluation. How long do people have to be in the program, for instance, before

they're included? In other words, what constitutes participation? (This also sets a criterion for

who should be counted as a drop-out: anyone who starts, but leaves before meeting the standard

for participation.) What about those whose attendance is spotty - a few days here, a few days

23
there, sometimes with weeks in between? Do they have to have attended a certain number of

hours to be considered participants?

These issues can be more complex than they seem. People may start and drop out of a program

numerous times, and then finally come back and complete it. Many others start programs

numerous times, and never complete them. It's usually impossible to tell the difference until

someone actually gets to the point of completion, whatever that means for the particular

program.

In a reversal of the start-many-times-before-completing scenario, there can be a few people who

stay in a program right up till the end and then drop out. This may have to do with the fear of

having to cope with success and a change in self-image, or it may simply be a pattern the person

has learned to follow, and will have to unlearn before being able to complete the program.

Should any or all of these people be included in or excluded from an evaluation, either before

(because of their history in the program) or after the fact? That's a decision you'll have to make,

based on what their inclusion or exclusion will tell you. Just be sure that your evaluation clearly

describes the criteria that you decide to use for your participants.

IF YOU'RE AN OUTSIDE EVALUATOR OR ACADEMIC OR OTHER INDEPENDENT

RESEARCHER

Up to this point, we've largely ignored the evaluation difficulties faced by evaluators not directly

connected with the organization or institution running the program they're evaluating. If you've

been hired or designated by the organization or a funder to evaluate the program, you have to

establish trust, both with the organization and its staff and with participants, if you hope to get

accurate information to work with. You also have to learn enough in a short period about the

24
community, the organization, the program, and the participants to devise a good evaluation plan,

and to analyze the data you and others gather.

If you're an independent researcher - a graduate student, an academic, and a journalist - you face

even greater obstacles. First, you have to find a place to conduct your research - a program to

evaluate - that fits in with your research interests. Then, you have to convince the organization

running that program to allow you to do the research. Once you've jumped that hurdle, you're

still faced with all the same tasks as an outside evaluator: establishing trust, understanding the

context, etc.

Let's look first at the process you as an independent researcher might follow in order to choose and

gain access to a setting appropriate to your interests. Once you've gained that access, you've

become an outside evaluator, so from that point on, the course of preparing for the evaluation will

be the same for both.

CHOOSE A SETTING

If you're an academic or student, you can probably find an appropriate program by asking

colleagues, professors, and other researchers at your institution. If none of them knows of one

offhand, someone can almost undoubtedly put you in touch with human service agencies and

others who will. Other possible sources of information include the Internet, funders, professional

associations, health and human service coalitions, and community organizations. Public funding

information is often available on the web, in libraries, or in newspaper archives. The wider you

spread your net, the more likely you are to find the program you're looking for.

The right program will obviously vary depending on your research interests, but some

questions that will inform your choice include:

25
• Does the setting include people who are actually experiencing the problem that is of

interest to you?

• Is the setting similar to others of this type? (If not, its program might not be useful to

others dealing with the issue, even if it works well in its own context.)

• Does the setting provide support for the research? Will staff, participants, and others help

with data gathering, be forthcoming about context questions, and cooperate with you?

• Does the setting have the resources to maintain the program after your evaluation is done?

• Does the setting permit the changes in operation required by the research? If the planning

of the evaluation and choosing of questions point to doing things differently, can and will

the program make the necessary changes?

• Is the setting accessible?

• Accessibility includes not only handicap accessibility, but whether a site is in a

neighborhood that feels welcoming or safe to participants, whether it is easily reachable by

public transportation or on foot from the areas from which participants are drawn, and

whether it is in a building or institution that doesn't feel intimidating or strange (a

university campus or building can seem as threatening as a fortress to someone who is

insecure about his educational background, for example.) Accessibility can be the

determining factor in whether participants consider a program, or whether they stay in it.

• Is the setting stable? Are the program and organization stable enough that you know they'll

be able to support their work at the current level, at least until the evaluation is completed?

26
Once you've found an appropriate setting, you'll have to convince the organization to collaborate

with you on an evaluation. The next three steps are directed toward that goal.

LEARN AS MUCH AS YOU CAN ABOUT THE ORGANIZATION YOU'VE CHOSEN

Just as you wouldn't go to a job interview without doing some research about the employer, you

shouldn't try to gain the cooperation of an organization without knowing something about it - its

mission, its goals, whom it serves, who the director and board members are, etc. If someone told

you about the organization, she may have, or may know someone who has, much of the

information you need. If the organization maintains a website, much of that information will be

available there. If it's incorporated, the office of the Secretary of the state of incorporation and/or

other state offices will have information about the officers (i.e., the Board of Directors) and other

aspects of the organization. Funding agencies may also have information that's a matter of public

record, including proposals.

CONTACT THE APPROPRIATE PERSON(S) AND REQUEST AN INTERVIEW

Find out whom (by name as well as position) you should talk to about conducting a

research project in the organization you've chosen.

Depending on the organization, this could be the board president, the executive director, or the

program director (if the program you're interested in is only part of a larger organization). In any

case, it might be wise to involve the program director even if he's not the final decision-maker,

since his cooperation will be crucial for the completion of your research.

• If you can, get a personal introduction. It's always best if you come recommended by

someone familiar with the person you need to speak with.

27
• If you can't get a personal introduction, it's usually best to send a letter requesting a meeting

and explaining why, and follow it up with a phone call.

• Before the meeting, send a proposal outlining what you want to do. This should be

substantive enough to help the organization decide whether it wants to work with you, but

not so specific that it doesn't allow for collaborative planning of the evaluation.

PLAN AND PREPARE FOR THE INITIAL MEETING

There are several purposes for this meeting, besides the ultimate one of getting permission

and support for your project (or at least an agreement to continue to discuss the

possibility). They include:

• Establishing your credentials - the experience, educational background, and any other

factors that equip you to conduct this evaluation. This might include references from

colleagues, professors, or other organizations you've worked with.

• Learning more about the program and the organization

• Explaining what you want to do and why, what form the evaluation results are likely to

take, what you'll do with them, who'll have access, etc. This explanation should also cover

issues of confidentiality and permission of participants.

• Explaining what you need from the organization and/or program - participation of

participants and staff, for instance, any logistical support, access to records, or access to

program activities

28
• Explaining what you're offering in return - your services for a comprehensive formal

evaluation, any stipends, equipment or materials, other support services, or whatever else

you may have to offer

• Clarifying the organization's needs, and discussing how they fit with your own - and how

both can be satisfied

Assuming that your presentation has been convincing, and you're now the program evaluator, the

rest of the steps here apply to both independent researchers and outside evaluators.

FIND OUT ALL YOU CAN ABOUT THE CONTEXT

This may play out differently for outside evaluators than it does for independent researchers, but

it's equally important for both. It means finding out all you can about the community, the

organization, the program, and the participants beforehand - the social structure of the

community and where participants fit in it, the history of the issue in question, how the

organization is viewed, relationships among groups and individuals, community politics, etc.

If you're an outside evaluator, you can pick the brains of program administrators, staff, and

participants about the community, the organization, and the issue. Ask them to steer you to

others - community leaders, officials, longtime residents, clergy, and trusted members of

particular groups - who can give you their perspectives as well. If possible, get to know the

community physically: walk and/or drive around it, visit businesses, parks, restaurants, the

library. Understanding how the issue plays out in the community, the nature of relationships

among groups and individuals, and what life is like in the neighborhoods where participants live

will help a great deal in analyzing the evaluation of the program.

29
If you're an independent researcher, learn as much about the context as you can before you

contact the program. Websites (for the organization and/or the community) and libraries are two

possible sources of information, as are community and organization literature and people who

know the community. Learning about the community, the organization, and the participants

beforehand will both help you determine whether this program fits with your research and help

you advocate for its cooperation with your project. Once you have that cooperation, you can

follow the same path as an outside evaluator (since that's what you are) to learn as much about the

context of the program as you can.

ESTABLISH TRUST WITH PROGRAM ADMINISTRATORS, STAFF, AND

PARTICIPANTS

This can be the most difficult part of an evaluation for someone from outside the organization.

There's no magic bullet or predictable timeline, but there are several things you can do:

• Be yourself. Don't feel you have to act a certain way: deal with people in the program as

you do with friends and acquaintances in other circumstances. People can tell when you're

being false, and are unlikely to trust you if you are.

• Treat everyone with equal respect, as colleagues in a research project.

• Don't assume you know more than anyone else just because you're the professional.

• Share freely what you do know, but don't tie yourself to any one process or method,

especially in response to an opposite stance from a key individual.

• Ask administrators, staff, and participants what they want from the evaluation, and discuss

how the evaluation could provide it.

30
• Don't be afraid to say "I don't know, but I'll find out," and then do.

• Follow through on whatever you say you'll do. Don't promise anything you can't deliver

on, and make deadlines reasonable, so you can meet them.

GENERAL TIPS FOR ALL EVALUATORS

These steps apply to everyone, internal evaluators as well as external.

Aim for a participatory evaluation

We've discussed above the involvement of all stakeholders to the extent possible. Involving

participants, program staff, and other stakeholders in participatory planning and research can

often get you the most accurate data, and may give you entry to people and places you normally

might not have. On the other hand, participatory planning and research, as we've explained, takes

time and energy. If you have limited time, you may not be able to set up a fully participatory

project. You can, however, still consult with stakeholders, and involve them in ways that don't

necessarily involve training or large amounts of your time. They can help you line up interviews

with participants or other important informants, for instance, and/or act as informants themselves

about community conditions and relationships.

At least the people in charge of the program, and probably those implementing it as well, will

expect to be part of the planning of the evaluation. They are, after all, the ones who need to know

whether their work is effective, and how to improve it. Involving participants as well, in roles

ranging from informants about context to actual researchers, is likely to enrich the quantity and

quality of the information you can obtain.

31
Plan the evaluation, in collaboration with stakeholders

That collaboration should be at the highest level of participation possible, given the nature of the

program, the time available, and the capacity of those involved (if program participants are

fiveyear-olds, they probably have relatively little to contribute to evaluation planning...but their

parents might want to be involved.)

The actual planning involves ten different areas, each of which will be the subject of one of

the remaining sections in this chapter:

• Information gathering and synthesis

• Designing an observational system

• Developing and testing a prototype intervention

• Selecting an appropriate experimental design

• Collecting and analyzing data

• Gathering and interpreting ethnographic information

• Collecting and using archival data

• Encouraging participation throughout the research

• Refining the intervention based on the evaluation

• Preparing the evaluation results for dissemination

Once the planning is done, it's time to get started on conducting the evaluation. And when you're

finished - having analyzed the information and planned and made the changes that were needed -

32
it's time to start the process again, so that you can determine whether those changes had the

effects you intended. Evaluation, like so much of community work, is a process that goes on as

long as the work itself does. It's absolutely essential to the continued improvement of your

program.

IN SUMMARY

Choosing evaluation questions - the areas in your work you'll examine as part of your evaluation

of your program - is key to defining exactly what it is you're trying to accomplish. For that

reason, those questions should be chosen carefully as part of the planning process for the

program itself, so that the questions can guide your work as well as your evaluation of it. The

more those stakeholders can be involved in that choice and planning, the more likely you are to

create a program that successfully meets its goals serving the community.

Choosing those questions well entails understanding the context of the program - the community,

participants, the culture of any groups involved, the history of the issue and of the social

structure of the community and the organization - and (if you're an outside evaluator without ties

to the program) establishing trust with administrators, staff members, and participants. That trust

will enable you to conduct a participatory evaluation that draws on the knowledge and talents of

all stakeholders, and to plan an evaluation that fits the goals of the program and accurately

analyzes its strengths and weaknesses. With that analysis in hand, you'll be able to make changes

to improve the program. Then you're ready to start the whole process again, so you can evaluate

the effects of the changes you've made.

33
Chapter 2

INFORMATION GATHERING AND SYNTHESIS

Suppose you wanted to design a house that used very little energy, took few resources to build

and maintain, and was affordable for most families. You might have some original ideas about

how this could be done, but you’d want to find out what ideas others had as well. You’d

probably read about earth-beamed houses (houses that are built into a hillside or earth mound),

solar panels or windmills for producing electricity, efficient insulating windows, waste-water

recycling, and non-toxic building materials that reuse waste wood and metal. You’d talk to

people who built or owned energy-efficient houses, to hear about the realities of living green.

You’d learn about the barriers to some environmentally-friendly strategies, as well as ways to get

around those barriers. There’s a huge amount of information out there, and it would make sense

to gather as much of it as possible, so that you could put together the information, incorporate

appropriate elements into your design, and get new ideas based on what’s already been done.

The same is true if you’re designing an intervention or program to deal with a community health

or other issue, or an evaluation of that program. Others have also undoubtedly tried to address

that issue, some with success and some without. Knowing what they did, how they did it, and

what the results were can help you decide how to design your effort. You might be able to find a

method here, and a technique elsewhere that all fit together into exactly the program that will suit

the people and conditions in your community. Or you might realize that something you’d

intended to do simply hasn’t worked in a number of other instances, and so wouldn’t be likely to

work for you, either.

34
Gathering and using others’ ideas doesn’t mean that you can’t use your own or come up with

something new. New ideas tend to come out of what others have attempted. Most artists start out

imitating others before they develop their own styles. Einstein didn’t just chance on relativity; he

was familiar with it because others had worked on it. You can usually innovate more effectively

if you know what’s been tried.

This section looks at gathering all the information you can about your community issue and

about attempts to address it, and putting that information together to design an evaluation to

address your questions. Although this chapter is about evaluation, much of the material in these

sections applies to planning the intervention (or program) and the evaluation: the two really can’t

be separated.

An evaluation is a research project: we are trying to discover what works and under what

conditions. The steps for designing and using an evaluation – the subject of this chapter – are

essentially the same as those for designing the program you’re evaluating. The elements that you

borrow from others’ successful efforts, and those that you create yourself, will give you an

intervention and related evaluation questions. Although this section talks about program design, it

also applies to the design of the evaluation.

WHAT DO WE MEAN BY INFORMATION GATHERING AND SYNTHESIS?

Information gathering refers to gathering information about the issue you’re facing and the ways

other organizations and communities have addressed it. The more information you have about the

issue itself and the ways it has been approached, the more likely you are to be able to devise an

effective program or intervention of your own.

35
There are obviously many sources of information, and they vary depending on what you’re looking

for. In general, you can consult existing sources or look at “natural examples,” examples of actual

programs and interventions that have addressed the issue. We’ll touch on where to find both here,

and then go into more detail about them later in the section.

• Existing sources. This term refers to published material of various kinds that might shed

light either on the issue or on attempts to deal with it. These can be conveniently divided

into scholarly publications, aimed primarily at researchers and the academic community;

mass-market sources, written in a popular style and aimed at the general public; and

statistical and demographic information published by various research organizations and

government agencies.

• Natural examples. These are programs or interventions developed and tried in

communities that have addressed your issue. Studying them can tell you what worked for

them and what didn’t, and why. By giving you insight into how issues play out in your or

other communities, they can provide nuts-and-bolts ideas about how to (or how not to)

conduct a successful program or intervention. For the most part, information sources here

are the people who are involved in efforts to address issues similar to yours, or those who

can steer you to them. Additionally, there are a number of natural examples (such as single

case studies) that have been written about descriptively in the literature of community

psychology or public health that may be relevant to your work.

Synthesis is from the Greek; it means putting together. Its English meaning is the same: the

putting together of something out of two or more different sources. Synthetic fabrics, for

36
instance, are called that because they’re constructed from a number of different chemical

building blocks.

In this section, we’re talking about ideas. Synthesis here refers to analyzing what you’ve learned

from your information gathering, and constructing a coherent program or approach by taking

ideas from a number of sources and putting them together to create something new that meets the

needs of the community and population you’re working with.

Synthesizing in this way requires identifying the functional elements of each idea or program

that you’ve looked at that seems to hold lessons for your work. Functional elements are the core

components of each program – the methods, framework, activities, techniques, and other aspects

– that make up the specific program you’re examining. Once you’ve separated these parts out,

you can put those that meet your needs together with what you’ve learned about the issue and

your own ideas to build a program that speaks specifically to your situation.

As we’ve mentioned, the activities of information gathering and synthesis are needed

both to create the original program and to develop an evaluation of it that will help you

maintain and improve it. The two really start in the same place, with what you think

will address the issue – what shape the program or intervention should take, with whom

it should be applied, and what behaviors or conditions it aims to change. This also

informs what its short- and long-term goals should be, and by what means you’ll try to

achieve those goals. Once these are determined, they in turn determine your evaluation

questions. You can’t construct an evaluation without knowing exactly what you’re

trying to evaluate.

37
WHY GATHER AND SYNTHESIZE INFORMATION?

If you’re in the process of starting a program to address a community issue, such as violence or

early childhood education, you probably know quite a bit about that issue already. You’ve dealt

with it, perhaps, in a variety of ways, and you have some pretty good ideas about what kind of

program would work. Why take the time and trouble, for you and for others engaged in a

participatory planning effort, to read a lot of material written by others and to track down people

who’ve run programs? If you’re inclined to think this way, there are a lot of good reasons why

you should think again. Gathering information beforehand and putting together what you’ve

learned could be the most important things you do to make your program effective. Here’s why:

• It will help you avoid reinventing the wheel. A lot of different organizations have likely

approached this issue before you. Some might have been successful and some might not

have, but all of them have probably learned something that would be useful to you in the

process. You don’t have to make the same mistakes someone else did if you know about

them, and you don’t have to make up something from scratch that may or may not work,

when you have a model that has worked.

It’s certainly not a bad thing if you have some of the same good ideas that others have had, but it

helps to know that they are good ideas. And there’s a chance that you might have some of the same

bad ideas others have had, in which case it helps even more to know that they’re bad ideas. It will

save you a huge amount of trouble, and perhaps be the difference between creating a program that

does its job well and one that fails miserably and disappears. Square wheels don’t roll – someone

could have told you that.

38
• It will help you to gain a deep understanding of the issue so that you can address it

properly. The first step in figuring out how to deal with an issue is to know what you’re

dealing with. The better you understand it – its causes, how it occurs, how people react

when they’re affected by it, what its consequences are for individuals and the community,

and who can influence it – the more likely it is that you’ll be able to determine how to

approach it.

• You need all the tools possible to create the best program you can. Foremost among the

tools you need to plan and implement a program or intervention are information,

information, and information. Just as with the issue itself, the more you know about what

works for whom, how to make things happen, and how to establish or eliminate certain

conditions, the more likely that you’ll be able to plan a successful program that addresses

all aspects of the issue and leaves nothing to chance. Various kinds of professional and

interpersonal skills may help you implement a program, but if what you’re implementing

isn’t effective, it doesn’t matter how skillfully you carry it out.

• It’s likely that most solutions aren’t one size fits all. The more information you gather,

the greater the variety of approaches, methods, and frameworks you’ll have to choose from.

Putting together the right combination will help you to successfully address the particular

needs of your community and population.

• It can help you to be culturally sensitive. Not only can you learn more about the

culture(s) of the people you’re working with, but you can probably find a number of

approaches that have worked with the cultural group you hope will benefit. Perhaps even

39
more important, you can learn to avoid costly mistakes that may take a lot of time and

effort – or be impossible – to repair.

• Knowing what’s been done in a variety of other circumstances and understanding the

issue from a number of different viewpoints may give you new insights and new ideas

for your program. As we discussed at the beginning of this section, new ideas seldom

spring from nowhere. They’re stimulated by your own experience and the ideas and

experience – good and bad, positive and negative – of others. Look to the experience of

other fields, communities, and countries. The more different ideas you’re exposed to, and

the more ways you can put them together, the greater chance there is that you’ll come up

with something new that’s more effective than what’s gone before.

WHEN SHOULD YOU GATHER AND SYNTHESIZE INFORMATION?

Information gathering and synthesis is crucial to the success of the program and to the relevance

and effectiveness of the evaluation. It should start at the beginning of any effort, and contribute

to the initial planning. It should also go on throughout the life of the program, so that you can

continue to adjust by adding or changing program elements to enhance outcomes, and to

generate new ideas.

Major adjustments should generally come at the end of an evaluation cycle, when you have solid

information about what worked and what didn’t. That doesn’t mean that you can’t make smaller

adjustments in the course of the program to improve results along the way.

There’s a tension here between continually changing a program to make it better and obtaining

accurate evaluation results. If you change a method or activity in midstream, your evaluation will

not be able to give you a clear assessment of its effectiveness.

40
How much changing you do in the course of a program depends on your intent. If your first

responsibility is to find out what works best, so you can pass it on, then it’s important not to make

changes until an evaluation has been completed. If your primary responsibility is to the current

participants in the program, then you should make whatever changes are necessary whenever

they’re necessary to ensure the best outcome for them.

There can be ethical issues involved here. In medical experiments with new therapies or drugs,

for example, some participants are given the new treatment and others aren’t (all participants

consent to this arrangement, and to not knowing which group they’ll be assigned to.) If the new

treatment proves to be harmful, there is an ethical obligation for the researchers to stop

administering it. If, on the other hand, it quickly proves remarkably effective, researchers usually

feel ethically bound to extend it to others in the study as soon as they can prove its positive

effects. Not all programs necessarily pose ethical problems that are as clear-cut as those

encountered in medical studies, but ethical issues should always be considered.

WHO SHOULD GATHER AND SYNTHESIZE INFORMATION?

The assumption throughout this chapter is that the whole process – planning, design,

implementation, and evaluation – involves multiple stakeholders.

Typical stakeholders in a community program or intervention might include:

• Program participants or beneficiaries

• Program staff and administrators

41
• Others affected by the program – police, medical staff, teachers, etc.

• Academics or other researchers

• Local officials

• Community activists

INFORMATION GATHERING

In a participatory process, information gathering can be enhanced by a division of labor

determined by the skills and experience of the participants. If there are academics or other

professional researchers involved, it would probably make the most sense for them – or others

with research experience – to review the evaluation literature. Members of the affected

population might be the best ones to collect information about the history of the issue in the

community, and about how it currently affects people. Program directors and staff would

probably have the best contacts in the field, and thus the best chance to find information about

other similar programs. Those with Internet access and computer experience might be the logical

on-line searchers, or might act as technical support for others to help them find what they’re

looking for. Those with knowledge in the law and legislation might be the ones to examine

policies.

There’s also the possibility that training could be provided to the whole group, or to various

individuals to allow them to pursue various lines of inquiry. There’s no reason, for instance, that

people without research experience couldn’t learn to understand and interpret demographic

information or contact programs in other places. (There are some limitations here: levels of

42
related education, materials or computers, and/or inability to connect with other people might all

figure in to what kind of research it makes sense to ask others to do.)

SYNTHESIS

It is especially important that all participants in the process be involved in putting together the

information. Training new participants to synthesize information will pay dividends in the end,

because they may be able to see things in the information that aren’t obvious to experienced

researchers. They may know things about the community that shed light on which elements of

other programs might be appropriate and which might not.

In any case, information gathering and synthesis, like any other part of the process, should reflect

the needs, interests, and abilities of all stakeholders.

HOW DO YOU GATHER AND SYNTHESIZE INFORMATION?

There are a number of steps to gathering and putting together the information you need. Most of

these can be group activities, part of the participatory process. The actual information gathering

can be parceled out to specific individuals or sub-groups.

DECIDE WHAT YOU NEED TO KNOW

Not surprisingly, the first step in gathering information is determining what information to

gather. There are a number of areas to explore:

43
• Details about the issue. These might include its immediate and root causes; its general

effects on individuals and communities; its consequences; its development through

different stages; its history; and the history of attempts to address it.

• How the issue has been dealt with elsewhere. Best practices or approaches for which there

is an evidence base; other approaches that have been at least partially effective; and what

hasn’t worked, which may give you at least as much important information as what has.

• People who can help. This category encompasses experts in the field and people or

organizations that have run or been involved in successful attempts to address the issue.

• Who is affected locally, and how? This really comprises two questions: a) what population

groups – geographical, ethnic, cultural, racial, class, etc. – are particularly affected by this

issue? And b) what other groups are affected, but less visibly? These might include those

who work with the first group(s) in the community (teachers, for example, or social

workers), those who depend on them, and those on whom they depend.

• The importance of the issue to the community. Again, this implies a double question:

o How important does the community perceive the issue to be? and

o How much and in what ways does the issue actually affect the community as a

whole?

• Community needs related to the issue. What has to be added to or removed from the

community in order to improve the situation? What kinds of approaches will the

community respond to or reject?

44
• Other context information. Community history, relationships among groups and

individuals that might be relevant to your work, community culture, etc.

• Who, if anyone, has some influence or control over changing the situation? Public officials

and other policymakers are often in this position. Business leaders, landlords, government

enforcement agencies, schools, employers, hospitals and health personnel, and members of

the affected group itself might also be in the position to change the situation (by learning

new skills or changing practices).

DETERMINE YOUR LIKELY INFORMATION SOURCES

As mentioned above, these encompass existing (i.e., published) sources and natural (i.e.,

experiential) examples. Published sources can be divided into scholarly, mass-market, and

statistical, each of which can provide different information and a different perspective on the

issue and attempts to address it. Depending on what you decide you’re looking for, you might

use all or any combination, of these sources.

The single largest storehouse of information available is the Internet. Many scholarly articles are

published online and accessible – often free, sometimes for a fee – to anyone who’s interested.

Virtually all U.S. laws and regulations at every level of government are easily found, most on

several websites. General knowledge on just about anything is widely available, as are lists of

best practices and successful organizations and the websites of those organizations. Census data

and other similar statistical information are also on view. Add to these the information provided

by such all-encompassing sources as Wikipedia (recently, for all its quirks, found to be just about

as accurate across its million-plus entries as the Encyclopedia Britannica), and you have a nearly-

bottomless well of fact and opinion to draw from.

45
As always, you have to be cautious: most of cyberspace is unedited, and the quality of

information varies. If you stick to reasonably reliable sites, you’re likely to find almost whatever

you need, or at least directions to it.

EXISTING SOURCES

Scholarly sources might include:

• Academic and some professional journals

• Books written for the academic market

• Doctoral dissertations - these are accessible to researchers through university libraries and

some Internet sources

• Papers and reports delivered at academic and professional conferences - these are often

available online, either on the authors’ websites or in e-published conference proceedings

• Occasional articles in respected mass-market scientific magazines, such as Nature or

Scientific American

• Newspaper archives

• Direct contact with academics and other researchers who’ve done work on the issue you’re

interested in, or who have conducted studies of attempts to deal with it

• Internet list servERs and news groups relating to the issue or the field in question

Mass-market sources of information:

46
• Widely available books, often marketed as “self-help” or “life-changing,” to the public at

large

• Articles in popular magazines, both those devoted to science or behavior and those of

general interest

• Newspaper stories, often in Sunday magazine sections

Where to find statistical and demographic information:

• Census data - available on the web and at many libraries

• Community reports, such as community report cards, self-studies, and needs assessments,

all of which should be obtainable through the appropriate municipal offices, and sometimes

on the web as well

• Organizational and agency data, usually a matter of public record if the agency is public or

publicly funded

47
In addition to these sources, the broadcast media often present stories about critical issues or

about successful efforts to address them. In most cases, such stories only skim the surface, since

they have to fit into short time slots (public broadcasting, on both radio and TV, breaks this mold

more than other media outlets). They can, however, serve as introductions to further research,

raising the importance of one or more aspects of an issue, or providing information about

effective programs that you can then contact.

Natural examples

Some of the more likely sources of natural examples:

• Program directors

• Friends or colleagues in the field

• Funders (particularly public agencies, because their transactions, including whom they

fund and why, are a matter of public record)

• Leaders and members of community coalitions or partnerships

• Officials who coordinate community-wide efforts

• Members of the population most directly affected by the issue at hand

• Current or former participants in or beneficiaries of effective programs

• People who work in collaboration with programs – police, medical staff, teachers, etc.

• Key informants in the community

48
• Experts – some of them the same academics and other researchers referenced under

scholarly sources – who have experience with your issue and efforts to address it

• Your own experience in the community

Don’t be afraid to range far and wide in your search for successful models or new ideas. Step
outside your own field and your own region, and see what’s been done elsewhere. A model from
social work or urban design might work in public health, or vice-versa. There’s enough overlap
among fields that deal with human health and development that you can often find exactly what
you need in seemingly odd places.

DEVISE A PLAN FOR COLLECTING INFORMATION

There are a number of considerations here:

• Who will gather what information? As we’ve discussed, the ideal group is multi-sectoral

and diverse in backgrounds and skills. Information gathering should be assigned according

to participants’ skills, interests, and contacts in the community. We’ve suggested, for

instance, that scholarly sources might be mined by academics or other experienced

researchers, while members of the affected population might be more successful in

approaching key informants in the community. This doesn’t mean, however, that in a given

group, these and other apparently logical roles couldn’t or shouldn’t be varied, depending

on the individuals involved.

49
• How will the information be gathered? Another issue is just how the information will be

gathered. Finding and reading written material is relatively straightforward: it’s in the

library or on the web, and you can read and take notes on the relevant parts of it. Getting

information directly from other people, however, can be more complicated. Will you

engage in formal or informal interviews? In observation? Will you conduct surveys or

public meetings? How will you contact people you don’t know – by letter, by phone,

through mutual acquaintances? Your information-gathering methods will be determined by

how much time you have, exactly what information you need, the depth of the information

you need, and the abilities of the participants.

• What adjustments will be made for particular gaps in experience or skills? People who

don’t read, write, and/or speak the language proficiently may have to devise imaginative

ways of recording information. Experienced researchers may have to translate scholarly

writing for just about everyone in the group who isn’t an academic. In many cases, most or

the entire group may need orientation or training before information gathering can begin.

You’ll need to work out what the needs are as a group, and devise ways to meet them.

• What’s the timeline for information gathering? While information gathering should

continue throughout the life of a project, the initial phase should have a time limit, so that

action isn’t delayed for too long. The time limit depends on your time constraints, the

seriousness and intensity of the issue, the community’s perception of urgency, and whether

there are external time restrictions (student interns who are only available until the end of

the summer, for instance.) Having a clear deadline will focus the group’s activities, and

boost its efficiency.

50
COLLECT INFORMATION

When your plan is completed, it’s time to put it into practice. You’ll have to conduct any training

that are necessary, and make sure that all the relevant tasks are assigned appropriately. You may

also want to set up regular meetings throughout the information-gathering process, in order to

give the group the chance to review progress, make suggestions, and report on what they’ve been

finding. In addition to providing support for those new to research, these meetings, by providing

a preview of the results of the process, will save everyone having to digest an overwhelming

amount of information all at once.

SYNTHESIZE: TAKE IT ALL APART

The process of synthesis involves breaking the information down into its component parts,

sifting through those parts to see which fit together best for your situation, and then integrating

them into an approach that is likely to work in your community.

There are usually three major areas to be considered:

• What’s known about the issue itself? What personal and environmental factors contribute

to the problem? What are its root causes? Do you have the resources to address them, or

are they beyond your scope (e.g., global economic forces or climate change)? Does the

issue have a number of different effects, and if so, what are they? What are the likely

consequences for the community as a whole if the issue is not resolved? (An

environmental health risk can not only kill or sicken individuals, but might also affect

business productivity, insurance availability and rates, hospital costs, the housing market,

or even – as in the case of the Love Canal neighborhood in Niagara Falls, NY – the

existence of a neighborhood or community itself.)

51
• The community context of the issue. What are the specific local effects of the issue?

Exactly who is affected? Exactly how are they affected? What are the consequences for

those individuals? For their families, friends, neighbors, and others they have dealings

with? For the community as a whole? What has been the community’s experience with

this issue in the past? How, if at all, has it been addressed? What local conditions would

change if the issue were addressed, and how would they change? Are there underlying

conditions that have to change before the issue can be addressed? Whose attitudes and/or

behaviors need to change to have an effect on the issue (for example, among policy

makers, those affected or specific officials)?

• Successful and unsuccessful attempts to address the issue. These may have been gleaned

both from the literature on best practices, and directly or at second hand from those

involved in them. Here, it’s important to separate out the elements of various approaches.

What specific procedures – methods and intervention components – were used? What

kinds of training – feedback, role play, modeling, etc. – were provided to participants?

Was information provided to participants about when, why, and how to act? Were there

positive or negative consequences that helped to establish or maintain change (or its

opposite)? Were environmental barriers, policies, or regulations put in place or

removed? What was the overall philosophy behind the approach? What aspects of the

issue did it address? What kind(s) of community was it tried in? What population groups

(in terms of culture, age, social class, etc.) were involved? Who was the approach to

benefit? What were the specific results in the short term? In the long term? What makes

a particular program, policy, or practice successful or unsuccessful? What events, if any,

were critical, to success (or failure)? What conditions – organizational features,

52
participant characteristics, and broader environmental factors – were critical? Is there a

model successful program? Is there a model unsuccessful program?

The existence of a model unsuccessful program doesn’t indicate that if you do the opposite of

everything that program did, you’ll be successful. Even if it failed spectacularly, much of the

program may have been potentially effective, but one or two elements – the way participants

were approached, recruited, or treated, a particular method – negated what could have worked.

By the same token, most elements of the program may have been fine, but its basic premise

might have been mistaken or ineffective – “Just say no” as a way of preventing AIDS among

teens, for instance. It’s important to try to figure out why the program was unsuccessful. A true

model unsuccessful program is one that did everything wrong, but those are few and far between.

Lisbeth Schorr (Common Purpose) makes a useful distinction between “what works” and

conditions under which what works actually works. Sometimes the presence of a charismatic

leader or champion motivates staff and/or participants to succeed. When the leader or original

staff members leave, some such programs collapse, while others are able to renew themselves by

careful hiring and a faithful implementation of what made it work.

In looking for programs to draw from, you need to understand the intervention components and

elements that make those programs work. Also, try to understand the conditions that allow an

intervention to be successful.

SYNTHESIZE: PUT IT BACK TOGETHER

53
Analyze the elements you’ve found to determine which of them would be appropriate for

the situation and group you’re working with.

• What has been used specifically with your population in your circumstances? Have the

successful programs you’ve looked at been context-specific (i.e., intended for their specific

communities and populations)? Can they be adapted to your context if they weren’t

intended for it?

• What can be adapted, if it wasn’t originally aimed at your population? (Techniques used

with children or adolescents that could be modified for use with adults, for instance, or

vice-versa.)

• What’s missing? What aspects of the issue in your community are not addressed by what

you’ve found? Are they important enough that they need to be addressed?

• Does what you’ve found confirm or contradict what you thought you already knew?

• Are there factors in your particular situation that make the issue substantially different for

you and your participants than for any other programs or approaches you’ve found out

about? How will you deal with that?

• What does your information tell you about the possibility of successfully addressing the

issue’s root causes (e.g., income inequality, social exclusion, lack of power)?

• In general, did most or all successful programs direct their change efforts at the same group

of people (policy makers, for example), or was there a variety? If the latter, what do you

think is most likely to work in your community?

54
• Perhaps most important, what’s your definition of success, and which of the programs you

learned about came closest to achieving? What components and elements of those

programs addressed what’s needed in your community?

Answering these questions will give you a good sense of which components of other programs

may work for you, and should also fit with what you already know to either give you ideas for new

elements that you can add, or confirm (or warn you away from) ideas for new elements that you

had already.

We don’t want to imply that simply taking a lot of different program components and playing

mix-and-match will provide you with an effective way to address a community issue. You

have to start with a clear framework informed by your vision and mission, and put together a

program that’s coherent and makes sense. All the elements have to fit; if they fit well enough,

you’ll end up with a whole that’s greater than the sum of its parts. If the elements don’t fit

together, or aren’t part of a program with a well-defined framework, the chances are you’ll end

up with a mess.

KEEP AT IT

Information gathering and knowledge synthesis should continue throughout the course of the

program. While you may wait until the results of an initial evaluation to change something, you

should always be looking for improvements and better approaches. No program or effort is perfect:

everything can be improved. As long as you keep trying to learn more and grow in your

understanding of your work, it will continue to get better. If you become complacent (i.e., you feel

you know what you’re doing and can relax), your program may start to lose its effectiveness.

IN SUMMARY

55
Gathering the information that already exists about your issue and attempts to address it is one of

the most important aspects of planning a program or evaluation. By putting together what’s

known about the issue and the history of the successes and failures of various approaches to it,

you can build a program structure that includes your own innovations and elements that have

worked for others in similar situations. This synthesis also allows you to avoid ineffective

approaches and to incorporate ideas and methods that have been particularly appropriate,

culturally or otherwise, to the population and community you’re working with.

Information gathering and synthesis should continue throughout the life of the program. The

more information you have, and the more carefully you put it together, the better your chances of

implementing a successful program.

Chapter 3

QUALITATITIVE AND QUANTITATIVE EVALUATION DESIGNS

DATA COLLECTION DESIGNING AND OBSERRVATIONAL SYSTEM

56
The local community health center was starting a program to encourage regular physical activity

among people with high blood pressure. The program had one simple objective: to engage

participants in 45 minutes of regular, moderate aerobic exercise at least four times a week over

the course of six months. The hope was that this regimen would lower participants’ blood

pressure, and lead to weight loss and an overall sense of greater well-being. A related aim was

that participants would continue the exercise on their own after the program ended.

The Center had gathered a group of 50 people who were willing to take part. After physical

checkups, all the participants attended workshops on diet, the mechanics and dangers of high

blood pressure, and how to start and maintain injury-free regular exercise. They also received

counseling about the kinds of exercise they might undertake – walking, bicycling, swimming,

etc. – and about ways to make exercising pleasant. To help participants to integrate exercise into

their lives, the center decided to ask them to exercise at their own convenience, using whatever

activities they chose, as long as they maintained the 45-minute, four-day-a-week pattern. They

were also asked to keep journals of the frequency and nature of their exercise, and to meet, in

groups of five, with Health Center counselors once a month for checks on blood pressure,

weight, and progress, as well as support, encouragement, and advice.

The center then had to decide how to evaluate the program. The performance goal of the

program – what participants were actually supposed to do – was maintaining the exercise

schedule over the six-month period. Since each participant was tending to that individually, it

would be hard to actually watch them all exercise whenever they did so. But the center needed to

know whether or not they had. The other program goals – lower blood pressure, good weight

loss – also had to be observed somehow. How could the center design a system to find out

57
whether the behavior of physical activity was occurring and whether this resulted in lower blood

pressure and weight loss?

Once you’ve determined your evaluation questions and gathered information about what to look

for, you have to find a way to look for it. That’s what observation systems are all about. Like the

center in the example, you’ll need to find ways to observe the behavior, the conditions, and the

changes – or lack of changes – in them that will answer your evaluation questions. This section is

about setting up observation systems to do just that.

WHAT DO WE MEAN BY DESIGNING AN OBSERVATIONAL SYSTEM?

An observational system is the way you get information about your program – what it and its

participants and implementers are actually doing, and what seems to be occurring as a result.

“Observation” here may mean actual observation – watching people, conditions, activity, or

results to see what happens – but it may also refer to less direct ways of monitoring a program’s

operation and outcomes. Its varieties include monitoring the behavior of individuals and groups

to see the results at different levels. Some methods of observation that might prove useful in

different evaluation situations:

• Direct observation. This is the purest and most verifiable form – watching people or

observing conditions or situations firsthand. If you’re involved in an effort to increase the

use and neighborhood sense of ownership of a public park, for instance, you might

directly observe how much and how people use the park by visiting and observing on

different days, in different types of weather, and under different circumstances over a

substantial period of time. Direct observers may be “invisible,” as an observer of park

activity would probably be, or they may be staff members who work with participants,

58
recording what happens. In either case, they are taking measures as outside observers, not

as participants themselves.

• Participant observation. A participant observer becomes part of the action, and observes

as an insider. In the case of the park, a participant observer might be a neighborhood

resident directly involved in the effort, or might be someone who becomes part of the life

of the park for the purposes of observation. He might jog their daily, or join a weekly

volleyball game and get to know others who use the park on a regular basis. His own

notes about what is observed in the park might also become part of his recording.

• Self-reports. Some of what you’re trying to achieve may simply not be visible at all, at

least not to you. Changes in what people do in private, such as their use of contraceptives,

may not be (or should not be) observed directly by an outsider. Similarly, when the goal

is to affect changes in the behavior of large numbers of people, such as to promote

healthy eating in the community, it will not be feasible to directly observe this for

everyone. In such situations, we ask people to report on their own behavior Thus; an

observational system may include interviews, journals, surveys, or other means of

firstperson reporting. Since such reporting may be subject to bias, we usually try to also

use other forms of evidence (e.g., observing weight loss as a product of the behaviors of

health nutrition and physical activity).

• Second-hand reports. An observational system may include or depend on the reports of

others who have direct experience with the people or conditions you’re concerned with.

Teachers, probation officers, park rangers, public health nurses, social workers – even

bartenders or hairdressers – might be valuable sources of second-hand information. These

59
reports, like self-reports, may be gathered by interviews, journals, surveys, checklists, and

the like.

• Electronic or mechanical observation. The observer in this case isn’t a person (although

ultimately people would review its information), but an automatically-operated or always-

on camera, audio recorder, heart monitor, pedometer, GPS (global positioning system)

tracker, or other piece of equipment. A camera operated by a tripwire is often used, for

example, to study the density of an animal population in a particular area, or along a

particular path.

• Tests of various kinds. Depending on what you’re measuring, this category could cover

everything from pencil-and-paper tests of academic learning to hands-on skills tests to

blood tests and the like. They might also include tests of new program methods and

procedures to see if they work before putting them into practice.

• Public and other records. Police reports, census data, employment statistics, public

health information – all of these and more could give you information on communitylevel

indicators that will help you determine the outcomes of your work.

• Products or results of behavior. Sometimes it is more practical to observe the product or

result of a behavior, rather than the behavior itself. For instance, if interested in

environmental pollution, we might observe the amount of debris or toxins on the ground

or in the water, rather than the behavior of illegal dumping of toxins or materials. Similarly,

an initiative interesting in preventing childhood obesity might use school records of height

and weight to measure obesity – in addition to direct observations of school lunches and

what youth report on eating surveys.

60
In addition to specifying what kinds of observation you’ll use, the design of an observational

system should also cover when, where, how often, by whom, and under what circumstances

observations will take place, as well as just what will be looked at. All of these depend on what

you’re observing and what information you hope to gain from your evaluation (back to those

evaluation questions again).

Among other considerations, will you look at the process of your effort or program – the steps

you took in setting it up, and whether they were faithful to what you intended? Will you look at

what you actually did – the number of participants you had, the methods you used, the time

everything took, how long participants stayed, etc.? Are you interested in which parts of what

you did were successful and which were not? And what do you want to know about outcomes –

the results of your program?

Designing an observational system entails thinking carefully about what you need to know, and

creating a system that is most likely to get you that information as accurately and easily as

possible. We’ll discuss the design process in detail, including the issues mentioned in the last

two paragraphs, in the “how-to” part of this section.

WHY DESIGN AN OBSERVATIONAL SYSTEM?

IF YOU’RE SERIOUS ABOUT EVALUATION, THERE ARE A NUMBER OF REASONS

FOR DESIGNING A GOOD OBSERVATIONAL SYSTEM:

• It can help you get reliable information. Designing a system that standardizes the

methods, times, and other aspects of the observation will mean that the information you

61
get from different observers and places is likely to be accurate and consistent, and

therefore more useful to an overall evaluation.

• It can help you find out exactly what you need to know, without wasted effort. You

can design a system that examines what you’re interested in, and ignores what you’re not.

That means that you don’t have to sort out unnecessary data, and that you’ll have the right

means of collecting the data you need to address your evaluation questions.

• It can ensure that observations are made. A consistent system that’s designed and

accepted by those who will do the observing, whatever form it takes, makes it far more

likely that observations will be made when, where, and how they’re supposed to.

• It can make it easier to analyze your data. A consistent, rational system of observation

can give you good information for scientific analysis, whether that analysis is quantitative

(based on numbers and statistics) or qualitative (based on narrative and interpretation of

the meaning of behavior and events.)

• It can help you avoid haphazard evaluation. A well-designed observational system will

allow you to collect information systematically, and not leave you with a mass of

disconnected data that are not necessarily related to what you want to know.

• It will make it easier to justify your findings. The more accurate your information, the

more reliable the conclusions that can be drawn from it. If your observational system is

designed and implemented well, it’s much easier to argue that your information is reliable

and accurate, and a good base for the conclusions you reach.

62
• It can help you gain credibility with funders and policymakers. The people who

control funds and policy are particularly concerned with accountability. If you can present

them with a useful evaluation based on data collected through a well-designed and

reliable observational system, they’ll be more inclined to treat you as a knowledgeable

voice in the field.

• It can let you pass on your practices with confidence. A well-designed observational

system makes it possible to feel that your evaluation results tell the truth. If those results

show that your program is highly effective, you can pass on what you do as a best

practice to colleagues in the field and others, without worrying that you may be urging

them to use methods or assumptions that might not work very well.

• It can give you the best information possible about what’s working in your program,

and what you need to adjust.

WHEN SHOULD YOU DESIGN AN OBSERVATIONAL SYSTEM?

As we’ve discussed, an observational system refers not only to direct observation, but to any

method of examining and recording the process, activities, and outcomes of your program. An

observational system is intrinsic to your evaluation, since that is what will tell you what actually

went on. Therefore, the ideal is to design that system before you actually start implementing the

program, so that you can monitor the program throughout its existence.

That’s the ideal. The reality for many community workers – especially those who work in small,

community-based organizations – is that evaluation begins whenever the time, energy, and

resources are available, which is often months or years after the program has started. Whenever

it begins, the observational system should be designed to fit the evaluation questions you’re

63
asking. Because the observations must be consistent and reliable it's well worth taking the time

to make sure to design an effective evaluation system.

It’s best if you can observe through a whole program cycle, from beginning to end. Some

programs don’t have a cycle, and the observation may focus on the behavior or results of

individual participants rather than the program as a whole. In these situations, evaluation may

begin as new participants begin the program and are observed from the beginning.

If you’ve been recording events, keeping journals, etc., before you start evaluating, you may have

information that you can incorporate into the results of your observation. If your design calls for

specific firsthand observations, their definition may be precise enough that similar observations

recorded in journals kept by staff may not meet the criteria to be included. If staff journals or

records are part of the system you’re putting in place, however, you may be able to use all or

most of the information you have (and, indeed, you could design the observational system so you

can.)

The real danger here is that you’ve missed something important already by the time your

observational system is operational. It may be that the early part of the program is crucial, at

least for some participants or for some changes and you’ll be starting to observe after those

changes have been made. Start your observations early so that your system can pick it up.

WHO SHOULD DESIGN AN OBSERVATIONAL SYSTEM?

We’ve stated many times the Community Tool Box bias toward participatory process, and

particularly toward participatory research and evaluation. In the case of an observational

system, a system will function best if it’s designed by a group that includes those who will

64
actually be the observers. If they’re part of the planning, they’ll be familiar with the system,

know exactly what information it’s meant to observe, and understand their roles with the

observational system.

In a community-based or smaller organization, it’s likely that time will be a factor – there are

probably too few people already doing too many jobs. If that’s the case, the level and nature of

the observational system has to be one that the staff can actually handle, whether they’re the

observers, or whether they’re facilitating the observation for outside evaluators or volunteers. If

they help to plan and set up the system, they’ll have far more incentive to make sure it works

than if it’s imposed on them.

The design of the observational system should specifically include the people who will actually

do the observing, who are often either staff members or members of the group that will benefit

from the program. In addition to these, it helps to include researchers or others who understand

observational systems, and can help to design a system that specifically meets the needs of the

evaluation or research project. It might also be beneficial to include individuals from the

group(s) that will be observed, to help with cultural issues and provide feedback on their

response to the design. The observational design team, therefore, might consist both of members

of the overall evaluation planning group, and others specifically recruited to work on an

observational system.

The actual short list includes:

• Program staff and administrators

• Support staff (who often do recording or data entry)

65
• Outside evaluators or research consultants

• Participants or beneficiaries

• Volunteer observers

If, for some reason, the design group doesn’t include anyone with research experience, a

training that includes information about different methods of observation, and about which

methods are likely to produce which kinds of information, could help greatly to inform the

design process. If the group does include researchers, that information could be presented as

part of the discussion about design and the various possibilities, rather than in a training or

workshop format.

HOW DO YOU DESIGN AN OBSERVATIONAL SYSTEM?

So, you’ve decided on evaluation questions and planned an evaluation. Now it’s time to

determine how you’ll get the data you need to answer your evaluation questions.

REVIEW YOUR EVALUATION QUESTIONS

Remember these? You decided what it was you wanted to know, in order to determine whether

your program was effective. Let’s go back to the example at the beginning of this section, the local

community health center program. The center was starting a physical activity program for people

with high blood pressure. Its objective was to have the participants engage in 45 minutes of

moderate exercise at least four times a week over six months.

It hoped for several outcomes from this activity:

66
• That participants’ blood pressure would decrease

• That participants who needed to would lose weight

• That participants would experience a sense of greater well-being

• Those participants would continue the exercise routine after the six-month program ended.

Some evaluation questions, therefore, are:

• Did participants engage in the recommended exercise routine for the period of the

program?

• Did participants’ blood pressure decrease by the end of the six months?

• Did those participants who needed to lose weight do so by the end of six months?

There could easily be many other questions, a few examples being:

• How well attended were the workshops? Did participants find them helpful? Did

participants who attended all or most of the workshops achieve better results than those

who didn’t?

• Did participants experience a sense of greater well-being by the end of six months?

67
Did participants continue the exercise routine (and maintain their lower blood pressure)

after the program ended?

The answers to these questions might be only a few of those that the center wanted, but let’s

stick with them for now, and use them as examples as we go through this part of the section.

You may be concerned about your own process – how well you actually plan and implement

your program. You may also, like the center, have a specific time frame in which you hope for

results. You might also have benchmarks – smaller achievements along the way to a larger goal

– that you’re concerned with recording. All of this should figure into your observational design.

DECIDE WHAT YOU NEED TO OBSERVE TO ANSWER YOUR QUESTIONS

Depending on the kind of program or effort you’re engaged in, and the nature of your evaluation

questions, there’s a broad range of choices here.

Some of the most common:

• Participants’ behavior. This could be anything from the aggressive behaviors of children

in a schoolyard or play setting to the degree of welding skill exhibited by participants in

an employment-training program to the social interactions of the users of the

neighborhood park we discussed earlier. The possibilities are nearly endless.

• Someone else’s behavior. The ultimate test of whether a high school peer mediation

training program is working, for instance, may not be the behavior of the mediators on

whose training the program focuses, but the behavior of the students with whom they

68
work. There’s also a possibility here of looking at the ways in which participants are

treated by program staff and vice versa.

Conditions. An initiative may aim to change conditions directly – eliminating a crack

house where drug dealing occurs, building affordable housing, cleaning up a polluted

river – or may be meant to influence those changes by implementing programs or

environmental or policy changes..

• Observations of products or results of behavior. When the behavior or event or

condition itself isn’t visible or observable, either because it’s private, or because it takes

place on a level that can’t be observed directly, you may have to measure its products or

results. It would be virtually impossible to observe directly the rate at which adolescents

practiced safe sex, but it would be possible to learn the rate of STD infections among

them, and the number of teen pregnancies before, during, and after a safe-sex peer

education program.

When products or effects are all you can observe, you have to be sure that you’ve chosen the

right ones to look at. They should be, to the extent possible, obvious results of the behavior or

condition you’re interested in, and you should take into account – and try to correct for –

any other factors that might have caused them.

• Participants’ knowledge or attitudes. Like participants’ behavior, the possible range here

is enormous, from scores on a knowledge test to nearly anything else you might think of.

69
• Someone else’s knowledge or attitudes. For instance, an advocacy program would be

concerned with changing the attitudes of legislators and the public, but might not have

direct contact with those whom it hoped to influence. This might use repeated public

opinion surveys to assess willingness to support a particular policy change.

Goal attainment. Some programs have a particular aim that is their only reason for

existence. This might be the passage or repeal of a law, the building of a school, the

freedom of one or more political prisoners, etc. The only evaluation question in that case

may be whether the goal has been achieved (or to what degree). In this situation, a goal

attainment scale can be used to assess the degree of attainment (e.g., from 5 = most

favorable outcome, to 1 = least favorable outcome).

• Interactions. The focus of an evaluation might be on the nature of interactions, or on

whether particular individuals or groups interact or engage each other. For instance, if

the goal is increasing parent-child interaction, each party’s talking to and responding to

the other might be measure. As above, interactions among program participants, staff, or

between participants and staff might be the focus of observations.

All of the above possibilities might have to do with either program goals – i.e., what a program

wants to accomplish – or process and implementation – how a program goes about setting up

and carrying out its work.

There are also some areas of observation that relate specifically to program process and

implementation:

70
• Planning. Measurement here may focus on who was involved in planning what parts of

the program, how the plan was developed, what its content was, satisfaction, etc.

• Timeline. When did the planning, implementation, and evaluation of the program each

begin? How long did each take? Were deadlines met, and, if not, why not?

Numbers of participants. How many participants did you have? What was the average

amount of time they spent in the program? How many dropped out before completing

the program? How did those numbers compare to what you expected?

• Methods. What methods did you use in the program or intervention? How were they

used?

• Program implementation. What did you actually do? This would include the program

activity, its frequency and duration, the number of participants it served, where it took

place (if that’s relevant), and how it was conducted.

In addition to identifying what you want to look at, you’ll have to define it carefully, so that

observers know exactly what to look for. You have to be certain that all observations of a

particular behavior, for instance, refer to the same phenomena (e.g., specific features that define

whether the behavior occurred), even if observations made by different people. If they don’t

use the same measurement, you can’t really count on the information you get. Setting the limits

of observations in each category – what’s included, what’s excluded, and where the boundaries

are – will help to eliminate disagreement and make the observation more reliable.

71
72
To continue the health center example, let’s look at what the Center needs to observe. In order

to determine whether participants are actually doing their exercise regularly, the center has to

find a way to observe people’s behavior (e.g., activity logs, self-reports on how often they

engaged in physical activity). To find out whether they’ve lost weight, the center has to observe

an outcome of behavior (e.g., weight, body mass index). To learn whether they’re experiencing

an increased sense of well-being, the center has to obtain self-reports. And to learn whether

they continue their exercise past the end of the program, it has to find a way to observe behavior

after the program ends.

DECIDE HOW THE OBSERVATIONS WILL BE CONDUCTED

Earlier, we discussed some methods of observation. We’ll return to those here, and examine

them in greater detail.

• Direct observation. Direct observation involves either the “fly on the wall” approach,

where the observer is anonymous and generally unnoticed, or – more often in service

organizations of various kinds – is either an (identified) outside evaluator, or a staff

member who works with participants and records, sometimes with their help, their

behaviors and aspects of the situation Anonymous observers are particularly good in

situations where any people being observed are equally anonymous – conditions, large

events, or activity like the use of that neighborhood park we talked about earlier, where

the people involved could be anyone.

73
One common method of direct observation -- whether the observer is a program staffer or a

program participant – is through keeping a journal or activity log. The observer writes down or

otherwise records, soon after the occurrence, an account of what happened and events related to

it and often reactions to those experiences as well. The journal or activity log then becomes a

picture of the flow of the program, detailing the progress through it of particular participants,

adjustments made, and satisfaction.

The nature of journals or report logs will obviously vary with the nature of the program, and not

all programs or efforts lend themselves to this kind of observation. But, especially in situations

where several people have written journals or logs that cover the same period and events, they

can be a very powerful and revealing means of observation.

• Participant observation. As we’ve explained, participant observers become part of the

event, activity, culture, etc. that’s being observed, and experience it firsthand. Thus, in a

health or human service program, the observer might be an actual participant (i.e., a

member of the group at whom the program is aimed), or an evaluator who joins

participants in their activities. In the case of the park, for instance, a neighborhood

resident who already visits the park regularly might volunteer to track how he and others

actually use it – when various people or groups come, what they do once they’re there,

which parts of the park they frequent, who interacts with whom, etc.

A micro-grant program was designed by a non-governmental organization in a rural village to

increase income through creation of small businesses by participants. Members of the staff took

part in each workshop, and participated in such activities as training in lending and loans, all the

74
while observing the activities and other participants. At the end of each day, the staff conducted

group discussions where they relayed what they had seen, and asked participants to analyze

what they had done. The staff in this instance functioned as participant observers, and their

participation added greatly to their ability to help their client, low-income women from small

villages, understand and use their experiences.

• Self-reports. When the object of your observation is participants’ behavior that takes

place away from the program (the amount of time participants spend reading to their

children, for example), you often have to rely on the observations of participants

themselves. This reliance has, as you might expect, both advantages and disadvantages.

On the one hand, participants obviously know their own actions well. On the other, they

can also leave out things that might be embarrassing or report in ways that think others

want to see. In addition, since they haven’t had prior experience as or been trained as

observers, they might miss, or dismiss as unimportant and not report, behaviors or

conditions that would be valuable for evaluation purposes.

An obvious remedy would be to train self-reporters as observers, and in some cases –medical

trials, for instance – that’s both reasonable and common. In other situations, however, it would

go too far toward telling participants “the right answers,” and thereby possibly changing what

they report toward what they think you want to hear. There is also a huge advantage to self-

report: when they’re honest and represent a real change in behavior or experience for the

reporters, they’re far more powerful than anything another person could say about their

experience.

75
Self-reports, at least as defined here, imply an array of possible techniques for data collection.

Individual and group interviews, focus groups, public meetings, surveys and questionnaires,

journals, checklists, and even casual conversation might all be ways for participants to convey

information for an evaluation.

• Second-hand reports. These are reports about participant behavior or about conditions

that come from people associated with those participants or conditions, but not directly

connected to your program or effort. They might be service workers, teachers, health

professionals youth workers, family members (particularly parents of young children),

employers – almost anyone. Second-hand observations have to be viewed with some of

the same cautions as those from people who work closely with participants.

Observerparticipant relationships, sympathy or empathy for participants, or observers’

personal biases can all keep reports from being objective. These observers may also need

training.

• Electronic or mechanical observation. There are some circumstances where human

observation is impossible or impractical. Observations of speeding are often made

electronically, as are observations of health conditions inside the human body (using

Xrays, CT or MRI scans, electrocardiographs, etc.) In these situations, objectivity is no

problem, but you have to be sure that whatever equipment you use is working properly,

set up correctly, well-maintained, and protected against possible damage. It is also

76
important that whoever interprets the information the equipment provides is trained to do

so, and understands the limits and appropriate uses of that information.

• Tests or other similar observation tools. Education and health organizations often use

various kinds of tests as observation tools. In a human service context, they are generally

used to observe progress in skills, competency levels, or development. In public health or

medicine, tests may be used to observe health status (e.g., screenings for elevated blood

cholesterol) and the effects of treatment. They can be very useful in all these

circumstances, but they’re also very specific, and don’t allow much room for intuition. In

addition, the results of tests of skills, knowledge, or intellectual ability may be influenced

by nervousness, lack of sleep, personal problems, or other factors that have little to do

with actual competency.

• Public records and the like. If you’re using community-level indicators, such as rates of

infant mortality or injuries due to motor vehicle crashes, as one way of looking at

outcomes, you’ll have to use records, census data, and other similar material to get the

data you need.

To continue with our previous example, the local health center would have to use a variety of

these observation methods. The beginning and ongoing observations of blood pressure and

weight at the monthly counseling sessions would take place with the use of instruments – a blood

pressure cuff and a weighing scale – as well as by direct visual observation. (While an obvious

reduction in fat may not indicate weight loss if the fat is replaced by muscle, it does indicate an

77
increase in fitness, which may be equally beneficial. Fitness levels could also be mechanically

and electronically measured, if the program chose.)

The amount and type of exercise each participant engaged in would be self-observed and

selfreported through journals and interviews. Feelings of well-being would be self-reported, but

could also be observed by counselors trained to look for changes over time in posture,

selfpresentation, and other observable indicators. Finally, observations of continued exercise

would come through one or more follow-up visits some time after the program ended, with

interviews, blood pressure and weight measurements, and more direct visual observation of

fitness levels. (Participants might also agree to continue keeping journals for a set period of time

after the program ended, thus providing a self-report of their ongoing levels of physical activity.)

DECIDE WHEN YOU NEED TO OBSERVE

The question here is whether you need to start observing at the very beginning of the program

(you almost always do), and how often you should observe throughout the course of the

evaluation.

Some of the possibilities:

• Pre- and post- observation. This means making your observations at the beginning and

the end of the evaluation period or the program. It’s the equivalent of what many schools

do with standardized testing. They test reading scores at the beginning and end of each

78
year, and then compare the two to determine how much the students have advanced.

Although this type of observation may tell you whether anything changed during the

program, it won’t give you strong evidence how the change took place, what caused it, or

how effective your methods were.

This explanation assumes only before-and-after observation. Most of the possibilities here

include before-and-after observation, but add other observations to it. For most kinds of

evaluation, you should start observing at the beginning, or even well before the beginning (to

understand whether any changes may in fact be part of an already-existing trend). If you

conduct your first observation partway, you won’t know if changes occurred before then. A

major change may occur toward the beginning in some interventions, toward the end in others,

steadily throughout the intervention in still others, and in a few not till after the intervention is

over. It’s important to know just where you started from in order to fully understand what

you’re seeing. It may be that a long intervention is no more effective than a short one, or that a

short one makes no difference at all. You can only tell by knowing where you started from and

through repeated or continuous measurement.

If your program or effort is one with a specific, one-time goal – for example, the passage of a

law, or the clean-up of a particular space – the temptation may be to evaluate it only by whether

you reach your goal or not (i.e., a single observation at the end of the effort). This would be a

mistake, because it wouldn’t take into account the parts of your effort that were successful and

why, whether or not you reached your goal. That’s a piece of information you’ll need the next

time – and there will be a next time – you or others in the community take on a similar effort.

79
• At regular intervals during the evaluation period. You might choose any period from

once an hour to once a month or more, depending on what you’re observing. The

regularity makes observations easy to schedule, and gives an interval-by-interval picture

of what’s going on.

• At irregular intervals during the evaluation period. The reason for this schedule might

be logistical (you observe when you can); might have to do with making sure that

observations aren’t expected, so that you get a true picture of what you’re looking at; or

might be an attempt to look at the program or effort randomly, again to try to get an

accurate picture.

• At specific times during the evaluation period. In this case, you might be concerned to

see what happens or what participants are doing at different, identifiable times that imply

different, identifiable conditions. In observing the use of that neighborhood park we

mentioned earlier, you might want to be sure to go on weekdays, on weekends, in the

morning, afternoon, and evening, at each of the four seasons, in rain, clouds, sun, and

snow, and on days when there were special events in the park, to see who uses the park

and how under different conditions and at different times. If you’re monitoring the

process and progress of the program, it’s important to make sure you observe each stage

of it – the planning, the preparation, the implementation, the evaluation, and any followup

– to make sure you get a full picture of what you did and how you did it. This will give

80
you the information you need to analyze in order to make adjustments in how you conduct

your work.

• Continuously. When the observer is a staff member working with program participants

(or one or more of the participants themselves), it may be possible to make ongoing

observations. The observer in this situation might observe directly using checklists, keep

a journal, ask participants to keep records, video- or audiotape sessions, or record what

happens in some other way, so that there’s an ongoing, day-to-day account of the

behavior and what is happening in the environment of the program.

At the local health center, some of the observation – particularly that of monitoring participants’

blood pressure and weight – would be done at regular intervals, during the monthly meetings.

There would also be some continuous observation -- that of participants keeping track of their

exercise programs in journals. And finding out whether participants continued with their

routines would be a one- or two-time follow-up – perhaps six months and a year after the

program ended.

DEFINE AND DESCRIBE THE BEHAVIORS, PRODUCTS, CONDITIONS, AND/OR

EVENTS THAT OBSERVERS SHOULD BE CONCERNED WITH

If you want to be sure you know what observers are referring to in their reports, you have to be

specific about what you want them to look for. The planning group, or a subgroup of it – the

81
ideal would be a group that included a high proportion of people who will actually be researchers

and observers – should set out identification standards for each element to be observed. These

would explain what it looked like, when it was likely to occur, who would probably be involved,

etc. For instance, to observe bullying or interpersonal violence on a playground would require

clear definitions of this behavior, examples and non-examples, and scoring instructions. As a

result, observers could all start with the same guidance about what they were looking for.

DESIGN TRAINING FOR OBSERVERS

Unless all the observation is to be done by those directly involved in the planning (not

impossible in a small organization), and depending on their previous experience, observers

might need to be trained in one or more areas:

• What it’s important to record, and why. People who have no acquaintance with

research might not realize how important it is to record such details as the date, time,

evaluation length, place, and circumstances of any observation, a description of who was

involved and for how long, whether there were unexpected people or conditions present,

etc. An early morning observation might provide a different set of observations than a

late afternoon or evening one, for example. The presence of other people in a situation or

interview – relatives, friends, and program staff – can change the character of the

behaviors displayed or information offered. It’s crucial, therefore, that observers

understand that the context of the observation can be as important as its content.

82
• The definitions and descriptions of the behaviors, conditions, events, or situations to

be observed. Careful definitions and descriptions of what’s to be observed won’t make

much difference unless those who’ll do the observations are familiar with them.

• Effects of observation. In some cases, the behavior of participants might change as a

result of their reactions to being observed. Observers have to be aware of that possibility,

and make their own behavior as invisible as they can, so as not to influence participants’

behaviors.

In addition to human observers, the presence of audio or video equipment can often have an

effect. One way to offset it is to wait to start collecting data until participants are used to the

presence of the equipment. It’s also important to get participants’ permission beforehand to use

recording equipment.

• Observer bias. Especially in situations where the observers are also program staff, their

relationships with participants, or simply with the effort as a whole, may affect their

reports or observations. If they particularly like or dislike a participant, that may have

some influence on how they interpret or describe that person’s behavior. If they’re

heavily invested in the success of the program being evaluated, they may – intentionally

or unintentionally – put the best possible light on what they see. Whether or not they’re

program staff, observers can also be influenced by their personal assumptions, their

cultural, religious, or educational background, or their current psychological states or life

circumstances. If they can be helped to recognize these biases and understand why and

83
how they should be acknowledged or eliminated, there’s a better chance that they’ll

conduct reliable observations.

• Observer drift. Sometimes, after people have been observing for a while, their

observations tend to take on a regularity based on the rules they make up rather than

shared definitions based on a standard. They might tend to rate the behavior of certain

participants in ways based more on past experience than on what they see, for instance.

You may also have to correct for observer effects, bias, or drift over the course of the

evaluation. That’s part of devising checks for accuracy and reliability based on a standard.

DEVISE CHECKS FOR RELIABILITY AND ACCURACY

If your information is to be reliable, it’s important that when two observers record a particular

behavior, they mean exactly the same thing. This is a matter of training (see above), and also one

of checking, either at the beginning or periodically, to make sure that all observers are seeing

things in the same way. A participatory design of the system will help here. If the observers are

involved in defining what they all will be observing, there’s a much better chance they’ll all see

it similarly.

Some ways you can try to ensure agreement among observers:

• Use an external standard. One way to define what you’re looking for is to use a

standard that’s used and accepted by all observers. “Behavior X looks like this, occurs in

these circumstances, lasts for this long, and has these after-effects or results.” The use of

84
external standards often employs a checklist or something similar. The observer checks

off components of a behavior or condition, thus documenting what he sees in a way that

matches how a different observer would score the event the same situation. Such

standards help assure the continued accuracy of the observational system.

Research teams and laboratories commonly use standards to assure agreement in identifying

various conditions. Each condition is described in detail, with various possible markers, such as

measure of blood pressure or environmental toxins.

• Check for inter-rater reliability. Inter-rater reliability is the research term for assessing

whether all observers interpret the same things – behaviors, conditions, events – in the

same way. One way to address it is to check observers against one another. Two or more

are exposed to the same situations or information, and then their scoring, such as of

instances of bullying, are compared. If they all say essentially the same thing, then

interrater reliability is high, and everything’s fine. Typically, in research, if observers

agree 80% or more of the time, observations are deemed reliable. If they disagree about

what they saw, then you have to find the source of the disagreement. They may define

terms differently, or their backgrounds may bias them toward seeing the same thing in

different ways. Whatever the case, you have to uncover differences, and find a way to

help all observers see things similarly and accurately.

• Use random third-party checks. A researcher, program director, or someone else who

has a clear idea of what information is important and what various conditions or

behaviors look like can observe in randomly chosen situations along with a regular

85
observer to see if their observations match reasonably well. If they disagree once, it may

not mean much; but if they consistently disagree, there’s a problem.

DETERMINE HOW TO REVIEW AND ADJUST YOUR OBSERVATIONAL SYSTEM FOR

THE NEXT EVALUATION

Here’s this section’s version of “keep at it.” Just like your program, your evaluation, including

your observational system, should be evaluated and adjusted to be as effective as possible.

Now you’re ready to start collecting and analyzing data. With careful planning and good training,

you should be able to get the information you need for your evaluation.

IN SUMMARY

In order to conduct an evaluation that allows you to see your program or effort clearly and to

adjust and improve it, you have to have a way of collecting accurate and useful information

about it. The observational system you use is the way you look at what you’re doing – at your

own process, at participants’ behavior and their progress and results, at conditions that affect

your effort or that your effort is trying to change – to gain the information that you’ll analyze to

evaluate your work. That system has to be feasible within your resources, and has to fit with the

nature of your program, so designing it is an important part of your evaluation.

The design of observational systems is best carried out as a participatory process, particularly

one involving both researchers or evaluators and those who’ll do the actual data collection. That

involvement will give them a clear understanding of the system itself, of what information is

86
needed, and of the pitfalls to data collection that they might encounter along the way. The result

should be a more reliable system, and, ultimately, more accurate data for your evaluation.

Chapter 4

SELECTING AN APPROPRIATE DESIGN

When you hear the word “experiment,” it may call up pictures of people in long white lab coats

peering through microscopes. In reality, an experiment is just trying something out to see how or

why or whether it works. It can be as simple as putting a different spice in your favorite dish, or

as complex as developing and testing a comprehensive effort to improve child health outcomes

in a city or state.

Academics and other researchers in public health and the social sciences conduct experiments to

understand how environments affect behavior and outcomes, so their experiments usually

involve people and aspects of the environment. A new community program or intervention is an

experiment, too, one that a governmental or community organization engages in to find out a

better way to address a community issue. It usually starts with an assumption about what will

work – sometimes called a theory of change - but that assumption is no guarantee. Like any

experiment, a program or intervention has to be evaluated to see whether it works and under

what conditions.

In this section, we’ll look at some of the ways you might structure an evaluation to examine

whether your program is working, and explore how to choose the one that best meets your needs.

87
These arrangements for discovery are known as experimental (or evaluation) designs.

WHAT DO WE MEAN BY A DESIGN FOR THE EVALUATION?

Every evaluation is essentially a research or discovery project. Your research may be about

determining how effective your program or effort is overall, which parts of it are working well

and which need adjusting, or whether some participants respond to certain methods or conditions

differently from others. If your results are to be reliable, you have to give the evaluation a

structure that will tell you what you want to know. That structure – the arrangement of

discovery- is the evaluation’s design.

THE DESIGN DEPENDS ON WHAT KINDS OF QUESTIONS YOUR EVALUATION IS

MEANT TO ANSWER.

Some of the most common evaluation (research) questions:

• Does a particular program or intervention – whether an instructional or motivational

program, improving access and opportunities, or a policy change – cause a particular

change in participants’ or others’ behavior, in physical or social conditions, health or

development outcomes, or other indicators of success?

• What component(s) and element(s) of the program or intervention were responsible for

the change?

88
• What are the unintended effects of an intervention, and how did they influence the

outcomes?

• If you try a new method or activity, what happens?

• Will the program that worked in another context, or the one that you read about in a

professional journal, work in your community, or with your population, or with your

issue?

If you want reliable answers to evaluation questions like these, you have to ask them in a way

that will show you whether you actually got results, and whether those results were in fact due to

your actions or the circumstances you created, or to other factors. In other words, you have to

create a design for your research – or evaluation – to give you clear answers to your questions.

We’ll discuss how to do that later in the section.

WHY SHOULD YOU CHOOSE A DESIGN FOR YOUR EVALUATION?

An evaluation may seem simple: if you can see progress toward your goal by the end of the

evaluation period, you’re doing OK; if you can’t, you need to change. Unfortunately, it’s not that

simple at all. First, how do you measure progress? Second, if there seems to be none, how do you

know what you should change in order to increase your effectiveness? Third, if there is progress,

how do you know it was caused by (or contributed to) your program, and not by something else?

And finally, even if you’re doing well, how will you decide what you could do better and what

elements of your program can be changed or eliminated without affecting success? A good

design for your evaluation will help you answer important questions like these.

Some specific reasons for spending the time to design your evaluation carefully include:

89
• So your evaluation will be reliable. A good design will give you accurate results. If you

design your evaluation well, you can trust it to tell you whether you’re actually having an

effect, and why. Understanding your program to this extent makes it easier to achieve and

maintain success.

• So you can pinpoint areas you need to work on, as well as those that are successful.

A good design can help you understand exactly where the strong and weak points of your

program or intervention are, and give you clues as to how they can be further

strengthened or changed for the greatest impact.

• So your results are credible. If your evaluation is designed properly, others will take

your results seriously. If a well-designed evaluation shows that your program is effective,

you’re much more likely to be able to convince others to use similar methods, and to

convince funders that your organization is a good investment.

• So you can identify factors unrelated to what you’re doing that have an effect –

positive or negative – on your results and on the lives of participants. Participants’

histories, crucial local or national events, the passage of time, personal crises, and many

other factors can influence the outcome of a program or intervention for better or worse.

A good evaluation design can help you to identify these, and either correct for them if

you can, or devise methods to deal with or incorporate them.

• So you can identify unintended consequences (both positive and negative) and

correct for them. A good design can show you all of what resulted from your program

or intervention, not just what you expected. If you understand that your work has

90
consequences that are negative as well as positive, or that it has more and/or different

positive consequences than you anticipated, you can adjust accordingly.

• So you’ll have a coherent plan and organizing structure for your evaluation. It will

be much easier to conduct your evaluation if it has an appropriate design. You’ll know

better what you need to do in order to get the information you need. Spending the time to

choose and organize an evaluation design will pay off in the time you save later and in

the quality of the information you get.

WHEN SHOULD YOU CHOOSE A DESIGN FOR YOUR EVALUATION?

Once you’ve determined your evaluation questions and gathered and organized all the

information you can about the issue and ways to approach it, the next step is choosing a design

for the evaluation. Ideally, this all takes place at the beginning of the process of putting together

a program or intervention. Your evaluation should be an integral part of your program, and its

planning should therefore be an integral part of the program planning.

That’s the ideal; now let’s talk about reality. If you’re reading this, the chances are probably at

least 50-50 that you’re connected to an underfunded government agency or to a communitybased

or non-governmental organization, and that you’re planning an evaluation of a program or

intervention that’s been running for some time – months or even years.

Even if that’s true, the same guidelines apply. Choose your questions, gather information, choose

a design, and then go on through the steps presented in this chapter. Evaluation is important

enough that you won’t really be accomplishing anything by taking shortcuts in planning it. If

91
your program has a cycle, then it probably makes sense to start your evaluation at the beginning

of it – the beginning of a year or a program phase, where all participants are starting from the

same place, or from the beginning of their involvement.

If that’s not possible – if your program has a rolling admissions policy, or provides a service

whenever people need it – and participants are all at different points, that can sometimes present

research problems. You may want to evaluate the program’s effects only with new participants,

or with another specific group. On the other hand, if your program operates without a particular

beginning and end, you may get the best picture of its effectiveness by evaluating it as it is,

starting whenever you’re ready. Whatever the case, your design should follow your information

gathering and synthesis.

WHO SHOULD BE INVOLVED IN CHOOSING A DESIGN?

If you’re a regular Tool Box user, and particularly if you’ve been reading this chapter, you know

that the Tool Box team generally recommends a participatory process – involving both research

and community partners, including all those with an interest in or who are affected with the

program in planning and implementation. Choosing a design for evaluation presents somewhat

of an exception to this policy, since scientific or evaluation partners may have a much clearer

understanding of what is required to conduct research, and of the factors that may interfere with

it.

As we’ll see in the “how-to” part of this section, there are a number of considerations that have

to be taken into account to gain accurate information that actually tells you what you want to

know. Graduate students generally take courses to gain the knowledge they need to conduct

92
research well, and even some veteran researchers have difficulty setting up an appropriate

research design. That doesn’t mean a community group can’t learn to do it, but rather that the

time they would have to spend on acquiring background knowledge might be too great. Thus, it

makes the most sense to assign this task (or at the very least its coordination) to an individual or

small group with experience in research and evaluation design. Such a person can not only help

you choose among possible designs, but explain what each design entails, in time, resources, and

necessary skills, so that you can judge its appropriateness and feasibility for your context.

HOW DO YOU CHOOSE A DESIGN FOR YOUR EVALUATION?

HOW DO YOU GO ABOUT DECIDING WHAT KIND OF RESEARCH DESIGN WILL

BEST SERVE THE PURPOSES OF YOUR EVALUATION?

The answer to that question involves an examination of four areas:

• The nature of the research questions you are trying to answer

• The challenges to the research, and the ways they can be resolved or reduced

• The kinds of research designs that are generally used, and what each design entails

• The possibility of adapting a particular research design to your program or situation –

what the structure of your program will support, what participants will consent to, and

what your resources and time constraints are

We’ll begin this part of the section with an examination of the concerns research designs should

address, go on to considering some common designs and how well they address those concerns,

93
and end with some guidelines for choosing a design that will both be possible to implement and

give you the information you need about your program.

Note: in this part of the section, we’re looking at evaluation as a research project. As a result,

we’ll use the term “research” in many places where we could just as easily have said, for the

purposes of this section, “evaluation.” Research is more general, and some users of this section

may be more concerned with research in general than evaluation in particular.

CONCERNS RESEARCH DESIGNS SHOULD ADDRESS

The most important consideration in designing a research project – except perhaps for the value

of the research itself – is whether your arrangement will provide you with valid information. If

you don’t design and set up your research project properly, your findings won’t give you

information that is accurate and likely to hold true with other situations. In the case of an

evaluation, that means that you won’t have a basis for adjusting what you do to strengthen and

improve it.

Here’s a far-fetched example that illustrates this point. If you took children’s heights at age six,

then fed them large amounts of a specific food for three years – say carrots – and measured them

again at the end of the period, you’d probably find that most of them were considerably taller at

nine years than at six. You might conclude that it was eating carrots that made the children

taller because your research design gave you no basis for comparing these children’s growth to

that of other children.

94
There are two kinds of threats to the validity of a piece of research. They are usually referred

to as threats to internal validity (whether the intervention produced the change) and threats to

external validity (whether the results are likely to apply to other people and situations).

Threats to internal validity

These are threats (or alternative explanations) to your claim that what you did caused changes in

the direction you were aiming for. They are generally posed by factors operating at the same

time as your program or intervention that might have an effect on the issue you’re trying to

address. If you don’t have a way of separating their effects from those of your program, you

can’t tell whether the observed changes were caused by your work, or by one or more of these

other factors. They’re called threats to internal validity because they’re internal to the study –

they have to do with whether your intervention – and not something else – accounted for the

difference.

There are several kinds of threats to internal validity:

• History. Both participants’ personal histories – their backgrounds, cultures, experiences,

education, etc. – and external events that occur during the research period – a disaster, an

election, conflict in the community, a new law – may influence whether or not there’s

any change in the outcomes you’re concerned with.

• Maturation. This refers to the natural physical, psychological, and social processes that

take place as time goes by. The growth of the carrot-eating children in the example above

is a result of maturation, for instance, as might be a decline in risky behavior as someone

95
passed from adolescence to adulthood, the development of arthritis in older people, or

participants becoming tired during learning activities towards the end of the day.

• The effects of testing or observation on participants. The mere fact of a program’s

existence, or of their taking part in it, may affect participants’ behavior or attitudes, as

may the experience of being tested, videotaped, or otherwise observed or measured.

• Changes in measurement. An instrument – a blood pressure cuff or a scale, for instance

– can change over time, or different ones may not give the same results. By the same

token, observers – those gathering information – may change their standards over time, or

two or more observers may disagree on the observations.

• Regression toward the mean. This is a statistical term that refers to the fact that, over

time, the very high and very low scores on a measure (a test, for instance) often tend to

drift back toward the average for the group. If you start a program with participants who,

by definition, have very low or high levels of whatever you’re measuring – reading skill,

exposure to domestic violence, particular behavior toward people of other races or

backgrounds, etc. – their scores may end up closer to the average over the course of the

evaluation period even without any program.

• The selection of participants. Those who choose participants may slant their selection

toward a particular group that is more or less likely to change than a cross-section of the

population from which the group was selected. (A good example is that of employment

training programs that get paid according to the number of people they place in jobs.

They’re more likely to select participants who already have all or most of the skills they

96
need to become employed, and neglect those who have fewer skills... and who therefore

most need the service.) Selection can play a part when participants themselves choose to

enroll in a program (self-selection), since those who decide to participate are probably

already motivated to make changes. It may also be a matter of chance: members of a

particular group may, simply by coincidence, share a characteristic that will set their

results on your measures apart from the norm of the population you’re drawing from.

Selection can also be a problem when two groups being compared are chosen by different

standards. We’ll discuss this further below when we deal with control or comparison groups.

• The loss of data or participants. If too little information is collected about participants,

or if too many drop out well before the research period is over, your results may be based

on too little data to be reliable. This also arises when two groups are being compared. If

their losses of data or participants are significantly different, comparing them may no

longer give you valid information.

• The nature of change. Often, change isn’t steady and even. It can involve leaps forward

and leaps backward before it gets to a stable place – if it ever does. (Think of looking at

the performance of a sports team halfway through the season. No matter what its record is

at that moment, you won’t know how well it will finish until the season is over.) Your

measurements may take place over too short a period or come at the wrong times to track

the true course of the change or lack of change that’s occurring.

97
• A combination of the effects of two or more of these. Two or more of these factors

may combine to produce or prevent the changes your program aims to produce. A

language-study curriculum that is tested only on students who already speak two or more

languages runs into problems with both participants’ history – all the students have

experience learning languages other than their own – and selection – you’ve chosen

students who are very likely to be successful at language learning.

Threats to external validity

These are factors that affect your ability to apply your research results in other circumstances – to

increase the chances that your program and its results can be reproduced elsewhere or with other

populations. If, for instance, you offer parenting classes only to single mothers, you can’t

assume, no matter how successful they appear to be, that the same classes will work as well with

men.

Threats to external validity (or generalizability) may be the result of the interactions of other

factors with the program or intervention itself, or may be due to particular conditions of the

program.

Some examples:

• Interaction of testing or data collection and the program or intervention. An initial test

or observation might change the way participants react to the program, making a

difference in final outcomes. Since you can’t assume that another group will have the

same reaction or achieve similar final outcomes as a result, external validity or

generalizability of the findings becomes questionable.

98
• Interaction of selection procedures and the program or intervention. If the participants

selected or self-selected are particularly sensitive to the methods or purpose of the

program, it can’t be assumed to be effective with participants who are less sensitive or

ready for the program.

Parents who’ve been threatened by the government with the loss of their children due to child

abuse may be more receptive to learning techniques for improving their parenting, for example,

than parents who are under no such pressure.

• The effects of the research arrangements. Participants may change behavior as a result of

being observed, or may react to particular individuals in ways they would be unlikely to

A classic example here is that of a famous baboon researcher, Irven DeVore, who after years of

observing troupes of baboons, realized that they behaved differently when he was there than

when he wasn’t. Although his intent was to observe their natural behavior, his presence itself

constituted an intervention, making the behavior of the baboons he was observing different

from that of a troupe that was not observed.

The interference of multiple treatments or interventions. The effects of a particular

program can be changed when participants are exposed to it beforehand in a different

context, or are exposed to another before or at the same time as the one being evaluated.

This may occur when participants are receiving services from different sources, or being

treated simultaneously for two or more health issues or other conditions.

99
Given the range of community programs that exist, there are many possibilities here. Adults

might be members of a high school completion class while participating in a substance abuse

recovery program. A diabetic might be treated with a new drug while at the same time

participating in a nutrition and physical activity program to deal with obesity. Sometimes,

the sequence of treatments or services in a single program can have the same effect, with one

influencing how participants respond to those that follow, even though each treatment is

being evaluated separately.

COMMON RESEARCH DESIGNS

Many books have been written on the subject of research design. While they contain too much

material to summarize here, there are some basic designs that we can introduce. The important

differences among them come down to how many measurements you’ll take, when you will take

them, and how many groups of what kind will be involved.

Program evaluations generally look for the answers to three basic questions:

• Was there any change – in participants’ or others’ behavior, in physical or social

conditions, or in outcomes or indicators of success– during the evaluation period?

• Was whatever change took place – or the lack of change – caused by your program,

intervention, or effort?

• What, in your program or outside it, actually caused or prevented the change?

As we’ve discussed, changes and improvement in outcomes may have been caused by some or

all of your intervention, or by external factors. Participants’ or the community’s history might

100
have been crucial. Participants may have changed as a result of simply getting older and more

mature or more experienced in the world – often an issue when working with children or

adolescents. Environmental factors – events, policy change, or conditions in participants’ lives –

can often facilitate or prevent change as well. Understanding exactly where the change came

from or where the barriers to change reside, gives you the opportunity to adjust your program to

take advantage of or combat those factors.

If all you had to do was to measure whatever behavior or condition you wanted to influence at

the beginning and end of the evaluation, choosing a design would be an easy task. Unfortunately,

it’s not quite that simple – there are those nasty threats to validity to worry about. We have to

keep them in mind as we look at some common research designs.

Research designs, in general, differ in one or both of two ways: the number and timing of the

measurements they use; and whether they look at single or multiple groups. We’ll look at

singlegroup designs first, and then go on to multiple groups.

101
Researchers usually refer to your first measurement(s) or observation(s) – the ones you take before you

start your program or intervention – as a baseline measure or baseline observation, because it establishes

a baseline – a known level – to which you compare future measurements or observations.

Some other important research terms:

• Independent variables are the program itself and/or the methods or conditions that the researcher

– in this case, you – wants to evaluate. They’re called variables because they can change – you

might have chosen (and might still choose) other methods. They’re independent because their

existence doesn’t depend on whether something else occurs: you’ve chosen them,

and they’ll stay consistent throughout the evaluation period.

• Dependent variables are whatever may or may not change as a result of the presence of the

independent variable(s). In an evaluation, your program or intervention is the independent

variable. (If you’re evaluating a number of different methods or conditions, each of them is an

independent variable.) Whatever you’re trying to change is the dependent variable. (If you’re

aiming at change in more than one behavior or outcome, each type of change is a different

dependent variable.) They’re called dependent variables because changes in them depend on

the action of the independent variable...or something else.

• Measures are just that – measurements of the dependent variables. They usually refer to
procedures that have results that can be translated into numbers, and may take the form of
community assessments, observations, surveys, interviews, or tests. They may also count
incidents or measure the amount of the dependent variable (number or percentage of children
who are overweight or obese, violent crimes per 100,000 population, etc.)

102
Observations might involve measurement, or they might simply record what happens in specific
circumstances: the ways, in which people use a space, the kinds of interactions children have in
a classroom, the character of the interactions during an assessment. For convenience,
researchers often use “observation” to refer to any kind of measurement and we’ll use the same
convention here.

Before we go any further, it is helpful to have an understanding of some basic research terms that

we will be using in our discussion.

Pre- and post- single-group design

The simplest design is also probably the least accurate and desirable: the pre (before) and post

(after) measurement or observation. This consists of simply measuring whatever you’re

concerned with in one group – the infant mortality rate, unemployment, water pollution –

applying your intervention to that group or community, and then observing again. This type of

design assumes that a difference in the two observations will tell you whether there was a change

over the period between them, and also assumes that any positive change was caused by the

intervention.

In most cases, a pre-post design won’t tell you much, because it doesn’t really address any of the

research concerns we’ve discussed. It doesn’t account for the influence of other factors on the

dependent variable, and it doesn’t tell you anything about trends of change or the progress of

change during the evaluation period – only where participants were at the beginning and where

103
they were at the end. It can help you determine whether certain kinds of things have happened –

whether there’s been a reduction in the level of educational attainment or the amount of

environmental pollution in a river, for instance – but it won’t tell you why. Despite its

limitations, taking measures before and after the intervention is far better than no measures.

Even looking at something as seemingly simple to measure pre and post as blood pressure (in a

heart disease prevention program) is questionable. Blood pressure may be lower at the final

observation than at the initial one, but that tells you nothing about how much it may have gone

up and down in between. If the readings were taken by different people, the change may be due

in part to differences in their skill, or to how relaxed each was able to make participants feel.

Familiarity with the program could also have reduced most participants’ blood pressure from the

pre- to the post-measurement, as could some other factor that wasn’t specifically part of the

independent variable being evaluated.

Interrupted time series design with a single group (simple time series)

An interrupted time series used repeated measures before and after delayed implementation of

the independent variable (e.g., the program, etc.) to help rule out other explanations. This

relatively strong design – with comparisons within the group – addresses most threats to internal

validity.

The simplest form of this design is to take repeated observations, implement the program or

intervention, and observe a number of times during the evaluation period, including at the end.

This method is a great improvement over the pre- and post- design in that it tracks the trend of

change, and can therefore, help see whether it was actually the independent variable that caused

104
any change. It can also help to identify the influence of external factors such as when the

dependent variable shows significant change before the intervention is implemented.

Another possibility for this design is to implement more than one independent variable, either by

trying two or more, one after another (often with a break in between), or by adding each to what

came before. This gives a picture not only of the progress of change, but can show very clearly

what causes change. That gives an evaluator the opportunity not only to adjust the program, but

to drop elements that have no effect.

There are a number of variations on the interrupted time series theme, including varying the

observation times; implementing the independent variable repeatedly; and implementing one

independent variable, then another, then both together to evaluate their interaction.

In any variety of interrupted time series design, it’s important to know what you’re looking for.

In an evaluation of a traffic fatality control program in the United Kingdom that focused on

reducing drunk driving, monthly measurements seemed to show only a small decline in fatal

accidents. When the statistics for weekends, when there were most likely to be drunk drivers on

the road, were separated out, however, they showed that the weekend fatality rate dropped

sharply with the implementation of the program, and stayed low thereafter. Had the researchers

not realized that that might be the case, the program might have been stopped, and the weekend

accident rate would not have been reduced?

Interrupted time series design with multiple groups (multiple baseline/time series)

This has the same possibilities as the single time series design, with the added wrinkle of using

repeated measures with one or more other groups (so-called multiple baselines). By using

105
multiple baselines (groups), the external validity or generality of the findings is enhanced – we

can see if the effects occur with different groups or under different conditions.

This multiple time series design – typically staggered introduction of the intervention with

different groups or communities – gives the researcher more opportunities:

• You can try a method or program with two or more groups from the same

• You can try a particular method or program with different populations, to see if it’s

effective with others

• You can vary the timing or intensity of an intervention with different groups

• You can test different interventions at the same time

• You can try the same two or more interventions with each of two groups, but reverse

their order to see if sequencing it makes any difference

Again, there are more variations possible here.

Control group design

A common way to evaluate the effects of an independent variable is to use a control group. This

group is usually similar to the participant group, but either receives no intervention at all, or

receives a different intervention with the same goal as that offered to the participant group. A

control group design is usually the most difficult to set up – you have to find appropriate groups,

observe both on a regular basis, etc. – but is generally considered to be the most reliable.

106
The term control group comes from the attempt to control outside and other influences on the

dependent variable. If everything about the two groups except their exposure to the program

being evaluated averages out to be the same, then any differences in results must be due to that

exposure. The term comparison group is more modest; it typically offers a community

watched for similar levels of the problem/goal and relevant characteristics of the community

or population (e.g., education, poverty).

The gold standard here is the randomized control group, one that is selected totally at random,

either from among the population the program or intervention is concerned with – those at risk

for heart disease, unemployed males, young parents – or, if appropriate, the population at large.

A random group eliminates the problems of selection we discussed above, as well as issues that

might arise from differences in culture, race, or other factors.

A control group that’s carefully chosen will have the same characteristics as the intervention

group (the focus of the evaluation). If, for instance, the two groups come from the same pool of

people with a particular health condition, and are chosen at random either to be treated in the

conventional way or to try a new approach, it can be assumed that – since they were chosen at

random from the same population – both groups will be subject, on average, to the same outside

influences, and will have the same diversity of backgrounds. Thus, if there is a significant

difference in their results, it is fairly safe to assume that the difference comes from the

independent variable – the type of intervention, and not something else.

107
The difficulty for governmental and community-based organizations is to find or create a

randomized control group. If the program has a long waiting list, it may be able to create a

control by selecting those to first receive the intervention at random. That in itself creates

problems, in that people often drop off waiting lists out of frustration or other reasons. Being

included in the evaluation may help to keep them, on the other hand, by giving them a closer

connection to the program and making them feel valued.

An ESOL (English as a Second or Other Language) program in Boston with a three-year

waiting list addressed the problem by offering those on the waiting list a different option.

They received videotapes to use at home, along with biweekly tutoring by advanced students

and graduates of the program. Thus, they became a comparison group with a somewhat

different intervention that, as expected, was less effective than the program itself, but was more

effective than none, and kept them on the waiting list. It also gave them a head start once they

got into the classes, with many starting at middle rather than at a beginning level.

When there’s no waiting list or similar group to draw from, community organizations often end

up using a comparison group - one composed of participants in another place or program and

whose members’ characteristics, backgrounds, and experience may or may not be similar to

those of the participant group. That circumstance can raise some of the same problems related to

selection seen when there is no control group. If the only potential comparisons involve very

different groups, it may be better to use a design, such as an interrupted time series design that

doesn’t involve a control group at all, where the comparison is within (not between) groups.

108
Groups may look similar, but may differ in an important way. Two groups of participants in a

substance abuse intervention program, for instance, may have similar histories, but if one

program is voluntary and the other is not, the results aren’t likely to be comparable. One group

will probably be more motivated and less resentful than the other, and composed of people who

already know they have a potential problem. The motivation and determination of their

participants, rather than the effectiveness of the two programs, may influence the amount of

change observed.

This issue may come up in a single-group design as well. A program that may, on average,

seem to be relatively ineffective may prove, on close inspection, to be quite effective with

certain participants – those of a specific educational background, for instance, or with particular

life experiences. Looking at results with this in mind can be an important part of an evaluation,

and give you valuable and usable information.

CHOOSING A DESIGN

This section’s discussion of research designs is in no way complete. It’s meant to provide an

introduction to what’s available. There are literally thousands of books and articles written on

this topic, and you’ll probably want more information. There are a number of statistical

methods that can compensate for less-than-perfect designs, for instance: few community groups

109
have the resources to assemble a randomized control group, or to implement two or more similar

programs to see which works better.

Given this, the material that follows is meant only as broad guidelines. We don’t attempt to be

specific about what kind of design you need in what circumstances, but only try to suggest some

things to think about in different situations. Help is available from a number of directions: Much

can be found on the Internet (see the “Resources” part of this section for a few sites); there are

numerous books and articles (the classic text on research design is also cited in “Resources”);

and universities are a great resource, both through their libraries and through faculty and

graduate students who might be interested in what you’re doing, and be willing to help with

your evaluation. Use any and all of these to find what will work best for you. Funders may also

be willing either to provide technical assistance for evaluations, or to include money in your

grant or contract specifically to pay for a professional evaluation.

Your goal in evaluating your effort is to get the most reliable and accurate information possible,

given your evaluation questions, the nature of your program, what your participants will consent

to, your time constraints, and your resources. The important thing here is not to set up a perfect

research study, but to design your evaluation to get real information, and to be able to separate

110
the effects of external factors from the effects of your program. So how do you go about

choosing the best design that will be workable for you? The steps are in the first sentence of this

paragraph.

Consider your evaluation questions

What do you need to know? If the intent of your evaluation is simply to see whether something

specific happened, it’s possible that a simple pre-post design will do. If, as is more likely, you

want to know both whether change has occurred, and if it has, whether it has in fact been caused

by your program, you’ll need a design that helps to screen out the effects of external influences

and participants’ backgrounds.

For many community programs, a control or comparison group is helpful, but not absolutely

necessary. Think carefully about the frequency and timing of your observations and the amount

of different kinds of information you can collect. With repeated measures, you can get you quite

an accurate picture of the effectiveness of your program from a simple time series design. Single

group interrupted time series designs, which are often the most workable for small organizations,

can give you a very reliable evaluation if they’re structured well. That generally means obtaining

multiple baseline observations (enough to set a trend) before the program begins; observing often

and documenting your observations carefully (often with both quantitative – expressed in

numbers – and qualitative – expressed in records of incidents and of what participants did and

said – data); and including during intervention and follow-up observations to see whether

effects are maintained.

111
In many of these situations, a multiple-group interrupted time series design is quite possible, but

of a “naturally-occurring” experiment. If your program includes two or more groups or classes,

each working toward the same goals, you have the opportunity to stagger the introduction of the

intervention across the groups. This comparison with (and across) groups allows you to screen

out such factors as the facilitator’s ability and community influences (assuming all participants

come from the same general population.) You could also try different methods or time

sequences, to see which works best.

In some cases, the real question is not whether your method or program works, but whether it

works better than other methods or programs you could be using. Teaching a skill – for instance,

employment training, parenting, diabetes management, and conflict resolution – often falls into

this category. Here, you need a comparison of some sort. While evaluations of some of these –

medical treatment, for example – may require a control group, others can be compared to data

from the field, to published results of other programs, or, by using community-level indicators,

from measurements in other communities.

There are community programs where the bottom line is very simple. If you’re working to

control water pollution, your main concern may be the amount of pollution coming out of

effluent pipes, or the amount found in the river. Your only measure of success may be keeping

pollution below a certain level, which means that regular monitoring of water quality is the only

evaluation you need. There are probably relatively few community programs where evaluation

is this easy – you might, for instance, want to know which of your pollution-control activities is

most effective – but if yours is one, a simple design may be all you need.

112
Consider the nature of your program

What does your program look like and what is it meant to do? Does it work with participants in

groups, or individually, for instance? Does it run in cycles – classes or workshops that begin and

end on certain dates, or a time-limited program that participants go through only once? Or can

participants enter whenever they are ready and stay until they reach their goals? How much of

the work of the program is dependent on staff, and how much do participants do on their own?

How important is the program context – the way staff, participants, and others treat one another,

the general philosophy of the program, the physical setting, the organizational culture? (The

culture of an organization consists of accepted and traditional ways of doing things, patterns of

relationships, how people dress, how they act toward and communicate with one another, etc.)

• If you work with participants in groups, a multiple-group design – either interrupted time

series or control group – might be easier to use. If you work with participants

individually, perhaps a simple time series or a single group design would be appropriate.

• If your program is time-limited – either one-time-only, or with sessions that follow one

another – you’ll want a design that fits into the schedule, and that can give you reliable

results in the time you have. One possibility is to use a multiple group design, with

groups following one another session by session. The program for each group might be

adjusted, based on the results for the group before, so that you could test new ideas each

session.

• If your program has no clear beginning and end, you’re more likely to need a single

group design that considers participants individually, or by the level of their baseline

113
performance. You may also have to compensate for the fact that participants may be

entering the program at different levels, or with different goals.

A proverb says that you never step in the same river twice, because the water that flows past a

fixed point is always changing. The same is true of most community programs. Someone

coming into a program at a particular time may have a totally different experience than a similar

person entering at a different time, even though the operation of the program is the same for

both. A particular participant may encourage everyone around her, and create an

overwhelmingly positive atmosphere different from that experienced by participants who enter

the program after she has left, for example. It’s very difficult to control for this kind of

difference over time, but it’s important to be aware that it can, and often does, exist, and may

affect the results of a program evaluation.

If the organizational or program context and culture are important, then you’ll probably

want to compare your results with participants to those in a control group in a similar

situation where those factors are different, or are ignored.

There is, of course, a huge range of possibilities here: nearly any design can be adapted to nearly

any situation in the right circumstances. This material is meant only to give you a sense of how

to start thinking about the issue of design for an evaluation.

Consider what your participants (and staff) will consent to

In addition to the effect that it might have on the results of your evaluation, you might find that a

lot of observation can raise protests from participants who feel their privacy is threatened, or

from already-overworked staff members who see adding evaluation to their job as just another

114
burden. You may be able to overcome these obstacles, or you may have to compromise – fewer

or different kinds of observations, a less intrusive design – in order to be able to conduct the

evaluation at all.

There are other reasons that participants might object to observation, or at least intense

observation. Potential for embarrassment, a desire for secrecy (to keep their participation in the

program from family members or others), even self-protection (in the case of domestic violence,

for instance) can contribute to unwillingness to be a participant in the evaluation. Staff members

may have some of the same concerns.

There are ways to deal with these issues, but there’s no guarantee that they’ll work. One is to

inform participants at the beginning about exactly what you’re hoping to do, listen to their

objections, and meet with them (more than once, if necessary) to come up with a satisfactory

approach. Staff members are less likely to complain if they’re involved in planning the

evaluation, and thus have some say over the frequency and nature of observations. The same is

true for participants. Treating everyone’s concerns seriously and including them in the planning

process can go a long way toward assuring cooperation.

Consider your time constraints

As we mentioned above, the important thing here is to choose a design that will give you

reasonably reliable information. In general, your design doesn’t have to be perfect, but it does

have to be good enough to give you a reasonably good indication that changes are actually taking

place, and that they are the result of your program. Just how precise you can be is at least

115
partially controlled by the limits on your time placed by funding, program considerations, and

other factors.

Time constraints may also be imposed. Some of the most common:

• Program structure. An evaluation may make the most sense if it’s conducted to

correspond with a regular program cycle.

• Funding. If you are funded only for a pilot project, for example, you’ll have to conduct

your evaluation within the time span of the funding, and soon enough to show that your

program is successful enough to be refunded. A time schedule for evaluation may be part

of your grant or contract, especially if the funder is paying for it.

• Participants’ schedules. A rural education program may need to stop for several months a

year to allow participants to plant and tend crops, for instance.

• The seriousness of the issue A delay in understanding whether a violence prevention

program is effective may cost lives.

• The availability of professional evaluators. Perhaps the evaluation team can only work

during a particular time frame.

Consider your resources

Strategic planners often advise that groups and organizations consider resources last: otherwise

they’ll reject many good ideas because they’re too expensive or difficult, rather than trying to

find ways to make them work with the resources at hand. Resources include not only money, but

also space, materials and equipment, personnel, and skills and expertise. Often, one of these can

116
substitute for another: a staff person with experience in research can take the place of money that

would be used to pay a consultant, for example. A partnership with a nearby university could get

you not only expertise, but perhaps needed equipment as well.

The lesson here is to begin by determining the best design possible for your purposes, without

regard to resources. You may have to settle for somewhat less, but if you start by aiming for

what you want, you’re likely to get a lot closer to it than if you assume you can’t possibly get it.

IN SUMMARY

The way you design your evaluation research will have a lot to do with how accurate and reliable

your results are, and how well you can use them to improve your program or intervention. The

design should be one that best addresses key threats to internal validity (whether the intervention

caused the change) and external validity (the ability to generalize your results to other situations,

communities, and populations).

Common research designs – such as interrupted time series or control group designs– can be

adapted to various situations, and combined in various ways to create a design that is both

appropriate and feasible for your program. It may be necessary to seek help from a consultant, a

university partner, or simply someone with research experience to help identify a design that fits

your needs.

A good design will address your evaluation questions, and take into consideration the nature of

your program, what program participants and staff will agree to, your time constraints, and the

resources you have available for evaluation. It often makes sense to consider resources last, so

117
that you won’t reject good ideas because they seem too expensive or difficult. Once you’ve

chosen a design, you can often find a way around a lack of resources to make it a reality.

Chapter 5

118
COLLECTING AND ANALYSING DATA

In previous sections of this chapter, we’ve discussed studying the issue, deciding on a research

design, and creating an observational system for gathering information for your evaluation. Now

it’s time to collect your data and analyze it – figuring out what it means – so that you can use it

to draw some conclusions about your work. In this section, we’ll examine how to do just that.

WHAT DO WE MEAN BY COLLECTING DATA?

Essentially, collecting data means putting your design for collecting information into operation.

You’ve decided how you’re going to get information – whether by direct observation,

interviews, surveys, experiments and testing, or other methods – and now you and/or other

observers have to implement your plan. There’s a bit more to collecting data, however. If you are

conducting observations, for example, you’ll have to define what you’re observing and arrange

to make observations at the right times, so you actually observe what you need to. You’ll have to

record the observations in appropriate ways and organize them so they’re optimally useful.

Recording and organizing data may take different forms, depending on the kind of information

you’re collecting. The way you collect your data should relate to how you’re planning to analyze

and use it. Regardless of what method you decide to use, recording should be done concurrent

with data collection if possible, or soon afterwards, so that nothing gets lost and memory doesn’t

fade.

Some of the things you might do with the information you collect include:

• Gathering together information from all sources and observations

119
• Making photocopies of all recording forms, records, audio or video recordings, and any

other collected materials, to guard against loss, accidental erasure, or other problems

• Entering narratives, numbers, and other information into a computer program, where they

can be arranged and/or worked on in various ways

• Performing any mathematical or similar operations needed to get quantitative information

ready for analysis. These might, for instance, include entering numerical observations

into a chart, table, or spreadsheet, or figuring the mean (average), median (midpoint),

and/or mode (most frequently occurring) of a set of numbers.

• Transcribing (making an exact, word-for-word text version of) the contents of audio or

video recordings

• Coding data (translating data, particularly qualitative data that isn’t expressed in

numbers, into a form that allows it to be processed by a specific software program or

subjected to statistical analysis)

• Organizing data in ways that make them easier to work with. How you do this will

depend on your research design and your evaluation questions. You might group

observations by the dependent variable (indicator of success) they relate to, by

individuals or groups of participants, by time, by activity, etc. You might also want to

group observations in several different ways, so that you can study interactions among

different variables.

120
There are two kinds of variables in research. An independent variable (the intervention) is a

condition implemented by the researcher or community to see if it will create change and

improvement. This could be a program, method, system, or other action. A dependent variable

is what may change as a result of the independent variable or intervention. Dependent

variable could be a behavior, outcome, or other condition. A smoking cessation program, for

example, is an independent variable that may change group members’ smoking behavior, the

primary dependent variable.

WHAT DO WE MEAN BY ANALYZING DATA?

Analyzing information involves examining it in ways that reveal the relationships, patterns,

trends, etc. that can be found within it. That may mean subjecting it to statistical operations that

can tell you not only what kinds of relationships seem to exist among variables, but also to what

level you can trust the answers you’re getting. It may mean comparing your information to that

from other groups (a control or comparison group, statewide figures, etc.), to help draw some

conclusions from the data. The point, in terms of your evaluation, is to get an accurate

assessment in order to better understand your work and its effects on those you’re concerned

with, or in order to better understand the overall situation.

There are two kinds of data you’re apt to be working with, although not all evaluations will

necessarily include both. Quantitative data refer to the information that is collected as, or can

be translated into, numbers, which can then be displayed and analyzed mathematically.

Qualitative data are collected as descriptions, anecdotes, opinions, quotes, interpretations, etc.,

and are generally either not able to be reduced to numbers, or are considered more valuable or

121
informative if left as narratives. As you might expect, quantitative and qualitative information

needs to be analyzed differently.

QUANTITATIVE DATA

Quantitative data are typically collected directly as numbers. Some examples include:

• The frequency (rate, duration) of specific behaviors or conditions

• Test scores (e.g., scores/levels of knowledge, skill, etc.)

• Survey results (e.g., reported behavior, or outcomes to environmental conditions; ratings

of satisfaction, stress, etc.)

• Numbers or percentages of people with certain characteristics in a population (diagnosed

with diabetes, unemployed, Spanish-speaking, under age 14, grade of school completed,

etc.)

Data can also be collected in forms other than numbers, and turned into quantitative data for

analysis. Researchers can count the number of times an event is documented in interviews or

records, for instance, or assign numbers to the levels of intensity of an observed event or

behavior. For instance, community initiatives often want to document the amount and intensity

of environmental changes they bring about – the new programs and policies that result from their

efforts. Whether or not this kind of translation is necessary or useful depends on the nature of

what you’re observing and on the kinds of questions your evaluation is meant to answer.

Quantitative data is usually subjected to statistical procedures such as calculating the mean or

average number of times an event or behavior occurs (per day, month, and year). These

122
operations, because numbers are “hard” data and not interpretation, can give definitive, or nearly

definitive, answers to different questions. Various kinds of quantitative analysis can indicate

changes in a dependent variable related to – frequency, duration, timing (when particular things

happen), intensity, level, etc. They can allow you to compare those changes to one another, to

changes in another variable, or to changes in another population. They might be able to tell you,

at a particular degree of reliability, whether those changes are likely to have been caused by your

intervention or program, or by another factor, known or unknown. And they can identify

relationships among different variables, which may or may not mean that one causes another.

QUALITATIVE DATA

Unlike numbers or “hard data,” qualitative information tends to be “soft,” meaning it can’t

always be reduced to something definite. That is in some ways a weakness, but it’s also strength.

A number may tell you how well a student did on a test; the look on her face after seeing her

grade, however, may tell you even more about the effect of that result on her. That look can’t be

translated to a number, nor can a teacher’s knowledge of that student’s history, progress, and

experience, all of which go into the teacher’s interpretation of that look. And that interpretation

may be far more valuable in helping that student succeed than knowing her grade or numerical

score on the test.

Qualitative data can sometimes be changed into numbers, usually by counting the number of

times specific things occur in the course of observations or interviews, or by assigning numbers

or ratings to dimensions (e.g., importance, satisfaction, ease of use).

The challenges of translating qualitative into quantitative data have to do with the human factor.

123
Even if most people agree on what 1 (lowest) or 5 (highest) means in regard to rating

“satisfaction” with a program, ratings of 2, 3, and 4 may be very different for different people.

Furthermore, the numbers say nothing about why people reported the way they did. One may

dislike the program because of the content, the facilitator, the time of day, etc. The same may be

true when you’re counting instances of the mention of an event, such as the onset of a new

policy or program in a community based on interviews or archival records. Where one person

might see a change in program he considers important another may omit it due to perceived

unimportance.

Qualitative data can sometimes tell you things that quantitative data can’t. It may reveal why

certain methods are working or not working, whether part of what you’re doing conflicts with

participants’ culture, what participants see as important, etc. It may also show you patterns – in

behavior, physical or social environment, or other factors – that the numbers in your quantitative

data don’t, and occasionally even identify variables that researchers weren’t aware of.

It is often helpful to collect both quantitative and qualitative information.

Quantitative analysis is considered to be objective – without any human bias attached to it –

because it depends on the comparison of numbers according to mathematical computations.

Analysis of qualitative data is generally accomplished by methods more subjective – dependent

on people’s opinions, knowledge, assumptions, and inferences (and therefore biases) – than that

of quantitative data. The identification of patterns, the interpretation of people’s statements or

other communication, the spotting of trends – all of these can be influenced by the way the

researcher sees the world. Be aware, however, that quantitative analysis is influenced by a

124
number of subjective factors as well. What the researcher chooses to measure, the accuracy of

the observations, and the way the research is structured to ask only particular questions can all

influence the results, as can the researcher’s understanding and interpretation of the subsequent

analyses.

WHY SHOULD YOU COLLECT AND ANALYZE DATA FOR YOUR EVALUATION?

Part of the answer here is that not every organization – particularly small community-based or

non-governmental ones – will necessarily have extensive resources to conduct a formal

evaluation. They may have to be content with less formal evaluations, which can still be

extremely helpful in providing direction for a program or intervention. An informal evaluation

will involve some data gathering and analysis. This data collection and sense making is critical to

an initiative and its future success, and has a number of advantages.

• The data can show whether there was any significant change in the dependent

variable(s) you hoped to influence. Collecting and analyzing data helps you see whether

your intervention brought about the desired results

The term “significance” has a specific meaning when you’re discussing statistics. The level of

significance of a statistical result is the level of confidence you can have in the answer you get.

Generally, researchers don’t consider a result significant unless it shows at least a 95%

certainty that it’s correct (called the .05 level of significance, since there’s a 5% chance that it’s

wrong). The level of significance is built into the statistical formulas: once you get a

mathematical result, a table (or the software you’re using) will tell you the level of

significance.

125
Thus, if data analysis finds that the independent variable (the intervention) influenced the

dependent variable at the .05 level of significance, it means there’s a 95% probability or

likelihood that your program or intervention had the desired effect. The .05 level is generally

considered a reasonable result, and the .01 level (99% probability) is considered about as close

to certainty as you are likely to get. A 95% level of certainty doesn’t mean that the program

works on 95% of participants, or that it will work 95% of the time. It means that there’s only a

5% possibility that it isn’t actually what’s influencing the dependent variable(s) and causing the

changes that it seems to be associated with.

• They can uncover factors that may be associated with changes in the dependent

variable(s). Data analyses may help discover unexpected influences; for instance, that

the effort was twice as large for those participants who also were a part of a support

group. This can be used to identify key aspects of implementation.

• They can show connections between or among various factors that may have an

effect on the results of your evaluation. Some types of statistical procedures look for

connections (“correlations” is the research term) among variables. Certain dependent

variables may change when others do. These changes may be similar – i.e., both

variables increase or decrease (e.g., as children’s proficiency at reading increases, the

amount of reading they do also increases). Or the opposite may be observed – i.e. the

two variables change in opposite directions (as the amount of exercise they engage in

increases, peoples’ weight decreases). Correlations don’t mean that one variable causes

another or that they both have the same cause, but they can provide valuable information

about associations to expect in an evaluation.

126
• They can help shed light on the reasons that your work was effective or, perhaps,

less effective than you’d hoped. By combining quantitative and qualitative analysis,

you can often determine not only what worked or didn’t, but why. The effect of cultural

issues, how well methods are used, and the appropriateness of your approach for the

population – these as well as other factors that influence success can be highlighted by

careful data collection and analysis. This knowledge gives you a basis for adapting and

changing what you do to make it more likely you’ll achieve the desired outcomes in the

future.

• They can provide you with credible evidence to show stakeholders that your

program is successful, or that you’ve uncovered, and are addressing limitations.

Stakeholders, such as funders and community boards, want to know their investments are

well spent. Showing evidence of intermediate outcomes (e.g. new programs and policies)

and longer-term outcomes (e.g., improvements in education or health indicators) is

becoming increasingly important to receiving – and retaining – funding.

127
Their use shows that you’re serious about evaluation and about improving your

work. Being a good trustee or steward of community investment includes regular review

of data regarding progress and improvement.

• They can show the field what you’re learning, and thus pave the way for others to

implement successful methods and approaches. In that way, you’ll be helping to

improve community efforts and, ultimately, quality of life for people who benefit.

WHEN AND BY WHOM SHOULD DATA BE COLLECTED AND ANALYZED?

As far as data collection goes, the “when” part of this question is relatively simple: data

collection should start no later than when you begin your work – or before you begin in order to

establish a baseline or starting point – and continue throughout. Ideally, you should collect data

for a period of time before you start your program or intervention in order to determine if there

are any trends in the data before the onset of the intervention. Additionally, in order to gauge

your program’s longer-term effects, you should collect follow-up data for a period of time

following the conclusion of the program.

The timing of analysis can be looked at in at least two ways: One is that it’s best to analyze your

information when you’ve collected all of it, so you can look at it as a whole. The other is that if

you analyze it as you go along, you’ll be able to adjust your thinking about what information you

actually need, and to adjust your program to respond to the information you’re getting. Which of

these approaches you take depends on your research purposes. If you’re more concerned with a

summative evaluation – finding out whether your approach was effective, you might be more

inclined toward the first. If you’re oriented toward improvement – a formative evaluation – we

128
recommend gathering information along the way. Both approaches are legitimate, but ongoing

data collection and review can particularly lead to improvements in your work.

The “who” question can be more complex. If you’re reasonably familiar with statistics and

statistical procedures, and you have the resources in time, money, and personnel, it’s likely that

you’ll do a somewhat formal study, using standard statistical tests. (There’s a great deal of

software – both for sale and free or open-source – available to help you.)

If that’s not the case, you have some choices:

• You can hire or find a volunteer outside evaluator, such as from a nearby college or

university, to take care of data collection and/or analysis for you.

• You can conduct a less formal evaluation. Your results may not be as sophisticated as if

you subjected them to rigorous statistical procedures, but they can still tell you a lot about

your program. Just the numbers – the number of dropouts (and when most dropped out),

for instance, or the characteristics of the people you serve – can give you important and

usable information.

• You can try to learn enough about statistics and statistical software to conduct a formal

evaluation yourself. (Take a course, for example.)

129
• You can collect the data and then send it off to someone – a university program, a friendly

statistician or researcher, or someone you hire – to process it for you.

You can collect and rely largely on qualitative data. Whether this is an option depends to

a large extent on what your program is about. You wouldn’t want to conduct a formal

evaluation of effectiveness of a new medication using only qualitative data, but you might

be able to draw some reasonable conclusions about use or compliance patterns from

qualitative information.

• If possible, use a randomized or closely matched control group for comparison. If your

control is properly structured, you can draw some fairly reliable conclusions simply by

comparing its results to those of your intervention group. Again, these results won’t be as

reliable as if the comparison were made using statistical procedures, but they can point

you in the right direction. It’s fairly easy to tell whether or not there’s a major difference

between the numbers for the two or more groups. If 95% of the students in your class

passed the test, and only 60% of those in a similar but uninstructed control group did, you

can be pretty sure that your class made a difference in some way, although you may not

be able to tell exactly what it was that mattered. By the same token, if 72% of your

students passed and 70% of the control group did as well, it seems pretty clear that your

instruction had essentially no effect, if the groups were starting from approximately the

same place.

Who should actually collect and analyze data also depends on the form of your evaluation. If

you’re doing a participatory evaluation, much of the data collection - and analyzing - will be

130
done by community members or program participants themselves. If you’re conducting an

evaluation in which the observation is specialized, the data collectors may be staff members,

professionals, highly trained volunteers, or others with specific skills or training (graduate

students, for example). Analysis also could be accomplished by a participatory process. Even

where complicated statistical procedures are necessary, participants and/or community members

might be involved in sorting out what those results actually mean once the math is done and the

results are in. Another way analysis can be accomplished is by professionals or other trained

individuals, depending upon the nature of the data to be analyzed, the methods of analysis, and

the level of sophistication aimed at in the conclusions.

HOW DO YOU COLLECT AND ANALYZE DATA?

Whether your evaluation includes formal or informal research procedures, you’ll still have to

collect and analyze data, and there are some basic steps you can take to do so.

IMPLEMENT YOUR MEASUREMENT SYSTEM

We've previously discussed designing an observational system to gather information. Now

it’s time to put that system in place.

• Clearly define and describe what measurements or observations are needed. The

definition and description should be clear enough to enable observers to agree on what

they’re observing and reliably record data in the same way.

131
• Select and train observers. Particularly if this is part of a participatory process, observers

need training to know what to record; to recognize key behaviors, events, and conditions;

and to reach an acceptable level of inter-rater reliability (agreement among observers).

• Conduct observations at the appropriate times for the appropriate period of time. This

may include reviewing archival material; conducting interviews, surveys, or focus groups;

engaging in direct observation; etc.

Record data in the agreed-upon ways. These may include pencil and paper, computer

(using a laptop or handheld device in the field, entering numbers into a program, etc.),

audio or video, journals, etc.

ORGANIZE THE DATA YOU’VE COLLECTED

How you do this depends on what you’re planning to do with it, and on what you’re

interested in.

• Enter any necessary data into the computer. This may mean simply typing comments,

descriptions, etc., into a word processing program, or entering various kinds of

information (possibly including audio and video) into a database, spreadsheet, a GIS

(Geographic Information Systems) program, or some other type of software or file.

• Transcribe any audio- or videotapes. This makes them easier to work with and copy, and

allows the opportunity to clarify any hard-to-understand passages of speech.

• Score any tests and record the scores appropriately.

132
• Sort your information in ways appropriate to your interest. This may include sorting by

category of observation, by event, by place, by individual, by group, by the time of

observation, or by a combination or some other standard.

• When possible, necessary, and appropriate, transform qualitative into quantitative data.

This might involve, for example, counting the number of times specific issues were

mentioned in interviews, or how often certain behaviors were observed.

CONDUCT DATA GRAPHING, VISUAL INSPECTION, STATISTICAL ANALYSIS, OR

OTHER OPERATIONS ON THE DATA AS APPROPRIATE

We’ve referred several times to statistical procedures that you can apply to quantitative data. If

you have the right numbers, you can find out a great deal about whether your program is causing

or contributing to change and improvement, what that change is, whether there are any expected

or unexpected connections among variables, how your group compares to another you’re

measuring, etc.

There are other excellent possibilities for analysis besides statistical procedures,

however. A few include:

• Simple counting, graphing and visual inspection of frequency or rates of behavior, events,

etc., over time.

• Using visual inspection of patterns over time to identify discontinuities (marked increases,

decreases) in the measures over time (sessions, weeks, months).

133
• Calculating the mean (average), median (midpoint), and/or mode (most frequent) of a

series of measurements or observations. What was the average blood pressure, for

instance, of people who exercised 30 minutes a day at least five days a week, as opposed

to that of people who exercised two days a week or less?

• Using qualitative interviews, conversations, and participant observation to observe (and

track changes in) the people or situation. Journals can be particularly revealing in this

area because they record people’s experiences and reflections over time.

Finding patterns in qualitative data. If many people refer to similar problems or barriers,

these may be important in understanding the issue, determining what works or doesn’t

work and why, or more.

• Comparing actual results to previously determined goals or benchmarks. One measure of

success might be meeting a goal for planning or program implementation, for example.

TAKE NOTE OF ANY SIGNIFICANT OR INTERESTING RESULTS

Depending on the nature of your research, results may be statistically significant (the 95% or

better certainty that we discussed earlier), or simply important or unusual. They may or may not

be socially significant (i.e., large enough to solve the problem).

There are a number of different kinds of results you might be looking for.

134
• Differences within people or groups. If you have repeated measurements for

individuals/groups over time, we can see if there are marked increases/decreases in the

(frequency, rate) of behavior (events, etc.) following introduction of the program or

intervention. When the effects are seen when and only when the intervention is introduced

– and if the intervention is staggered (delayed) across people or groups – this increases

our confidence that the intervention, and not something else, is producing the observed

effects.

• Differences between or among two or more groups. If you have one or more randomized

control groups in a formal study (groups that are drawn at random from the same

population as the group in your program, but are not getting the same program or

intervention, or are getting none at all), then the statistical significance of differences

between or among the groups should tell you whether your program has any more

influence on the dependent variable(s) than what’s experienced by the other groups.

• Results that show statistically significant changes. With or without a control or

comparison group, many statistical procedures can tell you whether changes in dependent

variables are truly significant (or not likely due to chance). These results may say nothing

about the causes of the change (or they may, depending on how you’ve structured your

evaluation), but they do tell you what’s happening, and give you a place to start.

Correlation between variables doesn’t tell you that one necessarily causes the other, but simply

those changes in one have a relationship to changes in the other. Among American teenagers,

for instance, there is probably a fairly high correlation between an increase in body size and an

135
understanding of algebra. This is not because one causes the other, but rather the result of the

fact that American schools tend to begin teaching algebra in the seventh, eighth, or ninth grades,

a time when many 12-, 13-, and 14-year-olds are naturally experiencing a growth spurt.

On the other hand, correlations can reveal important connections. A very high correlation

between, for instance, the use of a particular medication and the onset of depression might lead

to the withdrawal of that medication, or at least a study of its side effects, and increased

awareness and caution among doctors who prescribe it. A very high correlation between gang

membership and having a parent with a substance abuse problem may not reveal a direct cause-

and-effect

136
relationship, but may tell you something important about who is more at risk for substance

abuse.

• Correlations. Correlation means that there are connections between or among two or

more variables. Correlations can sometimes point to important relationships you might

not have predicted. Sometimes they can shed light on the issue itself, and sometimes on

the effects of a group’s cultural practices. In some cases, they can highlight potential

causes of an issue or condition, and thus pave the way for future interventions.

• Patterns. In both quantitative and qualitative information, patterns often emerge: certain

health conditions seem to cluster in particular geographical areas; people from a particular

group behave in similar ways; etc. These patterns may not be specifically what you were

looking for or expected to find, but they may either be important in themselves or shed

light on the areas you’re interested in. In some cases, you may need to subject them to

statistical procedures (regression analysis, for example) to see if, in fact, they’re random,

or if they constitute actual patterns.

• Obvious important findings. Whether as a result of statistical analysis, or of examination

of your data and application of logic, some findings may stand out. If 70% of a group of

137
overweight participants in a healthy eating and physical activity program lowered their

weight and blood pressure significantly, compared to only 20% of a similar group not in

the program, you can probably assume that program may have been effective. If there’s

no change whatsoever in education outcomes after two years of your education program,

then you’re either running an ineffective program, or you’re simply not reaching those

who are most likely to have poorer outcomes (which can also be interpreted to mean

you’re running an ineffective program.)

Not all important findings will necessarily tell you whether your program worked, or

what the most effective method is. It might be obvious from your data collection, for

instance, that, while violence or roadway injuries may not be seen as a problem

citywide, they are much higher in one or more particular areas, or that the rates of

diabetes are markedly higher for particular groups or those living in areas with greater

disparities of income. If you have the resources, it’s wise to look at the results of your

research in a number of different ways, both to find out how to improve your program,

and to learn what else you might do to affect the issue.

INTERPRET THE RESULTS

Once you’ve organized your results and run them through whatever statistical or other analysis

you’ve planned for, it’s time to figure out what they mean for your evaluation. Probably the most

138
common question that evaluation research is directed toward is whether the program being

evaluated works or makes a difference. In research terms, that often translates to “What were the

effects of the independent variable (the program, intervention, etc.) on the dependent variable(s)

(the behavior, conditions, or other factors it was meant to change)?” There are a number of

possible answers to this question:

• Your program had exactly the effects on the dependent variable(s) you expected and

hoped it would. Statistics or other analysis showed clear positive effects at a high level of

significance for the people in your program and – if you used a multiple-group design –

none, or far fewer, of the same effects for a similar control group and/or for a group that

received a different intervention with the same purpose. Your early childhood education

program, for instance, greatly increased development outcomes for children in the

community, and also contributed to an increase in the percentage of children succeeding

in school.

• Your program had no effect. Your program produced no significant results on the

dependent variable, whether alone or compared to other groups. This would mean no

change as a result of your program or intervention.

• Your program had a negative effect. For instance, intimate partner violence increased (or

at least appeared to) as a result of your intervention. (It is relatively common for reported

events, such as violence or injury, to increase when the intervention results in improved

surveillance and ease of reporting).

• Your program had the effects you hoped for and other effects as well.

139
o These effects might be positive. Your youth violence prevention program, for

instance, might have resulted in greatly reduced violence among teens, and might

also have resulted in significantly improved academic performance for the kids

involved.

o These effects might be neutral. The same youth violence prevention program

might somehow result in youth watching TV more often after school.

o These effects might be negative. (These effects are usually called unintended

consequences.) Youth violence might decrease significantly, but the incidence of

teen pregnancies or alcohol consumption among youth in the program might

increase significantly at the same time.

o These effects might be multiple, or mixed. For instance, a program to reduce

HIV/AIDS might lower rates of unprotected sex but might also increase conflict

and instances of partner violence. Your program had no effect or a negative effect

and other effects as well. As with programs with positive effects, these might be

positive, neutral, or negative; single or multiple; or consistent or mixed.

If your analysis gives you a clear indication that what you’re doing is accomplishing your

purposes, interpretation is relatively simple: You should keep doing it, while trying out ways to

make it even more effective, or while aiming at other related issues as well.

If your analysis shows that your program is ineffective or negative, however – or, for that matter,

if a positive analysis leaves you wondering how to make your successful efforts still more

successful – interpretation becomes more complex. Are you using an absolutely wrong

140
approach? Are you using an approach that could be effective, but is poorly implement? Is there a

particular contributing factor you’re failing to take into account? Are there barriers to success –

of culture, experience, personal characteristics, and systematic discrimination – present in the

population from which participants are drawn? Are there particular components or elements you

can change to make your program more effective, or should you start again from scratch? What

should you address to make a good program better?

Careful and insightful interpretation of your data may allow you to answer questions like these.

You may be able to use correlations, for instance, to generate hypotheses about your results. If

positive or negative changes in particular variables are consistently associated with positive or

negative changes in other variables, the two may be connected. (The word “may” is important

here. The two may be connected, but they may not, or both may be related to a third variable

that you’re not aware of or that you consider trivial.) Such a connection can point the way

toward a factor (e.g., access to support) that is causing the changes in both variables, and that

must be addressed to make your program successful. Correlations may also indicate patterns in

your data, or may lead to an unexpected way of looking at the issue you’re addressing.

You can often use qualitative data to understand the meaning of an intervention, and people’s

reactions to the results. The observation that participants are continually suffering from a variety

of health problems may be traced, through qualitative data, to nutrition problems (due either to

poverty or ignorance) or to lack of access to health services, or to cultural restrictions (some

Muslim women may be unwilling – or unable because of family prohibition – to accept care and

treatment from male doctors, for example).

141
Once you have organized your data, both statistical results and anything that can’t be analyzed

statistically need to be analyzed logically. This may not give you convincing information but it

will almost undoubtedly give you some ideas to follow up on, and some indications of

connections and avenues you might not yet have considered. It will also show you some

additional results – people reacting differently than before to the program, for example. The

numbers can tell you whether there is change, but they can’t always tell you what causes it or

why (although they sometimes can), or why some people benefit while others don’t. Those are

often matters for logical analysis, or critical thinking.

Analyzing and interpreting the data you’ve collected brings you, in a sense, back to the

beginning. You can use the information you’ve gained to adjust and improve your program or

intervention, evaluate it again, and use that information to adjust and improve it further, for as

long as it runs. You have to keep up the process to ensure that you’re doing the best work you

can and encouraging changes in individuals, systems, and policies that make for a better and

healthier community.

You have to become a cultural detective to understand your initiative, and, in some ways, every

evaluation is an anthropological study.

IN SUMMARY

The heart of evaluation research is gathering information about the program or intervention

you’re evaluating and analyzing it to determine what it tells you about the effectiveness of what

you’re doing, as well as about how you can maintain and improve that effectiveness.

142
Collecting quantitative data – information expressed in numbers – and subjecting it to a visual

inspection or formal statistical analysis can tell you whether your work is having the desired

effect, and may be able to tell you why or why not as well. It can also highlight connections

(correlations) among variables, and call attention to factors you may not have considered.

Collecting and analyzing qualitative data – interviews, descriptions of environmental factors, or

events, and circumstances – can provide insight into how participants experience the issue you’re

addressing, what barriers and advantages they experience, and what you might change or add to

improve what you do.

Once you’ve gained the knowledge that your information provides, it’s time to start the process

again. Use what you’ve learned to continue to evaluate what you do by collecting and analyzing

data, and continually improve your program.

143
Chapter 7

COLLECTING AND USING ARCHIVAL DATA

You’re evaluating your teen pregnancy prevention program, and you’d like to know whether it

will result in a reduction in pregnancy rates among young girls in the community. There are

statistics on pregnancy rates for the state and county collected by the Public Health Service, but

none at the community level, and none that separate the rates for girls under 16, the population

you’re most concerned with. You could do a community survey to try to find out the local rate,

but there are two problems with that idea. The first is that you have neither the time nor the

resources to conduct the survey, and the second is that you’re unlikely to get an accurate picture–

there may be embarrassment in asking the questions or other reasons.

There might be another way of getting the information, however. A number of other agencies in

the community work with youth, and one of them, or a combination, might have the figures

you’re looking for. Rather than generating or collecting it yourself, you can save a great deal of

time and trouble by using data that already exist.

144
We have previously discussed ways to find existing information to help you conduct a

community assessment of assets and needs, but now our goal is somewhat different, because

we’re seeking data that you can use to evaluate your work. This means that it needs to be in a

form that can be analyzed, and may have to refer to a very specific population, issue, and/or

method. As a result, it may be harder to find, and may have to be converted in some way once

you do find it. In many situations, however, using existing information still can be much easier

than collecting the data yourself. In this section, we’ll try to help you make the use of archival

data as easy as possible.

WHAT ARE ARCHIVAL DATA?

Archival data refer to information that already exists in someone else’s files. Originally generated

for reporting or research purposes, it’s often kept because of legal requirements, for reference, or

as an internal record. In general, because it’s the result of completed activities, it’s not subject to

change and is therefore sometimes known as fixed data.

Some researchers make a distinction between archival and secondary data. They see archival

data as information specifically collected for bureaucratic procedures and the like –

applications, reports, etc. – that can then be made usable for other purposes. Secondary data

refer to research information, collected as a result of studies and similar efforts that can then be

used by others either as comparison data or as part of new research. For the purposes of this

section, we’ll include both of these types of data in our discussion, and not distinguish between

them.

SOURCES OF ARCHIVAL DATA

145
Archival data can exist almost anywhere that information is collected.

Some of the most common sources (we’ll look at these and others later in more detail) are:

• Public records from governmental agencies

• Research organizations

• Health and human service organizations

146
Schools and education departments

• Academic and similar institutions

• Business and industry

Archives are often stored as paper files or on electronic storage – computer disks, CDs, DVDs,

etc. – and may include photographs and audio and video recordings as well. It may also take the

form of encoded information expressed in numbers, or in computer language. Computer files, of

course, may include various media and text, all in the same place.

Many organizations have archives so large that they store most of the material off-site, either

with a data storage firm, or in their own or a rented facility. Some archives are made available

on a website maintained by the government or other organization.

TYPES OF ARCHIVAL DATA YOU MIGHT LOOK FOR

As explained above, much of the data you’re likely to use for evaluation purposes will probably

be more focused than data you’d use for an assessment of the level of a problem. Evaluation

information would be more likely than assessment data to come in the form of study results, for

example, than as narrative history or original documents. There’s a good deal of overlap: census

data, for instance, could be used in both assessment and evaluation. In general, however, the

possibilities below would refer to the types of data available, including information for a

specific population or geographic area on:

• Knowledge and awareness of issues

147
• Demographics of the population (e.g., age, education, income)

Behavior

• Health and development outcomes

• Environmental conditions or risk/protection factors affecting the population

WHY COLLECT AND USE ARCHIVAL DATA?

There are sometimes good reasons for using original data, including that the information you

need just isn’t available elsewhere. Additionally, if a researcher collects original data, he or she

has more control what data are collected.

On the other hand, if the information you need, or something very close to it, already

exists, there are several good reasons to find and use it.

• It’s easier and less time-consuming than collecting all the data yourself. This is

probably the most obvious and most common reason for taking advantage of archival

data. Especially if you’re looking for a large amount of information or information about

a large group of people, you may be able to save yourself an enormous amount of time

and trouble by using archival data.

• Archival data may have already been processed by people with more statistical

expertise. Unless you’re a statistician or a health or human service researcher with an

advanced degree (and often not even then), the chances are that you don’t have a

148
flawless grasp of data analysis. You can hire someone or find a volunteer to help you,

but if the hard work has already been done, it will make your work that much easier.

Even with raw data, the basic organization and preparation (transcription of

interviews, entry of numbers into a spreadsheet or specific software, etc.) may have

already been done, again saving time and resources.

• It’s quite possible that you can find more information than you’d be able to gather

if you did it yourself. The archival data you find may be more sweeping or more

specific than what you’d be able to gather. It may involve more people than you’d be

able to,

cover a larger geographic area, or provide more detail.

• Archival data could touch on important areas you have not considered, or identify

patterns or relationships you wouldn’t have looked for In cases like these, the use of

pre-existing data might change your whole view of your work, and help bring you to a

level of effectiveness you wouldn’t have reached otherwise.

• It may eliminate the need to correct for problems, such as improper sampling, lack of

inter-rater reliability, or observer bias.

• Archival data allows the possibility of looking at the effects of your work over time.

Is the change in your population part of a trend that seems to be reflected in data from a

similar population or the entire state or nation? You may not have the capacity to collect

149
data over a long enough period to answer such questions, but if the data already exist, it

makes longer-term analysis possible.

• Archival data can make it possible for small organizations with limited resources to

conduct thorough evaluation studies. Most small community-based organizations

simply have neither the money nor the personnel to gather large amounts of data – but,

there’s no need to when the data you need exists elsewhere.

150
WHEN SHOULD YOU COLLECT AND USE ARCHIVAL DATA?

• When it’s available. This is the key question. If you know the data exist and you can get

access, use it. If the data doesn’t exist, if finding it would take more time and effort than

it’s worth, or if you have no access to it, then it’s not possible.

• When it’s relevant. As with its availability, the relevance of the data to what you’re

trying to find out is a key issue. All the archival data in the world won’t do you any good

if it doesn’t help you answer your evaluation questions.

• When you don’t have the time and/or resources to collect the data yourself. Whether

it’s a matter of the size and scope of your organization, time pressure from a funder to

produce an evaluation, or some other factor, archival data may be the only source of the

information you need.

• When it can inform your evaluation. There are large amounts of archival data

available almost everywhere. The mere fact that it exists doesn’t mean that it will do you

any good. You have to be selective about what you gather and use. Make sure it is

actually what you need, that it refers to the population and/or other elements of your

program that will make it truly useful to you. If not, you’ve not only wasted your effort,

but the resulting evaluation won’t give you a realistic picture of your work and how to

improve

it.

HOW DO YOU COLLECT AND USE ARCHIVAL DATA?

151
As you search out and collect archival data, there are several questions you should ask.

WHAT INFORMATION ARE YOU LOOKING FOR AND WHY?

To answer this question, you first might think about what information you need for your

evaluation that doesn’t necessarily require gathering data on current participants.

Some possibilities include:

• Data on past participants. You may want to compare the results for current participants

with data on past participants, especially if you’ve changed your methods or the

population has changed significantly.

Community-level indicators show trends for the community as a whole. You can often use them

to find out whether your efforts have had any effect in the community. If, for example, you are

conducting a program to reduce alcohol use among youth, one indicator of success might be a

reduction in the number of weekend and nighttime one-car crashes involving teens.

In many cases, you might choose community-level indicators according to the data that’s

available. For example, if there are reliable figures for exactly the kinds of car crashes

described, then using those figures as a community-level indicator would probably make sense.

• General information on the population and/or the community you’re working

with. You may want to see how well various characteristics of your participants match

those of the general population or you may simply want to understand the context of the

evaluation better. You may also be looking for information to choose community-level

indicators. Community-level indicators are specific, measurable units that help us

152
determine success of an initiative or intervention in the community; examples of

community-level indicators include program participation rates, services delivered,

levels of crime, or new cases of HIV/AIDS.

• Specific information on appropriate characteristics of the population you’re


working with. This may be used to compare your participants to the population they’re
part of, as well as to track specific differences that might be a result of your program.

153
Both general and specific information might include several categories to choose from. These

categories include:

• Demographics:

o Demographic statistics – age, race, gender, ethnicity o Geographical

location and distribution, population density, etc.

o Economics – income, employment, living conditions o Housing –

household size and ownership o Education level other characteristics –

primary language spoken, etc.

• Behavior:

o Health-related – tobacco use, physical activity, diet, etc.

o Substance use and abuse o Sexual behavior various other behaviors –

work habits, consumer patterns, etc.

• Health and Development Outcomes:

154
o Incidence (new cases) and prevalence (existing cases) of specific health conditions

(e.g., diabetes, injuries, infant mortality)

o Development outcomes (e.g. those completing primary education, high school; those with

disabilities)

o General health and well being characteristics of the population or community Access to health

and human services (e.g. those with access to clean water and sanitation

o Knowledge and awareness of issues (e.g. survey data on public concern with violence)

o Environmental conditions or risk/protective factors (e.g. exposures to pollution or toxins)

• Cultural information. Norms, customs, celebrations, beliefs, and practices related to

culture. If you can see whether and where your group’s efforts are fitting with

participants’ cultures, it will help you to determine whether that’s an issue, and where

you might need to make changes.

155
• Data on a similar group that can be used as a control or comparison. This might be a

group from the same population that signed up for but did not experience the program, or

another comparable community in a different place.

• Results of previous studies. You’d probably be most interested in studies that looked at
the same issue and population group you’re addressing. These can provide a standard of
comparison, as well as some sense of what kinds of results might be reasonable to
expect.

156
Suppression of Statistics

An issue that you should be aware of and prepared to encounter during your research is suppression

of statistics. When research deals with small populations or data pools, in order to protect the

privacy of individuals, it is sometimes necessary to suppress data. In other words, when the number

of cases in a category - i.e., females in Wyandotte County who died from lung cancer in 2004 - is

small enough that disclosing the data might allow a specific individual to be identified, steps are

taken to protect the privacy of individuals.

The most common method of preventing the identification of specific individuals is through cell

suppression. This means not providing counts in individual cells where doing so would potentially

allow identification of a specific person. Cell suppression can also be done by combining cells from

different small groups to create larger groupings that reduce the risk of identifying individuals.

The table below, from the Kansas Department of Health and Environment's Bureau of

Epidemiology and Public Health Informatics, shows a break-down by race of deaths due to chronic
liver disease and cirrhosis in Wyandotte County, KS in 2010. Because the numbers for
AfricanAmerican and Other are small enough that it might be possible to identify individuals from
those statistics, the data is suppressed, as indicated by the #.

Death Statistics for Chronic liver disease & cirrhosis for Wyandotte County, 2010

White Black/African-American Other All races

14 # # 20

157
# Indicates numbers below 6 t

The National Association of County and City Health Officials (NACCHO) has a useful tip shee

that explores this and other challenges of data collection and analysis in jurisdictions with small

populations and provides useful information for overcoming these challenges. ue

s in
In addition to the question of confidentiality, low numbers in a given category can also be an iss
when considering the stability of data. In other words, when there are low numbers or incidence
the data you are researching, it is more difficult to accurately calculate rates and it can give an
inaccurate picture of the categories you are researching. For instance, if the number of lung cancer
deaths in 2004 was 20, and in 2005 it was 30, statistically that is a 50% rise over one year, whic
quite a substantial fluctuation; however, it may be that it is simply a normal variation in reportinh is
Because the numbers reported are so small, even minor changes can seem substantial, and this c
g.
result in unreliable or unstable data. The table below, from the Kansas Department of Health an
an d
Environment's Bureau of Epidemiology and Public Health Informatics, shows a break-down by
race
of deaths due to breast cancer in 2010 in Wyandotte County, KS. Because the numbers availabl
e
White, African-American, and Other are too small to allow for an accurate, reliable calculation
for
the rates for that year, the information is suppressed, indicated by the @.@ symbols.
of

Death statistics for malignant neoplasm’s of breast, Wyandotte County, 2010

White Black/African-America Other All Races

Number 15 6 # 23

158
Rate @.@ @.@ @.@ 15.5

@.@ indicates numerator too small for rate calculation

# indicates numbers below 6

However, there are a couple of strategies that can be used to help avoid or address these problems

of instability.

One way to increase the reliability of data where you are dealing with small data sets is to combine

multi-year data (for instance, results of cancer deaths in a community for three years instead of

one). A drawback to this option is that looking at multi-year data limits the ability to monitor

program interventions and identify new trends. Rolling year averages (e.g., looking at data for

1997-2000 one year, and 1998-2001 the following year) may overcome this drawback and should

is an option that should be considered.

Another way to decrease the possibility of statistical instability is to expand the geographic area
you are investigating by looking at regional health assessments conducted by collaborating
neighboring jurisdictions, or in the example above, expanding from county to state. A drawback to
this option is that you may then be examining results for a geographical area that does not
necessarily apply to your assessment. Analyzing data at the regional level may also mask
interesting local variations in the data.

WHO IS LIKELY TO HAVE COLLECTED THAT INFORMATION?

159
In some cases, you might know for certain that it exists; in others, you’ll have to search

around. Some places to start:

Public records.

Government records at all levels – including federal state, county, and local. Copies of

publiclyfunded studies (after publication), financial information, crime statistics, demographic

information, and much more are available in public records.

Some you might be most interested in:

• Census Bureau. In most developed countries, the census covers a broad range of

demographic, economic, and geographic information.

• Federal and state departments and ministries. From environmental data to farming

practices and subsidies to poverty statistics to public health issues, the federal

government is a vast storehouse of information.

• Various levels of the court system. In the U.S., where civil and criminal trials and their

results are public, their records are public also.

• Police records. Arrests, domestic disputes, injury reports, and other information can be

found in police reports.

• Securities Exchange Commission and other business regulators. The SEC and other

regulators require businesses to file various information, usually annually, including

annual financial reports and environmental statements, all of it public.

160
• County commissions, agencies, and authorities. County Extension Services in the U.S.

(part of the U.S. Department of Agriculture) can be particularly helpful.

• City and town clerks’ offices.

Sometimes, government agencies are reluctant to share information, even though it’s public.

The Freedom of Information Act (FOIA) deals with this issue in the U.S. It allows for access

to a wide range of federal government records. Similar laws at the state level do the same for

state documents.

• Research organizations. Think tanks, independent oversight organizations, and research

organizations all issue reports on various topics, often backed up by studies.

Some of these organizations aren’t, and don’t pretend to be, politically neutral. They have

agendas, conservative or liberal, and some of them interpret their research in light of those

agendas. It’s important to be aware of the bias of any archival data that you use if you want

reliable data. However, many organizations with a political stance nonetheless try to make

their studies as objective as possible.

• Academia. Much research in health, human services, social issues, education, the

environment, and the sciences is conducted by universities and institutions connected to

them. This includes theses and dissertations for advanced degrees, as well as the results

of funded research web search engines, such as Google scholar, can help locate research

information.

161
• News media. Newspapers, magazines, and radio and TV outlets all keep archives, often

going back to the founding of the publication or station. These are often available to the

public – sometimes on line – either free or for a fee. Although they are unlikely to

contain detailed study results, they often have summaries of important studies, and may

serve to point you in the right direction to find what you need.

• Foundations and other private funders. These organizations fund studies of all kinds,

and many publish or otherwise make available the results as a condition of funding.

• Hospitals and other health care providers are sometimes university-related, and may

conduct studies of various health issues. They also may collect, as an administrative

necessity, demographic and other statistics on their patients, as well as information on

the frequency, geographical location, and intensity of various medical conditions.

• Mental health providers may have data on particular types of conditions, or on who is

most at risk for particular behaviors or conditions (e.g. depression).

• Human service and other non-governmental community-based organizations. The

information most likely to be gleaned from these organizations is administrative, and to

cover such areas as demographics and the location and character of community issues.

Depending on its nature, some of the research carried out or administrative data gathered by

universities, health and mental health providers, and human service organizations may have

some restrictions on them because of confidentiality. These restrictions usually only cover

access to individual records and identification of study participants, and generally don’t pose a

162
barrier to obtaining aggregate results of studies, assessments, or surveys with no identification

of individuals.

• Advocates and watchdog organizations may collect data (either locally, statewide, or

nationally) on businesses, on the environment, and on other particular issues – nearly

anything that pertains to their causes – and they’re usually willing to share it.

• Community activists. These folks tend to focus on specific issues, but if your issues are

similar to theirs, they may have a great deal of information that’s useful to you.

• Community economic development organizations are likely to have economic data,

landuse maps and patterns (perhaps including population distribution by race, ethnicity,

age, etc.), environmental information, and other similar material you might find useful.

• Businesses and corporations, particularly large ones, often collect information on their

workforces, economics and economic trends, and similar topics.

WHERE SHOULD YOU LOOK FOR ARCHIVAL DATA?

The question here is not only where to find archival information, but where to find it most

quickly and easily. Some of this material will be published, some only available from the

organizations that collected it. Looking in the right place first can save you a lot of time and

trouble.

Your own archives

163
Unless it’s brand new, your own organization should have an archive of administrative records,

past evaluations, assessments, and other data that might be helpful to you. Don’t ignore this

obvious and easily accessible source of information.

The Internet

Most public documents are either on the Web or can be found and/or ordered through a

website. The place to start is usually the website of the government agency most likely to have

collected the data. The Resources portion of this section contains a list of U.S. government

websites. In the U.S., states and most cities and towns have websites as well, with links to state

or municipal agencies and departments. (The URL’s for all state websites take the same form:

http:// www .[state abbreviation].gov. Municipal websites can easily be found by searching the

name and state of the community.) States or provinces and communities in most of the

developed, and much of the developing, world have websites as well.

Many of the other sources of information mentioned above are likely to have websites also.

Whether their data is available on those sites is another matter, and depends to some extent on

what kind of information you’re seeking. Watchdog organizations and some think tanks are

likely to post at least some of the results of their research on websites because they want it to be

as public as possible. Community economic development organizations likewise usually have

informative websites, since they’re trying to attract businesses and residents to an area.

Health providers and academics, on the other hand, may post their research on a website, but

only after it’s been published in a journal or book, or presented at a conference. That means that

you’re not apt to find very recent data (from the past year, for example). Local health and

164
human service providers and schools rarely conduct formal research, and rarely post any

administrative data on their websites, for two reasons: confidentiality, to which we’ve already

referred, and the fact that most of that data are intended for internal use, and therefore not seen

as useful to anyone outside the organization. Business websites generally include material only

of interest to potential customers. Community activists may or may not have websites at all.

As always when using the Internet, you should be cautious about where you find your

information. There are enormous numbers of reliable websites...and huge numbers of unreliable

ones as well. If you’re not sure of a website or of the information you get from it, try to find

that information elsewhere as well. In general, you can rely on websites when you know where

they get their information, and when you trust the reputation and integrity of the site’s owner.

Go directly to the source

Often, the best way to find information from health and human service organizations, schools,

and businesses, as well as from advocates and community activists, is to go to them directly. If

you do, be prepared to explain exactly what you’re looking for, what you plan to use it for, and

what you can offer in return. Unless the organization is willing to let you comb through its files

– confidentiality is often a barrier to that – someone will have to spend some time finding what

you need. It’s only fair to offer something in return, whether it’s payment, data analysis

services, advocacy for the cause, or something else the other organization needs.

If you’re asking another similar organization for data so you can use it as a comparison or

control group, the request has to be extremely tactful. In a sense, you’ll be telling the staff of

that organization that you expect your results to be better than theirs. Depending upon how they

165
see their work – and how they perceive you and your organization – they may take this as an

opportunity to find better methods to serve their participants, or as a grave insult. If it’s the

latter, they’re hardly likely to agree to the use of their data. You’ll have to frame the request in

the right way, and offer a good exchange as well. It will help if you’re dealing with an

organization with which you already have a good relationship of mutual respect.

Libraries

Librarians have always been world-class experts at finding what library users needed. With

current technology, they’ve become even better. Many have an encyclopedic knowledge of not

only what’s available in the library itself, but what’s on the web as well. They may be familiar

with sources of archival information you’d never think of, and be able to help you find what you

need quickly and with minimum effort. When in doubt, head to (or communicate with) an

available library.

WHAT ARE YOU PLANNING TO DO WITH THE DATA ONCE YOU HAVE IT?

This question has to do not only with what form you need the data in, but also just what data

you actually need. If you’re planning to use it as a comparison to the group participating in the

program you’re evaluating – whether as a formal control group or as baseline data – you’ll need

information on the variables you’re planning to look at, as they relate to the population you’re

working with, or at least a population that’s reasonably similar. If, for example, you’re

evaluating a chronic disease prevention program intended to benefit Latinos, and you’ve found

archival data on physical activity and nutrition among Native Americans, you can’t compare

your results with those of the archival data because the groups are likely to be too different.

166
If you’re planning to subject your data to statistical analysis, you’ll want information that either

is, or can be made, quantitative. If the information you’re collecting on your participants is

largely qualitative, then the archival data should be qualitative as well. Furthermore, the

information you get either should determine or should match the way you collect your own data,

so that there’s a reasonable comparison, assuming a comparison is what you’re intending.

USING ARCHIVAL DATA

It’s difficult to imagine evaluating a program or approach without actually collecting your own

data on participants. You might be able to find data on those participants from an earlier time,

which you can then use as a baseline. You might be able to find appropriate data on a similar

group that you can use as a comparison or control. But you can’t find data elsewhere on what

those participants are currently experiencing, and that’s what you’re evaluating in almost every

case.

The “almost” here refers to a situation where you’re evaluating a program in retrospect –

looking back at it after it’s underway or been completed. It may be possible in that case to find

archival data that will allow you to determine the program’s effectiveness in terms of process,

outcomes, or both.

Although you’ll probably collect information on the participants in the program you’re

evaluating, there are a number of ways you might use archival data:

167
• To better understand the context of your evaluation. These might be ethnographic data

(see Section 6 of this chapter), oral histories, assessment information, interviews, etc.

You’d use it to get a clearer picture of the community in a number of ways, and to help

you interpret the results of your evaluation. It might, for instance, give you insight into

why a particular approach did or didn’t work, or why some participants stayed in the

program while others didn’t.

• To identify areas to address. Along with a clearer picture of the community goes a

deeper understanding of the community’s needs and concerns.

• To establish a baseline against which to measure your results. For this purpose, you’d

need recent information about where the population you’re working with stands on the

dependent variables or outcomes you’re concerned with. That would tell you where the

participants started from (on average), so that you could see from the measures you used

in your evaluation whether and how much they might have improved as a result of your

work.

There are two kinds of variables (things that may change) in research. An independent

variable or intervention is a program, treatment, method, system, or other action or condition

set up by the researcher to see if it will create change and improvement. A dependent variable

is a behavior, condition, or other element that may change as a result of the independent

variable. A violence prevention program, for example, is an independent variable that may

change community members’ engagement in violent behavior and associated injuries, the

dependent variables.

168
• To identify already-existing trends that may affect the results of your evaluation study.

The fact that there’s been a change in participants between the beginning and end of your

evaluation doesn’t necessarily mean that you’ve caused it. Among other things, it may

be part of an ongoing trend toward change that started well before your program did, and

may continue after it. Archival data might show such a trend over a number of measures

of your dependent variable in the population your participants come from.

You might find, for example, that even though community-level indicators moved in the right

direction – the sale of tobacco products went down, say – they still compared unfavorably with

the state or national averages for the same indicators. That knowledge might be important in

future goal-setting and in using your evaluation results to gain community support or funding.

• To establish a standard of comparison against which to measure your efforts. There are

two ways that you could use archival data for this purpose. One is to use census,

statewide, and/or community-wide data to compare with that of the population you’re

working with. That comparison can give you a sense of how serious the issue is for your

group, compared to the general public. The second way is to use similar data to compare

your outcomes with the data on the larger population. This might work especially well

when you’re using community-level indicators (e.g., rate of injuries, percentage of girls

completing different education levels).

• To act as a control or comparison group. One of the best ways to learn whether or not

your program had an effect is to compare the participants you’re working with to those

169
in another group that received no program or a different one. The best alternative here is

to create a group from the same population as participants – so that all participants will

have approximately the same background, environmental influences, cultural norms, etc.

– and to conduct the same observations on both groups at the same times, so that the

only difference between them is the program that one of them is exposed to. In practice,

creating or finding a perfect control group is often difficult. Archival data may be able

to provide a reasonable alternative, in the form of data collected on a comparison group

or population similar to that of participants in your program.

Often, the most likely possibility is a group that was part of another program with the same

goal as yours, but using different methods. This has the advantage not only of providing a

control, but of letting you infer whether your approach works as well as, not as well as, or

better than that of the comparison group.

• To provide data for a longitudinal study. If you think your program might have a

longterm effect, or if you think it will interact with the effects of past events,

circumstances, or programs, you might want to conduct a longitudinal study – one that

looks at participants over a longer period of time – for your evaluation. You may not

have the time or resources to collect data over a period of years, but you may be able to

find archival information that allows you to draw some conclusions about long-term

effects.

There are at least two circumstances where you might be able to use archival data for a

longitudinal perspective. The first is one in which you’re looking at the effect of an issue on

170
the population for a length of time before your program began. This might make it easier to see

program results in context, and to understand whether the program broke a cycle and started

real change. The second circumstance is when you’re looking back at the effects of a program

that was completed some time ago. In some circumstances, the effects of a program multiply

or accelerate over time. Particularly if your program was aimed at changes throughout the

community (reducing intimate partner violence, for instance), you may be able to find archival

data that tells you whether the effects of your program continued, kept growing, or trailed off

IN SUMMARY

Most government agencies and departments, community-based health and human service

providers, advocacy organizations, universities, and many other entities keep archival records of

information. You may be able to use these as part of the data for your evaluation, saving time

and trouble. Especially for small organizations with limited resources, the use of archival data

can make it possible to produce an evaluation that provides the information needed to accurately

assess a program’s effectiveness and make the changes necessary to improve it.

171
Chapter 8

REFINING THE PROGRAMME OR INTERVENTION BASED ON EVALUATION

RESEARCH

A Community Health Center conducted an evaluation of its program to promote physical

activity among those with higher risk for heart disease. The evaluation showed mixed results. A

small number of participants (15%) had very good outcomes. They had marked increases in

physical activity and improved nutrition. Their fitness improved and they lost weight. As

predicted, their blood pressure dropped, their pulse rates went down, and they reported feeling

more energized.

They reported high levels of satisfaction with the program and results.

172
A large majority of the original group (70%) exercised, but not as regularly as hoped. The health

benefits for this group varied, with several reducing blood pressure at least slightly, and the rest

maintaining the levels they had entered with.

A final group (15%) consisted of dropouts – several participants left the program, most within a

short time – and other people who simply never managed to exercise on any schedule at all.

There was virtually no change in their weight, blood pressure, or sense of well-being...except

for a small number that had relatively positive results.

What could the Community Health Center do with these results? It knew that, while the

intervention apparently worked if people stuck with it, the program was only partially

successful. How could it use the evaluation to improve the program, and so improve the health

of those it served?

This chapter so far has discussed the elements of conducting a research-based evaluation. But

evaluation itself is only a means to an end: a tool to help you see what is happening so you can

improve the effectiveness of your work. In this section, we’ll examine how you can use your

research – the results of your evaluation – to do just that.

WHAT DO WE MEAN BY REFINING THE INTERVENTION?

DATA ALLOW YOU AND OTHER GROUP MEMBERS TO CRITICALLY REFLECT ON

YOUR WORK AND LOOK FOR OPPORTUNITIES TO IMPROVE.

Some key reflection questions that you and your group might consider:

173
• What are we seeing? (e.g., amount and kind of activities implemented; results shown –

increases, decreases, trends)

• What does it mean? (e.g., was the introduction of the intervention associated with

changes)

• What are the implications for improvement? (e.g., do the results suggest that the

intervention should be sustained, altered, discontinued; what changes are suggested)

The reflection questions you ask will depend on the nature of your intervention, but the above

set of questions is a good starting point. Consider holding a meeting or brief retreat where the

evaluation results can be presented through graphs and charts, and key questions can be

discussed. Such a meeting might benefit from an experienced facilitator to keep the process

moving toward consensus for specific recommendations on how to improve.

Refining the intervention is the process of making your work more effective by using data

collected from your evaluation.

Depending on what you’ve learned from this data, you might want to:

• Increase or strengthen your intervention in certain areas or with particular groups

• Change or eliminate elements of the intervention that didn’t work well

• Adjust your intervention to changing conditions or needs in the community

174
To continue with our example from above, the Community Health Center staff and selected

participants met to review the results. They felt that the evaluation had shown that if people

exercised regularly, they could lower their blood pressure, lose weight, and improve their overall

health.

A key implication of the findings was how to help people establish and stay with an exercise

routine.

Further dialogue about results of the evaluation left the Community Health Center with additional

questions:

• How can we increase the number of participants who actually adopt and continue regular

exercise and other healthy behaviors?

• Why did some people who didn’t exercise regularly reduce their blood pressure, and should

we add another component (e.g., healthy nutrition) to our program?

• What other factors, if any, besides exercise seem to help participants exercise regularly and

lower their blood pressure (e.g., wellness group, medication)?

By focusing on the key reflection questions – What are we seeing? What does it mean? What are
the implications for improvement? – The center should be able to refine their program to get even
better results for more participants.

175
It will be important for you to meet with other members of your group to review the data,

identify key areas for improvement, and brainstorm and come to consensus on how to address

issues that have been raised. Careful attention to your evaluation results can help inform which

courses of action you should take to improve your efforts.

WHY SHOULD YOU USE YOUR EVALUATION RESEARCH TO REFINE THE

INTERVENTION?

Refining the intervention is the primary purpose of an evaluation. If you find out that your

intervention wasn’t effective, you have three choices: you can quit; you can blindly try another

approach; or you can use your evaluation research to guide you towards a more effective

intervention.

Using evaluation results is vital: it points you in the direction that your research tells you is apt

to be most helpful. Using research to help you choose your course of action also establishes you

as a credible and practical organization, one that’s concerned with what works. That kind of

reputation is likely to increase your opportunities for getting funding and other resources, and to

gain and sustain your community support. Most importantly, it helps the group succeed in

addressing the important problems or goals of your community.

WHEN SHOULD YOU REFINE THE INTERVENTION?

The short answer to this question is “constantly.” Monitoring and evaluation should go on

throughout the life of the program or project, and should be used to adapt and adjust what you

do on an ongoing basis. In practical terms, it’s wise to reevaluate your work regularly – once a

year is typical – and make any major changes at that time. Of course, you can and should make

176
minor adjustments throughout the year, based on your monitoring and on feedback from

participants, staff, and others who implement or experience the intervention.

There are, in addition, some specific times when adjusting your work can be especially

helpful:

• When what you’re doing isn’t working. If it’s obvious that your work isn’t having the

desired effect, it’s time to consider what you need to change.

Make sure that you allow enough time for a program or intervention to have an effect before

you make a judgment that it isn’t working. Nothing happens overnight, and the more difficult

the issue you’re addressing, the longer it’s likely to take to influence intended outcomes. You

have to walk a line between cutting a program off before it’s had time to work and letting it go

on after it’s shown itself to be ineffective.

• When participants are dropping out at a high rate. What are you doing – or what are

the external factors – that might be causing participants to leave your program? How can

you change the intervention to assure that people experience it long enough for them to

benefit?

• Between sessions of a time-limited or sequential program. Some programs – like the

exercise program used as an example – are only designed to run for a limited period, but

may run again and again, with new participants each time. If such a program is

continually evaluated, you’ll get – and should use – information each time that will help

you make the next round of the program better.

177
• When funders or participants ask you to adjust some aspect(s) of your program.

Your evaluation research should be helpful in determining how to respond to the

funder’s or participants’ requests.

• When funding or other resources are reduced. You may be faced with eliminating

parts of your program, cutting numbers of participants, or other unpleasant choices. Your

evaluation research can help you find the best way to make cuts without losing your

effectiveness, by keeping intact the elements of the program that make the most

difference.

• When the issue or goal changes. Sometimes there is a shift in priority issues for the

community following a rise in unemployment or violence. Your research can tell you

that, and suggest ways of dealing with the change in conditions.

WHO SHOULD BE INVOLVED IN REFINING THE INTERVENTION?

The best plan here is to involve a number of stakeholders, depending to some extent on who has

been involved in the planning and evaluation of the effort.

Some people who definitely should take part:

• Participants. These are the folks who experience both the intervention itself and its

effects, and they are likely to have ideas about what would make it better, easier for them

to participate, or more relevant for them. Participants should be your partners in refining

programs and interventions, since they have an inside perspective on whether they are

working.

178
• Staff members, paid or volunteer. Like participants, staff members have a unique

perspective on the intervention. Not only do they see the way it works every day, but

they’ll also have to carry out any changes. If they can claim ownership of those changes

by participating in the planning process for them, they’re far more likely to understand

them properly and to be eager to make them work.

• People who are directly or indirectly involved in supporting the work. Depending

upon the nature of your issue, these might include educators, government officials,

health professionals, employers, funders, or others. Since their contribution is needed to

make any changes successful, it’s important that they have input into the planning of

those changes. They’ll need to understand and support them if the adjusted intervention

is to go well.

• Those who led and participated in the evaluation. They’ll have a good handle on what

the evaluation showed, and a grasp of what might need changing and how.

For example, the Community Health Center put together a team to look at the evaluation results

and make some recommendations for changes in the program. The team included a variety of

participants who had experienced different outcomes, a health care provider, a Center board

member, and a staff member from the university that conducted the evaluation. They went over

some of the research that the Center had used in developing the program, and carefully studied

participant interviews and other evaluation material, as well as the records kept by program

staff.

HOW DO YOU REFINE AN INTERVENTION BASED ON RESEARCH?

179
Changes in interventions should be focused on one or more of the three aspects of evaluation:

Process (both your process – activities implemented, doing what you intended, etc. – and

participants’ process – what did they actually do?), impact, and outcomes. You have to

examine each of these separately, and ultimately integrate them to decide what adjustments you

need to make in your intervention.

Each aspect of the evaluation builds on what comes before. In order to have the impact you

want, you have to put together and run your program well, and that’s a matter of process. If your

process didn’t go properly, then you haven’t really conducted the program you planned for. If

you didn’t get the impact you hoped for, it may be due to the fact that you simply didn’t do what

you planned, and the first adjustments should be to the process, to ensure that the intervention is

implemented as intended.

Similarly, to get the outcomes you intend, the program has to have an impact on the appropriate

risk and protective factors or other environmental conditions. If the program had the impact you

envisioned, but not the outcomes, then adjustments need to take place at the impact level,

perhaps in the risk and protective factors and/or conditions that influence outcomes.

PROCESS

An evaluation of the process of your effort compares what you planned to do with what you

actually did.

Process has a number of elements to which evaluation might be applied. They encompass both

logistics (the handling of details, such as finding space and buying materials) and program

implementation (methods, program structure, etc.).

180
These elements can include:

• Community participation. Were you able to involve members and sectors of the

community that you intended to? Were you able to make good contacts and establish

relationships within the priority population?

• Community assessment. Did you conduct an assessment of the situation in the way you

planned? Did it give you the information you needed?

• Program planning. Was the planning participatory? Did it include research into best

practices and successful interventions? Did it result in an approach that everyone felt

would work?

• Staff hiring and/or volunteer recruitment. Did you hire staff and/or recruit volunteers

that were the right people for the jobs?

181
Staff and/or volunteer training. Were staff and/or volunteers oriented and trained

before they started, so that they knew what they were doing when they began work? Was

there ongoing training?

• Outreach to and recruitment of potential participants. Was outreach successful to

engage those from the groups intended? Were you able to recruit the number and type of

participants intended?

• Implementation strategy. Here, you’re determining both what you actually did in

implementing the program, and what participants actually did. Did you structure the

program as planned? Did you use the methods you intended to? Did you arrange the

amount and intensity of services, other activities, or conditions as intended? Did you

obtain and use the materials and equipment you expected to? Did relationships develop

as envisioned? Did participants actually do or experience what you had intended?

• Evaluation strategy. Did you conduct the evaluation as planned? Did you gather data

related to process, impact, and outcome?

• Timelines and benchmarks. Did you complete or start each of these elements in the

time you planned for? Did you complete key milestones or accomplishments as planned?

If all or most things went as planned, and any that didn’t were trivial, you’ve essentially

done what you set out to do. If they didn’t, there are a number of possible reasons for changes

in the intended process:

182
• It took more time than you expected to complete one or more important tasks (finding

and hiring key staff is a typical one here)

It was harder than you expected to accomplish a particular task. This may be a matter of

time spent, but it may also mean that you simply didn’t have the skills or personnel to do

what you needed to

• Something you had good reason to expect didn’t happen (e.g., funding or support that

you expected didn’t come through)

• Someone or some organization you depended on didn’t come through (e.g., a hired staff

member became ill and did not finish the work on time)

• More participants dropped out than you anticipated

• More people participated than you anticipated

• Partway through, you found that the methods you had planned didn’t work well, and you

had to make adjustments

• A funder or community advisory board asked you to change some of what you were

doing

• Partway through, you became aware of a new method that seemed to be extremely

effective, and you switched to implement it

• You discovered a more successful way of doing things in the course of the work, and

adopted it

183
• You underestimated the resources necessary to carry out your original plan, and had to

scale back (or look for more funding/volunteer help/space/other support)

• You encountered opposition

You encountered unexpected difficulties (someone quit, materials/equipment weren’t

available from the supplier)

• You encountered disaster (e.g., the site burned down, the program coordinator became

ill, a staff member got arrested or misused your funds)

• You simply didn’t pay attention to following the plan, and/or didn’t do your job as an

organization

The deviation from your plan may have had made very little difference at all, or it may have

made all the difference. Some differences might be positive – a delay might make it possible to

find a more stable funding source; a change in method might make for a more effective program

– but they’re still differences. It’s worthwhile to understand what changed to make sense of the

evaluation results and to make any needed adjustments.

184
185
Perhaps you implemented your process according to plan, and your program ran as intended. By

contrast, the process may have been filled with difficulties – opposition, no community support,

difficulty recruiting participants, missed deadlines. Does that mean that all the work you put

into planning was unnecessary?

It’s likely that the answer is the opposite. If you were able to carry off implementing a program

regardless of the fact that your plans were disrupted, it’s a good bet that the clear vision of what

you wanted to do kept you on track.

Taking a close look at how you managed to overcome the obstacles in your way will help you

understand how to avoid them in the future. (Avoiding all obstacles is unusual in any

community work. The key is learning how to anticipate and overcome them.)

It’s also possible that the process leading up to the program went as planned, but the

implementation didn’t turn out as expected. In that case, it was probably your plan that was at

fault.

Some possible problems:

• You didn’t assess the situations or take some important aspects of preparation into

account

• You didn’t properly understand some aspect(s) of what you had to do to be successful

• You didn’t properly communicate some aspect(s) of what you had to do to staff,

participants, funders, or the community

• You underestimated the amount of money or other resources you would need

186
• You didn’t have proper fiscal control

• You ignored something important (treating participants with respect, for instance)

• You didn’t involve the community enough

• You didn’t factor in enough time for some aspect(s) of what you had to do (i.e., you

planned for a given time period and carried that out, but it was too short)

• You didn’t provide some important support for participants (travel, child care, stipend),

and a large number dropped out as a result

Finding out why your plan didn’t produce the intervention you expected can be helpful.
Understanding what you need to plan for, and how to do it, can make your future work both
more efficient and more effective.

187
IMPACT

Your program or initiative’s impact is the effect it had on the environmental conditions, events, or

behaviors that it aimed to change (increase, decrease, sustain.)

In most – but not all – cases, the immediate impact of the program is not the same as the eventual

intended results. Generally, a program aims only to influence one or more particular behaviors or

conditions – risk or protective factors. The assumption is that such influence will then lead to a

longer-term change, which is the ultimate goal of the program.

The intended impact of the Health Center’s exercise program, for example, is the adoption by

participants’ regular exercise, a protective factor in reducing risk for chronic diseases. The goals

of the program, however, are actually better heart health, and, ultimately, a longer and

higherquality life. Impact is the intermediate step – the influence you have on a behavior or other

factor that will in turn lead to the intended results.

Your process might have gone perfectly – you might have done exactly what you set out to do –

and might still have had no impact on the risk and protective factors you targeted. By the same

token, you may have ended up running a program markedly different from the one you planned,

and still have had the impact you hoped for. The results of the process evaluation will tell you

how closely you stuck to your plan in setting up and running your program. The results of your

impact evaluation will tell you whether your program made the changes or intended results.

188
In all these cases, evaluation should involve feedback from both participants who had good

results and those who didn’t. What worked particularly well for those who had success?

What were barriers to those for whom the program didn’t work well? It’s not always easy to

get participants to describe the positives and negatives – but it’s the best way to find out.

Your program worked as you planned if the behaviors and risk and/or protective factors changed

in the ways you intended. The big question that remains in this case is whether the changes your

program influenced led to the ultimate outcomes you were working toward. We’ll consider that

when we look at outcomes a little later in the section.

If your program actually had a negative impact on the targeted behaviors or risk and/or

protective factors – the intervention aimed to increase childhood immunizations, and fewer

children were immunized, for example – it is important to look more deeply into what is

happening.

Some possibilities:

• You failed to communicate your message, or its importance

• You underestimated or ignored cultural influences that were powerful enough that your

methods failed to overcome them

• You didn’t take into account cultural influences in participants’ lives that made it difficult

to achieve intended results - these factors could include poverty, or competing demands

for time, among many others

189
• The cultural incompetence of the organization or some staff members worked against your

goals

• The structure and/or methods of the program led to unanticipated negative consequences

• The program was seen by participants as something that was being imposed on them, and

they had little influence on its design or implementation

Just as you might find that your process went well and your program still didn’t influence the

risk and protective factors you meant to, it’s possible that you created exactly the changes you

intended in risk and protective factors, and the program still didn’t achieve the outcomes

intended. We’ll look at outcomes to consider that situation.

OUTCOMES

The outcomes of an intervention are the changes that actually took place as a result of it. The goal

of an intervention is usually not just a change in behavior or circumstances, but the changes in

community health and development that occur as a result of that immediate change. A tobacco

control program, for instance, aims to help participants avoid or quit smoking: that’s its impact.

Its real goals – the hoped-for outcomes of the program – are reduced rates of heart disease, lung

cancer, and other smoking-related diseases for participants and their family members.

The ultimate outcomes may take years to assess, but others – like the blood pressure goals of the

Health Center exercise program, or the results of a job training course – can be determined at or

soon after the end of the intervention. Outcomes are the true measure of the success of the

intervention, because they are the reason it was conducted in the first place. However, the impact

190
made – such as changes in community programs of policies– can be an important intermediate

outcome since it can take years to see changes in longer-term outcomes.

The program produced the intended outcomes

If the program produced the outcomes you intended, congratulations: you’ve achieved the goals

of your effort. This isn’t the time to consider your work complete, however. How can you make

the intervention even better and more effective?

• Can you expand or strengthen parts of the program that worked particularly well?

• Are there evidence-based methods or best practices out there that could make your work

even more effective?

• Would targeting more or different behaviors or risk and protective factors lead to greater

success?

• How can you reach people who dropped out early or who didn’t really benefit from your

work?

• How can you improve your outreach? Are there marginalized or other groups you’re not

reaching?

• Can you add services – either directly aimed at program outcomes or related services such

as transportation – that would improve results for participants?

• Can you improve the efficiency of your process, saving time and/or money without

compromising your effectiveness or sacrificing important elements of your program?

191
Good interventions are dynamic: they keep changing and experimenting, always reaching for

something better. Programs can always be improved.

The program only produced some of the intended outcomes

If the intervention produced only some, or some lower level, of the desired outcomes, you may be

headed in the right direction. The program may also have greater effects in the long run, as

participants incorporate the changes they’ve made into their everyday lives.

Some possible reasons for the program’s effect not being as great as planned:

• You didn’t target sufficient risk and/or protective factors

• The program’s message didn’t reach participants or speak to them in a powerful way

• There were intervening factors – attendance or lack of support services – that made the

program less effective than it could’ve been

• Particular parts of the program didn’t work well

• Particular parts of the program weren’t implemented well

• You overestimated what was possible in the time available

• The program didn’t approach participants in the right way – it was too formal, the

language used posed barriers for some, etc.

• The program wasn’t culturally adapted for the population

192
• There were conflicts among participants or between participants and staff

For example, let's say that the Health Center’s exercise program wasn’t by any means a failure,

but it was only modestly successful. How might the Health Center use its evaluation information

to improve the results for program participants?

First, the Center could examine what participants said about the program. What enabled the

members of the most successful group to exercise? Why weren’t members of the much larger

group able to establish regular effective exercise routines? And for members of the third group –

those who didn’t exercise at all or dropped out quickly – what might have gotten them more

motivated?

Perhaps those in the first group attended all the sessions and found exercise partners who

challenged one another to do a little more (or to eat a little better.) Perhaps those in the other

groups did not locate partners.

Based on the evaluation, the program’s designers decided that they should arrange for exercise

partners or groups for everyone. It seemed from the evaluation that both the social situation and

the challenge that exercising with others presented made exercise more likely and more fun, and

promoted a more vigorous workout. They also decided to develop a much more formal nutrition

component to the program, and to incorporate a buddy system into that component as well, in the

hopes that participants could help one another develop recipes and stick to a reasonable eating

plan.

The program produced no outcomes

If the program produced no outcomes at all, you may have to make big changes.

193
It can be very difficult to admit that you’ve been taking the wrong direction, especially after

investing a lot of time and effort in planning and implementing a program. It’s tempting to

believe that if you just work harder, or recruit different participants, or use better materials,

you’ll get the results you want. It takes courage conclude that the results call for a major redesign

in the effort.

The program may have produced unintended outcomes, either positive or negative. If they’re

positive, you might want to understand how they came about so that you can continue to produce

them. If they’re negative, you’ll probably want to learn more so you can seek to eliminate them.

Most of the reasons for unintended outcomes are similar to those for lack of outcomes.

A positive unintended outcome in a youth violence prevention program, for example, might be

better school performance; a negative example in the same program might be an increase in

school dropout. Teens in the program might improve their school performance because they

admire a staff member with college education, and want either to be like him or to impress him.

Or they may see college and an escape from the neighborhood as their best way out of the cycle

of violence.

Those who drop out of school as a result of the program may also do so because they see it as a

way to avoid violence: school – or the trip to and from school – may be especially dangerous

because of the presence of youth from other neighborhoods or rival gangs. Conversely, they may

see dropping out of school in favor of work as a non-violent road to financial success, as opposed

to dealing drugs or other similar violence-prone activities.

194
Given all this, how do you approach your evaluation research to decide what you need to refine

and how? A good general approach is to work backward from outcomes – asking “but why?” –

Regarding why each previous phase failed to produce the results you wanted.

Using the "But why?" Method to examine outcomes

• Examine the outcomes. If your intervention achieved the intended outcomes, it has done

its job. Now you can consider how to maintain these effects or refine your program (see

above). You should still examine the results for process and impact, and make changes

where they’ll gain you greater effectiveness or efficiency. But chances are the program

doesn’t need major changes, unless you want to enlarge your goals, or unless you’ve

found an alternative approach that could lead to even more impressive outcomes.

• Examine the impact. If your evaluation research shows no outcomes, or outcomes that fall

short of what you intended, the next area to examine is the impact of your program on the

targeted behaviors and risk/protective factors.

If the program had the impact you expected, but no outcomes, perhaps you’ve chosen the wrong

behaviors or factors to target, and need to rethink your problem analysis and related intervention.

There are other plausible explanations: your intervention wasn’t in place long enough, the effects

are delayed, your measures are insensitive to what is being achieved, etc.

• Examine the process. The next step here is to understand how well you planned, prepared

for, and implemented your intervention. If the reasoning and assumptions behind your

planning were accurate, and if you set up and implemented your program based on them,

195
you should have the impact you were aiming for, and that impact should lead to the

outcomes you intended. If your program didn’t go as planned, that could be a good part,

if not all, of the reason for your lack of outcomes. Your process evaluation can show you

where you need to adjust and improve your implementation to have a better chance to get

the intended results.

If your program did go as planned – you met your deadlines and did what you intended to do in

the way you intended to do it – and you failed to achieve your goals, there’s a good chance that

your planning was the problem. You may have aimed at insufficient risk and/or protective

factors, as mentioned above, or you may have chosen ineffective methods to influence the right

ones...or both. There are other possibilities that could be picked up by a process evaluation as

well, many of which have already been suggested – treatment of participants, language or other

communication issues, lack of cultural competence, etc. Identifying and correcting such

problems can help a program reach success.

• Keep making adjustments. Make your adjustments and refinements, run and evaluate the

intervention, and make further adjustments and refinements to improve your work. This

should be a continual cycle for the life of your program.

IN SUMMARY

The purpose of an evaluation and the research that goes into it is not just to tell you whether or

not your intervention has been a success. The real value of evaluation research lies in its ability

to help you identify and correct problems – as well as to celebrate progress. Evaluation can

196
pinpoint the strengths of your program, and help you to protect and enhance those strengths and

make them even stronger.

By examining the three elements of an intervention – process, impact, and outcomes – your

evaluation can tell you whether you did what you had planned; whether what you did had the

influence you expected on the behaviors and factors you intended to influence; and whether the

changes in those factors led to the intended outcomes. That knowledge can show you what you

might change to improve your program, as well as the overall effectiveness of the intervention.

And, the information can be used to celebrate the accomplishments you are making along the

way.

197
ASSIGNMENT:

1.Why is choosing the right question important in Monitoring and Evaluation?

2.Using Archival data has its own bottlenecks. Name five and explain how to overcome them.

3.why is research important component in monitoring and evaluation? Give and explain four.

198
199

You might also like