How to develop a program evaluation plan
This resource has been designed specifically for Communities for Children (CfC) service providers to assist them in meeting the evaluation criterion for the 50% evidence-based program requirement. However, it may be used by anyone who wants to evaluate the outcomes of a program. This resource is provided as a guide only.
Focused on evaluating outcomes, this resource will enable you to measure what the effect of your program has been on participants. The final section provides some additional information on process evaluation, which you may also wish to use if you want to understand why you are getting these outcomes and how your program could be improved.
What is evaluation?
Evaluation is something that we all do informally every day. We ask questions and make judgements and decisions based on the information we receive. Evaluation of a program simply formalises this through a systematic process of collection and analysis of information about the activities, characteristics and outcomes of a program with the purpose of making judgements about the program (Zint, no date, citing Patton 1987). Evaluation can take place at different times during a project for different purposes.
We do evaluation for two main reasons. Firstly, to determine whether or not our programs are making a positive difference and, secondly, to understand how and why a program has worked (or not) and how it can be improved.
How is evaluation used?
Evaluation can be used to improve programs and make decisions about the allocation of resources and the continuation of programs. Evaluation can also be used to advocate for the continuation or expansion of a program or for accountability to funders, community members, and program participants (Flinders University, 2013).
Are there different types of evaluation?
Yes, there are many different types of evaluation. For a program to meet the 50% evidence-based program criteria within the CfC activity, it needs an outcome evaluation. CfC service providers may wish to conduct other types of evaluations, in addition to an outcomes evaluation; however, these are not required by CFCA to assess your program.
Broadly speaking, evaluations fall into the following categories:
A needs assessment is often undertaken as part of program planning to determine what issues exist in an area or with a particular group and what the priorities for action are.
Process evaluation is undertaken while a program is being implemented or at certain milestones within program implementation. Process evaluation provides information on how a program is working. It is usually used to improve a program and to see if a program is being delivered as intended (and, as such, if it is on track to achieve its outcomes).
An outcome evaluation looks at what has changed because of the program. It can be undertaken at the end of a program or at a particular stage (e.g., on conclusion of a 12-week program or annually). The time frame will depend on the program (World Health Organization [WHO], 2013).
Impact or summative evaluation
Impact evaluation is usually undertaken at the conclusion of a program and looks at the total effect of a program.
Conducting an evaluation
There are four phases to an evaluation:
- data collection;
- data analysis; and
- write up and dissemination.
This resource focuses on the planning phase to support CfC service providers to meet the evaluation criterion of the 50% evidence-based program requirements. An evaluation plan template has been provided. Filling out this template will help you to meet the requirement for an evaluation plan under the Emerging Program category of the CfC evidence-based program requirements.
A note on terminology
Evaluation terminology can be confusing, as people sometimes use the same terms to mean different things. We are using outcomes to describe the short, medium and long term changes that occur as a result of the program, however some people use the term impact to describe these or use these terms interchangeably. It is always useful to clarify terms to ensure everyone is on the same page.
Developing an evaluation plan for an outcome evaluation
An evaluation plan is an essential first step in undertaking an evaluation. An evaluation plan simply outlines what you want to evaluate, what information you need to collect, who will collect it and how it will be done. It is best to prepare your evaluation plan when you are planning your program, although y ou can do it later if you need to (Pope & Jolly, 2008).
Before developing an evaluation plan, you must have defined the goals and objectives of the program and the strategies or activities it will use to achieve those goals. A clear and well-developed program logic that is based on the evidence and has a strong and coherent “theory of change” should do this.
It is also important to include all key stakeholders in the development of your evaluation plan. This will ensure that you get different perspectives and that everyone is on board and willing to do the work of evaluation. Developing an evaluation plan collaboratively is particularly important if you are working in partnership with community members or with another service.
There are some key steps and considerations when planning for an evaluation (see Figure 1). These are outlined in more detail below.
Figure 1: Key steps in planning an evaluation
1. Why do I need to evaluate?
Identifying evaluation purpose and audience
The first thing to do when developing your evaluation plan is to consider how the evaluation will be used. Who is the audience for the evaluation? It could be your funders, program staff, managers who make decisions about the future of programs, or it could be community members who have been involved in the program. Each of these different groups is likely to want to know different things about the program. For example, program staff may want to know whether program participants are enjoying the activities, and program funders may want to know whether the program is achieving its intended outcomes.
For CfC service providers one of the key purposes of undertaking an outcome evaluation will be to meet the 50% evidence-based program requirement, and the audience will include the Department of Social Services (DSS) and Expert Panel staff from the Australian Institute of Family Studies (AIFS) as part of the program assessment process.
2. What do I need to find out?
Deciding on evaluation design
It is not possible to recommend a single evaluation design, as the most appropriate evaluation design depends on the purpose of the evaluation and the program being evaluated. Some factors to consider are:
- the type of program or project you are seeking to evaluate;
- the questions you want to answer;
- your target group;
- the purpose of your evaluation;
- your resources; and
- whether you will conduct an evaluation internally or hire an evaluator.
For the CfC evidence-based program requirement, the design must include pre- and post-testing as a minimum - this means collecting data (pre-testing) from participants at the commencement of the program that can be compared with data collected at the end of the program (post-testing). Only collecting data at the end of the program is not enough to meet the CfC evidence-based program requirement. The evaluation will be strengthened further if you are able to use a comparison or control group (people who did not receive the program). These types of evaluation designs are discussed further in the CFCA Resource Sheet Planning for Evaluation I.
There are a range of other evaluation methodologies and approaches such as empowerment evaluation, most significant change and developmental evaluation that can be used in appropriate and justifiable circumstances. These are approaches to evaluation rather than a design and, by themselves, are unlikely to contribute to the CfC evidence-based program requirements.
Pre- and post-testing
A pre- and post-test evaluation design cannot determine whether a program has caused a change, it can only measure whether or not a change has taken place.
For more information, see the CFCA resource Planning for Evaluation.
Additional resources on evaluation design:
Better Evaluation website, information on various approaches to evaluation
3. What will I measure
If you have already identified the short, medium and long-term outcomes in your program logic, this makes doing an outcome evaluation much easier, as you will have outlined what you need to measure and you may have already specified the time frames in which you expect those outcomes to occur.
Figure 2 demonstrates how a program logic can be used to develop key evaluation questions and indicators.
Figure 2:Developing key evaluation questions and indicators
Source: Taylor-Powell, E., Jones, L., & Henert, E. (2003), p.181
Although you may have identified multiple outcomes for your program in your program logic, evaluation requires time and resources and it may be more realistic to evaluate a few of your outcomes rather than all of them. When selecting which outcomes to measure, there are a number of factors to take into consideration. The following questions, adapted from the Ontario Centre for Excellence in Child and Youth Mental Health (2013), will help you to make these decisions:
- Is this outcome important to our stakeholders? Different outcomes may have different levels of importance to different stakeholders. It will be important to arrive at some consensus.
- Does this outcome align with the intended objectives of CfC according to the operational guidelines? All evaluated outcomes must meet this requirement.
- Is this outcome within our sphere of influence? If the focus of a program is on parenting skills, it is unreasonable to expect that it would contribute to parental employment outcomes because some parents were referred to an employment program.
- Is this a core outcome to your program? A program may have a range of outcomes, but you should aim to measure those which are directly related to your goal and objectives.
- Will the program be at the right stage of delivery to produce the particular outcome? Ensure that the outcomes are achievable within the timelines of the evaluation. For example, it would not be appropriate to measure a long-term outcome immediately after the end of the program.
- Will we be able to measure this outcome? There are many standardised measures with strong validity and reliability that are designed to measure specific outcomes (see the Outcomes measurement matrix). The challenge is to ensure that the selected measure is appropriate for and easy to administer to the target population (e.g., not too time-consuming or complex).
- Will measuring this outcome give us useful information about whether the program is effective or not? Evaluation findings should help you to make decisions about the program, so if measuring an outcome gives you interesting, but not useful, information it is probably not a priority. For example, if your program is designed to improve parenting skills, measuring parental employment outcomes will not tell you whether or not your program is being effective.
Once you have determined which outcome/s you will focus your evaluation on, you should form these into evaluation questions. An evaluation question is specific, measurable and targeted, to ensure that you get useful information and you don’t have too much data to analyse (Taylor-Powell, Jones, & Henert, 2003; WHO, 2013). When you are developing your evaluation questions, you should ensure that you will be able to find or collect data to answer the question without too much difficulty.
For example, if the outcome is improved child development, it may be useful to narrow this to a specific domain of child development that can be measured more easily through pre- and post- testing. Your evaluation question might be “to what extent have children in the program improved their social skills.”
It is important when conducting an outcome evaluation to include a question about “unintended impacts”. Unintended impacts are things that happened because of your program that you didn’t anticipate; they can be positive or negative. Finding negative outcomes can be just as important as positive outcomes, as it gives you some idea about what is working in your program and what isn’t, and whether you need to make any changes. For an example of this, see the completed evaluation plan template.
An indicator is the information you need to collect to answer your evaluation question. For example, if your evaluation question is about how much literacy has increased, you might use NAPLAN scores as an indicator of changes in literacy. Outcomes and indicators are sometimes confused; the outcomes are the changes that occur as a result of your program and the indicators are the things that you see, hear or read that provide you with the information to know what and how much has changed (Ontario Centre of Excellence for Child and Youth Mental Health, 2013).
Some outcomes and evaluation questions might be best measured by more than one indicator. For example, increased parental involvement in school could be measured by attendance at school meetings, participation in parent-school organisations; attendance at school functions; and calls made to the school (Taylor-Powell et al., 2003).
Additional resources on evaluation questions, outcomes and indicators
Community Tool Box website, page on choosing evaluation questions
Centre for Social Impact, The compass: Your guide to social impact measurement (chapter 5)
FRIENDS (Family Resource Information, Education, and Network Development Service), Menu of outcomes and possible indicators
4. How will I measure it?
Types of data
The methods and tools you will use to collect data depends on what type of data you are collecting. Data can be quantitative (numbers) or qualitative (words). The type of data you want to collect is often determined by your evaluation question. Many evaluations are “mixed methods” evaluations where a combination of quantitative and qualitative data is collected.
Quantitative data will tell you how many, how much or how often something has occurred. Quantitative data are often expressed as percentages, ratios or rates. Quantitative data collection methods include outcomes measurement tools, surveys with ratings scales, or observation methods that count how many times something happened (Muir & Bennet, 2014; WHO, 2013).
Qualitative data will tell you why or how something happened and are useful for understanding attitudes, beliefs and behaviours. Qualitative data collection methods include interviews, focus groups, observation and open-ended surveys (WHO, 2013). It is difficult to demonstrate how a program has had a positive effect on participants through qualitative data alone. For more information about qualitative methods and how to ensure they are useful and of high quality, see CFCA, Using Qualitative Methods in Program Evaluation.
To meet the CfC 50% evidence-based requirement you need to demonstrate that your program has achieved some of its intended outcomes. If you are using a pre- and post-testing design, it is usually more straightforward to do this by collecting quantitative data, or a mix of both qualitative and quantitative.
Before you decide on your data collection methods, you should consider how the data will be analysed (see below).
Data collection methods
As discussed above, the data collection methods you select will be determined partly by the type of data you want to collect and the question you are seeking to answer. There are also a number of other considerations when selecting your methods:
- The needs of the target group. For example, having a written survey or an outcomes measurement tool may not be suitable for groups with low literacy. It is best to check with the target group about what methods they prefer.
- Timing. Consider both the time that you and your team have to collect and analyse the data, and also the amount of time your participants will have to contribute to evaluation measures.
- Evaluation capacity within your team. Evaluation takes time and resources and requires skills and/or training. Developing surveys or observation checklists, conducting interviews and analysing qualitative data are all specialist skills. If you don’t have these skills, you may wish to undertake some training or contract an external evaluator through the Industry List.
- Access to tools. Developing data collection tools (such as your own survey or observation checklist) that collect good quality data is difficult. Using validated outcomes measurement tools is recommended, but these may not be suitable for your group or may need to be purchased. For more information see this article on how to select an outcomes measurement tool and this resource to assist you to find an appropriate outcomes measurement tool.
- Practicality. There is no point collecting lots of valuable data if you don’t have the time or skills to analyse them; in fact, this would be unethical (see Ethics below) as it would be an unnecessary invasion of your participants’ privacy and a waste of their time. You need to make sure that the amount of data you’re collecting and the methods you’re using to collect the data are proportionate to the requirements of the evaluation and the needs of the target group.
Data collection quality
The quality of data is determined by two main criteria, validity and reliability. Essentially, to ensure that you have quality data, you need to ensure that your data collection tools (such as a questionnaire) are taking an accurate measurement, and are producing consistent results. The World Health Organization (2013) recommend three strategies to improve validity and reliability in evaluation:
- Improve the quality of sampling - have a bigger sample size, ensure that your sample is representative of the population you are sampling from.
- Improve the quality of data collection - ensure data collection tools have been piloted, that people administering the tools (e.g., staff doing interviews or overseeing the surveys) have been trained and that the data are reviewed for consistency and accuracy.
- Use mixed methods of data collection - use qualitative and quantitative and have multiple sources of data to verify results.
Some other important considerations to ensure your evaluation is good quality:
- Cultural appropriateness. If you are working with Aboriginal and Torres Strait Islander people, or people from culturally and linguistically diverse (CALD) backgrounds, make sure your evaluation methods and the tools you are using are relevant and appropriate. The best way to do this is through discussion and pilot testing. See the CFCA resource on evaluation with Indigenous families and communities for more information.
- Thinking about who is asking the questions. If you have the same person running the program and conducting the evaluation, people may feel pressure to share only positive feedback. Consider how you can ensure anonymity for people participating in the evaluation, or have someone external run an evaluation session.
Timing data collection
To measure change, you need to allow enough time for the change to take place. For example, if you are measuring a change in participant outcomes using quantitative methods, you will need to collect baseline data (pre-testing) from participants before your program starts that you can compare with data collected at the end of your program (post-testing), or use an evaluation design with a control or comparison group (more information about control and comparison groups). Pre- and post-testing (as is necessary for CfC programs to meet the 50% evidence-based program requirement) uses the same measurement tool and the difference between the results gives you information about the change that has occurred. CFCA resource Planning for Evaluation I has more information on pre- and post-testing.
The outcomes you can measure depend on the time between your pre- and post-testing. For example, if you are collecting data at the end of a six-week program, you will be able to measure changes in knowledge or attitudes, but six weeks is probably not long enough to measure changes in behaviour. If one of your key outcomes is a change in behaviour, it might be a good idea to follow up with people at a certain time after the program has finished (e.g., three months) to see if there have been sustained changes in their behaviour.
Additional resources on data collection
FRIENDS (Family Resource Information, Education, and Network Development Service), Using qualitative data in program evaluation
5. Who will I collect data from?
Choosing who to include in your evaluation is called “sampling”. Generally, most research and evaluation takes a sample from the population. For CfC service providers, the population is most likely all participants in the program. If you are not including the whole population in your evaluation, then who you include and who you don’t can affect the results of your evaluation, and this is called “sampling bias” or “selection bias”. For example:
- If you collect data from people who finished your program, but only 30 out of 100 people finished, you are missing important information from people who didn’t finish the program - your results might show that the program is effective for 100% of people, when, in fact, you don’t know about the other 70 people.
- Similarly, if you provide a written survey to people doing your program, but there is a number of people with poor literacy or English as an additional language, they may not complete it, and then your evaluation gives an incomplete picture of the program, it only tells you whether it was effective for people with English literacy skills.
The number of people you include in your evaluation is called the “sample size”. The bigger the sample size, the more confident you can be about the results of your evaluation (assuming you have minimised sampling bias). Of course, you need to work within the resources that you have, it may not be practical to sample everyone.
For CfC programs working to meet the 50% evidence-based program requirement, the minimum sample size is 20 people.
While it may not be necessary that everyone eligible participate in your evaluation, you need to consider who is being asked to participate and who is not, and the effect this might have on the data that you collect and the conclusions you can draw from them.
You need to describe in your evaluation report who was invited to participate in the evaluation, why they were invited to participate, and how many people actually participated. If you think that there are implications from this (e.g., nobody with English as an additional language participated in the evaluation, but they make up half of your program participants) you need to explain this in the evaluation report.
The Research Methods Knowledge Base have more information on sampling.
All research and evaluation needs to be ethical. There are a number of elements to consider when ensuring that your evaluation is ethical.
Risk and benefit: Before asking people to participate in your evaluation, you must consider the potential risks. Is there any possibility that people could be caused harm, discomfort or inconvenience by participating in this evaluation? For practitioners working with children and families, potential for harm or discomfort would be due most likely to the personal nature of the questions. How can you ensure that this potential for harm or discomfort is minimised or avoided? For example, could questions be asked in a private space if necessary and could a list of services or counsellors be provided that people can be referred to if required. The potential for harm must be weighed against the potential for benefit - often there are unlikely to be individual benefits for participants, but their participation in the evaluation may contribute to improving your program.
Consent: Participation in research and evaluation must be completely voluntary. You must ensure that people are fully informed about the evaluation: how it will be done, what topics the questions will cover, potential risks or harms, potential benefits, how their information will be used and how their privacy and confidentiality will be protected. You must also make clear to people that if they choose not to participate in the evaluation, this will not compromise their ability to use your service now or in the future. Written consent forms are the most common way of getting consent, but this may not be suitable for all participants. Other ways that people can express consent are outlined in the National Health and Medical Research guidelines. For children to participate, parental consent is nearly always required.
There are additional considerations when conducting research and evaluation with Aboriginal and Torres Strait Islander people, for more information see CFCA resource Evaluating the Outcomes of Programs for Indigenous Families and Communities.
Additional resources on sampling, research ethics and collecting data from participants
National Health and Medical Research Council, National statement on ethical conduct in human research
Australian Institute of Aboriginal and Torres Strait Islander Studies, Guidelines for Ethical Research in Australian Indigenous Studies
6. How long will this take?
You may need to allow longer for evaluation than you anticipate. Planning for evaluation, ensuring enough time to recruit participants to your study, allowing time for outcomes to emerge, data analysis and writing up the report all take time. To undertake pre-testing with participants before they begin a program you need to have everything organised.
7. What do I do with the data?
You need to ensure you have time, skills and resources to analyse the data you collect.
If you are collecting qualitative data (e.g. through interviews) these will need to be analysed and sorted into themes. This can be very time consuming if you have done a lot of interviews. Statistical analysis may need to be undertaken with quantitative data, depending on how much data you have and what you want to do with them.
For a good discussion of data analysis and the steps to data analysis and synthesis see the World Health Organization’s Evaluation Practice Handbook, page 54
Writing up the evaluation
Writing up the evaluation and disseminating your findings is an important step in the evaluation and it is important that you ensure adequate time and resources. You may be preparing different products for different stakeholders (e.g., a plain English summary of findings for participants), but it is likely that you will need to produce an evaluation report. CfC service providers are required to submit an evaluation report as part of the 50% evidence-based program assessment process.
An evaluation report should include the following:
- the need or problem addressed by the program;
- the purpose and objectives of the program;
- a clear description of how the program is organised and its activities;
- the methodology - how the evaluation was conducted and an explanation of why it was done this way;
- the sample - how many people participated in the evaluation, who they were and how they were recruited;
- evaluation tools - what tools were used and when and how they were delivered (a copy of the tools should be included in the appendix);
- data analysis - a description of how data were analysed;
- ethics – a description of how consent was obtained and how ethical obligations to participants were met;
- findings - what did you learn from the evaluation and how do the results compare with your objectives and outcomes; and
- any limitations to the evaluation, as well as how future evaluations will overcome these limitations.
The World Health Organization provide a sample evaluation report structure in their Evaluation Practice Handbook, page 62.
Dissemination is an often-neglected part of evaluation but there are many benefits from sharing evaluation findings. Sharing the findings of your evaluation can provide valuable information for other services that may be implementing similar programs or working with similar groups of people. It is also important to report back on your evaluation findings to the people who participated in the evaluation. The way you present and share information should be appropriate to the audience. For example, a full evaluation report is required to meet the 50% evidence-based program requirement, but this report is unlikely to be read by service users who participated in the evaluation and a short summary or verbal discussion might be more appropriate. Where relevant, evaluation findings may also be disseminated through the CFCA information exchange or written up for publication in academic journals.
Additional resources on analysis, write-up and dissemination
World Health Organization, Evaluation practice handbook (esp. chapters 5 and 6)
CfC service providers are only required to conduct an outcome evaluation to meet the 50% evidence-based program requirement but conducting a process evaluation can also provide useful information.
Designing and conducting a process evaluation includes very similar considerations as an outcome evaluation; however, the focus of the evaluation questions is different. Whereas an outcome evaluation is focused on what changed as a result of the program, a process evaluation focuses on whether the program was/is being delivered as intended.
The evaluation questions in a process evaluation are linked to the activities you have/are delivering and indicators are likely to be in the areas of implementation (what has been done), reach and scope (how many people have been involved from what groups), and quality (how well things have been done) (Flinders University, 2013).
Many of these resources are also listed above.
Better Evaluation, online resources to support evaluation
The Community Tool Box, evaluation toolkit
Centre for Social Impact, The compass: Your guide to social impact measurement
MEERA (My Environmental Education Evaluation Resource Assistant), online “evaluation consultant”
Ontario Centre for Child and Youth Mental Health, evaluation learning modules
FRIENDS (Family Resource Information, Education, and Network Development Service), evaluation planning resources
Flinders University, Planning and evaluation wizard
University of Wisconsin-Exchange, templates, resources and a free online course (with a focus on program logic)
World Health Organization, Evaluation practice handbook
Flinders University. (2013). Planning and evaluation wizard. Retrieved from <som.flinders.edu.au/FUSA/SACHRU/PEW/pep_eval_zone.htm>
Muir, K., & Bennett, S. (2014). The compass: Your guide to social impact measurement. Sydney, Australia: The Centre for Social Impact.
Pope, J., & Jolly, P. (2008). Evaluation step-by-step guide. Melbourne, Victoria: Department of Planning and Community Development.
Ontario Centre of Excellence for Child and Youth Mental Health. (2013). Program evaluation toolkit. Ottawa, Ontario: Ontario Centre of Excellence for Child and Youth Mental Health. Retrieved from <,www.excellenceforchildandyouth.ca/sites/default/files/docs/program-evaluation-toolkit.pdf>
Taylor-Powell, E., Jones, L., & Henert, E. (2003). Enhancing program performance with logic models. Lancaster, WI:University of Wisconsin-Extension. Retrieved from <www.uwex.edu/ces/lmcourse/>.
World Health Organization (WHO). (2013). Evaluation practice handbook. Geneva: World Health Organization.
Zint, M. (no date). My Environmental Education Evaluation Resource Assistant (MEERA). Retrieved from <meera.snre.umich.edu/home>.
Authors and Acknowledgements
This resource was developed by Jessica Smart, Senior Research Officer with the Child Family Community
Australia information exchange at the Australian Institute of Family Studies.
Published by the Australian Institute of Family Studies, January 2017.
This resource was developed by Jessica Smart, Senior Research Officer with the Child Family Community Australia information exchange at the Australian Institute of Family Studies.