Robyn Parker and Alister Lamont
This Resource Sheet is aimed at providing program practitioners and providers with a range of resources relating to conducting program or practice evaluation. It is based upon Evaluating Child Abuse and Neglect Intervention Programs (Lamont, 2009) and Evaluation in Family Support Services (Parker, 2010). Where possible online resources have been selected, however there are some references that will require library access. The resources have primarily been selected for their suitability for use by practitioners with little or no experience of conducting evaluations. While some of these resources have been prepared for use in disciplines outside of the communities and families sector, they have been included here because they are applicable to providers and practitioners across a range of sectors.
Evaluation is more than just the collection of information and data; it involves making a judgement about the "worth, merit or value" of a program or practice (Scriven, 2003-04). Therefore it is about quality assurance - is your program or practice effective? There are several other reasons to evaluate a program or practice and clarifying these can contribute to the scoping and management of an evaluation. Evaluation can:
- support applications for new or continued funding;
- secure support from stakeholders and the community;
- improve staff performance and management; and
- contribute to the broader evidence-base about what does and what does not work for clients and providers with regard to specific issues and experiences.
Evaluation can also provide information about how services are used, profile service users, assist in the ongoing improvement and refinement of program content, and provide informed cost-benefit analysis (Tomison, 2000). Furthermore, families and children have the right to access services that are based on the best known evidence about what will help them manage their lives well and will do them no harm. Evaluation findings are of use for those planning a new service, for practitioners providing services and for policy-makers making decisions about public policy and program funding.
The following articles and web pages highlight a range of reasons why conducting an evaluation is an important and valuable activity. Although they relate to conducting evaluations in a range of sectors, the basic principles are applicable to evaluation of programs and practices in the communities and families sector.
Evaluation Principles and Frameworks. (2008). South Australian Council of Social Service. Unley, South Australia: Author.
This document notes a number of reasons for evaluation in the community sector and proposes an evaluation framework that can be applied across the sector.
The Program Manager's Guide to Evaluation. (2003). U.S. Department of Health & Human Services Administration for Children and Families. <http://tinyurl.com/2ar8kx5> Chapter 1: Why evaluate your program?
The first section of this comprehensive evaluation guide focuses on why an evaluation might be required and discusses several concerns managers may have about engaging in evaluation activities.
Approaching an evaluation - Ten issues to consider <http://www.bradroseconsulting.com/Approaching_an_Evaluation.html>
This resource sets out common reasons to evaluate and poses several additional questions that will help clarify the purpose and scope of a particular evaluation.
Evaluating Community Programs and Initiatives <http://ctb.ku.edu/>
The Community Tool Box is a free resource providing information and practical guidance in creating and evaluating change and improvement. The toolbox comprises an extensive range of information about community building. Sections relevant to developing a plan for evaluation, methods for evaluation, and using evaluation to understand and improve your program or practice can found in "Part J. Evaluating Community Programs and Initiatives" (Chapters 36-39).
Types of evaluations
There are a number of evaluation types and methods. The question(s) you wish to answer will determine the type(s) of evaluation you employ. The three main types of evaluations relevant to practitioners are: process, impact and outcome evaluations. It should be noted that different types of evaluations are intrinsically linked and can be used either independently or together (Tomison, 2000). Evaluation literature discusses several different types of evaluation, with at times contradictory definitions and descriptions.
Process evaluation is concerned with how a service is delivered. Process evaluations may assess the timing of an intervention, where it is occurring, the costs involved, the services offered, who is participating, how clients enter and progress through the program, how many sessions they attend, and who is facilitating the intervention (Hall, 2009). The purpose of process evaluation is to identify areas that are working well and areas that may require or benefit from change to enhance service delivery.
Process evaluations may be used to answer key questions such as:
- What are the "active ingredients" of your program?
- What were the demographic and clinical characteristics of clients?
- Are all service providers administering the program in the same way?
- Has the program or training been implemented as intended/planned? (i.e., model fidelity)
- Is the program reaching its target population effectively?
- Have collaborative links with other programs or service providers been successfully established?
- Is the program being delivered in the most effective way?
- Process evaluations cannot answer questions about the effect or impact of a program on participants.
Impact and outcome evaluation
Impact and outcome evaluations measure whether an intervention has an effect on participants in accordance with the intervention's aims and objectives. These evaluations are tied to the program objectives - what the program is intended to change or influence. For clients this might be their knowledge, skills, behaviours, mental health, coping strategies, parenting capacity. For the agency or organisation, the objectives might relate to the type and mix of clients, or the way staff are managed or supervised.
These types of evaluations can answer questions such as:
- Do parents report better parent-child relationships after participating in a parenting education program, compared to before?
- Do parents report fewer instances of defiant behaviour in their child after participating in a parenting support program, compared to before?
- Do adolescents report higher levels of self-esteem after attending an adolescent resilience workshop?
- Do couples report using more constructive communication techniques after attending a couple relationship education program?
- Does your service have more Indigenous and/or CALD clients following the implementation of a new engagement initiative?
Essentially, impact and outcome evaluation tell you what has changed for participants, and how much change there has been, and the degree to which your program was responsible for the reported or observed changes in participants.
The following resource sets out the various types of evaluation that might be appropriate for your needs.
The type of evaluation you conduct is determined by the questions you need to answer. This page describes the key types of evaluations and the questions they can be used to answer. The resource is available on their website <http://www.learningforsustainability.net/evaluation/questions.php>.
Process evaluations are useful for understanding the strengths and limits of your program or service and how it is delivered. However, only an impact or outcome evaluation will tell you if children and families have better outcomes as a result of participating in your intervention.
Planning is a critical part of the evaluation process. Investing time and resources in this phase will contribute to the smooth running of the evaluation and help evaluators plan for and allocate resources, anticipate and overcome obstacles, and collect and interpret appropriate data.
The following resources provide an introduction to the evaluation process, or present overviews and straightforward models or frameworks of the process that can be adapted to evaluate a range of program or practice types.
The effectiveness of parent education and home visiting child maltreatment prevention programs. (2006). Holzer, P. Higgins, J., Bromfield, L., Richardson, N., & Higgins, D. (Child Abuse Prevention Issues No. 24).
This paper contains a concise discussion of the key issues to think about in designing rigorous parenting program evaluations.
First steps in monitoring and evaluation. Charities Evaluation Services, London, UK. <http://www.ces-vol.org.uk/index.cfm?pg=160>
The resources held on this site are aimed at small agencies or voluntary organizations that are new to evaluation. Plain language is used throughout.
W. K. Kellogg Foundation evaluation handbook. <http://www.wkkf.org/knowledge-center/resources/2010/W-K-Kellogg-Foundation-Evaluation-Handbook.aspx>
This handbook provides a comprehensive framework for thinking about evaluation as a relevant and useful program tool. It is written primarily for project directors who have responsibility for the ongoing evaluation of W. K. Kellogg Foundation-funded projects, which support activities aimed at strengthening children, families and communities.
Quick tips for program development and evaluation (PDF 15 KB). Program Development and Evaluation Unit at the University of Wisconsin-Extension. <http://www.uwex.edu/ces/pdande/resources/pdf/Tipsheet17.pdf>
These quick tips are brief, easy-to-use and practical suggestions for improving program development and evaluation practices. They cover planning the evaluation, collecting data, analysis, interpretation and communication of results, and the retrospective post-test design. References for further reading are also provided.
Program logic models
Understanding the logic underpinning programs and practices - the relationships between the program goals and activities, operational and organisational resources, the techniques and practices, and the expected effects of your program or practice - is a critical step in designing an evaluation. The relationships among these aspects of service provision are often represented systematically in visual form as program logic models. Program logic models identify the parts of the program or practice that would be appropriate and useful to measure, and the order in which they should be measured.
Constructing a program logic model requires detailed examination of a program or practice, including the resources available to it, and the assumptions underlying the various steps or activities that lead to the intended short-, medium- and long-term outcomes. It is important to closely examine and question these assumptions so that any unintended or unforeseen consequences can be anticipated. Programs that on the surface would appear to have sound logic have, on occasion, backfired. An example is the Scared Straight juvenile delinquency prevention program in the United States in which young offenders visit prisons in an attempt to deter future offending. Reviews of this and similar programs found that they have no value as a deterrent and may actually lead to more offending behaviour for some young people (Petrosino, Turpin-Petrosino, & Buehler, 2002).
Just as program logic models themselves can vary in form and complexity, the following resources cover the creation of program logic models in varying degrees of detail.
Understanding Program Logic (PDF 388 KB).(2009). L. Holt, Department of Human Services, Victoria. <http://www.health.vic.gov.au/healthpromotion/downloads/understanding_program_logic.pdf>
This resource provides an introductory guide to developing and representing your program logic using examples from a range of service types.
McCawley, P. F. (undated). The logic model for program planning and evaluation (PDF 86 KB). University of Idaho-Extension.
This brief article provides a good introduction to developing program logic models.
Introduction to program evaluation for public health programs: A self-study guide. Centers for Disease Control and Prevention, U.S. Department of Health and Human Services.
This section of this comprehensive guide covers the process of describing the program and developing a logic model, starting simply and building to more complicated forms.
Mapping Change. Using a theory of change to guide planning and evaluation. <http://www.grantcraft.org/index.cfm?pageId=808>
GrantCraft provides resources based on the practical wisdom of those who make grants for community-based projects. This resource, while written for grant-makers, is also useful for those in receipt of those funds to prompt clear thinking and articulation of programs and practices.. Download requires free registration.
Collecting data: What is "good" evidence?
The aim of any evaluation is to demonstrate that program participants have benefited in measurable and hopefully lasting ways, and that those benefits are (at least partially, if not solely) attributable to the program. The data that are actually collected to assess a program are critical to determining its effectiveness, therefore it is important to carefully consider the type of evidence to be collected. It is often difficult to conduct large scale, complex evaluations in a service environment. However, an understanding of the principles underlying discussions of quality of evidence is important for anyone evaluating their programs or practices, as it contributes to the strength of the evaluation, and ultimately the conclusions about the effectiveness of the program being evaluated.
The following resources provide information about how the type of evidence and the way it is collected impacts on the strength of your evaluation, and to prompt thinking and discussion to promote more considered and rigorous evaluation.
Evaluation of income management in the Northern Territory by the Australian Institute of Health and Welfare.
The Evaluation Approach and Methodology chapter includes a brief but useful discussion of the levels or hierarchy of evidence. It outlines and reviews the evaluation method and the sources of data collected to demonstrate the tension that often occurs in social science evaluation between what evidence would be ideally collected, and what evidence is actually available to evaluators.
Introduction to program evaluation for public health programs: A self-study guid by the Centers for Disease Control and Prevention, U.S. Department of Health and Human Services. <http://www.cdc.gov/eval/guide/index.htm>
The third and fourth sections of this guide cover issues and challenges related to the design of the evaluation and the quality of the evidence collected.
Designing a rigorous impact or outcome evaluation
As the main goal of an evaluation is to indicate whether a program is effective or not, it is important that an evaluation is conducted appropriately. There are three elements that represent the gold standard for a rigorous evaluation: pre- and post-test designs, a comparison or control group, and follow-up testing.
Pre- and post-test designs
Pre- and-post-test designs assess participants "before" and "after" a program in order to ascertain whether participants have changed according to program goals (Chalk & King, 1998). Testing participant skills before an intervention service (pre-test) allows for a comparison of tests made after service participation (post-test) and can demonstrate how much change has occurred during that period (even though it cannot be determined whether the change was a result of the intervention or not).
Comparison and control groups
A comparison or control group is one that is not involved in the intervention program being evaluated. Comparison and control groups are used to compare the outcomes of program participants with non-participants. If the outcomes for the intervention group are significantly better than those for the comparison group, then you can have some measure of confidence that the program/intervention is responsible for the differences. For example, pre- and post-testing of participants in a grief and loss counselling program might show change over time with a decrease in depression at the end of the intervention. Without a comparison group however, it is not known if their improvement was a result of the program or whether their depression would have gotten better over time without any intervention. Waiting lists are often used as a source of comparison group participants.
When participants are randomly assigned to a non-intervention group, the group is called a control group. Because participants are randomly assigned, the two groups of participants are likely to have similar characteristics - or at least, not be significantly different to each other. This is important because, as noted above, the point of an evaluation is to assess whether the program or intervention is the likely reason for the changes in participants. Random allocation of participants to the groups minimises the likelihood of the two groups being significantly different in important ways before the program begins.
Follow-up testing is needed for assessing whether successful outcomes of an intervention extend beyond the short-term. For example, participants may have increased knowledge about parent strategies at the end of a 2-day course. If, after 1 month, they have not retained the skills learned at the course, can it be said that the program was effective in enhancing parenting skills? To determine whether an intervention has a lasting effect, an evaluation will need to conduct follow-up assessments on the same outcome measures.
Avoiding design flaws
To avoid research design flaws, it is important that evaluations only measure the influence of the intervention program. As well as taking steps such as including comparison and control groups in your evaluation, it is important to identify other factors that might influence the findings, such as other events or circumstances in participants' lives. Key questions to consider include:
- Are the participants also attending another program or service?
- Do participants differ in their social supports or family arrangements?
- Do participants have a greater motivation to change than non-program participants in the comparison or control group?
- Are there differences between program participants and non-program participants in terms of the severity of problems?
Common measures and indicators
Once the program objectives have been articulated and the overall design of the evaluation has been determined, the critical task of selecting the instruments can begin. Having clear, measurable objectives will help to identify what needs to be measured. How the data will be collected is a key issue, and this will also inform decisions about the specific instruments to be used.
The first decision relates to whether quantitative or qualitative data will be collected. Quantitative data are numeric data typically collected through methods such as surveys, questionnaires or other instruments that use numerical scales. It can be useful to set up your case file system to record quantitative data so that it can be easily accessed for evaluation. They are analysed via descriptive and inferential statistics. Descriptive statistics include counts (for example, number of "yes" and "no" responses), percentages and averages, while inferential statistics are used to identify statistically significant relationships (correlational and/or causal) between the different indicators. The strength of quantitative data is that they can be generalised outside of the group of participants in your program.
Qualitative data are collected via techniques such as interviews, focus groups, observations, documents, case reports, and written or verbal responses. Qualitative data can reveal aspects of the participants' experiences of the program beyond the more narrowly focused survey or questionnaire items. The strength of qualitative data is in the rich descriptive detail they provide.
Gathering different types of evidence by collecting both qualitative and quantitative data from various sources and combining different designs can improve the depth, scope and validity of the findings of the evaluation.
Each of the data collection methods noted above requires some specific knowledge and expertise to ensure they are properly and effectively applied. The following resources can help you make decisions about the kind of data that will provide you with the information you need to answer questions about your program objectives.
Research for Social Workers (2nd edition). (2003). Alston, M. & Bowles, W. Crows Nest, NSW: Allen and Unwin.
This book discusses a range of research and evaluation issues in an accessible, user-friendly manner. Plenty of examples, tips and summaries covering the spectrum of evaluation tasks are provided throughout the book.
Research Methods Knowledge Base (2nd edition). (2006). Trochim, W. M. <http://www.socialresearchmethods.net/kb/survey.php>
This is part of a comprehensive, web-based introductory social research methods textbook with a section specifically focused on evaluation. It is written in an informal style, with links to separate pages explaining key terms and concepts, including an accessible section on survey methods.
Making measures work for you. Outcomes and evaluation. GrantCraft. <http://www.grantcraft.org/index.cfm?fuseaction=Page.viewPage&pageID=835>
GrantCraft provides resources based on the practical wisdom of those who make grants for projects that strengthen communities and networks. This resource, while written for grant makers, is also useful for those in receipt of those funds to prompt clear thinking and articulation of what needs to be measured to identify the benefits or otherwise of programs and practices. Registration is required for download, but there is no charge.
Thames Valley University Dissertation Guide
This guide has been developed for Open University students undertaking a dissertation. While the terminology is focused on 'research' the principles can also be applied to an evaluation project. This section of the guide discusses various methods of collecting data, highlighting the key components, advantages and disadvantages, and practical issues relating to each.
In many cases, an instrument will have already been created that measures your program objectives. It is worth investing time in a search for an existing measure, or one that can be adapted with minimal effort. The following websites contain a wide range of measures that can be used in your evaluation.
Not every evaluator will want or need to conduct their own statistical analysis. If the resources are available, data entry and statistical analysis can be outsourced to those with relevant expertise. Nevertheless, a basic knowledge of key statistical terms and concepts will enhance your understanding of the outcomes of your evaluation. The following sites offer straightforward instruction on the range of statistics that can be expected to appear in an evaluation report.
Positive indicators of child wellbeing: A conceptual framework, measures and methodological issues (PDF 915 KB). UNICEF Innocenti Research Centre.
This report discusses the need for positive measures of children's wellbeing, identifies and reviews existing and potential indicators of positive wellbeing, and discusses methodological issues in developing positive wellbeing indicators. It includes the data sources for the examples of measures included in the report, an annotated bibliography, and an extensive list of references.
Instrument Collections.The American Evaluation Association.
This web page hosts a comprehensive list of links to sites and resources for evaluation tools and instruments recommended by members of the American Evaluation Association. Most are free to access; some instruments may require registration or purchase.
Early childhood measures profiles (PDF 3.6 MB). Child Trends. <http://aspe.hhs.gov/hsp/ECMeasures04/report.pdf>
This very large, comprehensive collection of early childhood measures across several domains (including social-emotional, language, literacy, and cognition) includes background information, reviews, and citations for each measure.
Handbook of Family Measurement Techniques. (2001). J. Touliatos, B. F. Perlmutter & M. A. Straus (Eds). California: Sage Publications (3 volumes).
This reference contains the items, scoring information, respondent instructions, references and psychometric properties of 976 instruments. Measures are grouped into marital and family interaction; intimacy and family values; parent--child relations; family adjustment, health and wellbeing; and family problems.
The Magenta Book. UK Government Social Research Unit.
Comprising a series of individual topics and arranged around common questions, the Magenta Book offers guidance for policy evaluators and analysts, and people who use and commission policy evaluation. While the book is focused on policy evaluation, it contains a straightforward guide to understanding statistics, as well as sections on principles underlying a range of research designs, and the role and value of collecting qualitative data. (Note: the guide does include some of the more advanced statistical techniques.)
Research Methods Knowledge Base (2nd edition). Trochim, W. M. <http://www.socialresearchmethods.net/kb/analysis.php>
This is part of a comprehensive, web-based introductory social research methods textbook with a section specifically focused on evaluation. It is written in an informal style, with links to separate pages explaining key terms and concepts. The analysis section describes the key concepts related to data analysis, including descriptions of statistical techniques (such as correlation) and concepts (for example, statistical power).
Evaluation in Indigenous contexts
Programs for Indigenous families and communities present particular challenges for evaluators. It is critical that the evaluation component of the program is seen as part of the program package from the earliest discussions with the Indigenous community in which it will be run, and explained to Indigenous participants when joining the program.
When talking with participants about the evaluation, be upfront about what is involved, what participants may be asked to do or say, what will be done with the findings, and so on. Schedule feedback sessions for participating families or communities to tell them about what the evaluation found and what it means for them and their family and community. Issues you will need to think through very carefully include:
- the type of data that can be collected - qualitative methods might be more appropriate and easier to use;
- how the data is collected - observational techniques and talking with participants may be more effective than questionnaires;
- who does the data collection - an Indigenous evaluator or someone already accepted by the community may be more appropriate to gather the data;
- the time frame in which the evaluation can be done - you will need to build in sufficient time to allow for relationships to be built with participants and their families; and
- using culturally-appropriate measures - make sure your instruments have been adapted for use with Indigenous participants in the local region, and spend some time testing and reviewing them before the actual data collection takes place.
Engaging the services of a "cultural broker" may be helpful. A cultural broker can act as an interpreter, but their role can be quite broad and include facilitating two-way interactions between evaluators and Indigenous participants and their families (Michie, 2003).
Closing the Gap Clearinghouse Assessment Checklist <http://www.aihw.gov.au/closingthegap/>
The Closing the Gap Clearinghouse Quality Assessment Checklist provides a systematic means of assessing the quality of a range of different research and for summarising research findings.
National Health and Medical Research Council -V alues and Ethics
The National Health and Medical Research Council provides ethical guidelines for conducting research with Indigenous people that apply equally to evaluation. See publication Values and Ethics: Guidelines for Ethical Conduct in Aboriginal and Torres Strait Islander Health Research. <http://www.nhmrc.gov.au/publications/synopses/e52syn.htm>
Program evaluation can be challenging for service-based managers and practitioners where resources are stretched, but it is also an important tool for ensuring that programs and services are effective and of high quality. Evaluation evidence can also contribute to funding submissions and to gain the support of stakeholders and the community. It does not necessarily require specialist training or skills, but to be done well it does require good planning and the allocation of resources.
There is no perfect way to conduct a program evaluation in the real world. The information in this Resource Sheet aims to give broad guidance in the key steps and issues involved in developing an evaluation, and direct providers to useful resources to further develop their understanding of and capacity to engage in evaluation activities within their agency.
- Chalk, R., & King, P. A. (1998). Violence in families: Assessing prevention and treatment programs. Washington DC: National Academy Press.
- Hall, R. (2009). Evaluation principles and practice. In G. Argyrous (Ed.), Evidence for policy and decision-making. Sydney: UNSW Press.
- Holzer, P. J., Higgins, J., Bromfield, L. M., Richardson, N., & Higgins, D. J. (2006). The effectiveness of parent education and home visiting child maltreatment prevention programs (Child Abuse Prevention Issues No. 24). Melbourne: National Child Protection Clearinghouse.
- Lamont, A. (2010). Evaluating child abuse and neglect intervention programs (NCPC Resource Sheet). Melbourne: National Child Protection Clearinghouse.
- Michie, M. (2003, July). The role of culture brokers in intercultural science education: A research proposal. Paper presented at the Australasian Science Education Research Association conference, Melbourne, Australia.
- Parker, R. (2010). Evaluation in family support services (AFRC Issues Paper 6). Melbourne: Australian Family Relationships Clearinghouse.
- Petrosino, A., Turpin-Petrosino, C., & Buehler, J. (2002). "Scared Straight" and other juvenile awareness programs for preventing juvenile delinquency. Cochrane Database of Systematic Reviews, 2. DOI: 10.1002/14651858.CD002796.
- Scriven, M. (2003-04). Michael Scriven on the differences between evaluation and social science research. The Evaluation Exchange, IX(4). <http://tinyurl.com/yggukga>
- Tomison, A. (2000). Evaluating child abuse prevention programs (Child Abuse Prevention Issues No. 12). Melbourne: National Child Protection Clearinghouse.
This paper outlines different evaluation types, identifies the key elements to developing a rigorous evaluation and highlights possible limitations
Drummond Street Services' CEO Karen Field reflects upon the use of evidence in her work and the work of the sector.
The Smith Family’s Wendy Field discusses the tensions between implementing evidence-based programs and responding to local and complex needs.
Cathie Valentine discusses the need for collaboration between researchers and practitioners to help overcome complex problems affecting families.