Evidence-based practice and service-based evaluation


You are in an archived section of the AIFS website 


Content type
Practice guide

November 2013

Evidence-based practice

The call for services to ensure that their programs and practices are informed by, if not grounded in, well-conducted, relevant, scientific research evidence has grown louder in recent years. The term "evidence-based practice" is now in common usage, and the impact of using evidence-based practice principles in guiding service development and delivery is reflected in higher standards of service provision (Midgley, 2009).

Types of evidence

"Evidence-based practice" is a term that was originally coined in the field of medicine in the 1990s. Since then, it has been increasingly adopted by the helping professions and defined in this context as the point of overlap between best evidence, practitioner expertise and client values (Gibbs, 2003). Some professionals consider "evidence-informed practice" a more accurate term, as it better reflects that decisions are informed or guided, rather than influenced solely, by evidence (Shlonsky & Ballan, 2011).

In order to both inform and evaluate what they do, practitioners need to understand the various types of evidence that can be used to test their program objectives. The different types of evidence allow stronger or weaker conclusions to be drawn - the better the evidence, the more confidence you can have in your conclusions.

Within the field of evaluation, there are established criteria by which evidence of the effectiveness or impact of a program or practice is assessed. These "hierarchies" or levels of evidence1 indicate the relative power of the different types of evaluation designs to demonstrate program effectiveness. If you are planning an evaluation, you can use these hierarchies to guide your decisions about which evaluation method to use. The hierarchies can vary, but will generally place randomised controlled trials (RCTs) or experimental designs at the top, followed by quasi-experiments, and then pre- and post-test studies. (The pros and cons of these designs are also discussed in Planning for Evaluation I: Basic Principles) So what do these designs look like?2

Randomised controlled trials

What is a randomised controlled trial?

A randomised controlled trial (RCT) is a method of scientifically testing for differences between two or more groups of participants, one of which does not receive the intervention (the control group) and the others that do (the experimental groups).

The main feature of RCTs is the random allocation of participants to control and intervention groups - hence the use of the word "randomised". Randomisation provides each participant with an equal chance of being allocated to receive or not receive the intervention, which means there is a greater chance that the groups have similar factors that may influence their response to the program, such as gender, attitudes, past history, current life events. In other words, the systematic bias is reduced. An example of systematic bias would be if women were assigned to one group and men the other, or people in high-conflict relationships were allocated to the control group and those in low-conflict relationships to the couple anger management intervention.3

Data are collected from participants both before and after the program. If there is no bias in the way individuals are allocated to the groups, you can reasonably conclude that any differences between the groups after completing the program are due to the intervention rather than any pre-existing differences among participants. Since RCTs are typically conducted under conditions that provide a high degree of control over factors that might provide alternative explanations for findings, RCTs can allow you to make statements to the effect that the outcomes for participants are directly attributable to the program; that is, that the program caused the changes. Therein lies the holy grail of program development!

However, RCTs are not without their problems and reliance on RCTs has been criticised (see Bickman & Reich, 2009; and the American Evaluation Association [AEA], 2003, and European Evaluation Society, 2007, both cited in Donaldson et al., 2009). While recognising the value of applying the best available evidence to their practice, many practitioners (and evaluators) have responded negatively to the prevailing view that the only evidence worthy of such application comes from RCTs reported in systematic reviews and meta-analyses (see box below). From a practitioner perspective, RCTs may represent the "gold standard" design for research and evaluation, but they cannot accommodate the complex and challenging nature of service delivery in some sectors. In order to attribute outcomes to a program, RCTs need to be conducted under tightly controlled conditions, and this can make it difficult to apply the evidence they produce to everyday practice. It can be equally difficult to conduct an RCT in a service environment (Simons & Parker, 2002; Tomison, 2000).

What is a systematic review?

Systematic reviews are critical syntheses of a number of studies of similar programs or practices (such as parenting education programs, or programs aimed at building adolescent resilience) that have been evaluated in ways designed to maximise their internal validity,* typically via RCTs. Where there are many RCTs on a particular subject, systematic reviews provide a valuable summary of the evidence and can highlight the effective characteristics and components of interventions. Inclusion of a study in a systematic review is often based on the quality of the study. Since RCTs are considered the preferred design, where practicable, reviews may be confined to these types of studies.

What is a meta-analysis?

A meta-analysis is a systematic review that includes statistical analysis of the combined effects of several studies or randomised controlled trials. A statistic that quantifies the amount of change in particular variables for a group of studies is calculated and used to indicate whether the effect of a type of program (e.g., parenting competence, adolescent resilience) is small, medium or large. The larger the combined effect, the more effective the program.

* Internal validity is critical to evaluations in which you are trying to determine whether your program caused the effects you have recorded. Internal validity means that the way in which you designed and conducted your evaluation allows you to say that what you did (i.e., the program) caused what you observed (i.e., the outcome) to happen (Trochim, 2006)

Problems with the RCT method when conducted in a service environment are well recognised and include:

  • participants dropping out of each group at different rates (i.e., if there are a lot of drop-outs from the intervention group, but only a few from the control group);
  • unexpected differences between the control and intervention groups (so that assumptions of pre-test similarity don't hold);
  • use of inadequate instruments (Margison et al., 2000); and
  • ethical issues related to withholding an intervention from participants in control groups.

Notably, these problems can also plague non-RCT designs. Furthermore, when RCTs do not take place in "real-world" conditions (e.g., in a laboratory or a university rather than in a service environment), they may have limited external validity; that is, the findings may not be generalisable to the "real-world" service environment and "real-world" participants (Metz, Espiritu, & Moore, 2007).

Since RCTs are not always practicable, evaluators and researchers often employ the next best thing - quasi-experiments.


When the random allocation of program participants to control and experimental groups is not possible for practical or ethical reasons, the use of naturally occurring comparison groups can be employed. Participants on your waiting list are a good source of comparison data, because (a) they are available to you, and (b) you can collect the same data from them as you do from those participating in the program. The two groups are likely to be reasonably well matched in terms of demographic characteristics.

Comparison data can also be obtained by offering a similar group of participants an intervention that differs somewhat from the program you want to evaluate. These might be participants in:

  • a briefer version of the program (e.g., a single 2-hour session versus 10 2-hour sessions over a period of 10 weeks);
  • a less interactive form of the program (e.g., being given reading materials or workbooks); or
  • a different program that has similar aims.

Again, participants provide data before and after the program.

Evidence of greater benefits to those who participated in the intervention compared to the comparison groups can suggest the program is effective, but you could not say the program caused the improvements. This is because there has been no random assignment of participants, meaning that the benefits might also be explained by pre-program differences between the groups of participants rather than by their participation in the program. Nonetheless, if consistent results are found in repeated studies of a given type of program using a variety of quasi-experimental (and other, non-experimental) methods, then it is possible to have greater confidence in the effectiveness of a program - at least relative to the comparison groups employed.

Pre- and post-test studies

When employing control or comparison groups is not feasible, a more simple design can be used. Pre- and post-test studies examine the effect of a program without the use of either a comparison or a control group. The group participating in the program provide data on relevant measures immediately before the program and again at its completion, and the degree of change is measured. If the program works, the program logic would lead you to expect any changes recorded will be in the direction that supports the program objectives; for example, levels of adolescent participants' self-esteem may increase, or the number of child behaviour problems may decrease. In these studies, even if there are significant differences between the pre- and post-program measures, no real conclusions can be drawn as to whether the effects are due to participation in the program, because we cannot know if similar changes might have occurred had the program not been run. All that can safely be said is that some aspect of this group's behaviour (or attitudes, knowledge, skills etc.) changed in the period between the start of the program and its conclusion.

Service-based program evaluation

As noted earlier, some commentators view evidence-based practice as being removed from the complex real world in which programs and practices are implemented (AEA Statement on Scientifically Based Evaluation Methods, 2003, cited in Donaldson et al., 2009; Charman & Barkham, 2005) and reliant on "very restrictive definitions of 'evidence'" (Midgley, 2009, p. 323). Indeed, a systematic review based on only a few studies (even if they were high quality RCTs) would have limited value if you were looking for evidence to help you develop or modify a program or practice for a particular context. On the other hand, you may learn a good deal from a larger number of smaller scale, less sophisticated evaluations conducted in similar circumstances to those in which your program operates. Clarke (2006) suggested that the nature, complexity and context in which the intervention is to be implemented need to be considered when deciding which evaluation method to use. The weaknesses of RCTs4 are highlighted when they are applied to social programs, where it is evident that RCTs are more suited to situations in which a small number of clearly specified, uncontested (that is, not easily contaminated by external factors) and easily measured variables are to be examined.

Evidence should not be discounted just because a study does not use one of the designs listed in the hierarchies of evidence. Indeed, in some quarters it is considered good practice to gather data using multiple methods and triangulate the findings (European Evaluation Society, 2007, cited in Donaldson et al., 2009). These more pragmatic views of evidence and evaluation reflect the realities of conducting evaluations in a service environment.

Providers as evaluators

Providers are uniquely placed to evaluate their programs and services, and the knowledge, skills and expertise to develop and run an evaluation often exist among their staff. Practitioners apply evidence to their work with clients, drawn from a range of sources, including scientific research in clinical or other settings. They also incorporate evidence gathered over time in their daily activities, from supervision and case reviews, and various types of feedback provided by clients. There can, however, be great diversity in the types and quality of data gathered by services (Simons & Parker, 2002; Tomison, 2000), and constraints on time and resources may limit their capacity to actually carry out the evaluation.

One solution might be to use an external evaluator to conduct the evaluation. This might play out in two ways: the evaluation is conducted entirely by the external evaluator, or the external evaluator and the providers pool their knowledge, expertise and resources and work together to design and implement an evaluation plan.

External evaluation

Drawing on external assistance provides an independent perspective on the program, and allows program staff to concentrate on running the program. External evaluators can, however, be expensive and they will require time to get up to speed with the program and the organisation, and to develop trust with staff and participants (South Australian Community Health Research Unit, 2008). A collaboration or partnership approach may help to offset the costs and possible delays involved in hiring a consultant (McNamara, n. d.), and means that ownership of the process remains to a greater degree with the agency and its staff, rather than fostering a perception of the consultant being brought in to "do" the evaluation "to" them.

Figure 1 is aimed at helping providers decide whether to conduct their evaluation in-house, or whether it would be sensible to work in partnership with an evaluation consultant. There are, of course, advantages and disadvantages to each option. But even if you do engage an external evaluation consultant, it is important that program managers and staff have a sound understanding of what is involved in evaluation and why the process needs to unfold in particular ways.

Figure 1. Decision tree: Who should conduct the evaluation?


"How we do things here": Developing a culture of evaluation

Regardless of who conducts the evaluation or how well it is designed, its implementation will be problematic if its purpose, process and value to clients, practitioners and the organisation are not understood and accepted by those involved in or affected by the collection of evaluation data - whether directly as practitioners engaged with clients or program participants, or indirectly as managers of programs.

An agency with an evaluation culture is one that is committed to using evaluation findings to inform its decisions (Owen, 2003). Incorporating evaluation into the culture of the organisation means that:

  • conducting evaluations becomes accepted and is understood by its members;
  • members can design their own evaluations, or get help and guidance from others within the organisation; and
  • evaluation is used by members in their daily practice (Murphy, 1999).

Some attitudes that may make program evaluation challenging, and suggestions for counteracting these attitudes, are provided in the box below.

Challenging attitudes to evaluation

Misunderstandings about or organisational conflict regarding the need for program evaluation can undermine the best efforts at using evaluation for good in an organisation. Posavac (2011) outlined some "dysfunctional" attitudes towards evaluation and ways of dealing with these attitudes, and some of these are below:

The program is exemplary

  • Most program planners are enthusiastic and confident about the successes that their programs will bring, and may feel betrayed when evaluation fails to show the same level of impact.
  • Suggestion: Help the planners/organisation to understand what level of improvement is reasonable to expect. A small improvement experienced by many people may be as, if not more, valuable than aiming for a large improvement in few people. Dramatic program effects are unlikely to occur, as any comparison groups are likely to be receiving some form of help or support also.

Evaluation will offend the program staff

  • No one likes to be judged, particularly if any outcomes become publicly available.
  • Suggestion: Remind program staff that it is an evaluation of the program, not of the personnel. Evaluators need to recognise that making program staff defensive is unlikely to lead to a good evaluation - focusing on discussions around improvements may help.

The program will be terminated

  • If a program is shown not to be working, there is a risk it will be terminated.
  • Suggestion: Often funders will be under pressure to find a replacement for a program that is shown to not work as well as expected. An unfavourable outcome, for this reason, is more likely to result in program refinement rather than program elimination.

Evaluation drains program resources

  • There may be objections to directing project money to evaluation at the expense of direct service provision.
  • Suggestion: That program money is redirected from service provision may be true. Yet the alternative may be that money is spent on services and programs that aren't working or are inefficient. An evaluated program is also more likely to attract additional support and resources.

Why develop a culture of evaluation?

A culture of evaluation helps to reinforce reflective thinking and practice by highlighting the potential contribution that understanding "what works" can bring. Having evaluation processes in place and a workforce that understands and can readily absorb evaluation activities into their daily practice also helps them to be responsive to external demands for accountability (US General Accounting Office, 2003). With the mechanisms already in place, evidence to demonstrate the effectiveness of a program can be gathered at short notice.

Further benefits of a culture of evaluation include:

  • staff becoming more focused on and interested in what they do and how they do it;
  • increased confidence in having evidence that a program or service is having a positive impact on clients;
  • improved relationships among staff, through increased interaction and teamwork; and
  • staff gaining new skills and knowledge (Murphy, 1999).

How to create a culture of evaluation

The process of embedding evaluation into the way in which an agency operates on a daily basis may require organisational change, planning and management. Including practitioners on the evaluation development team and working through exactly what a culture of evaluation means for them in their daily practice helps to promote ownership of the process and reduce the two-cultures mindset (the "evaluators" and "us") (Owen & McDonald, 1999).

It may be useful to start on a small scale by evaluating one or two programs in a year, to help staff see how the process works and begin to foster their acceptance. Starting small means staff can experience the flow-on effects of the application of evaluation findings and the benefits for service provision and client outcomes. Easing into the process also means that programs can be gradually modified to accommodate the needs of the evaluation. For example, evaluation materials (such as questionnaire items) and documents (such as those for informed consent) can be created, tested and revised, or program schedules adapted to allow for data collection (Owen, 2003).

Developing a culture of evaluation - A case study from Queensland

A key factor in developing a culture of evaluation is the overall strategy and the way it is introduced (Murphy, 1999). An example of a large-scale rollout of an evaluation strategy was provided by the former Queensland Department of Education, Training and the Arts (DETA; Hanwright & Makinson, 2008), where the implementation of the strategy comprised six primary activities:

  • A program evaluation strategy document, setting out:
    • what the strategy aimed to achieve;
    • a definition of evaluation, the main principles of good evaluation and why evaluation is important; and
    • the key deliverables and indicators of the strategy.
  • A program evaluation manual for staff that included a guide to the evaluation process and a suite of tools on specific topics, such as developing evaluation questions. Feedback from staff members indicated that these guides had helped to ease their concerns about the complexity of evaluation.
  • A rolling schedule of program evaluations in which the programs to be evaluated over the next few years were identified.
  • The development of simple program logic tools to help program managers identify inputs, outputs and outcomes, measures etc. These tools also helped to de-mystify evaluation and reduce anxiety by increasing staff familiarity, knowledge, expertise and acceptance of evaluation. The tools encouraged staff to focus on smaller, achievable parts of the evaluation process rather than tackling the entire evaluation at once.
  • Regular evaluation training workshops were held, conducted by an external evaluation professional.
  • Lunchtime evaluation forums were run by internal and external evaluators to foster discussion and skills development. Staff across the agency could also discuss evaluation experiences in an informal learning community.

Staff acceptance of the strategy grew to the extent that program logic models and evaluation principles were increasingly applied to other planning activities. A critical factor in the successful implementation of the strategy (also noted by Owen, 2003, and Murphy, 1999) was the commitment from senior management. In the DETA case, they built capacity within the organisation rather than creating an "evaluation branch" who would essentially function as independent consultants. There were indications that middle-level managers were absorbing evaluation activities into their work areas, although the time required to incorporate evaluation into planning seemed to be a barrier for more highly placed managers.

Just as evaluation is itself an ongoing process, so too is the process of embedding evaluation into the culture of an organisation.

Ethics in evaluation

Engagement and casework with clients is guided by respect and concern for their wellbeing, and this is equally important when clients are involved in evaluation activities. The creation of documents, the protection of privacy and data, and how much information you can reasonably ask clients to contribute are also ethical considerations. The research literature is replete with discussion of principles underlying ethical conduct of research that can be used as a basis for the ethical conduct of evaluation.

Notwithstanding the differences between evaluation and research outlined in Evaluation and Innovation in Family Support Services, the obligation to participants in your evaluation is exactly the same as for participants in any research project. Many agencies and organisations have ethics committees that consider or review ethical aspects of service provision and practice, and these could also be called upon to consider ethical issues in evaluating practice. Since all academics must obtain approval from their university ethics committee, partnering or collaborating with an academic institution can provide a mechanism for considering the ethical aspects of your evaluation. Nevertheless, the appropriate committee or personnel within the agency or organisation should also appraise evaluations of programs or practices. The National Health and Medical Research Council (NHMRC) has a register of Human Research Ethics Committees (HREC) if an internal committee is unavailable.

Do I need to get ethics approval for my evaluation?

Some research projects and evaluations may not require a full ethical review, as they rely solely on data that has already been collected for other purposes. For example, an original analysis of previously collected, publicly available statistical data would not necessarily require full ethical review. Similarly, organisations often choose to exempt from their formal HREC process any research or evaluation projects that rely on "Quality Assurance" data (e.g., pre- and post-program surveys collected as part of program implementation). Projects aiming to monitor, evaluate or improve services by a provider may be deemed to be Quality Assurance if they:

  • do not impose any risk on participants;
  • use existing data already collected by that organisation in the conduct of their work;
  • ensure analysis is conducted by either members of that organisation or someone working with the organisation who is bound by a professional code of ethics;
  • do not infringe the rights or reputation of the carers, providers or institution; and
  • do not violate the confidentiality of the client.5

If you are unsure whether you will need ethics approval for your evaluation, you can contact the CFCA information exchange helpdesk.

Where ethics approval is still needed, some of the key ethical issues relating to the practicalities of conducting an evaluation are outlined below.

Values and principles of ethical conduct

Four basic principles form the framework for conducting research - and therefore evaluation - that involves humans: respect, merit and integrity, justice, and beneficence. These underpin the relationship between evaluator and participant, and should inform the way in which you set up and carry out your evaluation plan. They are discussed in Section 1 of the National Statement on Ethical Conduct in Human Research (NHMRC, Australian Research Council [ARC], & Australian Vice-Chancellors' Committee [AVCC], 2007).

Harm to participants

A key consideration in gathering information from clients or program participants is the potential for the experience to cause the individual some distress. You may need to ask about aspects of the program or therapy in which they participated that are associated with sensitive or distressing experiences, so the need for this data must be weighed against concern for the individual. You will need to put in place procedures to deal with a situation in which a participant or client becomes distressed, such as having the data collected by an appropriately trained staff member.

Obtaining informed consent

Before collecting evaluation data from participants, informed, voluntary consent must be obtained. Clients should know if the information you collect - whether at intake or registration, before, during and after a program, in counselling or therapy sessions - is to be used in the evaluation of the program or service.

Two key documents are required: an information sheet that participants can keep, and an indication of their informed consent. The information sheet must be written in plain language and sufficiently detailed (but not overwhelmingly so) for the participant to make an informed choice about providing evaluation information. It should explain, in plain language:

  • what the evaluation is for (e.g., to get participant feedback on how a program works);
  • what is asked of participants (e.g., complete a 20-minute survey, have their counselling session recorded, participate in a focus group);
  • who will have access to their data (i.e., other than the evaluation team);
  • what will be done with the results (e.g., identify how the program helps participants, identify which parts are effective, improve the way in which the program is delivered, identify staff training needs, decide whether to keep the program going);
  • their right to withdraw from the evaluation at any time and to have any data already collected deleted, and that their access to services will not be negatively affected if they choose to withdraw;
  • how their privacy will be protected (e.g., coding of personal information, secure storage and disposal of data) to ensure either confidentiality or anonymity and the limits of that protection (e.g., in the event of disclosure of abuse or illegal activity, how and where their data will be stored - in a secure electronic or physical space); and
  • provide contact information of the appropriate person to whom to direct enquiries or complaints.

You may also indicate how the findings will be reported; for example, through internal reports for management and staff, reports to funding bodies, or publication in journals.

It is good practice to clarify with the participant their understanding of the information you have given them. When they have read and understood the information, their consent to participate can be indicated by their signing an informed consent form (see sample informed consent document). They may wish to keep a copy of this form. If collecting data by phone, then it will be useful to prepare a script for the interviewer to follow closely, that clearly sets out the information and asks the participant to indicate consent, after which the interview or survey can proceed.

Consideration will need to be given to obtaining consent from people who, for whatever reason, may be less able to freely and autonomously provide consent. These may include:

  • children;
  • people with certain developmental disabilities;
  • people with certain psychiatric or medical conditions; and
  • people experiencing early or later stages of dementia.

Protection of privacy and confidentiality

The issue of protection of privacy and confidentiality may have implications for parts of the agency or organisation not directly concerned with service delivery. In particular, for those concerned with data storage: do you have systems in place to protect the data, whether in hard copy or electronic files? This means having secure computer systems, secure filing or storage areas, restricting access to the data files, administration systems for storing contact details separately from participants' data and for de-identifying data (e.g., assigning codes so that, for example, names are not recorded on questionnaires). Addressing these concerns may not be especially onerous - adequate systems and procedures may already be in place.

Particular issues arise if your evaluation involves Indigenous people. Guidelines for evaluation (and research) with and about Indigenous peoples (PDF 96.5 KB) are available the Australian Institute of Aboriginal and Torres Strait Islander Studies (n.d.).

Further reading in ethics

Evaluation examples

Within the family relationships sector, a number of examples of the application of the principles of evidence-based practice are to be found, particularly in parent education. In Australia, the authors of the Triple-P, Tuning Into Kids, and Parents Under Pressure parenting programs have exposed their programs to rigorous testing employing RCTs.


  • Australian Institute of Aboriginal and Torres Strait Islander Studies. (n. d.). Guidelines for ethical research in Indigenous studies (PDF 96.5 KB). Canberra: AIATSIS. Retrieved from <www.aiatsis.gov.au/research/docs/ethics.pdf>.
  • Bickman, L., & Reich, S.M. (2009). Randomised controlled trials: A gold standard with feet of clay? In S. I. Donaldson, C. A. Christie, & M. M. Mark, (Eds.), What counts as credible evidence in applied research and evaluation practice? Thousand Oaks, CA: Sage.
  • Charman, D., & Barkham, M, (2005). Psychological treatments: Evidence-based practice and practice-based evidence. InPsych, December. Retrieved from <www.psychology.org.au/publications/inpsych/treatments>.
  • Clarke, A. (2006). Evidence-based evaluation in different professional domains: Similarities, differences and challenges. In: I. F. Shaw, J. C. Greene & M. M. Mark (Eds.), The Sage handbook of evaluation (Chap. 25). London: Sage Publications.
  • Donaldson, S. I., Christie C. A., & Mark, M. M. (Eds), (2009). What counts as credible evidence in applied research and evaluation practice? Thousand Oaks, CA: Sage Publications.
  • Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program evaluation: Alternative approaches and practical guidelines. New Jersey, US: Pearson Education. 4th Edition.
  • Gibbs, L. (2003). Evidence-based practice for the helping professions: A practical guide with integrated multimedia. Pacific Grove, CA: Brooks/Cole-Thomson Learning.
  • Hanwright, J., & Makinson, S. (2008). Promoting evaluation culture: The development and implementation of an evaluation strategy in the Queensland Department of Education, Training and the Arts. Evaluation Journal of Australasia, 8(1), 20-25.
  • Margison, F., Barkham, M., Evans, C., McGrath, G., Mellor Clark, J., Audin, K. & Cornell, J. (2000). Measurement and psychotherapy: Evidence-based practice and practice-based evidence. British Journal of Psychiatry, 177, 123-130.
  • McNamara, C. (n. d.). Basic guide to program evaluation. Minneapolis, MN: Authenticity Consulting. Retrieved from <www.managementhelp.org/evaluatn/fnl_eval.htm>.
  • Metz, A. J. R., Espiritu, R., & Moore, K. A. (2007). What is evidence-based practice? (Child Trends Research-to-Results Brief No. 14). Retrieved from <www.childtrends.org/?publications=what-is-evidence-based-practice>.
  • Midgley, N. (2009). Editorial: Improvers, adapters and rejecters. The link between "evidence-based practice" and "evidence-based practitioners". Clinical Child Psychology and Psychiatry, 14(3), 323-327.
  • Murphy, D. (1999). Developing a culture of evaluation. Paris: TESOL France. Retrieved from <www.tesol-france.org/articles/murphy.pdf>.
  • National Health and Medical Research Council, Australian Research Council, & Australian Vice-Chancellors' Committee. (2007). National Statement on Ethical Conduct in Human Research. Canberra: NHMRC. Retrieved from <www.nhmrc.gov.au/guidelines/publications/e72>.
  • Owen, J. (2003). Evaluation culture: A definition and analysis of its development within organisations. Evaluation Journal of Australasia, 3(1), 43-47.
  • Owen, J. M., & McDonald, D. E. (1999). Creating an evaluation culture in international development cooperation agencies. Journal of International Cooperation in Education, 2(2), 41-53.
  • Posavac, E. (2011). Program evaluation: Methods and case studies. New Jersey, US: Prentice Hall. 8th Edition.
  • Shlonsky, A. & Ballan, M. (2011). Evidence-informed practice in child welfare: Definitions, challenges and strategies. Developing Practice, 29, 25-42.
  • Simons, H. (2006). Ethics in evaluation. In I. F. Shaw, J. C. Greene, & M. Mark (Eds.), Handbook of evaluation: Policies, programs and practices (Chapter 11). London: Sage Publications.
  • Simons, M., & Parker, R. (2002). A study of Australian relationship education activities. Melbourne: Australian Institute of Family Studies.
  • Smith, G., & Pell, J. (2003). Parachute use to prevent death and major trauma related to gravitational challenge: Systematic review of randomised controlled trials. British Medical Journal, 327, 1459-1461. Retrieved from <www.bmj.com/cgi/content/abstract/327/7429/1459>.
  • South Australian Community Health Research Unit. (2008). Planning and evaluation wizard. Adelaide: Flinders University. Retrieved from <www.flinders.edu.au/medicine/sites/pew/pew_home.cfm>.
  • Tomison, A. (2000). Evaluating child abuse protection programs (Issues in Child Abuse Prevention, No. 12). Melbourne: National Child Protection Clearinghouse. Retrieved from <www.aifs.gov.au/nch/pubs/issues/issues12/issues12.html>.
  • Trochim, W. M. K. (2006). Research methods knowledge base. Ithaca, NY: Web Center for Social Research Methods. Retrieved from <www.socialresearchmethods.net/kb>.
  • US General Accounting Office. (2003). Program evaluation: An evaluation culture and collaborative partnerships help build agency capacity (PDF 443 KB) (GAO-03-454). Washington, DC: GAO. Retrieved from <www.gao.gov/new.items/d03454.pdf>.
  • What Works for Children Group. (2003). Evidence guide: An introduction to finding, judging and using research findings on what works for children and young people (PDF 269 KB). Retrieved from <www.whatworksforchildren.org.uk/docs/tools/evidenceguide%20june2006.pdf>.


1. These have been the subject of much debate in recent years. See Donaldson, Christie, and Mark (2009) for a comprehensive overview of the debate.

2 These descriptions are largely based on material from Fitzpatrick, Sanders, and Worthen (2011) and the What Works for Children Group (2003).

3. Of course, this could happen by chance, but statistically it is highly unlikely.

4. For a humorous and provocative take on the use of randomised controlled trials, see Smith and Pell (2003).

5. Further detail on Quality Assurance and ethical review is available from the NHMRC <http://www.nhmrc.gov.au/guidelines/publications/e46>.


This paper was first developed and written by Robyn Parker, and published in the Issues Paper series for the Australian Family Relationships Clearinghouse (now part of CFCA Information Exchange). This paper has been updated by Elly Robinson, Manager of the Child Family Community Australia information exchange at the Australian Institute of Family Studies.