The efficacy of early childhood interventions

You are in an archived section of the AIFS website

Content type

Research report

Published

July 2005

Download Research report

The efficacy of early childhood interventions 1.35 MB

Overview

This report works towards producing an evidence base concerning the efficacy of early childhood interventions in Australia.

Thirty-two Australian and international programs were evaluated on their design, implementation, effectiveness, and cost effectiveness, where possible. Cost-benefit analysis is discussed in more detail, as well as the lessons for early childhood intervention policy in Australia. The details of the 108 programs initially identified and the 32 selected for review are included in appendixes.

Director's foreword

I am delighted that the Australian Institute of Family Studies could be involved in the research resulting in this report. Prepared cooperatively with the Melbourne Institute of Applied Economic and Social Research, it represents an important step forward in establishing an evidence base concerning the efficacy of early childhood interventions in the current Australian context.

Although it is widely acknowledged that early childhood provides a unique window of opportunity for optimising children's capacity for learning, as well as a period where adverse experiences can have serious long-term effects, much less is known about how to transform this knowledge into effective interventions, nor how much investment should be made in these initiatives.

Information about effectiveness of programs currently operating in Australia is especially thin on the ground. One cannot assume that any type of intervention in early childhood will pay long-term dividends. Some interventions are more effective than others but, importantly, some are more cost-effective. It is necessary to scrutinise the evidence about cost effectiveness. As such, the report is especially timely, given the widespread interest in early intervention and prevention, not only across the nation but also internationally.

Cost-effectiveness has not been a particular focus in Australia. Such information is necessary to distinguish those initiatives that are worthy of investment and those that are not.

It is prudent that government is focusing on this issue, and appropriate that the Australian Institute of Family Studies and the Melbourne Institute should be supporting the Australian Government endeavours. The Institute is actively researching across areas such as crime prevention, prevention of drug and alcohol misuse, child abuse prevention, family and relationship support. These are important contributions to the Australian knowledge base. The availability of the Melbourne Institute's economic expertise has made for a very productive and complementary collaboration.

It is clear that early childhood interventions are generally worthy investments. It is my hope that governments and other stakeholders will accept the guidance contained in this report about how to produce knowledge about the returns on public investment that different pro- grams produce. Children, families, and ultimately society, can all benefit from this knowledge.

I am grateful to the authors of this report, Sarah Wise and Lisa da Silva from the Australian Institute of Family Studies, and Elizabeth Webster from the Melbourne Institute and Ann Sanson now at the University of Melbourne, on their valuable contribution to the literature on early intervention and prevention. I am also grateful to the Family and Children's Policy Branch of the Australian Government Department of Family and Community Services for its support in commissioning this work.

Professor Alan Hayes
Director
Australian Institute of Family Studies

Executive summary

Interventions in early childhood aimed at improving psychosocial conditions linked to child development have a long history. Evaluations of the impact of early childhood interventions on child and parenting outcomes indicate that they yield positive and substantial short-term effects, but the long-term outcomes have rarely been studied. Studies that have followed children longitudinally have found that cognitive effects tend to diminish over time, but that the interventions have positive long-term effects on crime and delinquency.

Long-term benefits (including cost-savings) of interventions in early childhood continue to be asserted in broad public debates, despite limited empirical support. More extensive examination of the cost effectiveness, or costs and benefits, of early childhood interventions is needed to substantiate claims of effectiveness.

The Australian Government Department of Family and Community Services approached the Australian Institute of Family Studies (Institute or AIFS) and the Melbourne Institute of Applied Economic and Social Research (Melbourne Institute) to conduct the Effectiveness of Early Childhood Interventions (EECI) project. Broadly, the goal was to conduct a review of selected early childhood interventions and provide further information about cost-benefit evaluations in a way that is relevant to Australian policy makers. Although there have been substantive reviews of early childhood intervention (for example, Karoly, Greenwood, Everingham, Hoube, Kilburn, Rydell, Sanders and Chiesa 1998; Mrazek and Brown 2002), there has not been an extensive review of the costs and benefits of early childhood interventions.

For the purpose of the EECI project, early childhood interventions were defined as public programs that attempt to improve child health and development during the period from conception to six years of age. In the current review, 108 national and international interventions with published evaluation data were identified from a systematic search of relevant electronic databases, from which 32 were selected for review.

In selecting these 32 programs, priority was given to programs that were well researched, or where a cost-benefit analysis had been conducted. Large-scale, well-established programs were also given priority, as were programs where the ultimate target population was the child. The 32 selected programs were classified into five clusters according to type of program, foci, location and focal child age.

The adequacy of design and implementation of the programs was reviewed according to four criteria: dosage/intensity, participation, implementation and drop-out rates. These reviews indicated that the adequacy of design and implementation was highly variable.

The adequacy of evaluation design was also reviewed according to a number of criteria. The majority of evaluations were at least adequate and often good or excellent in design.

The effectiveness of interventions was reviewed by an examination of effect sizes. Effect sizes were grouped according to their value into one of four categories; negligible, small, medium and large. Of the interventions that provided effect sizes, many had immediate and shortterm, albeit often small, effects. As mentioned previously, very few programs have examined long-term effects. The effect sizes found in interventions that did examine long-term effects indicated that cognitive effects diminished over time, but that interventions had positive effects on some late adolescent and adult outcomes.

While these findings provide some basis for estimating likely future benefits of early intervention programs, missing data on the restricted set of programs included in this review means that it is inappropriate to comment on the utility of early childhood interventions as a general strategy to sustain improvements for children in the long-term.

Moreover, of the 108 interventions that were initially identified, only eight programs included a cost-benefit study. There have been no cost-benefit analyses undertaken of Australian programs.¹

A discussion of the purpose of cost-benefit analyses and the process of conducting a costbenefit analysis follows the evaluation of the effects of early childhood interventions. A critique of the three main formulae or methodologies used in cost-benefit analyses-Net Present Value, Rate of Return and Cost Effectiveness; and discussion of ways of valuing non-market costs and benefits are also provided. Finally, the eight early childhood interventions with a cost-benefit component are critically reviewed according to the three main steps in conducting a cost-benefit analysis: estimating the net impact of the program, estimation of costs and benefits, and calculating net effects.

Overall, this review of early childhood interventions highlights a definite need for more data on early childhood interventions before conclusions regarding cost-benefits are made. It is recommended that evaluations are planned at the same time as programs are designed, to ensure they are set up to enable cost-benefit analyses. This involves random assignment of the target population, as well as collection of data on participant characteristics, program costs and program effects.

1. However, it is worth noting that a cost-effectiveness study has been conducted on the Positive Parenting Program (Triple P: Turner, Mihalopoulos, Murphy-Brennan and Sanders 2004).

1. Background and purpose of the project

Interventions to promote positive early childhood environments and optimal development are not new. Intensive pilot interventions such as the Perry Preschool Project, which ran between 1962 and 1967 (Schweinhart, Barnes and Weikart 1993), and large-scale ongoing interventions such as Head Start (FACES 2003) are explicitly aimed at improving psychosocial conditions linked to child development in the pre-school years. Developmental gains are also expected to carry over into later stages of development, resulting in fewer problems and better functioning into middle childhood, adolescence and beyond.

The advent of a new knowledge base from developmental neuroscience, and growing evidence from longitudinal studies, has strengthened the argument for expenditure on interventions in early childhood.

Early childhood is now understood to be a 'sensitive' period for brain development (also sometimes referred to as a 'critical period', but see Bailey (2002) for a critique). There is a proven relationship between the quality of early childhood experiences - that is, the amount of positive stimulation and sensitive, responsive caring by familiar adults - and the developing capabilities of the brain (Shonkoff and Phillips 2000). Negative experiences, such as exposure to a violent home environment, are also linked to sustained, harmful effects on brain function, and, in turn, negative effects on behaviour, cognition and emotional wellbeing (Schorr 1997). Poor environmental circumstances, such as low family income, have particularly negative effects on children's cognitive development, behaviour and school achievement (Bailey 2002; Brooks- Gunn 2003).

It is generally accepted that experiences in the early years provide a foundation for future development. This of course does not preclude the possibility of change in developmental pathways depending on later experiences (Bailey 2002; Brooks-Gunn 2003; Shonkoff and Phillips 2000). Some commentators have concluded that experiences and circumstances from conception to age six, and particularly in the first three years, affect brain development in a way that 'will affect learning, health and behaviour throughout life' (McCain and Mustard 1999 :5).

The promise of diminishing the burden of disease and dysfunction across the lifespan has encouraged governments and other agencies to invest more heavily in children before they enter formal schooling. This has involved a specific focus on targeted early childhood interventions to assist children from disadvantaged backgrounds to enter school on a more equal footing with more advantaged children (Brooks-Gunn 2003).

There has also been a diversification of early childhood interventions in step with theoretical shifts in developmental science. The evolution of comprehensive, holistic or 'multilevel' interventions, which employ programs, services and benefits that target outcomes across child, parent and community domains, reflect ecologically based models of child development, wherein the child is viewed in the context of the family, the family in the context of the community, and the community in the context of society at large.

The aims of early childhood interventions have also broadened. A new body of literature emphasises the importance of focusing on non-cognitive skills as a critical component of child success. If early childhood interventions can avoid the need for special education services at school, and help children get along better with peers, then they are deemed successful, despite their lack of longterm improvements in cognitive skills (Currie 2003).

Despite a strong theoretical base for establishing a foundation of optimal early childhood experiences, it is clear that without appropriate interventions at other crucial developmental stages, children will not be safeguarded from problems in the years to come (Bacharach 2002; Zigler and Styfco 1996; Brooks-Gunn 2003). Brooks-Gunn (2003: 1) even suggests: 'It is magical thinking to expect that if we intervene in the early years, no further help will be needed by children in the elementary school years and beyond'.

A small collection of systematic reviews provides a central source of information about the effectiveness of early childhood interventions. The RAND report, entitled 'Investing in Our Children: What We Know and Don't Know about the Costs and Benefits of Early Childhood Interventions' (Karoly et al.1998), provides an independent, objective review of the state of knowledge on early childhood interventions at the time the report was produced in 1997. Similarly, the 'Invest in Kids' project (Russell 2002) provides a summary of the outcomes of a large number of early childhood interventions, categorised according to the strength of the evaluation design and intervention type. Undertakings such as these show that a significant proportion of well designed early childhood interventions yield positive and substantial short-term outcomes, with cognitive effects typically diminishing over time but positive effects on crime rates and employment being evident.²

The Perry Preschool Project is one of only a handful of interventions with a longitudinal evaluation component, following children to the age of 27 years. It found that short-term improvements in cognitive outcomes weakened over time. However, the intervention did show a reduction in crime rates and better employment outcomes during late adolescence and early adulthood (Schweinhart, Barnes, and Weikart 1993). Barnett's (1995: 43) renowned review of the long-term effects of early childhood interventions also concluded that they can produce 'sizable persistent effects on achievement, grade retention, special education, high school graduation and socialization' Further, there is considerable research suggesting that the effects of interventions in early childhood can be sustained over time if subsequent schooling is of high quality (for example, Currie and Thomas 2000).³

In sum, many early childhood interventions have demonstrated positive, and often quite strong, short-term effects, but further longitudinal research is needed to confirm mid- to long-term effects (Emde 2003; Reynolds 1994). Further, program evaluations are limited mainly to measuring the effects the intervention has had on characteristics of the sample, or outcomes (such as parent employment or child literacy and numeracy), without taking the extra steps associated with cost benefit analyses.

Despite this, the long-term benefits (including cost-savings) of interventions in early childhood are continually communicated in broad public debates, with the 'seven dollar return to every dollar spent' finding of the Perry Preschool Project (Barnett 1993b; Weikart 1996) receiving a regular airing. Just how generalisable the Perry finding is to other interventions in early childhood is actually a matter for debate, as very few interventions have collected the data needed to perform cost-savings estimations.

The Australian Government Department of Family and Community Services approached the Australian Institute of Family Studies (Institute or AIFS) and the Melbourne Institute of Applied Economic and Social Research (Melbourne Institute) to conduct the Effectiveness of Early Childhood Interventions (EECI) project. Broadly, the goal was to conduct a review of select early childhood interventions and provide further information about cost-benefit evaluations. This includes an attempt to model likely savings from early childhood interventions in a way that is relevant to Australian policy makers.

Specifically, the EECI project aims to report objectively on the cost savings potential of early childhood interventions, and what further information is required to assemble an evidence base on cost-benefits of early childhood interventions in Australia. This should assist future judgements about investments in early childhood interventions in Australia.

The key objectives of the project are to: evaluate methodologies for producing cost-benefit analyses of early childhood interventions; describe the appropriateness of existing evaluation data for conducting cost-benefit analyses of early childhood interventions; and evaluate the extent to which Australian evaluations provide the necessary parameters for cost-benefit analyses.

To meet these goals, national and international early childhood interventions were identified through a systematic search of the available literature. Characteristics of the interventions, including the type of intervention, the intervention received, the subject population, the evaluation methodology, program costs, and anticipated and actual benefits, were documented. Programs meeting predetermined criteria were then selected for a more detailed assessment.

A select subset of the initial sample of interventions was then classified into one of five 'clusters' according to key program components. This enabled easy interpretation of the adequacy of the design, implementation and evaluation of interventions included in the review and provided a background for the review of the rigour of cost-benefit studies that follows.

Discussion of the parameters necessary to perform cost-benefit analyses, including ways of quantifying the benefits of interventions that do not have a 'price', such as child behaviours, mother's social support and parenting, precedes an evaluation of the cost-benefit methodologies undertaken on interventions in this review. Finally, recommendations are made as to the appropriate model to determine the cost-benefits of early childhood interventions in Australia.

2. See also two recent Australian reviews of earlychildhood interventions. First, Bowes (2000) provides a review of parenteducation and support programs in the United States. Second, the reporttitled 'A Head Start for Australia: An early years framework' (NSW andQLD Commissions for Children and Young People) provides an examinationof the research findings on early intervention programs and theirapplication to the Australian context. Recent commentaries byBrooks-Gunn (2003) and Anderson, Shinn, Fullilove, Scrimshaw, Fielding,Normand, Carande-Kulis and the Task Force on Community PreventiveServices (2003) are also relevant.

3. It bears noting that the positive effects of targeted early childhoodinterventions, while substantial, have not raised outcomes amongchildren from disadvantaged backgrounds to the same level as their moreadvantaged peers (see Zigler 2003).

2. Terminology and scope of the review

Definition of early childhood interventions

For the purpose of the EECI project, early childhood interventions were defined as programs that attempt to improve child health and development during the period from conception to six years of age with the expectation that these improvements will have long-term consequences for child development and wellbeing.

Early childhood interventions include:

Programs that focus on 'health promotion', or the prevention of onset of mental, social and behavioural problems by encouraging positive development and resiliency. ⁴ Early childhood health promotion interventions may be universally accessible, targeting the general public or an entire population (also referred to as primary prevention programs), or tailored towards children or families believed to be at high risk of problems developing (also referred to as targeted or selective programs, or secondary prevention programs).
Programs that focus on preventing the progression of problems that have already surfaced - also known as indicated programs, early childhood early intervention programs, or tertiary prevention programs.

Early childhood interventions may have one or more of the following outcome objectives or foci - parent-child relationships, parental knowledge, parenting skills, social support, the child's cognitive, language and social development, school performance, and broader community and social conditions (for example, economic circumstances of the family) that interact with parental functioning and the child's wellbeing.

Definition of cost-benefit analyses

In the current report, cost-benefit 'methodologies' are analyses that include three estimations - present value, rate of return, and cost effectiveness. (There are several other common estimations such as the cut-off period and the pay-off period, but as these are mainly used in business where debt financing is used and risk of bankruptcy is an issue, they are not reviewed here.)

Scope of the review

The EECI project did not aim to provide an exhaustive review of early childhood interventions. Rather, its aim was to identify a combination of different types of early childhood intervention programs, where program efficacy had been well researched, or where a cost-benefit analysis had been undertaken. Large scale, well established public programs suggested by respected authorities were given priority. Clinical programs, case identification and treatment programs were beyond the ambit of the review.

Further, the project emphasised programs where the ultimate target population was the child, and, preferably, where the intervention was child-focused or oriented to child outcomes, such as the child's cognitive, language and social development and school performance. A smaller percentage of the interventions were parent or family oriented. These types of interventions focus on positive changes for the parent (such as parenting knowledge or health and wellbeing) or family (such as economic self sufficiency), on the assumption that these benefits will have an indirect impact on the child.

The review also focused on interventions based on an 'ecological' model of child development, that is, interventions that are multidimensional, or seek to intervene at the level of the child, the parent or family and the community (sometimes referred to as two generation programs in the United States literature), or comprehensive interventions, which seek to promote a range of positive outcomes, such as enhanced health, school readiness and emotional and social wellbeing.

Selection of early childhood interventions

The first stage in selecting early childhood interventions for the current project involved a systematic literature review of early childhood interventions with published evaluation data. The World Wide Web, relevant electronic databases (for example, Psychinfo, ERIC), English language peer reviewed journals, and expert referrals formed the basis for the search.

This process yielded 108 early childhood interventions. A full summary of these programs appears in Appendix 1. These interventions were largely situated in the United States, and included interventions that are no longer running such as the Perry Preschool Project (1962-1967, Ypsilanti, US) and the Elmira Prenatal/Early Infancy Project (1978-1982, Elmira, US), but have followed up participants over a number of years, as well as interventions that are currently operating (such as Head Start and Early Head Start, both operated at multiple sites in the US).

Effort was also made to obtain information about interventions running in non-English-speaking countries. Information was gathered on interventions such as the Wasi Wasi Home Child care program (Peru), the Early Enrichment Project (Turkey), and the Colombia Promesa Program (Colombia). These interventions, most often run in developing countries, differ from interventions in Western countries in a number of ways. First, expenditure per child is lower; second, staff are generally less well trained; and third, nutrition and physical health is of primary focus, as opposed to developmental health more broadly (Behrman, Cheng and Todd 2004).

A number of Australian interventions were also found via the initial search, such as Best Start (DHS Victoria 2001), Good Beginnings (www.goodbeginnings.net.au) and Families First (Fisher, Kemp and Tudball 2002). Some interventions that were developed in other countries and later taken up in Australia were also found, such as the Home Instruction for Parents of Preschool Youngsters (HIPPY) program, developed in Israel.

Of these 108 interventions, many were demonstration or pilot programs that involved only small sample sizes. In addition, very few involved longitudinal follow-up. The majority of interventions were targeted, typically at children and families from disadvantaged backgrounds - for example, low socio-economic status, children at risk for child abuse and neglect, adolescent mothers, children with behavioural problems.

A variety of intervention strategies were represented - for example, home visiting, centre-based services, group meetings and workshops. Some of the interventions were operated at a single location, while others involved coordination of multiple initiatives at multiple sites. Evaluations of the interventions ranged from weak (small, qualitative studies without a comparison sample) to rigorous (large, randomised, longitudinal studies).

The outcomes measured included child outcomes (typically cognitive, behavioural and social), parent outcomes (typically parenting and parental wellbeing) and family outcomes (typically family relationships and economic self-sufficiency). The longitudinal outcomes examined were most often child outcomes and included crime and delinquency, education, employment and income.

The sample of 108 interventions was then reduced to a smaller subset of interventions that met the essential criteria for inclusion; as previously discussed, a strong evaluation component was essential. This process also aimed to identify a range of different programs in order to provide a broad representation of early childhood interventions. All but three of the interventions operating in Australia (the Positive Parenting Program (Triple P), Baby Happiness, Understanding, Giving and Sharing (Baby HUGS) and Support at Home for Early Language and Literacy (SHELLS)) were 'screened-out' on this criterion. Interventions that were not appropriately 'child focused', or did not meet the definition of an early childhood intervention outlined above were also eliminated, as well as interventions that were determined to have inadequate evaluations due to very small sample sizes or inappropriate designs.

4. Resiliency refers to the ability to recover quickly from and adapt successfully to adversity What is resiliency?

3. Classification of interventions

A total of 32 interventions were determined to meet our criteria for selection. They are listed below, along with the country and year(s) of operation. Interventions with a cost-benefit component are highlighted with an asterisk (*).

Selected interventions

Perry Preschool Project, Ypsilanti, US 1962-1967.*
Elmira Prenatal/Early Infancy Project, Elmira, US 1978-1982.*
Head Start, multiple sites, US 1965-current.
Early Head Start, multiple sites, US 1995-current.
Carolina Abecedarian Project, Carolina, US 1972-1985.*
Infant Health and Development Project, 8 sites, US 1985-2000.
Chicago Child-Parent Center, Chicago, US 1967-current.*
Syracuse Family Development Research Program, Syracuse, US 1969-1975.
High/Scope Preschool Curriculum Study, Ypsilanti, US 1967-1970.
New Hope Project, Milwaukee, US 1994-1998.
Parent-Child Development Centers, multiple sites, US 1970-1980.
Starting Early Starting Smart, 12 sites, US 1997-2001.*
Better Beginnings, Better Futures, Canada 1991-current.
Sure Start, multiple sites, United Kingdom 1999-current.
Positive Parenting Program, multiple sites, Australia, ongoing.*
Support at Home for Early Language and Literacy, NSW, Australia 1997-current.
Baby Happiness, Understanding, Giving and Sharing Program, Australia, current.
Parents as Teachers, Massachusetts, US 1984-current.
Home Instruction for Parents of Preschool Youngsters Program, multiple sites, international, ongoing.
New Parent Infant Network, United Kingdom 1980-current.
Project 12-Ways, Southern Illinois, US 1979-1985.
Even Start, multiple sites, US 1989-current.
Comprehensive Child Development Program, multiple sites, US 1990-1995.
Hawaii Healthy Start Program, Hawaii, US 1985-1988.
Florida Family Transition Program, Florida, US 1994-2000.*
Teenage Parent Demonstration Program, 3 sites, US 1986-1998.
Cuyahoga County Early Childhood Initiative, Ohio, US 2000-2002.
Saginaw Prekindergarten Program, Michigan, US 1960-current.
Bolivia Integrated Child Development Project, Bolivia, ongoing.*
Early Enrichment Project, Turkey 1982-1986.
Incredible Years, US and United Kingdom 1982-current.
Early Childhood Education and Assistance Program, Washington, US 1985-current.

These 32 early childhood interventions selected for further evaluation (referred as the interventions from this point) are more fully documented in Appendix 2. Interventions are documented by name of the intervention, where the program was located (for example, country), target population, sample size, intervention strategy, where the intervention took place (for example, home visitation, community centre), anticipated outcomes/benefits, time frame for anticipated benefits and details of the intervention evaluation, including findings related to the impact of the intervention on outcomes and whether a cost-benefit analysis had been conducted.

The interventions were grouped to allow for ease of reporting and interpretation. Interventions have been classified in a number of ways in the early intervention/prevention literature (for example, Mrazek and Brown 2002; Brooks-Gunn 2003). They have been classified according to location, target, timing, intensity, extensiveness and curriculum. The EECI project attempted to incorporate some of these approaches, resulting in interventions being classified in terms of four key intervention components:

Type of intervention: universal, targeted or indicated.
Foci/benefit: child outcome, parent/family outcome or child and parent/family/community outcomes (multi-level programs).
Intervention location: home visit, clinic-based or child care centre/preschool.
Focal age of children: prenatal, infancy, toddler, pre-school or early school aged.

Intervention 'clusters'

The interventions were examined in terms of the four key intervention components described above and determined to fall into one of five intervention types or 'clusters'. These groupings are described below.

Cluster 1: targeted, child focused, centre based, preschool age

Interventions grouped in 'cluster 1' are targeted interventions that aim to improve child development directly - that is, through interventions involving children as participants. The target population is primarily pre-school aged children from low-income or 'at-risk' neighbourhoods. These programs are most likely to be delivered in a child care or pre-school facility. The six interventions included in this cluster are:

Perry Preschool Project (Perry)*
Head Start
High/Scope Preschool Curriculum Comparison Study (High/Scope)
Saginaw Pre-Kindergarten Project (Saginaw)
Bolivia Integrated Child Development Program (PIDI)*
Chicago Child-Parent Center (CPC)*

Cluster 2: targeted, parent focused, home visits, all ages

Interventions that are grouped in 'cluster 2' are targeted interventions usually aimed at improving parent outcomes, such as parenting skills or social support. The target population receiving the interventions are parents with children across the early childhood age range from low-income backgrounds, or parents associated with some other risk factor (for example, depression, parent of a low birth-weight baby). The interventions typically include a strong home-visitation component. The eight interventions included in this cluster are:

Elmira Pre-natal and Early Infancy Project (PEIP)*
Houston Parent-Child Development Centre (PCDC)
Home Instruction for Parents of Pre-school Youngsters (HIPPY)
Hawaii's Healthy Start Program (Healthy Start)
Early Enrichment Project (EEP)
Support at Home for Early Language and Literacy (SHELLS)
Baby Happiness, Understanding, Giving and Sharing Program (Baby HUGS)
Project 12-ways

Cluster 3: targeted, family economic/welfare focused, all ages

Interventions that are grouped in 'cluster 3' are targeted interventions aimed at improving familial economic self-sufficiency or parental employment. The target population is parents of children across the early childhood age range from poor backgrounds or welfare recipients. These interventions include case management components, financial aid and additional services. The three interventions included in this cluster are:

New Hope Child and Family Study (New Hope)
Florida Family Transition Project (FTP)*
Teenage Parent Demonstration Project (TPDP)

Cluster 4: targeted, holistic, various locations, all ages

Interventions that are grouped in 'cluster 4' are targeted interventions. They are holistic interventions, that is, they aim to improve outcomes for both children across the early childhood age range and their parents. Thus, the intervention is targeted at both parents and children. These programs typically involve parent skills training and a child education component, and are delivered in various locations. The twelve interventions included in this cluster are:

Early Head Start
Carolina Abecedarian Project (Abecedarian)*
Infant Health and Development Project (IHDP)
Syracuse Family Development Research Program (Syracuse)
Starting Early Starting Smart (SESS)*
Even Start
Comprehensive Child Development Program (CCDP)
Incredible Years
Early Childhood Education and Assistance Program (ECEAP)
Better Beginnings Better Futures (BBBF)
Sure Start
New Parent Infant Network (NEWPIN)

Cluster 5: universal, various foci, various locations, all ages

Interventions that are grouped in 'cluster 5' are universal interventions. They are focused variously at children only, parents only or children and their parents, and apply to children across the early childhood age range. The intervention strategy and program location vary. The three interventions included in this cluster are:

Positive Parenting Program (Triple P) *
Parents as Teachers (PAT)
Cuyahoga County Early Childhood Initiative (Cuyahoga)

4. Adequacy of intervention design and implementation

It is important to interpret evaluation findings in the context of the strengths and weaknesses of the intervention design and implementation, as well as the strengths and weaknesses of the evaluation methodology. If an intervention has not been designed well or implemented according to plan, any evaluation of the program will be misrepresentative of the program it purports to be evaluating (Mrazek and Brown 2002).

Dryfoos (1990) suggests that effective programs are 'high-dose' and involve a structured curriculum. These issues, as well as participation rates, program integrity (the extent to which the program was delivered as intended) and drop-out rates/attrition are discussed in relation to the five intervention clusters below.

Dosage of programs

Dosage refers to the amount of intervention provided. Encompassed within the term dosage are the concepts of intensity and duration. For example, participants may receive the same 'dosage' from an intensive intervention implemented over a short duration, and a less intensive intervention over a longer duration.

In early childhood interventions, the evidence supports the notion that 'more is better' (Berlin, O'Neal and Brooks-Gunn 1998: 8). Interventions that are high on intensity and duration are thought to be more effective than those that are less intense, and run for a shorter duration. For example, Reynolds (1994) suggests that intervention effects are stronger and more lasting for programs that are of three to four years duration, compared to those of only one year duration. Other researchers have also suggested that programs of short duration have limited effects and that effects are more likely to be sustained for programs that are intensive and continue into the schoolage years (Brooks-Gunn 2003; Fonagy 2001). The dosage of programs included in the current report was highly variable.

Cluster 1: targeted, child focused, centre based, preschool age. Children in cluster 1 interventions typically received a high dose of intervention, receiving at least part-day child care or education five days a week for the most part of a year. The Perry Preschool Project and the High/Scope preschool curriculum study also included weekly or fortnightly home visits. Interventions were offered for at least one year, with some children receiving the intervention for up to six years.

Cluster 2: targeted, parent focused, home visits, all ages. Parents in cluster 2 interventions also received a reasonably high dose of intervention. Contact with parents was at least weekly, although this varied from phase to phase. For example, the Elmira PEIP began with weekly home visits, reduced to fortnightly visits, went back to weekly visits during the six weeks after birth and then gradually became less frequent over time. In terms of duration, programs ranged from weekly sessions over 16 weeks to regular contact over three years.

Cluster 3: targeted, family economic/welfare focused, all ages. Interventions in cluster 3 were typically medium-dose. Although the intensity was low (actual person-to-person contact was minimal, and took the form of a meeting with a case manager who provided support and assistance in finding employment), programs ran for approximately two years.

Cluster 4: targeted, holistic, various locations, all ages. Participants in cluster 4 typically received a high-dose intervention, although the nature of the intervention (for example, centre based and home visiting), varied from family to family according to need (for example, Early Head Start, Sure Start and NEWPIN). Services targeted at children were usually the most intensive (often five days a week), while parent services were less intensive (weekly to fortnightly). The programs operated from 22 weeks to five years.

Cluster 5: universal, various foci, various locations, all ages. The intensity of programs in intervention cluster 5 varied widely, even within the same program. For example, the Triple P program ranged from very low dose (parenting information communicated via the media) to weekly parent training sessions over ten weeks, while the PAT program lasted three years.

Participants in clusters 1 and 4 interventions received the highest dose of intervention, suggesting that these interventions may be the most effective (Reynolds 1994; Berlin, O'Neal and Brooks- Gunn 1998).

Participation rates

Involvement in early childhood interventions is usually voluntary, thus participation rates (either full or part-participation) can vary dramatically from program to program. Low participation among those expected to benefit the most from an intervention (children from low-income families, for example) will most likely result in negatively skewed effects (that is, effects will not be as positive as expected), whereas higher participation rates are often associated with better outcomes (see Berlin, O'Neal and Brooks-Gunn 1998). Although what constitutes low participation is not made explicit in the literature, low participation of a target group leads to participation threat to the evaluation design (Mrazek and Brown 2002; Berlin, O'Neal and Brooks-Gunn 1998). Participation rates for programs included in the current report are discussed below.

Cluster 1: targeted, child focused, centre based, preschool age. Participation in cluster 1 interventions was not always reported. Among programs reporting this information, participation rates were reasonably high. The Perry Preschool Project reported a 69 per cent full attendance rate and the High/Scope Preschool Curriculum study reported an 80 per cent participation rate in home visits.

Cluster 2: targeted, parent focused, home visits, all ages. Of the interventions in cluster 2 that reported participation rates, participation was quite low. For example, the Elmira PEIP reported that an average of 23 home visits were conducted between birth and two years, with a range of 0 to 59 (59 visits were specified by the program), while Hawaii's Healthy Start Program reported that very few families were visited weekly, as intended. Other programs, for example HIPPY, had difficulty determining participation rates.

Cluster 3, targeted, family economic/welfare focused, all ages. Given that financial incentives were used to encourage participation, the rates of participation in the programs in cluster 3 were typically near 100 per cent.

Cluster 4, targeted, holistic, various locations, all ages. Participation rates were not often reported for cluster 4. The available information indicated extreme variation in participation rates.

Cluster 5: universal, various foci, various locations, all ages. Participation rates in cluster 5 interventions were not easy to measure (for the media communication strategy of Triple P, for example). However, problems with non-attendance at groups and lack of success in phone contacts were documented. Participation in the Cuyahoga program was very high during the first three months; however no additional information was available. Participation information was not available for PAT.

In all clusters except cluster 3, participation was variable. The higher rates of participation in cluster 3 are most likely due to the low intensity of the interventions and the financial incentives for participation. The evaluations of those interventions with very low participation rates need to be interpreted with caution, as they are based on participants who did not receive the full intervention or did not participate in the intervention at all. In addition, there may be systematic differences between families who participated in the intervention and those that did not that could bias evaluation findings.

Drop-out rates

'Drop-out rates' refer to participants who began, but did not complete the full intervention. High drop-out rates pose an attrition threat to the evaluation design (Mrazek and Brown 2002), and can result in positively skewed findings, as participants who drop out may do so because they have not found the intervention acceptable or useful. However, it is often possible to statistically account for drop-out rates in an evaluation by comparing characteristics of participants who dropped out to the characteristics of participants who continued with the intervention, and controlling for any differences between the two groups. Drop-out rates for the programs reviewed in the current report are outlined below.

Cluster 1: targeted, child focused, centre based, preschool age. Drop-out rates were not reported for interventions in cluster 1.

Cluster 2: targeted, parent focused, home visits, all ages. Of the cluster 2 interventions that reported drop-out rates, the rates were quite high, ranging from 40 per cent to 69 per cent.

Cluster 3: targeted, family economic/welfare focused, all ages. Given the financial incentives, as well as the mandatory nature of some of the programs, drop-out rates in cluster 3 were close to zero.

Cluster 4: targeted, holistic, various locations, all ages. Drop-out rates in the interventions in cluster 4 were high, ranging from 24 per cent to 67 per cent.

Cluster 5: universal, various foci, various locations, all ages. Drop-out rates were not reported for any of the interventions in cluster 5.

In summary, drop-out rates when reported were generally high across all clusters, except for those interventions in cluster 3. This reflects the fact that these interventions were often mandatory, or involved some type of financial incentive for participation.

Program integrity

Program integrity refers to the implementation of the program according to its design (that is, the same content, delivered in the same way). The quality of program implementation is perceived to influence program effectiveness (Shonkoff and Phillips 2000), and poor implementation leads to an implementation threat to the evaluation design (Mrazek and Brown 2002). Staff qualifications, staff to child ratios and staff turnover are aspects of program implementation that can affect program integrity. Schorr (1997), for example, suggests that higher qualified and more experienced staff result in greater program effectiveness (see also Berlin, O'Neal and Brooks-Gunn 1998). Tomison and Wise (1999) suggest that professional staff are particularly necessary when dealing with very vulnerable families, or where there is a risk of child maltreatment.

Although it is far more difficult to evaluate programs that do not follow a strict curriculum (such as programs in cluster 5 which are tailored to family and community needs), flexibility in program delivery may be necessary to meet the specific needs of individuals and families. The integrity of programs under review in this report are discussed below.

Cluster 1: targeted, child focused, centre based, preschool age. Cluster 1 interventions were generally highly standardised and of high quality. Low staff to child ratios (ranging from 1:5 to 1:8 depending upon children's ages) were employed, and staff were highly qualified. Staff turnover rates, however, could have been improved upon. The Perry Preschool Project reported that ten teachers occupied four positions over five years and the High/Scope Preschool Curriculum study reported that new teachers were appointed in the second year of the study. Three of the six interventions involved set programs. The remaining three, although guided by strict protocols, were not implemented in the same manner from site to site, meaning that there was not one consistent program to evaluate.

Cluster 2: targeted, parent focused, home visits, all ages. In most cases, the staffing of programs in cluster 2 involved paraprofessionals (lay people trained specifically to implement the program), often from the same community as participants. Three programs were staffed by professionals; the Elmira PEIP, Baby HUGS and Project 12-ways. Although most interventions had some set guidelines, most were not standardised but administered according to individual or community need.

Cluster 3: targeted, family economic/welfare focused, all ages. Interventions in cluster 3 were staffed by employees of the relevant social services department, and were typically trained social workers. Program content varied from participant to participant, with some participants attending very few sessions and others attending quite a number. However, implementation was successful in terms of applying the specified financial incentives or disincentives.

Cluster 4: targeted, holistic, various locations, all ages. Cluster 4 interventions ranged from highly structured and standardised (for example, Incredible Years) to highly unstructured and non-standardised (for example, Even Start and SESS). Staff to child ratios in child care centres were good to very good; usually about 1:3 for infants and 1:6 for preschool aged children. Professionals or paraprofessionals most often implemented the programs and typically received ongoing training and/or supervision. For example, all staff (including drivers and cooks) of the Syracuse FDRP received two weeks of intensive training each year.

Cluster 5: universal, various foci, various locations, all ages. The interventions in cluster 5 were typically administered by professionals, although, with the exception of Triple P, the program content varied from participant to participant (for example, Cuyahoga).

In summary, most interventions were administered by professional or paraprofessional staff, which is likely to enhance intervention effectiveness (Schorr 1997). Most interventions had some form of flexibility in-built, although most followed some form of protocol. The most standardised interventions were found in cluster 1. Evaluations of the interventions that were not standardised need to be interpreted with care, as an implementation threat to the design may be present (Mrazek and Brown 2002).

Summary

Overall, very few of the programs reviewed in this report were adequate in all areas of design and implementation. As suggested above, although the adequacy of interventions within clusters was variable, the design and implementation of interventions in cluster 1 appear to be the most adequate across all aspects of design and implementation. Although dosage levels were high, other aspects of design and implementation were quite poor for cluster 2 interventions. Although most aspects of design and implementation were adequate for interventions in cluster 3, intensity levels were low. Given the great variability in cluster 4, it is difficult to draw any conclusions about the adequacy of targeted, holistic interventions. Similarly, it is difficult to draw conclusions about the adequacy of universal interventions in cluster 5 because of the limited information available about their design and implementation.

5. Adequacy of evaluation design

Ideally, evaluations of interventions should be systematic, comprehensive and use rigorous scientific controls, such as randomised trials and sufficient statistical power, to find meaningful program effects (Sanders 2003). Some existing reviews of program evaluations have developed standards, grades or levels of evidence for early childhood interventions, based on certain criteria. These categories are used as a means of reporting the rigour of the evaluation design (for example, Mrazek and Brown 2002).

Evidence rating system

The evidence rating system adopted in this report aims to provide information on a number of fundamental research design elements. The elements included in this review are:

Appropriate evaluation design methodology. Evaluations (including cost-benefits analyses) require an appropriate control or comparison group. This can be achieved either by randomly assigning participants to be in the intervention or control group, or by selecting a group of participants that are matched to the intervention group on a number of characteristics such as gender and age (matched comparison group).
Pre-intervention data. For matching intervention and control groups, and to detect change as a result of implementation, it is necessary to collect baseline information.
Intermediate follow-up and long-term follow-up. To determine whether the intervention has had any short-term and/or long-term effects, outcome data should be regularly collected on the intervention and comparison groups. Ideally, follow-up should continue for a number of years.
Representative sample of participants in the evaluation. To ensure that an evaluation is representative of the intervention it is evaluating, the evaluation sample must be representative of the whole sample that received the intervention.
Low attrition at follow-up and non-random attrition. Attrition in regard to evaluation integrity refers to the number of participants that could not be included in the immediate or long-term follow-up. Attrition is generally deemed to be acceptable if it is no more than 10 per cent per follow-up time point. Therefore, in a sample of 100, no more than ten participants could be lost at each follow-up time point.
Adequate statistical power. To ensure that an evaluation is statistically adequate, the case-to-variable ratio used in an analysis needs to be considered. A minimum of five participants for every one characteristic measured is standard.
Reliable measures. The integrity of an evaluation is enhanced if the tools used to measure outcomes are standardised (that is, have known psychometric properties) and widely used.
Appropriate choice of measures. In making decisions about how outcomes are to be measured, serious consideration must be given to the measures used. A measure that does not adequately assess what evaluators want it to assess will compromise the integrity of the evaluation.
Appropriate analytic approach. This criterion refers to the use of appropriate statistical techniques. This is necessary to ensure that the findings are reliable.

The presence or absence of each design element is recorded in Tables 1-5 below. Full details of the intervention evaluations and outcomes are provided in Appendix 2.

Adequacy of cluster 1 evaluations

All evaluations in cluster 1 included a representative sample of participants. Most used reliable measures, made appropriate choices about measures and used appropriate analytic approaches. Four of the six interventions (Perry, CPC, High/Scope and PIDI) included an appropriate control or comparison group and four (Perry, Head Start, High/Scope, PIDI) collected pre-intervention data. Half of the interventions had follow-up data (Perry, CPC, High/Scope).

The evaluation integrity of three interventions in cluster 1 was very good, with all three interventions containing nine of the ten research design elements (Perry, CPC, High/Scope). The evaluation integrity of one intervention (Saginaw) was very poor, containing only two of the research design elements; while the evaluation integrity of the remaining two interventions (Head Start, PIDI) was moderate (six design elements). These details are illustrated in Table 1.

**Table 1:** Adequacy of cluster 1 evaluations
	Perry	Head Start¹	CPC	High/ Scope	Saginaw²	PIDI
Includes appropriately-matched comparison group or randomised control design methodology	design element present	design element not present	design element present³	design element present	design element not present	design element present⁴
Pre-intervention (baseline) data available	design element present	design element present	design element not present	design element present	design element not present	design element present⁵
Intermediate follow-up (i.e. collected up to two years after the intervention period)	design element present	design element not present	design element present	design element present	design element not present	design element not present
Long-term follow-up (i.e. collected more than 2 years after the intervention period)	design element present	design element not present	design element present	design element present	design element not present	design element not present
Representative sample of participants included in the evaluation⁶	design element present	design element present	design element present	design element present	design element present	design element present
Low attrition at longitudinal follow-up (not more than 10 per cent per data point) and attrition not systematic	design element present	NA	design element present	design element present	NA	NA
Adequate statistical power for analyses	design element not present⁷	design element present	design element present	design element not present⁸	design element present	design element present
Reliable measures	design element present	design element present	design element present	design element present	design element not present	NR
Appropriate choice of outcome measures	design element present	design element present	design element present	design element present	design element not present	design element present
Appropriate analytic approach	design element present	design element present	design element present	design element present	design element not present	design element present
Number of evaluation design elements present	9/10	6/10	9/10	9/10	2/10	6/10

NA= not applicable (for example, no longitudinal follow-up)
NR=not reported (insufficient information published to determine whether design element present/absent)
1 Numerous evaluations of Head Start have been conducted. Given limited time frames, this review focuses on a large-scale national evaluation, however it must be noted that this is not necessarily representative of all evaluations of Head Start.
2 Evaluations of this program examine whether or not the intervention group has achieved the objectives set out by the program.
3 Although participants in the Chicago CPC were self-selected, the intervention and control groups did not differ on a number of characteristics at the beginning of the intervention.
4 Participants in the program were self-selected.
5 On most measures.
6 However, those receiving the program were often not representative of the general population (i.e.,mostly African American children).
7 Numerous analyses were conducted on a small sample, meaning that some findings may be significant due to chance.
8 Numerous analyses were conducted on a small sample.

Adequacy of cluster 2 evaluations

All but one of the evaluations in cluster 2 (SHELLS) contained an appropriate control or comparison group. All of the evaluations included pre-intervention measures. SHELLS and Baby HUGS did not collect follow-up data, while the remaining evaluations included at least intermediate follow-up data. Half of the evaluations did not have adequate statistical power and half did not use reliable measures.

The evaluation integrity of one intervention (Elmira PEIP) was excellent, reflecting all ten of the design elements. One intervention (SHELLS) had very poor evaluation integrity (one design element present) while the evaluation integrity of the remaining six interventions was moderate to good. These details are illustrated in Table 2.

**Table 2:** Adequacy of cluster 2 evaluations
	PEIP	PCDC	HIPPY⁹	Healthy Start	EEP	SHELLS	Baby HUGS	Project 12-ways
Includes appropriately-matched comparison group or randomised control design methodology	design element present	design element present	design element present	design element present	design element present	design element not present	design element present	design element present¹⁰
Pre-intervention (baseline) data available	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element present
Intermediate follow-up (i.e. collected up to two years after the intervention period)	design element present	design element present	design element present	design element present	design element not present	design element not present	design element not present	design element present
Long-term follow-up (i.e. collected more than 2 years after the intervention period)	design element present	design element present	design element not present	design element not present	design element present	design element not present	design element not present	design element not present
Representative sample of participants included in the evaluation	design element present	NR	design element present	design element present	design element present	design element not present	design element not present	design element present
Low attrition at longitudinal follow-up (not more than 10 per cent per data point) and attrition not systematic	design element present	design element not present	design element present	design element not present	design element present	NA	design element not present	NA¹¹
Adequate statistical power for analyses	design element present	design element not present	design element not present	design element present	design element present	design element not present	design element not present	design element present
Reliable measures	design element present¹²	design element present	design element not present	design element not present	design element present	design element not present¹³	design element present	design element not present
Appropriate choice of outcome measures	design element present	design element present	design element present	design element not present	design element present	design element not present¹⁴	design element present	design element not present
Appropriate analytic approach	design element present	design element present	design element present	design element present	NR	design element not present	design element present	design element present
Number of evaluation design elements present	10/10	7/10	7/10	10-Jun	8/10	1/10	5/10	6/10

NA= not applicable (for example, no longitudinal follow-up)
NR=not reported (insufficient information published to determine whether design element present/absent)
9 Not all of the evaluations reviewed were adequate.
10 Participation in the program was not random.
11 The evaluation was conducted by examining Department of Children and Family Services files only.
12 Although some of the measures used are questionable as they rely solely on maternal report.
13 Outcomes were assessed largely by parent report only.
14 Child outcomes were not assessed (but are planned for future evaluations).

Adequacy of cluster 3 evaluations

All of the evaluations of interventions in cluster 3 included appropriate control or comparison groups, a representative sample, adequate statistical power, reliable measures and chose appropriate outcome measures.

Table 3 shows that the evaluation integrity of two of the interventions was very good, with both evaluations containing nine of the ten design elements (New Hope, FTP). The evaluation integrity of the remaining intervention (TPDP) was good, containing seven design elements.

**Table 3:** Adequacy of cluster 3 evaluations
	New Hope	FTP	TPDP
Includes appropriately-matched comparison group or randomised control design methodology	design element present	design element present	design element present
Pre-intervention (baseline) data available	design element present	design element present	design element not present
Intermediate follow-up (i.e. collected up to two years after the intervention period)	design element present	design element present	design element not present
Long-term follow-up (i.e. collected more than 2 years after the intervention period)	design element not present	design element present	design element present
Representative sample of participants included in the evaluation	design element present	design element present	design element present
Low attrition at longitudinal follow-up (not more than 10 per cent per data point) and attrition not systematic	design element present	design element not present	design element present
Adequate statistical power for analyses	design element present	design element present	design element present
Reliable measures	design element present	design element present	design element present
Appropriate choice of outcome measures	design element present	design element present	design element present
Appropriate analytic approach	design element present	design element present	design element not present¹⁵
Number of evaluation design elements present	9/10	9/10	7/10

NA= not applicable (for example, no longitudinal follow-up)
NR=not reported (insufficient information published to determine whether design element present/absent)
15 The significance level used was 0.10.

Adequacy of cluster 4 evaluations

Most of the evaluations in cluster 4 included a representative sample and chose appropriate outcome measures, while two-thirds of the evaluations included an appropriate control or comparison group and two-thirds used reliable measures. For most of the other design elements, approximately half contained each design element. Attrition in the evaluations was acceptable in only four of the evaluations (Abecedarian, IHDP, Incredible Years, ECEAP) and were not applicable in half of the interventions due to the lack of longitudinal follow-up.

The evaluation integrity of three interventions was very good, with all evaluations containing nine of the ten design elements (Abecedarian, IHDP, Incredible Years). Two interventions (Sure Start and NEWPIN) had very poor evaluation integrity, with each intervention containing only one design element. However, more comprehensive evaluations of Sure Start are pending. The evaluation integrity of the remaining seven evaluations was moderate to good (five to seven design elements). These details are illustrated in Table 4.

The table can also be viewed on pages 16 of the PDF.

**Table 4:** Adequacy of cluster 4 evaluations
	Early Head Start	Abecedarian	IHDP	Syracuse	SESS	Even Start	CCDP	Incredible Years	ECEAP	BBBF	Sure Start¹⁶	NEWPIN
Includes appropriately-matched comparison group or randomised control design methodology	design element present	design element present	design element present	design element present	design element present	design element present	design element not present	design element present	design element not present¹⁷	design element present	design element not present	design element not present
Pre-intervention (baseline) data available	design element not present¹⁸	design element present	design element not present	design element not present	design element present	design element present	design element present	design element present	design element not present	design element present	design element not present	design element not present
Intermediate follow-up (i.e. collected up to two years after the intervention period)	design element not present¹⁹	design element present	design element present	design element not present	design element not present	design element present	design element present	design element present	design element present	design element not present²⁰	design element not present	design element not present
Long-term follow-up (i.e. collected more than 2 years after the intervention period)	design element not present	design element present	design element present	design element present	design element not present	design element not present	design element present	design element not present	design element present	design element not present²¹	design element not present	design element not present
Representative sample of participants included in the evaluation	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element not present	design element not present
Low attrition at longitudinal follow-up (not more than 10 per cent per data point) and attrition not systematic	NA	design element present	design element present	design element not present	NA	design element not present	design element not present	design element present	design element present	NR	NA	NA
Adequate statistical power for analyses	design element present	design element not present	design element present	design element not present	design element present	design element not present	design element present	design element present	design element present	design element present	design element not present	design element not present²²
Reliable measures	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element present	design element not present	NR²³	design element not present	NR
Appropriate choice of outcome measures	design element present	design element present	design element present	design element present	design element not present	design element present	design element present	design element present	design element present	design element present	design element present²⁴	design element present
Appropriate analytic approach	design element present	design element present	design element present	NR	design element present	design element present	design element not present	design element present	design element not present²⁵	design element present	design element not present	NR
Number of evaluation design elements present	6/10	9/10	9/10	5/10	6/10	7/10	7/10	9/10	6/10	6/10	1/10	1/10

NA= not applicable (for example, no longitudinal follow-up)
NR=not reported (insufficient information published to determine whether design element present/absent)
16 A comprehensive evaluation is pending.
17 The intervention and comparison groups differed quite dramatically on level of poverty.
18 Minimal baseline data was collected.
19 An intermediate follow-up is planned.
20 Longitudinal analyses are planned.
21 Longitudinal analyses are planned.
22 One evaluation did, however this evaluation focused primarily on service use, rather than outcomes.
23 The measures were not described adequately enough to make a judgement.
24 The planned outcome measures are appropriate
25 Much of the evaluation focused on comparing groups within the intervention group, rather than comparing the intervention and comparison groups

Adequacy of cluster 5 evaluations

All three of the evaluations in cluster 5 contained an intermediate follow-up and a representative sample, however none of them contained a long-term follow-up. In addition, attrition was high in all but one evaluation (Cuyahoga) and only Triple P included an appropriate control group and used an appropriate analytic approach.

As shown in Table 5, the evaluation integrity of Triple P was good (seven design elements); the evaluation integrity of PAT was poor (four design elements); and the evaluation integrity of Cuyahoga was moderate (five design elements).

**Table 5:** Adequacy of cluster 5 evaluations
	Triple P	PAT	Cuyahoga
Includes appropriately-matched comparison group or randomised control design methodology	design element present	design element not present	design element not present
Pre-intervention (baseline) data available	design element present	design element not present	design element present
Intermediate follow-up (i.e. collected up to two years after the intervention period)	design element present	design element present	design element present
Long-term follow-up (i.e. collected more than 2 years after the intervention period)	design element not present	design element not present	design element not present
Representative sample of participants included in the evaluation	design element present	design element present	design element present
Low attrition at longitudinal follow-up (not more than 10 per cent per data point) and attrition not systematic	design element not present	design element not present	design element present
Adequate statistical power for analyses	design element present	design element not present²⁶	design element present
Reliable measures	design element not present²⁷	design element present	design element not present
Appropriate choice of outcome measures	design element present	design element present	design element not present
Appropriate analytic approach	design element present	design element not present	design element not present
Number of evaluation design elements present	7/10	4/10	5/10

NA= not applicable (for example, no longitudinal follow-up)
NR=not reported (insufficient information published to determine whether design element present/absent)
26 Not for the all of the analytic procedures used.
27 The measures used were largely parent report. Some observations were also conducted.

Relative adequacy of evaluations across clusters

It is difficult to make any firm distinctions between clusters, given the great variability in evaluation integrity within clusters. With the exception of cluster 5, each cluster contained evaluations with very good integrity, while all clusters except cluster 3 contained evaluations with very poor to poor integrity.

One design element that warrants further discussion is the use of reliable measures. Regardless of cluster, most of the evaluations included some objective measures, as well as parental reports. Although parent reported measures have their merit, and are usually the most expedient way of data collection, they are subjective by nature. Objective measures are therefore needed to corroborate parental reports.

6. Effects of early childhood interventions

Assessing program outcomes

The available information on significant effects was examined to assess program outcomes. While non-significant trends may be part of a larger pattern, in and of themselves they cannot be interpreted reliably. Thus, only effects that were significant at .05 level were included. The use of the .05 cut-off is a commonly used cut-off in statistical analyses and was therefore adopted for consistency with scientific reports.

Long-running interventions have often been researched at different time points. Effect sizes are thus reported in terms of the timing of effects - whether the reported outcome effect was: short term (data collected during or immediately after the intervention); intermediate (data collected up to two years after the intervention period); or long-term (data collected more than two years after the intervention period).⁵

For ease of reporting and interpretation, the magnitude of effects are grouped into four categories: negligible (Neg) (effect size under 0.20); small (Sm) (effect size 0.20-0.49); medium (Med) (effect size 0.50-0.79); or large (Lg) (effect size 0.80 or greater).

A negative effect size (-) indicates that effects were in the opposite direction to that which was expected. That is, the control group performed better on a measured outcome than the intervention group.

Tables 6-10 summarise the available data on intervention effect sizes. Missing cells indicate that effect sizes were either not calculated, or not reported in the evaluation material reviewed.⁶ The number of evaluation design elements present for each intervention is also included in these tables for reference.

Effects of cluster 1 interventions

The outcomes measured in the interventions in cluster 1 were all child outcomes and effect sizes were available for three of the six interventions in this cluster. Most of the effects were negligible to small, and very few effects were large. In addition, for the Perry project (that reported short, intermediate and long-term effect sizes), the positive effects on cognitive outcomes tended to diminish. In terms of specific outcomes, the effects of intervention on child cognitive abilities varied, with the Perry project reporting large short-term gains and PIDI reporting negligible short-term gains. Similarly, children's academic gains reported by the Perry project were larger than those reported by CPC. (See Table 6.)

**Table 6:** Effects of cluster 1 interventions
Intervention	Outcome	Effect size Short-term	Effect size Intermediate	Effect size Long-term	Evaluation design
Perry	IQ	Lg	Sm	Neg	9 /10 design elements were present in the Perry evaluation.
	Nonverbal intellectual performance	Med	Sm	Neg
	Vocabulary	Lg	Sm	Neg
	Psycholinguistic ability	Lg	Sm	Sm
	Achievement tests	-	Sm	Med
	Literacy	-	-	Sm
	High school completion	-	-	Sm
	Mean number of criminal arrests	-	-	Med
	Monthly earnings	-	-	Med
	Yearly earnings	-	-	Sm
	Employment status	-	-	Sm
	Welfare use	-	-	Sm
	Teen pregnancy	-	-	Med
Head Start	no data available
CPC	Reading	-	Neg	Neg	9/10 design elements were present in the CPC evaluation.
	Mathematics	-	Neg	Sm
	Teacher ratings	-	Neg	Neg
	Grade repetition	-	Neg (-)	Neg (-)
	Parent involvement in education	-	Sm	Neg
	Special education	-	Sm (-)	Sm (-)
	High school completion	-	-	-
	Crime and delinquency	-	-	-
High/Scope	no data available
Saginaw	no data available
PIDI	Cognitive development (under 3 years)	Neg	-	-	6/10 design elements were present in the PIDI evaluation.
	Cognitive development (3 to 5 years)	Sm to Med	-	-
	Motor development (under 3 years)	Neg	-	-
	Motor development (3 to 5 years)	Sm to Med	-	-
	Health (under 3 years)	Med to Lg	-	-
	Health (3 to 5 years)	Sm to Med	-	-

Effects of cluster 2 interventions

Effects sizes were available for child and family outcomes for three of the interventions in cluster 2. Again, most of the effects were negligible to small, with some effects diminishing over time and others remaining stable in the intermediate term.

In terms of specific outcomes, the effects on child cognitive and academic outcomes were quite varied between, and even within, interventions. The HIPPY program found positive short and intermediate term effects on children's cognitive skills and academic skills that ranged from negligible to medium, while the PCDC found mainly negligible long-term effects, although academic effects were medium for boys. (See Table 7.)

**Table 7:** Effects of cluster 2 interventions
Intervention	Outcome	Effect size Short-term	Effect size Intermediate	Effect size Long-term	Evaluation design
PEIP	Home safety	Sm	Med	-	10 /10 design elements were present in the PEIP evaluation.
	Child hospitalisation	-	Sm	-
	Home environment	Neg	Neg	-
	Maternal education	Lg	-	-
	Subsequent pregnancies	Neg	Neg
	Child abuse and neglect	-	-	-
	Maternal and child drug use	-	-	-
	Maternal and child crime	-	-	-
PCDC	Child - Cognitive	-	-	Neg (-) to Neg	7 /10 design elements were present in the PCDC evaluation.
	Child - School performance (males and females)	-	-	Neg
	Child - School performance (males only)	-	-	Med
	Child-Temperament (males and females)	-	-	Neg
	Child-Temperament (males only)	-	-	Sm to Lg
HIPPY	Cognitive skills	Neg to Med	-	-	7/10 design elements were present in the HIPPY evaluation.
	Reading	Neg to Sm	Neg to Med	-
	Mathematics	Sm	Neg to Sm	-
	Classroom Adaptation	Sm to Med	Neg to Med	-
	School readiness	Sm		-
Healthy Start	no data available
EEP	no data available
SHELLS	no data available
Baby HUGS	no data available
Project 12-ways	no data available

Effects of cluster 3 interventions

Only one of the three evaluations in intervention cluster 3 reported effect sizes. Both child and family outcomes were measured. Outcomes were measured in the intermediate term and most of the effect sizes were small. In terms of specific outcomes, the effects on child behaviour ranged from small to medium, while the effects on parent outcomes (including income, child care use and support) were small. (See Table 8.)

**Table 8:** Effects of cluster 3 interventions
Intervention	Outcome	Effect size Short-term	Effect size Intermediate	Effect size Long-term	Evaluation design
New Hope	Child - Social Skill (males only)	-	Sm	-	9/10 design elements were present in the New Hope evaluation.
	Child - Improved classroom behaviour (males only)	-	Sm to Med	-
	Child - Lower externalising behaviour (males only)	-	Sm	-
	Child - Lower internalising behaviour (males only)	-	Med	-
	Child - Classroom behaviour (males only)	-	Sm	-
	Child - Increased educational aspirations (males only)	-	Sm	-
	Parent - Higher income	-	Sm	-
	Parent - Higher child care use	-	Sm	-
	Parent - Perceived social support	-	Sm	-
FTP	no data available
TPDP	no data available

Effects of cluster 4 interventions

Evaluations in intervention cluster 4 measured child and family outcomes. Effect sizes were available for four of the twelve evaluations; most effect sizes were for short-term outcomes only. Again, most of the effect sizes were negligible to small.

In terms of specific outcomes, the effects of interventions in cluster 4 on child cognitive outcomes were varied. Early Head Start found negligible effects, while BBBF found effects that ranged from negligible to medium, and IHDP found large effects. Effects on child emotional and behavioural outcomes were generally negligible, although Incredible Years and BBBF found some small to medium effects. Effects on parenting were contrasting, with Early Head Start reporting negligible effects, Incredible Years reporting small to medium effects and BBBF reporting effects that ranged from medium (but in the opposite direction) to large. (See Table 9.)

Table 9: Effects of cluster 4 interventions
Intervention	Outcome	Effect size Short-term	Effect size Intermediate	Effect size Long-term	Design
Early Head Start	Cognitive and language development	Neg	-	-	6/10 design elements were present in the Early Head Start evaluation.
	Social-emotional development	Neg to Sm	-	-
	Parenting behaviour	Neg	-	-
	Parent knowledge and discipline strategies	Neg	-	-
	Parent health and family functioning	Neg	-	-
	Parent self-sufficiency	Neg	-	-
Abecedarian	no data available
IHDP	IQ (heavier group)	Lg	-	-	9/10 design elements were present in the IHDP evaluation.
	IQ (lighter group)	Sm	-	-
	Behaviour	Neg	-	-
	Morbidity (heavier group)	Neg	-	-
	Morbidity (lighter group)	Sm	-	-
	Serious morbidity	Neg (-)	-	-
	Functional Status	Neg	-	-
	Height	Neg	-	-
	Body mass (heavier group)	Neg	-	-
	Body mass (lighter group)	Neg	-	-
	General health	Neg	-	-
Syracuse	no data available
SESS	no data available
Even Start	no data available
CCDP	no data available
Incredible Years	Child - Non-compliance	Neg	NS	-	9/10 design elements were present in the Incredible Years evaluation.
	Child - Conduct problems	Neg	NS	-
	Parent - Harsh parenting	Neg	Neg	-
	Parent - Positive interactions	Neg to Sm	Neg	-
	Child - Non-compliance at home	Med	-	-
	Child - Behaviour at school	Sm to Med	-	-
	Child - Peer interactions	Sm	-	-
	Parent - Parenting skills and interactions	Sm to Med	-	-
ECEAP	no data available
BBBF	Emotional and behavioural problems	Neg to Med	-	-	6/10 design elements were present in the BBBF evaluation.
	Language development	Neg to Med	-	-
	Motor development	Sm to Med	-	-
	Attention and Memory	Neg to Sm	-	-
	Increase in breastfeeding	Sm (-) to Med (-)	-	-
	Child nutrition	Neg to Med	-	-
	Immunisation	Neg	-	-
	Parent encouragement for bike helmet use	Lg (-)	-	-
	Use of child professionals	Neg to Sm	-	-
	Health pre and post natal	Sm (-) to Sm	-	-
	Parenting	Med (-) to Lg	-	-
	Reduced domestic violence	Sm	-	-
	Sense of community cohesion	Med(-) to Med	-	-
Sure Start	no data available
NEWPIN	no data available

Effects of cluster 5 interventions

Triple P was the only evaluation in intervention cluster 5 that reported effect sizes. This evaluation found three large short-term effects on parent and child outcomes (child behaviour, parenting style and parent conflict over child rearing). Two of these large effects diminished at intermediate follow- up to small (child behaviour) and medium (parenting style). Effects on parent-child relationships were negligible, and small program effects were found on parent mental health, both of which were maintained at the same level at intermediate follow-up. (See Table 10.)

**Table 10:** Effects of cluster 5 interventions
Intervention	Outcome	Effect size Short-term	Effect size Intermediate	Effect size Long-term	Design
Triple P	Child behaviour	Lg	Sm	-	7/10 design elements were present in the Triple P evaluation.
	Parenting style	Lg	Med	-
	Parent conflict over child rearing	Lg	Lg	-
	Parent-child relationship	Neg	Neg	-
	Parent mental health	Sm	Sm	-
PAT	no data available
Cuyahoga	no data available

Summary of intervention effects

In summary, effect sizes indicated that child cognitive outcomes demonstrated the greatest change in the short-term; however, the size of these effects diminished over time. The more enduring effects were found on acts of delinquency and crime, with lower incidences of crime and delinquency among intervention participants. Most of the available effects on parent and family outcomes were negligible to small. However, it should be noted that few of the evaluations reporting effect sizes measured parent outcomes, and, in contrast to these findings, the Triple P program found large effects on parent outcomes.

Consistent with Benasich, Brooks-Gunn and Clewell (1992) and Brooks-Gunn (2003), who reported that most of the positive effects of interventions on child outcomes are the result of centre-based interventions, as opposed to home-visiting or case management interventions, the largest effects on child outcomes were found for intervention cluster 1 (where all programs were centre-based).

However, it is difficult to make any firm conclusions about the effects of the early intervention programs under review in this report, as many of the evaluations did not provide effect sizes, and the programs included in this review are not intended to be representative. In addition, it is acknowledged that the assignment of the terms 'negligible', 'small', 'medium' and 'large', to the size of effects is following standard definitions, but does not equal actual impact or value, and can ignore the inherent worth of particular outcomes and their value to society (see reviews by Boyle and Hertzman in Russell 2002). A focus only on medium or large effect sizes might miss a valuable outcome with a small effect size, when that is all that is needed to 'tip the balance' of health of a population (see Russell 2002).

Importantly, Brooks-Gunn (2003) notes that even a small effect size at a primary school aged follow- up is impressive, and an effect at adulthood even more so. The long-term follow-up of the Perry Preschool project reported here was conducted when participants were aged 27 years; approximately 22 years after participants had completed the intervention. To retain an effect, albeit negligible, for such a period of time is extremely impressive.

5. This definition of long-term effects was adopted given the paucity of longitudinal follow-up in the early childhood intervention literature. However, our interests are really in effects that last throughout childhood and adolescence into adulthood.

6. Although a comprehensive search was undertaken, it is possible that some effect sizes are available that are not included in this report.

7. Cost-benefit analyses: purposes and principles

Cost-benefit analyses are investment decision tools, which set out to resolve whether certain investment projects should be undertaken, and, if resources are limited, the relative ranking of these investments. The aim of a cost-benefit analysis of an early childhood intervention is to assess its social benefits relative to its social costs with a view to replicating the program, in either a similar or in a broader context, and achieving similar outcomes.

Social returns include the benefits to the program recipient and his or her family as well as returns to society more broadly. Social costs include the benefits foregone from not using the resources for some other use.

Most cost-benefit analyses count the loss or gain of satisfaction to the members of society, that is, the welfare and happiness of each and every citizen. Institutions such as the criminal justice system are not human entities and do not enter into the calculation. However, the welfare of the people who work in the justice system and the criminals and victims are counted. A reduction in crime may therefore have an ambiguous impact on society if the losses incurred by the police and criminal lawyers etc. who subsequently lose employment is valued more than the gains for the victims of crime plus the gains from transferring the government money saved to alternative uses. This is an extreme example, but serves to illustrate the point that to avoid double counting, the impacts on final householders (as consumers and workers) are considered only, and not intermediary institutions such as the government or a business.

In some cases, governments are interested in the losses and gains to government budgets from operating a project. In this case, the analysis is much simpler as non-pecuniary costs and benefits (costs and benefits that are not monetary) do not need to be monetised. It is a more straight-forward accounting exercise. In the case of a federal early childhood intervention, only commonwealth payments, such as commonwealth education subsidies, federal court expenditures, social security payments, and income taxes, need be counted. This is not a real cost-benefit analysis, but rather an accounting exercise and this type of metric is excluded from the current discussion.

The underlying premise of cost-benefit decision-making techniques is that all investments entail costs, and that all such costs are benefits foregone by another party. Accordingly, the costs of implementing an early childhood intervention will be the benefits foregone by not using these funds in an alternative way, which, for example, may be extending aged care or improving public safety.

In the case of early childhood interventions, unless the type of people who are required to run the program (social workers, child psychologists, teachers) are unemployed, then running a program will be at the expense of running other types of activities. The essence of a cost-benefit analysis is to recognise this trade-off. Unless there are idle resources - workers as well as equipment and materials - other parts of the welfare economy will be affected. Properly performed, the cost side of the cost-benefit calculation should capture the benefits foregone from not using the workers, equipment and materials in an alternative use. It is misleading to imply that the value of one program can be assessed in isolation from another. If there are no benefits foregone, then there are no costs.

In principle, cost-benefit analyses are concerned with the benefits and benefits foregone (that is, costs) to the whole of society, not just those affecting the directly implicated parties. These analyses take care to consider the hidden, implicit and indirect benefits on secondary parties and also try to include estimates of costs and benefits into the future.

Fundamental to this method is the assumption that benefits from a diverse range of activities, affecting different members of society, can be quantified and compared. In almost every case, this means monetisation. Most cost-benefit analyses rely on the assumption that the goods and services involved in the investment process are traded and valued in monetary terms 'correctly' through the market (that is, the average price consumers are willing to pay varies directly, and in proportion, with their assessment of the benefits from consuming it). Householders will only pay more for costly goods and services if they value them more highly than cheaper goods. Hence, the price of a colour television set is higher than a black and white set since consumers' valuation of the former is higher than the latter. In addition, in order for the good to appear on the market, its valuation by consumers must be at least as high as the average costs of production.⁷ In this way, price may be a reasonable way to quantify relative benefits of goods and services.

This assumption is clearly not strictly valid in many cases, especially those relating to welfare investment projects. In particular, the market price of a good or service may not be a reasonable guide to the societal value of that good or service when:

There are significant parties, other than the purchaser and supplier of the good or service, who benefit or are disadvantaged by its consumption or production (these effects are called spillovers or externalities). For example, a vaccination program may not only benefit the recipient of the vaccination, but also people who would otherwise be exposed to a disease outbreak. A reduction in the crime rate may primarily benefit potential victims rather than the wouldbe criminal. Because these 'other' parties have no say in the market transaction of the product, there is no market record of their valuation of the activity.
The good or service is not traditionally traded in the market. A classic example is the consumption of services such as libraries, parks and gardens, crime prevention and social cohesion. While these products can be traded for a price,⁸ traditionally they are not commercially traded in Australia.
The good or service cannot be traded in the market because it is not possible to exclude consumers and thus demand a price. Free-to-air broadcast television is not excludable and we cannot assess how much householders value the service. In addition, the good or service may represent an innovation, and since it does not currently exist, it cannot be objectively valued.
The good or service is primarily purchased by low-income households. The market valuation of benefits can be weighted towards the preferences of high-income households, if the price is demand sensitive.⁹ In other words, the preferences of high income households has greater weight in determining the market price of a product than low income households. Furthermore, products that are only valued by low income consumers may not be produced at all if their willingness-to-pay does not exceed the average cost of production. These are not issues for cost-benefit analysis if the purchasing patterns, or consumption preferences, of householders with respect to the good or service in question does not vary by income (that is, it may be related to the presence of children or location). However, it will matter if preferences vary by income and either prices are demand sensitive or costs of production are prohibitive.

Where these limitations constitute a significant part of the benefits, and benefits foregone, of the potential investment projects, some attempt is usually made to account for them in a way to derive acceptable quantitative estimates. Techniques for dealing with spillovers and non-market transactions are reviewed in Section 10.

Finally, while the valuation of the separate elements of the cost-benefit formulae is not always straightforward, the manner in which they are combined to produce a single investment decision index is also not always clear-cut and objective. In particular, the way in which costs and benefits are weighted between individuals and over time across generations is a considerable source of controversy in the literature.

7. Strictly, the market price reflects the valuation 'at the margin', or the value to consumers of the last unit of the good consumed. It does not give the total valuation from consuming all units of the good - hence the paradox of water and diamonds. While the value of an extra diamond is higher than the value of an extra litre of water, the sum of the value of all diamonds in the world is considerably less that the sum of the value of all water.

8. The private provision of private estates and compounds in some societies may be regarded as an example of the market provision of reduced crime and social cohesion.

9. That is, when the producer charges a price above the costs of production because he or she has some level of market power.

8. Steps in a cost-benefit analysis

A cost-benefit analysis of an early childhood intervention involves three steps: (1) an estimation of the net impact of the intervention; (2) an estimation of the social costs and the social benefits of the intervention in monetary terms; and (3) a calculation of the cost-benefit of the intervention.

Data for steps 1 and 2 can be collected from repeat surveys and linkages to administrative records of program participants. It is important that only the net costs and benefits are counted. In the case of costs, this is the costs incurred by society in the program scenario which are over and above the costs of the counterfactual scenario (usually a no program situation). This applies to net benefits as well. For example, if a program leads to the completion of Year 11 at secondary school for participants compared with Year 10 for non-participants, then only the value of the extra year needs to be monetised.

Pecuniary values for non-pecuniary costs and benefits (crime, loss of health, unemployment) are usually derived from secondary literature which has made these estimates. Most intervention costs are pecuniary in nature and include spending by government on resources such as buildings, equipment and facilities, and for the wages of social, health and education workers. Usually the non-pecuniary costs of a program, such as the loss of time incurred by the child and his or her family as a result of participating in the intervention are not counted. This is usually because evaluators either do not believe they are large or believe they are benefits, not costs. For prudential reasons, most evaluations err on the side of understating net benefits, especially when they have a speculative component, and consequently these costs are ignored.

By contrast, many net benefits of early childhood interventions are non-pecuniary. The obvious examples are reductions in the crime rate, but other benefits include the increased satisfaction associated with the schooling years, eventual employment and family functioning. Most evaluations do not seek to monetise changes to life satisfaction but they do enumerate the effects of reduced crime on prospective victims. Again, the dominant reason for ignoring the net changes to satisfaction reside in the conservative nature of cost-benefit analyses on the one hand and the more speculative conversion of some benefits into money equivalents on the other.

Many of the conjectural estimates in a cost-benefit analysis arise from making projections of pecuniary benefits, not from monetising non-pecuniary benefits. With respect to early childhood interventions, this usually involves making projections of, for example, how improvement in school retention will convert into more wages for the participant in adult life. Because the anticipated benefits from these interventions can have very long horizons, policy makers who wish to discern whether an intervention is making a positive social contribution before the full extent of benefits are 'known', have to rely on linking intermediate program net impacts with findings from other studies. In the case of the increment to labour incomes, this can mean linking changes in primary school achievement (however measured), with changes to subsequent educational attainment, occupational attainment and associated wages. Using the law of statistical average, the evaluator can then project from the age of five through to 65 years. However, the longer the projection period the greater the band of errors around the estimate. For example, we may say that a program leads to an increase in the present value of wages of say $5000 ± 10 per cent up to the age of 30, but $30,000 ± 25 per cent up to the age of 65.¹⁰

Step 1: Estimating the net impact of the intervention

The first step in conducting a cost-benefit analysis requires answers to two questions. First, what outcomes have the participants achieved in comparison to what they would have achieved if they had not participated in the intervention (called the net impact of the intervention)? Second, if this intervention is extended to other children, will it have the same net impact?

Tests of significance are a measure of the confidence we have in the size of the estimated net impact. Given a large representative sample, tests of significance will tell you whether another group of children, randomly chosen from the population, are likely to incur the same effects from undertaking the intervention. However, a test of significance is different from the actual size of the net impact. An estimate of a net impact, while significant, may be very small in absolute terms. On the other hand, the net impact may be very large, but insignificant (referred to as effect sizes in the previous sections). In the latter case, this is usually because there was a large variation in the net impacts of individuals in the intervention group, perhaps because relevant co-variables have not been modelled.

To obtain information on the net impact of an intervention requires either the selection of a control group who are similar in every respect except for their inclusion in the intervention; or a large enough sample that allows us to generalise the effects and abstract from random external factors that may have an effect on an individual's performance.

These issues are discussed in detail below.

Selection of a control group

The difference in outcomes for the intervention and control groups that can be ascribed to the intervention itself is called the net impact.¹¹

The ideal way to avoid the possibility that particular types of parents (for example, the more motivated or persistent parents) enrol for the intervention is to randomly assign children to the intervention and control groups before the intervention commences. This requires the evaluation to be designed as an integral part of the intervention. By using the law of large numbers, random assignment ensures that there is no systematic tendency for either group to have more or less favourable characteristics, either observable or unobservable.

However, random assignment does not totally eliminate systematic differences between the two groups, even if large samples are involved. Because it is not possible to force families to participate in an intervention, some self-selection out of the intervention by families who disapprove or do not see value in the program, or who because of contemporaneous difficulties in their life, are not in a position to offer their time to the program, is expected. Participation rates as a threat to evaluation design has been discussed in detail in Section 4.2. This effect will tend to overstate the net impact of the intervention.

In cases where random assignment is not possible, the evaluator should try to construct a control group from children who match the intervention group on relevant observable characteristics. Usually, this means characteristics such as parents' socio-economic background, ethnicity, pre-program IQ and so on. If pertinent unobservables, such as parental motivation, could be assessed, then these could be used for matching as well. Choosing which characteristics to match on is usually informed by existing literature. In cases where the intervention is large and pervasive, such as universal maternal and child health or preschool programs, it may be difficult to find a population of children who have not received any intervention services.

Random assignment and matching methods do not in themselves ensure that, on average, the nonprogram characteristics of the two groups are the same for any selected sample. If the samples are small, then the chance that any one pair of program and control groups will not be similar is high. Evaluators will often control for other factors in analyses of intervention effects (such as multiple regression), even when random assignment or ex ante matching has been undertaken, to iron out any remaining differences between the two groups.

Generalisability

The second question is whether the same net impacts will occur if the intervention is extended to other groups of children. This requires the evaluator to discuss the sub-groups that can potentially benefit from the intervention, and to assess whether the evaluation results can be generalised to these other groups. For example, an evaluation may only consider children from low-income refugee families or mothers, and the same net impact may or may not accrue if the intervention was applied to low income indigenous families or fathers.

Step 2: Estimating the social costs and benefits of the intervention in monetary terms

The second step in conducting a cost-benefit analysis involves an estimation of the social costs and social benefits of the intervention in monetary terms.

Estimating benefits

There are two philosophical considerations that should be made explicit with regard to estimating benefits of an intervention. The first relates to how extensively the measured benefits of a intervention are defined, or how many people who are indirectly affected by the intervention are counted. The second consideration is whether the outcome is real, or merely an intermediate result which is valued for its potential to affect 'real' outcomes.

It is possible to argue that most people in a society can be affected by an intervention, but clearly a line must be drawn, otherwise the evaluator will spend excessive time making calculations of secondary and uncertain detail.

Determining at what point a result, especially an ephemeral one, is a benefit in itself or an indicator of a potential benefit is a rather more difficult question to answer. For example, is higher educational attainment valued in itself, or only as an indicator that the person will go on to find more rewarding and stable employment? Further, if benefits are not sustained over time, are these early benefits 'real' or just promises that amounted to nothing? There may be a case for arguing that an intervention that only affects educational outcomes for a few years post-intervention, has zero benefits since the higher attainment in these years may have had no long term effect on the happiness of children, or their families, who participated in the intervention.

Without a clear notion of what defines a benefit for the participant, evaluations are reduced to measuring the effects on broader society. If only higher educational attainment is valued because it leads to a higher paying job, then it is not appropriate to count higher net years of education as a benefit as well as the additional wages. If however, more schooling is considered to have intrinsic value to the child, then some estimate should be made of this additional value at the time that the education was received.

In the cost-benefit evaluations considered later in Section 11 the final chosen outcomes are usually the impact on future wages and crime rates. If this is what the program designers intended, then it is appropriate for the evaluator to concentrate on recording these benefits. However, intermediate milestones, such as educational attainment, are worth measuring if it is intended to project final outcomes before enough time has elapsed to allow the full extent of the final outcomes to be apparent.

These decisions over the scope of the population and the designation of ultimate benefits should be decided before the evaluation is designed. Not only does it determine what sort of benefits to measure, but also the length of time the children should be followed.

The ideal length of time allowed for data collection is determined by the balance between the costs of data collection and the attrition of participants from the evaluation study. In practice, evaluators typically have to work with short- and medium-term data, using projections to complete the cost-benefit analysis. This means drawing on existing studies on the relationship between anticipated benefits such as health and educational attainment, and longer-term outcomes such as earnings. These reduce confidence in the final result, but it is an acceptable method and possibly the only way to conduct a cost-benefit analysis in years immediately following an intervention.

Scope of included benefits
It is not necessary to know at the commencement of the intervention all types of potential benefit, as survey items can be decided at a later stage. If however, the intervention expects a change in the families' status (such as a change to mothers' education), then this does have to be considered ex ante. It is easy for older interventions that have extensive data on adult experiences to estimate final pecuniary benefits from an intervention but considerably harder for interventions that are more recent (for example, Bolivia PIDI) and only have intermediate outcomes, such as cognitive development and need for remedial schooling. Interpreting the results from these newer interventions requires a more flexible and sophisticated approach.

Scope of potential beneficiaries
Earlier studies only considered monetising the effects on the child, victims of crime and the taxpayer (for example, savings in remedial education, higher earnings and lowered crime rates). More recent studies have also considered the broader effects on the mother (for example, better health and less substance use). In most cases, a qualitative discussion is made of the effects on people who are likely to be affected. Groups for whom quantitative measures are required generally need to be known ex ante.

Estimating costs

Scope of included costs
Deciding which cost to include is usually the least controversial part of a cost-benefit analysis, and is usually limited to the direct government costs of running the intervention. Recall that all costs are benefits foregone, or rather, represent the money value of benefits society would have otherwise gained had the government spent the money on another program. This may be another type of child related intervention, a welfare program or simply reduced taxes. Theoretically, changes incurred by staff in the counterfactual program should also be considered, but this is generally too much detail for most evaluations.

With respect to early childhood interventions, most of the resources used are the labour services of professional or paraprofessional staff who deliver health, psychological and educational services to young children and their families.

Some evaluations attempt to distinguish between fixed and variable costs, but it is unclear how valuable this information is to the overall study, relative to the cost of collating it, since most fixed costs are only fixed over limited ranges of production. In-kind resources, such as free rent or facilities, should be included as well as budgeted items. Often, no account is taken of the costs to families, as it is often assumed that the time they provide has no opportunity cost. This may or may not be true. The skill in measuring costs is usually to avoid double counting. Karoly, Kilburn, Bigelow, Caulkins, Cannon and Chiesa (2001) provide further details of hazards to watch for when measuring costs.

Fixed costs associated with the establishment of the intervention are only relevant if they will be incurred every time the intervention is extended. Fixed costs associated with the initial design of the intervention that are one-off are not relevant.

Most costs are part of the funding costs and are already monetised. However, in the case where idle resources are used (such as unemployed professionals), the costs of employing the resource will not be true costs and should be excluded. In cases where there are indirect costs, such as a cost of the families' time and costs to the schools that are not directly funded, then some estimate of these should also be made, but only if it is believed they are significant. If they only represent minor costs, then they should be just referred to in the text.

It is rare for the costs of an intervention to extend beyond the intervention period. If they occur, they will in most cases be revealed as negative benefits and will be monitored through the benefits section.

Step 3: Calculating the cost-benefit

Cost-benefit analyses produce a single metric, which is a summary measure of the difference between the costs and benefits of an intervention. In most cases, however, a sensitivity analysis is done, and this can produce a range of figures for one intervention. A sensitivity analysis involves inputting different assumptions about, for example, the value of crime, or the rate of discount, to see how much the overall figure changes.

Net present value (NPV) is the preferred method to calculate the net impact of an intervention (see Section 9 for a full discussion of cost-benefit calculations), although many studies convert this to the ratio of net present benefits to net present costs. Present value (PV) calculations can be calculated at any time after the intervention has ended, although (as noted above) the earlier the evaluation the less clear the results and the more conjectures are needed about how early indicators map into later outcomes. It is advisable to calculate net present values for several rates of discount (rates of interest) to make the sensitivity of the result to variation in the rates apparent.

10. Karoly et al. (2001) provide a very good summary of the practical matters to be considered before attempting a cost-benefit analysis.

11. It is also possible to control for unobservable characteristics using the difference-in-difference method. However, this method requires measures of outcomes before the intervention. This can be reasonable for adult interventions. For example, in employment programs, measures are taken of the outcome variable, employment status, both before and after the program. It is less likely to be suitable for child-orientated interventions where the main outcomes are not observable before the intervention has begun.

9. Overview of cost-benefit calculations: present value, rate of return and cost-effectiveness methods

This section examines the main formulae used for investment decision rules and how they can accommodate the problem of interpersonal comparisons. As mentioned previously, the essence of cost-benefit analysis is to make clear to policy makers that expenditure on one program is always at the expense of expenditure on an alternative program.¹²

When one person benefits from program A but another person benefits from program B, a value judgement must be made as to which program is preferred. Questions to consider are: Is program A preferred only if the monetary benefit received by beneficiaries is greater than program B? If the beneficiaries of program A are wealthier, or have more promising lifetime prospects than the beneficiaries of program B, is program A still preferred?

Cost-benefit analysis formulae

There are three main formulae used in cost-benefit analysis: net present value, rate of return and cost-effectiveness calculations. This section provides overviews of these methods and the subsequent section deals with the issues of the societal distribution of costs and benefits and the choice of discount rate.

Net present value

The net present value (NPV) is an overall measure of the difference between the costs and benefits of an intervention. Intuitively, if the intervention contributes more benefits to members of society than the costs it imposes on society, then there is an argument for implementing the intervention. However, the time period in which the costs and benefits are generated or received can vary, so the costs and benefits need to be reduced to a single comparable time period by some method. For example, if an early childhood intervention promises to produce a benefit of $100,000 in 50 years time through reduced crime, it is debatable whether this is of equal value to society of a benefit of $100,000 in one years time.

In general, it is commonly assumed that more distant benefits and costs are of less value than near ones. It is assumed that the value of costs and benefits incurred and received differs according to the time periods, so the costs and benefits are weighted according to the time period in which they fall. Thus, a single overall summary figure can be derived for the intervention. In general, the formula for net present value, discussed below, assumes that each additional year into the future is discounted at a constant rate. However, this need not be the case and the rate of discount can be negative (implying that more distant costs and benefits are valued more that near costs and benefits) or differ for each selected year

In general terms, given a stream of benefits, B0, B1, B2... and costs C0, C1, C2..., the formula for the net present value (or NPV) is:

or, more briefly,

where r is the rate of discount and the sub-scripts 0, 1, 2... refer to each time period with 0 representing the start of the intervention. When comparing projects, r should include a premium for the risk and uncertainty associated with predicted future benefits and costs.

If costs are one-off and concentrated in the initial time period (such that C1 = C2 = Cn =0) and the stream of benefits (B) is constant and infinite (n = ∞) then

Both benefits and costs need to be reduced to a common denominator, usually money. The investment decision rule is either to invest in all interventions that have a net present value greater than zero, or alternatively to rank interventions according to their net present value. However, the net present value is sensitive to the chosen rate of discount. In the example below, intervention A has a higher net present value at low rates of interest while intervention B dominates at higher rates of interest. This will occur because the benefits arising from A are from a more distant time period than B.

Net present value (NPV) can only be used in circumstances where the main costs and benefits of an intervention can be reduced to a common unit of account, usually money. This ensures that the value to the participant and society of higher wages, a more rewarding job, less crime and less social dislocation can be monetised in a meaningful way. If the methods used to monetise these effects are not well accepted by policy makers then this method of deciding between interventions should not be used.

Rate of return

The rate of return formula uses many of the same assumptions as the net present value referred to above, but instead of calculating a single measure of net benefits at a given discount rate, it estimates the discount rate that is required to produce a single net benefit measure of zero. The rate of return, λ is derived from the formula:

or more briefly,

where t is the time horizon for the intervention. The investment decision rule is to invest in all interventions with an internal rate of return greater than the societal rate of time preference. The latter is the rate at which the average member of society is prepared to forego benefits in the current period, in order to receive benefits in a later period. If for example, if the average citizen is prepared to forego $100 worth of consumption today, only if he or she is certain they will receive at least $105 next year, then the rate of time preference is 5 per cent. Rates of time preference can be negative. A person may be willing to give up $100 today in order to be certain to receive $95 next year (possibly because their income from other sources is expected to fall), in which case the rate of time preference is -5 per cent.

Cost effectiveness

Cost effectiveness approaches are used when it is not considered meaningful to monetise the benefit streams, and the investment criterion is reduced to ranking the costs of achieving the same goals through different interventions. For example, raising the school retention rate for a target population may be achieved by preschool programs, parent education and awareness programs or direct financial incentives to families. In this case, the investment decision criterion would be to minimise the new present value of costs:

where NPC is net present costs.

Which measure is superior?

The cost effectiveness rule is a superior decision rule only when the benefits are homogeneous and thus quantifiable across alternative interventions. It would be appropriate then to use this rule when comparing two or more interventions to increase school retention or reduce the number of criminal assaults. As soon as interventions have more than one type of benefit, or the benefits vary in quality to such an extent that they cannot be quantified in a meaningful way, then the cost effectiveness approach cannot be used. In this instance, the net present value and rate of return formulae should be used.

In general, these two approaches will give different rankings depending on either the chosen rate of discount or the chosen time period, leading to some ambiguity in the investment decision-making instrument. The calculation of the internal rate of return is, however, sensitive to the chosen time horizon (t). Layard (1972: 51-52) argues that there are three main reasons for preferring the net present value as a decision rule:

The net present value can accommodate variations over time in the discount rate which the rate of return approach cannot.
The rate of return approach incorrectly ranks interventions of different size or interventions of different time horizons. This is not an issue if the projects are completely divisible and duplicable (maintaining the same stream of costs and benefits pari passu, but in this case the rate of return approach will give the same answer as the net present value. Thus the rate of return metric is equivalent but not superior to the net present value metric.
The rate of return calculations may not give a unique answer and may give many solutions, as shown in the diagram below. A given project has a unique set of net present value for each rate of discount, but net present value may equal zero at two rates.

In short, the rate of return provides a less general approach than the better defined net present value (or NPV).

Societal distribution of costs and benefits

Cost-benefit analysis aims to produce an index for the net societal benefit from a given investment project to enable decision makers to decide either whether a project should proceed, or how to rank projects by value. It calculates a single figure by simply summing costs and benefits across individuals and is therefore neutral with respect to the types of individuals who will benefit the most from one intervention or the other. However, it may so happen that intervention A may benefit (or disadvantage) community Y the most, while intervention B benefits (or disadvantages) community Z more. It almost always happens that intervention A will affect distinct individuals in a different way from intervention B, whether or not they belong to the same community or group.

In the example of early childhood interventions, there are very clear potential income distributional effects. Many of the proposed benefits from running early childhood interventions are intended to directly affect the participating child in the form of better health, higher wages and a more rewarding labour market experience. As such, society as a whole, through the payment of taxes, has made the investment for the localised benefits of selected groups in society, but see discussion of spillover effects in Section 10. This is a subjective policy decision. To make this subjective process more transparent, cost-benefit analysts may choose to weigh the benefits and costs according to a set of subjective distributional values (see Weisbrod 1968: 814). In this way, a series of indices, based on different subjective weights, may be calculated, and the decision maker can see how sensitive the project rankings are to the subjective weights. Generally however, distributional differences have no formal place in the cost-benefit formulae, but since they are clearly relevant decision-making criteria for public policy, they are treated discursively in the text that accompanies the evaluation.

Intertemporal discount rate

An investment is by definition, a current outlay made in the expectation of a future return. The decision maker therefore is always comparing values over time and must make some choice about whether to discount, or appreciate, future dollar values relative to today's dollar. For business, this is straightforward. Since they must borrow money for investment, either from a financial intermediary or their shareholders, the cost of having funds tied up in an investment project is the market rate of interest plus an allowance for the risk and uncertainty of the project.

For public policy makers, the issue is about the cost of deferring today's consumption (this includes consumption of welfare products) until some time in the future. Clearly, if the labour and resources used for an investment are currently unemployed or not used, then there is no deferral of today's consumption and the discount rate is zero.

However, investments that only involve otherwise unemployed resources are rare, and the general case will be when the project(s) involve scarce labour who would otherwise be employed in other welfare enhancing work. The appropriate rate of discount represents how much of today's consumption is foregone in order to consume tomorrow. For example, if 91 cents is foregone today in order to consume $1.00 tomorrow, then the rate of discount is 10 per cent.

While rates can be estimated by surveying people and asking them what they personally would give up today in order to get a specified amount in the future, interpersonal comparisons cannot be estimated this way. In particular, this method cannot be used to estimate inter-generational discount rates. Today's citizens should not be entitled to put a maximum rate on how much of today's consumption they should forego in order to benefit, or not benefit, future unborn generations, or decide if future generation's welfare should be discounted at all, especially when much of today's consumption is derived from the natural endowment. This is not a question of today's parents deciding on how much to invest for their own children. In matters of public policy, it is today's members of society collectively determining the discount rate for the collective population of tomorrow.

With respect to early childhood interventions, the choice of discount rate will affect the ranking and net present value calculation of interventions where the benefits are concentrated in the school age period, early adulthood or late adulthood.

While there is no certain correct answer to the question of the appropriate rate of time preference, positive discount rates are generally used for the simple practical reason that projects with zero or negative discount rates and infinitely lived benefit streams do not converge to a present value.

12. Here the term 'expenditure' is used in a broad sense to include a program of tax reduction.

10. Valuing spillovers and other non-market transactions

There are four major reasons why a particular intervention benefit will not be recorded through the market. The first is because it affects a party other than the intervention provider and recipient (that is, someone not directly involved in the transaction for which a price is struck). The second, third and fourth reasons arise because the consumer of the service does not voluntarily purchase the service through the market. In this case, while the transaction may occur because of compulsion or funding from a third party, such as the government, there is no consumer price and hence no way to evaluate how much the consumer values the service.

These three reasons for non-market consumption of the service are: first, because either the purchaser or supplier of the service is myopic and does not appreciate the benefits to be had from engaging in the transaction; second, because the benefits, while statistically significant on average, are too uncertain at the individual level for the individual to have the confidence to invest¹³; and third, because although both parties know of the benefits, the potential purchaser, cannot afford to buy (or borrow money for) the service even though he or she realises that over the longer term there will be a net return.

While there are undoubted potential spillovers or benefits to third parties from investing in early childhood interventions (for example, from reduced crime rates and fewer welfare dependants), there is an implied view in the literature that the reason the target groups, predominantly low income families, are not already investing more intensively in their children's education and social development is because of the other three factors discussed above. In short, either low income families are not appraised of the benefits of positive early childhood experiences, are not convinced that they will apply to their individual circumstance, or do not have the funds to pay for the extra professional assistance and materials.

The following section discusses the nature of spillovers and why they are not captured by the market. This is followed by consideration of ways that have been devised to monetise major non-pecuniary spillovers and other benefits to consumers that are not transacted through the market because the consumer does not voluntarily purchase them from the market.

The nature of spillovers (externalities)

A complete cost-benefit analysis should count the expected costs and benefits of a program to all parties in society, regardless of whether these are transacted through the market, or whether they are directly transacted through contact with another person or another activity.

The first and second party to a transaction are the producer (supplier) and the consumer of the good or service. In the case of a market transaction, the price paid for the good or service is taken to reflect both the producer's (maximum) cost of production (or benefits foregone) and the consumer's (minimum) valuation of consuming the product. Because both the producer and consumer are willing parties to a market transaction, it is assumed that the agreed price is above the maximum cost and below the minimum benefit.

Spillovers, or externalities, are unintended effects of such a transaction on a third party. Generally, the third party has no power over the absorption of these effects.¹⁴ Examples of spillovers include changes to the environment resulting from a higher level of production and consumption and thus a change in the ability of third parties to derive satisfaction from the environment, or a change in the welfare of a whole community resulting from more education or health programs for a specific sub-group.

Spillovers are always outside the market and are thus not measured through the price mechanism. Taken to the extreme, the number and quantity of spillovers is unlimited, as the actions of one party can have infinite possible effects on the welfare of proximate parties. However, the cost-benefit analyst must, for practical reasons, limit the scope of measured costs and benefits to those that are significant in size and that should reasonably, and ethically, enter a societal welfare function.

Valuation of non-market costs and benefits

How far the analyst should go to impute the value of non-marketed costs and benefits depends on the estimated size of these effects, relative to market transactions, and how much information needs to be collected. As a minimum, the analyst should mention and describe the principal spillovers.

Where commercial operations co-exist with public provision, the former may be used to impute values, after adjusting for differences in quality. In Australia, this will include health, educational, aged-care and recreational services. In some cases, there may be no market for the good or service in Australia, for cultural or institutional reasons, but commercial operations may exist overseas (for example, commercial city parks, commercial beaches) from which to draw prices. The difficulty here is finding examples that are close enough with respect to its characteristics that a parallel can be drawn.

Clearly, there are private markets in Australia and other countries for intensive childhood services of the type considered in this report. Parents can buy extra kindergarten services, and extended professional assistance for social and educational needs. The prices of these services can be used as a shadow price for the value to families that are receiving these services free through a government or welfare agency program.

However, there remains the critical issue of how much foresight parents who currently use the private market have, and thus how well the price they are prepared to pay encapsulates the present value of long-term benefits to the child, and second, whether the impact on children who use the private market will be of the same proportion to the impact on children who do not. Children who use the private market are more likely to belong to high income and well educated families than children who do not use the private market. In essence, the impact of the program may depend on selection effects. It is not clear whether these qualifications imply a systematic over- or under-estimate of the 'true' value of the benefits of the intervention services to the target group.

If the impact on the target group is higher than for the population currently purchasing the services privately, then the present value of benefits will be higher than the present value of the costs of running the intervention. In addition, if it can be convincingly argued that there are spillover benefits from the intervention, such as reduced crime rate and better social cohesion, then the present value of benefits will be accordingly higher still. The aim of the evaluations is to argue that either or both of these additional sources of benefits are present, and to define, and possibly monetise the size of these benefits.

Two generic rules - the principles of exact compensation and of opportunity costs - are used to quantify non-market transactions.

Principle of exact compensation

In theory, the pecuniary measure of the effect of an externality is the amount of income that a person would have to receive, or forego, in order to maintain his or her level of satisfaction (utility) at the same pre-effects level. In order to assess this, economists use the principle of revealed preferences. They look at how much people are prepared to pay in order to avoid a negative spillover, or to come in contact with a positive spillover.

Principle of opportunity cost

In many cases, explicit or implicit program costs represent transfer payments and are not true societal costs since there is no opportunity cost (foregone benefit) from using the designated resource or labour. The classic example is when a program uses unemployed labour. Even though there is a program cost to employing people who would otherwise not be employed (wages + on-costs + capital costs), the true cost is their loss from their alternative use, which is the loss of leisure. The same reasoning applies to otherwise unused facilities and resources. However, the cost of scarce (or already employed) labour should be counted as a program cost, because outputs from alternative employments are being foregone.

In the remaining sections, divergent approaches to valuing changes to one person's wellbeing through change to their health or socio-physical environment are considered. Estimates of the number of people affected by type of cost and benefit also have to enter the cost-benefit formula. In addition, if the analyst wants to conduct a sensitivity analysis of the distributional consequences, enumeration of the types of people affected needs also be made.

Life and health

One of the common benefits to monetise is a reduction in death and ill health. To the extent that early childhood interventions reduce anti-social behaviour and improved health and thus ill effects on the participants and people they interact with over the course of their life, these factors may be included in an evaluation.

There are four common ways that the literature uses to monetise the value of life and health. Evaluators who require these values for an evaluation will not make these estimates themselves but will draw upon an existing study.

The first method, a simple but rather narrow way to calculate the societal loss from one person's life, is to equate it to the present value of future earnings (or PVE) Y.

where t is time, c is the current year, and d is the expected year of retirement.

In some cases, account is also taken of the bereavement of the family and loss of enjoyment by the individual (Mishan 1975: 299). Changes to transfer payments are not measured as these represent a transfer between members of society and not a net loss to society. Transfer payments, such as an orphan's payment, would only be included (as a gain to the family and a loss to the rest of the community) if the present value calculation included weights for income distribution factors.

This method is not widely accepted, since by extension, it implies that the goal of an economy is to maximise GDP (which by extension is achieved through unlimited immigration) (see Mishan 1975: 301).

The second method, a more advanced version of the PV earnings, is the PV of losses affecting other parties only. This deducts personal expenses, C, from gross income, Y. This essentially excludes the loss of utility to the dead person.

This method is also not well accepted as it implies that there is no loss associated with a person, such as a low-income recipient who consume their whole income.

The third method is to look at revealed preferences of individuals or the government. Expenditures, such as installing seat belts, improving occupational health and safety or improving medical equipment, that lower the probability of death, can be used to calculate a dollar figure for a reduction of x per cent in the death rate of y people over a given year. For example, a new diagnostic machine in a hospital is reasonably expected, through earlier detection, to reduce the death rate from that disease by 1 per cent. If 1,000 people are treated each year, the machine costs $10 million, and has a life-time of ten years, then the value of saving a life is $100,000.

Similarly, the wages associated with more risky jobs compared with less risky jobs may be used to calculate a person's pecuniary assessment of the risk differential. If one job earns $100 per week more than other comparable jobs but has a 1 percentage point greater chance of serious injury in any given year, then the value of losses due to injury is equal to $100 x 52 x 100 = $520,000.

This method does rely upon the assumption, in the case of government expenditures, that decisions on these matters reflect societal preferences. In addition, it is assumed that individuals make reasonable, informed choices and are not unduly myopic. One disadvantage of this revealed preference method is that as many different estimates as there are examples will be produced. There is likely to be a different trade-off in jobs between the risk of injury and wages for many jobs. An average, or weighted average for certain demographic groups, offers the best solution.

With the fourth method, the amount for which a person is prepared to insure their life may indicate how much he or she believes their life is worth to their beneficiaries. Similar to the second method above, this method assumes that the life has no intrinsic value to the potential loser of life.

In a similar way to the calculation for loss of life, calculation can be made for the loss of limb or health.

Location effects

Local amenity - arising from pollution, traffic congestion, crime rates, access to good schools and facilities - can be measured through the analysis of property prices. To estimate the effects of one type of amenity, such as crime rate, the analysis would need to be multi-variate and involve large amounts of data to enable one effect to be separated from the other. This requires obtaining data from locations with a large variation in the effect under consideration and to be able to control for all the other major characteristics that affect housing prices. The argument is that the difference in average rental (or rental imputed from price) between similar houses, but one group located in a high crime area and the other in a low crime area, represents how much people are willing-to-pay to avoid the negative externality of a greater risk of being the victim of crime; or in other words, how much the difference in crime rate is worth to the people.¹⁵

Life satisfaction

Job, social and family satisfaction are commonly cited benefits from many programs and there is a tradition of measuring changes in satisfaction in questionnaires using ordinal scales such as the Likert scale. While these scales are regarded as acceptable ways to rank, and sometimes compare,¹⁶ levels of satisfaction, they do not easily translate into a pecuniary value.

13. While mathematical probabilities can be calculated for a large group of individuals, the importance of other factors dominate outcomes for single individuals.

14. Usually this is because they are intrinsically non-excludable (for example, air pollution) or not excluded in practice (for example, gardens).

15. Due to the psychological and pecuniary cost of relocation, only the rents or housing prices of people recently moving in to a neighbourhood should be included. Someone who values a reduction in the crime rate more than the rent differential between neighbourhoods, will not move if the cost of moving exceeds the present value of the net gain.

16. This implies that the scales are cardinal, not just ordinal.

11. Evaluations of cost-benefit studies of early childhood interventions

Eight of the 32 early childhood interventions reviewed in this report contained a cost-benefit analysis. These studies are assessed here, in the light of how well they serve as a template for an Australian evaluation of an early childhood intervention. Each evaluation was assessed against ten major criteria that relate to the main steps in conducting a cost-benefit analysis: estimating the net impact if the program; treatment of costs and benefits; and formulae used for calculating net effects.

Perry Preschool Project (Perry)

General comments

This is a well-executed and internally coherent cost-benefit analysis. The analysis relies upon estimates of deterministic relationships between variables such as years of education and earnings from secondary sources. These are not critically examined or qualified. It has a sensible treatment of which people to include as beneficiaries and the scope of costs and benefits included. It uses a discursive method to recognise and treat relevant but imprecise costs and benefits.

The main weakness of the analysis is the very small and non-representative sample and thus the very much reduced value for policy makers who wish to generalise about the effects of extending the intervention. This is not a trivial issue since most of the benefits arise from the reduced crime rate for which we are given little information (that is, the number of criminal acts committed by the evaluation sample). There are no considerations of the effects on the family of participants and other members of the community other than in their role as tax-payers.

Nonetheless, it serves as a good template for future cost-benefit analyses.

Estimating the net impact of the intervention

Controlling for selection bias. Program participation was allocated on the basis of random assignment. There is no discussion of whether there is any selection bias ex post although it is indicated that parents could, and did, opt out of the parental meetings.

The size of the participation and control groups. Only 58 participants were included in the intervention group. This is too small to draw many inferences from.

Tests of statistical significance. Tests for whether the net impacts are significant are given, but it is not clear whether these have controlled for differences in mental retardation between the two groups.

Treatment of costs and benefits

Scope of included costs. This is well done and includes all major program and instruction costs.

Scope of included benefits. The main benefits are the material enhancement of the participants, the reduced costs to society from less remedial schooling, and the reduced crime rate. A discussion is made of other life satisfaction benefits, but these are not used in the cost-benefit calculation. This seems a valid list.

Scope of potential beneficiaries. This is limited to the participants, the counterfactual victims of crime and the taxpayer. This seems appropriate.

Time scale considered. Real data is used to age 27 and estimates are extrapolated to retirement age. This is a valid method.

Method used to monetise non-pecuniary benefits. The only non-pecuniary benefit which has been monetised, and which contributes over half of the net present value of benefits, is the effects on the potential victims of crime. The monetary value of the pain and suffering etc that a victim of crime incurs is cited in a footnote and not discussed or rationalised. This is a weakness of the analysis.

Formula used for calculation

Method employed. The analysis uses the net present value (NPV) method, at a 3 per cent rate of time preference (discounted back to the start of the program).

Use of sensitivity analysis. A reasonable use of sensitivity analysis is made with respect to the rate of time preference and the effects of crime on victims.

Cost-benefit findings

The cost-benefit analysis indicated that the benefits totalled $108,002 per child while costs totalled $12,356 per child. This is equal to a saving of $8.74 for every $1 spent. The cost-benefit analysis also indicated that the net benefits remained large even when any one of the benefits was excluded, or if all benefits were reduced by half.

Bolivia Integrated Child Development Program (PIDI)

General comments

Participants in this intervention, which aims to increase the cognitive and physical development of undernourished children in Bolivia, were self-selected. Most of the analysis is devoted to the treatment of the selection issues based on observable characteristics, as the data do not permit the evaluators to control for unobservables. There is no data on the pre-intervention cognitive and physical condition of the children. It does not appear that the intervention was set up with evaluation in mind and this compromises the robustness of the findings.

The cost-benefit analysis is rather scantily done and some of the assumptions are weak. In particular, the assumption that four years of program provision will have a permanent effect on cognitive and physical development has not been justified. The exposé does not make it clear how many observations are included in each regression used to derive the net impacts.

This is not a good example of a cost-benefit analysis.

Estimating the net impact of the intervention

Controlling for selection bias. The authors use a Heckman-type estimation to control for selection issues, but this can only conditioned on observables.

The size of the participation and control groups. The description of the sample size is confusing and it is not clear which size has been used for each regression.

Tests of statistical significance. Tests of significance are provided.

Treatment of costs and benefits

Scope of included costs. Costs are limited appropriately to the intervention costs for the four years. Scope of included benefits. Benefits are limited to the net effect on earnings. This is narrow. It is assumed that changes in height affect earnings even though the net impact on height was not significant at the 10 per cent level.

Scope of potential beneficiaries. Only the intervention participants are included - this is narrow.

Time scale considered. The time scale is limited to two rounds of data gathering. It is not clear what the time period was between these measurement points and the commencement or termination of the intervention.

Method used to monetise non-pecuniary benefits. All intervention data give intermediate outcomes (measures of cognitive and physical development) and these are converted into future earnings using the assumption that all changes are permanent and reference to secondary data linking cognition and physic to earnings. This is a reasonable method.

Formula used for calculation

Method employed. NPV.

Use of sensitivity analysis. Different levels of time preference and base level educational attainment are used to calculate the cost-benefit ratios.

Cost-benefit findings

The cost-benefit analysis estimated that the program cost approximately $43 per child per month (per capita annual GDP is $800). Forty per cent of this cost is consumed by providing children with their nutritional needs. Benefit to cost ratios were found to range from 1.7:1 to 3.7:1 (where benefits in terms of future earnings were the focus).

Chicago Child-Parent Center (CPC)

General comments

This analysis follows the format established by Barnett (1993a, 1993b) in the Perry Preschool Project evaluations. The major differences are, on the plus side, a considerably larger sample (989 in the program and 550 in the control group) but, on the negative side, non-randomised selection into the program. This means that the two comparison groups can vary in unobservable ways. In particular, it is expected that more motivated and determined parents will enrol in the program. This will tend to overstate the benefits from the intervention as control and intervention groups are not alike except for their program participation. Similar methods to the Barnett studies are used to measure costs and benefits.

Estimating the net impact of the intervention

Controlling for selection bias. Selection into the intervention is not random and it is not clear whether the determinants are limited to parental interest, other than the normal income and disadvantage requirements. No attempt is made in the regression analysis to adjust for selection bias, as in the cost-benefit analysis for PIDI.

The size of the participation and control groups. Over 1000 cases are included in the evaluation. This is a reasonable sample.

Tests of statistical significance. Tests of significance are reported and the significant results are used in the cost-benefit analysis.

Treatment of costs and benefits

Scope of included costs. Same set of costs as Barnett above are included. This is adequate.

Scope of included benefits. The same set of benefits are included as Barnett, but like Barnett, some discussion is made of the unmeasured benefits. Similar to other studies, seemingly reputable secondary sources are used to convert intermediate outcomes, such as education, into earnings.

Scope of potential beneficiaries. Same set of beneficiaries are included as Barnett.

Time scale considered. Evaluation data appears to extend to the age of 21 years which is a reasonable length.

Method used to monetise non-pecuniary benefits. Only the public expenditure costs associated with criminal activity, policing, judicial and property costs etc, are included in the analysis so there are no monetary values of non-pecuniary benefits. A similar treatment is made of the costs of child abuse - these benefits are limited to public expenditure savings.

Formula used for calculation

Method employed. NPV.

Use of sensitivity analysis. Several different rates of time preference are used.

Cost-benefit findings

The cost-benefit analysis estimated the cost of the program to be US$6,730 (1998 dollars) for 1 and half years, with a return of US$47,759 per child. Overall, $7.10 was returned to society for every dollar spent (benefits to society were $3.83 for every dollar and government savings was $2.88 per dollar).

Carolina Abecedarian Project (Abecedarian)

General comments

A more expensive and intensive intervention than Perry and Head Start, the Abecedarian involved full-time care for children aged up to five years. The number of children involved in the intervention was very small (104) and the analysis thus suffers from the same difficulties as Barnett's other studies on the Perry Preschool Program. Unlike other studies, this evaluation extends to more speculative benefits such as the effects on future generations and life expectancy of the participants. This is not entirely wrong, but because there is a large variance on the size of the monetised estimates of these benefits, it is good practice to present both conservative as well as the more far-reaching estimates.

Estimating the net impact of the intervention

Controlling for selection bias. Participation was by random assignment, but given the considerable commitment required from parents (relinquishing their child for 40 hours a week from birth), there are likely to be considerable ex post selection issues. This is not discussed.

The size of the participation and control groups. At 104 in both intervention and control groups combined, the numbers are too small for strong generalisations.

Tests of statistical significance. These are presented.

Treatment of costs and benefits

Scope of included costs. Usual program costs are included in addition to the costs of child care for the control group.

Scope of included benefits. These are considerably broader than other studies as they evaluate, using secondary sources, the effects on the education of the next generation and the effects of the significant reduction in smoking on the value of additional years of life to the participant. Estimates are used from the secondary literature to estimate earnings to the age of retirement.

Scope of potential beneficiaries. Unlike other studies, this study includes the effects on the mothers' educational attainment during the intervention period.

Time scale considered. Survey data and school data has been used up to the age of 21 years. This is a reasonable period.

Method used to monetise non-pecuniary benefits. The main non-pecuniary benefit monetised is the value of life. The analysis refers to value of life literature that make this assessment based on how much people are willing to spend in order to reduced their risk of death by a designated percentage. This is a common, but controversial, method used in economics.

Formula used for calculation

Method employed. NPV, but rates of return are also discussed.

Use of sensitivity analysis. Discount rates from 0 to 7 per cent are used. Net present benefits can be decomposed to exclude benefits affecting future generations and those that arise from reduced smoking.

Cost-benefit findings

The cost-benefit analysis found that the average annual cost was about US$13,900 (2002 dollars) per child. The final cost-benefit findings were that the benefits outweighed the costs by $4 to every $1 spent.

Starting Early Starting Smart (SESS)

General comments

This intervention integrates preschool education with programs to reduce parents' mental illness and substance abuse. It began in 1997, and the evaluation is still in the early data collection phase. While there has only been partial random assignment, the numbers of children involved in the evaluation is large. The evaluators are intending to collect extensive detail on program costs and cover abroad scope of potential benefits. No outcomes are presented in the article.

Estimating the net impact of the intervention

Controlling for selection bias. Participation was by random assignment in some centres but in others a control group was constructed. A broad range of control characteristics has been collected. These range from parent-child interaction, home environment, child behaviour as well as the more traditional individual and family characteristics.

The size of the participation and control groups. There are 1900 children in the sample, which is large. Tests of statistical significance. Not applicable.

Treatment of costs and benefits

Scope of included costs. Unusual detail made of program costs, including classifications into fixed and variable, consumable and non-consumable, investment and operating and stakeholder group costs. It is not clear what can be gained from this level of detail.

Scope of included benefits. An extensive set of benefits are proposed to be collected. In addition to the usually educational intermediate outcomes, there will be data on welfare dependence arrests, emergency room visits, family violence and mental illness.

Scope of potential beneficiaries. Unlike other studies, this study includes the effects on the participants' families.

Time scale considered. Survey data and school data will be collected through to adulthood. Method used to monetise non-pecuniary benefits. Not stated.

Formula used for calculation

Method employed. Not decided yet.

Use of sensitivity analysis. Not decided yet.

Cost-benefit findings

The cost-benefit analyses have not yet been conducted.

Florida Family Transition Project (FTP)

General comments

This family based intervention delivers health, education and social services to adults with the aim of reducing welfare dependence and increasing employment income. There is random assignment and a large evaluation sample. Conservative measures are taken on costs and benefits, which are limited to government payments and earnings.

Estimating the net impact of the intervention

Controlling for selection bias. Participants allocated through random assignment and participation was deemed compulsory. There was considerable attrition over the evaluation period, but unless there are reasons to believe that there is bias in the attrition, this is not a problem.

The size of the participation and control groups. There were 2800 people in the evaluation, split evenly between the intervention and control group. This is a large number.

Tests of statistical significance. Not performed.

Treatment of costs and benefits

Scope of included costs. Limited to government welfare costs, and the cost of delivering the program services.

Scope of included benefits. Limited to earnings and fringe benefit differentials (adjusted for tax payments). Estimated by regression analysis (based on some individual characteristics).

Scope of potential beneficiaries. The participant and the taxpayer.

Time scale considered. Five years post intervention. This is a long time period for an employment program evaluation.

Method used to monetise non-pecuniary benefits. No non-pecuniary benefits are monetised but some are discussed.

Formula used for calculation

Method employed. NPV.

Use of sensitivity analysis. Not clear.

Cost-benefit findings

The cost-benefit analysis indicated that the program costs were approximately US$12,500 per family member over the five-year period. The net costs, over and above what was spent on the usual welfare program were US$8,000 per family. The cost-benefit analysis found that the FTP produced a net loss to the government of US$6,300 per family.

Triple-P Positive Parenting Program (Triple P)

General comments

This is a well-conducted and thorough evaluation with a reasonable sized intervention and control groups. Unlike other analyses, it uses a cost effectiveness method but in order to capture some of the wider benefits caused by an intervention it recasts them as cost savings. The short time frame for measuring the benefits of the intervention is a limitation.

Estimating the net impact of the intervention

Controlling for selection bias. Participants were allocated through referral and there were several levels of intensity of treatment. In most cases the control group was drawn from families on the waiting list for the intervention. However, it was not clear how long the waiting lists were. There would have to be very long waiting lists for good counterfactual data to be collected. There was considerable attrition over the evaluation period, but unless there are reasons to believe that there is bias in the attrition, this is not a problem. Children with additional developmental or health problems were excluded. A comparison of the family background characteristics showed that the intervention and control groups were not significantly different.

The size of the participation and control groups. Five separate 'randomised' trials were conducted involving 567 children. This is a reasonable number if the random assignment is truly random. No details on refusal rates for parents who did not want to participate, but it is implied that all participants and the control groups are self selected. This may be appropriate for this type of intervention that requires a high level of parental cooperation. Attrition rates are provided.

Tests of statistical significance. Provided.

Treatment of costs and benefits

Scope of included costs. Program costs such as fees for counselling sessions, the prices for workbooks and materials are included but the time and medial costs (can be negative) affecting the family are excluded. Because of the limited scope for including benefits in a cost effectiveness formulae, some benefits are included as cost savings, such as education, health, foster care and crime costs associated with fewer people, up to the age of 28, with conduct disorders. These estimates are derived from secondary literature.

Scope of included benefits. Whether the child developed or maintained a conduct disorder was the only benefit. Only one benefit is permitted in costs effectiveness studies but one way around this is to include other benefits in the cost side as cost savings.

Scope of potential beneficiaries. The child.

Time scale considered. Six months to three year follow-ups undertaken depending on the trial.

Method used to monetise non-pecuniary benefits. Not applicable as a cost effectiveness method used.

Formula used for calculation

Method employed. Cost effectiveness. The benefit of the intervention is the number of child conduct disorders averted. The evaluators considered the number of disorders that need to be prevented in order for the intervention to cover costs. No account taken of reduced mental health problems in parents (considered to be a secondary benefit of the intervention).

Use of sensitivity analysis. Conducted.

Cost-benefit findings

Triple P costs range from 75c at Level 1 to $422.45 at Level 4 (individual) in Australian 2003 dollars. The cost-effectiveness analysis indicated that the intervention would pay for itself if it averted less than 1.5 per cent of conduct disorder cases and that an aversion rate of 7 per cent or more would result in a cost saving.

Elmira Prenatal and Early Infant Project (PEIP)

General comments

The cost-benefit analysis appears to include both costs and benefits affecting the mother and child as well as transfer payment between the family and the government. This appears to include double counting. Ideally, a cost-benefit analysis should be separated from a statement of government accounts.

Estimating the net impact of the intervention

Controlling for selection bias. It has been described as a randomised trial, but few details are provided.

The size of the participation and control groups. Three hundred in combined intervention and control groups.

Tests of statistical significance. No.

Treatment of costs and benefits

Scope of included costs. Program costs and other savings to government are included.

Scope of included benefits. Mother behaviours including smoking, attendance at child related classes, nutrition, child abuse and neglect rates, education, substance abuse and criminal activity. Child effects included IQ, and criminal activity.

Scope of potential beneficiaries. Includes both mother and child.

Time scale considered. Benefits monitored up until the age of 15.

Method used to monetise non-pecuniary benefits. Only a subset of the benefits were monetised, including emergency room visits, use of welfare, and use of the criminal justice system.

Formula used for calculation

Method employed. This is an unusual combination of costs and benefits to the child, mother and family, as well as measures of transfer payments between the family and government (taxes and welfare payments).

Use of sensitivity analysis. No.

Cost-benefit findings

The cost-benefit analysis estimated that the costs of the program were US$3,300 in 1980 dollars and US$6,700 in 1997 dollars per child for two and a half years of service. The analysis also indicated that investment was recovered before the children turned four years old and the intervention saved US$4 for every US$1 spent. However, the benefits exceeded the costs only for families where the mother was of low income and unmarried.

Effectiveness of early childhood interventions with a cost-benefit analysis

Cost-benefit analyses were available for at least one program from each of the five intervention clusters specified in this report. This section summarises the interventions in each of the clusters according to the adequacy of the evaluation design, program efficacy and cost effectiveness.

Cluster 1: targeted, child focused, centre based, preschool age

Three of the interventions in cluster 1 included a cost-benefit analysis in their evaluation. Two of these cost-benefit analyses (for Perry and CPC) were appropriately executed. However, the cost-benefit findings of PIDI need to be interpreted with caution due to a number of issues discussed above.

A saving of US$8.74 was found for every dollar spent on Perry, a saving of US$7.10 for every dollar spent on CPC and cost-benefit ratios ranging from 1.7:1 to 3.7:1 for PIDI. This suggests that interventions in cluster 1 provide a good return on investment. In addition, Perry and CPC were well implemented and the evaluations of both were well-designed. Effect sizes for Perry ranged from large in the short-term to medium in the intermediate term, while effect sizes for CPC ranged from small in the intermediate term to small in the long-term. This suggests that the cost-benefit analyses of Perry and CPC adequately represent the intended interventions.

Cluster 2: targeted, parent focused, home visits, all ages

Cost-benefit analyses were conducted for the Elmira PEIP intervention. This analysis indicated a saving of US$4 for every dollar spent - however, this only applied to low-income and single parent families. In addition, the calculation of the costs and benefits was unconventional. The evaluation design of the PEIP was excellent and effect sizes ranged from large in the short-term to medium in the intermediate term. Although the PEIP demonstrated good return on investment for a select group, it is not possible to generalise this finding to other parent focused, home visitation-type interventions.

Cluster 3: targeted, family economic/welfare focused, all ages

A cost-benefit analysis was conducted on the FTP and found a loss, as opposed to a saving. Although the evaluation of FTP was well designed, effect sizes were not published. It is therefore inappropriate to comment whether the poor return on investment demonstrated by the FTP generalises to other interventions with an economic or welfare focus.

Cluster 4: targeted, holistic, various locations, all ages

A cost-benefit analysis was conducted on the Abecedarian project. A saving of US$4 for every dollar spent was found. Adding further strength to this finding is the adequate cost-benefit techniques, good implementation of the intervention and well-designed evaluation. Again, it is difficult to generalise this finding to a range of targeted, holistic early childhood interventions.

Cluster 5: universal, various foci, various locations, all aged

Triple P was the only intervention in cluster 5 that produced a cost analysis, however this was only a cost-effectiveness analysis, which does not include an extensive calculation of program benefits. Therefore, it is difficult to make any conclusions about return on investment for universal interventions.

Summary

Focusing narrowly on the limited cost-benefit data for early childhood interventions reviewed here, there is some indication that interventions that involve children as participants, or that focus on enhancing parenting efficacy, and that are intensive in nature, have greater cost savings potential than interventions that focus solely on familial economic circumstances. However, reliable conclusions about the relative cost savings of early childhood interventions require additional cost-benefit data on a more representative sample of programs.

12. Lessons for Australia

New findings from developmental neuroscience, and growing evidence from longitudinal studies have indicated that children's experiences in early childhood provide an important foundation for subsequent development. There has thus been increased interest in the potential for early childhood interventions to ensure children start life on a positive developmental pathway, particularly those children whose family background might indicate problems in the sensitive formative years.

This report focuses attention on the potential for early childhood interventions to produce returns on public investment in the long run. It reviews selected early childhood interventions to examine the effect of these programs, carefully considering intervention design, implementation and evaluation rigour. It establishes the conceptual framework within which program costs and outcomes can be understood, evaluates cost-benefit methodologies, and reviews published estimates of costs and benefits of applicable early childhood interventions.

This section summarises findings about the efficacy of early childhood interventions for improving outcomes for children and the relative cost-savings potential of different early childhood intervention programs. It concludes with recommendations for conducting cost-benefit analyses of early childhood interventions in Australia.

Are early childhood interventions efficacious?

While this review provides a basis for estimating likely future benefits of early childhood interventions, it is not a comprehensive study. The dearth of evaluation data on interventions generally, and missing data on the restricted and unrepresentative number of interventions in this review, makes it impossible to comment on the usefulness of early childhood interventions as a general strategy to sustain improvements for children in the long-term.

Examination of 108 large-scale, public early childhood interventions from around the world revealed relatively little empirical data on program effectiveness. Indeed, of the 108 interventions identified in the current review, only 32 interventions had a strong evaluation component, including only three interventions developed and currently operating in Australia.

In an attempt to identify the most effective 'type' of early childhood intervention, programs were grouped into five clusters according to the availability of the intervention, the intended effects of the intervention, where the intervention took place, and the focal age of children targeted for the intervention.

On balance, the interventions produced a number of important improvements across a wide range of outcome domains. The greatest improvements were observed in respect to children's cognitive skills, and child outcomes in general, with parent-related outcomes showing the least improvement (studies reporting effect sizes on parent and family outcomes were in the negligible to small range, although the Triple P program was an exception).¹⁷

Most of the positive effects on child outcomes were the result of centre-based interventions, as opposed to 'home-visiting' or 'case management' interventions. These interventions were grouped in cluster 1, which included programs like the Perry Preschool Project and Head Start. This is most likely a testament to the fact that cluster 1 interventions were consistently superior in terms of key elements of design and implementation quality such as dosage, intensity, participation rates, 'drop-out' rates and program integrity. By contrast, there was great variability in design and implementation adequacy within cluster 4 (targeted, holistic interventions, such as Sure Start) and very little information was available on interventions in cluster 5 (universally available programs, such as Triple P), which made it difficult to comment definitively on these interventions as a group. It may also be true that more intensive effort is required to achieve substantive change in parentrelated outcomes, such as parenting skills and social support, than what was offered by the current interventions.

Although the review of early childhood interventions reviewed here is not representative, it supports the case for well-designed, well-executed and high quality interventions. Differences in benefits observed across the programs reviewed here may in fact relate to differences in program quality and funding.

The measured effects of early childhood interventions were mostly limited to the immediate and short-term. Reductions in acts of delinquency and crime (which are easily measured) were the most enduring intervention effects reported. However, only 13 of the 32 reviewed interventions (40.6 per cent) followed up participants for more than two years, and the Perry Preschool Project stands out as the only intervention to collect comprehensive evaluation data on participants into adulthood. Impressively, the adult follow-up of participants in the Perry Preschool Project, collected after 22 years when participants were aged 27 years, showed positive effects on aspects of intellectual ability as well as income and employment outcomes in adulthood.

It is also possible that interventions produce different effects at different developmental stages. Effects that disappear after a few short years may in fact re-emerge at a later developmental stage, showing what is known as a 'sleeper effect' (for example, initial gains in cognitive and language performance following experience in centre-based child care may 'fade-out', only to re-appear at entry to school, for example). Interventions that do not conduct lengthy follow-ups could in fact be underestimating intervention effects, or incorrectly reporting diminishing effects over time.

Although it is natural to consider benefits in terms of outcomes that an intervention was designed to produce, gains from early childhood intervention may also occur beyond the domains measured in an evaluation. It is instructive that when Head Start began, for example, it was primarily concerned with enhancing cognitive performance. Later evaluations have seen this intervention as also contributing to positive early moral developmental and language regulation (Emde 2003: 8). Moreover, program effectiveness is often determined in terms of outcomes that are easily measured, such as acts of crime. Less tangible effects - the capacity to sustain functional relationships, as one example - may fall off an evaluator's radar simply because of the complexity (and potentially cost) of measurement.

The need for longitudinal study after an early childhood intervention is clear. This is important to understand what is needed to sustain and enhance intervention effects, how long programs should last, and to appreciate possible influences of program participation on later stages of development.

Do early childhood interventions have long-term payoffs?

Very few sound cost-benefit and cost-savings analyses of early childhood intervention programs with long-term follow-ups have been conducted. Of the 108 interventions that were initially identified, only eight interventions included a cost-benefit study. With the exception of a cost-effectiveness study of Triple P, there have been no cost-benefit analyses undertaken of Australian interventions, making it difficult for government to decide objectively on how much funding to allocate to these interventions vis-à-vis other social and economic expenditures.

There is evidence, however, that early childhood interventions can produce potential returns in public investment. Although it is not possible to generalise these findings, among the early childhood interventions with a cost-benefit analysis reviewed here, programs that involved children as program participants, or that focused on improving parenting skills or levels of parenting support, produced a greater return on investment than interventions that focused on family economic circumstances.

Planning a cost-benefit analysis of an Australian early childhood intervention

Clearly, much more Australian data is needed on interventions in early childhood to determine their effects and benefits in this context. The review of cost-benefit studies provided in section 11, combined with information about the process for undertaking a cost-benefit analysis and estimating program costs and benefits is instructive in this regard. What follows is a summary of the important steps in planning a cost-benefit analysis of an Australian early childhood intervention.

There are a number of general principles that may be used by decision-makers considering costbenefit analysis of an early childhood intervention program, which may need to be tailored to the specific circumstances of a given intervention and its evaluation design.

Ideally, an evaluation should be planned at the same time as the intervention is designed to enable random assignment and the cheapest form of data collection. There are four parts to a cost-benefit analysis that are to some extent conducted separately:

estimating the net impact;
measuring the benefits - pecuniary and non-pecuniary;
measuring the costs - pecuniary and non-pecuniary; and
combining costs and benefits into a present value (PV) calculation.

Before the evaluation is designed three things need to be established: first, the intended benefits of the intervention (for example, educational, crime related, employment, social engagement); second, the target population; and third, additional factors that may affect outcomes other than the characteristics of the target population (for example, if the target population is children from low income households, then other correlated but intrinsically different factors may be the parents' refugee status, parents' history of substance abuse, age of parents, relationship status of parents, parents' criminal record, peer group etc.).

Data can be collected from repeat surveys and administrative records on participants. Pecuniary values for non-pecuniary costs and benefits (crime, loss of health, unemployment) are usually derived from secondary literature, which has made these estimates.

Selecting the intervention and comparison groups

Setting up the evaluation ideally requires the inclusion of a comparison group that is similar to the program group demographically and/or on relevant pre-tests. There are many ways to build a comparison group, with random assignment¹⁸ of the target population (children from low-income households, parents with substance abuse, parents with long term unemployment etc.) typically viewed as the best way of ensuring that intervention and comparison groups are equivalent initially.

In the case of child participants, parental consent is required for both participation in the intervention and for ethics approval for the collection of administrative data (such as school and government records). This requirement for approval will introduce a bias in the selection of children into the intervention which ideally should be accounted for though the regression analysis.

After random assignment, including parental consent, children and parents should be surveyed to ascertain: first, the background characteristics of the family with respect to different types of disadvantage that may impact on the child's social, educational and psychological development; and second, depending on the age of the child, any developmental assessments of the child before the intervention has begun.

Rather than defining comparison groups through random assignment or some other means, researchers can attempt to estimate the effects of participation in an early childhood intervention through the use of random surveys. Ideally, specific information about program participation should be collected. In the absence of specific program participation information, random surveys can be used to compare how well program participants fared compared to a population group.

Collecting data on intervention costs

Usually only the running costs need to be recorded. Fixed costs associated with the establishment of the intervention are only relevant if they will be incurred every time the intervention is extended. Fixed costs associated with the design of the intervention that are one-off are not relevant.

It is rare for the costs of the intervention to extend beyond the intervention period. If they occur, they will in most cases be revealed as negative benefits and will be monitored through the benefits section.

Collecting data on intervention benefits

Benefits should be measured over time through surveys and the collection of administrative data. Relevant survey measures include educational achievement records, school retention, employment history, incidence of criminal record, social and health problems (substance abuse, social dysfunction). Administrative records can supplement survey data (for example, school achievement records, social security records, and so on).

The frequency of the surveys and administrative data collection depends on funding for the evaluation and the occurrence of critical milestones in the child's development. The latter may include the start of primary, secondary and tertiary education, age 18 or age 21 years.

Conducting PV calculations

Present value calculations can be calculated at anytime after the intervention has ended although the earlier the evaluation the less clear the results and the more we are required to rely upon conjectures about how early lifecycle indicators map into later outcomes.

It is advisable to calculate present value for several rates of discount to make the sensitivity of the result to variation in the rates apparent.

Conclusion

Most of the evaluations summarised in this report are of good quality, although weaknesses were noted across a number of aspects of program design and implementation (notably attrition from the program), thus some interpretation of evaluation findings is required. Nevertheless, evaluation findings suggest that early childhood interventions can produce improvements across a wide range of outcome domains. There is also some limited evidence that early childhood interventions can produce potential returns in public investment.

Unfortunately, however, no evaluation can demonstrate that a program that worked well in one setting will have similar positive results when adopted in a new location. Thus, evaluations that are conducted in the Australian context are essential to understand the potential benefits of early childhood interventions undertaken here. Ideally, an evaluation should be planned at the same time as the intervention is designed to ensure methodologically strong evaluations that will support cost-benefit analyses and other evaluative endeavours.

17. Evaluation findings need to be interpreted with the consistency and dependability of the measurement in mind. The reliability of evaluation measures is presented in the review of programs in Section 5 of this report.

18. Random assignment and matching methods do not in themselves ensure that families in the intervention and comparison groups do not differ from one-another in unmeasured ways.

References

Anderson, L., Shinn, C., Fullilove, M., Scrimshaw, S., Fielding, J., Normand, J., Carande-Kulis, V.& the Task Force on Community Preventive Services (2003), 'The effectiveness of early childhooddevelopment programs: A systematic review', American Journal of Preventive Medicine, vol. 24, no. 3, pp. 32-46.
Bailey, D. (2002), 'Are critical periods critical for early childhood education? The role of timing in early childhood pedagogy', Early Childhood Research Quarterly, vol. 17,pp. 281-294.
Bacharach, V. (2002), 'Can science justify preschool?', Education Reporter, no. 202, Online at www.eagleforum.org/educate/2002/nov02/focus.shtml (accessed May 2004).
Barnett, W.S. (1993a), 'Cost-benefit analysis', in Schweinhart, L., Barnes, H. & Weikart, D., Significant benefits: The High/Scope Perry Preschool Study through age 27 , High/Scope Educational Research Foundation, Ypsilanti, Michigan.
Barnett, W.S. (1993b), 'Benefit-cost analysis of preschool education: Findings from a 25-year follow-up', American Journal of Orthopsychiatry, vol. 63, no. 4,pp. 500-508.
Barnett, W.S. (1995), 'Long-term effects of early childhood programs on cognitive and social outcomes', The Future of Children, vol. 5, no. 3, pp. 25-50.
Barnett, W.S. (1995), 'Long-term effects of early childhood programs on cognitive and social outcomes', The Future of Children, vol. 5, no. 3, pp. 25-50.
Benasich, A., Brooks-Gunn, J. & Clewell, B. (1992), 'How do mothers benefit from early intervention programs?', Journal of Applied Developmental Psychology, vol. 13, pp. 311-362.
Berlin, L.J., O'Neal, C.R. & Brooks-Gunn J. (1998), 'What makes early intervention programs work? The program, its participants, and their interaction', Zero to Three, vol. 18, pp. 4-15.
Bowes, J. (2000), Parents' responses to parent education: A review of selected parent education and support programs in the USA , Institute of Early Childhood, Macquarie University, Sydney.
Brooks-Gunn, J. (2003), 'Do you believe in magic? What we can expect from early childhood intervention programs', Social Policy Report, vol. XVII, no. 1, pp. 3-14.
Currie, J. (2003), 'What we can expect from early childhood intervention programs', Social Policy Report, vol. XVII, no. 1, pp. 5.
Currie, J. & Thomas, D. (2000), 'Does Head Start make a difference?', American Economic Review,vol. 85, no. 3, pp. 341-364.
DHS Victoria (2001), The 'Best Start' indicators project, Department of Human Services, Victoria, Australia.
Dryfoos, J. (1990), Adolescents at risk: Prevalence and prevention, Oxford UniversityPress, UK.
Emde, R. (2003), 'Charting intervention effects over time', Social Policy Report, vol. XVII, no. 1,pp. 8.
FACES (2003), Head Start FACES 2000: A whole-child perspective on program performance , US Department of Health and Human Services, USA.
Fisher, K., Kemp, L. & Tudball, J. (2002), Families First outcomes evaluation framework: Prepared for the Cabinet Office of New South Wales , Social Policy Research Centre, Sydney.
Fonagy, P. (2001), 'On the significance of outcome studies: A review of clinical relevance and research implications', Swiss Archives of Neurology and Psychiatry, vol. 152, no. 5, pp. 208-216.
Karoly, L., Greenwood, P., Everingham, S., Hoube, J., Kilburn, M.R., Rydell, C., Sanders, M. & Chiesa, J. (1998), Investing in our children: What we know and don't know about the costs and benefits of early childhood interventions , RAND, USA.
Karoly, L., Kilburn, M.R., Bigelow, J., Caulkins, J., Cannon, J. & Chiesa, J. (2001), Assessingcosts and benefits of early childhood intervention programs: Overview and application to the Starting Early Starting Smart program , Casey Family Programs, Seattle, WA.
Layard, R. (1972), Cost-benefit analysis, Penguin Education, Melbourne.
McCain, M. & Mustard, J. (1999), Reversing the real brain drain: Early Years Study final report , The Founders' Network of the Canadian Institute for Advanced Research, Toronto.
Mishan, E.J. (1975), Cost-benefit analysis, Allen & Unwin, Boston and London.
Mrazek, P.J. & Brown, C.H. (2002), 'An evidence-based literature review regarding outcomes in psychosocial prevention and early intervention in young children: Final report', in Russell C. (ed.) The state of knowledge about prevention/early intervention , Invest in Kids, Canada.
NSW Commission for Children and Young People & Commission for Children and Young People (Qld) (2004), A Head Start for Australia: An early years framework, NSW Commission for Children and Young People and Commission for Children and Young People (Qld), Australia.
Reynolds, A.J. (1994), 'Effects of a preschool plus follow-on intervention for children at risk', Developmental Psychology, vol. 30, no. 6, pp. 787-804.
Russell, C. (ed.) (2002), The state of knowledge about prevention/early intervention , Invest in Kids, Canada.
Sanders, M.R. (2003), 'Triple P - Positive Parenting Program: A population approach to promoting competent parenting', Australian e-Journal for the Advancement of Mental Health (AeJAMH) vol. 2, no. 3 (accessed 13 April 2004).
Schorr, L.B. (1997), Common purpose: Strengthening families and neighbourhoods to rebuild America , Anchor Books, Doubleday, New York.
Shonkoff, J.P., & Phillips, D.A. (eds) (2000), From neurons to neighbourhoods: The science of early childhood development , Committee on Integrating the Science of Early Childhood Development, National Research Council and Institute of Medicine, National Academy Press, Washington DC, USA.
Schweinhart, L., Barnes, H. & Weikart, D. (1993), Significant benefits: The High/Scope Perry Preschool Study through age 27 , High/Scope Press, Michigan.
Tomison, A. & Wise, S. (1999), 'Community-based approaches in preventing child maltreatment', Issues in Child Abuse Prevention, no. 11, Autumn 1999, National Child Protection Clearinghouse,Australian Institute of Family Studies, Melbourne.
Turner, K., Mihalopoulos, C., Murphy-Brennan, M. & Sanders, M. (2004), Triple-P Positive Parenting Program, Submission for Technology Appraisal by the National Institute for Clinical Excellence and Social Care Institute for Excellence.
Weikart, D. (1996), 'High-quality preschool programs found to improve adult status', Childhood, vol.3, pp. 117-120.
Weisbrod, B. (1968), 'Deriving an implicit set of governmental weights for income classes', in Layard R. (ed.) Cost-benefit analysis, Penguin Education, Harmondsworth UK.
Zigler, E. (2003), 'Forty years of believing in magic is enough', Social Policy Report, vol.XVII, no. 1, pp. 10.
Zigler, E. & Styfco, S. (1993), Head Start and beyond: A national plan for extended early childhood intervention , Yale University Press, New Haven, Connecticut.

Acknowledgements

Ms Sarah Wise is a Principal Research Fellow at the Australian Institute of Family Studies. She has a background in developmental psychology, and has a research interest in child welfare, non-parental child care and parent-child relationships. As an Institute researcher, Sarah has managed research projects in the areas of foster care, family support, child protection, child care and early childhood interventions.

Dr Lisa da Silva is a Senior Research Officer at the Australian Institute of Family Studies. Lisa has a background in clinical psychology and has completed a Doctor of Clinical Psychology (Child, Adolescent and Family). Her research interests are in early childhood and non-parental child care. At the Institute, she has worked on projects relating to child care, early childhood interventions and adolescence.

Dr Elizabeth Webster is Director of the Applied Microeconomics section of the Melbourne Institute of Applied Economic and Social Research, and Associate Director of the Intellectual Property Research Institute of Australia. She has completed a PhD and a Master of Economics. Beth has undertaken research on the economics of innovation, intellectual property, training and human capital formation, occupational change and labour market programs in Australia.

Associate Professor Ann Sanson is an Associate Professor in the Department of Psychology at the University of Melbourne, where her teaching and research are child and adolescent development in the context of families and communities. She was formerly Acting Director of the Australian Institute of Family Studies and is the Project Director for Growing Up in Australia (the Longitudinal Study of Australian Children).

The authors are grateful to Peter Dawkins and Roger Wilkins of the Melbourne Institute of Applied Economic and Social Research for their advice and detailed comments on drafts of this report. Thanks are also due to Louise Hayes of the Australian Institute of Family Studies, who helped document and assess a portion of the early childhood interventions included in the evaluation.

This report was commissioned by the Australian Government Department of Family and Community Services. It is the product of the collaboration between the Australian Institute of Family Studies and the Melbourne Institute of Applied Economic and Social Research.

Australian Institute of Family Studies
The Australian Institute of Family Studies is Australia's national centre for research and information on families. Now in its 25th year, the Institute's research on issues that affect family stability and wellbeing play a key role in the development of family policy and informed debate in Australia. The Institute is a statutory authority established by the Australian Government in February 1980.

Melbourne Institute of Applied Economic and Social Research
The Melbourne Institute of Applied Economic and Social Research is Australia's leading economic and social research institute. For 40 years it has carried out economic and social research and has developed a reputation as one of Australia's foremost social science research centres. It publishes the Australian Economic Review, the Quarterly Bulletin of Economic Trends, and the Australian Social Monitor in addition to regular reports on economics indicators.

Citation

Wise, S., da Silva, L., Webster, E., & Sanson, A. (2006). The efficacy of early childhood interventions (Research Report No. 14). Melbourne: Australian Institute of Family Studies.

ISBN

0 642 39527 6

Download Research report

The efficacy of early childhood interventions 1.35 MB