Head Start
Further Development Could Allow Results of New Test to Be Used for Decision Making
Gao ID: GAO-05-343 May 17, 2005
In September 2003, the Head Start Bureau, in the Department of Health and Human Services (HHS) Administration for Children and Families (ACF), implemented the National Reporting System (NRS), the first nationwide skills test of over 400,000 4- and 5-year-old children. The NRS is intended to provide information on how well Head Start grantees are helping children progress. Given the importance of the NRS, this report examines: what information the NRS is designed to provide; how the Head Start Bureau has responded to concerns raised by grantees and experts during the first year of implementation; and whether the NRS provides the Head Start Bureau with quality information.
The Head Start Bureau developed the NRS to gauge the extent to which Head Start grantees help children progress in specific skill areas, including understanding spoken English, recognizing letters, vocabulary, and early math. Due to time constraints and technical matters, the Head Start Bureau adapted portions of other assessments for use in the NRS. Head Start Bureau officials have responded to some concerns raised during the first year of NRS implementation, but other issues remain. For example, the Head Start Bureau has modified training materials and is exploring the feasibility of sampling. However, it is not monitoring whether grantees are inappropriately changing instruction to emphasize areas covered in the NRS. Head Start Bureau officials have said NRS results will eventually be used for program improvement, targeting training and technical assistance, and program accountability; however, the Head Start Bureau has not stated how NRS results will be used to realize these purposes. Currently, results from the first year of the NRS are of limited value for accountability purposes because the Head Start Bureau has not shown that the NRS meets professional standards for such uses, namely that (1) the NRS provides reliable information on children's progress during the Head Start program year, especially for Spanish-speaking children, and (2) its results are valid measures of the learning that takes place. The NRS also may not provide sufficient information to target technical assistance to the Head Start centers and classrooms that need it most.
Recommendations
Our recommendations from this work are listed below with a Contact for more information. Status will change from "In process" to "Open," "Closed - implemented," or "Closed - not implemented" based on our follow up work.
Director:
Team:
Phone:
GAO-05-343, Head Start: Further Development Could Allow Results of New Test to Be Used for Decision Making
This is the accessible text file for GAO report number GAO-05-343
entitled 'Head Start: Further Development Could Allow Results of New
Test to Be Used for Decision Making' which was released on May 17,
2005.
This text file was formatted by the U.S. Government Accountability
Office (GAO) to be accessible to users with visual impairments, as part
of a longer term project to improve GAO products' accessibility. Every
attempt has been made to maintain the structural and data integrity of
the original printed product. Accessibility features, such as text
descriptions of tables, consecutively numbered footnotes placed at the
end of the file, and the text of agency comment letters, are provided
but may not exactly duplicate the presentation or format of the printed
version. The portable document format (PDF) file is an exact electronic
replica of the printed version. We welcome your feedback. Please E-mail
your comments regarding the contents or accessibility features of this
document to Webmaster@gao.gov.
This is a work of the U.S. government and is not subject to copyright
protection in the United States. It may be reproduced and distributed
in its entirety without further permission from GAO. Because this work
may contain copyrighted images or other material, permission from the
copyright holder may be necessary if you wish to reproduce this
material separately.
Report to Congressional Requesters:
United States Government Accountability Office:
GAO:
May 2005:
Head Start:
Further Development Could Allow Results of New Test to Be Used for
Decision Making:
GAO-05-343:
GAO Highlights:
Highlights of GAO-05-343, a report to congressional requesters:
Why GAO Did This Study:
In September 2003, the Head Start Bureau, in the Department of Health
and Human Services (HHS) Administration for Children and Families
(ACF), implemented the National Reporting System (NRS), the first
nationwide skills test of over 400,000 4- and 5-year-old children. The
NRS is intended to provide information on how well Head Start grantees
are helping children progress.
Given the importance of the NRS, this report examines: what information
the NRS is designed to provide; how the Head Start Bureau has responded
to concerns raised by grantees and experts during the first year of
implementation; and whether the NRS provides the Head Start Bureau with
quality information.
What GAO Found:
The Head Start Bureau developed the NRS to gauge the extent to which
Head Start grantees help children progress in specific skill areas,
including understanding spoken English, recognizing letters,
vocabulary, and early math. Due to time constraints and technical
matters, the Head Start Bureau adapted portions of other assessments
for use in the NRS.
Head Start Bureau officials have responded to some concerns raised
during the first year of NRS implementation, but other issues remain.
For example, the Head Start Bureau has modified training materials and
is exploring the feasibility of sampling. However, it is not monitoring
whether grantees are inappropriately changing instruction to emphasize
areas covered in the NRS.
Head Start Bureau officials have said NRS results will eventually be
used for program improvement, targeting training and technical
assistance, and program accountability; however, the Head Start Bureau
has not stated how NRS results will be used to realize these purposes.
Currently, results from the first year of the NRS are of limited value
for accountability purposes because the Head Start Bureau has not shown
that the NRS meets professional standards for such uses, namely that
(1) the NRS provides reliable information on children‘s progress during
the Head Start program year, especially for Spanish-speaking children,
and (2) its results are valid measures of the learning that takes
place. The NRS also may not provide sufficient information to target
technical assistance to the Head Start centers and classrooms that need
it most.
An Assessor and Head Start Student Demonstrate the NRS Assessment.:
[See PDF for image]
[End of figure]
What GAO Recommends:
GAO recommends the HHS Assistant Secretary for ACF, in collaboration
with the Head Start Bureau, determine how NRS data will be used for
accountability and targeting technical assistance; monitor the effects
of the NRS on local Head Start practices; use first year NRS results to
conduct further study of the reliability and validity of the NRS;
compile a detailed, well-organized document on the technical quality of
the NRS; improve management of its data on NRS participation; and study
the costs and benefits of sampling in administering the NRS. ACF
generally agreed with our recommendations.
www.gao.gov/cgi-bin/getrpt?GAO-05-343.
To view the full product, including the scope and methodology, click on
the link above. For more information, contact Marnie S. Shaul at (202)
512-7215 or shaulm@gao.gov.
[End of section]
Contents:
Letter:
Results in Brief:
Background:
NRS Assesses Selected Skills Using Adaptations of Other Assessments:
The Head Start Bureau Has Been Responsive to Some Implementation Issues
Raised during First Year of NRS, but Others Remain:
The Head Start Bureau Has Not Specified How NRS Results Will Be Used
and Important Analyses Remain to Be Done:
Conclusions:
Recommendations for Executive Action:
Agency Comments and Our Evaluation:
Appendix I: Objectives, Scope and Methodology:
Appendix II: Survey Instrument:
Appendix III: Comments from the Department of Health and Human
Services:
Appendix IV: GAO Contacts and Staff Acknowledgments:
Tables:
Table 1: Examples of Information Included in Computer-Based Reporting
System (CBRS):
Table 2: Description of NRS Components and Their Modifications:
Table 3: Sample Disposition:
Figures:
Figure 1: Head Start Grantees, Delegate Agencies, and Centers:
Figure 2: Timeline of Events Leading to Implementation of NRS:
Figure 3: Example of NRS Letter Naming Instructions and Task:
Figure 4: Example of NRS Early Math Skills Instructions and Task:
Figure 5: Example of Type of Vocabulary Instructions and Task Used in
the NRS:
Abbreviations:
ACF: Administration for Children and Families:
CBRS: Computer-Based Reporting System:
ECLS-K: Early Childhood Longitudinal Study of a Kindergarten cohort:
HHS: U.S. Department of Health and Human Services:
HSB: Head Start Bureau:
NAEYC: National Association for the Education of Young Children:
NAS: National Academy of Sciences:
NHSA: National Head Start Association:
NRS: National Reporting System:
OLDS: Oral Language Development Scale:
PPVT: Peabody Picture Vocabulary Test:
Pre-LAS 2000: Pre-Language Assessment Scale 2000:
QRC: Head Start Quality Research Centers:
TWG: Technical Work Group:
United States Government Accountability Office:
Washington, DC 20548:
May 17, 2005:
The Honorable Edward M. Kennedy:
Ranking Minority Member:
Committee on Health, Education, Labor and Pensions:
United States Senate:
The Honorable Christopher J. Dodd:
Ranking Minority Member:
Subcommittee on Education and Early Childhood Development:
Committee on Health, Education, Labor and Pensions:
United States Senate:
In fall 2003, the federal Head Start program initiated a nationwide
skills test of over 400,000 4-and 5-year-old children. This test,
called the Head Start National Reporting System (NRS), is intended to
meet a long-standing need for systematic information on how well
specific Head Start grantees are helping children learn. Head Start is
designed to promote school readiness and healthy development among poor
preschool children and provides services to nearly 1 million children,
generally between the ages of 3 and 5, through nearly 1700 grantees.
These grantees or their delegates provide services at about 19,000 Head
Start centers nationally, with each grantee having from 1 to over 100
centers. For nearly a decade the Head Start Bureau (HSB) and the U.S.
Department of Health and Human Services (HHS) have been engaged in
promoting accountability and moving toward a results-oriented
evaluation of Head Start. The NRS builds on this work. The NRS was
developed in response to President Bush's April 2002 announcement of
the "Good Start, Grow Smart" early childhood initiative that directed
HHS to develop a national accountability system to ensure that every
Head Start grantee will assess the progress made by children in early
literacy, language, and numeracy skills.
Head Start teachers, or others trained as NRS assessors, administer the
NRS to children individually in the fall and spring of the Head Start
year. The NRS begins with a game of "Simon Says," lasts about 15
minutes, and includes four sub-tests designed to screen for
understanding of spoken English and to assess skills in recognizing
letters, vocabulary, and early math. During the test, an assessor sits
across from a child at a table and asks scripted questions of the
child, and the child responds by verbally identifying or pointing to
pictures, numbers, or letters that are contained in a 3-ring binder.
The assessor marks the child's responses on a computer-readable scoring
sheet. While all of the children are given at least the portion of the
English-language assessment that screens for understanding of spoken
English, children whose primary language is Spanish are also assessed
using a Spanish version of the NRS. Children who speak both English and
Spanish are given both versions of the NRS and scores from both tests
are reported separately.
Although other evaluations of children's skills and Head Start
performance exist, the NRS differs from them in its scale, type, and
purpose. The NRS is a standardized test intended for all
prekindergarten Head Start children. It represents the first time that
HSB will use children's performance on a standardized test to measure
how well specific Head Start grantees are helping children progress.
Many in the Head Start community and beyond agree that it is a laudable
goal to look at Head Start at the national and grantee levels to
determine whether Head Start achieves its stated objectives. However,
there have been significant concerns about whether the NRS, as
currently composed, is the right way to accomplish this goal.
Given the importance HSB places on measuring Head Start performance and
the concerns about the NRS, we examined (1) what information the NRS is
designed to provide, (2) how HSB has responded to implementation issues
raised by the Head Start grantees and experts during the first year of
NRS implementation, and what issues remain to be addressed, and (3)
whether the NRS provides HSB with the quality of information it needs
to meet its purposes.
To answer these questions, we collected and analyzed information from
multiple sources. To determine what information the NRS is designed to
provide, we interviewed representatives from HSB, its contractors, and
early childhood professional organizations and we reviewed documents
chronicling the steps HSB took in developing the NRS. To examine how
HSB responded to implementation issues raised by Head Start grantees
and experts during the first year of NRS implementation and what issues
remain to be addressed, we interviewed representatives from HSB and
randomly sampled Head Start grantees and delegates from the population
of all Head Start grantees and delegates during the 2003-2004 school
year. We received responses from 80 percent of the grantees and
delegates we surveyed. We also visited 12 Head Start grantees in 5
states (Colorado, Maryland, Massachusetts, Rhode Island, and Virginia),
to interview staff who conducted the assessments and to observe them
administering the NRS to children. The states and grantees chosen for
site visits were judgmentally selected to include a range of enrollment
sizes, types of program, rural and urban locations, and linguistic
populations. Finally, to examine whether the NRS provides HSB with the
quality of information it needs to meet its goals, we reviewed the
professionally accepted standards for test development, interviewed all
of the members of the Technical Work Group--a team of experts convened
to assist HSB and its contractors in the design and implementation of
the NRS--and consulted with individuals recommended by the National
Academy of Sciences as experts in the areas of test design and the
educational testing of Spanish-speaking and bilingual children. These
independent experts reviewed documents provided by HSB and its
contractors pertaining to the adequacy and appropriateness of the
assessment. See appendix I for additional information on our scope and
methodology. We conducted our work between May 2004 and February 2005
in accordance with generally accepted government auditing standards.
Results in Brief:
HSB developed the NRS to gauge the extent to which Head Start grantees
help children progress in specific academic skill areas. The NRS
includes materials adapted from other tests and is designed to provide
information on selected academic skills of children in Head Start.
Specifically, the NRS probes children's understanding of spoken English
and skills in vocabulary, letter recognition, and simple math through
the use of pictures, letters, and numbers. For example, children are
asked to count marbles pictured on a page and identify the height of a
teddy bear pictured beside a simple ruler. Children's skills in the
selected areas are assessed to determine how well participating
children, as a group, are learning and to identify grantees where
children are not making the expected progress.
In response to concerns raised during the first year of NRS
implementation, HSB has made changes to how the NRS is implemented and
is considering other changes, although other concerns have not yet been
addressed. In response to assessors' feedback that the initial training
instructed assessors to follow the assessment script too rigidly, HSB
modified some of its training materials to better prepare assessors for
the situations they encountered when implementing the test. In
addition, in response to suggestions by Technical Work Group members,
HSB changed the order in which the Spanish and English assessments are
administered. HSB is also considering substantive changes like
requiring only a sample of children to take the NRS and adding a social-
emotional development component to the NRS. According to our survey,
over 60 percent of grantees found it at least moderately challenging to
find time to assess all children, and sampling may help to minimize
this burden. Adding a measure of social-emotional development would
help to address concerns about the narrow range of skills that the NRS
tests. While these changes demonstrate HSB's responsiveness to some
concerns raised, the Bureau has yet to address other potential
implementation problems, such as whether all 4-and 5- year-olds
eligible to participate in the NRS are assessed and whether assessors
have narrowed the curriculum they teach in response to the NRS.
Analysis of the NRS is currently incomplete to support its use for the
purposes of accountability and targeting training and technical
assistance. First, HSB has not articulated a strategy for how it will
use information from the NRS to meet its purposes. For example, it has
not articulated what level of progress is expected, how it will use NRS
scores to target training and technical assistance, or how it will hold
grantees accountable for achieving results. Such decisions are
important first steps in any test development process. Further, results
from the first year of the NRS currently cannot be used to hold
grantees accountable or to target training and technical assistance
because HSB analyses have not yet shown that the NRS provides the scope
and quality of assessment information needed for these purposes. The
usefulness of educational tests is dependent on their consistency of
measurement (their reliability), along with whether they measure what
they are designed to measure (their validity). HSB has asserted that
the NRS meets these criteria because it borrows certain material from
existing tests that have met them, but the agency has not shown the NRS
itself to be valid and reliable over time. Test developers generally
use a pilot test to establish reliability and validity, but due to time
constraints, HSB did not conduct a full pilot test. In addition,
language experts advising HSB have raised serious concerns about
whether the Spanish version of the NRS adequately measures the skills
of Spanish-speaking children and whether results from the English and
Spanish versions are comparable. Responding in part to these concerns,
HSB has not yet used first year results of the NRS for accountability
decisions and has stated that future accountability decisions will not
be based solely on NRS results, but will reflect other grantee
information as well. The NRS also may not provide sufficient
information to target training and technical assistance to the centers
and classrooms that need it most. NRS results are aggregated across the
many classrooms and centers that a grantee may operate and results are
reported only at the grantee and delegate levels, because results are
more reliable at these levels than at lower levels. However, a
grantee's average score could mask variability among the multiple
classrooms or centers and limit information on where technical
assistance would be most effectively targeted. Furthermore, NRS results
alone do not indicate why results may be high or low, or what type of
training or technical assistance would be appropriate.
To help ensure that the NRS successfully and efficiently achieves its
purposes, we are recommending that the HHS Assistant Secretary for the
Administration for Children and Families (ACF) take several actions,
including articulating plans for use of the NRS results, providing
additional technical information on the test results, and conducting
additional study of unintended effects and alternative ways for
improving the test. ACF generally agreed with GAO's recommendations and
described some of the actions it has already begun. In addition, ACF
submitted detailed comments on certain aspects of the draft report,
including comments concerning the level of evidence for the validity of
the NRS.
Background:
Established in 1965, Head Start is a federally funded early childhood
development program that served over 900,000 children at a cost of $6.8
billion in 2004. Head Start offers low-income children a broad range of
services, including educational, medical, dental, mental health,
nutritional, and social services.[Footnote 1] Children enrolled in Head
Start are generally between the ages of 3 and 5 and come from varying
ethnic and racial backgrounds. Head Start is administered by HSB within
ACF. HSB awards Head Start grants directly to local grantees. Grantees
may develop or adopt their own curricula and practices within federal
guidelines. Grantees may contract with other organizations--called
delegate agencies--to run all or part of their local Head Start
programs. Each grantee or delegate agency may have one or more centers,
each containing one or more classrooms. In this report, the term
"grantee" is used to refer to both grantees and delegate agencies.
Figure 1 provides information on the numbers of Head Start grantees,
delegate agencies, centers and classrooms.
Figure 1: Head Start Grantees, Delegate Agencies, and Centers:
[See PDF for image]
[End of figure]
Since the inception of Head Start, questions have been raised about the
effectiveness of the program. In 1998, we reported that Head Start
lacked objective information on performance of individual grantees and
Congress enacted legislation requiring HSB to establish specific
educational standards applicable to all Head Start programs and allowed
development of local assessments to measure whether the standards are
met.[Footnote 2] HSB implemented this legislation by developing the
Child Outcomes Framework to guide Head Start grantees in their ongoing
assessment of the progress of children. The Framework covers a broad
range of child skill and development areas and incorporates each of the
legislatively mandated goals, such as that children "use and understand
an increasingly complex and varied vocabulary" and "identify at least
10 letters of the alphabet."
Since 2000, HSB has required every Head Start grantee to include each
of the areas in the Framework in the child assessments that each
grantee adopts and implements. The eight broad areas included in the
Framework are language development, literacy, mathematics, science,
creative arts, social and emotional development, approaches to
learning, and physical health and development. Grantees are permitted
to determine how to assess children's progress in these areas. These
assessments are to align with the grantee's curriculum; as a result the
specific assessments vary across the grantees. The assessments occur 3
times each year and generally involve observing the children during
normal classroom activities.[Footnote 3] The results of the assessments
are used for the purposes of individual program improvement and
instructional support and are not aggregated across grantees or
systematically shared with federal officials. The NRS, prompted by the
April 2002 announcement of President Bush's Good Start, Grow Smart
initiative, builds on the 1998 legislation by requiring all Head Start
programs to implement the same assessment, twice a year, to all 4-and 5-
year-old Head Start participants who will attend kindergarten the
following year.
When President Bush announced this initiative in April 2002, it called
for full implementation in fall 2003; as a result the NRS was developed
and preparations for implementation occurred within an 18-month period.
See figure 2. Shortly after the President announced this initiative,
HSB hired a contractor to assist it in developing and implementing the
NRS. The contractor, working closely with HSB, was responsible for the
design and field testing of the NRS, including developing training
materials to support national implementation of the reporting system by
grantees.[Footnote 4] HSB also worked with the Technical Work Group and
others throughout implementation of the NRS. The Technical Work Group
includes 16 experts in such areas as child development, educational
testing, and bilingual education. They advised HSB on the selection of
assessments, the appropriateness of the assessments in addressing the
mandated indicators, the technical merit of the assessments, and the
overall design of the NRS. While the Technical Work Group members
offered advice, the group members were not always in agreement with
each other and HSB was not obligated to act on any of the advice it
received. A list of the Technical Work Group members and their
professional affiliations is included in appendix I.
Figure 2: Timeline of Events Leading to Implementation of NRS:
[See PDF for image]
[End of figure]
Through focus groups, teleconferences, and various correspondences, HSB
officials communicated to Head Start grantees the purpose of the NRS
and their plans for administering the assessment. Focus groups and
discussions were held with various interested parties, including Head
Start managers and directors and experts from universities and the
public sector, on issues ranging from strengths and limitations of
various assessment tools to strategies for assessing non-English
speaking children. HSB also received input through a 60-day public
comment period, from mid-April to June 2003.
Another contractor developed a Computer-Based Reporting System (CBRS)
for the NRS. Local Head Start staff use the CBRS to enter descriptive
information about their grantees, centers, classrooms, teachers, and
children, as shown in table 1, as well as to keep track of which
children have been assessed. HSB analyzes the descriptive information
from the CBRS in conjunction with the child assessment data to develop
reports on the progress of specific subgroups of children. For example,
HSB can report separately on the average scores of children enrolled in
part-day programs and those enrolled in full-day programs.
Table 1: Examples of Information Included in Computer-Based Reporting
System (CBRS):
Program information:
* Program name;
* Director name;
* Number of delegates;
* Number of centers;
* Number of family day care centers;
* NRS lead for program;
Center information:
* Center name;
* Center type;
* Enrollment year start date;
* Enrollment year end date;
* NRS center lead name;
Classroom level information:
* Teacher name;
* Classroom type;
* Day option;
* Total enrollment;
* Number of additional teaching staff;
* Teacher entry date to classroom; Assessor information:
* Name;
* Highest grade or year of school completed;
* Highest degree held in Early Childhood Education or related field;
Teacher information:
* Teacher name;
* In what languages is teacher fluent?
* Total years teaching;
* How many years teaching Head Start?
* Highest grade or year of school completed;
* Child Development Associate credential;
Child information:
* Child name;
* DOB;
* Date of entry into classroom;
* Child unique ID from center;
* Years in preschool Head Start;
* Does child have a disability?
* Does child speaks a language other than English at home?
* If yes, how well does child speak English?
* If yes, what is primary language?
* Ethnicity/race.
Source: Head Start National Reporting System, Computer-Based Reporting
System Train-the-Trainer Manual, Prepared by Xtria, LLC, February 2004.
[End of table]
HSB, with assistance from the contractors, worked to ensure local staff
received adequate training on administering the assessment and using
the CBRS, and provided guidance on how to obtain consent from parents.
Training and certification of all assessors was required so that all
assessors would administer the NRS in the same way. Two-and-a-half day
training sessions were held at eight sites throughout the U.S. and
Puerto Rico during July and August 2003. Roughly 2,800 individuals
completed the training, of which 484 were certified in both English and
Spanish. In turn, these certified trainers held training sessions
locally to train and certify additional staff who would be able to
administer assessments.
The development of educational tests is a science in itself, to which
university departments, professional organizations, and private
companies are devoted. Among the most important concepts in test
development are validity and reliability. Validity refers to whether
the test results mean what they are expected to mean and whether
evidence supports the intended interpretations of test scores for a
particular purpose. Reliability refers to whether or not a test yields
consistent results. Validity and reliability are not properties of
tests; rather, they are characteristics of the results obtained using
the tests. For example, even if a test designed for 4th graders were
shown to produce meaningful measures of their understanding of
geometry, this wouldn't necessarily mean that it would do so when
administered to 2nd or 6th graders or with a change in directions
allowing use of a compass and ruler. Test developers typically
implement "pilot" tests that represent the actual testing population
and conditions and they use data from the pilot to evaluate the
reliability and validity of a test. This process generally takes more
than 1 year, especially if the test is designed to measure changes in
performance.
In the remainder of the report, we will discuss how the focus of the
NRS was determined and the assessment was developed, HSB's response to
problems in initial implementation as well as some implementation
issues that remain unaddressed, and the extent to which the assessment
meets the professional and technical standards to support specific
purposes identified by HSB.
NRS Assesses Selected Skills Using Adaptations of Other Assessments:
The NRS assesses vocabulary, letter recognition, simple math skills,
and screens for understanding of spoken English. As initially conceived
by HSB, the NRS was to gauge the progress of Head Start children in 13
congressionally mandated indicators of learning. However, time
constraints and technical matters precluded HSB from assessing children
on all of the indicators and led HSB to consider, and eventually adopt,
portions of other assessments for use in the NRS.
The 18 months from announcing the Good Start, Grow Smart initiative, of
which the NRS is a part, to implementing the assessment was not enough
time for HSB to develop a completely new assessment. Therefore, HSB,
with the advice of its contractor and the Technical Work Group, chose
to borrow material from existing assessments. Concerns raised by
Technical Work Group members and the contractor about the length and
complexity of the assessment and the technical adequacy of individual
components eventually led to limiting the areas assessed in the NRS,
from 13 skills to 6. The six legislatively mandated skills that HSB
targeted included whether children in Head Start:
* use increasingly complex and varied spoken vocabulary;
* understand increasingly complex and varied vocabulary;
* identify at least 10 letters of the alphabet;
* know numbers and simple math operations, such as addition and
subtraction;
* for non-English speaking children, demonstrate progress in listening
to and understanding English; and:
* for non-English speaking children, show progress in speaking English.
In April and May of 2003 an assessment that included 5 components
covering the 6 skills was field tested with 36 Head Start programs to
examine the basic adequacy of the NRS, as well as the method for
training assessors, and the use of the CBRS. The field test also
included a Spanish version of the NRS. Based on the field test, one
component--phonological awareness, or one's ability to hear, identify,
and manipulate sounds--was eliminated. While this component examined an
area that experts have linked to prevention of reading difficulties,
the test used to assess it was problematic. HSB moved forward with the
other components of the NRS. The four components of the NRS each
measure one or more of the six legislatively-mandated indicators.
The four components that comprise the NRS are from the following tests:
* Oral Language Development Scale (OLDS) of the Pre-Language Assessment
Scale 2000 (Pre-LAS 2000),
* Third Edition of the Peabody Picture Vocabulary Test (PPVT-III),
* Head Start Quality Research Centers (QRC) letter-naming exercise,
and:
* Early Childhood Longitudinal Study of a kindergarten cohort (ECLS-K)
math assessment.
Some or all of each test was previously used for other studies, and the
PPVT and letter naming were previously used in studies of Head Start
children.[Footnote 5] Three of the four tests were modified from their
original version, as shown in table 2. Figures 3 and 4 are examples
from the letter naming and early math skills components of the NRS.
Figure 5 is an example of the type of item used in the vocabulary
(PPVT) component of the NRS.
Table 2: Description of NRS Components and Their Modifications:
NRS components: Oral Language Development Scale (OLDS) of the PreLAS
2000 (comprehension of spoken English);
Modifications to components: NRS includes two subtests from the
original assessment;
Description of components: Simon Says-The child is asked to follow the
instructions that "Simon says," such as "Simon says, 'Touch your
toes.'"; Art Show- The child is presented with a series of 10 pictures
and asked to name or explain what is in each picture;
Legislatively-mandated skill measured by component: Use increasingly
complex and varied spoken vocabulary; For non-English speaking
children, demonstrate progress in listening to and understanding
English; For non-English speaking children, show progress in speaking
English.
NRS components: Third Edition of the Peabody Picture Vocabulary Test
(PPVT-III);
Modifications to components: NRS includes 24 items from what was
originally a 144-item test;
Description of components: The child is asked to point to pictures to
demonstrate understanding of words representing parts of the human body
or their functions, activities of daily living, emotions and feelings,
work/career-related activities, and plants, animals, and their
habitats;
Legislatively- mandated skill measured by component: Understand
increasingly complex and varied vocabulary.
NRS components: Head Start Quality Research Centers (QRC) letter-naming
exercise;
Modifications to components: None;
Description of components: The child is shown all 26 letters of the
alphabet, divided into three groups of 8, 9, and 9 letters, and
arranged in approximate order of item difficulty, and is asked to
identify the letters they know by name;
Legislatively-mandated skill measured by component: Identify at least
10 letters of the alphabet.
NRS components: Early Childhood Longitudinal Study of a kindergarten
cohort (ECLS-K) math assessment;
Modifications to components: NRS includes items in the easier range of
the original assessment;
Description of components: Using pictures, the child is asked about a
range of math skills: number recognition of 1-digit numerals, basic
geometric shapes, matching number names with objects, counting, simple
addition and subtraction, and interpreting simple measurements and
graphic representations;
Legislatively-mandated skill measured by component: Know numbers and
operations.
Source: GAO analysis of HHS documentation.
[End of table]
Figure 3: Example of NRS Letter Naming Instructions and Task:
[See PDF for image]
[End of figure]
Figure 4: Example of NRS Early Math Skills Instructions and Task:
[See PDF for image]
[End of figure]
Figure 5: Example of Type of Vocabulary Instructions and Task Used in
the NRS:
[See PDF for image]
[End of figure]
The Head Start Bureau Has Been Responsive to Some Implementation Issues
Raised during First Year of NRS, but Others Remain:
HSB has been responsive to some specific implementation concerns about
the NRS, but other issues remain that might pose problems in the
future. HSB already has made modifications to NRS training materials,
the CBRS, and how the Spanish NRS is administered. In addition, HSB is
working with the Technical Work Group to explore the feasibility of
adopting a sampling strategy and including a measure of social-
emotional development in the NRS. HSB has told grantees not to make
changes to their programs based on the first year of the NRS, but our
survey found that some grantees have changed instruction to emphasize
areas covered in the test.[Footnote 6] While some such change may be
appropriate, HSB currently is not monitoring whether grantees are
changing the content of instruction to de-emphasize areas not tested or
adopting inappropriate styles of teaching.
HSB Has Responded to Some Implementation Issues That Arose during the
First Year of NRS:
Based on grantee feedback about their experiences during the first year
of NRS implementation, HSB has already responded to some concerns by
providing additional guidance on handling children's behavior, making
it easier for Head Start staff to use the CBRS, and changing the order
in which the Spanish and English versions of the NRS are administered
to Spanish speaking children. These changes are, in part, a response to
feedback from local assessors and concerns raised by Technical Work
Group members. During our site visits, some assessors described the
2003 NRS training as rigid, with a lot of emphasis placed on following
the script. HSB addressed these concerns in the 2004 spring refresher
training video. Assessors agreed that this video better reflected the
situations they encountered when assessing young children, such as a
child who fidgets, has to go to the bathroom or wants a drink of water
during an assessment.
In addition to changing training material, HSB added several new
features to the CBRS in response to information contractors gleaned
while fielding assessors' phone calls for technical assistance. For
example, the CBRS initially required local Head Start staff to type in
all necessary information about their students, but the fall 2004
version of the CBRS allowed local staff to update information about
their children using information from the previous year or by
transferring information from other computer systems.
Another change to the NRS is the order in which the Spanish and English
assessments are administered to Spanish speaking children. Some TWG
members suggested that by administering the NRS first in English and
secondly in Spanish to Spanish-speaking children with limited English
proficiency, the children will have experienced difficulty and
frustration during the English test. These feelings of frustration or
failure could affect a child's disposition--and a child's responses--
when later taking the Spanish version. Thus, the validity of the
Spanish assessment might be compromised. During summer 2004, Migrant
and Seasonal Head Start Programs administered the assessment in Spanish
first. Based on the positive response they received from local
assessors, HSB instructed all programs to follow this format in fall of
2004.
HSB Is Considering Sampling Strategies and Broadening NRS to Include a
Measure of Social-Emotional Development:
HSB is considering ways to deal with two issues raised during the first
year of implementation: the burden on grantees in dedicating staff for
the assessments and the limited range of skills that were assessed in
the NRS. In particular, HSB is considering the feasibility of sampling
to minimize the burden that grantees experienced in assessing all 4-and
5-year-old Head Start participants who will attend kindergarten the
following year. According to our survey, finding time to conduct
assessments presented at least a moderate challenge to an estimated 63
percent of grantees and allocating staff to administer the NRS
presented at least a moderate challenge for an estimated 42 percent of
grantees during the first year of the NRS. According to most of the
assessors we spoke to (8 of 12) during our site visits, local staff
neglected other tasks, juggled tasks, or took work home because they
were occupied with administering the NRS. Assessors also mentioned
having to reschedule training and reallocate staff because of the NRS.
Several Technical Work Groups members and grantees have suggested
sampling as a way for the NRS to provide better information while
reducing the burden on grantees. Sampling would allow staff to spend
more time in the classroom and would cost less. Responding to these
suggestions, HSB is working with some members of the Technical Work
Group to identify various sampling strategies and their practical
implications. These sampling strategies include matrix sampling, which
involves taking a subset of items from the larger assessment and
randomly assigning them to test takers, thereby avoiding the need to
administer all items to all test takers. Matrix sampling would allow
for more items to be included and, therefore, more in-depth assessment
of the subjects covered by the test. Drawing an appropriate sample is
complicated, however, and it might be difficult to learn how subgroups
are doing, by region or subpopulation, using sampling or matrix
sampling.
In addition to studying the feasibility of sampling, HSB is actively
exploring ways to incorporate a measure of social-emotional development
into the NRS. Technical Work Group members have argued that social-
emotional development is critical to kindergarten success and adding a
measure of social-emotional development would begin to address
criticisms that the scope of the NRS currently is too narrow. A
Technical Work Group subcommittee has identified eight measures of
social-emotional development for possible field-testing. In addition,
HSB has directed its contractor to conduct a small pilot to assess the
feasibility of these measures and to conduct focus groups to obtain
teacher feedback on the measures. Following the pilot test and focus
groups, the contractor will conduct a field test with 30 Head Start
programs to determine the appropriateness and technical adequacy of the
measures.
HSB Has Not Yet Addressed Some Concerns:
While HSB is addressing some issues associated with the NRS, additional
implementation concerns have yet to be addressed. HSB currently lacks
independent information to verify that grantees are assessing all of
the children eligible to participate in the NRS. Thus, the potential
exists for undetected errors or exclusion of children HSB intends to be
assessed. HSB attempts to ensure it has accurate information in several
ways. For example, HSB compares the number of 4-and 5-year-olds
reported in the current year with information from the previous year
and it analyzes the data for inconsistencies and
discrepancies.[Footnote 7] However, beyond these checks, HSB does not
have an independent way to confirm the number of children eligible to
participate in the NRS.
There is also a concern that local Head Start programs will alter their
teaching practices and curricula based on their participation in the
NRS. These alterations, whether intended or unintended, might have
positive and negative consequences. Local assessors are generally Head
Start staff and it is expected that they want their children to perform
well on the NRS and that they will teach their children the specific
skills measured in the NRS. An increased focus on teaching these skills
could be positive to the extent they have been neglected. However, this
focus would be detrimental if it resulted in narrowing the curriculum
to exclude skills that are not measured on the NRS but that experts
believe are equally important for children's development. HSB
specifically told grantees not to make changes to their programs based
on their initial NRS results and has provided guidance on appropriate
instruction. Nonetheless, according to our survey of assessors, at
least an estimated 18 percent of grantees changed instruction during
the first year of NRS implementation to emphasize areas covered in the
NRS. One assessor we interviewed explained that despite being told
during NRS training that programs should not adjust their curricula, it
is human nature to try to correct areas in need of improvement. Without
additional information, it is not possible to determine whether changes
in instruction are positive or negative.
Despite HSB's assurances that it intends to use the NRS results only in
the context of other information on performance, experts state that
grantees' perception of the NRS as a "high stakes" test could
compromise the test within a few years. Assessors are very involved in
the scoring of the NRS, yet the NRS is evaluating the grantees that
employ them; thus, they are not independent. Assessors' input and
interpretations could make the grantee appear to accomplish its goals,
whether it actually does or not. For example, one assessor commented
that participating in the NRS had planted a seed that perhaps she
should teach her children particular words that appear in the NRS, such
as the word "altogether," which appears in the instructions. It is also
worth noting that the words used to screen for understanding of English
were exactly the same in fall 2003 and spring 2004, so that learning
particular words would make a large difference. An independent expert
argued that there needs to be continuous monitoring and retraining of
NRS assessors, as there was during the first year of NRS
implementation, to maintain quality control over the testing process.
For the second year of the NRS, HSB has extended its effort to review
the quality of assessment administration, but these efforts do not
include monitoring of changes in classroom practices.
Additionally, in the absence of clear direction from HSB, local Head
Start staff might misinterpret the results and use them
inappropriately. The Technical Work Group has been clear that NRS
scores for classrooms and individual children are not reliable and
should not be used at the classroom level or for individual child
evaluation or instruction. Yet, two of the Head Start grantees we
visited stated that they photocopied each child's responses before
returning the completed scoring sheets and one stated that the grantee
intended to use the individual test results to evaluate its own
performance at the classroom level. Technical Work Group members have
argued that local Head Start programs should be given clear information
on how to interpret the NRS results and how to improve their programs
if they are unhappy with their NRS scores; however, the Technical Work
Group members themselves have expressed confusion about how to
interpret NRS scores, given the technical issues that are discussed in
detail in the next section.
The Head Start Bureau Has Not Specified How NRS Results Will Be Used
and Important Analyses Remain to Be Done:
HSB has not said specifically how it will use the NRS results and HSB
currently lacks analyses showing that the NRS provides the scope and
quality of information needed to hold Head Start grantees accountable
or target training and technical assistance. To support these purposes,
the NRS must produce valid and reliable results on children's
performance that would allow for clear conclusions about Head Start
grantees' effectiveness in improving the academic performance of
children. Due to time constraints, HSB did not conduct a pilot test
that could have provided information to establish the reliability and
validity of changes in the NRS results over time. Experts have also
questioned the technical merit of the Spanish-language NRS. Apart from
these concerns, the NRS results alone do not provide enough contextual
information to support accountability decisions. Acknowledging some of
these issues, HSB has stated that accountability decisions will not be
based solely on NRS results, and it will consider other grantee
information, though it has not explicitly described how NRS results
will be interpreted. Finally, because multiple classrooms are averaged
to produce grantee results and this average may mask variability among
different classrooms, NRS results are of limited use to target training
and technical assistance to the classrooms where assistance is needed
most.
Head Start Bureau Has Not Stated How It Will Use NRS Results to Achieve
Its Purposes:
Head Start Bureau officials have stated in general terms that they will
use NRS results to improve program performance, target training and
technical assistance and hold Head Start grantees accountable; however,
it remains unclear whether the NRS' purposes will be realized because
HSB has not explained how assessment results will be used. For example,
as of February 2005, HSB had not specified what grantee scoring level
constitutes adequate performance. In addition, it had not indicated
whether HSB would adjust scores to account for age or other differences
among the children grantees serve, how it would account for students
with disabilities, or whether adequate performance would be measured in
absolute terms (e.g., the average score or the percentage of children
that score above a certain level) or by growth in performance
(performance change from fall to spring assessment).
Professional standards for educational testing require that test
developers specify how results will be used prior to developing a test
so that judgments can be made about the appropriateness of the test.
The specific uses of the NRS dictate the specific technical criteria it
should meet. For example, if HSB intends to hold grantees accountable
for increasing their assessment scores by a particular percentage, the
NRS would need to be sensitive enough to reliably measure increases of
that size. Several Technical Work Group members have emphasized the
point that HSB should have determined exactly how it intended to use
the NRS as a first step in the development of the NRS. As of February
2005, HSB officials had not indicated when they would make decisions
about the specific uses of the NRS data or when they would provide this
information to grantees.
This ambiguity has left some grantees wondering what the consequences
could be of their assessment results. Assessors from 6 of the 12 Head
Start grantees we visited said they were concerned about how HSB would
use the NRS. Assessors from two grantees expressed apprehension that
the results would be misinterpreted as evidence regarding the
effectiveness of the program. One assessor suggested that HSB should
share with local Head Start staff how it plans to use the data because
it would generate greater support for the NRS among staff. These
findings are consistent with recommendations from a quality assurance
study, commissioned by HSB, that recommended HSB provide more
information on how it will use the results of the NRS assessments,
especially with respect to implications for training and technical
assistance, program improvement, and funding, to alleviate the concerns
of grantees.[Footnote 8] HSB has stated that it is focusing on how to
work with grantees on understanding NRS results and how to use the
information to make improvements through training and technical
assistance.
Results from First Year Cannot Be Used to Hold Grantees Accountable
Because Important Analyses Have yet to Be Completed or Documented:
In order to use the NRS for the purpose of holding grantees accountable
for children's progress, HSB needs to demonstrate that the NRS will
provide reliable and valid information. As of February 2005, HSB had
not, however, conducted certain analyses on NRS results to establish
the validity and some aspects of the reliability of the assessment. A
test is considered valid when it measures what it is supposed to
measure and evidence supports the intended interpretations of test
scores for a particular purpose. Reliability refers to whether or not a
test yields consistent results, meaning that if a child in Head Start
took the NRS on, say, a different day, that his or her score would be
similar.
HSB tested the reliability of particular NRS items through a short
field test, but given the time constraints on the development of the
NRS, HSB did not run a more extensive "pilot" test prior to full
implementation. The field test results provided some information on the
reliability of the NRS components for one point in time, which
generally was strong at the grantee level. However, HSB lacked
information on the range of growth that children might experience over
the course of a year and--consequently--did not have the data to show
that the test produces valid and reliable results on change from fall
to spring. Some assessors also have expressed doubt about whether the
NRS accurately measures change over time. According to our survey of
NRS assessors, about a quarter of assessors agree that the NRS
accurately measures the progress of their Head Start children from fall
to spring. Further, without additional data from a pilot test, HSB
could not fully validate the NRS and ensure that its use for the
intended purposes was appropriate.
Despite not conducting a pilot test, HSB stated that the NRS was
technically sound in large part because it borrowed sections from tests
that produced valid and reliable results in previous studies. Relying
on this past work instead of conducting a new pilot test allowed HSB to
develop the NRS within a very short time frame, but there are problems
with this approach. The sample of children in these past studies is not
always the same as the Head Start children with regard to age, home
language, culture, or range of socio-economic status. Moreover, some of
the tests used in the past were modified for use in the NRS by either
limiting the questions asked or modifying the instructions. Without
further analyses of the actual NRS implementation data, it is
impossible to determine whether interpretations of the NRS results for
the purpose of accountability are valid. Data from the first year of
implementation could now be used to conduct some of these analyses and
make determinations. For this reason, some Technical Work Group members
have suggested that the first year of NRS implementation should have
been considered a pilot test. HSB officials stated recently that they
would be working with the Technical Work Group and a new advisory
committee to continue to review the quality, reliability, and validity
of the NRS assessment.
Technical Work Group members have noted specific concerns with the
approach and format of the NRS that may be threats to its validity. For
example, Technical Work Group members have criticized the math section
for asking children to refer to items pictured on a page rather than
providing physical items (e.g., blocks) to handle and have argued that
the instructions are complicated for 4-and 5-year-old children. They
argue children might fail items due not to lack of math skills, but
because they do not understand the instructions or they lack the
ability to perform the math operations without items that can be
manipulated. Technical Work Group members also questioned whether the
letter-naming task is a valid measure of how many letters the children
know. Given the layout of the letters on the page, a child can miss
letters even if he or she actually knows the names of the letters, or
may tire of naming them and seek to see what is on the next page.
Several of the assessors we interviewed echoed these concerns and also
raised concerns about the quality of the pictures and choice of
vocabulary used in the PPVT component of the NRS. Due in part to these
concerns, only about half of lead assessors believe that the NRS
accurately portrays the majority of their children's abilities.
Currently, HSB cannot use the results from the Spanish version of the
NRS for accountability purposes because it has not been demonstrated
that this version produces reliable and valid results or that its
results are comparable to those from children tested in English. While
it is important that a Spanish version was developed due to the fact
that 20 percent of Head Start children speak Spanish, experts have
questioned the reliability of the Spanish NRS results and criticized
other aspects of this version. First, the Spanish version of the NRS
was not standardized for the Spanish-speaking Head Start population.
Because the country of origin and class of a child's family affect the
Spanish dialect he or she speaks, there are important language
differences among subpopulations, making such standardization
important. For example, the Spanish spoken in Puerto Rico differs from
that in Mexico and children from these countries are likely to
recognize and use different words in test questions and answers. A
number of NRS assessors commented to us that the Spanish terms used in
the NRS were unfamiliar to their children and, in some cases,
unfamiliar to the staff as well. A second problem with the Spanish NRS
is that the English and Spanish versions are scored differently in that
English answers are acceptable on the Spanish version, but not vice
versa. This presents a problem because bilingual children may know some
things in English and other things in Spanish. For example, a child
might know the Spanish words for household items and the English words
for numbers and math concepts. As an indication of this, one-third of
Spanish-language NRS assessors found that on the Spanish version of the
NRS many of their children responded correctly in English, but not in
Spanish.
Members of the Technical Work Group and experts in bilingual testing
have also questioned whether the Simon Says and Art Show components of
the NRS can be used appropriately to track children's progress in
English, as HSB intends. They express concerns that these components,
designed simply as a screener to identify children who might have
difficulty understanding English, do not provide useful information on
the extent of English understood.
In addition to addressing concerns about the reliability and validity
of the NRS directly, it is important that HSB's analyses and results
are easy for other knowledgeable people to understand and use.
Professional standards call for a technical manual addressing issues
such as reliability and validity, as well as clearly specifying the
intended uses and interpretations of the tests and cautioning against
unintended misuses. According to all three of the independent experts
who reviewed the technical aspects of the NRS at our request, the
documentation of the reliability and validity of the NRS is not as well
organized as would be desirable.[Footnote 9] They stated that given the
importance of the validity of the NRS, a technical manual that brings
all the evidence together in one place would be valuable. The expert
reviewers reported that, in some cases, relevant material for
evaluating the procedures and evidence to support the reliability and
validity was provided, but was not organized in one place. For other
areas, especially concerning the empirical work related to the Spanish
version, documentation was not provided. For example, the information
on the Spanish version of the test was limited to descriptions of
procedures and summaries (e.g., "reliabilities were in the moderate to
high range") and did not include documentation that would have made it
possible for the reviewers to confirm the findings.
HSB Acknowledges that NRS Alone Does Not Provide Range of Information
and Context Needed for Making Accountability Decisions:
The NRS by itself does not provide sufficient information to draw
conclusions about the effects of Head Start grantees on children's
outcomes--information that would support use of the NRS for Head Start
grantee accountability. The NRS does not measure all aspects of Head
Start, but only a limited range of the areas on which Head Start
focuses and which contribute to children's school readiness. For
example, the NRS does not include measures related to science, creative
arts, approaches to learning, physical health and development, or
social and emotional development, areas on which all Head Start
programs are required to focus. Further, the cognitive areas included
in the NRS are measured using a very narrow source of data that is not
sufficient to evaluate the effects of Head Start grantees on the full
range of child outcomes. For the area of literacy, the test measures
how well children can identify letters, but not whether they can
recognize rhymes or understand that letters make sounds--both aspects
of "phonemic awareness," which is believed to be an area critical for
preventing reading difficulties. For the area of language development,
the test measures how well children can identify pictures by name, but
not grammar, usage, or expressive speech.
The Head Start Bureau has acknowledged the limited scope of the NRS and
has expressly urged Head Start grantees to continue implementing their
local assessments of the broader range of Head Start activities. The
Associate Commissioner for the Head Start Bureau has stated that the
Bureau does not intend to make decisions about grantees based solely on
NRS data. Rather, the NRS information will be combined with
comprehensive program level data collected on program designs and staff
patterns; funded and actual enrollment; health, education, disability,
and family services delivered; and demographic, social, and other
trends.[Footnote 10] Many Technical Work Group Members have stated that
this type of contextual information is necessary for the NRS to be a
useful part of an overall program evaluation design.
In addition to measuring a limited range of the areas on which Head
Start focuses, the NRS does not include all of the 4-year-old children
who participate in Head Start. Most notably, children who speak neither
English nor Spanish, about 4 percent of Head Start children otherwise
eligible to participate in the NRS, are excluded from the NRS. Some
grantees do not have such children in their classrooms while others may
include many such children. In addition, a number of children are
excluded from the NRS due to prolonged absence and the scores of some
children who do participate in the NRS are later excluded due to
administrative reporting errors.
Application of NRS in Targeting Training and Technical Assistance
Requires Further Development:
NRS results are most reliable at the grantee level, but results at the
grantee level are not the most useful for identifying where training
and technical assistance should be targeted because some grantees
include a large number of locations and classrooms. Using average
scores at the grantee level to target training and technical assistance
can mask the variability that underlies them. An average score gain for
a grantee may be accounted for by high gains only of children in
particular classrooms, while the scores of children in other classrooms
did not change or actually lost points. The NRS data would allow for
more effective targeting of training and technical assistance if the
data could be used at the center and classroom levels, but currently
the NRS cannot be used in this way. Given this limitation, HSB has
stated that it might use NRS results to target training to a particular
region of the country or to support a national training initiative in a
particular skill area rather than to target specific grantees.
The NRS, by itself, cannot identify which particular aspects of the
Head Start program, if any, contributed to a grantee's particular NRS
results and this imposes some limitations on its utility for targeting
training and technical assistance. The NRS does not directly assess the
performance of Head Start grantees, such as by assessing the quality of
the classroom environment or teacher-child interactions. Rather, the
NRS assesses children's performance as an indirect measure of grantee
performance. To ensure that the NRS can be used as a valid indicator of
grantee performance (vs. variations in student age or other
characteristics), experts believe it would be important to link NRS
data to other observations known to distinguish more and less
successful programs. In its quality assurance study of the NRS, HSB
found that local Head Start staff were not sure how to use the fall
2003 results that were reported at the grantee level. Likewise, in our
survey of NRS assessors we found that almost one-third of assessors
believed the NRS did not provide useful information for their programs.
Some members of the Technical Work Group have suggested that HSB
further investigate the assumption that targeting training and
technical assistance at the grantee or broader level can affect the
progress made by children on certain academic skills. They argue that,
if it is found that the classroom level matters, then the focus of
analysis and reporting should be redirected and efforts could be made
to increase the reliability of the scores at the classroom level.
Conclusions:
The NRS is an important step toward meeting a long-standing need for
systematic data on children's progress in Head Start and grantees'
performance. Developing such a system is a challenging endeavor and
considerable care and resources have gone into the project so far. At
the same time, the technical standards applicable to HSB's planned uses
for the assessment results need to be met. In addition, the system
should be implemented with the greatest efficiency and caution against
unintended negative consequences. The current NRS has strengths as well
as areas in need of refinement, further investigation, and development.
While the NRS provides some information on child outcomes among Head
Start grantees, HSB has not yet articulated how it intends to interpret
and use this information for the purposes of informing decisions about
Head Start accountability and targeting training and technical
assistance. Without further guidance, there is confusion among Head
Start grantees about what level of performance is expected of them and
how NRS results from their programs might be used to hold them
accountable. Out of anxiety about potential uses of the test, grantees
may be inappropriately narrowing the educational activities provided
through Head Start to match those included in the NRS, even though
instructed not to do so. Thus far, HSB has not established an ongoing
mechanism for monitoring the extent to which the NRS has such effects
on instruction.
Other key steps that HSB has not taken include validating component
tests and determining the reliability and validity of the NRS results
across time. In addition, it has not compiled complete, well-organized
documentation on the analyses conducted during test development and
implementation, making it difficult for independent experts to evaluate
the full technical merits of the English and Spanish versions of the
NRS. Further, HSB lacks a mechanism for ensuring that all English and
Spanish-speaking Head Start children who are eligible to participate in
the NRS are assessed. Without such a mechanism and additional analyses,
and the assurances they provide, the potential exists that the NRS will
produce results that are not useful for program evaluation. Moreover,
without further work on test validation, HSB cannot use the NRS for
making decisions about grantees.
Finally, HSB's decision to assess all children with the full NRS
assessment, rather than assessing a sample of children with a sample of
items, has created a logistical challenge for many local Head Start
grantees who must conduct the assessments, and limited the depth of
information the NRS can provide about the learning of Head Start
children in particular skill areas. At the same time, developing a
sampling or matrix sampling strategy is complicated, especially for
gathering information on the performance of subgroups of grantees, such
as by region.
Recommendations for Executive Action:
To help ensure that the NRS successfully and efficiently achieves its
purposes, we are recommending that the HHS Assistant Secretary for ACF
take steps to better monitor some aspects of NRS implementation and
examine means of improving its efficiency, including steps to:
* monitor the effects of the NRS on local Head Start instructional
practices;
* improve the management and accuracy of its data on the number of
children eligible for and participating in the NRS; and:
* work with the Technical Work Group to determine the feasibility of
sampling options for administering the NRS, including documentation of
their costs and benefits.
In addition, we are recommending that the Assistant Secretary for ACF
reduce uncertainty about the appropriate uses of the NRS by taking
additional steps to:
* determine how the NRS data will be used for the purposes of
accountability and targeting training and technical assistance, and
clearly communicate this information to grantees;
* use the first year of NRS results to conduct further study to ensure
that the results are reliable and valid for both the English and
Spanish versions and that the results are appropriate for the intended
purposes; and:
* compile detailed technical information on the NRS, including
appropriate uses, in a single, well-organized document and make this
information publicly available.
Agency Comments and Our Evaluation:
ACF provided written comments on a draft of this report, which are
reprinted in appendix III. ACF generally agreed with GAO's
recommendations and stated that it had taken the following actions:
ACF's contractors are conducting additional analyses of the first year
NRS results to ensure that future results are reliable and valid.
ACF's contractors are preparing a detailed technical report.
ACF has engaged its contractors and TWG in the preparation of an
options paper with recommendations for sampling.
ACF is examining changes that occur in local curriculum implementation
and teaching practices.
Further, ACF indicated that it will examine ways to improve the
management and accuracy of its data on the number of children eligible
for and participating in the NRS.
ACF's positions regarding the NRS evolved over the course of our
review, as evidenced by ACF's decision not to include the 2003-2004 NRS
results in the 2004-2005 program monitoring process, its modification
of training materials, and changes ACF made to the CBRS. ACF expressed
in its comments a continued willingness to receive recommendations and
advice.
While generally agreeing with our recommendations, ACF also submitted
detailed comments on certain aspects of the draft report. Several of
these comments concerned the level of evidence for the validity of the
NRS. For example, ACF cited ongoing analyses of validity and noted that
most of the tests in the NRS have been used in other studies. However,
while further evidence of validity may be forthcoming, the data
available at the time of our review did not fully document that the
tests provide for valid inferences about program performance or
children's progress from fall to spring. If the test is to be used as a
measure of program performance or to assess changes in child outcomes,
it is important to ensure that it is sensitive to the range of
development typically demonstrated in Head Start. Based on our analysis
and that of the TWG and independent experts, we continue to believe
that further study is necessary to ensure that the NRS results are
reliable and valid and that the results are appropriate for the
intended purposes.
ACF also commented at length on our finding that, according to our
survey of assessors, at least an estimated 18 percent of grantees
"changed instruction during the first year of NRS implementation to
emphasize areas covered in the NRS." ACF does not dispute that such
changes were made, but suggests they may be appropriate, which we had
noted in the draft report. In addition, ACF made a number of technical
comments that we have incorporated as appropriate.
We are sending copies of this report to the Assistant Secretary for
ACF, appropriate congressional committees, and other interested
parties. We will also make copies available to others upon request. In
addition, the report will be available at no charge on GAO's Web site
at http://www.gao.gov. Please contact me at (202) 512-7215 if you or
your staff have any questions about this report. Other major
contributors to this report are listed in appendix IV.
Signed by:
Marnie S. Shaul:
Director, Education, Workforce and Income Security Issues:
[End of section]
Appendix I: Objectives, Scope and Methodology:
We designed our study to examine (1) what information the National
Reporting System (NRS) is designed to provide, (2) how the Head Start
Bureau (HSB) has responded to implementation issues raised by the Head
Start grantees and experts during the first year of NRS implementation,
and what issues remain to be addressed, and (3) whether the NRS
provides HSB with the quality of information it needs to meet its
goals. We obtained information about these objectives through the
following methods:
* Conducted in-person interviews with representatives from HSB, its
contractors, and early childhood professional organizations.
* Reviewed documents chronicling the steps HSB took in developing and
implementing the NRS and delineating the professionally accepted
standards for test development.
* Conducted a mail survey of a nationally representative sample of Head
Start grantees and delegates.
* Conducted in-person interviews with staff at 12 Head Start programs
in 5 states.
* Conducted interviews with all of the members of the Technical Work
Group.
* Contracted with individuals recommended by the National Academy of
Sciences as experts in the areas of psychometrics and the educational
testing of Spanish-speaking and bilingual children.
We conducted our work between May 2004 and February 2005 in accordance
with generally accepted government auditing standards.
Interviews with Head Start Bureau and Relevant Parties:
To obtain information on the steps HSB took in developing and
implementing the NRS, we conducted in-person and/or telephone
interviews with HSB and its contractors or subcontractors (Westat,
Mathematica, and Xtria), using semi-structured interview protocols. A
representative of HSB was present at each of the interviews with its
contractors. We asked HSB officials' questions about the purpose of the
NRS, reporting NRS results, revisions and updates to the NRS, reactions
to NRS critics, and other related matters. We asked Westat staff
questions regarding: (1) the validity, reliability, and other analyses
of NRS results; (2) test development and revision; (3) test
administration, scoring, and reporting; (4) testing individuals of
diverse linguistic backgrounds; and (5) testing individuals with
disabilities. We asked Xtria staff about focus groups they conducted,
Computer-Based Reporting System (CBRS) training, and the CBRS itself.
We asked Mathematica staff about their Quality Assurance Study
methodology and findings.
We interviewed representatives of the National Head Start Association
(NHSA) to obtain information on what NHSA staff and their members
learned from the first year of NRS implementation and to obtain their
opinion on the extent to which the NRS comports with professional
standards. We interviewed representatives of the National Association
for the Education of Young Children (NAEYC) to learn how the NRS
comports with their recommendations for assessing young children.
Review of Documents:
To obtain information chronicling the steps HSB took in developing and
implementing the NRS and information about the quality of the NRS
results, we reviewed documents provided by HSB and its contractor.
These documents included, for example, minutes from meetings with the
Technical Work Group and others, minutes from focus groups, copies of
informational memos to Head Start grantees on the implementation of the
NRS, reports of results from field testing, and reports of fall 2003
NRS results.
To obtain information on the professionally accepted standards for test
development, we reviewed the Standards for Educational and
Psychological Testing, which is sponsored and published jointly by the
American Educational Research Association, the American Psychological
Association, and the National Council on Measurement in Education. That
document provides the preeminent, universally accepted, guidance for
the development and evaluation of high-quality, psychometrically robust
assessment instruments.
Survey of NRS Lead Assessors:
To obtain information on implementation issues raised by the Head Start
grantees during the first year of NRS implementation, we drew a
stratified random probability sample of 472 grantees or delegates from
a study population of 1,820 grantees or delegates of Head Start
Programs during the 2003-2004 school year. We selected our sample from
six strata defined by the total number of Head Start tests administered
and the number of Head Start tests administered in Spanish in the 2003-
2004 school year. Ultimately, we received 376 completed questionnaires,
for an overall response rate of 80 percent. The division of the
population, the division of the sample, and the division of the
respondents across the six strata can be found in table 3. Each sampled
grantee or delegate was subsequently weighted in the analysis to
represent all the members of the population.
Table 3: Sample Disposition:
Stratum number: 1;
Stratum description: At least 200 tests and at least 100 Spanish tests;
Total population size: 180;
Total sample size: 125;
Number of respondents: 98.
Stratum number: 2;
Stratum description: Less than 200 tests and at least 100 Spanish
tests;
Total population size: 22;
Total sample size: 22;
Number of respondents: 17.
Stratum number: 3;
Stratum description: At least 200 tests and between 1 and 99 Spanish
tests;
Total population size: 327;
Total sample size: 90;
Number of respondents: 80.
Stratum number: 4;
Stratum description: Less than 200 tests and between 1 and 99 Spanish
tests;
Total population size: 575;
Total sample size: 98;
Number of respondents: 77.
Stratum number: 5;
Stratum description: At least 200 tests and no Spanish tests;
Total population size: 171;
Total sample size: 48;
Number of respondents: 39.
Stratum number: 6;
Stratum description: Less than 200 tests and no Spanish tests;
Total population size: 545;
Total sample size: 89;
Number of respondents: 65.
Total;
Total population size: 1,820;
Total sample size: 472;
Number of respondents: 376.
Source: GAO.
[End of table]
We developed the survey questionnaire and pretested the content and
format of this questionnaire five times with NRS lead assessors, either
in-person or on the telephone. During these pretests, we asked the NRS
assessors whether the questions were clear and unbiased and whether the
terms contained in the questionnaire were accurate and precise. We made
changes to the questionnaire based on the pretest results.
Questionnaires were mailed to the sample of NRS lead assessors in
August 2004 and follow-up calls were made to those assessors whose
responses were not received within 2 weeks.
Because we followed a probability procedure based on random selections,
our sample of delegates and grantees is only one of a large number of
samples that we might have drawn. Because each sample could have
provided different estimates, we express our confidence in the
precision of our particular sample's results as 95 percent confidence
intervals. These are intervals that would contain the actual population
values for 95 percent of the samples we could have drawn. As a result,
we are 95 percent confident that each of the confidence intervals in
this report will include the true values in the study population. All
percentage estimates from our sample have margins of error (that is,
widths of confidence intervals) of plus or minus 6 percentage points or
less, at the 95 percent confidence level, unless otherwise noted.
In addition to sampling errors, the practical difficulties of
conducting any survey may introduce other types of errors, commonly
referred to as non-sampling errors. For example, differences in how a
question is interpreted, the sources of information available to
respondents, or the characteristics of people who do not respond can
introduce unwanted variability into the survey results. We included
steps in both the data collection and data analysis stage to minimize
such non-sampling errors. For example, a survey specialist in
combination with subject matter experts designed our questionnaire; the
questionnaire was pretested with NRS assessors; data entry was verified
to ensure accuracy; and another computer programmer verified the
computer programs used for analysis.
A copy of the survey questionnaire, including overall responses, is
included in appendix II.
Site Visits to Head Start Grantees:
To obtain information on implementation issues raised by the Head Start
grantees during the first year of NRS implementation, we also conducted
site visits to 12 Head Start programs in 5 states (Colorado, Maryland,
Massachusetts, Rhode Island, and Virginia), where we interviewed staff
who conducted the assessments and, in some cases, observed them
administering the NRS to children. The states and grantees chosen for
site visits were judgmentally selected to include a range of enrollment
sizes, types of program, rural and urban locations, and ethnic and
racial populations.
The interviews were conducted using a semistructured interview guide
that included questions about preparation for and logistics of
administering the assessment; experiences of conducting the
assessments; effects of the NRS on the children and program; reactions
to the NRS results; use of the CBRS; other assessment measures in use
at the program; and contextual information about the program and
community. During our site visits, we spoke with the lead assessor and,
in some cases, other Head Start staff, including other assessors,
staff, and managers. With the exception of sites in Colorado, we
conducted our site visits during May and June of 2004. We conducted our
Colorado site visits during September 2004. In all cases, we asked the
staff to refer to experiences during the 2003-2004 school year. We
cannot generalize our site visit findings beyond the 12 sites we
visited, but we have used these data for illustrative purposes in
conjunction with our survey.
Interviews with Technical Work Group:
To obtain information on whether the NRS provides HSB with the quality
of information it needs to meet its goals, we conducted telephone
interviews with each of the 16 members of the Technical Work Group,
using a semi-structured interview protocol. We asked the members about
their professional backgrounds and involvement on the Technical Work
Group; their understandings of the purpose of the NRS; their
assessments of the completeness of the steps HSB took in developing and
implementing the NRS; their assessments of the extent to which the NRS
is reliable, valid, and consistent with professional standards;
specific concerns about the NRS that members had raised during
Technical Work Group meetings; and their opinions on how HSB should
proceed with regard to the NRS. Each of the members stated that he or
she could be candid in discussing these issues with GAO. We also
observed two meetings of the Technical Work Group in May and October
2004.
Technical Work Group Members:
Craig Ramey, Ph.D., Chairman:
Distinguished Professor of Health Studies and Director, Georgetown
University Center for Health Education:
School of Nursing and Health Studies:
Georgetown University:
Washington, D.C.
Clancy Blair, Ph.D., Co-Chairman:
Assistant Professor:
Human Development and Family Studies:
Pennsylvania State University:
University Park, Pa.
Jason L. Anthony, Ph.D., Ed.S.:
Research Assistant Professor:
Texas Institute for Measurement, Evaluation, and Statistics:
Department of Psychology:
University of Houston:
Houston, Tex.
Margaret Burchinal, Ph.D.:
Senior Scientist:
Frank Porter Graham Child Development Institute:
The University of North Carolina at Chapel Hill:
Chapel Hill, N.C.
Richard Clifford, Ph.D.:
Senior Scientist:
Frank Porter Graham Child Development Institute:
The University of North Carolina at Chapel Hill:
Chapel Hill, N.C.
Linda Espinosa, Ph.D.:
Associate Professor:
311D Townsend Hall:
College of Education:
University of Missouri-Columbia:
Columbia, Mo.
Nicholas Ialongo, Ph.D.:
Associate Professor:
Bloomberg School of Public Health:
Johns Hopkins University:
Baltimore, Md.
Graciela Italiano-Thomas, Ed.D.:
CEO:
Centro de la Familia de Utah:
South Salt Lake, Utah:
Jacqueline Jones, Ph.D.:
Director, Initiatives in Early Childhood and Literacy Education:
Educational Testing Service:
Princeton, N.J.
Ann P. Kaiser, Ph.D.:
Professor of Psychology and Human Development:
Director, Research Program on Communication, Cognitive, and Emotional
Development:
Vanderbilt University:
Nashville, Tenn.
Samuel J. Meisels, Ed.D.:
President:
Erikson Institute:
Chicago, Ill.
Fred Morrison, Ph.D.:
Professor:
Department of Psychology:
University of Michigan:
Ann Arbor, Mich.
Robert C. Pianta, Ph.D.:
Professor, William Clay Parrish, Jr. Chair in Education:
Curry Programs in Clinical and School Psychology:
University of Virginia:
Charlottesville, Va.
Kyle Snow, Ph.D.:
National Institute of Child Health and Human Development:
National Institutes of Health:
U.S. Department of Health and Human Services:
Bethesda, Md.
W. Douglas Tynan, Ph.D., ABPP:
Associate Professor of Pediatrics:
Alfred I. duPont Hospital for Children:
Jefferson Medical College:
Wilmington, Del.
Jane Wiechel, Ph.D.:
Associate Superintendent:
Center for Students, Families and Communities:
Ohio Department of Education:
Columbus, Ohio:
Expert Reviews:
To obtain information on whether the NRS provides HSB with the quality
of information it needs to meet its goals, we contracted with
individuals recommended by the National Academy of Sciences (NAS) as
experts in the areas of psychometrics and the educational testing of
Spanish-speaking and bilingual children. These independent experts
reviewed documents provided by HSB and its contractors and provided
written comments on the adequacy and appropriateness of the assessment.
We also conducted follow-up telephone interviews with each of the three
experts to reconcile variations in their written reviews. We developed
our own conclusions based on the information provided by these experts.
The three experts are listed below.
Ronald K. Hambleton, Ph.D.:
Distinguished University Professor for Research and Evaluation Methods:
University of Massachusetts at Amherst:
School of Education:
Center for Educational Assessment:
Amherst, Mass.
Luis M. Laosa, Ph.D.:
Principal Research Scientist, Emeritus:
Educational Testing Service:
Center for Education Policy and Research:
Princeton, N.J.
Robert L. Linn, Ph.D.:
Professor:
University of Colorado:
Department of Education:
Boulder, Colo.
[End of section]
Appendix II: Survey Instrument:
The survey instrument displayed here includes the population estimates
for grantees overall. The confidence intervals for these estimates do
not exceed plus or minus 6 percentage points.
[See PDF for image]
[End of survey]
[End of section]
Appendix III: Comments from the Department of Health and Human
Services:
DEPARTMENT OF HEALTH AND HUMAN SERVICES:
ADMINISTRATION FOR CHILDREN AND FAMILIES:
Office of the Assistant Secretary,
Suite 600:
370 LEnfant Promenade, S.W.
Washington, D.C. 20447:
APR 20 2005:
Ms. Marnie S. Shaul:
Director, Education, Workforce and Income Security Issues:
U.S. Government Accountability Office:
441 G. Street, N. W.
Washington, D.C. 20548:
Dear Ms. Shaul:
The Administration for Children and Families appreciates the
opportunity to provide comments on recommendations in the U.S.
Government Accountability Office's draft report entitled, "Head Start:
Further Development Could Allow Results of New Test to be Used for
Decisionmaking" (GAO-05-343).
Should you have questions regarding our comments, please contact Windy
Hill, Associate Commissioner of the Head Start Bureau, Administration
on Children, Youth and Families, at (202) 205-8573.
Sincerely,
Signed by:
Wade F. Horn, Ph.D. Assistant Secretary for Children and Families:
Attachment:
COMMENTS OF THE ADMINISTRATION FOR CHILDREN AND FAMILIES ON THE
GOVERNMENT ACCOUNTABLITY OFFICE'S DRAFT REPORT TITLED, "HEAD START:
FURTHER DEVELOPMENT COULD ALLOW RESULTS OF NEW TEST TO BE USED FOR
DECISIONMAKING" (GAO-05-343):
The Administration for Children and Families (ACF) appreciates the
opportunity to comment on this Government Accountability Office (GAO)
draft report. We appreciate the breadth of contact made in the
preparation of this report.
GAO Recommendations:
To help ensure that the NRS successfully and efficiently achieves its
purposes, we are recommending that the HHS Assistant Secretary, for ACF
take steps to better monitor some aspects of NRS implementation and
examine means of improving its efficiency, including steps to:
* monitor the effects of the NRS on local Head Start instructional
practices;
* improve the management and accuracy of its data on the number of
children eligible for and participating in the NRS; and:
* work with the Technical Work Group to determine the feasibility of
sampling options for administering the NRS, including documentation of
their costs and benefits.
In addition, we are recommending that the Assistant Secretary for ACE
reduce uncertainty about the appropriate uses of the NRS by taking
additional steps to:
* determine how the NRS data will be used for the purposes of
accountability and targeting training and technical assistance, and
clearly communicate this information to grantees;
* use the first year of NRS results to conduct further study to ensure
that the results are reliable and valid for both the English and
Spanish versions and that the results are appropriate for the intended
purposes; and:
* compile detailed technical information on the NRS, including
appropriate uses, in a single, well-organized document and make this
information publicly available.
ACF Comments:
ACF has widely publicized its commitment, need and intent for
improvements in the implementation of the National Reporting System
(NRS), including child assessment. We believe that the GAO
recommendations mirror many of ACF's public statements, as well as
accurately describe some of the action steps that are already in
process.
The remaining GAO recommendations are also in keeping with those
arising from our internal planning with the NRS contractors, the local
programs and the Technical Work Group (TWG). Additionally, the
Secretary of HHS will also be receiving recommendations from the newly
formed Secretary's Advisory Committee (SAC) on Head Start
Accountability and Educational Performance Standards, which will begin
meeting this summer.
Specific comments related to the recommendations:
* ACF has already included a scheduled deliverable within the scope of
work of the NRS contractors. Additional analyses are continuing to be
conducted with the first year NRS results in order to ensure that
future results are reliable and valid, and in order to be confident
that the results are appropriate for the interim and final intended
purposes. TWG and SAC will both assist ACF in the review of these
analyses.
* ACF has included tasks that will result in the NRS contractors
preparing a detailed technical report to expand beyond what is already
included in the recently distributed "Report to Congress on Head Start
Assessment." The new work is already in progress. We will make some
version of the new document available to the public when it is cleared
by ACF.
* ACF will examine ways to improve management regarding NRS
participation. We believe that we can achieve this through the existing
Computer-Based Reporting System data collection, data management, the
quality assurance site visits, and as part of our overall
responsibility for program monitoring.
* Prior to the release of the GAO report, ACF had engaged the NRS
contractors and TWG in the preparation of an options paper with
recommendations for sampling, including not only the benefits and cost
implications for each approach but also what could or must be "given
up" under the implementation of each approach. TWG and SAC will have a
role in reviewing these recommendations and further advising ACF and
HHS, respectively.
* ACF is examining and will continue to examine changes that occur in
local curriculum implementation and teaching practices through at least
three primary methods: on-site federal reviews, regular periodic
contact of an assigned technical assistance liaison and the NRS quality
assurance site visits.
Other Comments:
* ACF would like the title as well as pertinent references throughout
the document to refer to the NRS rather than "the test." The child
assessment alone is not synonymous with NRS.
Though mentioned, ACF believes that the Year One Quality Assurance
Study lacked attention in this report.
Page 4, first full paragraph, and page 23, third paragraph - GAO states
that HSB has asserted the validity and reliability of NRS measures
because NRS borrows certain materials from existing tests that have met
the validity and reliability criteria, but the agency has not shown NRS
itself to be valid or reliable over time. Reliability and concurrent
and predictive validity of the Head Start NRS measures were calculated
using the Family and Child Experiences Survey (FACES) and other data on
Head Start children. These results were included in the package of
materials provided to GAO.
Ongoing analyses are being conducted to further demonstrate the
reliability and validity of the NRS assessment data. For example,
analyses comparing matched FACES data with NRS data are being conducted
to validate the assessment parallel data collected by locally trained
NRS assessors with those collected by trained, experienced,
professional FACES data collectors. Preliminary analyses indicate that
little difference is found between the two data sets.
Most of the subtests in the NRS battery have been used extensively in
the Head Start FACES study, in the National Head Start Impact Study or
in the Head Start Quality Research Center intervention studies
involving more than 10,000 Head Start children, as well as in other
major studies of low-income preschoolers. These measures have been used
in the National Institute of Child Health and Human Development
studies, the "Mother & Child Supplement" to the National Longitudinal
Survey of Youth" and in the "Child Development Supplement" to the Panel
Study of Income Dynamics. The results of these assessments have proved
to be highly stable from cohort to cohort, not only in terms of the
level of achievement with which children enter or leave the Head Start
program, but also in terms of their growth trajectories.
Analysis of longitudinal data from the Head Start FACES study has shown
that vocabulary and letter-recognition assessments given in Head Start
can account for nearly half of the variance in children's tested
reading skills at the end of kindergarten, and 66 percent of the
variance when tested in general knowledge at the end of kindergarten.
Also, scores gained from vocabulary and letter-recognition assessments
account for almost one-third of the variance in kindergarten reading
skills and over one-quarter of the variance in kindergarten general
knowledge.
Page 9, Figure 2 - ACF would like to see the report contain both a
narrative and a timeline on NRS for the year 2004, not just for 2002
and 2003 as is currently in the report. The activities of the GAO
occurred during 2004, as did the first full year of ACF's
implementation of NRS.
Page 11, first paragraph - GAO indicates that a true "pilot," rather
than the summer field test of NRS, would take about a year to complete.
ACF believes that by further:
examination of the Year I data, we will have data even beyond the scope
of a one-year pilot effort.
The GAO report also states that HSB did not conduct a "full pilot
test." The Head Start Bureau (HSB) conducted a field test of the NRS
child assessment in the spring of 2003 with a national probability
sample of 36 Head Start programs, including two migrant programs and
two American Indian programs, resulting in a field test sample of over
1,430 kindergarten-eligible English-and Spanish-speaking children. The
results of the field test showed that the measures were appropriate for
the Head Start population, capturing a range of ability levels in the
assessments domains. Year I implementation results will add
significantly to this information and what we know about the properties
of the assessment over time.
Page 21, first paragraph - Though GAO has included a footnote to
explain, "...actions taken by the Head Start Bureau's contractors are
attributed to the Head Start Bureau itself," this note appears on this
page long after readers can attribute actions to HSB. Since the report
is written without disclosing what actions were taken or advised by
whom, ACF would like the footnote to be moved to the beginning of the
report or described in the opening narrative.
Page 26, third paragraph - GAO uses a figure of 13 percent to describe
the number of children who speak neither English nor Spanish. Aggregate
Program Information Report data indicate that programs reported 95
percent of the children enrolled last year spoke either English or
Spanish, leaving 5 percent who speak other languages. The number of
children in NRS who spoke a language other than English or Spanish at
home, as reported in the Computer-Based Reporting System, was
approximately 4 percent or 19,000 in the fall of 2003.
HSB has two other concerns with the report. Our responses to these two
are rather lengthy to help clarify them:
1. Page 17, first paragraph - The program office is concerned with the
following statement in the report "... some grantees have changed
instruction to emphasize areas covered in the test." The manner in
which it is stated implies that this can only be negative and that it
can only be attributable to NRS in any program in which it occurs. On
the contrary, we believe this illustrates a powerful positive change,
inasmuch as Head Start's heavy emphasis on instructional and curricular
changes pre-date the implementation of NRS by several years. We explain
our concern in detail.
As this country's largest and only federally funded, comprehensive
early childhood program, we have learned a great deal from research-
based practices that enhance young children's learning and development.
Unless we ensure that programs are providing meaningful and challenging
learning experiences through ongoing observation and assessment of
children's progress as required by the Program Performance Standards,
participation will have little value for children. Therefore, we are
not surprised to learn that local programs reported to GAO that they
are making changes in their curriculum and in their teaching practices.
We believe that NRS may be:
giving them additional data upon which they are making such local
decisions, rather than NRS serving as the sole source of such
information upon which to base change decisions. We have, through
various methods, specifically cautioned programs not to take actions of
this nature. We believe that most programs are not using NRS Year I
reporting in inappropriate ways.
The GAO report acknowledges in a small way that prior work has occurred
in this area, yet GAO does not acknowledge that the prior work, rather
than NRS alone, may be producing changes in curriculum and instruction.
Prior to the NRS, the Head Start Child Outcomes Framework (Framework)
defined the comprehensive nature of child development and early
childhood education in Head Start by including the domains of. language
development, literacy, mathematics, science, creative arts, social and
emotional development, approaches to learning, and physical
development. This focus across all domains must remain within the local
curriculum and within the local ongoing assessment.
Additionally, the Head Start Program Performance Standards require that
all of these areas of development be supported through age-appropriate
curriculum delivered through classroom or home-based programming with
the integral involvement of parents. Therefore, the focus across all
domains must remain within the local curriculum and within the local
ongoing assessment.
ACF has been offering and continues to offer training, technical
assistance and other resources to help programs look more closely at
their local implementation and to make necessary changes. Additionally,
some programs have made and others are actively engaged in making these
types of changes as a result of either their required program self-
assessment or local aggregation of child outcome data, and/or as a
result of noncompliance or deficiencies identified and reported in the
process of triennial monitoring. We recognize and applaud programs that
are actively engaged in making appropriate changes in the areas of
curriculum, ongoing assessment of child progress and early childhood
instruction across domains.
Another example of our work that is influencing changes in local
programs is the Head Start Leaders Guide to Positive Child Outcomes.
This resource is based on the requirements of the Head Start Program
Performance Standards and the Framework. This important document has
been the basis of Head Start training, providing staff with specific
strategies to strengthen curriculum and to foster children's progress
in each of the identified domains. These strategies assist program
staff in strengthening curriculum planning and implementation
regardless of the specific curriculum used in individual programs.
Both ACF's regulations and resource materials provide examples of
educational quality based on:
* intentional teaching;
* outcomes-oriented learning experiences;
* child engagement; and:
* challenging learning opportunities for small groups of children and
for individual children.
2. Page 7, second paragraph - The GAO report states of non-NRS
assessments, "The assessments occur 3 times each year and generally
involve observing the children during normal classroom activities."
This statement, though perhaps stated by one or more local programs,
inaccurately describes grantee actions as related to two existing Head
Start requirements. The first is the long-standing requirement for
ongoing observations and ongoing assessment of each child's progress.
Therefore, observing or assessing progress only three times a year
would be a significant area of noncompliance, and more likely, a
deficiency in that program. The statement on page seven further
represents a misunderstanding and, therefore, an inappropriate
implementation of the existing requirement. Three times per year each
agency is required to aggregate, report and examine data from its
locally designed and locally administered ongoing assessment of child
progress. This is different from assessing children three times a year.
Head Start standards do not allow for "assessing three times per year";
rather, teachers must observe and record examples of children's
development and learning on an ongoing basis throughout the year.
Management requirements have programs review aggregate data from the
assessment at three points in time during the year--the beginning,
midpoint and the end. The information is reviewed program-wide, in
aggregate, to assess children's status and progress on a wide range of
areas identified in the Framework. This information is used to continue
to plan the educational program for children as well as to inform the
overall program assessment and planning process.
We are aware that NRS is providing an additional way for programs to
look at children's progress over the course of a Head Start year. This
may be contributing to a renewed focus on becoming more intentional and
more deliberate regarding the early childhood educational services in
local Head Start programs--the learning content, intentional teaching,
and children's school readiness in the areas of both the Framework and
the 1998 Congressionally mandated child outcomes.
As we look more closely at this type of change in local programs, we
hope that we will be able to conclude that NRS is not currently the
"cause" of the more intentional focus on school readiness, but rather
that necessary changes are the result of a number of other factors,
including:
* The 1998 Congressional mandate, specifying additional Program
Performance Standards in language, literacy and numeracy/early
mathematics and the subsequent Framework; The increased qualifications
of teachers and the significant number with degrees;
* The increased focus on intentional teaching strategies shared through
training based on research;
* The appropriate use of local outcomes data (not the NRS data); The
appropriate use of the required program self-assessment;
* Information from research, including the finding that children's pre-
school vocabulary is the best predictor of school success, and:
* Individual agency and grantee responses to findings from federal, on-
site and triennial monitoring of compliance with all applicable laws
and regulations.
HSB's emphasis on instructional change clearly pre-dates NRS, which was
launched in 2002.
As stated earlier, separate from and prior to NRS, the Framework
defined the comprehensive nature of child development and early
childhood education in Head Start. Additionally, the Head Start Program
Performance Standards require that areas of development be supported
through age-appropriate curriculum delivered through classroom or home-
based programming with the integral involvement of parents.
It is important to recognize that both the Head Start Program
Performance Standards, which were initially issued in 1972 and revised
in 1996, and the Framework issued in 2000, all pre-date NRS.
The 1998 reauthorization of the Head Start Act (The Act) requires the
Secretary of HIS to establish "education performance standards to
ensure the school readiness of children participating in Head Start,"
including assurances that children develop phonemic, print and
numeracy/early mathematics skills; understand and use language to
communicate, understand and use increasingly complex and varied
vocabulary; develop and demonstrate an appreciation of books; and for
English language learners, progress toward acquisition of the English
language. The Act also required that the Head Start teacher
qualifications be raised because of evidence that links classroom and
teaching quality to the skills, knowledge and formal education of
teachers.
Therefore, the Act, the Head Start Leaders Guide to Positive Child
Outcomes, the Framework and the Program Performance Standards, as well
as professional development experiences such as Mentor Coaching, all
hold programs and local staff accountable for use of specific
strategies to strengthen curriculum content, learning outcomes and
intentional teaching, and to foster children's progress in each child
development domain of the comprehensive Head Start program.
Ensuring developmentally appropriate programming provides a meaningful
basis for observing and assessing children's progress and promoting and
individualizing learning and development. NRS is providing an
additional form of assessment reporting and an additional and renewed
focus on local programs becoming more intentional and more deliberate
regarding curriculum content, intentional teaching and children's
school readiness, and is not the sole source or a source to replace
existing requirements for local Head Start agencies.
ACF looks forward to additional recommendations as we move toward the
use of NRS data and as we inform grantees and others about the use of
the NRS data as another tool for accountability and providing training
and technical assistance.
[End of section]
Appendix IV: GAO Contacts and Staff Acknowledgments:
GAO Contacts:
Betty Ward-Zukerman (202) 512-2732, wardzukermanb@gao.gov;
Heather McCallum Hahn (202) 512-2890, mccallumh@gao.gov:
Staff Acknowledgments:
Ramona Burton, Scott Heacock, Kathryn Rooney, Carolyn Boyce, Curtis
Groves, Stu Kaufman, Joan Vogel, and Sid Schwartz made significant
contributions to this report.
FOOTNOTES
[1] Head Start regulations require that at least 90 percent of the
children enrolled in Head Start come from families with incomes at or
below the federal poverty guidelines, receiving public assistance, or
caring for a foster child. In 2004, the federal poverty guideline for a
family of four in the 48 contiguous states and the District of Columbia
was $18,850.
[2] See GAO, Head Start: Challenges in Monitoring Program Quality and
Demonstrating Results, GAO/HEHS-98-186 (Washington, D.C.: June 1998),
and Head Start: Curriculum Use and Individual Child Assessment in
Cognitive and Language Development, GAO-03-1049 (Washington, D.C.:
September 2003).
[3] According to ACF officials, in addition to the assessments
conducted as part of the Head Start Child Outcomes Framework, Head
Start teachers must observe and record examples of children's
development and learning on an ongoing basis throughout the year.
[4] Analyses and actions taken by the Head Start Bureau's contractors
are attributed to the Head Start Bureau itself.
[5] Both the OLDS and the math assessment were used in the ECLS-K, and
the PPVT-III was used with two cohorts of the Head Start Family and
Child Experiences Survey (FACES). The Head Start Quality Research
Centers letter-naming exercise was developed for use in Head Start
curriculum studies. The ECLS-K is an ongoing study that focuses on
children's early school experiences beginning with kindergarten and
following children through fifth grade. FACES is a national
longitudinal study of the development of Head Start children, their
families, and Head Start programs and staff in a small sample of
programs.
[6] We use the terms "the test" and "the assessment" to make shortened
reference to the NRS test battery. The NRS also incorporates a support
infrastructure for the test battery, including a system for training
staff to conduct the assessments and a computer-based reporting system.
While the NRS may eventually be expanded to incorporate additional
components, we examined it as implemented through spring 2004.
[7] The current year's data are not available until December.
[8] The Head Start Bureau awarded a contract to Mathematica Policy
Research, Inc., to conduct an implementation study of the NRS in a
randomly-selected set of 35 Head Start programs. The research team
observed a total of 119 local assessors, interviewed Head Start
directors, NRS trainers, and data managers, and held focus groups with
staff conducting the assessments to learn about their experiences.
Mathematica also planned to visit four Migrant and Seasonal Head Start
programs during spring 2004 and fall 2005.
[9] See appendix I for a list of the expert reviewers and their
affiliations.
[10] See GAO, Head Start: Comprehensive Approach to Identifying and
Addressing Risks Could Help Prevent Grantee Financial Management
Weaknesses, GAO-05-176 (Washington, D.C.: Feb. 28, 2005).
GAO's Mission:
The Government Accountability Office, the investigative arm of
Congress, exists to support Congress in meeting its constitutional
responsibilities and to help improve the performance and accountability
of the federal government for the American people. GAO examines the use
of public funds; evaluates federal programs and policies; and provides
analyses, recommendations, and other assistance to help Congress make
informed oversight, policy, and funding decisions. GAO's commitment to
good government is reflected in its core values of accountability,
integrity, and reliability.
Obtaining Copies of GAO Reports and Testimony:
The fastest and easiest way to obtain copies of GAO documents at no
cost is through the Internet. GAO's Web site ( www.gao.gov ) contains
abstracts and full-text files of current reports and testimony and an
expanding archive of older products. The Web site features a search
engine to help you locate documents using key words and phrases. You
can print these documents in their entirety, including charts and other
graphics.
Each day, GAO issues a list of newly released reports, testimony, and
correspondence. GAO posts this list, known as "Today's Reports," on its
Web site daily. The list contains links to the full-text document
files. To have GAO e-mail this list to you every afternoon, go to
www.gao.gov and select "Subscribe to e-mail alerts" under the "Order
GAO Products" heading.
Order by Mail or Phone:
The first copy of each printed report is free. Additional copies are $2
each. A check or money order should be made out to the Superintendent
of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or
more copies mailed to a single address are discounted 25 percent.
Orders should be sent to:
U.S. Government Accountability Office
441 G Street NW, Room LM
Washington, D.C. 20548:
To order by Phone:
Voice: (202) 512-6000:
TDD: (202) 512-2537:
Fax: (202) 512-6061:
To Report Fraud, Waste, and Abuse in Federal Programs:
Contact:
Web site: www.gao.gov/fraudnet/fraudnet.htm
E-mail: fraudnet@gao.gov
Automated answering system: (800) 424-5454 or (202) 512-7470:
Public Affairs:
Jeff Nelligan, managing director,
NelliganJ@gao.gov
(202) 512-4800
U.S. Government Accountability Office,
441 G Street NW, Room 7149
Washington, D.C. 20548: