Title I

Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies Gao ID: GAO-03-389 May 8, 2003

The No Child Left Behind Act of 2001 (NCLBA) reauthorized the $10 billion Title I program, which seeks to improve the educational achievement of 12.5 million students at risk. In passing the legislation, Congress increased the frequency with which states are to measure student achievement in mathematics and reading and added science as another subject. Congress also authorized funding to support state efforts to develop and implement tests for this purpose. Congress mandated that GAO study the costs of implementing the required tests. This report describes characteristics of states' Title I tests, provides estimates of what states may spend to implement the required tests, and identifies factors that explain variation in expenses.

The majority of states administer statewide tests and customize questions to measure student learning against their state standards. These states differ along other characteristics, however, including the types of questions on their tests and how they are scored, the extent to which actual test questions are released to the public following the tests, and the number of new tests they need to develop to comply with the NCLBA. GAO provides three estimates of total expenditures between fiscal year 2002 and 2008, based on different assumptions about the types of test questions states may choose to implement and how they are scored. The method by which tests are scored largely explains the differences in GAO's estimates. If all states use tests with multiple-choice questions, which are machine scored, GAO estimates that the total state expenditures will be about $1.9 billion. If all states use tests with a mixture of multiple choice questions and a limited number of open-ended questions that require students to write their response, such as an essay, which are hand scored, GAO estimates spending to be about $5.3 billion. GAO estimates that spending will be at about $3.9 billion, if states keep the mix of question types states reported to GAO. In general, hand scoring is more expensive and time and labor intensive than machine scoring. Benchmark funding for assessments as specified in NCLBA will cover a larger percentage of estimated expenditures for tests comprised of multiple-choice questions and a smaller percentage of estimated expenditures for tests comprised of a mixture of multiple-choice and open-ended questions. Several states are exploring ways to reduce assessment expenses, but information on their experiences is not broadly shared among states.

Recommendations

Our recommendations from this work are listed below with a Contact for more information. Status will change from "In process" to "Open," "Closed - implemented," or "Closed - not implemented" based on our follow up work.

Director: Team: Phone:

GAO-03-389, Title I: Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies This is the accessible text file for GAO report number GAO-03-389 entitled 'Title I: Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies' which was released on May 08, 2003. This text file was formatted by the U.S. General Accounting Office (GAO) to be accessible to users with visual impairments, as part of a longer term project to improve GAO products' accessibility. Every attempt has been made to maintain the structural and data integrity of the original printed product. Accessibility features, such as text descriptions of tables, consecutively numbered footnotes placed at the end of the file, and the text of agency comment letters, are provided but may not exactly duplicate the presentation or format of the printed version. The portable document format (PDF) file is an exact electronic replica of the printed version. We welcome your feedback. Please E-mail your comments regarding the contents or accessibility features of this document to Webmaster@gao.gov. This is a work of the U.S. government and is not subject to copyright protection in the United States. It may be reproduced and distributed in its entirety without further permission from GAO. Because this work may contain copyrighted images or other material, permission from the copyright holder may be necessary if you wish to reproduce this material separately. Report to Congressional Requesters: United States General Accounting Office: GAO: May 2003: TITLE I: Characteristics of Tests Will Influence Expenses; Information Sharing May Help States Realize Efficiencies: GAO-03-389: GAO Highlights: Highlights of GAO-03-389, a report to Congressional Requesters Why GAO Did This Study: The No Child Left Behind Act of 2001 (NCLBA) reauthorized the $10 billion Title I program, which seeks to improve the educational achievement of 12.5 million students at risk. In passing the legislation, Congress increased the frequency with which states are to measure student achievement in mathematics and reading and added science as another subject. Congress also authorized funding to support state efforts to develop and implement tests for this purpose. Congress mandated that GAO study the costs of implementing the required tests. This report describes characteristics of states‘ Title I tests, provides estimates of what states may spend to implement the required tests, and identifies factors that explain variation in expenses. What GAO Found: The majority of states administer statewide tests and customize questions to measure student learning against their state standards. These states differ along other characteristics, however, including the types of questions on their tests and how they are scored, the extent to which actual test questions are released to the public following the tests, and the number of new tests they need to develop to comply with the NCLBA. GAO provides three estimates of total expenditures between fiscal year 2002 and 2008, based on different assumptions about the types of test questions states may choose to implement and how they are scored. The method by which tests are scored largely explains the differences in GAO‘s estimates. If all states use tests with multiple-choice questions, which are machine scored, GAO estimates that the total state expenditures will be about $1.9 billion. If all states use tests with a mixture of multiple- choice questions and a limited number of open-ended questions that require students to write their response, such as an essay, which are hand scored, GAO estimates spending to be about $5.3 billion. GAO estimates that spending will be at about $3.9 billion, if states keep the mix of question types states reported to GAO. In general, hand scoring is more expensive and time and labor intensive than machine scoring. Benchmark funding for assessments as specified in NCLBA will cover a larger percentage of estimated expenditures for tests comprised of multiple-choice questions and a smaller percentage of estimated expenditures for tests comprised of a mixture of multiple-choice and open-ended questions. Several states are exploring ways to reduce assessment expenses, but information on their experiences is not broadly shared among states. What GAO Recommends: Given that significant expenses may be associated with testing, GAO is recommending that Education facilitate the sharing of information on states‘ experiences in attempting to reduce expenses. Education agreed with GAO‘s recommendation but raised concerns about GAO‘s methodology for estimating expenditures. www.gao.gov/cgi-bin/getrpt?GAO-03-389. To view the full report, including the scope and methodology, click on the link above. For more information, contact Marnie S. Shaul at (202) 512-7215 or shaulm@gao.gov. [End of section] Contents: Letter: Results in Brief: Background: States Generally Report Administering Statewide Assessments Developed to Measure Their State Standards, but Differ Along Other Characteristics: Estimates of Spending Driven Largely by Scoring Expenditures: Conclusions: Recommendation: Agency Comments: Appendix I: Objectives, Scope, and Methodology: Appendix II: Accountability and Assessment Requirements under the 1994 and 2001 Reauthorizations of Title I: Appendix III: Number of Tests States Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003): Appendix IV: Estimates of Assessment Expenditures NCLBA Required, but Not in Place at the Time of Our Survey, FY 2002-08: Appendix V: State Development and Nondevelopment Estimates: Appendix VI: Fiscal Years 2002-08 Estimated Expenditures for Each Question Type: Appendix VII: Comments from the Department of Education: Appendix VIII: GAO Contacts and Staff Acknowledgments: GAO Contacts: Staff Acknowledgments: Tables: Table 1: Number of Assessments and Subject Areas Required by the 1994 and 2001 ESEA Reauthorizations: Table 2: Assessment Minimum Amounts under NCLBA: Table 3: The Number of Tests States Reported Needing to Develop or Augment Varies: Table 4: Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08: Table 5: Estimated Total Expenditures for Test Development Are Lower Than for Test Administration, Scoring, and Reporting: Table 6: Total Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08: Table 7: States Selected for Study: Table 8: Examples of Assessment Expenditures: Table 9: Average Annual Expenditures for the 7 States (adjusted to 2003 dollars): Table 10: Estimated Expenditures to Implement Title I Assessments in a Given Year: Table 11: Estimates of Expenditures for the Assessments Required by NCLBA That Were Not in Place at the Time of Our Survey, Fiscal Years 2002-08: Table 12: Estimates by State, Development, and Nondevelopment Expenditures: Table 13: Estimated Expenditures for Each Question Type, Fiscal Years 2002-08: Figures: Figure 1: The Majority of States Report They Currently Use Statewide Tests and Plan to Continue to Do So: Figure 2: The Majority of States Reported That They Currently Use and Plan to Develop New Tests That Are Customized to Measure Their State's Standards: Figure 3: The Majority of States Reported They Use a Combination of Multiple-choice and Open-ended Questions on Their Tests, but Many States Are Uncertain about Question Type on Future Tests: Figure 4: States Split in Decision to Release Test Questions to the Public Following Tests: Figure 5: Estimated Scoring Expenditures Per Assessment Taken for Selected States, Fiscal Year 2002: Figure 6: Various Factors Are Likely to Affect What States Spend on Title I Assessments: Figure 7: Total Expenditures Likely to Be Lower in First Few Years and Benchmark Funding in NCLBA Estimated to Cover Most of Expenditures in First Few Years: Abbreviations: ESEA: Elementary and Secondary Education Act LEA: local educational agency NASBE: National Association of State Boards of Education NCLBA: No Child Left Behind Act: United States General Accounting Office: Washington, DC 20548: May 8, 2003: The Honorable Judd Gregg Chairman, Committee on Health, Education, Labor, and Pensions United States Senate: The Honorable Edward M. Kennedy Ranking Minority Member, Committee on Health, Education, Labor, and Pensions United States Senate: The Honorable John A. Boehner Chairman, Committee on Education and the Workforce House of Represenatives: The Honorable George Miller Ranking Minority Member, Committee on Education and the Workforce House of Representatives: Title I, the largest source of federal funding for primary and secondary education, provided states $10.3 billion in fiscal year 2002 to improve the educational achievement of 12.5 million students at risk. In passing the No Child Left Behind Act of 2001 (NCLBA), Congress increased funding for Title I and placed additional requirements on states and schools for improving student performance. To provide an additional basis for making judgments about student progress, NCLBA increased the frequency with which states are to assess students in mathematics and reading and added science as another subject. Under NCLBA, states can choose to administer statewide, local, or a combination of state and local assessments, but these assessments must measure states' content standards for learning. If a state fails to fulfill NCLBA requirements, the Department of Education (Education) can withhold federal funds designated for state administration until the requirements have been fulfilled. To support states in developing and implementing their assessments, Congress authorized specific funding to be allocated to the states between fiscal year 2002 and 2007. NCLBA requires that states test all students annually in grades 3 through 8 in mathematics and reading or language arts and at least once in one of the high school grades by the 2005-06 school year. It also requires that states test students in science at least once in elementary, middle, and high school by 2007-08. Some states have already developed assessments in many of the required subjects and grades. In the conference report accompanying passage of the NCLBA, Congress mandated that we do a study of the anticipated aggregate cost to states, between fiscal year 2002 and 2008, for developing and administering the mathematics, reading or language arts, and science assessments required under section 1111(b) of the act. As agreed with your offices, this report (1) describes characteristics of states' Title I assessments and (2) provides estimates of what states may spend to implement the required assessments between fiscal year 2002 and 2008 and identifies factors that explain variation in expenses.[Footnote 1] To determine the characteristics of states' Title I assessments, we collected information through a survey sent to the 50 states, the District of Columbia, and Puerto Rico; all 52 responded to our survey. We also reviewed published studies detailing the characteristics of states' assessments. To estimate projected expenditures all states are expected to incur, we reviewed 7 states' expenditures--all of which had implemented the 6 assessments required by the 1994 Elementary and Secondary Education Act (ESEA) reauthorization and were testing students in many of the additional subjects and grades required by NCLBA. The 7 states were Colorado, Delaware, Maine, Massachusetts, North Carolina, Texas, and Virginia. To estimate projected expenditure ranges for all states, we used expenditures from these 7 states coupled with key information gathered through a survey completed by each state's assessment director. We estimated projected state expenditures for test development, administration, scoring, and reporting results for both assessments that states need and assessments that states currently have in place. Our methodology for estimating expenditures was reviewed by several internal and external experts and their suggestions have been incorporated as appropriate. Education officials were also briefed on our methodology and raised no substantial concerns. As agreed with your offices, we did not determine expenditures for alternate assessments for students with disabilities nor expenditures for English language proficiency testing. In addition, we did not determine the expenditures local school districts may incur with respect to these assessments. To determine what factors account for variation in projected expenditures, we reviewed the 7 states' expenditures, noting the test characteristics that were associated with specific types and levels of expenditure. We supplemented our examination of state expenditures with interviews of test publishers and contractors and state assessment officials in these states regarding the factors that account for price and expenditure variation. The expenditure data that we received were not audited. Actual expenditures may vary from projected amounts, particularly when events or circumstances are different from those assumed. All estimates are reported in nominal dollars unless otherwise noted. We conducted our work in accordance with generally accepted government auditing standards between April 2002 and March 2003. (See app. I for more details about our scope and methodology.): Results in Brief: The majority of states share two characteristics--they administer statewide assessments rather than individual local assessments and use customized questions to measure the content taught in the state schools rather than questions from commercially available tests. However, states differ in many other respects. For example, some states use assessments that include multiple-choice questions and other states include a mixture of multiple-choice questions and a limited number of questions that require students to write their response, such as an essay. Many states that use questions that require students to write their response believe that such questions enable them to more effectively measure certain skills, such as writing. However, others believe that multiple-choice questions also allow them to assess such skills. In addition, some states make actual test questions available to the public after testing but differ with respect to the percentage of test questions they publicly release and consequently, the number of questions they will need to replace. States also vary in the number of new tests they reported needing to develop to comply with the NCLBA, which ranged from 0 to 17. We provide three estimates--$1.9, $3.9, and $5.3 billion--of total spending by states between fiscal year 2002 and 2008, with the method by which assessments are scored largely explaining the differences in our estimates. These estimates are based on expenditures associated with new assessments as well as existing assessments. The $1.9 billion estimate is based on the assumption that all states will use multiple- choice questions, which are machine scored. The $3.9 billion estimate is based on the assumption that all states keep the mix of question types--whether multiple-choice or a combination of multiple-choice and open-ended--states reported to us. The $5.3 billion estimate is based on the assumption that all states will use a combination of multiple- choice questions and questions that require students to write their response, such as an essay, which are hand scored. Several states are exploring ways to reduce assessment expenses. This information could be beneficial to others, however, it is currently not being broadly shared. Given that significant expenses may be associated with testing, we are recommending that Education facilitate the sharing of information on states' experiences as they attempt to reduce expenses. Education agreed with our recommendation, but raised concerns about our methodology for estimating expenditures. Background: Enacted as part of President Johnson's War on Poverty, the original Title I program was created in 1965, but the 1994 and most recently, the 2001 reauthorization of ESEA, mandated fundamental changes to Title I. The 1994 ESEA reauthorization required states to develop state standards and assessments to ensure that students served by Title I were held to the same standards of achievement as other students. Some states had already implemented assessments prior to 1994, but they tended to be norm referenced--a student's performance was compared to the performance of all students nationally. The 1994 ESEA reauthorization required assessments that were criterion referenced-- students' performance was to be judged against the state standards for what children should know and be able to do.[Footnote 2] In passing the NCLBA, Congress built on the 1994 requirements by, among other things, increasing the number of grades and subject areas in which states were required to assess students, as shown in table 1. NCLBA requires annual testing of students in third through eighth grades, in mathematics and reading or language arts. It also requires mathematics and reading or language arts testing in one of the high school grades (10-12). States must also assess students in science at least once in elementary (3-5), middle (6-9), and high school (10-12). NCLBA gives the states until the 2005-06 school year to administer the additional mathematics and reading or language arts assessments and until the 2007-08 school year to administer the science assessments (see app. II for a summary of Title I assessment requirements). Table 1: Number of Assessments and Subject Areas Required by the 1994 and 2001 ESEA Reauthorizations: Subject: Reading or language arts; Number of required assessments: 1994 ESEA reauthorization: 3; Number of required assessments: 2001 ESEA reauthorization: 7. Subject: Mathematics; Number of required assessments: 1994 ESEA reauthorization: 3; Number of required assessments: 2001 ESEA reauthorization: 7. Subject: Science; Number of required assessments: 1994 ESEA reauthorization: 0; Number of required assessments: 2001 ESEA reauthorization: 3. Subject: Total; Number of required assessments: 1994 ESEA reauthorization: 6; Number of required assessments: 2001 ESEA reauthorization: 17. Source: P.L. No. 103-382 (1994) and P.L. No. 107-110 (2001). [End of table] Unlike the 1994 ESEA reauthorization, NCLBA does not generally permit Education to allow states additional time to implement these assessments beyond the stated time frames.[Footnote 3] Under the 1994 ESEA reauthorization, Congress allowed states to phase in the 1994 ESEA assessment requirements over time, giving states until the beginning of the 2000-01 school year to fully implement them with the possibility of limited time extensions. In April 2002, we reported that the majority of states were not in compliance with the Title I accountability and assessment provisions required by the 1994 law.[Footnote 4] Every state applying for Title I funds must agree to implement the changes described in the 2001 act, including those related to the additional assessments. In addition to the regular Title I state grant, NCLBA authorizes additional funding to states for these assessments between fiscal year 2002 and 2007.[Footnote 5] These funds are to be allocated each year to states, with each state receiving $3 million, regardless of its size, plus an amount authorized based on its share of the nation's school age population. States must use the funds to pay the cost of developing the additional state standards and assessments. If a state has already developed the required standards and assessments, it may use these funds to, among other things, develop challenging state academic content and student academic achievement standards in subject areas other than those required under Title I and to ensure the validity and reliability of state assessments. NCLBA authorized $490 million for fiscal year 2002 for state assessments and such funds as may be necessary through fiscal year 2007. However, if in any year Congress appropriates less than the amounts shown in table 2, states may defer or suspend testing; however, states are still required to develop the assessments. In fiscal year 2002, states received $387 million for assessments. Table 2: Assessment Minimum Amounts under NCLBA: Fiscal year: 2002; Appropriation benchmark: $370,000,000. Fiscal year: 2003; Appropriation benchmark: 380,000,000. Fiscal year: 2004; Appropriation benchmark: 390,000,000. Fiscal year: 2005; Appropriation benchmark: 400,000,000. Fiscal year: 2006; Appropriation benchmark: 400,000,000. Fiscal year: 2007; Appropriation benchmark: 400,000,000. Fiscal year: Total; Appropriation benchmark: $2.34 billion. Source: P.L. No. 107-110 (2001). [End of table]: Other organizations have provided cost estimates of implementing the required assessments. The National Association of State Boards of Education (NASBE) estimated that states would spend between $2.7 to $7 billion to implement the required assessments. AccountabilityWorks estimated that states would spend about $2.1 billion.[Footnote 6] States can choose to use statewide assessments, local assessments, or both to comply with NCLBA. States can also choose to develop their own test questions or augment commercially available tests with questions so that they measure what students are actually taught in school. However, NCLBA does not permit states to use commercially available tests that have not been augmented. NCLBA provides Education a varied role with respect to these assessments. Education is responsible for determining whether or not states' assessments comply with Title I requirements. States submit evidence to Education showing that their systems for assessing students and holding schools accountable meet Title I requirements, and Education contracts with individuals who have expertise in assessments and Title I to review this evidence. The experts provide Education with a report on the status of each state regarding the degree to which a state's system for assessing students meets the requirements and, therefore, warrants approval. Under NCLBA, Education can withhold federal funds provided for state administration until Education determines that the state has fulfilled those requirements.[Footnote 7] Education's role also includes reporting to Congress on states' progress in developing and implementing academic assessments, and providing states, at the state's request, with technical assistance in meeting the academic assessment requirements. It also includes disseminating information to states on best practices. States Generally Report Administering Statewide Assessments Developed to Measure Their State Standards, but Differ Along Other Characteristics: The majority of states report using statewide assessments developed to measure student learning against the content they are taught in the states' schools, but their assessments differ in many other ways. For example, some states use assessments that include multiple-choice questions, while others include a mixture of multiple-choice questions and questions that require students to write their answer by composing an essay or showing how they calculated a math answer. In addition, some states make actual test questions available to the public but differ with respect to the percentage of test questions they publicly release. Nearly all states provide accommodations for students with disabilities and some states report offering their assessments in languages other than English. States also vary in the number of new tests they will need to develop to comply with the NCLBA. The Majority of States Use Statewide Tests That They Report Are Written to Their State Standards: Forty-six states currently administer statewide tests to students and 44 plan to continue using statewide tests for future tests NCLBA requires them to add.[Footnote 8] (See fig. 1.) Only 4 states--Idaho, Kansas, Pennsylvania, and Nebraska--currently use a combination of state and local assessments and only Iowa currently uses all local assessments. Figure 1: The Majority of States Report They Currently Use Statewide Tests and Plan to Continue to Do So: [See PDF for image] Note: Percentages do not add to 100 because of rounding. [End of figure] : The majority of states (31) report that all of the tests they currently use consist of questions customized, that is, developed specifically to assess student progress against their state's standards for learning for every grade and subject tested. (See fig. 2.) Many of the remaining states are using different types of tests for different grades and subjects. For example, some states are using customized tests for some grades and subjects and commercially available tests for other grades and subjects. Seven states reported using only commercially available tests in all the grades and subjects they tested. In the future, the majority of states (33) report that all of their tests will consist of customized questions for every subject and grade. Moreover, those states that currently use commercially available tests report plans to replace these tests with customized tests or augment commercially available tests with additional questions to measure what students are taught in schools, as required by NCLBA. Figure 2: The Majority of States Reported That They Currently Use and Plan to Develop New Tests That Are Customized to Measure Their State's Standards: [See PDF for image] Note: Percentages do not add to 100 due to rounding. In the current period, "other" includes states that reported using commercially available tests for all grades and subjects tested that had not been augmented with additional questions to measure state standards. These states reported plans to augment these tests with additional questions or replace them with customized tests. [End of figure] States Vary in Approach to Specific Accommodations: In developing their assessments, nearly all states (50) reported providing specific accommodations for students with disabilities.[Footnote 9] These often include Braille, large print, and audiotape versions of their assessments for visually impaired students, as well as additional time and oral administration. About a quarter of the states (12) report offering these assessments in languages other than English, typically Spanish. Both small and larger states scattered across the United States offer assessments in languages besides English. For example, states such as Wyoming and Delaware and large states such as Texas and New York offer Spanish language versions of their assessments. New York and Minnesota offer their assessments in as many as four other languages besides English.[Footnote 10] While a quarter of the states currently translate or offer assessments in languages other than English, additional states may provide other accommodations for students with limited English proficiency, such as additional time to take the test, use of bilingual dictionaries, or versions of the test that limit use of idiomatic expressions. States Are Using Different Types of Questions to Assess Students: Thirty-six states report they currently use a combination of multiple- choice and a limited number of open-ended questions for at least some of the assessments they give their students. (See fig. 3.) For example, in Florida, third grade students' math skills are assessed using multiple-choice questions, while fifth grade students' math skills are assessed using a combination of multiple-choice and open-ended questions. Twelve states reported having tests that consist entirely of multiple-choice questions. For example, all of Georgia's and Virginia's tests are multiple-choice. Almost half of the states reported that they had not made a decision about the ratio of multiple-choice to open- ended questions on future tests. Of the states that had made a decision, most reported plans to develop assessments using the same types of questions they currently use. Figure 3: The Majority of States Reported They Use a Combination of Multiple-choice and Open-ended Questions on Their Tests, but Many States Are Uncertain about Question Type on Future Tests: [See PDF for image] [End of figure] States choose to use a mixture of question types on their tests for varying reasons. For example, some officials believe that open-ended questions, requiring both short and long student responses, more effectively measure certain skills such as writing or math computation than multiple-choice questions. Further, they believe that different question types will render a more complete measure of student knowledge and skills. In addition, state laws sometimes require test designers to use more than one type of question. In Maine, for example, state law requires that all state and local assessments employ multiple measures of student performance. States Split as to Whether They Make Actual Test Questions Available to the Public Following Tests: Slightly over half of the states currently release actual test questions to the public, but differ in the percent of questions they release. (See fig. 4.) Texas, Massachusetts, Maine, and Ohio release their entire tests to the public following the tests, allowing parents and other interested parties to see every question their children were asked. Other states, such as New Jersey and Michigan release only a portion of their tests. Moreover, even those states that do not release questions to the general public may release a portion of the questions to teachers, as does North Carolina, so that they can better understand areas where students are having the most difficulty, and improve instructions. States that release questions must typically replace them with new questions. Figure 4: States Split in Decision to Release Test Questions to the Public Following Tests: [See PDF for image] [End of figure] Often, states periodically replenish their tests with new questions to improve test security. For example, states like Florida, Kentucky, Maryland, and South Carolina that do not release test questions, replenish or replace questions periodically. In addition to replenishing test items, many states use more than one version for each of their tests and do so for various reasons. For example, Virginia gives a different version of its test to students who may have been absent. Some states use multiple test versions of their high school tests to allow those students who do not pass it to take it multiple times. Still other states, such as Massachusetts and Maine, use multiple versions to enable the field testing of future test questions. States Vary in the Number of Additional Tests They Reported They Need to Develop or Augment: States differ in the number of additional tests they reported they need to meet NCLBA requirements, with some having all of the tests needed while others will need to develop new tests or augment commercially available tests with additional questions to fulfill the new requirements for a total of 17 tests. (See table 3.) Appendix III has information on the number of tests each state needs to develop or augment to comply with NCLBA. The majority of states (32) report they will need to develop or augment 9 or fewer tests and the rest (20) will need to develop or augment 10 or more tests. Eight states--Alabama, New Mexico, Montana, South Dakota, Idaho, West Virginia, Wisconsin, and the District of Columbia report that they need to develop or augment all 17 tests. Maryland is also replacing a large number of its tests (15); although its assessments were certified as compliant with the 1994 law, the tests did not provide scores for individual students. Although Education waived the requirement that Maryland's tests provide student level data, Maryland is in the process of replacing them so that it can provide such data, enabling parents to know how well their children are performing on state tests. Table 3: The Number of Tests States Reported Needing to Develop or Augment Varies: Range in number of test states need to comply with NCLBA: None; Number of states: 5. Range in number of test states need to comply with NCLBA: 1-3; Number of states: 4. Range in number of test states need to comply with NCLBA: 4-6; Number of states: 6. Range in number of test states need to comply with NCLBA: 7-9; Number of states: 17. Range in number of test states need to comply with NCLBA: 10-12; Number of states: 10. Range in number of test states need to comply with NCLBA: 13 or more; Number of states: 10. Source: GAO survey. [End of table]: Most states reported plans to immediately begin developing the tests, which according to many of the assessment directors we spoke with, typically take 2 to 3 years to develop. For example, most states reported that by 2003 they will have developed or will begin developing the reading and mathematics tests that must be administered by the 2005-06 school year. Similarly, most states reported that by 2005 they will have developed or will begin developing the science tests that must be administered by the 2007-08 school year. To help them develop these tests, most states report using one or more outside contractors to help manage testing programs. Nearly all states report that developing, administering, scoring, and reporting will be a collaborative effort involving contractors and state and local education agencies. However, while states report that contractors and state education agencies will share the primary role in developing, scoring, and reporting new assessments, local education agencies will have the primary role in administering the assessments. Estimates of Spending Driven Largely by Scoring Expenditures: We provide three estimates--$1.9, $3.9, and $5.3 billion--of total state spending between fiscal years 2002 and 2008 for test development, administration, scoring, and test reporting. These figures include estimated expenses for assessments states will need to add as well as continuing expenditures associated with assessments they currently have in place. The method of scoring largely explains the differences in the estimates. However, various other factors, such as the extent to which states release assessment questions to the public after testing and therefore need to replace them, also affect expenditures. Between states, however, the number of students assessed will largely explain variation in expenditures. Moreover, because expenditures for test development are small in relation to test administration, scoring, and reporting (nondevelopment expenditures), we estimate that state expenditures may be lower in the first few years when states are developing their assessments and higher in subsequent years as states begin to administer and score them and report the results. Different Estimates Primarily Reflect Differences in How Assessments Are Scored: We estimate that states may spend $1.9, $3.9, or $5.3 billion on Title I assessments between fiscal years 2002 through 2008, with scoring expenditures largely accounting for differences in our estimates. Table 4 shows total state expenditures for the 17 tests required by Title I. In appendix IV, we also provide separate estimates for expenses associated with the subset of the 17 assessments that states reported they did not have in place at the time of our survey but are newly required by NCLBA. Table 4: Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08: Question type: Multiple-choice; Estimate: $1.9 billion; Questions and scoring methods used: Estimate assumes that all states use machine-scored multiple-choice questions. Question type: Current question type; Estimate: $3.9 billion; Questions and scoring methods used: Estimate assumes that states use the mix of question types reported in our survey. Question type: Multiple-choice and open-ended; Estimate: $5.3 billion; Questions and scoring methods used: Estimate assumes that all states use both machine-scored multiple-choice questions and some hand scored open-ended questions. Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. [End of table] The $1.9 billion estimate assumes that all states will use multiple- choice questions on their assessments. Multiple-choice questions can be scored by scanning machines, making them relatively inexpensive to score. For instance, North Carolina, which uses multiple-choice questions on all of its assessments and machine scores them, spends approximately $0.60 to score each assessment. The $3.9 billion estimate assumes that states will implement assessments with questions like the ones they currently use or plan to use based on state education agency officials' responses to our survey. However, 25 states reported that they had not made final decisions about question type for future assessments. Thus, the types of questions states ultimately use may be different from the assessments they currently use or plan to use. Finally, the $5.3 billion estimate assumes that all states will implement assessments with both multiple-choice and open-ended questions. Answers to open-ended questions, where students write out their responses, are typically read and scored by people rather than by machines, making them much more expensive to score than answers to multiple-choice questions. We found that states using open-ended questions had much higher scoring expenditures per student than states using multiple-choice questions, as evidenced in the states we visited, as shown in figure 5.[Footnote 11] For example, Massachusetts, which uses many open-ended questions on its Title I assessments, spends about $7.00 to score each assessment. Scoring students' answers to open-ended questions in Massachusetts involves selecting and training people to read and score the answers, assigning other people to supervise the readers, and providing a facility where the scoring can take place. In cases where graduation decisions depend in part on a student's score on the assessment, the state requires that two or three individuals read and score the student's answer. By using more than one reader to score answers, officials ensure consistency between scorers and are able to resolve disagreements about how well the student performed. Figure 5: Estimated Scoring Expenditures Per Assessment Taken for Selected States, Fiscal Year 2002: [See PDF for image] [End of figure] We estimate that, for most states, much of the expense associated with assessments will be related to test scoring, administration, and reporting, not test development, which includes such expenses as question development and field testing.[Footnote 12] (See table 5.) In Colorado, for example, test administration, scoring, and reporting expenditures comprise 89 percent of the total expenditures, while test development expenditures comprised only 11 percent. (See app. V for our estimates of development and nondevelopment expenditures by state.): Table 5: Estimated Total Expenditures for Test Development Are Lower Than for Test Administration, Scoring, and Reporting: In millions. Development; Multiple-choice: $668; Current: question type: $706; Multiple-choice and open-ended: $724. Administration, scoring, and reporting; Multiple-choice: 1,233; Current: question type: 3,237; Multiple-choice and open-ended: 4,590. Total; Multiple-choice: $1,901; Current: question type: $3,944; Multiple-choice and open-ended: $5,313. Source: GAO projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. [End of table] Various Factors are Likely to Affect Expenditures for Title I Assessments: While the scoring method explains a great deal of the variation in expenditures among states, other factors are likely to affect expenditures. These factors include the number of different test versions used, the extent to which the state releases assessment questions to the public after testing, fees for using copyrighted material, and factors unique to the state. (See fig. 6.) For example, states that use multiple test versions will have higher expenditures than those that have one. Massachusetts used 24 different test versions for many of its assessments and spent approximately $200,000 to develop each assessment. Texas used only 1 version for its assessments and spent approximately $60,000 per assessment. In addition, states that release test items to the public or require rapid reporting of student test scores are likely to have higher expenditures than states that do not because they need to replace these items with new ones to protect the integrity of the tests and assign additional staff to more rapidly score the assessments by the specified time frame. States that customize their assessments may have higher expenditures than states that augment commercially available tests. Moreover, factors unique to the state may affect expenditures. Maine, which had one of the lowest assessment development expenses of all of the states we visited (about $22,000 per assessment), has a contract with a nonprofit testing company. Between states, the number of students tested generally explains much of the variation in expenditures, particularly when question types are similar. States with large numbers of students tested will generally have higher expenditures than states with fewer students. Figure 6: Various Factors Are Likely to Affect What States Spend on Title I Assessments: [See PDF for image] [End of figure] Benchmark Amounts in NCLBA Will Cover Varying Portions of States' Estimated Expenditures and Amount Covered Will Vary Primarily by Type of Test Questions States Use: Using the benchmark funding levels specified in NCLBA, we estimate that these amounts would cover varying portions of estimated expenditures. (See table 6.) In general, these benchmark amounts would cover a larger percentage of the estimated expenditures for states that choose to use multiple-choice tests. To illustrate, we estimated that Alabama would spend $30 million if it continued to use primarily multiple-choice questions, but $73 million if the state used assessments with both multiple-choice and open-ended questions. The specified amount would cover 151 percent of Alabama's estimated expenditures if it chose to use all multiple-choice questions, but 62 percent if the state chose to use both multiple-choice and open-ended questions. Table 6: Total Estimated Expenditures by States for Title I Assessments, Fiscal Years 2002-08: Alabama; Estimates (in millions): Multiple-choice: $30; Estimates (in millions): Current question type: $30; Estimates (in millions): Multiple-choice and open-ended: $73; Appropriation benchmark (in millions)[A]: $46; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 151%; Appropriation benchmark as percent of estimated expenses: Current question type: 151%; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 62%. Alaska; Estimates (in millions): Multiple-choice: 17; Estimates (in millions): Current question type: 25; Estimates (in millions): Multiple-choice and open-ended: 28; Appropriation benchmark (in millions)[A]: 26; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 154; Appropriation benchmark as percent of estimated expenses: Current question type: 106; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 93. Arizona; Estimates (in millions): Multiple-choice: 39; Estimates (in millions): Current question type: 108; Estimates (in millions): Multiple-choice and open-ended: 108; Appropriation benchmark (in millions)[A]: 51; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 132; Appropriation benchmark as percent of estimated expenses: Current question type: 47; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 47. Arkansas; Estimates (in millions): Multiple-choice: 23; Estimates (in millions): Current question type: 42; Estimates (in millions): Multiple-choice and open-ended: 53; Appropriation benchmark (in millions)[A]: 37; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 158; Appropriation benchmark as percent of estimated expenses: Current question type: 88; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 70. California; Estimates (in millions): Multiple-choice: 178; Estimates (in millions): Current question type: 235; Estimates (in millions): Multiple-choice and open-ended: 632; Appropriation benchmark (in millions)[A]: 219; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 123; Appropriation benchmark as percent of estimated expenses: Current question type: 93; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 35. Colorado; Estimates (in millions): Multiple-choice: 32; Estimates (in millions): Current question type: 87; Estimates (in millions): Multiple-choice and open-ended: 87; Appropriation benchmark (in millions)[A]: 46; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 145; Appropriation benchmark as percent of estimated expenses: Current question type: 53; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 53. Connecticut; Estimates (in millions): Multiple-choice: 28; Estimates (in millions): Current question type: 68; Estimates (in millions): Multiple-choice and open-ended: 68; Appropriation benchmark (in millions)[A]: 41; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 147; Appropriation benchmark as percent of estimated expenses: Current question type: 59; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 59. Delaware; Estimates (in millions): Multiple-choice: 14; Estimates (in millions): Current question type: 24; Estimates (in millions): Multiple-choice and open-ended: 24; Appropriation benchmark (in millions)[A]: 26; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 183; Appropriation benchmark as percent of estimated expenses: Current question type: 106; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 106. District of Columbia; Estimates (in millions): Multiple-choice: 13; Estimates (in millions): Current question type: 13; Estimates (in millions): Multiple-choice and open-ended: 17; Appropriation benchmark (in millions)[A]: 24; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 184; Appropriation benchmark as percent of estimated expenses: Current question type: 184; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 144. Florida; Estimates (in millions): Multiple-choice: 83; Estimates (in millions): Current question type: 211; Estimates (in millions): Multiple-choice and open-ended: 281; Appropriation benchmark (in millions)[A]: 102; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 123; Appropriation benchmark as percent of estimated expenses: Current question type: 48; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 36. Georgia; Estimates (in millions): Multiple-choice: 54; Estimates (in millions): Current question type: 54; Estimates (in millions): Multiple-choice and open-ended: 174; Appropriation benchmark (in millions)[A]: 67; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 124; Appropriation benchmark as percent of estimated expenses: Current question type: 124; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 39. Hawaii; Estimates (in millions): Multiple-choice: 17; Estimates (in millions): Current question type: 31; Estimates (in millions): Multiple-choice and open-ended: 31; Appropriation benchmark (in millions)[A]: 28; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 162; Appropriation benchmark as percent of estimated expenses: Current question type: 91; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 91. Idaho; Estimates (in millions): Multiple-choice: 18; Estimates (in millions): Current question type: 23; Estimates (in millions): Multiple-choice and open-ended: 30; Appropriation benchmark (in millions)[A]: 30; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 167; Appropriation benchmark as percent of estimated expenses: Current question type: 131; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 98. Illinois; Estimates (in millions): Multiple-choice: 65; Estimates (in millions): Current question type: 164; Estimates (in millions): Multiple-choice and open-ended: 211; Appropriation benchmark (in millions)[A]: 92; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 141; Appropriation benchmark as percent of estimated expenses: Current question type: 56; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 44. Indiana; Estimates (in millions): Multiple-choice: 40; Estimates (in millions): Current question type: 113; Estimates (in millions): Multiple-choice and open-ended: 113; Appropriation benchmark (in millions)[A]: 56; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 140; Appropriation benchmark as percent of estimated expenses: Current question type: 49; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 49. Iowa; Estimates (in millions): Multiple-choice: 24; Estimates (in millions): Current question type: 62; Estimates (in millions): Multiple-choice and open-ended: 62; Appropriation benchmark (in millions)[A]: 38; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 158; Appropriation benchmark as percent of estimated expenses: Current question type: 62; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 62. Kansas; Estimates (in millions): Multiple-choice: 23; Estimates (in millions): Current question type: 36; Estimates (in millions): Multiple-choice and open-ended: 51; Appropriation benchmark (in millions)[A]: 38; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 164; Appropriation benchmark as percent of estimated expenses: Current question type: 106; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 73. Kentucky; Estimates (in millions): Multiple-choice: 28; Estimates (in millions): Current question type: 62; Estimates (in millions): Multiple-choice and open-ended: 71; Appropriation benchmark (in millions)[A]: 43; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 155; Appropriation benchmark as percent of estimated expenses: Current question type: 70; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 61. Louisiana; Estimates (in millions): Multiple-choice: 31; Estimates (in millions): Current question type: 81; Estimates (in millions): Multiple-choice and open-ended: 81; Appropriation benchmark (in millions)[A]: 49; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 158; Appropriation benchmark as percent of estimated expenses: Current question type: 60; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 60. Maine; Estimates (in millions): Multiple-choice: 18; Estimates (in millions): Current question type: 33; Estimates (in millions): Multiple-choice and open-ended: 33; Appropriation benchmark (in millions)[A]: 29; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 159; Appropriation benchmark as percent of estimated expenses: Current question type: 86; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 86. Maryland; Estimates (in millions): Multiple-choice: 35; Estimates (in millions): Current question type: 91; Estimates (in millions): Multiple-choice and open-ended: 91; Appropriation benchmark (in millions)[A]: 51; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 146; Appropriation benchmark as percent of estimated expenses: Current question type: 56; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 56. Massachusetts; Estimates (in millions): Multiple-choice: 38; Estimates (in millions): Current question type: 109; Estimates (in millions): Multiple-choice and open-ended: 109; Appropriation benchmark (in millions)[A]: 55; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 144; Appropriation benchmark as percent of estimated expenses: Current question type: 50; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 50. Michigan; Estimates (in millions): Multiple-choice: 57; Estimates (in millions): Current question type: 177; Estimates (in millions): Multiple-choice and open-ended: 177; Appropriation benchmark (in millions)[A]: 80; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 140; Appropriation benchmark as percent of estimated expenses: Current question type: 45; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 45. Minnesota; Estimates (in millions): Multiple-choice: 34; Estimates (in millions): Current question type: 91; Estimates (in millions): Multiple-choice and open-ended: 91; Appropriation benchmark (in millions)[A]: 51; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 149; Appropriation benchmark as percent of estimated expenses: Current question type: 56; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 56. Mississippi; Estimates (in millions): Multiple-choice: 25; Estimates (in millions): Current question type: 63; Estimates (in millions): Multiple-choice and open-ended: 63; Appropriation benchmark (in millions)[A]: 39; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 154; Appropriation benchmark as percent of estimated expenses: Current question type: 61; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 61. Missouri; Estimates (in millions): Multiple-choice: 36; Estimates (in millions): Current question type: 99; Estimates (in millions): Multiple-choice and open-ended: 99; Appropriation benchmark (in millions)[A]: 54; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 150; Appropriation benchmark as percent of estimated expenses: Current question type: 54; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 54. Montana; Estimates (in millions): Multiple-choice: 18; Estimates (in millions): Current question type: 28; Estimates (in millions): Multiple-choice and open-ended: 29; Appropriation benchmark (in millions)[A]: 27; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 149; Appropriation benchmark as percent of estimated expenses: Current question type: 97; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 94. Nebraska; Estimates (in millions): Multiple-choice: 18; Estimates (in millions): Current question type: 34; Estimates (in millions): Multiple-choice and open-ended: 34; Appropriation benchmark (in millions)[A]: 32; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 177; Appropriation benchmark as percent of estimated expenses: Current question type: 93; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 93. Nevada; Estimates (in millions): Multiple-choice: 21; Estimates (in millions): Current question type: 26; Estimates (in millions): Multiple-choice and open-ended: 45; Appropriation benchmark (in millions)[A]: 33; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 152; Appropriation benchmark as percent of estimated expenses: Current question type: 125; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 72. New Hampshire; Estimates (in millions): Multiple-choice: 17; Estimates (in millions): Current question type: 32; Estimates (in millions): Multiple-choice and open-ended: 32; Appropriation benchmark (in millions)[A]: 29; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 168; Appropriation benchmark as percent of estimated expenses: Current question type: 92; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 92. New Jersey; Estimates (in millions): Multiple-choice: 43; Estimates (in millions): Current question type: 127; Estimates (in millions): Multiple-choice and open-ended: 127; Appropriation benchmark (in millions)[A]: 67; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 153; Appropriation benchmark as percent of estimated expenses: Current question type: 53; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 53. New Mexico; Estimates (in millions): Multiple-choice: 21; Estimates (in millions): Current question type: 39; Estimates (in millions): Multiple-choice and open-ended: 41; Appropriation benchmark (in millions)[A]: 33; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 155; Appropriation benchmark as percent of estimated expenses: Current question type: 84; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 81. New York; Estimates (in millions): Multiple-choice: 83; Estimates (in millions): Current question type: 276; Estimates (in millions): Multiple-choice and open-ended: 276; Appropriation benchmark (in millions)[A]: 121; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 146; Appropriation benchmark as percent of estimated expenses: Current question type: 44; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 44. North Carolina; Estimates (in millions): Multiple-choice: 49; Estimates (in millions): Current question type: 49; Estimates (in millions): Multiple-choice and open-ended: 152; Appropriation benchmark (in millions)[A]: 65; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 132; Appropriation benchmark as percent of estimated expenses: Current question type: 132; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 43. North Dakota; Estimates (in millions): Multiple-choice: 16; Estimates (in millions): Current question type: 23; Estimates (in millions): Multiple-choice and open-ended: 23; Appropriation benchmark (in millions)[A]: 26; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 162; Appropriation benchmark as percent of estimated expenses: Current question type: 109; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 109. Ohio; Estimates (in millions): Multiple-choice: 55; Estimates (in millions): Current question type: 171; Estimates (in millions): Multiple-choice and open-ended: 171; Appropriation benchmark (in millions)[A]: 86; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 158; Appropriation benchmark as percent of estimated expenses: Current question type: 50; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 50. Oklahoma; Estimates (in millions): Multiple-choice: 27; Estimates (in millions): Current question type: 37; Estimates (in millions): Multiple-choice and open-ended: 66; Appropriation benchmark (in millions)[A]: 42; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 156; Appropriation benchmark as percent of estimated expenses: Current question type: 114; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 63. Oregon; Estimates (in millions): Multiple-choice: 28; Estimates (in millions): Current question type: 28; Estimates (in millions): Multiple-choice and open-ended: 70; Appropriation benchmark (in millions)[A]: 40; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 145; Appropriation benchmark as percent of estimated expenses: Current question type: 145; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 57. Pennsylvania; Estimates (in millions): Multiple-choice: 58; Estimates (in millions): Current question type: 162; Estimates (in millions): Multiple-choice and open-ended: 181; Appropriation benchmark (in millions)[A]: 87; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 150; Appropriation benchmark as percent of estimated expenses: Current question type: 54; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 48. Puerto Rico; Estimates (in millions): Multiple-choice: 28; Estimates (in millions): Current question type: 28; Estimates (in millions): Multiple-choice and open-ended: 70; Appropriation benchmark (in millions)[A]: 47; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 167; Appropriation benchmark as percent of estimated expenses: Current question type: 167; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 67. Rhode Island; Estimates (in millions): Multiple-choice: 17; Estimates (in millions): Current question type: 28; Estimates (in millions): Multiple-choice and open-ended: 28; Appropriation benchmark (in millions)[A]: 27; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 161; Appropriation benchmark as percent of estimated expenses: Current question type: 98; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 98. South Carolina; Estimates (in millions): Multiple-choice: 31; Estimates (in millions): Current question type: 82; Estimates (in millions): Multiple-choice and open-ended: 85; Appropriation benchmark (in millions)[A]: 43; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 139; Appropriation benchmark as percent of estimated expenses: Current question type: 53; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 51. South Dakota; Estimates (in millions): Multiple-choice: 18; Estimates (in millions): Current question type: 18; Estimates (in millions): Multiple-choice and open-ended: 27; Appropriation benchmark (in millions)[A]: 26; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 145; Appropriation benchmark as percent of estimated expenses: Current question type: 145; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 97. Tennessee; Estimates (in millions): Multiple-choice: 33; Estimates (in millions): Current question type: 33; Estimates (in millions): Multiple-choice and open-ended: 85; Appropriation benchmark (in millions)[A]: 52; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 158; Appropriation benchmark as percent of estimated expenses: Current question type: 158; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 61. Texas; Estimates (in millions): Multiple-choice: 126; Estimates (in millions): Current question type: 232; Estimates (in millions): Multiple-choice and open-ended: 441; Appropriation benchmark (in millions)[A]: 147; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 116; Appropriation benchmark as percent of estimated expenses: Current question type: 63; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 33. Utah; Estimates (in millions): Multiple-choice: 24; Estimates (in millions): Current question type: 44; Estimates (in millions): Multiple-choice and open-ended: 61; Appropriation benchmark (in millions)[A]: 37; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 154; Appropriation benchmark as percent of estimated expenses: Current question type: 84; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 60. Vermont; Estimates (in millions): Multiple-choice: 16; Estimates (in millions): Current question type: 25; Estimates (in millions): Multiple-choice and open-ended: 25; Appropriation benchmark (in millions)[A]: 25; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 155; Appropriation benchmark as percent of estimated expenses: Current question type: 102; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 102. Virginia; Estimates (in millions): Multiple-choice: 43; Estimates (in millions): Current question type: 60; Estimates (in millions): Multiple-choice and open-ended: 129; Appropriation benchmark (in millions)[A]: 59; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 136; Appropriation benchmark as percent of estimated expenses: Current question type: 99; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 46. Washington; Estimates (in millions): Multiple-choice: 41; Estimates (in millions): Current question type: 118; Estimates (in millions): Multiple-choice and open-ended: 118; Appropriation benchmark (in millions)[A]: 55; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 135; Appropriation benchmark as percent of estimated expenses: Current question type: 47; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 47. West Virginia; Estimates (in millions): Multiple-choice: 23; Estimates (in millions): Current question type: 23; Estimates (in millions): Multiple-choice and open-ended: 43; Appropriation benchmark (in millions)[A]: 31; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 135; Appropriation benchmark as percent of estimated expenses: Current question type: 135; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 72. Wisconsin; Estimates (in millions): Multiple-choice: 29; Estimates (in millions): Current question type: 66; Estimates (in millions): Multiple-choice and open-ended: 72; Appropriation benchmark (in millions)[A]: 53; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 180; Appropriation benchmark as percent of estimated expenses: Current question type: 80; Appropriation benchmark as percent of estimated expenses: Multiple-choice and open- ended: 73. Wyoming; Estimates (in millions): Multiple-choice: 15; Estimates (in millions): Current question type: 21; Estimates (in millions): Multiple-choice and open-ended: 21; Appropriation benchmark (in millions)[A]: 25; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 171; Appropriation benchmark as percent of estimated expenses: Current question type: 119; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 119. Total; Estimates (in millions): Multiple-choice: $1,901; Estimates (in millions): Current question type: $3,944; Estimates (in millions): Multiple-choice and open-ended: $5,313; Appropriation benchmark (in millions)[A]: $2,733; Appropriation benchmark as percent of estimated expenses: Multiple-choice: 144%; Appropriation benchmark as percent of estimated expenses: Current question type: 69%; Appropriation benchmark as percent of estimated expenses: Multiple- choice and open-ended: 51%. Source: GAO analysis. [A] Figures in these columns are based largely on benchmark funding levels in NCLBA. If Congress appropriates less than the benchmark amounts, states may defer test administration. For fiscal years 2002 and 2003, however, we used the actual appropriation. In addition, because we were mandated to estimate spending for fiscal year 2008, for purposes of this analysis, we assumed a fiscal year 2008 benchmark of $400 million, the same amount as for fiscal years 2005, 2006, and 2007. It should be noted, however, that Congress has not authorized funding past fiscal year 2007, when Title I would be reauthorized. Benchmarks by state were calculated based on the formula in NCLBA for allocating assessment funds to the states. [End of table] Total Expenditures Likely to Be Lower in the First Few Years, Increasing Over Time as States Begin to Administer, Score, and Report Additional Assessments: Estimated expenditures are likely to be lower in the first few years when tests are being developed and increase in later years when greater numbers of tests are administered, scored, and reported. As a result, the benchmark funding amounts in NCLBA would cover a larger percentage of estimated expenditures in the first few years. Under some circumstances, the funding benchmarks in NCLBA exceed estimated state expenditures. For example, as shown in figure 7, the fiscal year 2002 allocation would more than cover all of the estimated expenses if all states were to use multiple-choice questions or continue with the types of questions they currently use. If all states were to choose to use a mixture of multiple-choice and open-ended questions, the most expensive option, fiscal year 2002 funding would cover 84 percent of states' total expenditures. We estimate a similar pattern for fiscal year 2003. (See app. VI for fiscal year 2002 through 2008 estimated expenditures for each question type.): In fiscal year 2007 and 2008, benchmark funding would continue to cover all of the estimated expenditures if all states were to use all multiple-choice questions, about two-thirds of estimated expenditures if all states continued using their current mix of questions, and a little over 50 percent of estimated expenditures if all states were to use a mixture of question types, the most expensive option. Figure 7: Total Expenditures Likely to Be Lower in First Few Years and Benchmark Funding in NCLBA Estimated to Cover Most of Expenditures in First Few Years: [See PDF for image] [End of figure] Opportunities May Exist to Share Information on Efforts to Reduce Testing Expenditures: Some states are exploring ways to control expenses related to assessments and their experiences may provide useful information to other states about the value of various methods for controlling expenditures. Recently, several states, in conjunction with testing industry representatives, met to discuss ways of reducing test expenditures. For example, the group discussed a range of possible options for reducing expenditures, including computer-administered tests; commercially available tests that can be customized to states standards by adding additional questions; computerized scoring of written responses, and computer scanning of students' written responses. Information about individual states experiences as they attempt to reduce expenses could benefit other states. However, such information is currently not systematically shared. Conclusions: The 1994 and 2001 ESEA reauthorizations raised student assessments to a new level of importance. These assessments are intended to help ensure that all students are meeting state standards. Congress has authorized funding to assist states in developing and implementing these assessments. We estimate that federal funding benchmarks in NCLBA will cover a larger percentage of expenses in the first few years when states are developing their assessments, with the covered percentage decreasing as states begin to administer, score, and report the full complement of assessments. Moreover, the choices states make about how they will assess students will influence expenditures. Some states are investigating ways to reduce the expenses, but currently information on states' experiences in attempting to reduce expenses is not broadly shared. We believe states could benefit from information sharing. Recommendation: Given the large federal investment in testing and the potential for reducing test expenditures, we recommend that Education use its existing mechanisms to facilitate the sharing of information on states' experiences as they attempt to reduce expenses. Agency Comments: The Department of Education provided written comments on a draft of this report, which we have summarized below and incorporated in the report as appropriate. (See app. VII for agency comments.) Education agreed with our recommendation, stating that it looks forward to continuing and enhancing its efforts to facilitate information sharing that might help states contain expenses. However, Education raised concerns about our methodology, noted the availability of additional federal resources under ESEA that might support states' assessment efforts, and pointed out that not all state assessment costs are generated by NCLBA. With regard to our estimates, we have confidence that our methodology is reasonable and provides results that fairly represent potential expenditures based on the best available information. Education's comments focus on the uncertainties that are inherent in estimation of any kind--the necessity of assumptions, the possibility of events or trends not readily predicted, and other potential sources of error that are acknowledged in the report--without proposing an alternative methodology. Because of the uncertainty, we produced three estimates instead of one. In developing our approach, we solicited comments from experts in the area and incorporated their suggestions as appropriate. We also discussed our estimation procedures with Education staff, who raised no significant concerns. Second, Education cites various other sources of funds that states might use to finance assessments. While other sources may be available, we focused primarily on the amounts specifically authorized for assessments in order to facilitate their comparison to estimated expenses and because they are the minimum amounts that Congress must appropriate to ensure that states continue to develop as well as implement the required assessments. We are sending copies of this report to the Secretary of Education, relevant congressional committees, and other interested parties. Please contact me on (202) 512-7215 or Betty Ward-Zukerman on (202) 512-2732 if you or your staff have any questions about this report. In addition, the report will be available at no charge on GAO's Web site at http:// www.gao.gov. Other GAO contacts and staff acknowledgments are listed in appendix VIII. Marnie S. Shaul, Director Education, Workforce and Income Security Issues: Signed by Marnie S. Shaul: [End of section] Appendix I: Objectives, Scope, and Methodology: The objectives of this study were to provide information on the basic characteristics of Title I assessments, and to estimate what states would likely spend on Title I assessments between fiscal year 2002 and 2008, and identify factors that explain variation in estimated expenditures. To address the first objective, we collected information from a survey sent to the 50 states, the District of Columbia, and Puerto Rico, and reviewed documentation from state education agencies and from published studies detailing the characteristics of states' assessments. To address the second objective, we collected detailed assessment expenditure information from 7 states, interviewed officials at state education agencies, discussed cost factors with assessment contractors, and estimated assessment expenditures under three different scenarios. The methods we used to address the objectives were reviewed by several external reviewers, and we incorporated their comments as appropriate. This appendix discusses the scope of the study, the survey, and the methods we used to estimate assessment expenditures. Providing Information on the Basic Characteristics of Title I Assessments: We surveyed all 50 states, the District of Columbia, and Puerto Rico, all of which responded to our survey. We asked them to provide information about their Title I assessments, including the characteristics of current and planned assessments, the number and types of new tests they needed to develop to satisfy No Child Left Behind Act (NCLBA) requirements, when they planned to begin developing the new assessments, the types of questions on their assessments, and their use of contractors. We also reviewed documentation from several states about their assessment programs and published studies detailing the characteristics of states' assessments. Estimating Assessment Expenditures and Explaining Variation in the Estimates: This study estimates likely expenditures on Title I assessments by states between fiscal year 2002 and 2008, and identifies factors that may explain variation in the estimates. It does not estimate expenditures for alternate assessments for students with disabilitiess for English language proficiency testing, or expenditures incurred by school districts.[Footnote 13] Instead, we estimated expenses states are expected to incur based on expenditure data obtained for this purpose from 7 states combined with data on these and other states' assessment plans and characteristics obtained through a survey.[Footnote 14] In the 7 states, we requested information and documentation on expenditures in a standard set of areas, met with state officials to discuss the information and asked that they review our subsequent analysis of information regarding their state. The expenditure data that we received from the 7 states were not audited. Moreover, actual expenditures may vary from projected amounts, particularly when events or circumstances are different from those assumed, such as changes in the competitiveness of the market for student assessment or changes in assessment technology. Selection of 7 States: We selected 7 states that had assessments in place in many of the grades and subjects required by the NCLBA from the 17 states with assessment systems that had been certified by Education as in compliance with requirements of the Improving America's Schools Act of 1994 when we began our work. We included states with varying student enrollments, including 2 states with relatively small numbers of students. The states we selected were Colorado, Delaware, Maine, Massachusetts, North Carolina, Texas and Virginia. (See table 7 for information about the selected states.): Table 7: States Selected for Study: State: Colorado; Date approved by Education: July 2001; Number of students: 724,508; Number of assessments: Reading (out of 7): 7; Number of assessments: Math (out of 7): 5; Number of assessments: Science (out of 3): 1; Total: 13. State: Delaware; Date approved by Education: December 2000; Number of students: 114,676; Number of assessments: Reading (out of 7): 7; Number of assessments: Math (out of 7): 7; Number of assessments: Science (out of 3): 3; Total: 17. State: Maine; Date approved by Education: February 2002; Number of students: 207,037; Number of assessments: Reading (out of 7): 3; Number of assessments: Math (out of 7): 3; Number of assessments: Science (out of 3): 3; Total: 9. State: Massachusetts; Date approved by Education: January 2001; Number of students: 975,150; Number of assessments: Reading (out of 7): 5; Number of assessments: Math (out of 7): 4; Number of assessments: Science (out of 3): 3; Total: 12. State: North Carolina; Date approved by Education: June 2001; Number of students: 1,293,638; Number of assessments: Reading (out of 7): 7; Number of assessments: Math (out of 7): 7; Number of assessments: Science (out of 3): 0; Total: 14. State: Texas; Date approved by Education: March 2001; Number of students: 4,059,619; Number of assessments: Reading (out of 7): 7; Number of assessments: Math (out of 7): 7; Number of assessments: Science (out of 3): 2; Total: 16. State: Virginia; Date approved by Education: January 2001; Number of students: 1,144,915; Number of assessments: Reading (out of 7): 4; Number of assessments: Math (out of 7): 4; Number of assessments: Science (out of 3): 3; Total: 11. Source: U.S. Department of Education, National Center for Education Statistics, and state education agencies. [End of table]: Collection of Expenditure Information from 7 States: We collected detailed assessment expenditure information from officials in the 7 states. We obtained actual expenditures on contracts and state assessment office budget expenditures for fiscal year 2002 for all 7 states and for previous years in 4 states.[Footnote 15] In site visits to the 7 states, we interviewed state education agency officials who explained various elements of their contracts with assessment publishing firms and the budget for the state's assessment office. To the extent possible, we collected expenditure data, distinguishing expenditures for assessment development from expenditures for assessment administration, scoring, and reporting, because expenditures vary differently between these two expenditure categories. Assessment development expenditures vary with the number of assessments while administration, scoring, and reporting expenditures vary with the number of students taking the assessments. (See table 8 for examples of expenditures.): Table 8: Examples of Assessment Expenditures: (Continued From Previous Page) Type of expenditure: Development; Example of expenditure: Question writing; Question review (e.g., for bias). Type of expenditure: Administration; Example of expenditure: Printing and delivering assessment booklets. Type of expenditure: Scoring; Example of expenditure: Scanning completed booklets into scoring machines. Type of expenditure: Reporting; Example of expenditure: Producing individual score reports. Source: State education agencies. [End of table] Calculation of Averages for Development and for Administration, Scoring, and Reporting: Using annual assessment expenditures for all 7 states, the number of assessments developed and implemented, and the number of students who took the assessments, we calculated average expenditures for ongoing development (assessments past their second year of development) and average expenditures for administration, scoring, and reporting for each state. (See table 9.): Table 9: Average Annual Expenditures for the 7 States (adjusted to 2003 dollars): State: Colorado; Average development expenditures (per ongoing assessment): $72,889; Average expenditures for administration, scoring, and reporting (per assessment taken): $10.35; Both multiple- choice and open-ended questions: Yes; Multiple-choice questions: [Empty]. State: Delaware; Average development expenditures (per ongoing assessment): $66,592; Average expenditures for administration, scoring, and reporting (per assessment taken): $8.78; Both multiple- choice and open-ended questions: Yes; Multiple-choice questions: [Empty]. State: Maine; Average development expenditures (per ongoing assessment): $22,295; Average expenditures for administration, scoring, and reporting (per assessment taken): $9.96; Both multiple- choice and open-ended questions: Yes; Multiple-choice questions: [Empty]. State: Massachusetts; Average development expenditures (per ongoing assessment): $190,870; Average expenditures for administration, scoring, and reporting (per assessment taken): $12.45; Both multiple- choice and open-ended questions: Yes; Multiple-choice questions: [Empty]. State: North Carolina; Average development expenditures (per ongoing assessment): $104,181; Average expenditures for administration, scoring, and reporting (per assessment taken): $1.85; Both multiple- choice and open-ended questions: Multiple-choice questions: Yes. State: Texas; Average development expenditures (per ongoing assessment): $61,453; Average expenditures for administration, scoring, and reporting (per assessment taken): $4.72; Both multiple- choice and open-ended questions: Multiple-choice questions: Yes. State: Virginia; Average development expenditures (per ongoing assessment): $78,489; Average expenditures for administration, scoring, and reporting (per assessment taken): $1.80; Both multiple- choice and open-ended questions: Multiple-choice questions: Yes. Source: GAO analysis of state education agency information. Note: We were able to obtain data for more than 1 year for Colorado, Delaware, Maine, Massachusetts, and Texas. For these states, we adjusted their average expenditures to 2003 dollars and then averaged these adjusted expenditures across the years that data were collected. North Carolina did not distinguish Title I assessments from other assessments it offers. [End of table] Estimating States' Likely Expenditures for 17 Title I Assessments: We provide three estimates of what all states are likely to spend on all of the required 17 assessments using the average development expenditure and average expenditures for administration, scoring, and reporting by question type (multiple-choice or multiple-choice with some open-ended questions). One estimate assumes that all states use only multiple-choice questions, the second assumes that states will use the types of questions state officials reported they use or planned to use, and the third assumes that all states will use both multiple- choice and a limited number of long and short open-ended questions. All estimates reflect states' timing of their assessments (for example, that science assessments are generally planned to be developed and administered later than assessments for reading and mathematics). To estimate what states would spend under the assumption that they use only multiple-choice questions, we took the mean of the average annual expenditures per assessment for North Carolina, Texas, and Virginia, states that use multiple-choice assessments. To compute an estimate that reflected the types of questions states used or planned to use, we used the appropriate averages. To illustrate, California reported 15 multiple-choice tests and 2 tests that include a combination of multiple-choice and open-ended questions. For the 15 multiple-choice tests, we used the mean from the multiple-choice states (North Carolina, Texas, and Virginia). For the 2 multiple-choice and open-ended tests, we used the mean from the states that had both question types (Colorado, Delaware, Maine, and Massachusetts). To estimate what states would spend, assuming that all states use both multiple-choice and open-ended questions, we used the mean of the average annual expenditures for Colorado, Delaware, Maine, and Massachusetts, states that use both types of questions. Estimating Development Expenditures: To estimate development expenditures, we obtained information from each state regarding the number of assessments it needed to develop, the year in which it planned to begin development of each new assessment, and the number of assessments it already had. For each assessment the state indicated it needed to develop, we estimated initial development expenditures beginning in the year the state said it would begin development and also for the following year because interviews with officials revealed that developing an entirely new assessment takes approximately 2 to 3 years. For the 7 states that provided data, we were typically not able to separate expenditures for new test development from expenditures for ongoing test development. Where such data were available, we determined that development expenses for new assessments were approximately three times the expense of development expenses for ongoing assessments, and we used that approximation in our estimates. For each state each year, we multiplied the number of tests in initial development by three times the average ongoing development expenditure to reflect that initial development of assessments is more expensive than ongoing development.[Footnote 16] We multiplied the number of ongoing tests by the average ongoing development expenditure. The sum of these two products provides a development expenditure for each state in each year and provides a total development estimate. We calculated three estimates as follows: * using the expenditure information from states that use multiple- choice questions, we produced a lower estimate; * using the information from the state survey on the types of tests they planned to develop (some indicated both open-ended/multiple-choice tests and some multiple-choice), we produced a middle estimate;[Footnote 17] and: * using the expenditure information from the states that use open-ended and multiple-choice questions, we produced the higher estimate. Estimating Administration, Scoring, and Reporting Expenditures: To produce an estimate for administration, scoring, and reporting, we used three variables: the average number of students in a grade; the number of administered assessments; and the average administration, scoring, and reporting expenditure per assessment taken. We calculated the average number of students in a grade in each year using data from the National Center for Education Statistics' Common Core of Data for 2000-01 and their Projection of Education Statistics to 2011. We obtained data on the number of administered assessments from our state education agency survey. Data on average expenditures come from the states in which we collected detailed expenditure information. For each state in each year, we multiplied the average number of students in a grade by the number of administered assessments and by the appropriate average assessment expenditure. Summing over states and years provided a total estimate for administration, scoring, and reporting. As above, we performed these calculations, using the expenditure information from multiple-choice states to produce the lower estimate, using the information from the state survey and expenditure information from both combination and multiple-choice states to produce a middle estimate, and using the expenditure information from the combination states to produce the higher estimate. We also estimated what states are likely to spend on the assessments that states did not have in place at the time of our survey, but are required by NCLBA, using the same basic methodology. Table 10 provides an overview of our approach to estimating states' likely expenditures on Title I assessments. Table 10: Estimated Expenditures to Implement Title I Assessments in a Given Year: [See PDF for image] Source: GAO analysis. [End of table] We conducted our work in accordance with generally accepted government auditing standards between April 2002 and March 2003. [End of section] Appendix II: Accountability and Assessment Requirements under the 1994 and 2001 Reauthorizations of Title I: Developing standards for content and performance: Requirements for 1994: Develop challenging standards for what students should know in mathematics and reading or language arts. In addition, for each of these standards, states should develop performance standards representing three levels: partially proficient, proficient, and advanced. The standards must be the same for all children. If the state does not have standards for all children, it must develop standards for Title I children that incorporate the same skills, knowledge, and performance expected of other children; Requirements for 2001: In addition, develop standards for science content by 2005- 06. The same standards must be used for all children. Implementing and administering assessments: Requirements for 1994: Develop and implement assessments aligned with the content and performance standards in at least mathematics and reading or language arts; Requirements for 2001: Add assessments aligned with the content and performance standards in science by the 2007-08 school year. These science assessments must be administered at some time in each of the following grade ranges: grades 3 through 5, 6 through 9, and 10 through 12. Requirements for 1994: Use the same assessment system to measure Title I students as the state uses to measure the performance of all other students. In the absence of a state system, a system that meets Title I requirements must be developed for use in all Title I schools; Requirements for 2001: Use the same assessment system to measure Title I students as the state uses to measure the performance of all other students. If the state provides evidence to the Secretary that it lacks authority to adopt a statewide system, it may meet the Title I requirement by adopting an assessment system on a statewide basis and limiting its applicability to Title I students or by ensuring that the Title I local educational agency (LEA) adopts standards and aligned assessments. Requirements for 1994: Include in the assessment system multiple measures of student performance, including measures that assess higher order thinking skills and understanding; Requirements for 2001: Unchanged. Requirements for 1994: Administer assessments for mathematics and reading in each of the following grade spans: grades 3 through 5, 6 through 9, and 10 through 12; Requirements for 2001: Administer reading and mathematics tests annually in grades 3 through 8, starting in the 2005-06 school year (in addition to the assessments previously required sometime within grades 10 through 12); States do not have to administer mathematics and reading or language arts tests annually in grades 3 through 8 if Congress does not provide specified amounts of funds to do so, but states have to continue to work on the development of the standards and assessments for those grades; Have students in grades 4 and 8 take the National Assessment of Educational Progress examinations in reading and mathematics every other year beginning in 2002-03, as long as the federal government pays for it. Requirements for 1994: Assess students with either or both criterion referenced assessments and assessments that yield national norms. However, if the state uses only assessments referenced against national norms at a particular grade, those assessments must be augmented with additional items as necessary to accurately measure the depth and breath of the state's academic contents standards; Requirements for 2001: Unchanged. Requirements for 1994: Assess students with statewide, local, or a combination of state and local assessments. However, states that use all local or a combination of state and local assessments, must ensure, among other things, such assessments are aligned with the state's academic content standards, are equivalent to one another, and enable aggregation to determine whether the state has made adequate yearly progress; Requirements for 2001: Unchanged. Requirements for 1994: Implement controls to ensure the quality of the data collected from the assessments; Requirements for 2001: Unchanged. Including students with limited English proficiency and with disabilities in assessments: Requirements for 1994: Assess students with disabilities and limited English proficiency according to standards for all other students; Provide reasonable adaptations and accommodations for students with disabilities or limited English proficiency to include testing in the language and form most likely to yield accurate and reliable information on what they know and can do; Requirements for 2001: By 2002-03 annually assess the language proficiency of students with limited English proficiency. Students who have attended a U.S. school for 3 consecutive years must be tested in English unless an individual assessment by the district shows testing in a native language will be more reliable. Reporting data: Requirements for 1994: Report assessment results according to the following: by state, local educational agency (LEA), school, gender, major racial and ethnic groups, English proficiency, migrant status, disability, and economic disadvantage; Requirements for 2001: Unchanged. Requirements for 1994: LEAs must produce for each Title I school a performance profile with disaggregated results and must publicize and disseminate these to teachers, parents, students, and the community. LEAs must also provide individual student reports, including test scores and other information on the attainment of student performance standards; Requirements for 2001: Provide annual information on the test performance of individual students and other indicators included in the state accountability system by 2002-03. Make this annual information available to parents and the public and include data on teacher qualifications. Compare high-and low-poverty schools with respect to the percentage of classes taught by teachers who are "highly qualified," as defined in the law, and conduct similar analyses for subgroups listed in previous law. Measuring improvement: Requirements for 1994: Use performance standards to establish a benchmark for improvement referred to as "adequate yearly progress." All LEAs and schools must meet the state's adequate yearly progress standard, for example, having 90 percent of their students performing at the proficient level in mathematics. LEAs and schools must show continuous progress toward meeting the adequate yearly progress standard. The state defines the level of progress a school or LEA must show. Schools that do not make the required advancement toward the adequate yearly progress standard can face consequences, such as the replacement of the existing staff; Requirements for 2001: In addition to showing gains in the academic achievement of the overall school population, schools and districts must show that the following subcategories of students have made gains in their academic achievement: pupils who are economically disadvantaged, have limited English proficiency, are disabled, or belong to a major racial or ethnic group. To demonstrate gains among these subcategories of students, school districts measure their progress against the state's definition of adequate yearly progress; States have 12 years for all students to perform at the proficient level. Consequences for not meeting the adequate yearly progress standard: Requirements for 1994: LEAs are required to identify for improvement any schools that fail to make adequate yearly progress for 2 consecutive years and provide technical assistance to help failing schools develop and implement required improvement plans. After a school has failed to meet the adequate yearly progress standard for 3 consecutive years, LEAs must take corrective action to improve the school; Requirements for 2001: New requirements are more specific as to what actions an LEA must take to improve failing schools. Actions are defined for each year the school continues to fail leading up to the 5th year of failure when a school may be restructured by changing to a charter school, replacing school staff, or state takeover of the school administration. The new law also provides that LEAs offer options to children in failing schools. Depending on the number of years a school has been designated for improvement, these options may include going to another public school with transportation paid by the LEA or using Title I funds to pay for supplemental help. Source: P. L. No. 103-382 (1994) and Pub.L No. 107-110 (2001). [End of table] [End of section] Appendix III: Number of Tests States Reported They Need to Develop or Augment to Comply with NCLBA (as of March 2003): State: Alabama; Number of tests needed: 17. State: Alaska; Number of tests needed: 9. State: Arizona; Number of tests needed: 9. State: Arkansas; Number of tests needed: 9. State: California; Number of tests needed: 5. State: Colorado; Number of tests needed: 4. State: Connecticut; Number of tests needed: 8. State: Delaware; Number of tests needed: 0. State: District of Columbia; Number of tests needed: 17. State: Florida; Number of tests needed: 0. State: Georgia; Number of tests needed: 0. State: Hawaii; Number of tests needed: 9. State: Idaho; Number of tests needed: 17. State: Illinois; Number of tests needed: 6. State: Indiana; Number of tests needed: 9. State: Iowa; Number of tests needed: 0. State: Kansas; Number of tests needed: 11. State: Kentucky; Number of tests needed: 8. State: Louisiana; Number of tests needed: 8. State: Maine; Number of tests needed: 8. State: Maryland; Number of tests needed: 15. State: Massachusetts; Number of tests needed: 6. State: Michigan; Number of tests needed: 8. State: Minnesota; Number of tests needed: 11. State: Mississippi; Number of tests needed: 3. State: Missouri; Number of tests needed: 8. State: Montana; Number of tests needed: 17. State: Nebraska; Number of tests needed: 11. State: Nevada; Number of tests needed: 11. State: New Hampshire; Number of tests needed: 9. State: New Jersey; Number of tests needed: 10. State: New Mexico; Number of tests needed: 17. State: New York; Number of tests needed: 8. State: North Carolina; Number of tests needed: 3. State: North Dakota; Number of tests needed: 11. State: Ohio; Number of tests needed: 8. State: Oklahoma; Number of tests needed: 8. State: Oregon; Number of tests needed: 6. State: Pennsylvania; Number of tests needed: 11. State: Puerto Rico; Number of tests needed: 10. State: Rhode Island; Number of tests needed: 11. State: South Carolina; Number of tests needed: 3. State: South Dakota; Number of tests needed: 17. State: Tennessee; Number of tests needed: 15. State: Texas; Number of tests needed: 1. State: Utah; Number of tests needed: 0. State: Vermont; Number of tests needed: 9. State: Virginia; Number of tests needed: 6. State: Washington; Number of tests needed: 8. State: West Virginia; Number of tests needed: 17. State: Wisconsin; Number of tests needed: 17. State: Wyoming; Number of tests needed: 11. [End of table] Source: GAO survey. [End of section] Appendix IV: Estimates of Assessment Expenditures NCLBA Required, but Not in Place at the Time of Our Survey, FY 2002-08: Table 11 provides estimates of assessment expenditures states may incur for grades and subjects they reported they would need to add to meet the additional assessment requirements under NCLBA. These estimates do not include any expenditures for continuing development or administration of assessments in grades and subjects already included in states' reported assessment program, unless states indicated plans to replace its existing assessments. Estimates reflect total expenditures between fiscal year 2002 and 2008, and are based on the assumptions we made regarding question types. Table 11: Estimates of Expenditures for the Assessments Required by NCLBA That Were Not in Place at the Time of Our Survey, Fiscal Years 2002-08: Dollars in billions: Question type: Multiple-choice; Dollars in billions: Estimate: $0.8; Questions and scoring methods used: Estimate assumes that all states use machine-scored multiple-choice questions. Dollars in billions: Question type: Current question type; Dollars in billions: Estimate: $1.6; Questions and scoring methods used: Estimate assumes that states use the mix of question types they reported in our survey. Dollars in billions: Question type: Multiple-choice and open-ended; Dollars in billions: Estimate: $2.0; Questions and scoring methods used: Estimate assumes that all states use both machine scored multiple-choice questions and some hand scored open-ended questions. Source: GAO. Note: Projections based on state assessment plans and characteristics and expenditure data gathered from 7 states. [End of table] [End of section] Appendix V: State Development and Nondevelopment Estimates: Table 12 provides test development and nondevelopment expenditures by state between fiscal year 2002-08. Test development estimates reflect expenditures associated with both new and existing tests. Nondevelopment expenditures reflect expenditures associated with administration, scoring, and reporting of results for both new and existing assessments. Table 12: Estimates by State, Development, and Nondevelopment Expenditures: Dollars in millions: Alabama; Multiple-choice and open-ended: Development: $16; Multiple- choice and open-ended: Non-development: $57; Current question type: Development: $15; Current question type: Non-development: $15; Multiple-choice: Development: $15; Multiple-choice: Non- development: $15. Alaska; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 14; Current question type: Development: 15; Current question type: Non-development: 10; Multiple-choice: Development: 13; Multiple-choice: Non- development: 4. Arizona; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 93; Current question type: Development: 15; Current question type: Non-development: 93; Multiple-choice: Development: 14; Multiple-choice: Non- development: 25. Arkansas; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 39; Current question type: Development: 14; Current question type: Non-development: 28; Multiple-choice: Development: 13; Multiple-choice: Non- development: 10. California; Multiple-choice and open-ended: Development: 13; Multiple- choice and open-ended: Non-development: 619; Current question type: Development: 12; Current question type: Non-development: 223; Multiple-choice: Development: 12; Multiple-choice: Non- development: 166. Colorado; Multiple-choice and open-ended: Development: 13; Multiple- choice and open-ended: Non-development: 74; Current question type: Development: 13; Current question type: Non-development: 74; Multiple-choice: Development: 12; Multiple-choice: Non- development: 20. Connecticut; Multiple-choice and open-ended: Development: 14; Multiple-choice and open-ended: Non-development: 54; Current question type: Development: 14; Current question type: Non-development: 54; Multiple-choice: Development: 13; Multiple-choice: Non- development: 15. Delaware; Multiple-choice and open-ended: Development: 12; Multiple- choice and open-ended: Non-development: 13; Current question type: Development: 12; Current question type: Non-development: 13; Multiple-choice: Development: 11; Multiple-choice: Non- development: 3. District of Columbia; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 4; Current question type: Development: 12; Current question type: Non-development: 1; Multiple-choice: Development: 12; Multiple-choice: Non- development: 1. Florida; Multiple-choice and open-ended: Development: 12; Multiple- choice and open-ended: Non-development: 269; Current question type: Development: 12; Current question type: Non-development: 200; Multiple-choice: Development: 11; Multiple-choice: Non- development: 72. Georgia; Multiple-choice and open-ended: Development: 12; Multiple- choice and open-ended: Non-development: 162; Current question type: Development: 11; Current question type: Non-development: 44; Multiple-choice: Development: 11; Multiple-choice: Non- development: 44. Hawaii; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 17; Current question type: Development: 14; Current question type: Non-development: 17; Multiple-choice: Development: 13; Multiple-choice: Non- development: 5. Idaho; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 16; Current question type: Development: 14; Current question type: Non-development: 9; Multiple-choice: Development: 14; Multiple-choice: Non- development: 4. Illinois; Multiple-choice and open-ended: Development: 13; Multiple- choice and open-ended: Non-development: 198; Current question type: Development: 13; Current question type: Non-development: 151; Multiple-choice: Development: 12; Multiple-choice: Non- development: 53. Indiana; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 99; Current question type: Development: 14; Current question type: Non-development: 99; Multiple-choice: Development: 13; Multiple-choice: Non- development: 27. Iowa; Multiple-choice and open-ended: Development: 12; Multiple-choice and open-ended: Non-development: 50; Current question type: Development: 12; Current question type: Non-development: 50; Multiple-choice: Development: 11; Multiple-choice: Non-development: 14. Kansas; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 37; Current question type: Development: 13; Current question type: Non-development: 23; Multiple-choice: Development: 13; Multiple-choice: Non- development: 10. Kentucky; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 58; Current question type: Development: 14; Current question type: Non-development: 48; Multiple-choice: Development: 13; Multiple-choice: Non- development: 16. Louisiana; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 67; Current question type: Development: 14; Current question type: Non-development: 67; Multiple-choice: Development: 13; Multiple-choice: Non- development: 18. Maine; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 19; Current question type: Development: 14; Current question type: Non-development: 19; Multiple-choice: Development: 13; Multiple-choice: Non- development: 5. Maryland; Multiple-choice and open-ended: Development: 16; Multiple- choice and open-ended: Non-development: 75; Current question type: Development: 16; Current question type: Non-development: 75; Multiple-choice: Development: 15; Multiple-choice: Non- development: 20. Massachusetts; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 96; Current question type: Development: 13; Current question type: Non-development: 96; Multiple-choice: Development: 12; Multiple-choice: Non- development: 26. Michigan; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 163; Current question type: Development: 14; Current question type: Non-development: 163; Multiple-choice: Development: 13; Multiple-choice: Non- development: 44. Minnesota; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 76; Current question type: Development: 15; Current question type: Non-development: 76; Multiple-choice: Development: 14; Multiple-choice: Non- development: 20. Mississippi; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 51; Current question type: Development: 13; Current question type: Non-development: 51; Multiple-choice: Development: 12; Multiple-choice: Non- development: 14. Missouri; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 85; Current question type: Development: 14; Current question type: Non-development: 85; Multiple-choice: Development: 13; Multiple-choice: Non- development: 23. Montana; Multiple-choice and open-ended: Development: 16; Multiple- choice and open-ended: Non-development: 13; Current question type: Development: 16; Current question type: Non-development: 12; Multiple-choice: Development: 15; Multiple-choice: Non- development: 3. Nebraska; Multiple-choice and open-ended: Development: 13; Multiple- choice and open-ended: Non-development: 21; Current question type: Development: 13; Current question type: Non-development: 21; Multiple-choice: Development: 12; Multiple-choice: Non- development: 6. Nevada; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 31; Current question type: Development: 13; Current question type: Non-development: 13; Multiple-choice: Development: 13; Multiple-choice: Non- development: 8. New Hampshire; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 18; Current question type: Development: 13; Current question type: Non-development: 18; Multiple-choice: Development: 12; Multiple-choice: Non- development: 5. New Jersey; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 113; Current question type: Development: 14; Current question type: Non-development: 113; Multiple-choice: Development: 13; Multiple-choice: Non- development: 30. New Mexico; Multiple-choice and open-ended: Development: 16; Multiple- choice and open-ended: Non-development: 25; Current question type: Development: 16; Current question type: Non-development: 24; Multiple-choice: Development: 15; Multiple-choice: Non- development: 7. New York; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 262; Current question type: Development: 14; Current question type: Non-development: 262; Multiple-choice: Development: 13; Multiple-choice: Non- development: 70. North Carolina; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 139; Current question type: Development: 12; Current question type: Non-development: 37; Multiple-choice: Development: 12; Multiple-choice: Non- development: 37. North Dakota; Multiple-choice and open-ended: Development: 14; Multiple-choice and open-ended: Non-development: 9; Current question type: Development: 14; Current question type: Non-development: 9; Multiple-choice: Development: 13; Multiple-choice: Non- development: 2. Ohio; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 158; Current question type: Development: 13; Current question type: Non-development: 158; Multiple-choice: Development: 12; Multiple-choice: Non-development: 42. Oklahoma; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 53; Current question type: Development: 13; Current question type: Non-development: 24; Multiple-choice: Development: 13; Multiple-choice: Non- development: 14. Oregon; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 57; Current question type: Development: 13; Current question type: Non-development: 15; Multiple-choice: Development: 13; Multiple-choice: Non- development: 15. Pennsylvania; Multiple-choice and open-ended: Development: 15; Multiple-choice and open-ended: Non-development: 166; Current question type: Development: 15; Current question type: Non-development: 147; Multiple-choice: Development: 14; Multiple-choice: Non- development: 45. Puerto Rico; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 56; Current question type: Development: 13; Current question type: Non-development: 15; Multiple-choice: Development: 13; Multiple-choice: Non- development: 15. Rhode Island; Multiple-choice and open-ended: Development: 14; Multiple-choice and open-ended: Non-development: 13; Current question type: Development: 14; Current question type: Non-development: 13; Multiple-choice: Development: 13; Multiple-choice: Non- development: 4. South Carolina; Multiple-choice and open-ended: Development: 13; Multiple-choice and open-ended: Non-development: 73; Current question type: Development: 13; Current question type: Non-development: 70; Multiple-choice: Development: 12; Multiple-choice: Non- development: 19. South Dakota; Multiple-choice and open-ended: Development: 17; Multiple-choice and open-ended: Non-development: 10; Current question type: Development: 15; Current question type: Non-development: 3; Multiple-choice: Development: 15; Multiple-choice: Non- development: 3. Tennessee; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 70; Current question type: Development: 14; Current question type: Non-development: 19; Multiple-choice: Development: 14; Multiple-choice: Non- development: 19. Texas; Multiple-choice and open-ended: Development: 12; Multiple- choice and open-ended: Non-development: 429; Current question type: Development: 11; Current question type: Non-development: 221; Multiple-choice: Development: 11; Multiple-choice: Non- development: 115. Utah; Multiple-choice and open-ended: Development: 12; Multiple-choice and open-ended: Non-development: 50; Current question type: Development: 12; Current question type: Non-development: 33; Multiple-choice: Development: 11; Multiple-choice: Non-development: 13. Vermont; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 10; Current question type: Development: 15; Current question type: Non-development: 10; Multiple-choice: Development: 14; Multiple-choice: Non- development: 3. Virginia; Multiple-choice and open-ended: Development: 13; Multiple- choice and open-ended: Non-development: 116; Current question type: Development: 12; Current question type: Non-development: 48; Multiple-choice: Development: 12; Multiple-choice: Non- development: 31. Washington; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 104; Current question type: Development: 14; Current question type: Non-development: 104; Multiple-choice: Development: 13; Multiple-choice: Non- development: 28. West Virginia; Multiple-choice and open-ended: Development: 17; Multiple-choice and open-ended: Non-development: 26; Current question type: Development: 16; Current question type: Non-development: 7; Multiple-choice: Development: 16; Multiple-choice: Non- development: 7. Wisconsin; Multiple-choice and open-ended: Development: 15; Multiple- choice and open-ended: Non-development: 57; Current question type: Development: 15; Current question type: Non-development: 51; Multiple-choice: Development: 14; Multiple-choice: Non- development: 15. Wyoming; Multiple-choice and open-ended: Development: 14; Multiple- choice and open-ended: Non-development: 7; Current question type: Development: 14; Current question type: Non-development: 7; Multiple-choice: Development: 13; Multiple-choice: Non- development: 2. Total; Multiple-choice and open-ended: Development: $724; Multiple- choice and open-ended: Non-development: $4,590; Current question type: Development: $706; Current question type: Non- development: $3,237; Multiple-choice: Development: $668; Multiple-choice: Non-development: $1,233. Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. [End of table] [End of section] Appendix VI: Fiscal Years 2002-08 Estimated Expenditures for Each Question Type: Table 13 provides estimates for each question type and the benchmark appropriations by fiscal years from 2002 through 2008. Each estimate reflects assumptions about the type of questions on the assessments. For example, the multiple-choice estimate assumes that all states will use assessments with only multiple-choice questions. These estimates also assume that states implement the assessment plans reported to us. The benchmark appropriation is based on actual appropriations in 2002 and 2003 and on the benchmark funding level in NCLBA for 2004-07. We assumed a benchmark of $400 million in 2008, the same as in 2005, 2006, and 2007. Table 13: Estimated Expenditures for Each Question Type, Fiscal Years 2002-08: Question type. Multiple-choice; Fiscal year (in millions): 2002: $165; Fiscal year (in millions): 2003: 237; Fiscal year (in millions): 2004: 288; Fiscal year (in millions): 2005: 291; Fiscal year (in millions): 2006: 293; Fiscal year (in millions): 2007: 308; Fiscal year (in millions): 2008: 318; Total: $1,901. Current question type; Fiscal year (in millions): 2002: $324; Fiscal year (in millions): 2003: 442; Fiscal year (in millions): 2004: 572; Fiscal year (in millions): 2005: 615; Fiscal year (in millions): 2006: 633; Fiscal year (in millions): 2007: 665; Fiscal year (in millions): 2008: 692; Total: $3,944. Multiple-choice and open-ended; Fiscal year (in millions): 2002: $445; Fiscal year (in millions): 2003: 586; Fiscal year (in millions): 2004: 761; Fiscal year (in millions): 2005: 824; Fiscal year (in millions): 2006: 855; Fiscal year (in millions): 2007: 903; Fiscal year (in millions): 2008: 941; Total: $5,313. Benchmark appropriation; Fiscal year (in millions): 2002: $366; Fiscal year (in millions): 2003: 376; Fiscal year (in millions): 2004: 390; Fiscal year (in millions): 2005: 400; Fiscal year (in millions): 2006: 400; Fiscal year (in millions): 2007: 400; Fiscal year (in millions): 2008: 400; Total: $2,733. Source: GAO estimates based on state assessment plans and characteristics and expenditure data gathered from 7 states. Note: Fiscal years 2002 through 2008 sums may not equal the total because of rounding. [End of table] [End of section] Appendix VII: Comments from the Department of Education: UNITED STATES DEPARTMENT OF EDUCATION: THE UNDER SECRETARY: April 29, 2003: Ms. Marnie S. Shaul Director: Education, Workforce, and Income Security Issues United States General Accounting Office Washington, DC 20548: Dear Ms. Shaul: I am writing in response to the General Accounting Office's (GAO) draft report, "Title I: Characteristics of Tests Will Influence Expenses; Guidance May Help States Realize Efficiencies." We appreciate the opportunity to review and respond. Problems with Estimating the Costs of Testing: While the draft report contains some useful information on the estimated costs of testing in the seven States studied, the report goes on to project these estimates on to all other States, which makes the report much less valuable and possibly misleading. We are very concerned about the inclusion and the weight given to the estimates of costs for each State based on estimates of the costs in the particular circumstances of only seven States studied in depth by GAO. In effect, this section of the draft report uses multiple levels of assumptions, which results in estimates that have the potential to be substantially in error. The GAO report ends up with three specific cost estimates for each State that have a ring of authority that we believe is significantly out of proportion to the confidence one can place in them. While the other forty-five "States" (including the District of Columbia and Puerto Rico) apparently responded to survey questions, it does not appear that they provided the level of detailed cost information used in the draft report on the costs for each of the States. As the study acknowledges, the factors in computing and estimating costs are very specific to the circumstances of each State, and cannot be generalized. Many factors can affect the costs in different States to make the estimates wrong and misleading. For example, the report cites the types of questions included in State assessments as one of the main reasons for different costs; yet 48 percent of the States reported that they are uncertain about the type of questions they will include on future tests, thus making projected costs in those States suspect. The report also cites other factors such as the number of different forms of assessments used, and the extent of public release of questions. We believe that there are many other factors that may also be crucial, such as the scoring of assessments through outside contracts versus the scoring of assessments by in-house staff, the expertise and experience of State staff, and many other individual characteristics of a State, including specific characteristics of its student population. The number of new tests that each State reported it would need to develop is found in Table 3 and used subsequently to estimate development costs by multiplying the number of tests in initial development by 3 times the average ongoing development expenditure. The assumption behind this calculation is questionable because: a) the GAO reports that they were typically not able to separate the costs of new test development from ongoing test development, and b) the costs of initial test development will vary tremendously by the nature of the test. For instance, to develop an 8th-grade reading test when grades 3- 7 already are being tested should be a trivial expense compared to developing a science assessment when none currently exists. A more reasonable estimate for test development could be derived from the total test development expenditure in the seven States surveyed, since that includes initial and ongoing test development. We also question the draft report's analysis of question type --i.e., multiple-choice versus open-ended questions --as a key determinant of costs. The draft report fails to differentiate open-ended questions that involve short factual answers from open-ended questions that involve lengthy writing samples. The costs of the latter will be quite high compared to the costs of the former. The draft report also does not consider the proportion of open-ended questions employed in an assessment. The functions of open-ended questions can be provided by relatively small proportions of such questions compared to multiple- choice questions. By not taking into account the nature of open-ended questions, and by not adjusting for the ability of States to retain open-ended questions while lowering costs through reducing the proportion of such questions, the draft report likely over-estimates substantially the assessment costs in the upper two of its three estimates. The draft report assumes that costs will be lower in the first few years of test administration and will increase in later years when more tests are being developed and administered. One could reasonably argue the opposite, that costs are always greater at the outset and that States are likely to combine their test development process in a single content area such as reading across grades 3-8 in the initial years. As a result "out year" costs would be lower. Moreover, GAO projections do not take into account the results of increasing competition as more companies enter the burgeoning State assessment market. Likewise, no provision is made for advances in technology. There are already companies in the market that are capable of administering State assessments with handheld computers. Software currently available can score open-ended questions. The forces of competition and technology almost surely will drive down costs in the development and administration of State assessments. Even though the draft report, in a footnote on page 26, acknowledges that the estimates "may be biased," based on the many problems noted in this response, we strongly recommend the deletion of the information on estimates of the costs for the States not studied directly. We believe the report is also misleading in suggesting that all of the estimated costs are generated by the testing requirements in the No Child Left Behind Act. Educational assessments are an inherent responsibility of the States, and many States (and, in many cases, school districts) have already developed and administered tests that would meet No Child Left Behind requirements. Many of these costs have been borne or would be borne by the States irrespective of the No Child Left Behind Act. We think the report needs to make the point that not all of these costs are incremental costs generated by the Act. Problems in Indicating the Sources of Federal Funding: In addition, the draft report contains information on sources of Federal funding, but does not, by any means, provide a complete picture. For example, it appears to focus on the funding under one Federal program under which testing costs are allowable. But it does not include other funding sources in the No Child Left Behind Act under which testing costs would also be allowable, such as Title I, Part A administrative costs, consolidated administrative costs under Section 9201 of the Elementary and Secondary Education Act (ESEA), Title V of ESEA, and the additional funds that may be transferred under Title VI of ESEA to various funding sources. Thus, it seriously understates available resources provided under Federal programs. The Recommendation for Sharing Information: We have no problems with the one recommendation contained in the report --that the Department of Education use its existing mechanisms to facilitate the sharing of information among States regarding assessment development and administration as States attempt to reduce expenses. As one example of information sharing already undertaken and facilitated by the Department, we suggest that you include in your report information on the Enhanced Assessment Grants, as authorized under Title VI of the No Child Left Behind Act. In February 2003, the Department awarded $17 million to fund projects aimed at improving the quality of State assessment instruments, especially for students with disabilities and students of limited English proficiency. In selecting grant recipients, the Department awarded priority points to applications submitted by consortia of States. All nine awards went to State consortia, ranging from three to fifteen States per consortium. The Department takes very seriously its commitment to provide technical assistance to State and local grantees and looks forwards to continuing and enhancing its efforts to share information on assessment development and administration to help States reduce costs and make the accountability provisions of the No Child Left Behind Act as effective and efficient as possible. We additionally suggest a change in the title of the report to more accurately reflect the report's content. We suggest substituting "Information Sharing" for "Guidance" so that the title reads "Characteristics of Tests Will Influence Expenses: Information Sharing May Help States Realize Efficiencies.": We appreciate the opportunity to provide comments on this draft report, and would be glad to work with your office to make the report more reliable and useful. Sincerely, Eugene W. Hickok: Signed by Eugene W. Hickok: [End of section] Appendix VIII: GAO Contacts and Staff Acknowledgments: GAO Contacts: Sherri Doughty (202) 512-7273 Jason Palmer (202) 512-3825: Staff Acknowledgments: In addition to those named above, Lindsay Bach, Cindy Decker, and Patrick DiBattista made important contributions to this report. Theresa Mechem provided assistance with graphics. FOOTNOTES [1] NCLBA authorizes funding through fiscal year 2007 for assessments. However, consistent with the mandate for this study, we examined expenditures between fiscal years 2002 through 2008, enabling us to more fully capture expenditures associated with the science assessments, which are required to be administered in school year 2007-08. [2] A norm referenced test evaluates an individual's performance in relation to the performance of a large sample of others, usually selected to represent all students nationally in the same grade or age range. Criterion referenced tests are assessments that measure the mastery of specific skills or subject content and focus on the performance of an individual as measured against a standard or criterion rather than the performance of others taking the test. [3] The Secretary of Education may provide states 1 additional year if the state demonstrates that exceptional or uncontrollable circumstances, such as a natural disaster or precipitous and unforeseen decline in the financial resources of the state prevented full implementation of the academic assessments by the deadlines. [4] U.S. General Accounting Office, Title I: Education Needs to Monitor States' Scoring of Assessments, GAO-02-393 (Washington, D. C.: Apr. 1, 2002). [5] According to Education, there are also other sources of funding in NCLBA that states may draw upon for assessment related expenses. [6] NASBE and AccountabilityWorks made different assumptions regarding what costs would vary with the number of students tested and which would be invariant costs. For example, NASBE assumed that development costs would vary by the number of students taking the test and AccountabilityWorks did not. Additionally, AccountabilityWorks reports having verified its assumptions with officials from two states, while the authors of the NASBE study do not report having verified their assumption with state officials. [7] This amount is generally 1 percent of the amount that states receive under Title I or $400,000, whichever is greater. [8] The District of Columbia and Puerto Rico are included in our state totals. [9] Two states reported that they did not provide accommodations for students with disabilities at the state level, however, accommodations may have been provided at the local school level. [10] New York offers its assessments in Spanish, Korean, Haitian Creole, and Russian and Minnesota offers its mathematics assessments in Spanish, Hmong, Somali, and Vietnamese. [11] In Texas and Colorado, we were unable to separate scoring expenditures from other types of expenditures. [12] This may not be true for smaller states because they may have fewer assessments to administer, score, and report. [13] The study also does not estimate the opportunity costs of assessments. [14] Because our expenditure data were limited to 7 states, our estimates may be biased. For example, if the 7 states we selected had higher average development expenditures per ongoing assessment than the average state, then our estimate of development expenditures would be biased upwards. [15] We were unable to obtain information on personnel expenditures from 5 of the 7 states, and so we did not include personnel expenditures in our analysis. In the 2 states in which we obtained personnel expenditures, such expenditures were a relatively small part of the assessment budget. [16] We found estimates were not sensitive to changes in assumptions regarding development costs, partly because they proved to be a generally small portion of overall expenses. [17] For states that reported that they did not know the kinds of question they would use on future tests, we assumed that future test would be the same as they currently use. Where data were missing, we assumed that states would use assessments with both multiple-choice and open-ended questions, potentially biasing our estimates upward. GAO's Mission: The General Accounting Office, the investigative arm of Congress, exists to support Congress in meeting its constitutional responsibilities and to help improve the performance and accountability of the federal government for the American people. GAO examines the use of public funds; evaluates federal programs and policies; and provides analyses, recommendations, and other assistance to help Congress make informed oversight, policy, and funding decisions. GAO's commitment to good government is reflected in its core values of accountability, integrity, and reliability. Obtaining Copies of GAO Reports and Testimony: The fastest and easiest way to obtain copies of GAO documents at no cost is through the Internet. GAO's Web site ( www.gao.gov ) contains abstracts and full-text files of current reports and testimony and an expanding archive of older products. The Web site features a search engine to help you locate documents using key words and phrases. You can print these documents in their entirety, including charts and other graphics. Each day, GAO issues a list of newly released reports, testimony, and correspondence. GAO posts this list, known as "Today's Reports," on its Web site daily. The list contains links to the full-text document files. To have GAO e-mail this list to you every afternoon, go to www.gao.gov and select "Subscribe to daily E-mail alert for newly released products" under the GAO Reports heading. Order by Mail or Phone: The first copy of each printed report is free. Additional copies are $2 each. A check or money order should be made out to the Superintendent of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or more copies mailed to a single address are discounted 25 percent. Orders should be sent to: U.S. General Accounting Office 441 G Street NW, Room LM Washington, D.C. 20548: To order by Phone: Voice: (202) 512-6000: TDD: (202) 512-2537: Fax: (202) 512-6061: To Report Fraud, Waste, and Abuse in Federal Programs: Contact: Web site: www.gao.gov/fraudnet/fraudnet.htm E-mail: fraudnet@gao.gov Automated answering system: (800) 424-5454 or (202) 512-7470: Public Affairs: Jeff Nelligan, managing director, NelliganJ@gao.gov (202) 512-4800 U.S. General Accounting Office, 441 G Street NW, Room 7149 Washington, D.C. 20548: