Estimating the Undocumented Population
A "Grouped Answers" Approach to Surveying Foreign-Born Respondents
Gao ID: GAO-06-775 September 29, 2006
As greater numbers of foreign-born persons enter, live, and work in the United States, policymakers need more information--particularly on the undocumented population, its size, characteristics, costs, and contributions. This report reviews the ongoing development of a potential method for obtaining such information: the "grouped answers" approach. In 1998, GAO devised the approach and recommended further study. In response, the Census Bureau tested respondent acceptance and recently reported results. GAO answers four questions. (1) Is the grouped answers approach acceptable for use in a national survey of the foreign-born? (2) What further research may be needed? (3) How large a survey is needed? (4) Are any ongoing surveys appropriate for inserting a grouped answers question series (to avoid the cost of a new survey)? For this study, GAO consulted an independent statistician and other experts, performed test calculations, obtained documents, and interviewed officials and staff at federal agencies. The Census Bureau and DHS agreed with the main findings of this report. DHS agreed that the National Survey of Drug Use and Health is not an appropriate survey for inserting a grouped answers question series.
The grouped answers approach is designed to ask foreign-born respondents about their immigration status in a personal-interview survey. Immigration statuses are grouped in Boxes A, B, and C on two different flash cards--with the undocumented status in Box B. Respondents are asked to pick the box that includes their current status and are told, "If it's in Box B, we don't want to know which specific category applies to you." The grouped answers approach is acceptable to many experts and immigrant advocates--with certain conditions, such as (for some advocates) private sector data collection. Most respondents tested did not object to picking a box. Research is needed to assess issues such as whether respondents pick the correct box. A sizable survey--roughly 6,000 or more respondents--would be needed for 95 percent confidence and a margin of error of (plus or minus) 3 percentage points. The ongoing surveys that GAO identified are not appropriate for collecting data on immigration status. (For example, one survey takes names and Social Security numbers, which might affect acceptance of immigration status questions.) Whether further research or implementation in a new survey would be justified depends on how policymakers weigh the need for such information against potential costs and the uncertainties of future research.
GAO-06-775, Estimating the Undocumented Population: A "Grouped Answers" Approach to Surveying Foreign-Born Respondents
This is the accessible text file for GAO report number GAO-06-775
entitled 'Estimating the Undocumented population: A "Grouped Answers"
Approach to Surveying Foreign-born Respondents' which was released on
September 29, 2006.
This text file was formatted by the U.S. Government Accountability
Office (GAO) to be accessible to users with visual impairments, as part
of a longer term project to improve GAO products' accessibility. Every
attempt has been made to maintain the structural and data integrity of
the original printed product. Accessibility features, such as text
descriptions of tables, consecutively numbered footnotes placed at the
end of the file, and the text of agency comment letters, are provided
but may not exactly duplicate the presentation or format of the printed
version. The portable document format (PDF) file is an exact electronic
replica of the printed version. We welcome your feedback. Please E-mail
your comments regarding the contents or accessibility features of this
document to Webmaster@gao.gov.
This is a work of the U.S. government and is not subject to copyright
protection in the United States. It may be reproduced and distributed
in its entirety without further permission from GAO. Because this work
may contain copyrighted images or other material, permission from the
copyright holder may be necessary if you wish to reproduce this
material separately.
United States Government Accountability Office:
GAO:
Report to the Subcommittee on Terrorism, Technology, and Homeland
Security, Committee on the Judiciary, U.S. Senate:
September 2006:
Estimating the Undocumented Population:
A "Grouped Answers" Approach to Surveying Foreign-Born Respondents:
GAO-06-775:
GAO Highlights:
Highlights of GAO-06-775, a report to the Subcommittee on Terrorism,
Technology and Homeland Security, Committee on the Judiciary, U.S.
Senate
Why GAO Did This Study:
As greater numbers of foreign-born persons enter, live, and work in the
United States, policymakers need more information”particularly on the
undocumented population, its size, characteristics, costs, and
contributions. This report reviews the ongoing development of a
potential method for obtaining such information: the ’grouped answers“
approach. In 1998, GAO devised the approach and recommended further
study. In response, the Census Bureau tested respondent acceptance and
recently reported results. GAO answers four questions. (1) Is the
grouped answers approach acceptable for use in a national survey of the
foreign-born? (2) What further research may be needed? (3) How large a
survey is needed? (4) Are any ongoing surveys appropriate for inserting
a grouped answers question series (to avoid the cost of a new survey)?
For this study, GAO consulted an independent statistician and other
experts, performed test calculations, obtained documents, and
interviewed officials and staff at federal agencies. The Census Bureau
and DHS agreed with the main findings of this report. DHHS agreed that
the National Survey of Drug Use and Health is not an appropriate survey
for inserting a grouped answers question series.
What GAO Found:
The grouped answers approach is designed to ask foreign-born
respondents about their immigration status in a personal-interview
survey. Immigration statuses are grouped in Boxes A, B, and C on two
different flash cards”with the undocumented status in Box B.
Respondents are asked to pick the box that includes their current
status and are told, ’If it‘s in Box B, we don‘t want to know which
specific category applies to you.“ A random half of respondents are
shown the card on the left of the figure (Card 1), resulting in
estimates of the percentage of the foreign-born population who are in
each box of that card. The other half of the respondents are shown the
card on the right, resulting in corresponding estimates for slightly
different boxes. (No one sees both cards.) The percentage undocumented
is estimated by subtraction: The percentage of the foreign-born who are
in Box B of one card minus the percentage who are in Box A of the other
card.
Figure: Immigration Status Cards 1 and 2:
[See PDF for Image]
Source: GAO; Corel Draw (flag and suitcase); DHS (resident alien
cards). (The actual size of each card is 8-1/2" by 11").
[End of Figure]
The grouped answers approach is acceptable to many experts and
immigrant advocates”with certain conditions, such as (for some
advocates) private sector data collection. Most respondents tested did
not object to picking a box. Research is needed to assess issues such
as whether respondents pick the correct box. A sizable survey”roughly
6,000 or more respondents”would be needed for 95 percent confidence and
a margin of error of (plus or minus) 3 percentage points. The ongoing
surveys that GAO identified are not appropriate for collecting data on
immigration status. (For example, one survey takes names and Social
Security numbers, which might affect acceptance of immigration status
questions.) Whether further research or implementation in a new survey
would be justified depends on how policymakers weigh the need for such
information against potential costs and the uncertainties of future
research.
What GAO Recommends:
GAO makes no new recommendations in this report.
[Hyperlink, http://www.gao.gov/cgi-bin/getrpt?GAO-06-775].
To view the full product, including the scope and methodology, click on
the link above. For more information, contact Nancy R. Kingsbury at
(202) 512-2700 or kingsburyn@gao.gov.
[end of Section]
Contents:
Letter:
Results in Brief:
Background:
Experts Seem to Accept ’Grouped Answers“ Questions If Fielded by a
Private Sector Organization:
Various Tests Are or May Be Needed:
Some 6,000 Foreign-Born Respondents Are Needed for ’Reasonably Precise“
Estimates of the Undocumented:
The Most Efficient Field Strategy Does Not Seem Feasible:
Observations:
Agency Comments:
Appendix I: Scope and Methodology:
Appendix II: Estimating Characteristics, Costs, and Contributions of
the Undocumented Population:
Appendix III: A Review of Census Bureau and GAO Reports on the Field
Test of the Grouped Answer Method:
Appendix IV: A Brief Examination of Responses Observed while Testing an
Indirect Method for Obtaining Sensitive Information:
Appendix V: The Issue of Informed Consent:
Appendix VI: A Note on Variances and ’Mirror Image“ Estimates:
Appendix VII: Comments from the Department of Commerce:
Appendix VIII: Comments from the Department of Homeland Security:
Appendix IX: Comments from the Department of Health and Human Services:
Appendix X: GAO Contact and Staff Acknowledgments:
Bibliography:
Tables:
Table 1: Approximate Number of Foreign-Born Respondents Needed to
Estimate Percentage Undocumented within 2, 3, or 4 Percentage Points at
90 Percent Confidence Level, Using Two-Card Grouped Answers Data:
Table 2: Approximate Number of Foreign-Born Respondents Needed to
Estimate Percentage Undocumented, within 2, 3, or 4 Percentage Points,
at 95 Percent Confidence Level, Using Two-Card Grouped Answers Data:
Table 3: Survey Appropriateness: Whether Surveys Meet Criteria Based on
the Grouped Answers Design:
Table 4: Survey Appropriateness: Whether Surveys Meet Table 3 (Design
Based) Criteria and Additional Criteria Based on Immigrant Advocates‘
Views:
Table 5: Experts GAO Consulted on Immigration Issues or Immigration
Studies:
Figures:
Figure 1: Immigration Status Card 1, Grouped Answers:
Figure 2: Immigration Status Card 2:
Figure 3: Cards 1 and 2 Compared:
Figure 4: SIPP Flash Card:
Figure 5: Training Card 1:
Figure 6: Training Card 2:
Figure 7: Immigration Status Card Tested in GSS:
Abbreviations
ACS: American Community Survey:
BLS: Bureau of Labor Statistics:
CASI: Computer Assisted Self Interview:
CPS: Current Population Survey:
DHS: Department of Homeland Security:
GSS: General Social Survey:
HHS: Department of Health and Human Services:
INS: Immigration and Naturalization Service:
NAWS: National Agricultural Workers Survey:
NCHS: National Center for Health Statistics:
NHIS: National Health Interview Survey:
NORC: National Opinion Research Center:
NRC: National Research Council:
NSDUH: National Survey on Drug Use and Health:
NSF: National Science Foundation
OMB: Office of Management and Budget:
SAMHSA: Substance Abuse and Mental Health Services Administration:
SIPP: Survey of Income and Program Participation:
[End of Section]
September 29, 2006:
The Honorable Jon Kyl:
Chairman:
The Honorable Dianne Feinstein:
Ranking Minority Member:
Subcommittee on Terrorism, Technology and Homeland Security:
Committee on the Judiciary:
United States Senate:
As greater numbers of foreign-born persons enter, live, and work in the
United States, policymakers and the general public increasingly place
high priority on issues involving immigrants. Because separate
policies, laws, and programs apply to different immigration statuses,
valid and reliable information is needed for populations defined by
immigration status. However, government statistics generally do not
include such information.
The information most difficult to obtain concerns the size,
characteristics, costs, and contributions of the population referred to
in this report as undocumented or currently undocumented.[Footnote 1]
Such information is needed because, for example, large numbers of
undocumented persons arrive each year, and the Census Bureau has
realized that information on the size of the undocumented population
would help estimate the size of the total U.S. population, especially
for years between decennial censuses.[Footnote 2] More generally,
information about the undocumented population--and about changes in
that population--can contribute to policy-related planning and
evaluation efforts.
As you know, in 1998, we devised an approach to surveying foreign-born
respondents about their immigration status.[Footnote 3] This self-
report, personal-interview approach groups answers so that no
respondent is ever asked whether he, she, or anyone else is
undocumented. In fact, no individual respondent is ever categorized as
undocumented. Logically, however, grouped answers data can provide
indirect estimates of the undocumented population. Generally, grouped
answers questions on immigration status would be asked as part of a
larger survey that includes direct questions on demographic
characteristics and employment and might include questions on school
attendance, use of medical facilities, and so forth; some surveys also
ask specific questions that can help estimate taxes paid. Potentially,
combining the answers to such questions with grouped answers data can
provide further information on the characteristics, costs, and
contributions of the undocumented population.
We reported the first results of preliminary tests of the grouped
answers approach, primarily with Hispanic farmworkers, in 1998 and
1999; the majority of the preliminary test interviews were fielded by
Aguirre International of Burlingame, California.[Footnote 4] We also
recommended that the Immigration and Naturalization Service (INS) and
the Census Bureau further develop and test the method. In response, the
Census Bureau contracted for a test as part of the 2004 General Social
Survey (GSS), which is fielded by the National Opinion Research Center
(NORC) at the University of Chicago, with "core funding" provided by a
grant from the National Science Foundation (NSF).[Footnote 5] The
Census Bureau's analysis of the 2004 GSS data became available in 2006.
In this report, we respond to your request that we review the ongoing
development of the grouped answers approach and related issues. We
address four questions: (1) Is the grouped answers approach
"acceptable" for use in a national survey of the foreign-born
population?[Footnote 6] (2) What kinds of further research are or may
be needed, based on the results of tests conducted thus far and expert
opinion? (3) How large a survey is needed to provide "reasonably
precise" estimates of the undocumented population, using grouped
answers data? (4) Are there appropriate ongoing surveys in which the
grouped answers question series might eventually be inserted (thus
avoiding the costs of fielding a new survey)?
To answer these questions, we:
* consulted private sector experts in immigration issues and studies,
including immigrant advocates, immigration researchers, and
others;[Footnote 7]
* consulted an independent statistical expert, Dr. Alan Zaslavsky, and
other experts in statistics and surveys;[Footnote 8]
* reanalyzed the data from the 2004 GSS test and subjected both our
analysis and the Census Bureau's analysis to review by the independent
statistical expert;
* performed test calculations, using specific assumptions; and:
* identified ongoing surveys that might be candidates for piggybacking
the grouped answers question series, gathered documents on those
surveys, and met with officials and staff at the federal agencies that
conduct or sponsor them.[Footnote 9]
We also met with other relevant federal agencies.[Footnote 10] Appendix
I describes our methodology and the scope of our work in more detail.
We conducted our work in accordance with generally accepted government
auditing standards between July 2005 and September 2006.
Results in Brief:
Acceptance of the grouped answers approach appears to be high among
immigrant advocates and respondents. The advocates we interviewed
generally accepted the approach--with provisos such as fielding by a
university or other private sector organization, appropriate data
protection (including protections against government misuse), and high-
quality survey procedures. The independent statistician, reviewing the
Census Bureau's analysis and our reanalysis of the 2004 GSS test of
respondent acceptance, concluded that the grouped answers approach is
"generally usable" for surveys interviewing foreign-born respondents in
their homes.[Footnote 11]
Based on the results of the GSS test and on consultations and
interviews with varied experts, further work is or may be needed to:
* Expand knowledge about respondent acceptance. For example, the 2004
GSS test did not cover persons who are "linguistically isolated" in the
sense that no member of their household age 14 or older speaks English
"very well".[Footnote 12]
* Test the accuracy of responses or respondents' intent to answer
accurately.[Footnote 13] To date, no tests of response accuracy, or the
intent to answer accurately, have been conducted, although a number of
relevant designs can be identified.
Thousands of foreign-born respondents would be needed to obtain
"reasonably precise" grouped answers estimates of the undocumented
population.[Footnote 14] Our calculations and work with statisticians
showed that while many factors are involved and it is not possible to
guarantee a specific level of precision, roughly 6,000 interviews would
be likely to be sufficient to support estimates of the size of the
undocumented population and major subgroups within it (especially high-
risk subgroups, defined by characteristics such as age 18 to 40,
recently arrived, employed[Footnote 15]). Quantitative estimates are
also possible; for example, major program costs associated with the
undocumented population may also be estimated, given appropriate
program data.
None of the ongoing, large-scale national surveys we identified appear
to be appropriate for piggybacking the grouped answers question series.
One self-report personal interview survey is fielded by a private
sector organization (under a contract with a Department of Health and
Human Services (HHS) agency); however, that survey focuses on the use
of illegal drugs, and we believe that direct questions on drug use
might heighten the sensitivity of the questions on immigration status.
We believe other ongoing surveys to be inappropriate; for example, one
asks other sensitive questions (on HIV status) and takes respondents'
names and Social Security numbers. Additionally, the Census Bureau
fields these surveys.
Whether further research or a new survey would be justified depends on
issues such as how policymakers weigh the need for such information
against potential costs.
We received comments on a draft of this report from the Department of
Commerce (Census Bureau), the Department of Homeland Security (DHS),
and the Department of Health and Human Services (DHHS). The Census
Bureau and DHS generally agreed with the main findings of the report,
and DHHS agreed that the National Survey of Drug Use and Health would
not be appropriate for "piggy-backing" the grouped answers question
series. These agencies also provided other technical comments (see
appendices VII, VIII, and IX).
Background:
Grouped Answers Reduce "Question Threat" and Allow Indirect Estimates
of the Undocumented:
Survey questions about sensitive topics carry a "threat" for some
respondents, because they fear that a truthful answer could result in
some degree of negative consequence (at a minimum, social disapproval).
The grouped answers approach is designed to reduce this threat when
asking about immigration status.
Three key points about the grouped answers approach are that:
1. no respondent is ever asked whether he or she, or anyone else, is
undocumented;
2. two pieces of information are separately provided by two subsamples
of respondents (completely different people--no one is shown both
immigration status cards); and:
3. taking the two pieces of information together--like two different
pieces of a puzzle--allows indirect estimation of the undocumented
population, but no individual respondent (and no piece of data on an
individual respondent) is ever categorized as undocumented.
We discuss each point in some detail.[Footnote 16]
1. No respondent is ever asked whether he or she is in the undocumented
category. Unlike questions that ask respondents to choose among
specific answer categories, the grouped answers approach combines
answer categories in sets or "boxes," as shown in figure 1.
Figure 1: Immigration Status Card 1, Grouped Answers:
[See PDF for image]
Sources: GAO; Corel Draw (flag and suitcase); DHS (resident alien
cards). (The actual size of the card is 8-1/2" by 11").
[End of figure]
Box B includes the sensitive answer category---currently
"undocumented"--along with other categories that are
nonsensitive.[Footnote 17]
Each respondent is asked to "pick the Box"--Box A, Box B, or Box C--
that contains the specific answer category that applies to him or her.
Respondents are told, in effect: If the specific category that applies
to you is in Box B, we don't want to know which one it is, because
right now we are focusing on Box A categories.[Footnote 18]
By using the boxes, the interview avoids "zeroing in" on the sensitive
answer. The specific categories shown in the boxes in figure 1 are
grouped so that:
* one would expect many respondents who are here legally, as well as
those who are undocumented, to choose Box B,[Footnote 19] and:
* there is virtually no possibility of anyone deducing which specific
category within Box B applies to any individual respondent.
2. Two pieces of information are provided separately by two subsamples
of respondents (no one is shown both immigration status cards).
Respondents are divided into two subsamples, based on randomization
procedures or rotation (alternation) procedures conducted outside the
interview process. (For example, a rotation procedure might specify
that within an interviewing area, every other household will be
designated as subsample 1 or subsample 2.)
This "split sample" procedure has been used routinely for many surveys
over the years. As applied to the grouped answers approach, the two
subsamples are shown alternative flash cards. Immigration Status Card
1, described above, represents one way to group immigration statuses in
three boxes. A second immigration status flash card (Immigration Status
Card 2, shown in figure 2) groups the same statuses differently.
Figure 2: Immigration Status Card 2:
[See PDF for image]
Sources: GAO; Corel Draw (flag and suitcase); DHS (resident alien
cards). (The actual size of the card is 8-1/2" by 11").
[End of figure]
The alternative immigration-status cards can be thought of as "mirror
images" in that:
* the two nonsensitive legal statuses in Box A of Card 1 appear in Box
B of Card 2 and:
* the two nonsensitive legal statuses in Box B of Card 1 appear in Box
A of Card 2.
However, the undocumented status always appears in Box B.
Interviewers ask survey respondents in subsample 1 about immigration
status with respect to Card 1. They ask survey respondents in subsample
2 (completely different persons) about immigration status with respect
to Card 2. Each respondent is shown one and only one immigration-status
flash card. There are no highly unusual or complicated interviewing
procedures.[Footnote 20]
Because the two subsamples of respondents are drawn randomly or by
rotation, each subsample represents the foreign-born population and, if
sufficiently large, can provide "reasonably precise" estimates of the
percentages of the foreign-born population in the boxes on one of the
alternative cards.
Incidentally, a respondent picking a box that does not include the
sensitive answer--for example, a respondent picking Box A or Box C in
figure 1--can be asked follow-up questions that pinpoint the specific
answer category that applies to him or her. Thus, direct information is
obtained on all legal immigration statuses. The data on some of the
legal categories can be compared to administrative data to check the
reasonableness of responses. Additionally, these data provide estimates
of legal statuses, which are useful when, for example, policymakers
review legislation on the numbers of foreign-born persons who may be
admitted to this country under specific legal status programs.
3. No individual respondent is ever categorized as undocumented, but
indirect estimates of the undocumented population can be made. Using
two slightly different pieces of information provided by the two
different subsamples allows indirect estimation of the size of the
currently undocumented population--by simple subtraction.
The only difference between Box B of Card 1 and Box A of Card 2 is the
inclusion of the currently "undocumented" category in Box B of Card 1.
Figure 3 shows both cards together for easy comparison.
Figure 3: Cards 1 and 2 Compared:
[See PDF for image]
Sources: GAO; Corel Draw (flag and suitcase); DHS (resident alien
cards). (The actual size of the card is 8-1/2" by 11").
[End of figure]
Thus, the percentage of the foreign-born population who are currently
undocumented can be estimated as follows:
* Start with the percentage of subsample 1 respondents who report that
they are in Box B of Card 1 (hypothetical figure: 62 percent of
subsample 1).
* Subtract from this the percentage of subsample 2 who say they are in
Box A on Card 2 (hypothetical figure: 33 percent of subsample 2).
* Observe the difference (29 percent, based on the hypothetical
figures); this represents an estimate of the percentage of the foreign-
born population who are undocumented.
Alternatively, a "mirror-image" estimate could be calculated, using Box
B of Card 2 and Box A of Card 1.[Footnote 21]
To estimate the numerical size of the undocumented population, a
grouped answers estimate of the percentage of the foreign-born who are
undocumented would be combined with a census figure. For example, the
2000 census counted 31 million foreign-born, and the Census Bureau
issued an updated estimate of 35.7 million for 2005. The procedure
would be to simply multiply the percent undocumented (based on the
grouped answers data and the subtraction procedure) by a census count
or an updated estimate for the year in question.
These procedures ensure that no respondents--and no data on any
specific respondent--are ever separated out or categorized as
undocumented, not even during the analytic process of making indirect,
group-level estimates.
To further ensure reduction of "question threat," the grouped answers
question series begins with flash cards that ask about nonsensitive
topics and familiarize respondents with the 3-box approach. For each
nonsensitive-topic card, interviewers ask the respondent which box
applies to him or her, saying: If it's Box B, we do not want to know
which specific category applies to you.
In this way, most respondents should understand the grouped answers
approach before seeing the immigration-status card.
To help ensure accurate responses, respondents who choose Box A can be
asked a series of clarifying questions.[Footnote 22] (No follow-up
questions are addressed to anyone choosing Box B.) The questions for
Box A respondents are designed to prompt them to, essentially,
reclassify themselves in Box B, if that is appropriate.[Footnote 23]
The grouped answers question series can potentially be applied in a
large-scale general population survey, where the questions on
immigration status would be added for the foreign-born respondents--
provided that an appropriate survey can be identified. If a new survey
of the general foreign-born population were planned, it would involve
selecting a general sample of households and then screening out the
households that do not include one or more foreign-born persons.
Finally, we note that while the initial version of the grouped answers
approach involved three alternative flash cards (and was termed the
"three-card method"), we recently devised the version described here,
which uses two cards rather than three. The two-card method is simpler,
is easier to understand, and provides more precise estimates. All cards
are alike in that they feature three boxes in which specific answer
categories are grouped.
Characteristics, Costs, and Contributions Can Potentially Be Estimated:
Generally, grouped answers questions on immigration status would be
asked as part of a larger survey that includes direct questions on
demographic characteristics and employment and might include questions
on school attendance, use of medical facilities, and so forth; some
surveys also ask specific questions that can help estimate taxes paid.
Potentially, combining the answers to such questions with grouped
answers data can be used to provide further information on the
characteristics, costs, and contributions of the undocumented
population.
For example, the numbers of undocumented persons in major subgroups --
such as demographic or employment status subgroups--can be estimated,
provided that the sample of foreign-born persons interviewed is
sufficiently large.
Grouped answers data collected from adult respondents can also be used
to estimate the number of children in various immigration statuses,
including undocumented--provided that an additional question is
asked.[Footnote 24] Additionally, when combined with separate
quantitative data (for example, data on program costs per individual),
grouped answers data can be used to estimate quantitative information
(such as program costs) for the undocumented population as a whole--or,
again, depending on sample size, for specific subgroups.
The procedures for deriving these more complex indirect estimates are
described in appendix II. No grouped answers respondent is ever
categorized as undocumented.
Statistical Information Is Needed on the Undocumented Population:
The foreign-born population of the United States is large and growing-
-as is the undocumented population within it. Congressional
policymakers, the U.S. Commission on Immigration Reform, and the
National Research Council's (NRC) Committee on National Statistics have
indicated a need for statistical information on the undocumented
population, including its size, characteristics, costs, and
contributions.
The Census Bureau estimates that as of 2005, foreign-born residents
(both legally present and undocumented) numbered 35.7 million and
accounted for at least one-tenth of all persons residing in each of 15
states and the District of Columbia.[Footnote 25] These figures
represent substantial increases over the prior 15 years. For example,
in 1990 the foreign-born population totaled fewer than 20 million; only
3 states had a population more than one-tenth foreign-born. One result
is that as the Department of Labor has testified, foreign-born workers
now constitute almost 15 percent of the U.S. labor force, and the
numbers of such workers are growing.[Footnote 26]
A new paper from the Department of Homeland Security (DHS) puts the
"unauthorized" immigrant population at 10.5 million as of January 2005
and indicates that if recent trends continued, the figure for January
2006 would be 11 million.[Footnote 27] The Pew Hispanic Center's
indirect estimate of the undocumented population as of 2006 is 11.5
million to 12 million. These estimates represent roughly one-third of
the entire foreign-born population.[Footnote 28] DHS has variously
estimated the size of the undocumented population as of January 2000 as
7 million and 8.5 million.[Footnote 29] Government and other estimates
for 1990 numbered only 3.5 million.[Footnote 30]
These various indirect estimates of the undocumented population are
based on the "residual method." Residual estimation (1) starts with a
census count or survey estimate of the number of foreign-born residents
who have not become U.S. citizens and (2) subtracts out estimated
numbers of legally present individuals in various categories, based on
administrative data and assumptions (because censuses and surveys do
not ask about legal status). The remainder, or residual, represents an
indirect estimate of the size of the undocumented population.
To illustrate the role of administrative data and assumptions, residual
estimates draw on counts of the number of new green cards issued each
year. But they also require assumptions to account for emigration and
deaths among those who received green cards in earlier years.
A recent DHS paper providing residual estimates of the undocumented
population includes ranges of estimates based on alternative
assumptions made for two key components.[Footnote 31] For example, "by
lowering or raising the emigration rates 20 percent . . . the estimated
unauthorized immigrant population would range from 10.0 million to 11.0
million."[Footnote 32] The DHS paper also lists assumptions that were
not subjected to alternative specifications. We believe the DHS paper
represents an advance because, up to now, analysts producing residual
estimates have generally not made public statements regarding the
precision of the estimates. (Some critics have, however, indicated that
residual estimates are likely to lack precision.[Footnote 33])
While the residual approach has been used to profile the undocumented
population on two characteristics--age and country of birth--it is
limited with respect to estimating (1) current geographic location and
(2) current employment and benefit use. The reason is that current
characteristics of legally present persons are not maintained in
administrative records; analysts must therefore rely largely on
assumptions.[Footnote 34] In contrast, the grouped answers method does
allow for the possibility of estimating current characteristics based
on current self-reports.
During the mid-1990s, the U.S. Commission on Immigration Reform
determined that better statistical "information on legal status and
type of immigrant [is] crucial" to assessing immigration policy.
Indeed, the Commission called for a variety of improvements in
estimates of the costs and benefits associated with undocumented
immigration.[Footnote 35] NRC's Committee on National Statistics
further emphasized the need for better information on costs, especially
state and local costs.[Footnote 36] (If successfully fielded, the
grouped answers method might help provide general information on such
costs--and, potentially, specific information for large states such as
California. Sample size limitations would be likely to prohibit
separate analyses for specific local areas, small states, and states
with low percentages of foreign-born or undocumented.)
Over the years, we have received numerous congressional requests
related to estimating costs associated with the undocumented
population.[Footnote 37] Recent Census Bureau research and conferences
reflect the realization that undocumented immigration is a key
component of current population growth and that there is a resultant
need for information on this group.[Footnote 38] Additionally, some of
the immigrant advocates we interviewed expressed an interest in being
able to better describe the contributions of the undocumented
population.
Surveys Are a Key Information Source:
Various national surveys ask foreign-born respondents to provide
information about themselves and, in some cases, other persons in their
households. While such surveys provide a wealth of information on a
wide variety of areas, including some sensitive topics, national
surveys generally do not ask about current immigration status--with the
exception of a question on U.S. citizenship, which is included in
several surveys.
As we reported earlier, it is believed that direct questions on
immigration status "are very sensitive, and negative reactions to them
could affect the accuracy of responses to other questions on [a]
survey."[Footnote 39] Two surveys that have asked respondents directly
about immigration status for several years are:
* the National Agricultural Workers Survey (NAWS), an ongoing annual
cross-sectional self-report survey of farmworkers, fielded by Aguirre
International, a private sector firm under contract to the Department
of Labor, since 1988,[Footnote 40] and:
* the Survey of Income and Program Participation (SIPP), a longitudinal
panel survey of the general population, conducted by the Census Bureau,
which has asked immigration status questions since 1996.
Of the two, SIPP is the more relevant, because its immigration status
questions have been administered to a sample of the general foreign-
born population.
SIPP has asked an adult respondent-informant from each household to
provide information about himself or herself and about others in his or
her household, including which immigration-status category applied to
each person when he or she came to this country. Answers are
facilitated by a flash card that lists major legal immigration statuses
(see fig. 4).[Footnote 41] A further question asks whether each person
obtained a green card after arriving in this country. The SIPP
questions come close to asking about--but do not actually allow an
estimate of--the number of foreign-born U.S. residents who are
currently undocumented.[Footnote 42] According to the Census Bureau,
SIPP is now scheduled to be "reengineered," but the full outlines of
the revised effort have not been set.
Figure 4: SIPP Flash Card:
[See PDF for image]
Source: U.S. Bureau of the Census. (The actual size of the card is 8-
12" by 11".)
[End of figure]
The Grouped Answers Approach Has Been Tested in Surveys Fielded by
Private Sector Organizations:
In the middle to late 1990s, the grouped answers question series was
subjected to preliminary development and testing with Hispanic
respondents, including interviews with farmworkers conducted by Aguirre
International, under contract to GAO.[Footnote 43] In these tests,
every respondent picked a box.[Footnote 44] However, these interviews
were not conducted under conditions of a typical large-scale survey in
which interviewers initiate contact with respondents in their
homes.[Footnote 45]
To further test respondents' acceptance of the grouped answers
approach, the Census Bureau created a question module with 3-box flash
cards and contracted for it to be added to the 2004 GSS. When
presenting the survey to respondents, interviewers explained that NORC
of the University of Chicago fielded the GSS survey, with "core
funding" from an NSF grant.[Footnote 46] The Census Bureau's question
module included cards from the three-card version of the grouped
answers approach--which features only one immigration status category
in Box A. The cards used were:
* the two training cards shown in figures 5 and 6[Footnote 47] and:
* the immigration status card shown in figure 7.[Footnote 48]
Figure 5: Training Card 1:
[See PDF for image]
Sources: GAO; Dominican Republic (illustrations). (The actual size of
the card is 8-1/2" by 11".)
[End of figure]
Figure 6: Training Card 2:
[See PDF for image]
Source: GAO. (The actual size of the card is 8-1/2" by 11".)
[End of figure]
Figure 7: Immigration Status Card Tested in GSS:
[See PDF for image]
Sources: GAO; Corel Draw (flag and suitcase); DHS (resident alien
cards). (The actual size of the card is 8-1/2" by 11".)
[End of figure]
Training card 1 shows different types of houses arranged in three
boxes. Respondents are asked to indicate the type of house they lived
in when in their home country--by picking a box. They are told that if
the answer is in Box B, we don't need to know which specific type
applies to them, because right now we are focusing on Box A.
Training card 2 shows different modes of transportation, again arranged
in three boxes. Respondents are asked to indicate the mode of
transportation they used the most recent time they traveled from their
home country to the United States--again by picking a box. They are
again told that if it's in Box B, we don't need to know which specific
mode applies.
Additionally, the GSS-Census Bureau module asked interviewers to (1)
judge respondents' understanding of the 3-box format, (2) observe
whether respondents objected or "kept silent for a while" when
presented with the immigration status card, and (3) record any comments
that respondents made about the cards. As the Census Bureau has noted,
the module was a partial test because only one immigration status card
was tested.
Data and documentation from this field test became available in late
2005. A Census Bureau analysis of these data (completed in 2006 and
reproduced in full in appendix IV), indicates that of 237 foreign-born
respondents, 216 (roughly 90 percent) chose a box, 4 gave other
answers, and 17 refused or said "don't know." The Census Bureau took
this "as an indication that most foreign-born who are asked about their
migrant status in this format would understand the question, know the
answer, and answer willingly."
Further, the Census Bureau paper stated that:
* the "overwhelming majority of foreign-born respondents" picked a box
on the immigration status card without--according to interviewers--any
objection, hesitation, or periods of silence;
* while some interviewers did not give a judgment or were confused
about rating respondents' understanding, about 80 percent of
respondents were coded as understanding and about 10 percent as
not;[Footnote 49] and:
* some respondents' comments, written in by interviewers, indicated
that although the GSS is a "personal interview" survey, telephone
interviews had been substituted, in some cases, and this meant that
respondents could not see the cards--making the use of the 3-box format
difficult.
The Census Bureau's paper highlighted various limitations of the 2004
GSS test, including (1) testing only one immigration status card, (2)
underrepresenting Hispanics, and (3) in some instances interviewing
over the telephone (instead of in person), so that respondents did not
see the flash cards.[Footnote 50]
Experts Seem to Accept "Grouped Answers" Questions If Fielded by a
Private Sector Organization:
The acceptability of the grouped answers approach appears to be high,
when implemented in surveys fielded by a university or private sector
organization. Many immigration experts, including advocates, accepted
the grouped answers approach, although some conditioned their
acceptance on a quality implementation in a survey fielded by a
university or other private sector organization. An independent
statistical expert believed that the grouped answers approach would be
generally usable with survey respondents.
Keys to Acceptance Are Fielding by a Private Sector Organization, Data
Protections, and Quality Implementation:
Some of the researchers and advocates we contacted were extremely
enthusiastic about the potential for new data. No one objected to
statistical, policy-relevant information being developed on the size,
characteristics, costs, and contributions of the undocumented
population. Overall, the immigration experts we contacted (listed in
appendix I, table 5) accepted the grouped-answers question approach--
although advocates sometimes conditioned their acceptance on, for
example, the questions being asked in a survey fielded by a university
or private sector organization--with data protections built in. Many
also offered suggestions for maximizing cooperation by foreign-born
respondents or ideas about how advocacy organizations might help.
Some advocates indicated that a key condition of their support would be
that (1) the grouped answers question on immigration status be asked by
a university or private sector organization and (2) identifiable data
(that is, respondents' answers linked to personal identifiers) be
maintained by that organization. Two advocate organizations
specifically stated that they "could not endorse," or implied they
would not support, the grouped answers approach, assuming the data were
collected and maintained by, in one case, the Census Bureau and, in the
other case, the government. Many other immigration experts and
advocates preferred that grouped answers data on immigration status be
collected by a university or other reputable private sector
organization pledged to protect the data.
The immigration advocates said that private sector fielding of a
grouped answers survey and protection of such data from nonstatistical
uses that might harm immigrants were key issues because:
* Some foreign-born persons are from countries with repressive regimes
and thus have more fear of (less trust in) government than the typical
U.S.-born person.
* Despite current law protecting individual data from disclosure, some
persons believe that information collected by a government agency such
as the Census Bureau is routinely shared (or that in some circumstances
it might be shared) across government agencies. Further, one advocate
pointed out that the Congress could change the current law, eliminating
that protection. (Although the grouped answers approach does not
identify anyone as undocumented, it does provide some information
regarding each respondent's immigration status.)
* Extremely large-scale data collections--notably, the American
Community Survey (ACS)--can yield estimates for areas small enough that
if the data were publicly available, they could be used for
nonstatistical, nonpolicy purposes. Some advocates referred to the
World War II use of census data to identify the areas where specific
numbers of persons of Japanese origin or descent resided. They also
pointed out that Census Bureau data on ethnicity--including counts of
Arab Americans--are publicly available by zip code. (The Census Bureau,
unlike other government agencies and private sector survey
organizations, is associated with extremely large-scale data
collections, and some persons may not fully differentiate Census Bureau
data collection efforts of different sizes.)
* Hostility to or lack of trust in the Census Bureau might result in
potentially lower response rates for foreign-born persons, based on the
World War II experience of the Japanese or a more recent incident in
which Census Bureau staff helped a DHS enforcement unit access publicly
available data on ethnicity by zip code.[Footnote 51],,HS stated that
it did not use these data and had not requested the information by zip
code.[Footnote 52] The Census Bureau clarified its position on
providing help to others requesting publicly available data.[Footnote
53]
Various advocates saw the issues listed above as linked to their own
acceptance, as well as to respondent acceptance, of a survey. Linking
these issues to respondent acceptance of a survey was, in some cases,
echoed by other immigration experts we consulted.[Footnote 54] Some
immigrant advocates and other immigration experts counseled us that if
there were an increase in enforcement efforts in the interior of the
United States (as opposed to border-crossing areas), foreign-born
respondents' acceptance of the grouped answers questions would be
likely to decrease--at least, if the questions were asked in a survey
fielded by the government.
One advocate expressly stated a preference for a grouped answers survey
with funding by a nongovernment entity, such as a foundation. We
discussed with a number of immigrant advocates who objected to a
government-fielded survey the possibility of a survey fielded by a
private sector organization with government funding. In some cases, we
specifically referred to one or both of the following surveys, which
(1) have been conducted for many years without inappropriate data
disclosures and (2) ask direct sensitive questions:
* the National Survey on Drug Use and Health (NSDUH), fielded by RTI
International under a contract from HHS's Substance Abuse and Mental
Health Services Administration (SAMHSA), and:
* the National Agricultural Workers Survey (NAWS), fielded by Aguirre
International, under a contract from the Department of Labor.[Footnote
55]
The advocates' response was generally to accept the concept of
government funding of a university's or private sector survey
organization's field work, provided that appropriate protections of the
data were built into the funding agreement.
GAO's contract with Aguirre International for early testing of the
grouped answers approach with farmworker respondents specified that
data on respondents' answers would be "stripped of person-identifiers
and related information." Additionally, the GSS "core funding" grant
with NSF and its contractual arrangements with sponsors of question
modules--such as the grouped-answers question insert contracted for by
the Census Bureau--do not involve the transfer of any data other than
publicly available data, stripped of identifiers, and limited so as to
avoid the possibility of "deductive disclosure" with respect to
respondent identities or local areas.[Footnote 56]
Various advocates said that their acceptance was also contingent on
factors such as:
1. high-quality data, including coverage of persons who have limited
English proficiency, with special attempts to reach those who are
linguistically isolated (that is, members of households in which no one
14 or older speaks English "very well") and to overcome other potential
barriers (such as cultural differences);
2. appropriate presentation of the survey, including an appropriate
explanation of its purpose and how respondents were selected for
interview; and:
3. transparency--that is, keeping the immigrant community informed
about or involved in the development and progress of the survey.
One advocate specifically said that her organization's support would be
contingent on both (1) the development of more information on
respondent acceptance within the Asian community--particularly among
Asians who have limited English proficiency or are linguistically
isolated--and (2) a survey implementation that is planned to adequately
communicate with Asian respondents, including those who are
linguistically isolated or have little education.[Footnote 57] Although
one-fourth of the 2004 GSS test respondents were Asian, the test was
conducted in English (allowing help from bilingual household members),
and no other tests have included linguistically isolated
Asians.[Footnote 58]
Advocates and Experts Suggest Ways to Maximize Respondent Cooperation
and Offer Their Assistance:
Advocates and other experts made several suggestions for maximizing
respondent cooperation with a survey using the grouped answers question
series--that is, maximizing response rates for such a survey as well as
maximizing authentic participation.
Advocates suggested that the survey (1) avoid taking names or Social
Security numbers,[Footnote 59] (2) hire interviewers who speak the
respondents' home-country language, (3) let respondents know why the
questions are being asked and how their households came to be selected,
(4) conduct public relations efforts, (5) obtain the support of opinion
leaders, (6) select a survey group from a well-known and trusted
university to collect the data, and (7) ask respondents about their
contributions to the American economy through, for example, working and
paying taxes.
Additionally, survey experts suggested:
* using audio-Computer Assisted Self Interview (audio-CASI),[Footnote
60]
* carefully explaining to respondents how anonymity of response is
protected, and:
* paying respondents $25 or $30 for participating in the interview.
Survey experts viewed these elements as key ways of boosting response
rates or encouraging authentic responses to sensitive questions. For
example, NAWS, which uses respondent incentives, achieves extremely
high response rates within cooperating farms--97 percent in 2002, with
a $20 payment to farmworkers selected.
Some immigrant advocates also offered suggestions for how their
organizations or other advocates might help the effort to develop and
field the grouped answers approach, including:
1. providing contacts at local organizations to help with arrangements
for future research,
2. developing or reviewing Box A follow-up questions, and:
3. serving on an advisory board with other representatives from
immigrant communities.[Footnote 61]
GSS Data and Independent Statistical Consultant Review Show "General
Usability" of the Grouped Answers Approach:
As we report above, the Census Bureau's recent analysis of the 2004 GSS
grouped answers data concluded that the "overwhelming majority of
foreign-born respondents" picked a box without objection, hesitation,
or silence. The Census Bureau reported, more specifically, that roughly
90 percent (216 of 237 respondents) chose a box, 4 gave other answers,
and 17 refused to answer or said "don't know."
Our subsequent analysis excluded 19 of the 237 respondents in the
Census Bureau analysis because:
² 4 were not foreign-born (for example, 1 had been born abroad to
parents who had, by the time he was born, become naturalized U.S.
citizens);
² 1 was not classifiable as either foreign-born or not foreign-born
(because he did not know whether his parents were born in the United
States);
² 4 others were known to have been interviewed on the telephone, based
on written-in interviewers' comments recorded in the computer file (for
example, one wrote that the respondent could not see the cards because
the interview was on the telephone); and:
² 10 others were subsequently found to have been interviewed on the
telephone, based on a special GSS hand check of the interview forms for
respondents who had refused or said "don't know," which was carried out
in response to our request.[Footnote 62]
As a result, in our analysis we found that only 6 personally
interviewed foreign-born GSS respondents refused or said "don't
know."[Footnote 63] One of the 6 was an 18-year-old Mexican who told
the interviewer that he did not know whether or not he was a legal
immigrant. Additionally, we found that the 4 respondents who gave
"other answers" had provided usable information (for example, one
called out that he had a student visa) and thus could be recoded into
an appropriate box.
After reviewing the two analyses of the GSS test data--the one that the
Census Bureau performed and the other we performed--Dr. Zaslavsky
concluded that:
The test confirms the general usability of the [grouped-answers
approach] with subjects similar to the target population for its
potential large-scale use--that is, foreign-born members of the general
population. Out of about 218 respondents meeting eligibility criteria
and who were most likely administered the cards in person (possibly
including a few who had telephone interviews but responded without
problems), only 9 did not respond by checking one of the 3 boxes. Of
these, 3 provided verbal information that allowed coding of a box, and
6 declined to answer the question altogether. Furthermore, several of
these [6] raised similar difficulties with other 3-box questions on
nonsensitive topics (type of house where born, mode of transportation
to enter United States), suggesting that the difficulties with the
question format were at least in part related to the format and not to
the particular content of the answers. Thus, indications were that
there would not be a systematic bias due to respondents whose
immigration status is more sensitive being unwilling to address the 3-
box format.
Dr. Zaslavsky emphasized the importance of minimizing or completely
avoiding telephone interviews when using the grouped answers approach-
-or, alternatively, providing advance copies of the cards to
respondents before interviewing over the telephone.[Footnote 64] (Dr.
Zaslavsky's written review is presented in full in appendix III.)
Various Tests Are or May Be Needed:
The findings on respondent acceptance--that is, the GSS test--raised
some unanswered questions about acceptance that experts said should be
addressed. Additionally, the experts said that one or more tests of
response validity are needed to determine whether respondents "pick the
correct box" versus systematically avoiding Box B.
Questions for Further Research Were Suggested by the GSS Test:
The independent reviewer of the GSS analyses (Dr. Zaslavsky) concluded
that:
four issues should be addressed in future field tests:
(a) Equivalent acceptability of all forms of the response card,
(b) Usability with special populations including those with low
literacy, the linguistically isolated, and concentrated immigrant
populations,
(c) Methods that avoid telephone interviews, or reduce bias and
nonresponse due to use of the telephone,
(d) Use of follow-up questions to improve the accuracy of box choices.
As the independent expert explained with respect to point (b), GSS
undercoverage of the foreign-born population occurred at least in part
because interviews were conducted only in English,although household
members could help respondents with limited English.[Footnote 65]
Various colleagues and experts we talked with supported points (a)
through (d). We further note that points (a) and (c) were covered or
touched on in the Census Bureau's paper reporting its analysis of the
2004 GSS data. In our discussions with Census Bureau staff, they also
mentioned that further tests of acceptance should include (d) follow-up
questions for Box A respondents.
Additionally, some advocates and an immigration researcher suggested
improving the cards, which might minimize the potential for "don't
know" or inaccurate answers. A survey expert suggested using focus
groups to further explore respondent perceptions of the cards--and to
potentially improve them.[Footnote 66]
Earlier testing covered a key portion of the populations (Hispanic
farmworkers) cited in (b) above, was conducted in Spanish, and included
Box A follow-up questions as recommended in (d) above.[Footnote 67] In
those interviews, every respondent picked a box. However,
1. No language other than Spanish or English has been used in testing;
thus, as one immigrant advocate pointed out, no testing has focused on
linguistically isolated Asians (those living in households in which no
adult member speaks English).
2. The interviews with Hispanic farmworkers were not conducted under
typical conditions of a household survey.
3. Only one immigration status card was tested with Hispanic
farmworkers and in the GSS.
Therefore, we agree that the acceptance-testing issues the experts
raised should be considered in assessing the grouped answers approach.
Studies Should Test Whether Respondents Pick the Correct Box:
Several experts told us that tests of respondent accuracy--or at least
respondents' intent to respond accurately--should be conducted. These
experts emphasized that grouped answers data would not be useful if
substantial numbers of respondents were to systematically avoid picking
Box B (that is, to not pick the box with the undocumented category).
However, one immigration study expert believed that if a response
validity study involved lengthy delays, fielding a grouped answers
survey should proceed in advance of a validity study.
We agree with the experts' position that tests are needed to determine
whether respondents systematically avoid Box B (even after Box A follow-
up check questions). Tests of response validity would ideally be
conducted with the methods of encouraging truthful answers that experts
mentioned, such as (1) explaining why the survey is being conducted,
how the respondent was selected, and how the anonymity of answers is
ensured, and (2) using audio-CASI and, if appropriate, paying
respondents for participating in the interview. And, as the Census
Bureau pointed out, such a study should include the full grouped
answers question series, including follow-up questions, and it should
test both Card 1 and Card 2. Even if small numbers of respondents were
to respond inaccurately, it would be helpful to estimate this and
adjust for any resulting bias.
We discussed various approaches to conducting validity studies with
immigration experts, including immigrant advocates, and with agencies
conducting surveys. In reviewing these approaches, we found that
response validity tests vary according to whether they are conducted
before, during, or after a survey is fielded.
Before a large-scale survey is conducted. The grouped answers question
series could be asked of a special sample of respondents for whom the
answers are known, in advance, by study investigators on an individual-
respondent basis. Such knowledge might be based, for example, on
information that recent applicants for green cards have submitted to
DHS.[Footnote 68] "Firewalls" could be used to prevent survey
information from being given to DHS.We discussed this approach with
DHS; however, experts criticized a DHS-based validity study on both
methodological and public relations grounds.[Footnote 69] An
alternative source of data on individuals' immigration statuses might
avoid these problems, but no alternative source has yet been
identified.
Before or as part of a large-scale survey. In either situation (that
is, in a presurvey study or as part of a survey), respondents could be
asked if they would be willing to participate in special validity-test
activities in return for a payment of, say, $25 or $30 for each
activity. Later, after interviewing had been completed in a given
location--not as part of the interview process--a sample of respondents
who chose Box A (that is, those who claimed to be here legally) could
be asked to:
² participate in a focus group in which respondents would discuss how
they felt answering the grouped answers questions when the interviewer
came to their house and, also, could possibly be asked to fill out a
"secret ballot" indicating whether they had answered authentically in
the earlier home interview;
* give permission for a record check and provide information that could
subsequently be used in a record check (for example, their name, date
of birth, and Social Security number) and permission to check these
data with the Social Security Administration;[Footnote 70] or:
* show his or her documentation (for example, green card) to a
documents expert.[Footnote 71]
These checks would logically be focused on Box A respondents, for most
of whom such checks would be less threatening. We believe that it is
reasonable to assume that most respondents who chose Box B picked the
correct box. Further, because the survey interview states that there
are no more questions on immigration if the respondent picks Box B,
pursuing follow-up validity checks might be deemed inappropriate for
Box B respondents.[Footnote 72]
After data are collected. With a large-scale survey, it would be
possible to conduct comparative analyses after the data were collected.
We provide three examples.[Footnote 73]
1. Grouped answers estimates of the percentage undocumented could be
compared for (a) all foreign-born versus (b) high-risk groups, such as
those who arrived in the United States within the past 5 or 10 years.
The expectation would be that with valid responses, a higher estimate
of the percentage undocumented would be obtained for those who arrived
more recently--because, for example, persons who had arrived recently
were not here during the amnesty in the late 1980s.[Footnote 74]
2. Comparisons could be made of (a) Box A estimates of specific legal
statuses and the approximate dates received--notably, the numbers of
persons claiming to have received valid green cards in 1990 or more
recently--with (b) publicly available DHS reports of the numbers of
green cards issued from 1990 to the survey date.[Footnote 75]
3. Analysts could compare (a) grouped answers estimates of the number
undocumented overall to (b) estimates of total undocumented obtained by
the residual method.[Footnote 76]
Wherever possible, Card 1 and Card 2 should be tested separately for
accuracy of response.
The advantage of conducting a validity study in advance of a survey is
that if significant problems surface, adjustments in the approach can
be made. Or if the problems are substantial and cannot be easily
corrected--and if the anticipated survey were to be fielded mostly or
only to collect grouped answers data--then that survey could be
postponed or canceled. However, the results of validity tests conducted
during or after a survey can be used to interpret the data and,
potentially, to adjust estimates if it appears that, for example, 5 to
10 percent of undocumented respondents had erroneously claimed to be in
Box A of Card 1. As one expert noted, conducting an advance study does
not preclude conducting a subsequent study during or after the survey.
Some 6,000 Foreign-Born Respondents Are Needed for "Reasonably Precise"
Estimates of the Undocumented:
Although several factors are involved, and it is not possible to
guarantee a specific level of precision in advance, we estimate that
roughly 6,000 foreign-born respondents, or more, would be needed for a
grouped answers survey.[Footnote 77] As we explain below, this is based
on (1) a precision requirement (that is, a 95 percent confidence
interval consisting of plus or minus 3 percentage points), (2)
assumptions about the sampling design of the survey in which the
questions are asked, and (3) the assumption that approximately 30
percent of the foreign-born population is currently undocumented.
An indirect grouped answers estimate of the undocumented population
generally requires interviews with more foreign-born respondents than a
corresponding hypothetical direct estimate would--assuming it were
possible to ask such questions directly in a major national survey. One
key reason is that the main sample of foreign-born respondents must be
divided into two subsamples. Half the respondents answer each
immigration status card. On this basis alone, one would have to double
the sample size required for a direct estimate based on a question
asked of all respondents. Further, the estimate of undocumented, which
is achieved by subtraction, combines two separate estimates, each
characterized by some degree of uncertainty.[Footnote 78]
Determining the number of respondents required for a "reasonably
precise" estimate of the percentage of the foreign-born population who
are undocumented involves three key factors:
1. specification of a precision level--that is, choice of a 90 percent
or 95 percent confidence level and an interval defined by plus or minus
2, 3, or 4 percentage points;
2. information on (or assumptions about) the sampling design for the
main survey and for subsamples 1 and 2; and:
3. to the extent possible, consideration of the likely distribution of
the foreign-born population across immigration status categories,
including the various legal categories and the undocumented
category.[Footnote 79]
With respect to the first factor involved in determining sample size,
some agencies--for example, the Census Bureau and the Bureau of Labor
Statistics (BLS)--use the 90 percent confidence level. Other agencies
use the 95 percent level.
With respect to the second factor, the sampling design of a large-
scale, nationally representative, personal-interview survey is based on
probabilistic area sampling rather than simple random sampling of
individuals. This often reduces the precision of estimates (relative to
simple random sampling).[Footnote 80] The reason is that persons
selected for interview are clustered in a limited number of areas or
neighborhoods (and residents of a particular neighborhood may tend to
be similar). It is possible that the design for selecting subsamples 1
and 2 could increase precision; however, it is not possible to predict
by how much.[Footnote 81]
With respect to the third factor, existing residual estimates point to
a fairly even 3-way split between three main categories-- undocumented,
U.S. citizen, and legal permanent resident. However, there is some
uncertainty associated with these estimates, the distribution may vary
across subgroups, and the percentages may change in future.[Footnote
82] Therefore, a range of distributions is relevant.
Taking each of these factors into account (to the extent possible) and
using conservative assumptions, we estimated the approximate numbers of
respondents required for indirect estimates of the undocumented
population that are "reasonably precise."
Table 1 shows required sample sizes for the 90 percent confidence
level, table 2 for the 95 percent level, with precision at plus or
minus 2, 3, and 4 percentage points. In estimating these required
sample sizes, we made conservative assumptions and specified a range of
possibilities for the distribution with respect to the undocumented
category.
To identify a single, rough figure for the sample size needed for
reasonably precise estimates, we focused on:
1. the 95 percent level, which is more certain and, we believe,
preferable;
2. the 30 percent column, because a current residual estimate of the
undocumented population is in this range; and:
3. the middle row (for plus or minus 3 percentage points), which is a
midpoint within the area of "reasonable precision" as defined above.
With this focus, we estimate that roughly 6,000 or more respondents
would be required.[Footnote 83]
Table 1: Approximate Number of Foreign-Born Respondents Needed to
Estimate Percentage Undocumented within 2, 3, or 4 Percentage Points at
90 Percent Confidence Level, Using Two-Card Grouped Answers Data:
Estimate with 2, 3, or 4 percentage points: 2;
Percent undocumented foreign-born (range of possibilities): 10%:
10,700;
Percent undocumented foreign-born (range of possibilities): 30%: 9,900;
Percent undocumented foreign-born (range of possibilities): 50%: 8,100;
Percent undocumented foreign-born (range of possibilities): 70%: 5,500;
Percent undocumented foreign-born (range of possibilities): 90%: 2,100.
Estimate with 2, 3, or 4 percentage points: 3;
Percent undocumented foreign-born (range of possibilities): 10%: 4,800;
Percent undocumented foreign-born (range of possibilities): 30%: 4,400;
Percent undocumented foreign-born (range of possibilities): 50%: 3,600;
Percent undocumented foreign-born (range of possibilities): 70%: 2,500;
Percent undocumented foreign-born (range of possibilities): 90%: 900.
Estimate with 2, 3, or 4 percentage points: 4;
Percent undocumented foreign-born (range of possibilities): 10%: 2,700;
Percent undocumented foreign-born (range of possibilities): 30%: 2,500;
Percent undocumented foreign-born (range of possibilities): 50%: 2,000;
Percent undocumented foreign-born (range of possibilities): 70%: 1,400;
Percent undocumented foreign-born (range of possibilities): 90%: 500.
Source: GAO analysis.
Note: Estimated numbers of respondents were calculated assuming that
(1) foreign-born persons who are not undocumented are evenly split
between the legal statuses in Box A, Card 1, and Box A, Card 2 (a
conservative assumption in that it maximizes the required number of
respondents), (2) sample selection design for the main survey and for
subsamples 1 and 2 increases the variance of an estimate of
undocumented by 1.6 (which does not build in potential reductions in
variance that might occur with a careful design for the selection of
subsamples 1 and 2); and (3) for simplicity, no respondents choose Box
C.
[End of table]
Table 2: Approximate Number of Foreign-Born Respondents Needed to
Estimate Percentage Undocumented, within 2, 3, or 4 Percentage Points,
at 95 Percent Confidence Level, Using Two-Card Grouped Answers Data:
Estimate with 2, 3, or 4 percentage points: 2;
Percent undocumented foreign-born (range of possibilities): 10%:
15,200;
Percent undocumented foreign-born (range of possibilities): 30%:
14,000;
Percent undocumented foreign-born (range of possibilities): 50%:
11,500;
Percent undocumented foreign-born (range of possibilities): 70%: 7,800;
Percent undocumented foreign-born (range of possibilities): 90%: 2,900.
Estimate with 2, 3, or 4 percentage points: 3;
Percent undocumented foreign-born (range of possibilities): 10%: 6,800;
Percent undocumented foreign-born (range of possibilities): 30%:
6,200[A];
Percent undocumented foreign-born (range of possibilities): 50%: 5,100;
Percent undocumented foreign-born (range of possibilities): 70%: 3,500;
Percent undocumented foreign-born (range of possibilities): 90%: 1,300.
Estimate with 2, 3, or 4 percentage points: 4;
Percent undocumented foreign-born (range of possibilities): 10%: 3,800;
Percent undocumented foreign-born (range of possibilities): 30%: 3,500;
Percent undocumented foreign-born (range of possibilities): 50%: 2,900;
Percent undocumented foreign-born (range of possibilities): 70%: 2,000;
Percent undocumented foreign-born (range of possibilities): 90%: 700.
Source: GAO analysis.
Note: Estimated numbers of respondents were calculated assuming that
(1) foreign-born persons who are not undocumented are evenly split
between the legal statuses in Box A, Card 1, and Box A, Card 2 (a
conservative assumption in that it maximizes the required number of
respondents), (2) sample selection design for the main survey and for
subsamples 1 and 2 increases the variance of an estimate of
undocumented by 1.6 (which does not build in potential reductions in
variance that might occur with a careful design for the selection of
subsamples 1 and 2); and (3) for simplicity, no respondents choose Box
C.
[A] This is the approximate number of foreign-born respondents needed
for an overall estimate of the percentage undocumented with a
confidence interval of plus or minus 3 percentage points at the
(preferred) 95% confidence level, assuming that 30% of the foreign-born
are undocumented.
[End of Table]
High-risk subgroups--subgroups with higher percentages of undocumented
(such as adults 18 to 44 and persons who arrived in the United States
within the past 10 years)--would require fewer respondents for the same
level of precision, as illustrated in the tables' middle and right
columns. For example, if about 70 percent of a subgroup were
undocumented, a survey with about 3,500 respondents in that subgroup
would produce an estimate of the percentage of the subgroup that is
undocumented, correct to within approximately plus or minus 3
percentage points at the 95 percent confidence level.
Low precision could obtain for smaller subgroups in which there are
relatively few undocumented persons (for example, 10 percent or less),
particularly if--as assumed in tables 1 and 2--there is an even split
of legally present foreign-born persons across the Box A categories of
immigration status cards 1 and 2.[Footnote 84]
The independent statistician we consulted indicated that if more than
one grouped answers survey is conducted, combining data across two or
more surveys could help provide larger numbers of respondents for
subgroup analysis. For example, if a large-scale survey were conducted
annually, analysts could combine 2 or 3 years of data to obtain more
precise estimates. (One caveat is that combining data from multiple
survey years reduces the time-specificity associated with the resulting
estimate.)
Finally, we note that to estimate the numerical size of the
undocumented population,
* A grouped answers estimate of the percentage of the foreign-born who
are undocumented would be combined with a census count of the foreign-
born or an updated estimate. For example, the 2000 census counted 31
million foreign-born persons, and the Census Bureau later issued an
updated estimate of 35.7 million for 2005.
* The specific procedure would be to multiply the percentage
undocumented (based on the grouped answers data and the subtraction
procedure) by a census count or an updated estimate of the foreign-born
population for the year in question.
The precision of the resulting estimate of the numerical size of the
undocumented population would be affected by (1) the precision of the
grouped answers percentage estimate, which is closely related to sample
size, as described above, and (2) any bias in the census count or
updated estimate of the foreign-born population.[Footnote 85] The
precision of the grouped answers percentage is taken into account by
using a percentage range (for example, the estimate plus or minus 3
percentage points) when multiplying. Although the amount of bias in a
census count or updated estimate is unknown, we believe that any such
bias would have a proportional impact on the calculated numerical
estimate of the undocumented population.[Footnote 86]
To illustrate the proportional impact, we assume that a census count
for total foreign-born is 5 percent too low. Using that count in the
multiplication process would cause the resulting estimate of the size
of the undocumented population to be 5 percent lower than it should
be.[Footnote 87] The situation is analogous for subgroups.[Footnote 88]
Overall, it seems clear that reasonably precise grouped answers
estimates of the undocumented population and its characteristics
require large-scale data collection efforts but not impossibly large
ones.
The Most Efficient Field Strategy Does Not Seem Feasible:
A low-cost field strategy would be to insert the new question series in
an existing, nationally representative, large-scale survey--that is, to
pose the grouped answers questions to the foreign-born respondents
already being interviewed. However, based on our review of on-going
large-scale surveys, the insertion strategy does not seem feasible.
Specifically, we identified four potentially relevant surveys but none
met criteria based on the grouped answers design and other criteria
based on immigrant advocates' concerns.
The dollar costs associated with inserting a grouped answers module are
difficult to calculate in advance because many factors are involved.
However, to suggest the "ball park" within which the cost of a grouped
answers insert might be categorized, if an insertion were possible, we
present the following two examples.
* The GSS test, in which a grouped answers question module was
inserted, cost approximately $100 per interview (more than 200
interviews were conducted). On average, the question series took 3.25
minutes. Logically, per-interview costs are likely to be higher in
relatively small surveys than in larger surveys with thousands of
foreign-born respondents.
* For the much larger Current Population Survey (CPS), with interviews
covering native-born and foreign-born persons in more than 50,000
households, the Census Bureau and BLS told us that "an average 10-
minute supplement cost $500,000 in 2005."[Footnote 89] This implies $10
per interview at the 50,000 level, but per-interview costs might be
higher when the question series applied to only a portion of the
respondents. Additional costs might apply for flash cards and foreign-
language interviews. BLS noted that still other costs would apply for
advance testing and subsequent analyses requested by the customer.
A more costly option would be to ask the grouped answers question
series in a follow-back survey of foreign-born respondents identified
in interviewing for an existing survey. (In-person self-report
interviews can cost $400 to $600 each.) More costly still would be the
development of a new, personal-interview survey of a representative
sample of the foreign-born population devoted to migration issues; the
main reason is that there would be additional costs in "screening out"
households without foreign-born persons.
We identified four potentially relevant ongoing large-scale surveys.
All have prerequisites and processes for accepting (or not accepting)
new questions. We also developed six criteria for assessing the
appropriateness of each survey as a potential vehicle for fielding the
grouped answers approach. Three criteria are based on design
requirements, and three are based on the views of immigrant advocates.
We found that no ongoing large-scale survey met all criteria.
Four Ongoing Large-Scale Data Collections Sometimes Accept Additional
Questions:
We identified four nationally representative, ongoing large-scale
surveys in which respondents are or could be personally
interviewed.[Footnote 90] Three of these conduct most or all interviews
in person:
1. the Current Population Survey (CPS), sponsored by BLS and the Census
Bureau and fielded by Census;
2. the National Health Interview Survey (NHIS), sponsored by the
National Center for Health Statistics (NCHS) and fielded by the Census
Bureau; and:
3. the National Survey on Drug Use and Health (NSDUH), sponsored by
SAMHSA and fielded by RTI International, a private sector contractor.
The fourth survey is the American Community Survey (ACS), a much larger
survey fielded by the Census Bureau and using "mixed mode" data
collection. The majority of the data are based on mailed questionnaires
or telephone interviews, with the remaining data based on personal
interviews. In addition, there is one personal-interview follow-back
survey that uses the ACS frame and data to draw its sample.[Footnote
91] Other follow-back surveys might eventually be possible.
For any of these four surveys, inserting a new question or set of
questions (or fielding a "follow-back" survey based on respondents'
answers in the main survey) requires approvals by the Office of
Management and Budget (OMB), the agencies that sponsor or field the
surveys, and in cases in which data are collected by a private sector
organization, the organization's institutional review board.
The prerequisites for an ongoing survey's accepting new questions
typically include low anticipated item nonresponse, pretesting and
pilot testing (including debriefing of respondents and interviewers)
that indicate a minimum of problems, review by stakeholders to
determine acceptability, and tests that indicate no effect on either
survey response rates or answers to the main survey's existing
questions.[Footnote 92] Another prerequisite would be the expectation
of response validity.[Footnote 93]
Additionally, multiple agencies mentioned a need for prior "cognitive
interviewing," compatibility with existing items (so that there is no
need to change existing items), and no significant increase in
"respondent burden" (by, for example, substantially lengthening the
interview).[Footnote 94]
Agencies sponsoring or conducting large-scale surveys varied on the
perceived relevance of immigration to the main topic of their survey.
For example, BLS noted that some of its customers would be interested
in data on immigration status by employment status (among the foreign-
born), and the Census Bureau has indicated the relevance of
undocumented immigration to population estimation. But some other
agencies saw little relevance to the large-scale surveys they sponsored
or conducted. Resistance to including a grouped answers question series
might occur where an agency perceives little or no benefit to its
survey or its customers.
Additionally, one agency raised the issue of informed consent, which we
discuss in appendix V.
No Ongoing Large-Scale Data Collection Met Our Criteria:
Based on the design of the grouped answers approach, as tested to date,
two criteria for an appropriate survey are (1) personal interviews in
which respondents can view the 3-box cards and (2) a self-report format
in which questions ask the respondents about their own status (rather
than asking one adult member of a household to report information on
others). A third criterion is that the host survey not include highly
sensitive direct questions that could affect foreign-born respondents'
acceptance of the grouped answers questions.[Footnote 95] We based
these criteria on the results of the GSS test, our knowledge of the
grouped answers approach, and general logic.
As shown in table 3, one of the surveys we reviewed (the CPS) does not
meet the self-report criterion; that is, it accepts proxy responses.
Two other surveys (the NHIS and NSDUH) do not meet the criterion of an
absence of highly sensitive questions, since they include questions on
HIV status (NHIS) and the use of illegal drugs (NSDUH). Conducting a
follow-back survey based on ACS would meet all three criteria.[Footnote
96]
Table 3: Survey Appropriateness: Whether Surveys Meet Criteria Based on
the Grouped Answers Design:
Survey type: Ongoing survey;
Specific survey: Current Population Survey (CPS);
Three design-based criteria: 1. Are the data gathered in personal
interviews?: YES> Mostly, for in-person waves; 16% of foreign-born
interviewed by telephone, in the in-person waves[A];
Three design-based criteria: 2. Are all respondents selected to self-
report?: No. An adult respondent reports on self and provides proxy
responses for others in his or her household. In-person data for 6,744
households with 1 or more foreign-born members (2006);
Three design-based criteria: 3. Are direct questions not highly
sensitive?: YES, not highly sensitive[B].
Survey type: Ongoing survey;
Specific survey: National Health Interview Survey (NHIS);
Three design-based criteria: 1. Are the data gathered in personal
interviews?: YES. Mostly; 17% of foreign-born sample adults interviewed
by telephone;
Three design-based criteria: 2. Are all respondents selected to self-
report?: YES. For some questions, but not all, 4,829 foreign-born
adults self-reported (2004);
Three design-based criteria: 3. Are direct questions not highly
sensitive?: No. There are direct questions on HIV, other STDs[C].
Survey type: Ongoing survey;
Specific survey: National Survey of Drug Use and Health (NSDUH);
Three design-based criteria: 1. Are the data gathered in personal
interviews?: Yes. All interviewed in person;
Three design-based criteria: 2. Are all respondents selected to self-
report?: Yes. 7,364 foreign-born age 12 and older and 4,934 foreign-
born age 18+ self-reported (2004);
Three design-based criteria: 3. Are direct questions not highly
sensitive?: No. There are direct questions on respondent's use and sale
of drugs like marijuana and cocaine.
Survey type: Potential follow-back survey;
Specific survey: Potential American Community Survey (ACS) follow-back
survey, by the Census Bureau- on all or a sample of foreign-born on
whom ACS data were collected;
Three design-based criteria: 1. Are the data gathered in personal
interviews?: YES. A follow back could specify personal interviews only.
(ACS is mixed mode, mostly mail);
Three design-based criteria: 2. Are all respondents selected to self-
report?: YES. A follow-back could specify self-report only. (ACS data
include both self-report data and proxy data in which one member of a
household provides responses for others);
Three design-based criteria: 3. Are direct questions not highly
sensitive?: Yes, not highly sensitive.
Source: GAO analysis.
[A] The CPS includes successive data collections or "waves" to update
data over time, at selected households. In some waves, interviews are
conducted in person; in others, by telephone.
[B] Based on the core CPS questionnaire. (Different modules or
supplements may be added in particular survey years or CPS waves.)
[C] HIV refers to human immunodeficiency virus. STDs refers to sexually
transmitted diseases.
[End of Table]
The views of immigrant advocates, which were echoed by some other
experts, suggested three additional criteria for a candidate "host"
survey:
1. data collection by a university or private sector organization,
2. no request for the respondent's name or Social Security number, and:
3. protection from possible release of grouped answers survey data for
small geographic areas (to guard against estimates of the undocumented
for such areas).
The experts based their views on (1) methodological grounds (foreign-
born respondents would be more likely to cooperate, and to respond
truthfully, if all or some of these criteria were met) and (2) concerns
about privacy protections at the individual or group levels.[Footnote
97] These criteria are potentially important, in part because the
success of a self-report approach hinges on the cooperation of
individual immigrants and, most likely, also on the support of opinion
leaders in immigrant communities.[Footnote 98] With respect to the
first criterion above, we note that with the exception of initial GAO
pretests, all tests of the grouped answers approach have involved data
collection by a university or private sector organization. Without
further tests, we do not know whether acceptance would be equally high
in a government-fielded survey.
As shown in table 4, an ACS follow-back would potentially not meet any
of the three criteria based on immigrant advocates' views. Only one
survey (NSDUH) met all three criteria based on immigrant advocates'
views--and because of its sensitive questions on drug use, that survey
did not meet the design-based table 3 criteria.
Table 4: Survey Appropriateness: Whether Surveys Meet Table 3 (Design
Based) Criteria and Additional Criteria Based on Immigrant Advocates'
Views:
Survey type: Ongoing survey;
Specific survey: Current Population Survey (CPS);
Meets all table 3 (design based) criteria: No;
Three additional criteria based on immigrant advocates' views: 1. Does
a nongovernment organization conduct field work?: No. The Census Bureau
conducts field work[B];
Three additional criteria based on immigrant advocates' views: 2. Are
interviews anonymous (that is, no names or Social Security Numbers are
taken)?: No. Takes names.
Three additional criteria based on immigrant advocates' views: 3. Is
sample too small for reliable small-area estimates of undocumented?[A]:
YES.
Survey type: Ongoing survey;
Specific survey: National Health Interview Survey (NHIS);
Meets all table 3 (design based) criteria: No;
Three additional criteria based on immigrant advocates' views: 1. Does
a nongovernment organization conduct field work?: No. The Census Bureau
conducts field work[C];
Three additional criteria based on immigrant advocates' views: 2. Are
interviews anonymous (that is, no names or Social Security Numbers are
taken)?: No. Takes both names and Social Security numbers;
Three additional criteria based on immigrant advocates' views: 3. Is
sample too small for reliable small-area estimates of undocumented?[A]:
Yes.
Survey type: Ongoing survey;
Specific survey: National Survey of Drug Use and Health (NSDUH);
Meets all table 3 (design based) criteria: No;
Three additional criteria based on immigrant advocates' views: 1. Does
a nongovernment organization conduct field work?: Yes;
Three additional criteria based on immigrant advocates' views: 2. Are
interviews anonymous (that is, no names or Social Security Numbers are
taken)?: Yes;
Three additional criteria based on immigrant advocates' views: 3. Is
sample too small for reliable small-area estimates of undocumented?[A]:
Yes.
Survey type: Potential follow-back;
Specific survey: Potential American Community Survey (ACS) follow-back
survey y the Census Bureau- on all or a sample of foreign-born on whom
data were collected;
Meets all table 3 (design based) criteria: Yes;
Three additional criteria based on immigrant advocates' views: 1. Does
a nongovernment organization conduct field work?: No. Only the Census
Bureau can conduct field work;
Three additional criteria based on immigrant advocates' views: 2. Are
interviews anonymous (that is, no names or Social Security Numbers are
taken)?: No. Takes names in the initial survey, and a follow-back would
be based on knowing each person's identity;
Three additional criteria based on immigrant advocates' views: 3. Is
sample too small for reliable small-area estimates of undocumented?[A]:
Potentially, no. A follow-back might be extremely large. (Also, small-
area releases are not prohibited by law or policy).
Source: GAO analysis.
Note: Table 3 criteria are personal interviews; respondent reports on
himself or herself; no highly sensitive direct questions.
[A] For this report, we define "small area" as below the county level.
[B] For CPS, only the Census Bureau can conduct a follow-back.
[C] For NHIS, a follow-back by a private sector organization might be
possible.
[End of Table]
In conclusion, we did not find a large-scale survey that would be an
appropriate vehicle for "piggybacking" the grouped answers question
series.
Observations:
For more than a decade, the Congress has recognized the need to obtain
reliable information on the immigration status of foreign-born persons
living in the United States--particularly, information on the
undocumented population--to inform decisions about changing immigration
law and policy, evaluate such changes and their effects, and administer
relevant federal programs.
Until now, reliable data on the undocumented population have seemed
impossible to collect. Because of the "question threat" associated with
directly asking about immigration status, the conventional wisdom was
that foreign-born respondents in a large-scale national survey would
not accept such questions--or would not answer them authentically.
Testing So Far Affirms That the Grouped Answers Approach Is Promising:
Using the grouped answers approach to ask about immigration status
seems promising because it reduces question threat and is statistically
logical. Additionally, this report has established that:
* The grouped answers approach is acceptable to most foreign-born
respondents tested (thus far) in surveys fielded by private sector
organizations; it is also acceptable--with some conditions, such as
private sector fielding of the survey--to the immigrant advocates and
other experts we consulted.
* A variety of research designs are available to help check whether
respondents choose (or intend to choose) the correct box.
* The grouped answers approach requires a fairly large number of
personal interviews with foreign-born persons (we estimate 6,000) to
achieve reasonably precise indirect estimates of the undocumented
population overall and within high-risk subgroups.
However, the most cost-efficient method of fielding a grouped answers
question series--piggybacking on an existing survey--does not seem
feasible. Rather, fielding the grouped answers approach would require a
new survey focused on the foreign-born. This raises two new questions
about "next steps"--and the answers depend, in large part, on
policymaker judgments, as described below.
Two New Questions about "Next Steps"
Question 1: Are the costs of a new survey justified by information
needs? DHS stated (in its comments on a draft of this report) that the
"information on immigration status and the characteristics of those
immigrants potentially available through this method would be useful
for evaluating immigration programs and policies." The Census Bureau
has indicated that information on the undocumented would help estimate
the total population in intercensal years. And an expert reviewer
emphasized that a new survey of the foreign-born would be likely to
help estimate the total population.[Footnote 99]
Additionally, policymakers might deem a new survey of the foreign-born
to be desirable for other reasons than obtaining grouped answers data.
Notably, an immigration expert who reviewed a draft of this report
pointed out that a survey focused on the foreign-born might provide
more in-depth, higher-quality data on that population than existing
surveys that cover both the U.S.-born and foreign born populations. For
example, more general surveys, such as the ACS and CPS (1) ask a more
limited set of migration questions than is possible in a survey focused
on the foreign-born, (2) are not designed with a primary goal of
maximizing participation by the foreign-born (for example, are not
conducted by private sector organizations), and (3) as DHS pointed out
in comments on a draft of this report, may not be designed to cover
persons who are only temporarily linked to sampled households, because
such persons may have arrived only recently in the United States and
are temporarily staying with relatives.[Footnote 100]
A new survey aimed at obtaining grouped answers data on immigration
status would require roughly 6,000 (or more) personal, self-report
interviews with foreign-born adults. Other in-person, self-report
interviews in large-scale surveys have cost $400 to $600 each. A major
additional cost would be obtaining a representative sample of foreign-
born persons; this would likely require a much larger survey of the
general population in which "mini-interviews" would screen for
households with one or more foreign-born individuals.
We did not study the likely costs of such a data collection or options
for reducing costs. However, survey costs can be estimated (based on,
for example, the experience of survey organizations), and policymakers
can, in future, weigh those costs against the information need--keeping
in mind the results of research on the grouped answers approach, to
date, and experts' opinions on research needed.
Question 2: What further tests of the grouped answers method, if any,
should be conducted before planning and fielding a new survey? On one
hand, advance testing could:
* assess response validity (that is, whether respondents pick--or
intend to pick--the correct box) before committing funds for a survey
and in time to allow adjustments to the question series;
* further delineate respondent acceptance and explore the impact on
acceptance of factors such as government funding--or funding by a
particular agency--in order to inform decisions about whether or how to
conduct a survey;[Footnote 101] and:
* as suggested in DHS's comments on a draft of this report, help
determine the cost of a full-scale survey.[Footnote 102]
On the other hand, extensive advance testing would likely delay the
survey--and may not be needed because:
* response validity could be assessed--and respondent acceptance could
be further delineated--concurrently with or subsequent to the survey
rather than in advance,[Footnote 103]
* the need for advance testing of response validity would be lessened
if policymakers see a need for more or better survey data on the
foreign-born additional to the need for grouped answers data on
immigration status (see discussion in question 1, above);
* the value of advance testing would be lessened if changes in
immigration law and policy occurred between the time of an advance test
and the main survey, because such changes could affect the context in
which the survey questions are asked and, hence, change the operant
levels of acceptance and validity; and:
* survey costs can be estimated--albeit more roughly--on the basis of
the experience of survey organizations.
Given the arguments for and against advance testing, it seems
appropriate for these to be weighed by policymakers.
Agency Comments:
We provided a draft of this report to and received comments from the
Department of Commerce, the Department of Homeland Security, and the
Department of Health and Human Services (see appendices VII, VIII, and
IX, respectively). The Office of Management and Budget provided only
technical comments, and the Department of Labor did not comment.
The Census Bureau agreed with the report's discussion of:
* the grouped answers method, including its strengths and limitations;
* the Census Bureau-GSS evaluation, including the conclusions of the
independent consultant (Alan Zaslavsky); and:
* the need for a "validity study" to determine whether the grouped
answers method can "generate accurate estimates" of the undocumented
population.
The Census Bureau also provided technical comments, which we used to
clarify the report, as appropriate.
The Department of Homeland Security stated that the kinds of
information that the grouped answers approach would provide, if
successfully implemented, would be useful for evaluating immigration
programs and policies. DHS further called for pilot testing by GAO to
assess the reliability of data collection and to help estimate the
costs of an eventual survey.[Footnote 104] As we indicate in the
"observations" section of this report, two key decisions for
policymakers concern:
* whether to invest in a new survey and:
* whether substantial testing is required in advance of planning and
fielding a survey.
We believe that depending on the answers to these questions, another
issue--one we cannot address in this report--would concern identifying
the most appropriate agency for conducting or overseeing (1) tests of
the grouped answers and (2) an eventual survey of the foreign-born
population. However, we believe that conducting or overseeing such
tests or surveys is a management responsibility and, accordingly, is
not consistent with GAO's role or authorities. DHS made other technical
comments which we incorporated in the report where
appropriate.[Footnote 105]
The Department of Health and Human Services (HHS) agreed that the NSDUH
would not be an appropriate vehicle for a grouped answers question
series. Commenting on a draft of this report, HHS said that the report
should include more information on variance calculations and on "mirror-
image" estimates.[Footnote 106] Therefore, we (1) added a footnote
illustrating the variance costs of a grouped answers estimate relative
to a corresponding direct estimate and (2) developed appendix VI, which
gives the formula for calculating the variance of a grouped answers
estimate and discusses "mirror image" estimates.
Additionally, HHS said that interviewers should more accurately
communicate with respondents when presenting the three-box cards. We
believe that the text of appendix V on informed consent, based on our
earlier discussions with privacy experts at the Census Bureau, deals
with this issue appropriately. As we state in appendix V, it would be
possible to explain to respondents that "there will be other interviews
in which other respondents will be asked about some of the Box B
categories or statuses." Finally, HHS made other, technical comments,
which we incorporated in the report, as appropriate.
The Office of Management and Budget provided technical comments. In
addition, our discussions with OMB prompted us to re-order some of the
points in the "observations" section of the report.
The Department of Labor informed us that it had no substantive or
technical comments on the draft of the report.
We are sending copies of this report to the Director of the Census
Bureau, Secretary of Homeland Security, Secretary of Health and Human
Services, Secretary of Labor, Director of the Office of Management and
Budget, and to others who are interested. We will also provide copies
to others on request. In addition, the report will be available at no
charge on GAO's Web site at [Hyperlink, http://www.gao.gov].
If you or your staff have any questions regarding this report, please
call me at (202) 512-2700. Contact points for our Offices of
Congressional Relations and Public Affairs may be found on the last
page of this report. Other key contributors to this assignment were
Judith A. Droitcour, Assistant Director, Eric M. Larson, and Penny
Pickett. Statistical support was provided by Sid Schwartz, Mark Ramage,
and Anna Maria Ortiz.
Signed by:
Nancy R. Kingsbury, Managing Director:
Applied Research and Methods:
[End of section]
Appendix I: Scope and Methodology:
To gain insight into the acceptability of the grouped answers approach,
we discussed the approach with numerous experts in immigration studies
and immigration issues, including immigrant advocates. Table 5 lists
the experts we met with and their organizations.
Table 5: Experts GAO Consulted on Immigration Issues or Immigration
Studies:
Name and Title: Steve A. Camarota, Director of Research;
Organization: Center for Immigration Studies.
Name and Title: Robert Deasy, Director, Liaison and Information;
Crystal Williams, Deputy Director;
Organization: American Immigration lawyers Association[A].
Name and Title: J. Traci Hong, Director of Immigration Program; Terry
M. Ao, Director of Census and Voting Programs;
Organization: Asian American Justice Center[A].
Name and Title: Guillermina Jasso, Professor of Sociology;
Organization: New York University.
Name and Title: Benjamin E. Johnson, Director of Policy, Immigration
Policy Center;
Organization: American Immigration Law Foundation[A].
Name and Title: John L. (Jack) Martin, Director, Special Projects;
Julie Kirchner, Deputy Director of Government Relations;
Organization: Federation for American Immigration Reform.
Name and Title: Douglas S. Massey, Professor of Sociology and Public
Affairs;
Organization: Princeton University.
Name and Title: Mary Rose Oakar, President; Thomas A. Albert, Director
of Government Relations; Leila Laoudji, Deputy Director of Legal
Advocacy; Kareem W. Shora, Director, Legal Department and Policy;
Organization: American-Arab Anti-Discrimination Committee[A].
Name and Title: Demetrios G. Papademetriou, President;
Organization: Migration policy Institute.
Name and Title: Jeffrey S. Passel, Senior Research Associate;
Organization: Pew Hispanic Center.
Name and Title: Eric Rodriguez, Director, Policy Analysis Center;
Michele L. Waslin, Director, Immigration Policy Research;
Organization: National Council of La Raza[A].
Name and Title: Helen Hatab Samhan, Executive Director;
Organization: Arab American Institute Foundation[A].
Name and Title: James J. Zogby, President; Rebecca Abou-Chedid,
Government Relations and Policy Analyst; Nidal M. Ibrahim, Executive
Director;
Organization: Arab American Institute[A].
Source: GAO.
Note: Other immigration experts we briefly consulted with by telephone
or e-mail or in conversations at an immigration conference included
George Borjas, Professor of Economics and Public Policy, Harvard
University; Georges Lemaitre, Directorate for Employment, Labour, and
Social Affairs, Organisation for Economic Co-operation and Development,
Paris, France; Enrico Marcelli, Assistant Professor of Economics,
University of Massachusetts at Boston; Randall J. Olson, Director,
Center for Human Resource Research, The Ohio State University; and
Michael S. Teitelbaum, Vice President, Alfred P. Sloan Foundation, New
York.
[A] Organization advocating for immigrants or expressly dedicated to
representing their views. We call such organizations immigrant
advocates, although some may not, for example, lobby for legislation.
[End of table]
To ensure that we identified immigration experts from varied
perspectives, we consulted Demetrios G. Papademetriou, who is among the
immigration experts listed in table 5, and Michael S. Teitelbaum, Vice
President of the Alfred J. Sloan Foundation. With respect to immigrant
advocates, we sought to include advocates who represented (1)
immigrants in general, without respect to ethnicity; (2) Hispanic
immigrants, as these are the largest group of foreign-born residents;
(3) Asian American immigrants, as these are also a large group; and (4)
Arab American immigrants, as these have been the target of interior
(that is, nonborder) enforcement efforts in recent years.
To determine what the 2004 General Social Survey (GSS) test indicated
about the acceptability of grouped answers questions to foreign-born
respondents and its "generally usability" in large-scale surveys, we
obtained the Census Bureau's report of its analysis of those data, and
we assessed the reliability of the GSS data through a comparison of
answers to interrelated questions. Then we:
* submitted the Census Bureau's report of its analysis to Dr. Alan
Zaslavsky, an independent expert, for review;
* developed our own analysis of the GSS data and submitted our paper
describing that analysis to the same expert;[Footnote 107] and:
* summarized the expert's conclusions and appended his report and the
Census Bureau's report (reproduced in appendixes III and IV), as well
summarizing our conclusions.[Footnote 108]
We used these procedures to ensure independence, given that the GSS
test was based on our earlier recommendation that the Census Bureau and
the Department of Homeland Security (DHS) test the grouped answers
approach.[Footnote 109]
To describe additional research that might be needed, we outlined the
grouped answers approach and reviewed the main conclusions of the GSS
test in meetings with the immigration experts listed in table 5 and
with private sector statisticians.[Footnote 110] Additionally, we
discussed the approach with various federal officials and staff at
agencies responsible for fielding large-scale surveys.[Footnote 111]
To assess the precision of indirect estimates, we addressed questions
to Dr. Zaslavsky, developed illustrative tables showing hypothetical
calculations under specified assumptions, and subjected those tables to
review.
To identify and describe candidate surveys for piggybacking the grouped
answers question series, we set minimum criteria for consideration
(nationally representative, mainly or only in-person interviews, and
data on at least 50,000 persons overall, including native-born and
foreign-born). Then we identified surveys that met those criteria,
collected documents concerning the surveys, and interviewed officials
and staff at federal agencies that sponsored or conducted those
surveys. We also talked with experts in immigration about additional
key criteria for selecting an appropriate survey.
The scope of our work had several limitations. We did not attempt to
collect new data from foreign-born respondents in a survey, focus
group, or other format. We did not assess census or survey coverage of
the foreign-born or undocumented populations.[Footnote 112] We did not
assess nonresponse rates among foreign-born or undocumented persons
selected for interview. We did not review alternative methods of
obtaining estimates of the undocumented.
While we consulted a number of private sector experts and sought to
include a range of perspectives, other experts may have other views.
Finally, we do not know to what extent the broad range of persons who
compose immigrant communities share the views of the immigrant
advocates we spoke with.
[End of section]
Appendix II: Estimating Characteristics, Costs, and Contributions of
the Undocumented Population:
Key Characteristics Can Be Estimated:
Logically, grouped answers data can be used to estimate subgroups of
the undocumented population, using the following procedures:
1. isolate survey data for (a) the subsample 1 respondents who are in
the desired subgroup, based on a demographic or other question asked in
the survey (for example, if the survey included a question on each
respondent's employment, data could be isolated for foreign-born who
are employed), and (b) subsample 2 respondents in that subgroup;
2. calculate (a) the percentage of the subsample 1 subgroup respondents
who are in each box of immigration status card 1 and (b) the percentage
of subsample 2 subgroup respondents who are in each box of immigration
status card 2; and:
3. carry out the subtraction procedure (percentage in Box B, Card 1,
minus percentage in Box A, Card 2), thus estimating the percentage of
the subgroup who are undocumented.
The resulting percentage can be multiplied by a census count or an
updated estimate of the foreign-born persons who are in the subgroup
(for example, multiply the estimate of the percentage of employed
foreign-born who are undocumented by the census count or updated
estimate of the number of employed foreign-born).
These steps can be repeated to indirectly estimate the size of the
undocumented population within various subgroups defined by activity,
demographics, and other characteristics (such as those with or without
health insurance) that are asked about in the survey. Without an
extremely large survey, it would be difficult or impossible to derive
reliable estimates for subgroups with few foreign-born persons or few
undocumented persons. Ongoing surveys conducted annually have sometimes
combined 2 or 3 years of data in order to provide more reliable
estimates of low-prevalence groups; however, there is a loss of time-
specificity.
Some Program Costs Can Be Estimated:
Program cost data are sometimes available on an average per-person
basis, and surveys sometimes ask about benefit use. In such cases, the
total costs of a program associated with a certain group can be
estimated. Program costs associated with the undocumented population
might be estimated by either (1) multiplying the estimated numbers of
undocumented persons receiving benefits by average program costs or (2)
performing the following procedures:
1. Isolate survey data for all foreign-born subsample 1 respondents who
said they were in Box B of Card 1 and estimate each individual
respondent's program cost.[Footnote 113] Then aggregate the individual
costs to estimate the total program cost (potentially, millions or
billions of dollars) associated with the population of foreign-born
persons defined by the group of immigration statuses in Box B, Card 1.
2. Isolate data for all foreign-born subsample 2 respondents who said
they were in Box A of Card 2 and, as above, estimate each individual
respondent's program costs, aggregating these to estimate the total
program costs associated with the population of foreign-born persons
defined by the immigration statuses in Box A, Card 2 (again,
potentially millions or billions of dollars).
3. Because the only difference between the immigration statuses in Box
B, Card 1, and Box A, Card 2, is the inclusion of the undocumented
status in Box B, Card 1, start with the total program cost estimate for
all Box B, Card 1, respondents and subtract the corresponding cost
estimate for Box A, Card 2, respondents.
The result of the subtraction procedure represents an indirect estimate
of program costs associated with the undocumented population. A more
precise cost estimate can be obtained by calculating an additional
"mirror image" cost estimate--this time, starting with costs estimated
for respondents in Box B of Card 2 and subtracting costs associated
with respondents in Box A of Card 1. The two "mirror image" estimates
could then be averaged.
The key limitations on such procedures are sample size and the
representation of key subgroups--for example, foreign-born respondents
residing in small states and local areas. Thus, for example, it is
possible that state-level costs associated with undocumented persons
might be estimated with reasonable precision for a large state or city
with many foreign-born persons and a relatively high percentage of
undocumented (potentially, California or New York City) but not for
many smaller states or areas, unless very large samples (or samples
focused on selected areas of interest) were drawn. Further work could
explore the ways that complex analyses could be conducted to help
delineate costs.
Contributions Might Be Estimated:
Contributions can be conceptualized as contributions to the economy
through work or, potentially, through taxes paid. Such contributions
might be estimated by combining grouped answers data with other survey
questions to estimate relevant subgroups, such as employed undocumented
persons. In complex analyses, these data could potentially be combined
with other data to help estimate taxes paid.
Logically, Estimates Can Be Made of Undocumented Children:
Logically, other quantitative estimates might be obtained through
procedures similar to those outlined above for estimating program
costs. For example, the numbers of children in various immigration
statuses might be estimated by asking an adult respondent how many
foreign-born children (or how many foreign-born school-age children)
reside in the household and then--using the 3-box card assigned to the
adult respondent--asking how many of these children are in Box A, Box
B, and Box C.[Footnote 114] We note that, thus far, testing has not
asked respondents to report children's immigration status with the
grouped answers approach.
Other Estimates May Be Possible:
If subsamples 1 and 2 are sufficiently large, it might also be possible
to estimate the portion of the undocumented population represented by:
* "overstays" who were legally admitted to this country for a specific
authorized period of time but remained here after that period expired
(without a timely application for extension of stay or change of
status)[Footnote 115] and:
* currently undocumented persons who are applicants for legal status
and are waiting for DHS to approve (or disapprove) their application.
To estimate overstays would require a separate question on whether the
respondent had entered the country on a temporary visa.[Footnote 116]
To estimate undocumented persons with pending applications would
require a separate question concerning pending applications for any
form of legal status (including, for example, applications for U.S.
citizenship as well as applications for legal permanent resident status
and other legal statuses).
The precision of such estimates would depend on factors such as sample
size, the percentages of foreign-born who came in on temporary visas or
who have pending applications of some kind, and the numbers of
undocumented persons within these groups.
[End of section]
Appendix III: A Review of Census Bureau and GAO Reports on the Field
Test of the Grouped Answer Method:
A Review of Census Bureau and GAO Reports on the Field Test of the
Grouped Answer Method:
Alan Zaslavsky:
Harvard Medical School:
July 8, 2006:
A field test of the "Grouped Answer Method" (GAM) for estimating the
number of undocumented immigrants was conducted by the National Opinion
Research Center (NORC) in the context of the 2004 General Social Survey
(GSS). A descriptive report on this test was prepared by the Bureau of
the Census and a further report by the Government Accountability Office
(GAO). This is a review of these two documents, focusing on what is
shown by the analyses and what questions remain to be answered. (The
Census Bureau report refers to the method as the "Three Card Method"
(3CM), but in fact the method could be implemented with two or three
different card forms.)
Major findings:
General usability: The test confirms the general usability of the GAM
with subjects similar to the target population for its potential large-
scale use, that is, foreign-born members of the general population. Out
of about 218 respondents meeting eligibility criteria and who were most
likely administered the cards in person (possibly including a few who
had telephone interviews but responded without problems), only 9 did
not respond by checking one of the 3 boxes. Of these, 3 provided
information, verbal information that allowed coding of a box, and 6
declined to answer the question altogether. Furthermore, several of
these raised similar. difficulties with other 3-box questions on
nonsensitive topics (type of house where born, mode of transportation
to enter United States), suggesting that the difficulties with the
question format were at least in part related to the format and not to
the particular content of the answers. Thus indications were that there
would not be a systematic bias due to respondents whose immigration
status is more sensitive being unwilling to address the 3-box format.
Telephone administration: Of 232 otherwise eligible respondents, 14
were identified as telephone respondents. Of these, 10 were identified
because they were followed up in tracking data after failing to provide
usable information in response to the GAM item. While it is not known
how many interviews were done by telephone altogether, the number is
believed to be only a relatively small fraction of the entire survey.
Thus, item nonresponse was largely a problem of telephone interviewing.
The higher nonresponse rate for telephone interviewees was not
surprising given the complexity of the response format (6 categories
grouped into 3 boxes), the reliance of the item on the visual metaphor
of boxes, the use of graphics to assist in remembering the categories,
and the difficulty of comprehending the categories verbally and
remembering the groupings while answering. In particular, the way in
which the 3-box method conceals the sensitive responses would be much
less obvious in a telephone interview. Unfortunately NORC was unable at
the present time to tell exactly how many telephone interviews were
administered altogether, so an item nonresponse rate among telephone
interviews could not be calculated. (NORC plans to disclose individual
data on mode of interview (telephone versus in-person) by the end of
2006, which will make possible calculation of item response rates by
response mode, mail versus telephone.) However, it seems likely for the
reasons mentioned, as well as from the concentration of problems in
telephone interviews, that the success rate of the method for telephone
respondents would be much lower than for in-person respondents. In
future implementations of this method it would be crucial to address
this issue, either by (1) attaching the question to a survey that makes
relatively little use of telephone interviews, or by (2) sending a card
to the respondent in advance of the interview that could be referred to
for visual cues for the item. If these solutions were not practical,
then it might be possible to develop a verbal form of the item adapted
to telephone use, but this would require some laboratory testing.
Limitations of this study:
Single card form: An important limitation of the NORC field test is
that only one card form was tested. This was very understandable as a
design limitation in the test since implementation of a multiform
protocol adds to the complexity of implementation of a study and might
well be judged to be excessively burdensome for a supplementary item.
Nonetheless this means that this test cannot answer questions about
differential rates of nonresponse or procedural difficulties in
responding to the items. It is also likely that even with multiple
forms, this test would have been underpowered to answer more refined
questions about differential rates of nonresponse. With only 9
nontelephone item nonrespondents, a split sample comparison would have
had power to detect only the most extreme differences in nonresponse
rate. However, it is reasonable to generalize about the
comprehensibility of the items from this test, even with a single form,
since the modification of rearranging the options in boxes would not be
expected to affect the usability of the question.
GSS coverage limitations: GSS coverage had some limitations that made
the test unrepresentative of the target population of foreign-born.
Compared to rates estimated from the Current Population Survey, the
foreign-born are undercovered by the GSS (8.4% in the GSS versus 14.5%
in the CPS), with particular undercoverage of recent immigrants and
those from Latin America. The CPS itself likely undercovers recent
immigrants, particularly the undocumented, so the undercoverage problem
might be even greater than revealed by comparison to the CPS. Of
course, by the same token, the CPS and other existing surveys are
likely to be affected by undercoverage to some extent. Special methods
might be required to cover concentrations of immigrant population that
include high rates of undocumented immigrants. The main concern in
relation to the conclusions of the field test is whether the
performance of the items, that is their acceptability and
comprehensibility, would be different either in these special
populations or with special method used to target these populations.
Within the GSS test, the problem cases were not notably concentrated
among recent immigrants or those with more limited English proficiency.
This suggests that the methods of the GAM did not rely on highly
culturally specific references or potentially confusing language.
However, within a community that is largely made up of undocumented
immigrants, even a "mixed" box might be regarded as more identifying
and therefore sensitive than in a more heterogeneous community. For
example in a migrant labor camp in which there are few citizens,
identifying oneself as "citizen or undocumented immigrant" (as opposed
to a noncitizen with legal status) might be regarded as tantamount to
admitting illegal status, while this would not be the case in a general
population.
English only: Another concern is the use of English only in the GSS.
Many of the issues here are similar to those identified in relation to
undercoverage of recent immigrants in the preceding paragraph. Indeed
the restriction to English-speaking respondents might explain some of
the undercoverage of recent immigrants noted above. The additional
issue raised specifically by English is whether the instructions are
clear in other languages. It might be expected, however, that because
the format of the item is largely graphical, it would not be highly
sensitive to translation.
Questions for further study:
Equivalence of acceptability of the alternative response cards: As
noted above, only one form of the response card was tested in the GSS
implementation. Future studies should use all (two or three)
alternative versions of the card, to evaluate whether item nonresponse
is equivalent for all of the forms, indicating comparable acceptability
of the forms.
Effects of nonresponse and incorrect responses on estimates: The effect
of problems of nonresponse and noncomprehension on the quality of
estimates from the GAM depends critically on the exact form they take,
not just on the percentage of responses that are missing or invalid. If
the group that does not respond to the item is the same regardless of
which card form is used, then the effect of nonresponse can be
understood as simple undercoverage of that nonrespondent group. Thus
within the respondents the analysis proceeds as if with complete data
and the unknowns only concern the characteristics of the
nonrespondents, a group whose size is known. The effects of nonresponse
can be bounded by assuming alternatively that none or all of the
nonrespondents are undocumented immigrants. These extremes might be
implausible, especially if qualitative information about the
nonrespondents (like that collected in the GSS test, or potentially
relationships of nonresponse to characteristics from larger
implementations) suggests that the nonrespondents do not generally look
like undocumented immigrants. Such an argument could be used to develop
plausible tighter bounds on the fraction of undocumented immigrants
overall. A simple assumption would be that the nonrespondents have a
similar fraction of undocumented immigrants to respondents, which would
allow use of the respondents to make estimates for the entire
population.
If nonresponse depends on which card is presented, the analysis of the
implications is somewhat more complex, since not only the size of the
nonrespondent group but also its distribution across categories could
depend on the card. Note that the latter effect would not be evident if
nonresponse rates overall are the same across cards. For a simple
example, suppose that 10% of citizens would decline to respond to the
card that groups citizens with undocumented immigrants, but would
respond when citizens are ungrouped. Suppose that legally resident
noncitizens behave similarly. Then the boxes including undocumented
immigrants would be reduced by 10% with either card, reducing the
estimate of undocumented immigrants by the same amount even if all the
undocumented immigrants responded accurately. Many other such scenarios
could be constructed. Thus it would be useful to study in larger
samples the factors associated with refusal to respond, particularly to
investigate whether the reasons given by the respondents seem to be
associated with the grouping on the card. The evidence from the GSS
test, however, do not point in the direction of complex nonresponse
patterns like those hypothesized in this paragraph.
Finally, similar issues apply with respect to response errors
(responding but checking the wrong box). A number of possibilities must
be considered. If a subgroup of legal immigrants systematically report
the wrong immigration status (for example legal immigrants authorized
to work in the United States who check the box for citizens) but this
is unaffected by the grouping of categories, this will have no effect
on the estimates for the undocumented. This might be the case, for
example, if some of these respondents are misinformed about their own
status or confused about the meaning of the categories. However, if
they systematically avoid the box for the undocumented (checking that
for citizens or legal noncitizen immigrants as the case may be), this
will tend toward underestimation of the undocumented. If some
undocumented immigrants systematically misreport their status, this
will also create biases in the estimates, especially if they
systematically avoid the box containing undocumented status. The GSS
study does not address this issue.
Effects of mode and mode alternatives: The GSS results support the view
that the multiple-card items are usable with in-person interviews but
more problematical with telephone interviews. Some questions of
interest include the following:
(1) Can the problems with telephone surveys be remedied by sending a
response card before the interview? What would the effect of such a
card be on rates of difficulties in telephone interviews?
(2) Is there potential for use of mail as a response mode for GAM
surveys? A mail survey would benefit from the same graphical
presentation as with the card used in person, but there would be no
opportunity to explain the question further to respondents who were
confused by the format. However, if the method were workable in a mail
survey, it would open up many more potential applications for the
method.
(3) Computer-aided self-interview (CASI) allows a respondent to provide
answers directly to the computer, without letting them be seen by the
interviewer. CASI has been used to reduce the effect of sensitive items
by giving the respondent a greater sense of privacy. Might CASI have a
similar effect with respect to items about immigration status?
Special populations: non-English speaking (linguistically isolated),
low literacy, high density of (undocumented) immigrants: Tests should
be conducted to evaluate the performance of the items in populations
with these characteristics, each of which was poorly or not at all
represented in the GSS and might have an effect on ability or
willingness to complete the item.
Screening questions: The description of possible citizenship questions
in the GAO report (page 17-18) suggests the possibility of doing some
further screening for citizenship to improve the precision of the
estimates for the undocumented. To explain this concept, suppose that a
3-box item question is asked in which undocumented immigrant status
appears in a box combined with citizens, and in the alternative card
form the citizens appear alone. The estimate of the undocumented is
obtained by subtracting the percentage in the latter box from the
percentage in the former (based on two distinct halves of the split
sample). If there were no other questions about citizenship, then the
estimate would be subject to large variance because it would be based
on the subtracting two large percentages, each subject to sampling
variability, to obtain a small difference. At the other extreme, if
there were another item or set of items on the survey that asked about
citizenship, then all of the citizens could be identified directly and
in the first card form, undocumented status could be deduced for each
respondent. In that case the second form could be dispensed with, and
the precision of estimates using the first form would be the same as
with a direct question on status. (This configuration of items is
described purely to illustrate a statistical principle. It must be
emphasized that a questionnaire set up in this way would be contrary to
the methodological and ethical principles underlying use of the GAM. It
would be unethically deceptive since the implicit promise that
undocumented status is not revealed for individuals would be violated.
It would also be methodologically dubious since at least some
respondents would likely sense the revealing nature of the combination
of items.) The method used in the GSS excludes the native-born from
answering the GAM item, thereby limiting the population for this item
to the foreign-born. This represents a beneficial compromise between
the two extreme options described above because it makes the "citizen"
group smaller and therefore reduces error. Note that although this
exclusion was used as a screener in the GSS (skipping out the native-
born from the 3-box item) to shorten average survey length, this was
not necessary statistically since the native-born could have been
excluded afterwards. This suggests, however, that there might be other
ways of asking additional immigration questions that would not fully
identify the undocumented but would still assist in cutting down the
number of respondents sharing a box with the undocumented. The concerns
in doing this would be the ethical (confidentiality) concern and the
possibility that including too many items on status would interfere
with respondent cooperation, so any changes in this direction should be
considered with the utmost caution to make sure that they are
improvements on the current proposal of using a nativity question as a
screener.
Summary of questions for future field tests: To summarize points
appearing above, the following issues should be addressed in future
field tests:
(a) Equivalent acceptability of all forms of the response card,
(b) Usability with special populations including those with low
literacy, the linguistically isolated, and concentrated immigrant
populations,
(c) Methods that avoid telephone interviews, or reduce bias and
nonresponse due to use of the telephone,
(d) Use of followup questions to improve the accuracy of box choices.
[End of section]
Appendix IV: A Brief Examination of Responses Observed while Testing an
Indirect Method for Obtaining Sensitive Information:
A Brief Examination of Responses Observed While Testing an Indirect
Method for Obtaining Sensitive Information:
March 2, 2006:
Luke J. Larsen:
Immigration Statistics Staff:
U.S. Census Bureau:
The Three-Card Method:
Developed by the U.S. Government Accountability Office (GAO) in the
late 1990s, the three-card method (3CM) is designed to obtain accurate
estimates of the unauthorized foreign-born population in the United
States while accomplishing the following tasks:
* Reducing the psychological stress that stems from asking a question
about such a sensitive topic as illegal immigration and:
* Eliminating the possibility that any one respondent could be
identified as an illegal immigrant[Footnote 117].
This is accomplished by drawing three random sub-samples from the
foreign-born population and administering to each sub-sample a
different variation of the migrant status question (each in the form of
a card that is shown to respondents, hence the name "three-card
method"). For this question, foreign-born respondents are asked to
indicate one of three migrant-status categories to which each of them
belongs:
* A specific status, such as "lawful permanent resident,"
* A collection of four other statuses, including "unauthorized
migrant," or:
* A "catch-all" group for people whose statuses do not fit into the
other two categories.
For each question variant, the status in the first group is swapped
with one of the statuses in the second group, so that each sub-sample
has a different configuration of categories (in no instance is the
unauthorized migrant status listed in the first group). When the data
have been collected, the various migrant status estimates from all
three sub-samples are combined to obtain an indirect estimate of
undocumented migrants in the entire sample.
In a 1998 "recommendations report," GAO requested that the U.S. Census
Bureau conduct a test of the 3CM in a field environment[Footnote 118].
To perform this test, the Census Bureau contracted with the National
Opinion Research Center (NORC) of the University of Chicago to add a
set of 3CM-oriented questions, including one designed to ask about
migrant status, to their 2004 General Social Survey (GSS).
About NORC and the GSS:
Established in 1941, NORC specializes in objective public opinion
research in many areas of public policy interest, including health,
labor, and education. Many survey projects administered by NORC provide
a wealth of social indicators based on the attitudes and opinions of
the public, while other studies focus on program evaluation, social
experiments, needs assessments, and epidemiological case control
designs. NORC has also proven itself to be a pioneer in the growing
field of survey methodology, pushing forward improvements in data
collection through electronic means and emphasizing the importance and
utility of objective public opinion research.
Prominent among survey products administered by NORC is the GSS, a
biennial (since 1994, nearly annual from 1972-1993) survey that
collects data about a number of demographic and attitudinal variables
from a national area probability sample of adult respondents. In
addition to the core demographic and attitudinal variables, the GSS
also implements a series of special interest topical question modules
on a rotational basis and, from time to time, experiments based on
question wording, context effects, validity/reliability assessments,
and other methodological issues. Because of the wide scope of topical
content and the focus on objective data collection, the GSS has become
a popular and valuable resource for academic researchers, policy
makers, and the mass media alike.
Methodology:
The 3CM, as originally developed by GAO, did not conform to the survey
design specifications of the GSS. Therefore, NORC was unable to
administer three variations of the migrant status question to each of
three separate samples. Instead, NORC used a modified version of the
3CM, wherein only one version of the migrant status question (in which
Box A is for those who are lawful permanent residents) was administered
within the entire GSS sample. Though this modification limited our
ability to analyze the full 3CM and draw conclusions, we can use the
3CM data from the GSS to test how respondents react to the migrant
status question and how well they understand the question format.
NORC did not insert the 3CM questions directly into the core survey
instrument, but instead appended them to the survey in the form of a
question module. This module was not given to all respondents; rather,
it was administered only to those who were born outside the United
States (as determined by their responses to a question in the core
instrument). Thus, while this filtering method was successful in
exposing all foreign-born respondents to the 3CM question module, it
also allowed born-abroad U.S. natives to answer the module. However,
the focus of this analysis is solely on the foreign born.
The 3CM question module in the 2004 GSS consisted of three 3CM-designed
questions to be administered to the respondent and two standard
questions asked of the field representative (FR). The first two 3CM
questions are primer questions that served to familiarize the
respondent with the question format, the visual aids, and expected
response behavior (specifically, indicating to which of the three
groups the respondent belongs). The third question, which asks about
the respondent's migrant status, is the focal point of the question
module. When the respondent has completed these three questions, the FR
was then asked to evaluate whether the respondent appeared to
understand the 3CM question format and whether the respondent objected
or hesitated to answer the migrant status question.
Analysis:
Demographic Characteristics:
The total respondent count - both native and foreign born - of the 2004
GSS was 2,812 people[Footnote 119]; of the total respondent pool, 237
people (8.4 percent) were foreign born[Footnote 120]. The distributions
of the foreign-born-in- sample and the total sample from the 2004 GSS
are shown in Table 1 across six demographic variables: sex, age,
Hispanic origin, marital status, educational attainment, and world
region of birth[Footnote 121]. Additionally, it would be worthwhile to
know how these distributions compare to national estimates produced by
the Census Bureau. We can obtain this information by using estimates
provided by the 2004 Annual Social and Economic Supplement (ASEC) to
the Current Population Survey (CPS). For example, in 2004, the U.S.
adult (aged 18 years and over) foreign-born population[Footnote 122]
31.1 million people represented 14.5 percent of the total adult
population according to the 2004 ASEC, a share that is significantly
larger than the 8.4 percent given by the GSS sample[Footnote 123].
distributions of the foreign-born population and the total population
from the 2004 ASEC across the same demographic variables are also shown
in Table 1.
Comparing the GSS and ASEC distributions revealed some interesting
information about the composition of the GSS sample[Footnote 124]. For
example, the foreign-born and total distributions by age and the
foreign-born distributions by sex were not statistically different
between the two data sources; however, the total GSS sample had a
larger proportion of women than that represented by the ASEC estimates.
Also, foreign-born distributions of world region of birth showed that
the GSS sample has less representation (relative to the point estimates
from the ASEC distributions) of those born in Latin America and more
representation of those born in Europe[Footnote 125].
Responses to the Migrant Status Question:
Among the 237 foreign-born respondents in the GSS sample, 87 people
(36.7 percent) indicated belonging to Box A (lawful permanent
resident), 128 people (54.0 percent) indicated belonging to Box B (U.S.
citizen, student/work/tourist visa, undocumented, or refugee/asylee), 1
person (0.4 percent) indicated belonging to Box C (other category not
in Boxes A or B), 4 people (1.7 percent) gave a response other than Box
A, B, or C, and 17 people (7.2 percent) were non-respondents who either
refused to answer the question or gave a "don't know" response. That
roughly 90 percent of foreign-born respondents gave preferred responses
(Boxes A, B, or C) is an indication that most foreign born who are
asked about their migrant status in this format would understand the
question, know the answer, and answer willingly.
Field Representative Responses to the "Understand" and "Objection "
Questions:
The field representatives reported that 190 of the foreign-born
respondents (80.5 percent) appeared to understand the 3CM question
format, whereas 22 respondents (9.2 percent) appeared not to understand
the format. Also, the field representatives for another 14 respondents
(5.9 percent) gave an "other" response to this question, and 10 more
field representatives (4.2 percent) were non-respondents (of which one
field representative response was missing). It appears that there was
some confusion among the field representatives in how to answer this
question, since all responses should have been "yes" or "no." The
crossed data between the migrant status question and the understanding
question appears to support this statement; for example, of the 14
respondents whose field representatives assigned an "other" response to
the understanding question, 12 gave preferred responses to the migrant
status question. Depending on whether the "other," "refused," and
"don't know" responses are assigned as "yes" or "no," the results
indicate that between 10 and 20 percent of the respondents did not
appear to understand the 3CM question format.
The field representatives also reported that 216 of the foreign-born
respondents (91.5 percent) did not raise an objection, hesitate, or
remain silent when asked the migrant status question. Only 5
respondents (2.1 percent) raised a verbal objection and 4 respondents
(1.7 percent) either hesitated to answer or remained silent. As with
the "understanding" question, there appeared to be a slight issue with
field representatives misunderstanding the "objection" question, as 2
respondents were assigned a response of "other" and 9 were designated
as non-respondents (once again, one field representative response was
missing). Interestingly, 3 respondents who objected to the migrant
status question actually gave a preferred response, as did 3
respondents who hesitated to answer (obviously they did not remain
silent). Also, 3 people who answered the question immediately gave an
"other" response, and 3 more either refused to answer or replied with
"don't know." However, the overwhelming majority of foreign-born
respondents gave a preferred response (Boxes A, B, or C) to the migrant
status question without objection, hesitation, or silence.
Response Patterns to the Migrant Status Question by Characteristic:
Twenty-one foreign-born respondents (8.9 percent) in the survey did not
give a preferred answer to the migrant status question; that is, they
either gave an "other" response (4 people, or 1.7 percent), a "don't
know" response (11 people, or 4.7 percent), or a refusal to answer the
question (6 people, or 2.5 percent). It is important to know whether
these non-preferred responses to the 3CM-based migrant status question
are more likely to occur for certain demographic cohorts among the
foreign-born population. Therefore, we examined the distribution of non-
preferred responses to the migrant status question across dimensions of
age, sex, Hispanic origin, marital status, educational attainment, and
world region of birth. Keeping in mind that there are not enough cases
under consideration to establish that non- preferred responses are
influenced by one or more characteristics, we can study these data for
clues to patterns that might exist, had we a larger response pool with
which to work.
Of the six demographic variables being studied, only age and sex
appeared to show disproportionate distributions of non-preferred
responses. Specifically, the "don't know" responses were more prevalent
among the older foreign born (aged 45 years and over; 7 people) than
the younger foreign born (18 to 44 years old; 4 people), even though
the younger group outnumbered the older group by a strong margin. Also,
refusals were more prevalent among foreign-born females (5 people) than
males (1 person), even though the foreign-born-in-sample were about
equally distributed by sex. Outside of these two instances, the data
suggested no relationship between each of the four remaining
demographic variables and the patterns of non-preferred responses to
the migrant status question. However, the small number of foreign-born-
in-sample - and the subsequently smaller number of respondents with non-
preferred responses - makes it difficult to determine whether these
trends are particularly pronounced.
Respondent Comments Regarding the 3CM:
While administering the 3CM question module, field representatives were
instructed to collect verbal comments from the respondents regarding
each question and to submit their own comments for the two
representative-directed questions. They were also instructed to enter
respondents' answers when they did not conform to the 3CM format, thus
comprising the category of responses known as "other." We shifted away
from quantitative analysis to examine this qualitative data in an
attempt to learn more about how respondents and field representatives
perceive and respond to the 3CM questions. One piece of information
gleaned from this analysis is that 25 respondents (10.5 percent) tended
not to simply state to which migrant status group they belonged, but to
state what their status was in both implicit ("been in country since
age 6") and explicit ("I have a visa") terms. This number may actually
be larger, since some field representatives might not have entered the
respondents' comments. However, this raises the issue of how field
representatives handled responses such as these. In some cases, when a
respondent made such a comment, the field representative entered a
response of "other," but in other cases, the response was set to one of
the boxes. This pattern of inconsistent coding suggests that field
representatives may have used their own judgment to set responses
according to respondents' actual answers.
Another useful piece of information is that the 3CM question format
became problematic when attempts were made to administer the survey
over a telephone. As previously stated, the GSS is conducted in a face-
to-face environment in most cases, but in the event that a sampled
person is not available when the field representative comes to the
home, a follow-up attempt is made via telephone. However, since the 3CM
is designed for use in a face-to-face setting, both respondents and
field representative had trouble with the question module over the
phone. This is evidenced in the comment fields, wherein field
representatives stated in two cases that they were unable to do the
questions over the phone. Because we cannot assume that every field
representative made a note regarding difficulty with administering the
module over the phone, we don't know how many follow-up interviews this
problem affected.
Conclusion:
In compliance with the GAO recommendations, the U.S. Census Bureau was
able to conduct a field test of the three-card method (via NORC and the
GSS) and analyze the results. In summary, we found that nine out of ten
foreign-born respondents to the migrant status question gave format-
appropriate answers (Box A, B, or C), eight out of ten appeared to
understand the format of the 3CM questions, and nine out of ten did not
raise an objection, remain silent, or hesitate to answer when asked the
migrant status question. Furthermore, the non-preferred responses to
the migrant status question ("other," .don't know," or "refusal") did
not appear to be strongly related with any of the six demographic
variables under consideration. We also found a number of operational
issues with the data, such as the tendency of some respondents to
indicate their specific migrant status despite instructions not to do
so, the inconsistent coding of proper responses among field
representatives when given an answer other than a "box" response, and
the difficulty in administering 3CM-designed questions in a situation
other than a face-to-face environment.
Table 1: Comparison of 2004 GSS Sample and 2004 CPS ASEC Estimates by
Nativity and Selected Characteristics (in percent)[1]:
Characteristics: Sex[3]: Male;
2004 GSS: Foreign Born in Sample: 48.9;
2004 GSS: Total Sample: 45.5;
2004 CPS ASEC[2]: Foreign-born Population: 50.3;
2004 CPS ASEC[2]: Total Population: 48.3.
Characteristics: Sex[3]: Female;
2004 GSS: Foreign Born in Sample: 51.1;
2004 GSS: Total Sample: 54.5;
2004 CPS ASEC[2]: Foreign-born Population: 49.7;
2004 CPS ASEC[2]: Total Population: 51.7.
Characteristics: Age[3]: 18-44 years:
2004 GSS: Foreign Born in Sample: 63.6;
2004 GSS: Total Sample: 50.2;
2004 CPS ASEC[2]: Foreign-born Population: 60.9;
2004 CPS ASEC[2]: Total Population: 51.5.
Characteristics: Age[3]: 45 years and over;
2004 GSS: Foreign Born in Sample: 36.4;
2004 GSS: Total Sample: 49.8;
2004 CPS ASEC[2]: Foreign-born Population: 39.1;
2004 CPS ASEC[2]: Total Population: 48.5.
Characteristics: Hispanic Origin[4]: Hispanic (of any race);
2004 GSS: Foreign Born in Sample: 27.4;
2004 GSS: Total Sample: 8.7;
2004 CPS ASEC[2]: Foreign-born Population: 45.2;
2004 CPS ASEC[2]: Total Population: 12.4.
Characteristics: Hispanic Origin[4]: Not Hispanic;
2004 GSS: Foreign Born in Sample: 72.6;
2004 GSS: Total Sample: 91.3;
2004 CPS ASEC[2]: Foreign-born Population: 54.8;
2004 CPS ASEC[2]: Total Population: 87.6.
Characteristics: Marital Status[5]: Currently or previously married;
2004 GSS: Foreign Born in Sample: 80.2;
2004 GSS: Total Sample: 78.0;
2004 CPS ASEC[2]: Foreign-born Population: 74.4;
2004 CPS ASEC[2]: Total Population: 71.0.
Characteristics: Marital status[5]: Never married;
2004 GSS: Foreign Born in Sample: 19.8;
2004 GSS: Total Sample: 22.0;
2004 CPS ASEC[2]: Foreign-born Population: 25.6;
2004 CPS ASEC[2]: Total Population: 29.0.
Characteristics: Educational Attainment[6]: At least high school
diploma;
2004 GSS: Foreign Born in Sample: 53.4;
2004 GSS: Total Sample: 87.0;
2004 CPS ASEC[2]: Foreign-born Population: 67.2;
2004 CPS ASEC[2]: Total Population: 85.6.
Characteristics: Educational Attainment[6]: At least bachelor's degree;
2004 GSS: Foreign Born in Sample: 39.8;
2004 GSS: Total Sample: 28.0;
2004 CPS ASEC[2]: Foreign-born Population: 27.3;
2004 CPS ASEC[2]: Total Population: 26.3.
Characteristics: World Region of Birth[3,7]: Europe;
2004 GSS: Foreign Born in Sample: 23.8;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: 13.9;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Asia;
2004 GSS: Foreign Born in Sample: 28.5;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: 25.8;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Latin America;
2004 GSS: Foreign Born in Sample: 38.3;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: 52.9;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Other Regions[8];
2004 GSS: Foreign Born in Sample: 9.3;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: 7.4;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Other Regions[8]: Africa;
2004 GSS: Foreign Born in Sample: 7.2;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: NA;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Other Regions[8]:
Australia;
2004 GSS: Foreign Born in Sample: 0.4;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: NA;
2004 CPS ASEC[2]: Total Population: X.
Characteristics: World region of Birth[3,7]: Other Regions[8: Canada;
2004 GSS: Foreign Born in Sample: 1.7;
2004 GSS: Total Sample: X;
2004 CPS ASEC[2]: Foreign-born Population: NA;
2004 CPS ASEC[2]: Total Population: X.
Sources: National Opinion Research Center (2004 GSS) and U.S. Census
Bureau (2004 CPS ASEC).
"X" indicates "not applicable"; "NA" indicates "mat available; Some
distributions may not add to 100.0% due to rounding.
[1]- The GSS data cited in this table are based on unweighted counts
and should not be construed as population estimates.
[2] -The population universe of the CPS is restricted to the civilian
non-institutionalized population in the United States, though some
members of the armed forces may be included if they live with family
members in off-post housing. For brevity, this report will refer to
this population as the total population.
[3] -The ASEC-based foreign-born and total population estimates for
age, sex, and world region of birth are for the adult (18 years or
older) population, in order to be more comparable with the adult-only
GSS sample.
[4] -The ASEC-based total population estimates regarding Hispanic
origin are for the adult population, while the foreign-born estimates
regarding Hispanic origin are for those aged 25 years or older. Since
most of the Hispanic foreign born were born in Latin America, and
because most of the foreign-born aged 18 to 24 years were born in Latin
America (66.0 percent, based on 2004 CPS ASEC data), the share of
Hispanic foreign-ban adults in the U.S. would likely be more than the
share of Hispanic foreign born aged 25 years or older.
[5] -Me ASEC-basal foreign-born and total population estimates
regarding marital status are for those who are aged 15 years or older.
Since relatively few people under the age of 18 tend to get married,
the share of currently or previously married people aged 18 and older
among the foreign-born and total populations would likely be greater
than the foreign-born and total population shares of currently or
previously married people aged 15 and over, and the corresponding never
married shares would likely be lower.
[6] -The ASEC-based foreign-born population estimates for educational
attainment are based on these who are aged 25 years or older, while the
total population estimates are based on those who are aged 18 years or
older. Since those aged 18 to 24 years are less likely than older
people in the total population to have attained either at least a high
school diploma (77.9 percent and 85.2 percent, respectively) or at
least a bachelors degree (8.4 percent and 27.7 percent, respectively),
the shares of adult foreign born who attained at least a high school
diploma or at least a bachelor's degree would likely be smaller than
these shares shown for the foreign born aged 25 years or older,
assuming that educational attainment trends for the total population
aged between 18 and 24 years can be transferred to the foreign-born
population of the same age group.
[7] -Because the focus of this report is upon the foreign-born
population, we chose to examine the world regions of birth only for the
foreign-born.
[8] -"Other Regions" includes Northern America, Africa, and Oceania.
[End of Table]
[End of section]
Appendix V: The Issue of Informed Consent:
Appropriately informing each respondent about what information he or
she is being asked to provide is a key issue. On one hand, the grouped
answers approach logically conveys to each respondent exactly what he
or she is being asked to reveal about himself or herself; no one we
spoke with suggested otherwise. On the other hand, the grouped answers
question series does not indicate that the respondent is being asked to
participate in an effort that will result in estimates of all
immigration statuses. Therefore, a statement is needed to convey this
information.
Officials and staff at the National Center for Health Statistics (NCHS)
were particularly concerned about this issue and believed that failing
to adequately address informed consent issues could be considered
unethical.[Footnote 126]
Privacy protection specialists at the Census Bureau said that:
² An introductory statement before the first immigration-related
question might be phrased, "The next questions are geared to helping us
know more about immigration and the role that it plays in American
life."
² When each respondent is shown the 3-box training cards, it would be
possible to explain to him or her that--while the survey does not ask,
and does not want to know, the specifics of which Box B category
applies to him or her--there will be other interviews in which other
respondents will be asked about some of the Box B categories or
statuses.[Footnote 127]
² Just before showing each respondent the immigration status card, it
should be stated--and, in fact, interviewers stated in the test with
Hispanic farmworkers--that "Using the boxes allows us to obtain the
information we need, without asking you to give us information that you
might not want to." Further: "Because we're using the boxes, we WON'T
'zero in' on anything somebody might not want to tell us."[Footnote
128]
² It may also be possible to explain that the study's goal is to allow
researchers to broadly estimate all categories or statuses on the card
for the population of immigrants--but to indicate that this will be
done without ever asking questions that "zero in" on something that
some respondents might not want to disclose in an interview.
² Neither the estimation method (that is, the two cards) nor the
specific policy relevance of immigration-status estimates would have to
be described to all respondents.However, interviewer statements should
be provided for responding to respondents who have doubts or questions.
[End of section]
Appendix VI: A Note on Variances and "Mirror Image" Estimates:
The statistical expression and variance of a grouped answers estimate
is as follows, with the starting point being the percentage or
proportion of subsample 1 who are in Box B, Card 1, and the procedure
being to subtract from this the proportion of subsample 2 who are in
Box A, Card 2 (with cards and boxes as defined as in figure
3):[Footnote 129]
Grouped answers estimate = p1 - p2. where
[See PDF for Equation]
Variance (p1 - p2) = [(p1q1/n1) + (p2q2/n2)] where
[See PDF for Equation]
The immigration status cards in figure 3 are designed so that Boxes A
and B include all major immigration statuses. This design ensures that,
on each card, the Box B categories apply to the largest possible number
of legally present respondents. In designing the cards this way, we
reasoned that this should reduce the question threat associated with
choosing Box B.
As a result, few respondents are expected to choose Box C ("some other
category not in Box A or Box B"). For example, in the 2004 GSS test,
only one foreign-born respondent of more than 200 chose Box C.
Therefore, we believe that for purposes of illustrative variance
calculations, it is reasonable to assume that no one chooses Box C.
Under this assumption,the two mirror-image estimates of the percentage
of the foreign-born who are undocumented would necessarily be exactly
the same, as explained below.
Assuming that no respondent chooses Box C, then q1 = 1 - p1 = the
proportion of subsample 1 in Box A, Card 1 q2 = 1 - p2 = the proportion
of subsample 2 in Box B, Card 2:
The alternative, mirror-image estimate can then be defined as follows:
Mirror-image estimate = q2 - q1:
As indicated above, q1 and q2 are defined in terms of p1 and p2. Using
algebraic substitution, we have:
[See PDF for Equation]
In other words, under the assumption that no one chooses Box C, the
mirror-image estimates of the percentage undocumented are, by
definition, identical. Thus, no precision gain follows from combining
them[Footnote 130]. No additional information is provided by a second,
mirror-image estimate.
In contrast, quantitative indirect estimates are based on a combination
of (1) grouped answers data and (2) additional, separate quantitative
data or estimates (for example, per-person estimates of emergency-visit
costs based on respondent reports of number of emergency room visits in
the past year and other information from hospitals on per-visit costs).
If the quantitative data are tallied or totaled for individuals in each
box of each card, the result is four different figures, none of which
can be derived from the others. (There are different respondents in
each box, and each would have separately reported how many emergency
room visits, for example, he or she made in the past year.) Thus, for
quantitative estimates of this type, calculating two independent mirror-
image estimates, and averaging them, may yield a more precise result.
[End of section]
Appendix VII: Comments from the Department of Commerce:
The Deputy Secretary Of Commerce:
Washington, D.C. 20230:
September 19, 2006:
Ms. Judith A. Droitcour:
Assistant Director:
Applied Research and Methods:
United States Government Accountability Office:
Washington, DC 20548- 0001:
Dear Ms. Droitcour:
The U.S. Department of Commerce appreciates the opportunity to comment
on the United States Government Accountability Office's draft report
entitled Estimating the Undocumented Population: A "Grouped Answers "
Approach to Surveying Foreign-Born Respondents (GAO-06-775). I enclose
the Department's comments on this report.
Sincerely,
Signed by:
David A. Sampson:
Enclosure:
U.S. Department of Commerce:
Comments on the United States Government Accountability Office Draft
Report Entitled Estimating the Undocumented Population: A "Grouped
Answers " Approach to Surveying Foreign-Born Respondents (GAO-06-775)
September 2006:
The U.S. Census Bureau generally agrees with the observations in this
report but has some comments and clarifications about various
statements.
Regarding footnote 1 on page 1:
GAO Report: "Our previous reports and those of other government
agencies have sometimes used the terms undocumented, illegal aliens,
illegal immigrants, unauthorized immigrants, and not legally present.
We use undocumented here, because this report concerns a technique for
surveying the foreign-born, an ongoing federal survey uses this term as
a response category when asking about legal status, and foreign-born
respondents appear to understand the term. We define undocumented as
foreign-born persons who are illegally present in the United States.
Foreign-born persons (i.e., those not born a U.S. citizen) were born
outside the United States to parents who were both not U.S. citizens at
the time of the birth."
Census Bureau Response: Although the Census Bureau has used the term
"undocumented," we generally prefer the term "unauthorized" rather than
"undocumented." When legal statuses associated with the "unauthorized"
category are not separately estimable or are demographically not
meaningful, we use the term "residual" to describe this group.
Regarding footnote 2 on page 1:
GAO Report: "Most recently, the Census Bureau has stated that among its
"enhancement priorities" to "improve estimates of net international
migration" are efforts to estimate "international migrants by migrant
status (legal migrants, temporary migrants, quasi-legal migrants,
unauthorized migrants, and emigrants)" with the overall purpose being
to produce annual estimates of the U.S. population. ("The U.S. Census
Bureau's Intercensal Population Estimates and Projections Program:
Basic Underlying Principles," paper distributed by the Bureau of the
Census at its conference on "Population Estimates: Meeting User Needs,"
Embassy Suites, Alexandria, Virginia, July 19, 2006.)"
Census Bureau Response: The Census Bureau is researching methods of
estimating the size of the foreign-born population by legal status.
Regarding footnote 51 on page 29:
GAO Report: "We note that these two examples involve agencies that are
apparently viewed neutrally by the immigrant community. Agencies that
are negatively viewed by at least some are the Department of Homeland
Security (DHS) and Census."
Census Bureau Response: We are not aware of empirical evidence that the
Census Bureau is viewed negatively by any specific groups.
Our specific comments about the report are as follows:
Pages 6 to 15: The description of the "grouped response" method is
accurate, including the discussion of strengths and limitations.
Pages 21 to 26 and pages 64 to 68: The discussion of the Census Bureau-
sponsored General Social Survey evaluation, including its strengths and
limitations, and Dr. Zaslavsky's evaluation are accurately described.
Pages 35 to 38: The Census Bureau agrees that a "validity study" is a
good idea. The "validity study" of the grouped response methods would
need to be performed to determine if the "grouped response" method can
be used and will generate accurate estimates.
[End of section]
Appendix VIII: Comments from the Department of Homeland Security:
U.S. Department of Homeland Security:
Washington, DC 20528:
September 12, 2006:
Ms. Nancy R. Kingsbury:
Managing Director:
Applied Research and Methods:
US General Accountability Office:
Washington, DC 20548:
Re: Draft Report GAO-06-775 "Estimating the Undocumented Population: A
"Grouped Answers" Approach to Surveying Foreign-Born Respondents."
Thank you for the opportunity to review the draft report. GAO
demonstrates that the "grouped answers" approach to surveying foreign-
born respondents has the potential to capture information on
unauthorized aliens in the United States that is not available using
existing methods and sources. They also serve notice that there are
significant hurdles to implementing the approach. The Office of
Immigration Statistics (OIS) believes that information on immigration
status and the characteristics of those immigrants potentially
available through this method would be useful for evaluating
immigration programs and policies (e.g., characteristics of
unauthorized aliens, program benefit use, and method of entry). We
therefore recommend that GAO pilot the methodology in a limited
geographic area in order to determine whether the information can be
collected reliably, and to better estimate costs of a national survey.
Our more specific comments to the report are listed below.
If a new survey needs to be developed then it should be designed to
cover all foreign-born persons in the country no matter their time in
the United States. The current, national surveys are limited to those
who have lived here at least 2 months and likely exclude some
unauthorized and temporary migrants.
The GAO report (page 53) suggests that the reliability of lawfully
admitted immigrant's responses could be tested by making comparisons
with publicly available administrative information. The comparisons may
not be made as directly as implied because administrative data on
immigrant flows will have to be adjusted for estimated changes in
population, such as through emigration and mortality.
Sincerely,
Signed by:
Steven J. Pecinovsky:
Director:
Departmental GAO/OIG Liaison Office:
[End of section]
Appendix IX: Comments from the Department of Health and Human Services:
Office of the Assistant Secretary for Legislation:
Department Of Health 8t Human Services:
Washington, D.C. 20201:
SEP 12 2006:
Nancy R. Kingsbury:
Managing Director, Applied Research and Methods:
U.S. Government Accountability Office:
Washington, DC 20548:
Dear Ms. Kingsbury:
Enclosed are the Department's comments on the U.S. Government
Accountability Office's (GAO) draft report entitled, "Estimating the
Undocumented Population: A Grouped Answers Approach to Surveying
Foreign-Born Respondents" (GAO-06-775), before its publication.
These comments represent the tentative position of the Department of
Health and Human Services and are subject to reevaluation when the
final version of this report is received.
The Department provided several technical comments directly to your
staff.
The Department appreciates the opportunity to comment on this draft
report before its publication.
Sincerely,
Signed by:
Vincent J. Ventimiglia, Jr.
Assistant Secretary for Legislation:
Comments From The Department Of Health And Human Services On Estimating
The Undocumented Population: A Group Answers Approach To Surveying
Foreign-Born Respondents GAO-06-775:
HHS Comments:
GAO is correct in their assessment that the National Survey on Drug Use
and Health (NSDUH) is NOT appropriate for collecting data on
immigration status. NSDUH has a large number of sensitive questions on
the use of illicit drugs that may cause persons with undocumented
status to not select the correct box in the "grouped answers" section
out of fear of somehow being identified. Also, the fact that NSDUH is
sponsored by a government agency may not be acceptable to foreign-born
respondents. The report indicated that this population may feel more
comfortable responding to a study sponsored by a university or private
sector organization.
The procedure used to estimate the size of the undocumented population
is provided on page 12; however, it does not indicate that the "mirror-
image" estimate could be used in combination with the other estimate in
an attempt to reduce variance. If there is some variance reduction,
this could mean that a smaller sample size is needed; thus reducing
costs.
Add an appendix where formulas are presented on the estimation of the
undocumented population along with its variance. Include the
combination of the "mirror-image" estimate and its variance. How does
the variance of the "grouped answers" estimate compare to an estimate
based on a question asked directly? Even though asking a direct
question is not feasible, we can get a perspective on how different the
"grouped answers" variance is from a the variance from a more
traditional estimator.
Disclosure of use of data: The respondents are shown three boxes. Each
one lists several possible immigration statuses, including United
States citizen and legal permanent resident, as well as undocumented
resident (See pages 8-9). The undocumented status always appears in Box
B along with other responses. The respondents are asked to choose the
box that contains their immigration status. If they choose the one with
the undocumented status which is always Box B, they are told, "If the
specific category that applies to you is in Box B, we do not want to
know which one it is because we are focusing on Box A categories."
While it's true that the interviewers do not want to know the specific
immigration status for any specific respondent, it is not true that
they are focusing on Box A categories. In fact, the entire purpose of
the exercise is to estimate how many people are undocumented by
extrapolating from the number that choose Box B.
[End of section]
Appendix X: GAO Contact and Staff Acknowledgments:
GAO Contact:
Nancy R. Kingsbury, (202) 512-2700 or kingsburyn@gao.gov.
Staff Acknowledgments:
Key GAO staff contributing to this report include Judith A. Droitcour,
Eric M. Larson, and Penny Pickett. Statistical support was provided by
Sid Schwartz, Mark Ramage, and Anna Maria Ortiz.
[End of section]
Bibliography:
Bird, Ronald. Statement of Ronald Bird, Chief Economist, Office of the
Assistant Secretary for Policy, U.S. Department of Labor, before the
Committee on the Judiciary, U.S. Senate, July 5, 2006.
Boruch, Robert, and Joe S. Cecil. Assuring the Confidentiality of
Social Research Data. Philadelphia: University of Pennsylvania Press,
1979.
Camarota, Steven A., and Jeffrey Capizzano. "Assessing the Quality of
Data Collected on the Foreign Born: An Evaluation of the American
Community Survey (ACS): Pilot and Full Study Findings," Immigration
Studies White Papers, Sabre Systems Inc., April 2004.
http://www.sabresys.com/whitepapers/CIS_whitepaper.pdf (Sept. 6, 2006).
Costanzo, Joseph, and others, "Evaluating Components of International
Migration: The Residual Foreign-Born," Population Division Working
Paper 61, U.S. Census Bureau, Washington, D.C., June 2002, p. 22.
Droitcour, Judith A., and Eric M. Larson, "An Innovative Technique for
Asking Sensitive Questions: The Three-Card Method," Bulletin de
Mèthodologie Sociologique, 75 (July 2002): 5-23.
El-Badry, Samia, and David A. Swanson, "Providing Special Census
Tabulations to Government Security Agencies in the United States: The
Case of Arab-Americans," paper presented at the 25th International
Population Conference of the International Union for the Scientific
Study of Population, Tours, France, July 18-23, 2005.
Hill, Kenneth. "Estimates of Legal and Unauthorized Foreign-Born
Population for the United States and Selected States Based on Census
2000." Presentation at the U.S. Census Bureau Conference, Immigration
Statistics: Methodology and Data Quality, Alexandria, Virginia,
February 13-14, 2006.
Hoefer, Michael, Nancy Rytina, and Christopher Campbell. Estimates of
the Unauthorized Immigrant Population Residing in the United States:
January 2005. Washington, D.C.: Department of Homeland Security, Office
of Immigration Statistics, August 2006.
GAO. Undocumented Aliens: Questions Persist about Their Impact on
Hospitals' Uncompensated Care Costs, GAO-04-472. Washington, D.C.: May
21, 2004.
GAO. Illegal Alien Schoolchildren: Issues in Estimating State-by-State
Costs, GAO-04-733. Washington, D.C.: June 23, 2004.
GAO. Overstay Tracking: A Key Component of Homeland Security and a
Layered Defense, GAO-04-82. Washington, D.C.: May 21, 2004.
GAO. Record Linkage and Privacy: Issues in Creating New Federal
Research and Statistical Information. GAO-01-126SP. Washington, D.C.:
April 2001.
GAO. Survey Methodology: An Innovative Technique for Estimating
Sensitive Survey Items, GAO/GGD-00-30. Washington, D.C.: November 1999.
GAO. Immigration Statistics: Information Gaps, Quality Issues Limit
Utility of Federal Data to Policymakers, GAO/GGD-98-164. Washington,
D.C.: July 31, 1998.
Greenberg, Bernard G., and others. "The Unrelated Questions Randomized
Response Model: Theoretical Framework." Journal of the American
Statistical Association, 64 (1969): 520-39.
Kincannon, Charles Louis, "Procedures for Providing Assistance to
Requestors for Special Data Products Known as Special Tabulations and
Extracts," memorandum to Associate Directors, Division Chiefs, Bureau
of the Census, Washington, D.C., August 26, 2004.
Locander, William, and others. "An Investigation of Interview Method,
Threat, and Response Distortion." Journal of the American Statistical
Association, 71 (1976): 269-75.
National Research Council, Committee on National Statistics, Local
Fiscal Effects of Illegal Immigration: Report of a Workshop.
Washington, D.C.: National Academy Press, 1996.
Passel, Jeffrey S. "The Size and Characteristics of the Unauthorized
Migrant Population in the U.S.: Estimates Based on the March 2005
Current Population Survey." Research Report. Washington, D.C.: Pew
Hispanic Center, March 7, 2006.
Passel, Jeffrey S., Rebecca L. Clark, and Michael Fix. "Naturalization
and Other Current Issues in U.S. Immigration: Intersections of Data and
Policy," In Proceedings of the Social Statistics Section of the
American Statistical Association: 1997. Alexandria, Va.: American
Statistical Association, 1997.
Robinson, J. Gregory. "Memorandum for Donna Kostanich." DSSD A.C.E.
Revision II Memorandum Series No. PP-36. Washington, D.C.: U.S. Bureau
of the Census, December 31, 2002.
Rytina, Nancy F. Estimates of the Legal Permanent Resident Population
and Population Eligible to Naturalize in 2004. Washington, D.C.:
Department of Homeland Security, Office of Immigration Statistics,
February 2006.
Schryock, Henry S., and Jacob S. Siegel and Associates. The Methods and
Materials of Demography. Washington, D.C.: U.S. Government Printing
Office, 1980.
Siegel, Jacob S., and David A. Swanson. The Methods and Materials of
Demography, 2nd ed. San Diego, Calif.: Elsevier Academic Press, 2004.
U.S. Census Bureau, "The U.S. Census Bureau's Intercensal Population
Estimates and Projections Program: Basic Underlying Principles," paper
distributed by the Census Bureau at its conference on Population
Estimates: Meeting User Needs, Alexandria, Virginia, July 19, 2006.
U.S. Commission on Immigration Reform. U.S. Immigration Policy:
Restoring Credibility: 1994 Report to Congress. Washington, D.C.: U.S.
Government Printing Office, 1994.
U.S. Immigration and Naturalization Service, Office of Policy and
Planning. Estimates of the Unauthorized Immigrant Population Residing
in the United States: 1990 to 2000. Washington, D.C.: January 2003.
U.S. Department of Labor, Findings from the National Agricultural
Workers Survey (NAWS) 2000-2002: A Demographic and Employment Profile
of United States Farm Workers. Research Report 9. Washington, D.C.:
March 2005.
Warner, Stanley. "Randomized Response: A Survey Technique for
Eliminating Evasive Answer Bias." Journal of the American Statistical
Association, 60 (1995): 63-69.
Warren, Robert, and Jeffrey S. Passel. "A Count of the Uncountable:
Estimates of Undocumented Aliens Counted in the 1980 Census."
Demography, 24:3 (1987): 375-93.
[End of section]
FOOTNOTES
[1] Our previous reports and those of other government agencies have
sometimes used the terms undocumented, illegal aliens, illegal
immigrants, unauthorized immigrants, and not legally present. We use
undocumented here, because this report concerns a technique for
surveying the foreign-born and an ongoing federally funded survey uses
this term as a response category when asking about legal status. We
define undocumented as foreign-born persons who are illegally present
in the United States. Foreign-born persons (that is, persons not born
as U.S. citizens) were born outside the United States to parents who
were both not U.S. citizens at the time of the birth.
[2] Most recently, the Census Bureau has stated that among its
"enhancement priorities" to "improve estimates of net international
migration" are efforts to research ways of estimating "international
migrants by migrant status (legal migrants, temporary migrants, quasi-
legal migrants, unauthorized migrants, and emigrants)" with the overall
purpose of producing annual estimates of the U.S. population. ("The
U.S. Census Bureau's Intercensal Population Estimates and Projections
Program: Basic Underlying Principles," paper distributed by the Census
Bureau at its conference on Population Estimates: Meeting User Needs,
Alexandria, Virginia, July 19, 2006.)
[3] GAO, Immigration Statistics: Information Gaps, Quality Issues Limit
Utility of Federal Data to Policymakers, GAO/GGD-98-164 (Washington,
D.C.: July 31, 1998), and Survey Methodology: An Innovative Technique
for Estimating Sensitive Survey Items, GAO/GGD-00-30 (Washington, D.C.:
November 1999).
[4] See GAO/GGD-98-164 and GAO/GGD-00-30.
[5] The GSS is a long-standing series of nationally representative
personal-interview self-report surveys, each consisting of a "core"
question series and additional "modules." The funding for fielding the
core question series is provided by a grant from NSF. The modules are
question series added through grants from and contracts with a variety
of sources. The Census Bureau contracted for a grouped answers module
in the 2004 GSS. The bulk of the funding for that Census-GSS contract
had been provided to the Census Bureau by the Department of Homeland
Security (DHS). This test of the grouped answers approach was in
response to our earlier recommendation in GAO/GGD-98-164.
[6] The acceptability of the grouped answers approach for use in a
national survey is defined here primarily in terms of (1) the responses
of immigrant advocates when the grouped answers approach is explained
to them (that is, objecting versus not objecting to or accepting the
method) and (2) respondents' tendency to pick a box when the grouped
answers immigration status question is posed to them (rather than their
refusing or saying that they "don't know"). The opinions of other
experts--for example, those who have conducted studies of immigrants--
are also relevant, as are interviewer judgments about respondent
reactions.
[7] In all, we consulted over 20 private sector immigration experts
(listed in appendix I, table 5). Because of the importance of immigrant
advocates' views on the issues in surveying immigrants, table 5
identifies the experts representing immigrant advocate organizations.
For purposes of this report, we define immigrant advocate organizations
as those whose purpose includes representing the immigrants' point of
view. More generally, in reporting the views of the experts we
consulted, we recognize that in some cases other knowledgeable persons
might have differing views.
[8] Alan Zaslavsky is Professor of Statistics, Department of Health
Care Policy, Harvard Medical School, Boston, Massachusetts. We selected
Dr. Zaslavsky because he (1) is independent with respect to the method
we discuss; (2) is a noted statistician who has received many awards,
has advised multiple executive agencies on the design and analysis of
large-scale surveys, and serves on the National Research Council's
(NRC) Committee for National Statistics at the National Academy of
Sciences; and (3) has developed innovative statistical approaches. We
also sought the advice of two other noted statisticians who had advised
us in earlier work on this method (Dr. Fritz Scheuren and Dr. Mary
Grace Kovar of NORC at the University of Chicago) and GAO colleagues
with expertise in statistics.
[9] We talked with four agencies sponsoring or conducting these
surveys: the Census Bureau in the Department of Commerce, the Bureau of
Labor Statistics (BLS) in the Department of Labor, and the National
Center for Health Statistics (NCHS) and the Substance Abuse and Mental
Health Services Administration (SAMHSA) in the Department of Health and
Human Services (HHS). Survey-related staff at these agencies provided
information on the specific surveys. Additionally, we deemed some staff
at these agencies to be experts in statistics and survey research.
[10] These included the Statistical and Science Policy Branch of the
Office of Information and Regulatory Affairs in the Office of
Management and Budget (OMB), the Employment and Training Administration
in the Department of Labor (DOL), and the Office of Immigration
Statistics within the Policy Directorate and the Research and
Evaluation Division, Office of Policy and Strategy, U.S. Citizenship
and Immigration Services in the Department of Homeland Security (DHS).
[11] Our reanalysis differed from the Census Bureau's in that we
eliminated 19 GSS cases that we deemed ineligible because, for example,
interviewing took place over the telephone rather than in person, as
required by the grouped answers approach; we found that 6 respondents
of more than 200 failed to provide usable, specific answers.
[12] The GSS allowed bilingual household members to help respondents
with limited English skills. Our earlier testing with farmworkers was
conducted in Spanish, but no testing has covered linguistically
isolated non-Hispanic respondents. About 4 percent of the foreign-born
population both (1) does not speak Spanish and (2) is linguistically
isolated (that is, is part of a household in which no member age 14 or
older speaks English "very well"). Although this may seem a small
percentage, it is possible that non-Hispanic undocumented persons are
concentrated in this group.
[13] The distinction between accurate responses and the intent to
answer accurately is necessary because some respondents may mistakenly
think that they are, for example, in a legal status.
[14] We define "reasonably precise" as a 90 percent or 95 percent
confidence interval spanning plus or minus 2 to 4 percentage points. A
90 percent or 95 percent confidence interval is the interval within
which the parameter in question would be expected to fall 90 percent or
95 percent of the time, if the sampling and interval estimation
procedures were repeated in an infinite number of trials.
[15] In many cases, the method would not be suitable for low-risk
subgroups. (High-risk and low-risk refer to subgroups with above-
average and below-average percentages of undocumented persons,
respectively.)
[16] The grouped answers approach derives from (1) the residual method
described by Henry S. Schryock and Jacob S. Siegel and Associates, The
Methods and Materials of Demography (Washington, D.C.: U.S. Government
Printing Office, 1980), and Robert Warren and Jeffrey S. Passel, "A
Count of the Uncountable: Estimates of Undocumented Aliens Counted in
the 1980 Census," Demography, 24:3 (1987): 375-93, and (2) earlier
indirect survey-based techniques, such as "randomized response" (see
Stanley Warner, "A Survey Technique for Eliminating Evasive Answer
Bias," Journal of the American Statistical Association, 60 (1965): 63-
69, and Bernard Greenberg and others, "The Unrelated Questions
Randomized Response Model: Theoretical Framework," Journal of the
American Statistical Association, 64 (1969): 520-39.
[17] Note that Box B in figure 1 uses the term currently
"undocumented"--with quotation marks around undocumented. We believe
this wording may help communicate with undocumented respondents who
either (1) had a legal status in the past (for example, entered with a
temporary visa but have now overstayed and thus lost their legal
status) or (2) are likely to acquire a legal status in the near future
(for example, entered illegally and applied for legal status but have
not yet received it). Potentially, the quotation marks might help
communicate with respondents who have some kind of document (for
example, a "matricula card" issued by the Mexican government) but who
do not have a valid legal immigration status that allows U.S.
residence.
[18] In the test with Hispanic farmworkers, interviewers explained:
"Because we're using the boxes--we WON'T 'zero in' on anything somebody
might not want to tell us."
[19] In future, changes in percentages of foreign-born in various
statuses might warrant changes in groupings across the boxes.
Additionally, the specific legal statuses defined by law might change,
requiring a change in the legal statuses shown on the cards.
[20] Unlike some other indirect estimation techniques, the grouped
answers approach does not require unusual stratagems as part of the
survey interview, such as asking respondents to make a secret random
selection of a question.
[21] The result of the subtraction would be the same, either way--
assuming that the same percentage of subsample 1 and subsample 2 chose
Box C.
[22] For example, in the test with Hispanic farmworkers, respondents
who picked Box A and said they were legal permanent residents (they had
a green card) were asked (1) under which program they had applied for a
green card (Family Unity, employer, and so forth), (2) whether they had
received the card (or had applied but not yet received it), (3) how
they received it (in person or by mail), and (4) whether they had then
applied for U.S. citizenship--and if so, whether they had received
citizenship.
[23] If a respondent decides to reclassify himself or herself in Box B,
on the basis of follow-up questions, survey procedures can record only
the Box B classification--and delete the original Box A classification,
as well as any answers to Box A follow-up questions. This prevents
retention of any detailed immigration-status material on respondents in
Box B.
[24] The additional question would ask for the number of foreign-born
children in the household who are in each box of the same immigration
status card that the adult respondent used to report which box he or
she is in. However, this questioning approach has not been tested.
[25] The 15 states and their percentages of foreign-born residents in
2005 were Arizona, 14.5; California, 27.2; Colorado, 10.1; Connecticut,
12.5; Florida, 18.5; Hawaii, 17.2; Illinois, 13.6; Maryland, 11.7;
Massachusetts, 14.4; Nevada, 17.4; New Jersey, 19.5; New York, 21.4;
Rhode Island, 12.6; Texas, 15.9; Washington, 12.2. The percentage in
the District of Columbia was 13.1.
[26] Statement of Ronald Bird, Chief Economist, Office of the Assistant
Secretary for Policy, U.S. Department of Labor, before the Committee on
the Judiciary, U.S. Senate, July 5, 2006.
[27] Michael Hoefer, Nancy Rytina, and Christopher Campbell, Estimates
of the Unauthorized Immigrant Population Residing in the United States:
January 2005 (Washington, D.C.: Department of Homeland Security, Office
of Immigration Statistics, August 2006).
[28] Jeffrey S. Passel, "The Size and Characteristics of the
Unauthorized Migrant Population in the U.S.: Estimates Based on the
March 2005 Current Population Survey," Research Report (Washington,
D.C.: Pew Hispanic Center, Mar. 7, 2006).
[29] The first figure is from U.S. Immigration and Naturalization
Service, Office of Policy and Planning, Estimates of the Unauthorized
Immigrant Population Residing in the United States: 1990 to 2000
(Washington, D.C.: January 2003); the second is from Hoefer, Rytina,
and Campbell.
[30] While different estimates are based on different definitions of
undocumented, and there are questions about data reliability, it seems
clear that the population of undocumented foreign-born persons is large
and has increased rapidly.
[31] The alternative assumptions were made for levels of (1) American
Community Survey (ACS) undercounting of "unauthorized" immigrants and
(2) emigration from the United States on the part of legal immigrants
counted as having been "admitted" between 1980 and 2004.
[32] Hoefer, Rytina, and Campbell, p. 6.
[33] See Kenneth Hill, "Estimates of Legal and Unauthorized Foreign-
Born Population for the United States and Selected States Based on
Census 2000," presentation at the U.S. Census Bureau Conference,
Immigration Statistics: Methodology and Data Quality, Alexandria,
Virginia, February 13-14, 2006. A similar point was made by Jacob S.
Siegel and David A. Swanson, The Methods and Materials of Demography,
2nd ed. (San Diego, Calif.: Elsevier Academic Press, 2004), p. 479.
[34] Administrative records on where legal immigrants live are based on
their residence (or intended residence) at the time when legal
permanent resident status was attained; these records have not been
subsequently updated. There are no administrative records on current
activities of legal permanent residents, such as employment.
[35] See U.S. Commission on Immigration Reform, U.S. Immigration
Policy: Restoring Credibility: 1994 Report to Congress (Washington,
D.C.: U.S. Government Printing Office, 1994), pp. 179-86.
[36] NRC, Committee on National Statistics, Local Fiscal Effects of
Illegal Immigration: Report of a Workshop (Washington, D.C.: National
Academy Press, 1996), p. 1-2.
[37] See, for example, GAO, Illegal Alien Schoolchildren: Issues in
Estimating State-by-State Costs, GAO-04-733 (Washington, D.C.: June 23,
2004), and Undocumented Aliens: Questions Persist about Their Impact on
Hospitals' Uncompensated Care Costs, GAO-04-472 (Washington, D.C.: May
21, 2004). For a more general discussion, see GAO/GGD-98-164, ch. 2,
"Policy-Related Information Needs."
[38] Census Bureau staff told us that this research includes J. Gregory
Robinson, "Memorandum for Donna Kostanich," DSSD A.C.E. Revision II
Memorandum Series No. PP-36, U.S. Bureau of the Census, Washington,
D.C., December 31, 2002.
[39] GAO/GGD-98-164, p. 3.
[40] While NAWS data collections are fielded annually, results are
generally reported every other year. See U.S. Department of Labor,
Findings from the National Agricultural Workers Survey (NAWS) 2000-
2002: A Demographic and Employment Profile of United States Farm
Workers. Research Report 9 (Washington, D.C.: March 2005).
[41] The SIPP flash card has neither an undocumented category nor an
"other status not listed" category. However, persons reported to have
an immigration status not on the SIPP card--which would logically
include undocumented persons as well as a small number of persons in
various minor legal immigration categories--are tallied separately.
[42] Although NAWS and SIPP have received OMB clearance (under the
Paperwork Reduction Act), and although no special field problems have
emerged, it is difficult to say whether field problems might arise in
future. Reasons include question-threat and related problems depending,
in part, on contextual factors, such as current levels of immigration
enforcement in the nonborder areas of the United States, and the
perceived relevance of the question to the survey.
[43] The contract specified that Aguirre would provide GAO data on
actual responses that had been "stripped of person-identifiers and
related information."
[44] Additionally, GAO conducted cognitive interviews focused on
testing the appropriateness of the icons used on the cards (see GAO/
GGD-00-30, pp. 44-45). Cognitive interviewing focuses on the mental
processes of the respondent while he or she is answering a survey
question. The goals are to find out what each respondent thinks the
question is asking, what the specific words or phrases (or icons on a
card) mean to him or her, and how he or she formulates an answer.
Typically, cognitive interviewing is an iterative process in which the
findings or problems identified in each set of interviews are used to
modify the questions to be tested in the next set of interviews.
[45] GAO/GGD-98-164 and GAO/GGD-00-30.
[46] The GSS consists of a "core" question series and additional
"modules." The funding for fielding the core question series is
provided by a grant from NSF. The modules are question series added
through a variety of grants and contracts.
[47] An expert reviewer of a draft of this report noted that the
housing types on the training card shown in figure 5 are not all
mutually exclusive; that is, a single family house can be located on a
farm.
[48] These cards were initially subjected to 1997-98 developmental
tests conducted with more than 100 Hispanic immigrants who were
farmworkers or in other situations such as applying for aid at a legal
clinic specializing in immigration cases--such that a fair number of
those interviewed seemed relatively likely to be undocumented. See GAO/
GGD-00-30 and GAO/GGD-98-164.
[49] The Census Bureau's paper said that field representatives reported
that the remaining respondents were in doubt and may not have
understood.
[50] The Census Bureau's paper also noted that the nonresponse rate for
the GSS overall (that is, averaged across a combination of U.S.-born
and foreign-born persons selected for the sample) was 29.6 percent.
(Persons who are selected for interview but not interviewed may be
either native-born or foreign-born; because they were never asked and
never reported where they were born, a specific response rate for the
foreign-born cannot be calculated.)
[51] See Samia El-Badry and David A. Swanson, "Providing Special Census
Tabulations to Government Security Agencies in the United States: The
Case of Arab-Americans," paper presented at the 25th International
Population Conference of the International Union for the Scientific
Study of Population, Tours, France, July 18-23, 2005. One advocate was
particularly concerned about the possibility that lower respondent
cooperation might have resulted from these incidents and, if so, might
have led to underrepresentation of these communities in Census Bureau
data. Additionally, one advocate questioned whether local estimates of
the undocumented might, in future, facilitate possible efforts to base
apportionment on population counts that do not include undocumented
residents. We note that most large-scale personal-interview surveys do
not include sufficient numbers of foreign-born respondents to allow
indirect grouped answers estimates of undocumented persons for small
geographic areas, such as zip codes.
[52] See "U.S. Customs and Border Protection Statement on Census Data,"
Department of Homeland Security, Press Office, Washington, D.C., August
13, 2004.
[53] Charles Louis Kincannon, Director, "Procedures for Providing
Assistance to Requestors for Special Data Products Known as Special
Tabulations and Extracts," memorandum to Associate Directors, Division
Chiefs, Bureau of the Census, Washington, D.C., August 26, 2004.
[54] It might be noted that SIPP officials told us that when the Census
Bureau conducted the SIPP survey and asked about immigration status,
interviewers did not experience field problems. However, SIPP asks
about immigration status at the time when respondents came to this
country (and one other question); SIPP stopped short of a specific
question on current undocumented status--and the SIPP data do not allow
indirect estimation of the number who are currently undocumented.
[55] These two examples involve agencies that are viewed neutrally by
the immigrant advocates we talked with. (Agencies that are viewed
negatively by some immigrant advocates are DHS and the Census Bureau.)
[56] GSS receives funding for its core questions through a grant from
NSF. GSS interviewers and advance letters told respondents about the
NSF sponsorship. Additionally, respondents were told that one purpose
of the survey was to inform government officials.
[57] This would mean communication that takes account of cultural as
well as language concerns.
[58] The 2004 GSS was limited to respondents who either were fluent in
English or were helped by a household member who was fluent in English;
some persons with limited English proficiency are likely to have been
reached. The preliminary testing and development of the grouped answers
approach offered a choice of Spanish or English interviews. However,
linguistically isolated non-Hispanics have not yet been included in any
test.
[59] Later in this report, we describe potential ways of testing
whether respondents "pick the correct box"--ways that do not require
routine collection of respondent names and Social Security numbers as
part of the main survey.
[60] CASI, or Computer Assisted Self Interview, means that the
respondent himself or herself uses a laptop to view the questions and
flash cards and to indicate his or her answers. Audio-CASI adds
earphones so that questions and instructions can be spoken to the
respondent while he or she views the questions on the screen. Audio-
CASI programming can be completed in any one of several languages.
Experts told us that studies have shown increased reporting of
sensitive items when audio-CASI is used.
[61] Two advocates mentioned positively the transparency that the
Census Bureau works toward through outreach to immigrant-advocate
organizations. This outreach includes explanation of data collection
goals and policies.
[62] GSS Director Tom Smith graciously arranged for a hand check of
interviews coded refusal or "don't know," thus providing key
information to us in time for this report. (Specific mode-of-interview
data for all 2004 GSS respondents will not be available until the end
of 2006.) The GSS Director also said that, overall, about 10 percent of
the 2004 GSS interviews were conducted over the telephone.
[63] Similar numbers refused or said "don't know" on the two 3-box
training cards. Specifically, 8 respondents refused or said "don't
know" on the housing card, 6 on the transportation card.
[64] Alternatively, we believe that it might be possible to estimate
the bias incurred by including a small number of telephone interviews
in the analysis (or by eliminating them from the analysis).
[65] Questions were asked and answers were apparently given in English.
[66] The pretesting and cognitive testing conducted on the cards so far
has been limited to certain groups of Hispanics. We believe that
testing with other groups, potentially including focus group testing,
could be important before large-scale implementation. It also might be
appropriate to change specific categories and definitions of statuses
on the cards, depending on future changes in laws.
[67] In fact, a key part of the earlier testing focused on the
development of icons to help respondents with limited literacy.
[68] NCHS has suggested that some kind of validity test at the
individual level is needed. Interviewing persons whose status is known
in advance is a classic approach.
[69] One expert scoffed at a validity test limited to persons whose
immigration status is known to DHS. An immigrant advocate pointed to
the issues that arose when the Census Bureau helped DHS obtain publicly
available information on ethnicity by zip code; she indicated that a
public relations problem could result even if only carefully crafted,
carefully protected sharing of information took place.
[70] One immigrant advocacy organization pointed out that it would be
important in such a study to protect the data so that the agency
checking records (in this instance, the Social Security Administration)
could not discover information about any identifiable respondent.
Protective approaches might include (1) using code numbers and a "third
party" model and (2) adding numerous "fake" cases to the checklist and
notifying the agency that this was being done. (See GAO, Record Linkage
and Privacy: Issues in Creating New Federal Research and Statistical
Information. GAO-01-126SP (Washington, D.C.: April 2001).)
[71] The ideas for these approaches are an outgrowth of our discussions
concerning NSDUH with SAMSHA. The NSDUH project officer said that as
part of that survey (which is fielded by RTI International in Research
Triangle Park, N.C., under a contract with SAMHSA), a sample of
respondents were offered $25 for a hair sample and $25 for a urine
sample. Ninety percent of those offered the incentive payments provided
one or both samples.
[72] It would be important to craft such a study so that respondents
would not be tempted to distort information in order to receive
payment. One immigrant advocate suggested asking "what other experience
federal agencies have had with paying a select group of respondents to
participate in a validity test" to determine "whether the payment
approach is considered scientifically sound." One way of addressing
this concern might be to offer all or some Box B respondents a "minimal
threat" follow-up opportunity, such as participating in a focus group,
which could also be associated with a payment.
[73] Other possible comparative analyses might also be useful. DHS
suggested comparisons to results from the Latin American Migration
Project and the New Immigrant Survey.
[74] This is a version of the standard "known groups" validity test--an
approach that NCHS suggested using if it is not possible to conduct
individual checks.
[75] An expert in immigration studies suggested this test. As DHS's
comments indicate, such a test would involve adjusting the DHS figures
on, for example, the number of green cards issued in specific years to
account for subsequent return-migration and mortality, as well as
taking account of survey undercoverage. For information on adjustments
needed in comparisons involving green cards, see Nancy F. Rytina,
Estimates of the Legal Permanent Resident Population and Population
Eligible to Naturalize in 2004 (Washington, D.C.: Department of
Homeland Security, Office of Immigration Statistics, February, 2006),
p. 3, table 2. For an analogous comparison for U.S. citizenship, see
Jeffrey S. Passel, Rebecca L. Clark, and Michael Fix, "Naturalization
and Other Current Issues in U.S. Immigration: Intersections of Data and
Policy," in Proceedings of the Social Statistics Section of the
American Statistical Association: 1997 (Alexandria, Va.: American
Statistical Association, 1997).
[76] This test was suggested by another expert in immigration studies.
Residual estimates are based primarily on comparing (1) administrative
data on the number of legal immigrants with (2) census counts or survey
estimates of the number of foreign-born residents who have not become
U.S. citizens.
[77] A sample of foreign-born is contained within a general sample of
the household population. As we explain in a later section of this
report, an efficient way to survey the foreign-born is by piggybacking
on an existing, ongoing large-scale survey of the total household
population, which includes foreign-born persons--if an appropriate
ongoing survey can be identified. A higher-cost alternative would be to
identify a new sample of the total household population and screen (by
mini-interviews conducted by telephone or in person or both) for
households that contain one or more foreign-born persons.
[78] The size of the error associated with a grouped answers estimate
relative to a direct estimate depends on the distribution of
immigration statuses. Assuming that 33.3 percent of foreign-born
persons are in the undocumented category, 33.3 percent are in the set
of legal statuses in Card 1, Box A, and 33.3 percent are in the set in
Card 2, Box A, we would expect the error associated with a grouped
answers estimate of the percentage undocumented to be twice that
associated with a corresponding direct estimate.
[79] If there is no information on the distribution of immigration
status, then a potentially very large sample size would be estimated,
based on a "worst case scenario" distribution. However, if there is
information, this may allow a given level of precision to be attained
with a smaller sample.
[80] To illustrate how this occurs in practice, referring to the
National Health Interview Survey (NHIS), NCHS told us that an estimate
of the percentage of persons who are foreign-born, 18 to 39 years old,
and U.S. citizens is characterized by a variance that is roughly 1.6
times the variance that would be associated with a corresponding
estimate based on simple random sampling. (In theory, a complex
sampling design could reduce the variance rather than increasing it.)
[81] The independent statistical consultant (Dr. Zaslavsky) advised us
that rotating the use of immigration status cards 1 and 2 in every
other household interviewed (balancing the use of alternative cards
within areas or clusters) might increase precision. The logic is that
because some areas are defined by factors such as income and ethnicity-
-which might be related to immigration status--rotation would help
ensure balance on these factors.
[82] For example, it is possible that new immigration laws would allow
large numbers of currently undocumented persons to legalize their
status.
[83] We believe these are reasonable choices but we realize that others
might focus on, for example, more precise estimation (plus or minus 2
percentage points).
[84] However, if the percentage undocumented overall were to sharply
decrease, it might be appropriate to change the groupings on the cards
to mitigate this factor.
[85] Such bias might arise from problems in accurately covering the
foreign-born population. An additional caveat is that coverage of the
undocumented may be lower than coverage of other foreign-born persons.
We examined coverage issues in GAO/GGD-98-164.
[86] This assumes that the census count or updated estimate is a
constant.
[87] Suppose hypothetically that an updated estimate for some future
year estimates the foreign-born population as 40 million and that a
grouped answers estimate of the percentage of foreign-born who are
undocumented is 30 percent. Multiplying 40 million by 30 percent would
yield an estimate of 12 million undocumented (hypothetical data).
Further suppose that the true size of the foreign-born population, in
that future year, were actually 42 million. Multiplying 42 million by
30 percent would yield 12.6 million--a result just 5 percent higher
than 12 million.
[88] In contrast, analysts have pointed to a potentially
disproportionate, magnifying impact of bias in census counts (or error
in updated estimates of the size) of the foreign-born population on
residual estimates of the number who are undocumented. See Kenneth
Hill, "Estimates of Legal and Unauthorized Foreign-Born Population for
the United States and Selected States Based on Census 2000,"
presentation at the U.S. Census Bureau Conference, Immigration
Statistics: Methodology and Data Quality, Alexandria, Virginia,
February 13-14, 2006. Siegel and Swanson (p. 479) make a similar point.
[89] More than 6,000 of these households included one or more foreign-
born persons.
[90] A fifth survey, SIPP, a large-scale in-person survey, is scheduled
to be "reengineered" to provide an "effective alternative to the
current SIPP." It is anticipated that administrative data will be
combined with survey data, although the exact directions that the
revised effort will take are not yet known. (We defined large-scale as
50,000 or more interviews, including native-born and foreign-born
respondents. The foreign-born represent about 12 percent of the
national population, implying that a survey of 50,000 U.S. residents
could be expected to collect data on roughly 6,000 foreign-born
persons.)
[91] This follow-back survey concerns alcohol use and alcoholism; it is
sponsored by the National Institute of Alcohol Abuse and Alcoholism.
OMB told us that, in part because ACS is a new survey, very few other
follow-up efforts, if any, are likely to be approved in the next few
years.
[92] For example, with respect to possible impacts on answers to main-
survey questions, SAMHSA (which sponsors the NSDUH) indicated a concern
that asking about immigration status might make respondents less likely
to provide honest answers to questions about illegal behaviors such as
drug use (potentially because of fear of such actions as deportation).
[93] As we discussed in a previous section, experts told us that it is
important to demonstrate that respondents, especially undocumented
respondents, "pick the correct box"--or at least to demonstrate that
they intend to pick the correct box (rather than avoiding Box B).
[94] Cognitive interviewing focuses on the mental processes of the
respondent while he or she is answering a survey question. The goals
are to find out what each respondent thinks the question is asking,
what the specific words or phrases (or icons on a card) mean to him or
her, and how he or she formulates an answer. Typically, cognitive
interviewing is an iterative process in which the findings or problems
identified in each set of interviews are used to modify the questions
to be tested in the next set of interviews.
[95] For example, if a respondent had already admitted engaging in a
behavior related to illegal activity, he or she might be less likely to
accurately answer a question on immigration status. Of course, if
future testing were to indicate that a particular type of sensitive
item did not affect immigration responses, this criterion would be
dropped.
[96] The ACS is a mixed-mode rather than a solely personal-interview
survey. It gathers information on all members of a household based, in
some cases, on a single adult respondent-informant rather than randomly
selecting one or more respondents in each household and asking them to
provide information about themselves. However, one follow-back personal
interview survey has based its sample selection on the ACS frame and
its data. We further note that if a follow-back survey based on the CPS
could be conducted, then--provided that the follow-back was designed
for self-report personal interviews--it would meet the criteria in
table 3.
[97] With respect to the individual level, Census Bureau staff told us
that they are extremely careful not to disclose information, that such
disclosure is prohibited by law, and that the Census Bureau explains
this to respondents. However, they also said that some respondents
erroneously believe that all government agencies share information with
one another or might do so under certain circumstances.
[98] We note that the relevance of the criteria in table 4 would likely
be heightened if interior enforcement efforts (that is, those conducted
away from border areas) were to sharply increase.
[99] This expert reviewer told us: "One of the biggest issues
surrounding immigration is the scale of in-and out-migration. The
failure to understand this process is one of the biggest reasons that
the population estimates were so far off at the time of the 2000
census. A survey devoted to the foreign-born could be especially
helpful in ensuring that we have the best weights [information on
population] possible, particularly if the survey could accurately
estimate illegal aliens."
[100] The ACS defines residence in a household as living there for 2
months (either completed or ongoing). For a discussion of other quality
issues in the ACS, see Steven A. Camarota and Jeffrey Capizzano,
"Assessing the Quality of Data Collected on the Foreign Born: An
Evaluation of the American Community Survey (ACS): Pilot and Full Study
Findings," Immigration Studies White Papers, Sabre Systems Inc., April
2004. http://www.sabresys.com/whitepapers/CIS_whitepaper.pdf (Sept. 6,
2006).
[101] Potentially, the prospects for private sector funding could be
explored. One question would be whether it is possible to identify a
willing private sector source that is not aligned with a particular
perspective on immigration issues.
[102] Alternatively, survey costs can be estimated--albeit more
roughly--on the basis of the experience of survey organizations.
[103] Validity tests conducted concurrent with the survey and follow-on
checks that compare survey results against (adjusted) administrative
information would seem to be appropriate, if a survey is, in fact,
fielded.
[104] DHS suggested that the pilot testing be conducted within a
limited geographic area.
[105] For example, DHS pointed to the issue of an existing survey (the
American Community Survey) defining residence in a household as living
there for 2 months (either completed or ongoing). DHS said this would
likely exclude some unauthorized and temporary migrants and indicated
that, if a new survey needs to be conducted, it should be designed to
cover all foreign-born persons residing here.
[106] A grouped answers estimate of the percentage of the foreign born
who are undocumented can be defined as the percentage of subsample 1
who are in Box B, Card 1, minus the percentage of subsample 2 who are
in Box A, Card 2. Alternatively, a grouped answers estimate could be
defined as the percentage of subsample 2 who are in Box B, Card 2,
minus the percentage of subsample 1 who are in Box A, Card 1. If both
calculations are performed and two estimates are derived, they might be
termed "mirror image" estimates.
[107] The independent review considered the Census Bureau and GAO
analyses of the GSS data in terms of (1) their overall reasonableness
and thoroughness, given the general objective (describing respondents'
acceptance and understanding), (2) key points of difference (if any)
between the two analyses or differences in conclusions, (3) whether the
analyses raised unanswered questions that should be addressed, and (4)
whether the conclusions appeared to be justified. The reviewer was also
free to comment on other aspects of the analyses.
[108] We believe this report independently addresses respondent
acceptability because we (1) focus on the results of the GSS test
(rather than critiquing the Census Bureau's work), (2) report how the
method performed rather than subjectively assessing its merit, and (3)
relied on an independent expert.
[109] DHS contributed to the funding of the Census Bureau's contract
with the National Opinion Research Center (NORC) for the insertion of a
module (question series) into the GSS.
[110] We consulted with Alan Zaslavsky, Fritz Scheuren, and Mary Grace
Kovar.
[111] In our earlier work, we consulted with numerous other private
sector experts on immigration and statistics. For those experts, see
GAO/GGD-00-30, p. 29.
[112] In 1998, we recommended that the Commissioner of the Immigration
and Naturalization Service (INS) and the Director of the Census Bureau
"devise a plan of joint research for evaluating the quality of census
and survey data on the foreign-born," based on our discussion of the
need to evaluate coverage and possible methods for doing so (see GAO/
GGD-98-164). This recommendation is still open. In 2002, Census Bureau
staff assumed that 15 to 20 percent of the undocumented were not
enumerated in the 1990 census and stated the belief that coverage of
this group improved in the 2000 census. (See Joseph Costanzo and
others, "Evaluating Components of International Migration: The Residual
Foreign-Born," Population Division Working Paper 61, U.S. Census
Bureau, Washington, D.C., June 2002, p. 22.) However, the Census Bureau
has not quantitatively estimated the coverage of either the foreign-
born population overall or the undocumented population.
[113] Estimation of program costs associated with an individual
respondent (or those in very refined subgroups) is sometimes calculated
based on a combination of (1) answers to specific questions (such as
whether the person is attending public school in the school district
where he or she lives or how many emergency room visits he or she made)
and (2) separately available information on program costs per
individual (for example, the per-pupil costs of public education in
specific school districts or the per-visit costs of emergency room
care).
[114] Potentially, based on the location of the responding household,
state and local per-pupil school costs could be obtained. Totaling
state and local school costs for foreign-born children in each box
would be followed by a group-level subtraction. In this way, the costs
of schooling undocumented immigrant children could be estimated--
nationally and potentially for key states--without ever categorizing
any child as undocumented and without ever estimating the number of
undocumented children in any school district.
[115] See GAO, Overstay Tracking: A Key Component of Homeland Security
and a Layered Defense, GAO-04-82 (Washington, D.C.: May 21, 2004).
[116] See Judith A. Droitcour and Eric M. Larson, "An Innovative
Technique for Asking Sensitive Questions: The Three-Card Method,"
Bulletin de Mèthodologie Sociologique, 75 (July 2002): 5-23.
[117] The foreign-born population includes anyone who was not a U.S.
citizen or a U.S. national at birth. All others - including those who
were born abroad or at sea of at least one parent who was a U.S.
citizen - belong to the native population.
[118] U.S. Government Accountability Office. Immigration Statistics:
Information Gaps, Quality Issues Limit Utility of Federal Data to
Policymakers. (GAO/GGD-98-164). Washington, D.C.: GAO, July 1998.
[119] This is the number of completed cases and does not include
refusals, break-offs, and other forms of non-response. According to
NORC, the 2004 GSS had a non-response rate of 29.6 percent. For more
details, see Davis, James Allan; Smith, Tom W; and Marsden, Peter V.
General Social Surveys, 1972-2004: Cumulative Codebook. Chicago:
National Opinion Research Center, 2005. (National Data Program for the
Social Sciences Series, no. l8).
[120] The GSS does not have a variable that directly identifies
respondents as being U.S. natives or foreign born. For this review, the
foreign born were designated as those who reported being born outside
the United States, were not born in Puerto Rico, and reported neither
parent as being born in the United States.
[121] The GSS data cited in this report are unweighted counts and
should not be construed as population estimates.
[122] The population universe of the ASEC is limited to the civilian
non- institutionalized population in the United States, though some
members of the armed forces may be included if they live with family
members in off-post housing; for brevity, this universe will be denoted
in this report as the total population. Likewise, the civilian non-
institutionalized foreign-born population as measured by ASEC will
simply be referred to as the foreign-born population.
[123] All comparison tests presented in this report have taken sampling
error into account and are significant at the 90-percent confidence
level, unless otherwise stated.
[124] Comparisons by marital status, educational attainment, and
Hispanic origin are not described in the text because the population
universes for the GSS data and the publicly available ASEC data lack
comparability. See Table I for further details.
[125] The representation of those born in either Asia or Other Regions
was not significantly different between the GSS sample and the ASEC
estimates. "Other Regions" includes Northern America, Africa, and
Oceania.
[126] None of the immigration experts we interviewed raised this issue,
however.
[127] Thus far, testing has included only one immigration status card,
so test interviewers have not told respondents that other respondents
will be providing information on some of the Box B statuses.
[128] See GAO/GGD-00-30.
[129] For simplicity, the discussion in this appendix assumes simple
random sampling, for both the main sample and the selection of the two
subsamples.
[130] Logically, if very few persons choose Box C, the precision gains
from combining the mirror-image estimates (which would necessarily be
very similar to each other) would be very small.
GAO's Mission:
The Government Accountability Office, the investigative arm of
Congress, exists to support Congress in meeting its constitutional
responsibilities and to help improve the performance and accountability
of the federal government for the American people. GAO examines the use
of public funds; evaluates federal programs and policies; and provides
analyses, recommendations, and other assistance to help Congress make
informed oversight, policy, and funding decisions. GAO's commitment to
good government is reflected in its core values of accountability,
integrity, and reliability.
Obtaining Copies of GAO Reports and Testimony:
The fastest and easiest way to obtain copies of GAO documents at no
cost is through the Internet. GAO's Web site ( www.gao.gov ) contains
abstracts and full-text files of current reports and testimony and an
expanding archive of older products. The Web site features a search
engine to help you locate documents using key words and phrases. You
can print these documents in their entirety, including charts and other
graphics.
Each day, GAO issues a list of newly released reports, testimony, and
correspondence. GAO posts this list, known as "Today's Reports," on its
Web site daily. The list contains links to the full-text document
files. To have GAO e-mail this list to you every afternoon, go to
www.gao.gov and select "Subscribe to e-mail alerts" under the "Order
GAO Products" heading.
Order by Mail or Phone:
The first copy of each printed report is free. Additional copies are $2
each. A check or money order should be made out to the Superintendent
of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or
more copies mailed to a single address are discounted 25 percent.
Orders should be sent to:
U.S. Government Accountability Office
441 G Street NW, Room LM
Washington, D.C. 20548:
To order by Phone:
Voice: (202) 512-6000:
TDD: (202) 512-2537:
Fax: (202) 512-6061:
To Report Fraud, Waste, and Abuse in Federal Programs:
Contact:
Web site: www.gao.gov/fraudnet/fraudnet.htm
E-mail: fraudnet@gao.gov
Automated answering system: (800) 424-5454 or (202) 512-7470:
Public Affairs:
Jeff Nelligan, managing director,
NelliganJ@gao.gov
(202) 512-4800
U.S. Government Accountability Office,
441 G Street NW, Room 7149
Washington, D.C. 20548: