Small Business Administration
Model for 7(a) Program Subsidy Had Reasonable Equations, but Inadequate Documentation Hampered External Reviews
Gao ID: GAO-04-9 March 31, 2004
The Small Business Administration (SBA) approved about $8.6 billion in loan guarantees through its 7(a) loan program in fiscal year 2003. SBA must estimate the subsidy cost of this program. Since fiscal year 2003, SBA has been using econometric modeling to estimate the subsidy. This report reviews SBA's estimation methodology and equations, assesses the default and recovery rates the model produced, identifies ways to enhance the estimates' reliability, describes the process for developing the model, and analyzes SBA's data.
From an economics perspective, SBA's econometric equations were reasonable, and its model produced estimated default and recovery rates that were in line with historical experience. However, from an audit perspective, SBA's lack of documentation of the model development process precluded GAO, and others, from independently evaluating the model's development and determining if SBA used a sound and consistently applied method to select and reject model variables. Taking into account economic reasoning and research, SBA's econometric equations for estimating defaults, prepayments, and recoveries were reasonable. SBA's equations used a limited set of variables; equations using other variables could also be reasonable but would produce different estimates. Since an estimate is an approximation, no one estimate can be considered accurate, and reasonable estimates can fall within a range of values. The model's estimated default and recovery rates were in line with recent historical experience. SBA could improve its estimation methodology by periodically checking for and correcting errors and should consider adding more borrower information, such as credit scores. Some errors in the model resulted in understating the estimated program costs. SBA used the expertise of other agencies and a contractor to develop its model and worked closely with the Office of Management and Budget (OMB), which must approve the methodology agencies use to estimate subsidies. OMB officially approved the model in the fall of 2002. SBA did not adequately document its model development process, including alternative variables considered and rejected, to enable external reviewers to assess the process that was used. Further, GAO and two other independent reviewers could not determine whether a bias existed in the model by systematically excluding variables to influence the subsidy rate in a particular direction. Adequate documentation, a key internal control, would enable SBA and other agencies to demonstrate the rationale and basis for key aspects of the model that provide important cost information for budgets, financial statements, and congressional decision makers and facilitate SBA's annual financial statement audit. Current OMB and other guidance is either silent or unclear about the level of documentation necessary for credit subsidy model development. SBA had a process to help ensure data integrity and data consistency in the equations with the loan-level data in its databases. Although errors existed in SBA's data systems, the magnitude and nature of these errors were not likely to significantly affect the subsidy rate.
Recommendations
Our recommendations from this work are listed below with a Contact for more information. Status will change from "In process" to "Open," "Closed - implemented," or "Closed - not implemented" based on our follow up work.
Director:
Team:
Phone:
GAO-04-9, Small Business Administration: Model for 7(a) Program Subsidy Had Reasonable Equations, but Inadequate Documentation Hampered External Reviews
This is the accessible text file for GAO report number GAO-04-9
entitled 'Small Business Administration: Model for 7(a) Program Subsidy
Had Reasonable Equations, but Inadequate Documentation Hampered
External Reviews' which was released on March 31, 2004.
This text file was formatted by the U.S. General Accounting Office
(GAO) to be accessible to users with visual impairments, as part of a
longer term project to improve GAO products' accessibility. Every
attempt has been made to maintain the structural and data integrity of
the original printed product. Accessibility features, such as text
descriptions of tables, consecutively numbered footnotes placed at the
end of the file, and the text of agency comment letters, are provided
but may not exactly duplicate the presentation or format of the printed
version. The portable document format (PDF) file is an exact electronic
replica of the printed version. We welcome your feedback. Please E-mail
your comments regarding the contents or accessibility features of this
document to Webmaster@gao.gov.
This is a work of the U.S. government and is not subject to copyright
protection in the United States. It may be reproduced and distributed
in its entirety without further permission from GAO. Because this work
may contain copyrighted images or other material, permission from the
copyright holder may be necessary if you wish to reproduce this
material separately.
Report to Congressional Committees:
March 2004:
SMALL BUSINESS ADMINISTRATION:
Model for 7(a) Program Subsidy Had Reasonable Equations, but Inadequate
Documentation Hampered External Reviews:
[Hyperlink, http://www.gao.gov/cgi-bin/getrpt?GAO-04-9]:
GAO Highlights:
Highlights of GAO-04-9, a report to Chairman and Ranking Minority
Member, House Committee on Small Business; Ranking Minority Member,
Senate Committee on Small Business and Entrepreneurship
Why GAO Did This Study:
The Small Business Administration (SBA) approved about $8.6 billion in
loan guarantees through its 7(a) loan program in fiscal year 2003. SBA
must estimate the subsidy cost of this program. Since fiscal year 2003,
SBA has been using econometric modeling to estimate the subsidy. This
report reviews SBA‘s estimation methodology and equations, assesses the
default and recovery rates the model produced, identifies ways to
enhance the estimates‘ reliability, describes the process for
developing the model, and analyzes SBA‘s data.
What GAO Found:
From an economics perspective, SBA‘s econometric equations were
reasonable, and its model produced estimated default and recovery rates
that were in line with historical experience. However, from an audit
perspective, SBA‘s lack of documentation of the model development
process precluded GAO, and others, from independently evaluating the
model‘s development and determining if SBA used a sound and
consistently applied method to select and reject model variables.
Taking into account economic reasoning and research, SBA‘s econometric
equations for estimating defaults, prepayments, and recoveries were
reasonable. SBA‘s equations used a limited set of variables; equations
using other variables could also be reasonable but would produce
different estimates. Since an estimate is an approximation, no one
estimate can be considered accurate, and reasonable estimates can fall
within a range of values. The model's estimated default and recovery
rates were in line with recent historical experience. SBA could improve
its estimation methodology by periodically checking for and correcting
errors and should consider adding more borrower information, such as
credit scores. Some errors in the model resulted in understating the
estimated program costs.
SBA used the expertise of other agencies and a contractor to develop
its model and worked closely with the Office of Management and Budget
(OMB), which must approve the methodology agencies use to estimate
subsidies. OMB officially approved the model in the fall of 2002.
SBA did not adequately document its model development process,
including alternative variables considered and rejected, to enable
external reviewers to assess the process that was used. Further, GAO
and two other independent reviewers could not determine whether a bias
existed in the model by systematically excluding variables to influence
the subsidy rate in a particular direction. Adequate documentation, a
key internal control, would enable SBA and other agencies to
demonstrate the rationale and basis for key aspects of the model that
provide important cost information for budgets, financial statements,
and congressional decision makers and facilitate SBA‘s annual financial
statement audit. Current OMB and other guidance is either silent or
unclear about the level of documentation necessary for credit subsidy
model development.
SBA had a process to help ensure data integrity and data consistency in
the equations with the loan-level data in its databases. Although
errors existed in SBA‘s data systems, the magnitude and nature of these
errors were not likely to significantly affect the subsidy rate.
What GAO Recommends:
SBA should (1) determine whether to include in the model other
information from its new loan monitoring system, (2) periodically
evaluate and update the model, and (3) document the model development
process. OMB should require agencies to document the basis and process
for developing their credit subsidy models.
SBA agreed with recommendations to improve the final model but SBA and
OMB disagreed that the model development was inadequately documented
and disagreed with our recommendations to improve such documentation
and guidance.
However, given the difficulty experienced by reviewers due to
inadequate documentation, we continue to recommend that SBA document
the basis and process for developing its model and that OMB require
this documentation.
www.gao.gov/cgi-bin/getrpt?GAO-04-9.
To view the full product, including the scope and methodology, click on
the link above. For more information, contact Davi D'Agostino at (202)
512-8678 or dagostinod@gao.gov.
[End of section]
Contents:
Letter:
Results in Brief:
Background:
SBA's Equations Were Reasonable and Estimated Default, and Recovery
Rates Were in Line with Historical Experience:
SBA's Model Could Be Enhanced by Adding Information on Borrowers,
Correcting Errors, and Updating Some Data:
SBA Collaborated with OFHEO and OMB to Develop the Model:
Lack of Adequate Model Documentation Hampered Independent Reviews of
SBA's Model:
SBA Had a Process to Help Ensure Data Quality and the Data Used in the
Model and SBA's Loan Level Databases Were Consistent:
Conclusions:
Recommendations for Executive Action:
Agency Comments and Our Evaluation:
Appendixes:
Appendix I: Objectives, Scope, and Methodology:
Assessing the Reasonableness of the Model's Econometric Equations and
Evaluating the Model's Estimated Default, Prepayment, and Recovery
Rates:
Identifying Additional Steps SBA Could take to Further Enhance the
Reliability of the Model:
Reviewing SBA's Process of Developing the Subsidy Model:
Evaluating the Model's Supporting Documentation, Including Its
Discussion of What Variables Were Tested and Rejected:
Determining What Steps SBA Took to Ensure the Integrity of the Data
Used in the Model and Whether These Data Were Consistent with
Information in Its Databases:
Appendix II: Analysis of Default, Prepayment, and Recoveries
Econometric Equations:
SBA's Default and Prepayment Equations:
Effects of Including Additional Variables:
SBA's Recovery Equation:
Appendix III: Comments from the Small Business Administration:
GAO Comments:
Appendix IV: Comments from the Office of Management and Budget:
GAO Comments:
Appendix V: GAO Contacts and Staff Acknowledgments:
GAO Contacts:
Staff Acknowledgments:
Tables:
Table 1: Variable Names and Descriptions:
Table 2: Multinomial Logistic Regression Coefficient Estimatesa:
Table 3: Names and Descriptions of Additional Variables:
Table 4: Multinomial Logistic Regression Coefficient Estimatesa:
Table 5: Distribution of SIC Industry Codes in SBA's Loan Database
Distribution of SIC Industry Codes in SBAs Loan Database:
Table 6: Variable Names and Descriptions:
Table 7: Recovery Model:
Figures:
Figure 1: Major Segments of the Model to Estimate 7(a) Subsidy Rate:
Figure 2: Estimated Default Rates Compared with Average Default
Experience from 1992 through 2001:
Figure 3: Estimated Default Rates Compared with Fiscal Year 2001 Actual
Default Experience:
Abbreviations:
CFO: Chief Financial Officer:
FCRA: Federal Credit Reform Act of 1990:
GDP: gross domestic product:
NAIC: North American Industrial Classification:
OFHEO: Office of Federal Housing Enterprise Oversight:
OMB: Office of Management and Budget:
SBA: Small Business Administration:
SIC: Standard Industrial Classification:
Letter March 31, 2004:
The Honorable Donald A. Manzullo:
Chairman, Committee on Small Business:
House of Representatives:
The Honorable Nydia M. Velazquez:
Ranking Minority Member:
Committee on Small Business:
House of Representatives:
The Honorable John F. Kerry:
Ranking Minority Member:
Committee on Small Business and Entrepreneurship:
United States Senate:
The 7(a) program is the Small Business Administration's (SBA) largest
lending program for small businesses. SBA reported that it approved
about $8.6 billion in loan guarantees in fiscal year 2003. The program
provides loan guarantees of up to 85 percent for loans made to small
businesses that are unable to obtain financing on reasonable terms in
the private credit markets. Like most federal loan or loan guarantee
programs, SBA's 7(a) program is subject to the Federal Credit Reform
Act of 1990 (FCRA). FCRA requires most agencies with government lending
programs to estimate annually the cost to the federal government of
extending or guaranteeing credit over the life of the loans (the
subsidy cost). Since an estimate is an approximation, no one estimate
can be considered accurate with certainty, and reasonable estimates can
fall within a range of values. Changes in estimation methodologies,
variables, or data used to calculate an estimate, are likely to result
in differences in the estimate. In fiscal year 2003, SBA implemented a
new methodology to estimate the subsidy cost of the 7(a) program that
is based on econometric modeling.[Footnote 1] SBA officials told us
that the new 7(a) model was the first step in a long-term effort to
develop and implement new econometric models for their credit programs.
Although this allowed SBA to build a model that responds to the need
for greater sensitivity to a wider variety of factors than a model
based on historical averages, SBA believes that this approach may not
be appropriate for all its credit programs.
In order to calculate the subsidy cost of their programs, agencies must
estimate the present value of future cash flows over the life of the
program, which for the 7(a) program are principally affected by
defaulted loans, prepayments of outstanding loans, recoveries on
defaulted loans, and fees. The revised method SBA adopted for the
subsidy calculation has four segments: (1) the econometric equations
that are used to estimate the likelihood of defaults and prepayments,
(2) the equations used to estimate the extent of recoveries, (3) the
cash flow module, and (4) the Office of Management and Budget (OMB)
Credit Subsidy Calculator, as shown in figure 1. The results of the
first and second segments--the econometric equations--are a key input
into the third. The third segment--the cash flow module--uses these
results, along with OMB forecasts of interest rates, unemployment
rates, and gross domestic product growth rates to estimate cash inflows
from fees and recoveries on defaulted loans and outflows from claim
payments on defaulted loans. The resulting cash flows are entered into
the fourth segment, the OMB Credit Subsidy Calculator, which calculates
the (1) present values of the cash flows and (2) the subsidy rate.
Figure 1: Major Segments of the Model to Estimate 7(a) Subsidy Rate:
[See PDF for image]
[End of figure]
This report responds to your November 26, 2002, and December 11, 2002,
requests that we review the methodology that SBA developed to estimate
the subsidy costs of its 7(a) loan program for the fiscal year 2004
budget. As agreed with your staff, we (1) assessed the reasonableness
of the model's econometric equations and evaluated the model's
estimated default and recovery rates based on the 7(a) program's recent
historical loan experience; (2) identified any additional steps SBA
could take to further enhance the reliability of its subsidy estimate
produced by the model; (3) described SBA's process for developing the
subsidy model; (4) evaluated the model's supporting documentation
including its discussion of what variables were tested and rejected;
and (5) determined what steps SBA takes to ensure the integrity of the
data used in the model and determined whether these data are consistent
with information in SBA's databases. We did not, however, validate
SBA's model.
First, to analyze the model, we obtained from SBA copies of the model
as approved by OMB in 2002, along with the loan-level data that were
used to develop the subsidy estimates. We analyzed the econometric
equations to determine whether they were reasonable based on the
variables they included, the statistical techniques used, and the
results obtained. For example, we determined whether the econometric
equations included appropriate variables and whether the variables used
in the equations were statistically significant. To evaluate the
model's estimated default and recovery rates, we compared these rates
with recent historical loan experience of the 7(a) program provided by
SBA. Using SBA's data, we also calculated what SBA would have estimated
for default and recovery rates based on the estimation methodology it
used prior to its fiscal year 2003 budget submission. Second, to
identify any additional steps SBA could take to enhance the reliability
of its model, we considered additional types of data that SBA might
collect and consider including in its econometric equations. As part of
this analysis, we reviewed the academic literature on default modeling
and interviewed officials with several banks engaged in similar
efforts. Third, to describe SBA's process for developing the model we
met with SBA and OMB officials. Fourth, to evaluate the model's
supporting documentation, including its discussion of what variables
were tested and rejected, we obtained and analyzed available relevant
documents and met with SBA officials and their contractor who developed
the model. We compared the information presented in SBA's model
documentation with existing credit subsidy guidance. Finally, to
determine what steps SBA took to ensure the integrity of the data used
by the model and to determine whether these data were consistent with
information in its databases, we assessed SBA's processes for ensuring
data reliability. We examined the type and level of errors and
evaluated the likelihood that they would significantly affect the
credit subsidy estimates. We also compared the loan-level data used in
the model with the data contained in SBA's databases. Appendix I
discusses the details of our methodology.
We conducted our work in Washington, D.C. from December 2002 to March
2004 in accordance with generally accepted government auditing
standards.
Results in Brief:
Overall, we found that from an economics perspective, SBA's econometric
equations were reasonable, and the SBA model produced estimated default
and recovery rates that were in line with historical experience.
However, from an audit perspective, SBA's lack of adequate
documentation of the model development process precluded us from (1)
independently evaluating the model's development; (2) determining
whether SBA used a sound and consistently applied method to select and
reject variables to be included in the model; and (3) determining
whether a bias from selecting variables existed in the model.
We found that SBA's econometric equations for estimating defaults,
prepayments, and recoveries were reasonable. SBA's equations used a
limited set of variables; equations using other variables could also be
reasonable but would produce different estimates. We also found that
the model's estimated default and recovery rates were in line with
recent historical experience. SBA's econometric equations related the
likelihood of defaults and/or prepayments to several variables that
economic reasoning and prior research suggested were appropriate to
this type of model, and, at the time of our review, SBA used
appropriate statistical techniques to identify the nature of these
relationships. In addition, SBA's equations produced estimated
relationships for defaults and prepayments that were consistent with
expectations based on economic reasoning. For example, the likelihood
of default was estimated to be higher when unemployment was higher.
SBA's equations used a limited set of variables, and we found that
equations using additional variables available to SBA that it did not
include, such as measures of interest rates and the businesses'
industry type, would also be reasonable. If SBA had used these
alternative equations, it might have estimated a higher or lower
subsidy rate. SBA did not include any economic variables in its
equation for estimating recoveries, so that forecasted recovery amounts
were not dependent on expected economic conditions. According to
documentation provided by SBA of the work done to develop this
equation, adding economic variables would not have increased the
precision of the recovery rate estimates.
SBA could enhance the model and the reliability of the subsidy estimate
produced by the model by including additional information that SBA
expects to have in the future and by correcting errors. SBA intends to
collect new business and business-owner information to determine how it
affects loan performance and such information may suggest variables
that can be useful in the model. SBA's econometric equations used
variables from its current databases and economic indicators, such as
gross domestic product (GDP) growth rates and unemployment rates, to
forecast future defaults and prepayments. However, at the time of our
review, SBA's current database did not include other information on
businesses or business owners, such as information on borrowers' credit
that is often used by private sector lenders to determine potential
defaults and losses. Academic literature on default models suggests
that such information is predictive of defaults. SBA has recently
contracted to develop a loan monitoring system that is intended to
track this information and allow the agency to determine how it affects
loan performance. During our review of the model, we identified some
errors that resulted in underestimates of the program costs of around
$6.5 million or about 6.8 percent of the estimated cost of the program
for fiscal year 2004.
To develop its subsidy model, SBA drew on the expertise of other
government agencies and consulted with OMB officials. In February 2002,
SBA entered into an arrangement with the Office of Federal Housing
Enterprise Oversight (OFHEO), which has staff with expertise in
econometric modeling, to assist in the development of the 7(a) subsidy
model.[Footnote 2] OMB also played a key role in the development of the
model because FCRA requires OMB to approve the methodology that each
federal agency uses to estimate the subsidy costs of its loan programs.
Thus, SBA consulted with OMB officials during the model's development,
and OMB officially approved the model in the fall of 2002. OMB
officials said that their role in reviewing the model was primarily to
provide oversight and ensure compliance with the law. Because at the
time of our review, SBA routinely had its cash flow models reviewed by
an independent third party, it hired an outside consultant to conduct
limited reviews of the econometric equations and cash flow segment. The
consultant identified some errors that SBA corrected.
SBA did not prepare adequate supporting documentation to enable us and
other independent reviewers to understand and evaluate the process that
SBA used to develop the model. While SBA provided some general
documentation of its model development process, the documentation
lacked adequate discussion of alternative variables or combinations of
variables that SBA considered, which variables were rejected for which
reasons, and specific examples based on results of earlier regressions.
As a result, we were unable to determine whether a bias in selecting
variables existed in the model. SBA officials told us that they did not
prepare this type of documentation because they believed that there was
no specific requirement to do so. Current guidance is either silent or
unclear about supporting documentation needed to explain the
development of econometric models used to generate credit subsidy
estimates for the budget and financial statements. However, maintaining
adequate documentation on how such models were developed is a sound
internal control practice that would provide SBA and other agencies the
opportunity to more fully demonstrate and explain the rationale and
basis for key aspects of their models that provide important cost
information for budgets, financial statements, and congressional
decision makers. This documentation would also help facilitate SBA's
annual financial statement audit.
SBA hired a private contractor to reconcile the information submitted
to it by 7(a) program lenders with the data stored in SBA's loan-level
databases on a monthly basis and, at the time of our review, had an
ongoing process to correct any errors that were found. Although errors
existed in SBA's data systems at the time of our review, we determined
that the magnitude and nature of these errors were not likely to
significantly affect the subsidy rate. In addition, SBA officials told
us that they performed various ad hoc reviews of the information in
SBA's loan-level databases to assess its accuracy and were currently
assessing various alternatives to further enhance its data integrity.
On the basis of our analysis of a statistical sample of defaulted,
prepaid, and active loans, as well as recoveries from defaulted loans,
we found that the data SBA used to calculate the subsidy costs were
consistent with the loan level data contained in SBA's actual databases
at the time of our review.
This report contains three recommendations to SBA and one
recommendation to OMB. We recommend that SBA (1) determine how best to
include in the model borrower-specific information that it intends to
collect in its new loan monitoring system; (2) establish a process for
periodically revising the model to correct errors and to reflect any
changes in the 7(a) program or other factors that could affect the
subsidy estimate; and (3) prepare adequate documentation of the model
development process including a detailed discussion of alternative
variables or combinations of variables that were considered, tested,
and rejected and criteria for doing so. We also recommend that OMB
require that agencies document the basis for credit subsidy estimates
and reestimates, including the process followed for selecting model
methodologies over alternatives and variables tested and rejected with
the basis for excluding them.
We received comments on a preliminary draft of this report from SBA and
OMB. SBA agreed with the findings and the first two recommendations
related to the final model. OMB had no comments. While a draft of this
report was at the agencies for comment, we continued to pursue
additional documentation that SBA had that might further explain its
7(a) model development process, including what variables were selected
and rejected and why. This final report discusses the lack of adequate
documentation and recommends improvements in SBA's documentation of the
development process for its credit subsidy models and in OMB's Circular
A-11 guidance. SBA generally disagreed with our findings and
recommendations related to the lack of adequate documentation
supporting the model's development process. OMB disagreed with our
recommendation that it revise Circular A-11. However, in light of the
consistent difficulty experienced by three independent reviewers of
SBA's 7(a) credit subsidy model, including SBA's financial statement
auditors, we continue to recommend that SBA enhance its credit subsidy
model documentation and that OMB require agencies to document the basis
and process used to develop credit subsidy models, including
understanding the model's basis and the variables that were selected
and rejected.
Background:
FCRA was enacted, among other reasons, to provide more accurate
measures of the costs of federal loan programs and to more accurately
compare costs among credit programs and between credit and noncredit
programs. FCRA requires agencies with loan guarantee programs to
estimate the subsidy cost, or the cost to the government, of their loan
guarantees over the life of the loan. To calculate the subsidy costs,
agencies must calculate, on a cohort[Footnote 3] basis, the net present
value of the forecasted cash flows for the program, which for SBA
included estimated defaults, recoveries, and fees related to the 7(a)
program. In addition, as part of this process, SBA must determine the
effects of loan prepayments on the cash flows. Under FCRA, SBA provides
information that generates a single subsidy rate and does not provide
information about any uncertainty in its estimate of the rate or other
factors affecting the rate, such as prepayments or defaults.
Prior to its 2003 budget submission, SBA's methodology for estimating
the subsidy on its 7(a) loans used historical averages for defaults and
recoveries based on loan data going back to 1986 as the basis for
estimates of future defaults and recoveries. This approach resulted in
fairly stable subsidy estimates on a yearly basis as it included a
sufficient volume of historical information that smoothed out
fluctuations in economic conditions from year to year. However, this
approach resulted in SBA consistently overestimating defaults and
recoveries. In previous work, we found that SBA overestimated defaults
by about $2 billion from fiscal years 1992 to 2000.[Footnote 4]
In an effort to improve the accuracy of its subsidy estimate, SBA
implemented a new methodology based on econometric modeling to estimate
the subsidy cost for the fiscal year 2003 and 2004 budget submissions.
Econometric modeling has advantages over historical averaging. For
example, to the extent that data are available, it can take into
account the effects of changes in such factors as economic conditions,
program rules, and loan types on defaults and prepayments.
All forecasts are uncertain, and this uncertainty has multiple causes.
When relationships among economic variables are estimated, uncertainty
may arise from the choice of variables used in the model, from the
degree of precision with which the strength of the relationships is
estimated, and from uncertainty about the future values of the
independent variables used in the forecasting equation. Excluding a
variable that should be in a forecasting model can reduce the quality
of the model. For example, if some industries have high default rates,
then excluding industry variables will tend to underestimate default
costs in years when many loans go to high risk industries and overstate
default costs in years when many loans go to low risk industries. The
choice of variables to be used in a model results from a process of
professional judgment and balancing the risks of including too many or
too few variables. Economic theory and statistical tests play an
important role in these decisions. The remaining sources of
uncertainty, the precision of the estimated relationships and
uncertainty about future values of independent variables, are often
beyond the control of those building the model. The precision of the
effects of the independent variables is determined largely by the
amount of data available to the analyst, and uncertainty about future
values of independent variables is inherent in any forecast.
Internal control is a major part of managing an organization and this
includes controls over data gathering and processing, such as SBA's
data on 7(a) loans. As mandated by the Federal Managers' Financial
Integrity Act of 1982, the Comptroller General issues standards for
internal control in the federal government.[Footnote 5] These standards
provide the overall framework for establishing and maintaining internal
control and for identifying and addressing major performance and
management challenges and areas at greatest risk of fraud, waste,
abuse, and mismanagement. According to these standards, internal
control comprises the plans, methods, and procedures used to meet
missions, goals, and objectives. Control activities are the policies,
procedures, techniques, and mechanisms that enforce management's
directives and help ensure that actions are taken to address risks.
Control activities are an integral part of an entity's planning,
implementing, reviewing, and accounting for government resources and
achieving effective results. They include a wide range of diverse
activities including controls over information processing. These
controls are established to ensure that all data inputs are received,
are valid, and outputs are correct. Agency management should design and
implement internal control based on the related costs and benefits. No
matter how well designed and operated, internal control cannot provide
absolute assurance that all agency objectives will be met and, thus,
once in place, internal control provides reasonable, not absolute,
assurance of meeting an agency's objectives.
SBA's Equations Were Reasonable and Estimated Default, and Recovery
Rates Were in Line with Historical Experience:
We found that the econometric equations that SBA used to estimate
defaults, prepayments, and recoveries were reasonable, although other
equations could also be reasonable. SBA uses an appropriate statistical
technique for identifying the nature of these relationships. In
addition, SBA's equations produced estimated relationships for defaults
and prepayments that were consistent with expectations based on
economic reasoning. We found that there were additional variables
available to SBA that it did not include in its equations, such as
measures of interest rates and the borrower's industry type that would
also be reasonable and would produce different subsidy rates. In
addition, SBA did not include any economic variables in its equation
for estimating recoveries. According to documentation provided by SBA
to estimate recoveries on defaulted loans, adding economic variables
would not have increased the precision of the recovery rate estimates.
Finally, we found that the new model's estimated default and recovery
rates were in line with recent historical experience.
Variables in SBA's Default and Prepayment Equations Were Appropriate:
The econometric equations that SBA used at the time of our review
related the likelihood that a borrower would either default on or
prepay a loan to several variables that economic reasoning and prior
research suggested were appropriate to include in these types of
equations. These variables included: (1) characteristics of the
borrower's business, such as whether it was a sole proprietorship,
partnership, or corporation; (2) characteristics of the loan, such as
the amount borrowed; and (3) two measures of economic conditions, the
unemployment rate in the state where the loan was made and the GDP
growth rate. Economic reasoning and prior research suggested that
differences in borrower and loan characteristics and economic
conditions were likely to influence defaults and prepayments. For
example, prior research suggested that new businesses were less likely
to survive than were established businesses and thus were more likely
to default.[Footnote 6] Prior research also suggested that the
likelihood of default on loans made to partnerships or corporations
should be less than it was for loans made to sole proprietors, while
the likelihood of prepayment should be greater. Details about SBA's
econometric equations are found in appendix II.
SBA's Statistical Technique and Estimated Relationships for Prepayments
and Defaults Were Appropriate:
At the time of our review, SBA used an appropriate technique known as
multinomial logistic regression[Footnote 7] to identify whether the
variables included in its model were important influences on the
likelihood that a borrower would either default on or prepay a loan and
to estimate the magnitude of these relationships. This technique, which
has been used in other models of this type, was appropriate because it
corresponded to the decision-making process that borrowers faced. When
deciding whether to default on the loan, prepay the loan, or keep it
active, using this technique, SBA produced estimates of both the
probability of default and the probability of prepayment.[Footnote 8]
The relationships that SBA's equations estimated between different
variables and the likelihood of defaults and prepayments were
consistent with economic reasoning. For example, SBA's default equation
suggested that defaults were more likely when unemployment was higher,
and the rate of increase in gross domestic product was lower. Both of
these estimated relationships were consistent with economic reasoning
because it was less likely borrowers would continue paying their debts
when more people are out of work, and the economy was growing less
rapidly or in decline.
SBA's prepayment equation also suggested that prepayments were more
likely when loans were made under the SBA Express Program, for which
SBA guaranteed a smaller percentage of the loan amount than it did
under the regular 7(a) business loan program. This result was
consistent with our expectations because the smaller guarantee was
likely to make lenders more cautious in making lending decisions, such
that firms borrowing through this program may have been more
creditworthy than firms borrowing through the regular program. In turn,
the businesses' enhanced creditworthiness may have led to more
prepayments because these businesses may have been relatively more
financially stable and may have been more likely to pay off their loans
early. The details of SBA's default and prepayment equations, which
show these relationships, are in appendix II.
Other Default and Prepayment Equations Would Also Be Reasonable and
Lead To Different Subsidy Rate Estimates:
We identified additional variables available to SBA, but not included
in the model, that also influenced the likelihood of defaults and
prepayments. The choice of variables included in a model reflects the
modelers' professional judgment and different equations using different
sets of variables can all be considered reasonable. To analyze the
effect of adding additional variables, we tested SBA's model to
estimate the 2003 subsidy cost using additional variables that (1)
measured the current interest rate on 1-year U.S. Treasury bills and
(2) considered the industry in which the borrowing firm operates. The
interest rate could be important as either another measure of general
business conditions or as a specific measure of the cost of capital.
The industry in which the borrowing firm operates could be important if
default and/or prepayment rates vary among industries, and the
distribution of loans among industries varies over time. In addition,
banks have traditionally recognized that the financial performance of a
borrower depends on the nature of the business supporting the loan, the
structure of the loan, and the financial condition of the firm. At the
time of our review, SBA's econometric equations contain information on
the loan and the firm but did not include information on the firm's
business.
The estimates produced by our testing suggest that these variables also
influenced the likelihood of defaults and prepayments occurring and,
therefore, that equations using these variables could also be
reasonable.[Footnote 9] However, there are additional considerations
that could be important in deciding whether to include a measure of
interest rates in the default and prepayment equations. Specifically,
including an interest rate variable would mean that forecasted interest
rates would be used with the results of the econometric equations (and
forecast values of other economic variables) to forecast future
defaults and prepayments. The fact that forecasting interest rates is
difficult may be a reason for not including an interest rate variable,
even if the variable appears to be significantly related to the
historical likelihood of default or prepayment. Furthermore, at
present, forecasted interest rates are low relative to the interest
rates that prevailed over most of the period from which the data were
drawn to develop SBA's equations, potentially limiting the usefulness
of including an interest rate variable.
We found that including either the interest rate on 1-year Treasury
bills or the industry in which the borrowing firm operates as a
variable in the default and prepayment equations changed the estimated
cost of the program. (See app. II.) According to SBA's model, the
estimated subsidy rate for loans disbursed in 2003 was 1.04 percent.
This estimate increased to 1.13 percent with the industry identifiers
included and decreased to 0.76 percent with the inclusion of the
interest rate on 1-year Treasury bills. In addition, when we included
both the interest rate variable and the industry identifiers, we
estimated a subsidy rate of 0.83 percent. Because interest rates are
difficult to predict and have recently been quite low, we conducted
tests to determine how sensitive the estimate was to small changes in
forecasted interest rates. We found that it is not very sensitive to
such changes. For example, when we increased the forecasted values
above those included in the official OMB forecast by 10 percent, we
estimated a subsidy rate of 0.80 percent while when we decreased the
forecasted values by 10 percent we estimated a subsidy rate of 0.73
percent.
The range of estimated subsidy rates that result from including
additional variables was roughly comparable to the range that resulted
from using different economic assumptions. We tested the sensitivity of
SBA's estimated subsidy rate to small changes in the forecast values of
the GDP growth rate and the unemployment rate by reestimating the
subsidy rate with SBA's model but used both more optimistic and more
pessimistic assumptions about future economic conditions.[Footnote 10]
With the more optimistic assumptions, we estimated the subsidy rate
decreased to 0.81 percent while with the more pessimistic assumptions
we estimated that it increased to 1.28 percent.
Estimates of Recoveries Depended Only on Age of Loan, Not Economic
Conditions:
SBA's model also included a separate econometric equation for
estimating recoveries, which are the amounts of defaulted loans that
were eventually recouped by collection efforts, such as the liquidation
of assets. In this equation, the cumulative net recovery rate[Footnote
11] for a cohort of loans was estimated as a function only of the age
of the loans in that cohort. In particular, this equation did not
include any economic variables, so forecasted recovery rates were
estimated to resemble historical recovery rates even though economic
conditions in the future might be quite different from the past.
According to documentation provided by SBA of the work done to develop
this equation, adding economic variables would not have increased the
precision of the recovery rate estimates.[Footnote 12]
The Model's Estimated Default and Recovery Rates Were in Line with
Historical Experience:
Our evaluation of the model's estimated default and recovery rates
found that these rates were in line with historical experience of the
7(a) program. There are some limitations to evaluating expected future
loan performance compared with historical data because over time the
economy changes and underwriting criteria and other factors that affect
loan performance may also change. Therefore, one would not expect the
estimated loan performance to exactly mirror historical experience.
However, these types of comparisons are useful to evaluate the model's
estimated default and recovery cash flows. Because recently issued
loans do not have significant experience and historical data can be
summarized in several ways, we evaluated the new model's estimated
default and recovery rates compared with historical data in two ways to
determine whether the estimates were in line with historical
experience.
In August 2001, we reported that from fiscal year 1992 through fiscal
year 2000, SBA overestimated the cost of the 7(a) program by about $1
billion, primarily because it overestimated defaults by approximately
$2 billion. Over this same period, SBA's estimated recoveries closely
matched actual loan performance. SBA's prior method to estimate costs
was based on averages of historical loan performance. As previously
discussed, SBA's current model estimated defaults significantly
differently than the prior method in that it considered economic
variables and loan specific information. Meanwhile, at the time of our
review, the model continued to estimate recoveries based on historical
patterns.
While it was currently not possible to determine the accuracy of the
model's estimated default rate, as shown in the following two figures,
the rate appeared to more closely match recent historical experience
than SBA's previous method. Figure 2 shows how the model's estimated
default rate compared with the estimated default rates calculated with
SBA's previous method and with the average default experience of loans
issued between 1992 and 2001.[Footnote 13] We could have included more
or fewer years of loans in our analysis, but we believe data since 1992
are sufficient to evaluate the model's estimated default rate compared
with historical experience because it included several years of loans
that have been through their peak default period, which for 7(a) loans
is generally between years 2 and 5.
Figure 2: Estimated Default Rates Compared with Average Default
Experience from 1992 through 2001:
[See PDF for image]
[End of figure]
As previously mentioned, since historical data may be summarized
differently, figure 3 shows how the new model's estimated default rate
compared with the estimated default rate calculated with SBA's previous
method and to actual default experience during fiscal year 2001 for the
loans issued since 1986.[Footnote 14] This comparison allowed us to
evaluate the
estimated default rate over a longer period of time since data from
older loans that have been outstanding for a longer period of time was
included.[Footnote 15]
Figure 3: Estimated Default Rates Compared with Fiscal Year 2001 Actual
Default Experience:
[See PDF for image]
[End of figure]
SBA's Model Could Be Enhanced by Adding Information on Borrowers,
Correcting Errors, and Updating Some Data:
SBA could enhance the reliability of its model's estimates by adding
information on both the businesses and the owners to the econometric
equations and reestimating the equations and by correcting errors in
the model. The econometric equations SBA used at the time of our review
to predict default and prepayments included some variables describing
the businesses and loans and two economic indicators, GDP and
unemployment rates. But they did not include some variables other
analysts and financial institutions often use that are associated with
businesses and business owners, such as credit scores. In addition,
during our review, we found some errors that resulted in
underestimating the cost of the 7(a) program that was included in the
fiscal year 2004 President's Budget. Correcting these errors would have
increased the estimated cost of the program by about $6.5 million.
Including Additional Information on Businesses and Business Owners
Could Enhance the Model's Reliability:
The quantitative relationships between the default and prepayment rates
and the current independent variables would probably change if new
information were included. In our review of the literature and
discussions with large banks, additional information was mentioned as
having an influence on defaults and prepayments. The information cited
was more detail on the loans, the business, and on business owners,
including credit scores.[Footnote 16]
Our review of the academic literature and discussions with some
commercial lenders indicated that private lenders often include
variables SBA did not consider in forecasting the financial performance
of small businesses.[Footnote 17] At the time of our review, the
current SBA model included loan variables (age and term) and some
business variables (new business indicators, form of ownership, and
loan amount, among others) but was missing detailed information on
businesses that can help predict financial viability. These variables
include earnings, capital, payment records, and available collateral,
all of which have been shown to affect creditworthiness and likelihood
of default. Profit levels, for example, help predict a business's
ability to generate cash internally to cover loan payments. Records of
debt payments help determine whether a business can cover its
obligations, while available collateral tells a lender whether a
business has the resources to cover outstanding debts during a
financial crisis. Adding and periodically updating this information
could enhance the predictive ability of SBA's econometric model by
providing more accurate estimates of potential defaults and
prepayments.
In addition, analysts and banks have found that variables describing
business owners can aid in evaluating credit risk, and many large banks
have started to underwrite and monitor small businesses using credit
scores. Information from business owners' credit records, such as
income, personal debt, employment tenure, homeownership status, and
previous personal defaults or delinquencies, can help predict
delinquencies and defaults in the businesses themselves. Although at
the time of our review SBA's current model did not include variables
that measure these characteristics, the agency was developing a new
loan monitoring system that SBA officials told us was intended to track
this type of information. This is an important issue since, if banks
use credit scores and the SBA does not, the SBA may be left with
riskier loans. SBA could then determine whether such variables also
reflect risks in SBA loans and could be used to help evaluate the costs
of SBA loan guarantees.
SBA's 2004 Subsidy Rate Estimate Included Errors:
During our review of the model used to generate the cost estimate of
the 7(a) subsidy that was included in the fiscal year 2004 budget, we
found errors that resulted in underestimates of program costs of about
$6.5 million. Based on the estimated subsidy rate and the projected
loan volume included in the fiscal year 2004 President's Budget, the
estimated cost of the program was about $94.9 million. If the errors we
found had been detected and corrected by SBA before the budget was
submitted, the estimated cost of the program with the same projected
loan volume would have increased to about $101.4 million.
These errors related to SBA's method of estimating recoveries, annual
guarantee fee cash flows, and projections of borrower interest rates.
First, the recovery estimates were based on the assumption that loans
would be issued during fiscal year 2003 instead of during fiscal year
2004, although default and prepayment estimates were based on the later
year. As a result, the model estimated that recovery cash flows would
occur 1 year early, affecting the net present value[Footnote 18] of the
cash flows and the subsidy rate. Second, formulas SBA used to summarize
the output of the cash flow segment of the model indicated that the
same annual guarantee fees collected during the first quarter of fiscal
year 2004 would be collected from about years 5-27, even though the
fees would decline as loan balances were paid off. SBA officials
indicated that these two errors would be corrected before the
submission of the 2005 budget. Third, in estimating the cost of loans
issued in the future, SBA assumed the loans would have characteristics
similar to those of loans issued during fiscal year 2001. However, SBA
did not adjust the borrower interest rates to levels that would be more
appropriate for loans to be issued during fiscal year 2004. SBA
officials indicated that this adjustment was not necessary because it
would not significantly affect the cost of the program. However, SBA
had made this adjustment when it calculated the subsidy cost for loans
to be issued during fiscal year 2003. When we corrected the previously
described errors, the estimated cost of the program for fiscal year
2004 increased by $6.5 million. We also found an error related to
estimating prepayment penalties. SBA officials stated that they were
aware of this error but believed that fixing it would be complicated
and that these cash flows would be immaterial to the cost of the
program. In the officials' view, fixing the error would not be cost
beneficial.
Cohort Data Could Be Updated:
In addition, the model could also be further enhanced if SBA were to
update the model to include new information as it becomes available.
For example, SBA used the 2001 cohort of loans to generate estimates of
the 2003 and 2004 subsidy. But, they were not sure if they were going
to use the 2002 cohort of loans for the 2005 estimate because they said
that updating the cohort is complicated as a result of changes in
program policies or in the composition of the 7(a) loan portfolio.
However, the model would likely produce more reliable estimates if the
most recent loan data were being used to generate the forecast rather
than continuing to use an older cohort of loans.
SBA Collaborated with OFHEO and OMB to Develop the Model:
SBA contracted with OFHEO economists, with expertise in econometric
modeling of mortgage defaults and prepayments, to develop its subsidy
model, which included determining the variables to be included in the
econometric equations. SBA consulted with OMB officials, who are
required by FCRA to approve agency subsidy estimates. SBA also hired a
private consulting firm to conduct a limited review of the model as
part of its ongoing review process to minimize errors in estimating the
subsidy.
SBA Entered into an Agreement with OFHEO to Develop the Subsidy Model:
In February 2002, SBA entered into an agreement with OFHEO to assist in
developing the subsidy model. According to SBA staff, they selected
OFHEO because it had staff with expertise and experience in econometric
modeling and was less expensive than a private contractor.[Footnote 19]
According to SBA staff, the OFHEO economists followed a four-step
process to develop the model. The first step was refining and building
the data set that would be used to generate the estimates. The data set
OFHEO used was constructed from the SBA databases that were used to
track loan payment history and personal financial information on
borrowers. The second step was the design and estimation of the
default, prepayment, and recovery equations, including the selection of
variables for these equations. The third step of the process was the
construction of the cash flow module, and, the fourth step was the
construction and testing of the model that OFHEO would deliver for use
by SBA.
OMB Officials Approved SBA's Model:
OMB officials also played a key role in the development of the model
because, under FCRA, OMB has final responsibility for approving
estimation methodologies and determining subsidy estimates. SBA
officials said they consulted with OMB during the model's development
until OMB approved it in the fall of 2002. OMB officials told us that
they considered the model to be an improvement over the previous method
that SBA used to calculate the program subsidy rate because it used
better data and the econometric equations allowed for more accurate
estimates of future cash flows. In addition, SBA could now use the
model to consider both programmatic and economic variables in
estimating the subsidy rate. For example, they said SBA could model how
such variables as lender type affected the subsidy rate.[Footnote 20]
In reviewing the model, OMB officials told us that they focused on the
methodology of the model, the cash flow projections, appropriate use of
variables in the econometric equations, and the validity of the data
used to make the calculations. They approved the model in November
2002.
SBA Hired a Private Consulting Firm to Review the Model:
SBA hired a private consulting firm to conduct an independent limited
review of the model for September 2002 to October 2002, as part of its
ongoing process to identify errors before OMB approved the model. The
consulting firm assessed the model conceptually and evaluated its
underlying computer programming--specifically, the key data inputs that
were the primary source of the model's cash flows and the model's
programming specifications (to ensure they were correctly coded and
that the code functioned properly). The firm also assessed the model's
compliance with the relevant statutes and regulations and conducted
scenario testing to evaluate how it performed under different economic
assumptions. The consulting firm concluded that although the model
performed reasonably well in estimating the subsidy cost, SBA had made
errors in estimating loan guaranty and servicing fees, the calculation
of recoveries, and prepayment penalties. SBA made changes to the model
to address the identified discrepancies for fees and recoveries, the
net effect of which was, to increase the subsidy rate estimate by about
36 percentage points. The consulting firm also determined that the
model lacked adequate documentation and they were, therefore, unable to
review the econometric component of the model. However, OFHEO
subsequently provided SBA with a report documenting the model's
development to a limited extent.
Lack of Adequate Model Documentation Hampered Independent Reviews of
SBA's Model:
In developing its new econometric model, SBA did not prepare adequate
supporting documentation to enable independent reviewers to understand
and evaluate the process that was used. For example, the independent
contractor SBA hired to review the 7(a) credit subsidy model was
hampered by the lack of adequate documentation and, as a result, this
team's review of the model's theoretical basis and its working features
was severely limited. While SBA later developed some general
documentation of its model development process, this documentation did
not contain, among other things, an adequate discussion of alternative
variables, or combinations of variables, that it considered, tested,
and rejected, and the reasons for rejecting them. SBA officials told us
that they did not prepare this type of documentation because they
believed that there was no specific requirement to do so. Current
guidance is either silent or unclear about supporting documentation
needed to explain the development of econometric models used to
generate credit subsidy estimates for the budget and financial
statements. Nevertheless, we believe that maintaining adequate
documentation on how such models were developed is a sound internal
control practice that would provide SBA and other agencies the
opportunity to demonstrate and explain the rationale and basis for key
aspects of their models that provide important cost information for
budgets, financial statements, and congressional decision makers.
Moreover, as a practical matter, this documentation would help
facilitate SBA's and other agencies' annual financial statement audits.
SBA's 7(a) Credit Subsidy Model Documentation Was Inadequate for
Outside Reviewers:
BearingPoint, the independent contractor hired to perform an initial
review of the SBA 7(a) credit subsidy model prior to its finalization,
was hampered by the lack of adequate documentation. In response to our
inquiry, the contractor stated that the team did not validate the model
which, from an audit perspective, would have encompassed a more robust
effort. In its final report to SBA, the contractor reported that SBA
lacked sufficient supporting documentation for a "thorough review of
its [the model's] theoretical basis (including alternative modeling
methodologies explored), its working features, or the update and
maintenance procedures necessary to use the model on an ongoing basis.
This lack of adequate documentation severely limited our ability to
assess certain critical parts of the model in detail, including its
econometric components." Further, the contractor recommended that "SBA
develop a robust set of documentation to support this model" including
"the modeling methodology, alternate methodologies considered, data
inputs and outputs, and model maintenance and update requirements.":
In its January 30, 2004, audit report, Cotton and Company, the
independent public accounting firm, identified in its internal control
report 9 specific
deficiencies in the model's documentation.[Footnote 21] These
deficiencies included, for example, a lack of technical references for
the statistical method used for the performance of the model, the
absence of mathematical specifications, the fact that important
variables were not clearly identified, and that units of measure for
key variables were not specified. In addition, the audit report stated
that the documentation that was provided was "self-contradictory" about
the quality of the default and prepayment model and lacked a discussion
of the assumptions and limitations of SBA's modeling approach. In
responding to the independent public accountant's internal control
report, SBA's Chief Financial Officer generally agreed with the
report's findings, including the deficiencies in SBA's model
documentation, and stated that the internal control report presented
"fundamentals of good financial management and SBA is committed to
accomplishing as many of these items as possible in the coming year.":
In response to BearingPoint's recommendation, SBA's OFHEO contractor
prepared some documentation for the model, but this documentation was
not sufficient to allow us and SBA's financial statement auditor to
gain an adequate understanding of certain key parts of the model
development process. For example, the documentation that SBA provided
included a broad overview of how the model works, a list of the
variables that the final econometric equations included, the estimated
coefficients of the equations, and figures showing how well the
equations fit the data during the historical period. For some
variables, SBA's documentation indicated how the variables were
expected to influence default or prepayment probabilities, but did not
provide any reasons, conceptual justification, or supporting empirical
analysis. Some of these statements seemed intuitive, such as when the
output of the economy increases, as measured by the percent change in
real GDP, it is expected that default rates will drop. However, other
statements were not intuitive. For example, SBA's documentation
indicated that larger loans were expected to default at elevated levels
and did not include any support for this assertion.
Additionally, the model documentation did not explain in sufficient
detail why SBA excluded some variables. Rather, the model documentation
included a table of 29 variables that were tested and rejected and
stated that the information presented was "a list of most variables
tested." The documentation also provided a general overview about why
these 29 variables were excluded. SBA's documentation stated that
"variables were removed for a variety of reasons. Some of the reasons
include--insignificant, highly correlated with other variables, low
economic importance (significant but impact on probabilities was
negligible), inconsistent results (variable was not robust to different
specifications), and incoherent results (results could not be
reconciled with any economic logic)." While the documentation that SBA
provided to us contained acceptable reasons that economists could cite
in rejecting variables, the documentation's lack of specificity did not
allow us to determine which variables were rejected for which reasons.
Further, we were unable to determine whether these were the only
criteria or whether they were consistently applied throughout the model
development process.
SBA and the OFHEO contractor told us that, during the model development
process, approximately 800 pages of raw testing information were
generated and retained in an electronic file. They further stated that
these 800 pages were not organized in any fashion and that there was no
summary document or road map with greater detail than the model
documentation provided us that would describe the variable-testing
process or the results of that process in an understandable fashion. In
addition, SBA and the contractor told us that the variables reflected
in the 800 pages were not recorded in English words, but rather in
mnemonics, and that there was no crosswalk or key still in existence to
decode the mnemonics. Based on these representations by SBA and its
contractor, we initially concluded that this information would be of
questionable or no usefulness in assessing SBA's development of the
assumptions and selection of variables used in the modeling process.
SBA eventually provided us access to the 800 pages of material that
contained some information on variables that were considered and
rejected. This document was a partial compilation of analyses conducted
during the model development process with no explanation or discussion
of what was learned from each analysis conducted. Thus, on its own,
this document provided little additional information regarding the
process that SBA's contractor followed in developing the econometric
equations used in the subsidy model. Further, the document was written
in mnemonics and was not organized in any logical manner. In addition,
SBA officials could not identify any specific parts of this
documentation that related to alternative variables that were
considered and rejected during the model development process.
Documenting the basis for selecting and rejecting variables from an
econometric model used to develop credit subsidy estimates is an
important internal control that would also help to provide financial
statement auditors reasonable assurance that a bias was not introduced
into the credit subsidy estimates by systematically excluding variables
to influence the subsidy rate in a particular direction. Statement on
Auditing Standards Number 57, Auditing Accounting Estimates (SAS No.
57), states that "even when management's estimation process involves
competent personnel using relevant and reliable data, there is
potential for bias in the subjective factors." When evaluating the
reasonableness of an estimate, the auditor should concentrate on, among
other things, "key factors and assumptions that are subjective and
susceptible to misstatement and bias." Because of the nature of
econometric models and the effect that variables used have on future
loan default and prepayment projections, auditors need to understand
both what was included and excluded from the model to assess the
reasonableness of the credit subsidy estimate from a financial
accounting perspective.
As our work demonstrated, changing the variables that were included in
the model changed the subsidy rate. Because of the lack of adequate
documentation on SBA's 7(a) model development process, we were unable
to determine whether a bias in selecting variables existed in the
model. Further, SBA's lack of adequate documentation on the 7(a) model
development process could have impeded our ability to reach a
conclusion on SBA's loan accounts in connection with the audit of the
consolidated financial statements of the federal government.
Specific Guidance on Credit Subsidy Model Development Documentation Is
Limited:
Currently, there is limited specific guidance on the nature and extent
of documentation that agencies must prepare related to the development
of models to generate credit subsidy estimates. OMB Circular A-11,
Preparation, Submission, and Execution of the Budget, provides guidance
on how agencies should prepare credit subsidy estimates. Circular A-11
does not include any guidance to the agencies for documenting their
model development process including selection and rejection of
variables for use in the models that generate federal credit subsidy
estimates. However, Federal Financial Accounting and Auditing Technical
Release 6, Preparing Estimates for Direct Loan and Loan Guarantee
Subsidies under the Federal Credit Reform Act Amendments to Technical
Release 3: Preparing and Auditing Direct Loan and Loan Guarantee
Subsidies under the Federal:
Credit Reform Act,[Footnote 22] provides some implementation guidance
about the nature and extent of documentation agencies should have for
their models. Technical Release 6 states that agencies should document
the cash flow model(s) used and the rationale for selecting the
specific methodologies. Agencies should also document the sources of
information, the logic flow, and the mechanics of the model(s)
including the formulas and other mathematical functions. In addition,
because the model is the basis for budget and financial statement
credit subsidy estimates, this documentation also facilitates an OMB
budget analyst's review, if the analyst is not involved in the
development process, the external financial statement audit, and other
independent reviews. Technical Release 6 also states that agency
documentation for subsidy estimates and reestimates should be complete
and stand on its own, enabling an independent person to perform the
same steps and replicate the same results with little or no outside
explanation or assistance. In addition, if the documentation were from
a source that would normally be destroyed, then copies should be
maintained in the file for the purposes of reconstructing the estimate.
Technical Release 6 does not specifically address expected
documentation of an agency's model development process, including a
detailed discussion of alternative variables that are considered, the
reasons for their rejection, and specific examples based on results of
earlier regressions. Nevertheless, in our view, the documentation
principles in this Technical Release represent sound internal control
practice that could also be applied to an agency's development of a
model used to generate budget and financial statement credit subsidy
estimates. Such documentation would introduce transparency into an
agency's budget process and enable agencies' models and the resulting
estimates to withstand scrutiny and inquiry from independent reviewers.
For example, such documentation would allow validation of an agency's
model by independent reviewers, and provide reasonable assurance that
the agency selected and rejected assumptions and variables for the
model on a sound basis. Further, this documentation would help
demonstrate to congressional stakeholders sound decision making and
stewardship over millions of dollars in appropriated funds.
SBA Had a Process to Help Ensure Data Quality and the Data Used in the
Model and SBA's Loan Level Databases Were Consistent:
Calculating a reliable credit subsidy estimate requires that the key
cash flow data, such as defaults or recoveries and the timing of these
events be reliable, or the credit subsidy estimate could be affected.
Internal control standards call for agencies to have a process to help
ensure the completeness, accuracy, and validity of all transactions
processed. SBA's monthly reconciliation process, combined with lender
incentives and loan sales, helped ensure the quality of the underlying
data used in its credit subsidy estimation process. Although at the
time of our review, some errors in its data existed in SBA's databases,
the nature and magnitude of these errors was unlikely to significantly
alter the subsidy rate. Further, we tested the data used by SBA's new
econometric model and found them to be consistent with the data in
SBA's loan systems at the time of our review.
SBA Had a Process to Identify and Correct Data Errors:
The primary method that SBA used to help ensure the integrity of its
loan data is its Form 1502 reconciliation process. Reconciliations are
an important internal control established to ensure that all data
inputs are received and are valid and all outputs from a particular
system are correct. This process, which has been in effect since
October 1997, utilized an SBA contractor to conduct monthly matches of
borrower data submitted by 7(a) program lenders on SBA's Form 1502 to
the information in the agency's Portfolio Management Query Display
System to help ensure the completeness and accuracy of the agency's
data. The information on the Form 1502 included a wide variety of data
for an individual loan, some of which was used in the credit subsidy
estimation process, and included, among other things, loan
identification number; loan status such as current, past due, or in
liquidation; loan interest rate; the portion of the loan guaranteed by
SBA; and the ending balance of the loan's guaranteed portion. Errors
identified by this match were loaded each month into SBA's Portfolio
Management Guaranty Information System, and it was accessed by the
various district office staff to work with lenders to correct the
erroneous data.
Although we did not independently test the data match conducted by
SBA's contractor or the field office staff's correction of identified
errors, we reviewed summary reports of the errors in the Guaranty Loan
Reporting System for each district office over a 4 month period during
fiscal year 2003 and found that most of these reported errors were
resolved during the month the errors were identified. During the months
we reviewed, the percentage of errors resolved ranged from a low of
about 65 percent to a
high of nearly 89 percent.[Footnote 23] Although one month we reviewed
had only a 65 percent resolution rate, leaving 4,860 errors uncorrected
at the end of the month, as explained in the following paragraph, not
all of these errors would affect the subsidy estimate and this number
is relatively small compared to the large volume of loan transaction
level data used in the credit subsidy estimation process. Our review of
the underlying data used in the model showed that about 5.7 million
data records were used to record the quarterly loan performance of
392,315 loans from 1988-2001.
In order to assess whether the remaining errors in SBA's data base
would likely have a significant affect on the credit subsidy estimation
process, we reviewed the 38 different error codes that are reported
monthly by the Guaranty Loan Reporting System and found that less than
half of these error codes were related to data used by the econometric
model and, as a result, could have affected the credit subsidy
estimate. For example, the Guaranty Loan Reporting System identified
errors for lender contact name and phone number--data that were not
used by the new econometric model and would not affect the subsidy
estimate. Other error codes relating to the guaranteed portion
principal balance or whether a loan was in liquidation status could
affect the credit subsidy estimate if the number of errors and their
dollar volume were significant.
We reviewed a 6-month summary error report from the Guaranty Loan
Reporting System for activity between February and July 2003 and found
that, for those error codes that could affect the credit subsidy
estimate, only two of these codes had error rates that exceeded 1
percent of the transactions. One of these codes indicated that the loan
status was not correct because the loan was in liquidation and had an
average error rate of about 1.4 percent for the 6-month period we
reviewed. The other error code indicated that the bank did not report
any information for a particular loan and had an average error rate of
about 2.4 percent for the same time period. The remaining 11 error
codes that could have affected the credit subsidy estimate had rates of
less than 1 percent. We assessed the error rates on this report in
aggregate to determine if these could affect the credit subsidy
estimate and found that the average aggregate error rate was about 6.5
percent during this period. However, given that most of these errors
were corrected in the month the error was identified, it was unlikely
that the remaining uncorrected errors would affect the credit subsidy
estimate at the time of our review.
Lender Incentives and Loan Sales Help Ensure Data Integrity:
In addition to the monthly loan data reconciliation process, lender
incentives also helped ensure the integrity of the underlying data used
in the credit subsidy estimates. In accordance with current SBA policy,
the agency can reduce or completely deny a lender's claim payment if
the defaulted loan data are not correct. According to SBA officials,
this policy gives the 7(a) program lenders an incentive to correct data
errors because it helps ensure they will be paid the full guarantee
amount if the borrower subsequently defaults on the loan. SBA provided
us with repair and denial data for fiscal years 1999 through the first
three quarters of fiscal year 2003 showing that the agency exercised
these options 2,177 times during this time, totaling at least $69.9
million.[Footnote 24]
Further, an ancillary benefit of SBA's loan sales program was to help
ensure data integrity. Prior to a sale, SBA district office staff, as
well as contractors, reviewed loan files as part of the "due diligence"
reviews to provide accurate information about the loans available for
sale to potential investors so that they may make informed bids. SBA
officials told us that prior to selling a loan, discrepancies between
the lenders' data and SBA had to be resolved.
Data Used by the Econometric Model Were Consistent with SBA Databases:
In order to assess the consistency between the data used in SBA's
econometric approach and the data in SBA's loan system, we selected and
tested a stratified random sample of 400 items to test key data that
could affect the credit subsidy estimate and found no errors.[Footnote
25] Specifically, we randomly selected 100 default and recovery
transactions and compared the amounts and transaction dates between the
loan system data and loan-level data used for the credit subsidy
estimate. In addition, we randomly selected 100 loans identified by the
model to be prepaid and reviewed the loan histories in SBA's database
and determined that all of these loans were paid off prior to their
scheduled termination date. Further, we tested 100 additional loans and
compared their status such as current, paid off, or default to ensure
their status in the model was correct and found no errors.
We also assessed the magnitude of 7(a) loans that were excluded from
the model in order to determine whether excluding these potentially
valid loans would likely affect the credit subsidy estimate. Our
earlier work on SBA's previous 7(a) credit subsidy model that primarily
used historical averages of defaults and recoveries found that
excluding loans from certain years that had higher default rates would
lower the overall average default rate. Excluding large numbers of
loans from this model would likely have a similar effect on the
estimated subsidy rate. To assess the magnitude of excluded loans, we
reviewed the computer coding for the econometric model and found that
SBA excluded loans when critical data for the model were missing such
as the initial disbursement date, the loan amount, or demographic
information on the borrowers. For most of the years between 1988 and
2001, the number of loans excluded because they lacked these essential
data ranged from 1 percent to 2 percent and overall, we concluded that
the degree of excluded loans was acceptable and would not significantly
affect the credit subsidy estimation calculation, at the time of our
review.
Conclusions:
Overall, we found that from an economics perspective, SBA's econometric
equations for its 7(a) credit subsidy model were reasonable. However,
from an audit perspective, SBA's lack of adequate documentation of the
model development process precluded us from (1) independently
evaluating the model's development; (2) determining whether SBA used a
sound and consistently applied method to select and reject variables to
be included in the model; and (3) determining whether a bias in
selecting variables existed in the model.
Based on our review, SBA's econometric equations for estimating
defaults, prepayments, and recoveries, which were used to derive the
estimate of its fiscal year 2004 subsidy costs, were reasonable. This
model's methodology has the potential to produce more reliable
estimates than the previous method of using historical averaging to
project the estimated program cash flows because this model relies on
economic reasoning in addition to historical program data. However, the
precision of any econometric model is limited because any estimate
produced by such a model should be considered one point in a range
within which the actual subsidy cost will likely fall. Because the
budget process requires agencies to select a specific estimate rather
than project a range, there will likely be some variance between the
forecasted and actual subsidy amounts. Using additional data that SBA
anticipates gathering in its new loan monitoring system, such as
borrower-specific data, could further enhance the reliability of SBA's
estimates of the subsidy cost. Therefore, further enhancements could
produce more reliable results.
Although the errors we identified in the model did not materially
affect the subsidy cost estimate, they did indicate that the process
SBA used to validate the model could be improved. Therefore, it is
important to invest the resources needed to periodically reevaluate the
underlying assumptions of any model to ensure that they are correct and
comprehensive, and that any errors or erroneous assumptions are
corrected so that the model continues to yield reasonable results.
While we found SBA's equations to be reasonable from an economics
perspective, the lack of adequate documentation of the model's
development process hampered three independent reviews of the 7(a)
model. Notwithstanding the current lack of clear OMB Circular A-11
guidance, SBA could benefit from applying the documentation principles
embodied in Technical Release 6 to the development of the 7(a)
econometric model and other credit subsidy estimation models it has
recently developed or is currently developing. Without adequate
documentation, SBA will be unable to transparently demonstrate the
rationale and basis for key aspects of models that provide important
cost information for budgets, financial statements, and congressional
decision makers. Although OMB provides guidance on how agencies should
prepare credit subsidy estimates in Circular A-11, it does not include
any guidance to the agencies for documenting their model development
process including the selection and rejection of variables for use in
the models that generate federal credit subsidy estimates. A lack of
improved OMB guidance for model documentation will continue to hamper
adequate external oversight and validation of models used to generate
credit subsidy estimates.
Recommendations for Executive Action:
We are making three recommendations to SBA and one to OMB. To further
enhance the reliability of SBA's subsidy estimates, we recommend that
the SBA Administrator take the following two actions:
* determine how best to include in future subsidy models borrower-
specific information, such as credit scores and loan-to-value ratios,
to be collected in the new loan monitoring system; and:
* ensure that the model remains reasonable by establishing a process
for periodically evaluating the model to correct any errors and
revising it to reflect changes in the 7(a) business loan program or
other factors that could affect the subsidy estimate.
To demonstrate and explain the rationale and basis for the 7(a)
econometric model and all other models developed, we recommend that the
SBA Administrator take the following action:
* prepare and retain adequate documentation of the model development
process including a detailed discussion of the alternative variables or
combinations of variables that were considered, tested, and rejected,
as well as the reasons for rejecting them.
To facilitate (1) validation of models used to generate credit subsidy
estimates, (2) external oversight, and (3) financial statement audits,
we recommend that the Director, OMB, take the following action:
* revise OMB Circular A-11 to require that agencies document the
development of their credit subsidy models, including the process
followed for selecting modeling methodologies over alternatives, and
variables tested and rejected, along with the basis for excluding them.
Agency Comments and Our Evaluation:
We provided an initial draft and a revised draft, based on our review
of additional model documentation, to both SBA and OMB for review and
comment. While our initial draft was at the agencies for comment, we
continued to pursue additional documentation that SBA had to further
explain its 7(a) model development process, including what variables
were selected, rejected, and why. When we eventually obtained access to
the 800 pages of SBA material, we determined that it was not organized
and included no road map to describe the variable testing process or
its results. We concluded that this information was of questionable or
no usefulness to our assessment of SBA's modeling process. We addressed
the weaknesses in SBA's documentation in the revised draft report and
provided it to SBA and OMB for comment. In commenting on the initial
draft, SBA's Chief Financial Officer (CFO) generally agreed with our
findings and the first two recommendations related to actions to
further enhance the reliability of the model's subsidy estimates. OMB
did not provide any comments on the initial draft report. We received
comments on the revised draft from SBA's CFO who generally disagreed
with our findings and recommendations related to the lack of adequate
documentation supporting the model's development process. We also
received comments on the revised draft from the OMB Assistant Director
for Budget and the Controller who disagreed with our recommendation
that OMB revise Circular A-11. Their written comments are reprinted in
appendixes III and IV, respectively, and are summarized below. Both
agencies provided technical comments that we have incorporated into the
report as appropriate.
In commenting on our final draft report, SBA stated that it had
provided us with extensive documentation, briefings, and explanations
about how the model was developed. We met with SBA officials and their
contractor who constructed the model and discussed their methodology,
but we were unable to corroborate this information with the
documentation they subsequently provided. SBA's comment letter stated
that it provided us with 800 pages of material that contained some
information on variables that were considered and rejected. During our
subsequent review of this material, we found that this documentation
was a partial compilation of analyses conducted during the model
development process with no explanation or discussion of what was
learned from each analysis conducted. After reviewing all of this
documentation, as discussed in the report, we concluded that it
provided little additional information to enable us to understand and
corroborate the process and criteria that SBA used to select and reject
variables for its 7(a) model.
Our conclusions regarding the lack of adequate documentation for the
model's development process were consistent with those of both the
independent contractor SBA hired to review the model in 2002 prior to
its implementation and the independent public accounting firm that
audited SBA's fiscal year 2003 financial statements. As part of its
January 30, 2004, audit report, the independent public accounting firm
identified in its internal control report 9 specific deficiencies in
the model's documentation. These deficiencies included, for example, a
lack of technical references for the statistical method used for the
performance of the model, the absence of mathematical specifications,
that important variables were not clearly identified, and that units of
measure for key variables were not specified. In addition, the audit
report stated that the documentation that was provided was "self-
contradictory" about the quality of the default and prepayment model
and lacked a discussion of the assumptions and limitations of SBA's
modeling approach. While SBA's CFO agreed with the independent
accounting firm's findings regarding the lack of adequate documentation
for the credit subsidy model, he disagreed with similar weaknesses
identified in our report.
SBA disagreed that its lack of adequate documentation on the 7(a) model
development process could impede our ability to reach a conclusion
about SBA's loan accounts in connection with the audit of the
consolidated financial statements of the federal government. Instead,
SBA believed mandating additional documentation would establish a new
and unnecessary requirement. Our comment was in regard to our
responsibility as the auditor of the consolidated financial statements
of the federal government and does not establish a new or unnecessary
requirement for SBA. For the consolidated financial statement audit, we
evaluate the reasonableness of credit program estimates based on audit
guidance in SAS No. 57.[Footnote 26] In auditing estimates, SAS No. 57
states that an auditor should consider, among other things, the process
used by management to develop the estimate, including determining
whether or not (1) relevant factors were used, (2) reasonable
assumptions were developed, and (3) biases influenced the factors or
assumptions. SBA's lack of adequate documentation of the 7(a) model
development process impaired our ability to make such an assessment.
OMB disagreed with the recommendation that Circular A-11 should be
revised and believed that the report did not demonstrate that revisions
were needed. OMB officials commented that they worked closely with SBA
during the model development process and believed that the
documentation SBA provided to OMB was adequate for them to determine
that the subsidy estimates and reestimates were reasonable. OMB also
did not concur with our statement that a lack of improved OMB guidance
hampered adequate external oversight. Unlike OMB, in this case, we and
other external reviewers did not have the opportunity to work with SBA
during the model development process and, as a result, relied on oral
explanations and documentation provided by SBA staff and its contractor
who developed the model. Further, we attempted to corroborate SBA's
statements with the documentation that SBA provided. However, as we
reported, three independent external reviews of SBA's 7(a) model were
hampered by a lack of adequate documentation of SBA's model development
process. We reaffirm our conclusion that adequate documentation is
needed for the SBA 7(a) model's development and that independent
external review and oversight will continue to be hampered without a
requirement to provide adequate documentation about how econometric
models are developed.
OMB stated that Ernst and Young was able to independently validate
SBA's 7(a) model with the available documentation. According to OMB,
this firm stated that the 7(a) model assumptions and methodology
appeared to be reasonable and accurate. We obtained and reviewed the
reports OMB cited and found that the firm was not hired to validate or
review the same segments of the model that we reviewed. This series of
reports was related to the cash flow module of the 7(a) model, as well
as the model used to calculate reestimates, but did not review the
econometric equations or the model's development process. In its
report, the firm explicitly stated that it was not reviewing the same
parts of the model that we reviewed. We confirmed this information in
conversations with the accounting firm's engagement partner and
concluded that this firm's work was not relevant to the findings and
conclusions presented in our report.
OMB also commented that SAS No. 57 states that internal controls over
accounting estimates may or may not be documented. While SAS No. 57
does state that the process for preparing accounting estimates may not
be documented, it also states that auditors should assess whether there
are additional key factors or alternative assumptions that need to be
included in the estimate and assess the factors that management used in
developing the assumptions. Further, SAS No. 57 states that auditors
should concentrate on key factors and assumptions that are subjective
and susceptible to misstatement and bias. We believe this includes the
selection and rejection of variables that can be included in the model.
Without adequate documentation on the credit subsidy model development
process, it is difficult for auditors to fulfill their responsibilities
to assess these areas.
OMB also commented that SBA fulfilled the management responsibilities
described in SAS No. 57 regarding internal controls for accounting
estimates. We disagree with this statement and point out that SAS No.
57 provides guidance for auditing accounting estimates as part of
conducting financial statement audits rather than directing agency
management's actions. Management's responsibility for internal
controls are contained in our "Standards for Internal Control in the
Federal Government," which states, among other things, that "internal
control and all transactions and other significant events need to be
clearly documented, and the documentation should be readily available
for examination."[Footnote 27] Further, as previously stated, Cotton
and Company also identified the lack of adequate model documentation as
an internal control weakness. Moreover, SBA's CFO generally agreed with
the independent public accountant's report's findings, including the
deficiencies in SBA's model documentation, and stated that the internal
control report presented "fundamentals of good financial management and
SBA is committed to accomplishing as many of these items as possible in
the coming year.":
OMB also stated that requiring agencies to prepare additional
documentation of the variables tested and rejected would be unduly
burdensome. We disagree with this statement and note that this
documentation would only need to be prepared when a model is developed
or when significant updates are implemented. Further, this requirement
would be consistent with other segments of OMB Circular A-11 that
require agencies to provide supporting documentation for their budget
submissions. However, as we mentioned in the report, there is currently
no explicit guidance for agencies to document the development of the
models that are used to generate credit subsidy estimates.
OMB also commented that we received sufficient information to test
alternative variables to measure the reasonableness of the final SBA
credit subsidy model. We note that our work demonstrated that using
additional variables that were also reasonable changed the subsidy
estimate. We believe that this work highlights the need for agencies to
document their basis for rejecting variables or combinations of
variables from their final credit subsidy models. By documenting this
work, agencies will be able to demonstrate to independent reviewers
that a bias from variable selection does not exist in the final model.
Both agencies provided technical comments that we incorporated into the
report as appropriate. The written comments of both agencies are
reprinted in appendixes III and IV.
We are sending copies of this report to the Chair of the Senate
Committee on Small Business and Entrepreneurship, other appropriate
congressional committees, the Administrator of the Small Business
Administration, and the Director of the Office of Management and
Budget. We also will make copies available to others upon request. In
addition, the report will be available at no charge on the GAO Web site
at [Hyperlink, http://www.gao.gov].
If you have any questions about this report, please contact me at (202)
512-8678 or [Hyperlink, dagostinod@gao.gov] or Katie Harris, Assistant
Director, at (202) 512-8415 or [Hyperlink, harrism@gao.gov]. Key
contributors to this report are listed in appendix V.
Signed by:
Davi M. D'Agostino Director, Financial Markets and Community
Investment:
[End of section]
Appendixes:
[End of section]
Appendix I: Objectives, Scope, and Methodology:
As agreed with your staff, we (1) assessed the reasonableness of the
model's econometric equations and evaluated the model's estimated
default, prepayment, and recovery rates based on the 7(a) program's
recent historical loan experience; (2) identified additional steps the
SBA could take to further enhance the reliability of its subsidy
estimate produced by the model; (3) reviewed SBA's process for
developing the subsidy model; (4) evaluated the model's supporting
documentation, including its discussion of what variables were tested
and rejected; and (5) determined what steps SBA has taken to ensure the
integrity of the data used in the model and determined whether these
data are consistent with information in its databases. We did not
validate SBA's model.
Assessing the Reasonableness of the Model's Econometric Equations and
Evaluating the Model's Estimated Default, Prepayment, and Recovery
Rates:
To analyze the model, we obtained from SBA copies of the model as
approved by the Office of Management and Budget (OMB), along with the
loan-level data that were used to develop the subsidy estimates. We
analyzed the econometric equations to determine whether they were
reasonable based on the variables they included, the statistical
techniques used, and the results obtained. For example, we determined
whether the econometric equations included appropriate variables and
whether the variables used in the equations were statistically
significant. To evaluate the model's estimated default and recovery
rates, we compared these rates with recent historical loan experience
of the 7(a) program provided by SBA. Using SBA's data, we also
calculated what SBA would have estimated for default and recovery rates
based on the estimation methodology it used prior to its fiscal year
2003 budget submission. (See app. II for a detailed discussion of our
analysis of the reasonableness of the model's econometric equations.):
Identifying Additional Steps SBA Could take to Further Enhance the
Reliability of the Model:
To identify additional steps SBA could take to enhance the reliability
of its model, we considered additional types of data that SBA might
collect and consider including in its econometric equations. As part of
this analysis, we reviewed the academic literature on default modeling
and interviewed officials with several banks engaged in similar
efforts.
Reviewing SBA's Process of Developing the Subsidy Model:
To determine SBA's process for developing the model, we met with SBA
officials in the Chief Financial Office who were responsible for
estimating the 7(a) program subsidy costs. We also met with OMB
officials who were responsible for approving the model. Finally, we
also reviewed available documentation on the model's development
provided by SBA and the report by the private consultant who reviewed
the model.
Evaluating the Model's Supporting Documentation, Including Its
Discussion of What Variables Were Tested and Rejected:
To evaluate the model's supporting documentation, including its
discussion of what variables were tested and rejected, we obtained and
analyzed available relevant documents and met with SBA officials and
their contractor who developed the model. We compared the information
presented in SBA's model documentation with existing credit subsidy
guidance including OMB Circular A-11 and Federal Financial Accounting
and Auditing Technical Release 6: Preparing Estimates for Direct Loan
and Loan Guarantee Subsidies under the Federal Credit Reform Act
Amendments to Technical Release 3: Preparing and Auditing Direct Loan
and Loan Guarantee Subsidies under the Federal Credit Reform Act. We
also assessed the impact the lack of documentation would have on SBA's
financial statement audit by comparing the documentation with Statement
on Auditing Standards Number 57, Auditing Accounting Estimates. SBA and
its contractor told us that 800 pages of raw testing information
contained in an electronic file was not organized in any fashion, and
that there was no summary document or road map that had greater detail
than the model documentation provided us that described the variable-
testing process or the results of that process in an understandable
fashion. In addition, SBA and the contractor told us that the variables
reflected in the 800 pages were not recorded in English words, but
rather in mnemonics, and that there was no crosswalk or key still in
existence to decode the mnemonics. Thus, no documentation existed that
would link the variable names used in the programming to a table of
variable descriptions. We obtained and reviewed a copy of this
documentation and confirmed the representations of SBA and its
contractor.
Determining What Steps SBA Took to Ensure the Integrity of the Data
Used in the Model and Whether These Data Were Consistent with
Information in Its Databases:
To determine what steps SBA took to ensure the integrity of the data
used by the model, we met with SBA officials to gain a general
understanding of the agency's data integrity efforts. We also assessed
the number of errors that were resolved by the district offices each
month by analyzing 4 months of fiscal year 2003 field office activity
from the Form 1502 Guaranty Loan Reporting System. We further assessed
whether the remaining errors at the end of the month would likely
affect the credit subsidy estimate by analyzing the types of errors
tracked by the system and determining which errors affected data used
by the new model. We also assessed the magnitude of these errors by
analyzing 6 months of fiscal year 2003 activity in the Guaranty Loan
Reporting System. To determine whether the data in the new model was
consistent with data in SBA's loan-level databases, we selected and
tested a stratified random sample of 400 key data elements that could
affect the credit subsidy estimate.[Footnote 28] Specifically, we
randomly selected 100 default and 100 recovery transactions and
compared the amounts and transaction dates between the loan system data
and loan-level data used for the credit subsidy estimate; 100 loans
identified by the model to be prepaid and reviewed the loan histories
in SBA's database to determine whether all of these loans were paid off
prior to their scheduled termination date; 100 additional loans and
compared their status such as current, paid off, or default to
determine if their status in the model agreed with SBA's loan-level
databases.
[End of section]
Appendix II: Analysis of Default, Prepayment, and Recoveries
Econometric Equations:
This appendix provides more detail on the three econometric equations
that the Small Business Administration (SBA) used to estimate the
subsidy rate for its 7(a) loan guarantee program and the expanded
equations that we developed. These equations are used to forecast
defaults, prepayments, and recoveries. The first section of this
appendix describes the variables that SBA used in the default and
prepayment equations and presents SBA's estimated coefficients. The
second section explains how we created the variable that we used to
represent the borrower's industry and presents the estimated
coefficients from our expanded default and prepayment equations. The
third section describes the equation that SBA used to forecast
recoveries and presents the estimated coefficients from that equation.
SBA's Default and Prepayment Equations:
In its new model for estimating the subsidy rate for the 7(a) loan
program, SBA uses multinomial logistic regression to estimate the
likelihood of defaults and prepayments as functions of a variety of
explanatory variables. Because multinomial regression is a simultaneous
estimation process, the default and prepayment equations are
identically specified (that is, the same explanatory variables are used
in each equation). SBA conducts its analysis at the level of the
individual loan, using loans that were disbursed from 1988 through
2001. For each loan, SBA's data set contains an observation for each
quarter that the loan is active. For example, if a loan prepays at the
end of the third year (counting the disbursement year as the first
year), then it is active during 12 quarters and, therefore, there are
12 observations for that loan in the data set.
For each observation, the dependent variable measures whether in that
quarter the borrower defaults on the loan, prepays the loan, or keeps
it active. As a result, the coefficients in the default or prepayment
equation are estimates of the association of each explanatory variable
with the likelihood of the loan defaulting or prepaying in that
quarter.
There are several categories of explanatory variables included in the
default and prepayment equations. The first group consists of a set of
dummy variables that indicate the age of the loan. These variables thus
serve to reflect the fact that prepayment and default behavior change
as a loan seasons. Specifically, there is a dummy variable for each of
the first ten quarters of the life of a loan. From the eleventh quarter
to the thirty-fourth quarter, there is a dummy variable for each two
consecutive quarters. Finally, if a loan remains active past an age of
thirty-four quarters, there is one more dummy variable.
The second set of explanatory variables concern loan characteristics. A
set of dummy variables indicates the contractual term of the loan at
origination. The categories are less than 5 years, 5 to up to 10 years,
10 years to up to 15 years, and 15 years or greater. Less than five
years serves as the omitted category in the regression. Loan amount is
another characteristic and is measured in millions of dollars. SBA also
includes a dummy variable that shows whether a loan was delivered
through the SBA Express Program. Also known as Subprogram 1027, this
program allows lenders to originate a loan using their own loan
documents instead of SBA documents and processing, but the loan
guarantee is only up to 50 percent. By comparison, the typical SBA
guarantee is almost 80 percent. Finally, there is a set of dummy
variables for type of lender: Regular, Preferred, and Certified. In the
regression, the regular type serves as the omitted category.
The next set of explanatory variables provides information on the
borrower. A set of dummy variables identifies ownership structure. The
categories are sole proprietorship, corporation, or partnership. Sole
proprietorship is the omitted category in the regression. An additional
dummy variable indicates whether the borrower is a new business.
Finally, there is a set of dummy variables that indicate the U.S.
Census Bureau region where the borrower is located.
The final set of explanatory variables contains two measures of
economic conditions. The first is the state unemployment rate where the
borrower is based. The source for these data is the U.S. Bureau of
Labor Statistics. The second is the quarterly percentage change in
gross domestic product. SBA obtained these data from the U.S. Bureau of
Economic Analysis.[Footnote 29] Table 1 summarizes the explanatory
variables.
Table 1: Variable Names and Descriptions:
Variable name; Age dummy variables: i1;
Variable description: 1 if loan is 1 quarter old, else 0.
Variable name; Age dummy variables: i2;
Variable description: 1 if loan is 2 quarters old, else 0.
Variable name; Age dummy variables: i3;
Variable description: 1 if loan is 3 quarters old, else 0.
Variable name; Age dummy variables: i4;
Variable description: 1 if loan is 4 quarters old, else 0.
Variable name; Age dummy variables: i5;
Variable description: 1 if loan is 5 quarters old, else 0.
Variable name; Age dummy variables: i6;
Variable description: 1 if loan is 6 quarters old, else 0.
Variable name; Age dummy variables: i7;
Variable description: 1 if loan is 7 quarters old, else 0.
Variable name; Age dummy variables: i8;
Variable description: 1 if loan is 8 quarters old, else 0.
Variable name; Age dummy variables: i9;
Variable description: 1 if loan is 9 quarters old, else 0.
Variable name; Age dummy variables: i10;
Variable description: 1 if loan is 10 quarters old, else 0.
Variable name; Age dummy variables: i1112;
Variable description: 1 if loan is 11 or 12 quarters old, else 0.
Variable name; Age dummy variables: i1314;
Variable description: 1 if loan is 13 or 14 quarters old, else 0.
Variable name; Age dummy variables: i1516;
Variable description: 1 if loan is 15 or 16 quarters old, else 0.
Variable name; Age dummy variables: i1718;
Variable description: 1 if loan is 17 or 18 quarters old, else 0.
Variable name; Age dummy variables: i1920;
Variable description: 1 if loan is 19 or 20 quarters old, else 0.
Variable name; Age dummy variables: i2122;
Variable description: 1 if loan is 21 or 22 quarters old, else 0.
Variable name; Age dummy variables: i2324;
Variable description: 1 if loan is 23 or 24 quarters old, else 0.
Variable name; Age dummy variables: i2526;
Variable description: 1 if loan is 25 or 26 quarters old, else 0.
Variable name; Age dummy variables: i2728;
Variable description: 1 if loan is 27 or 28 quarters old, else 0.
Variable name; Age dummy variables: i2930;
Variable description: 1 if loan is 29 or 30 quarters old, else 0.
Variable name; Age dummy variables: i3132;
Variable description: 1 if loan is 31 or 32 quarters old, else 0.
Variable name; Age dummy variables: i3334;
Variable description: 1 if loan is 33 or 34 quarters old, else 0.
Variable name; Age dummy variables: i35p;
Variable description: 1 if loan is older than 34 quarters, else 0.
Variable name; Loan characteristics: t5_10;
Variable description: 1 if term of loan is at least 5 years but less
than 10, else 0.
Variable name; Loan characteristics: t10_15;
Variable description: 1 if term of loan is at least 10 years but less
than 15, else 0.
Variable name; Loan characteristics: t15p;
Variable description: 1 if term of loan is 15 years or more, else 0.
Variable name; Loan characteristics: sub1027;
Variable description: 1 if loan delivered through SBA Express Program,
else 0.
Variable name; Loan characteristics: loan_amt;
Variable description: Gross guaranteed disbursed amount in millions.
Variable name; Loan characteristics: Lender_PLP;
Variable description: 1 if lender is part of the Preferred Lender
Program, else 0.
Variable name; Loan characteristics: Lender_CLP;
Variable description: 1 if lender is part of the Certified Lender
Program, else 0.
Variable name; Borrower characteristics: Corporation;
Variable description: 1 if borrower is incorporated, else 0.
Variable name; Borrower characteristics: Partnership;
Variable description: 1 if borrower is a partnership, else 0.
Variable name; Borrower characteristics: NewBusiness;
Variable description: 1 if borrower is a new business, else 0.
Variable name; Borrower characteristics: Northeast;
Variable description: 1 if located in U.S. Census Bureau's Northeast
Region, else 0.
Variable name; Borrower characteristics: Midwest;
Variable description: 1 if located in U.S. Census Bureau's Midwest
Region, else 0.
Variable name; Borrower characteristics: South;
Variable description: 1 if located in U.S. Census Bureau's South
Region, else 0.
Variable name; Economic conditions: Urate;
Variable description: Unemployment rate in the state where firm is
located.
Variable name; Economic conditions: pc_gdp96;
Variable description: Quarterly percent change in constant dollar GDP.
Source: GAO.
[End of table]
The coefficients in the SBA equations indicate that the probability of
both defaults and prepayments generally increase and then decline as a
loan seasons. Defaults peak during the eighth quarter while prepayments
peak around quarters 27 and 28. Longer-term loans are less likely to
default or prepay. By comparison, larger loans are more likely to
default or prepay. Good economic conditions, as reflected by the
coefficients on unemployment and the percentage change in gross
domestic product, reduce the chances of default and increase the
likelihood of prepayment. The positive coefficients on the variable for
new business indicate that such firms are more likely to default and
prepay. Corporations and partnerships are less likely to default and
more likely to prepay than sole proprietors. Finally, loans granted
under Subprogram 1027 are less likely to default and more likely to
prepay. Table 2 presents the coefficients in SBA's default and
prepayment equations as well as some summary statistics.
Table 2: Multinomial Logistic Regression Coefficient Estimates[A]:
Variables: Constant;
Predicting to defaults: Base model: -9.7650;
Predicting to prepayments: Base model: -5.2762.
Variables: i1;
Predicting to defaults: Base model: 2.1151;
Predicting to prepayments: Base model: 1.1203.
Variables: i2;
Predicting to defaults: Base model: 3.1174;
Predicting to prepayments: Base model: 1.6016.
Variables: i3;
Predicting to defaults: Base model: 3.8158;
Predicting to prepayments: Base model: 1.9374.
Variables: i4;
Predicting to defaults: Base model: 4.2247;
Predicting to prepayments: Base model: 2.1063.
Variables: i5;
Predicting to defaults: Base model: 4.5187;
Predicting to prepayments: Base model: 2.2865.
Variables: i6;
Predicting to defaults: Base model: 4.6659;
Predicting to prepayments: Base model: 2.4113.
Variables: i7;
Predicting to defaults: Base model: 4.7487;
Predicting to prepayments: Base model: 2.5805.
Variables: i8;
Predicting to defaults: Base model: 4.8211;
Predicting to prepayments: Base model: 2.7080.
Variables: i9;
Predicting to defaults: Base model: 4.8068;
Predicting to prepayments: Base model: 2.8163.
Variables: i10;
Predicting to defaults: Base model: 4.8121;
Predicting to prepayments: Base model: 2.9133.
Variables: i1112;
Predicting to defaults: Base model: 4.8033;
Predicting to prepayments: Base model: 3.0540.
Variables: i1314;
Predicting to defaults: Base model: 4.7772;
Predicting to prepayments: Base model: 3.1439.
Variables: i1516;
Predicting to defaults: Base model: 4.7101;
Predicting to prepayments: Base model: 3.3111.
Variables: i1718;
Predicting to defaults: Base model: 4.6214;
Predicting to prepayments: Base model: 3.4554.
Variables: i1920;
Predicting to defaults: Base model: 4.6136;
Predicting to prepayments: Base model: 3.6945.
Variables: i2122;
Predicting to defaults: Base model: 4.5156;
Predicting to prepayments: Base model: 3.5201.
Variables: i2324;
Predicting to defaults: Base model: 4.4297;
Predicting to prepayments: Base model: 3.6685.
Variables: i2526;
Predicting to defaults: Base model: 4.2945;
Predicting to prepayments: Base model: 3.8222.
Variables: i2728;
Predicting to defaults: Base model: 4.3414;
Predicting to prepayments: Base model: 4.0106.
Variables: i2930;
Predicting to defaults: Base model: 4.2515;
Predicting to prepayments: Base model: 3.6142.
Variables: i3132;
Predicting to defaults: Base model: 4.2036;
Predicting to prepayments: Base model: 3.7143.
Variables: i3334;
Predicting to defaults: Base model: 4.1378;
Predicting to prepayments: Base model: 3.7914.
Variables: i35p;
Predicting to defaults: Base model: 4.1027;
Predicting to prepayments: Base model: 3.9950.
Variables: t5_10;
Predicting to defaults: Base model: -0.0462[A];
Predicting to prepayments: Base model: -0.6568.
Variables: t10_15;
Predicting to defaults: Base model: -0.7596;
Predicting to prepayments: Base model: -1.1013.
Variables: t15p;
Predicting to defaults: Base model: -0.7395;
Predicting to prepayments: Base model: -1.1014.
Variables: sub1027;
Predicting to defaults: Base model: -0.5800;
Predicting to prepayments: Base model: 0.0812.
Variables: loan_amt;
Predicting to defaults: Base model: 0.2578;
Predicting to prepayments: Base model: 0.1189.
Variables: corporation;
Predicting to defaults: Base model: -0.0434;
Predicting to prepayments: Base model: 0.0989.
Variables: partnership;
Predicting to defaults: Base model: -0.1982;
Predicting to prepayments: Base model: 0.0211[A].
Variables: northeast;
Predicting to defaults: Base model: 0.3612;
Predicting to prepayments: Base model: -0.2054.
Variables: midwest;
Predicting to defaults: Base model: 0.2184;
Predicting to prepayments: Base model: -0.1869.
Variables: south;
Predicting to defaults: Base model: 0.4142;
Predicting to prepayments: Base model: -0.0928.
Variables: Lender_PLP;
Predicting to defaults: Base model: -0.1761;
Predicting to prepayments: Base model: 0.0824.
Variables: Lender_CLP;
Predicting to defaults: Base model: -0.1688;
Predicting to prepayments: Base model: -0.0014[B].
Variables: NewBusiness;
Predicting to defaults: Base model: 0.2773;
Predicting to prepayments: Base model: 0.0678.
Variables: urate;
Predicting to defaults: Base model: 0.1043;
Predicting to prepayments: Base model: -0.0957.
Variables: Pc_gdp96;
Predicting to defaults: Base model: -0.1261;
Predicting to prepayments: Base model: 0.0661.
Summary statistics for multinomial logistic regression models: N of
Observations;
Predicting to prepayments: Base model: 5,736,628.
Summary statistics for multinomial logistic regression models:
Variables: Likelihood Ratio Chi Sq;
Predicting to prepayments: Base model: 120,478.
Summary statistics for multinomial logistic regression models:
Variables: Degrees of Freedom;
Predicting to prepayments: Base model: 76.
Summary statistics for multinomial logistic regression models:
Variables: Significance levels;
Predicting to prepayments: Base model: