Tax Administration
IRS Needs to Further Refine Its Tax Filing Season Performance Measures
Gao ID: GAO-03-143 November 22, 2002
The tax-filing season, roughly January 1 through April 15, is when most taxpayers file their returns, receive refunds, and call or visit IRS offices or the IRS Web site with questions. To provide better information about the quality of filing season services, IRS is revamping its suite of filing season performance measures. Because the new measures are part of a strategy to improve service and because filing season service affects so many taxpayers, GAO was asked to assess whether the new measures have the four characteristics of successful performance measures graphically depicted below.
In assessing 53 performance measures across IRS's four program areas, GAO found that IRS has made significant efforts to improve its performance measurement system. Many of the measures satisfied some of the four key characteristics of successful performance measures established in earlier GAO work. Although improvements are ongoing, GAO identified instances where measures showed weaknesses including the following: (1) The objectivity and reliability of some measures could be improved so that they will be reasonably free from significant bias and produce the same result under similar circumstances. For example, survey administrators may notify Telephone Assistance's customer service representatives (CSR) too soon that their call was selected to participate in the customer satisfaction survey, which could bias CSR behavior towards taxpayers and adversely affect the measure's objectivity. In addition, the measure Electronic Filing and Assistance uses to determine the number of Web site hits was not reliable because it did not represent the actual number of times the Web site is accessed. (2) The clarity of some performance information was affected when that measure's definition and formula were not consistent. For example, the definition for "CSR response level" measure is the percentage of callers who receive service from a CSR within a specified period of time, but the measure did not include callers who received a busy signal or hung up. (3) Some suites of measures did not cover government-wide priorities such as quality, timeliness, and cost of service. For example, Field Assistance was missing measures for timeliness and cost of service.
Recommendations
Our recommendations from this work are listed below with a Contact for more information. Status will change from "In process" to "Open," "Closed - implemented," or "Closed - not implemented" based on our follow up work.
Director:
Team:
Phone:
GAO-03-143, Tax Administration: IRS Needs to Further Refine Its Tax Filing Season Performance Measures
This is the accessible text file for GAO report number GAO-03-143
entitled 'Tax Administration: IRS Needs to Further Refine Its Tax
Filing Season Performance Measures' which was released on November 22,
2002.
This text file was formatted by the U.S. General Accounting Office
(GAO) to be accessible to users with visual impairments, as part of a
longer term project to improve GAO products‘ accessibility. Every
attempt has been made to maintain the structural and data integrity of
the original printed product. Accessibility features, such as text
descriptions of tables, consecutively numbered footnotes placed at the
end of the file, and the text of agency comment letters, are provided
but may not exactly duplicate the presentation or format of the printed
version. The portable document format (PDF) file is an exact electronic
replica of the printed version. We welcome your feedback. Please E-mail
your comments regarding the contents or accessibility features of this
document to Webmaster@gao.gov.
GAO Highlights:
TAX ADMINISTRATION: IRS Needs to Further Refine Its Tax Filing Season
Performance Measures
Highlights of GAO- 03-143, a report to the Subcommittee on Oversight,
Committee on Ways and Means, House of Representatives
Why GAO Did This Study:
The tax filing season, roughly January 1 through April 15, is when
most taxpayers file their returns, receive refunds, and call or visit
IRS offices or the IRS Web site with questions. To provide better
information about the quality of filing season services, IRS is
revamping
its suite of filing season performance measures. Because the new
measures
are part of a strategy to improve service and because filing season
service affects so many taxpayers, GAO was asked to assess whether the
new
measures have the four characteristics of successful performance
measures
graphically depicted below.
What GAO Found:
In assessing 53 performance measures across IRS‘s four program areas,
GAO
found that IRS has made significant efforts to improve its performance
measurement system. Many of the measures satisfied some of the four key
characteristics of successful performance measures established in
earlier
GAO work. Although improvements are ongoing, GAO identified instances
where
measures showed weaknesses including the following: (1) The objectivity
and
reliability of some measures could be improved so that they will be
reasonably
free from significant bias and produce the same result under similar
circumstances. For example, survey administrators may notify Telephone
Assistance‘s customer service representatives (CSR) too soon that their
call was selected to participate in the customer satisfaction survey,
which could bias CSR behavior towards taxpayers and adversely affect
the
measure‘s objectivity. In addition, the measure Electronic Filing and
Assistance uses to determine the number of Web site hits was not
reliable
because it did not represent the actual number of times the Web site is
accessed.
(2) The clarity of some performance information was affected when that
measure‘s definition and formula were not consistent. For example, the
definition for ’CSR response level“ measure is the percentage of
callers
who receive service from a CSR within a specified period of time, but
the
measure did not include callers who received a busy signal or hung up.
(3)
Some suites of measures did not cover governmentwide priorities such as
quality, timeliness, and cost of service. For example, Field Assistance
was missing measures for timeliness and cost of service.
[See PDF for image]
What GAO Recommends:
GAO is making recommendations to the Commissioner of Internal Revenue
directed at taking actions to better ensure that IRS validates the
accuracy
of data collection methods for several measures; modifies the formulas
used
to compute various measures; and adds certain measures, such as cost of
service,
to its suite of measures.
Of GAO‘s 18 recommendations, IRS agreed with 12 and discussed actions
that
had been taken or would be taken to implement them. For 2 of those 12,
the
actions discussed by IRS did not fully address GAO‘s concerns. IRS did
not
agree with the other 6 recommendations.
The full report, including GAO‘s objectives, scope, methodology, and
analysis
is available at www.gao.gov/cgi-bin/getrpt?GAO-03-143. For additional
information about the report, contact James White, 202-512-9110 or
WhiteJ@gao.gov.
Report to the Chairman, Subcommittee on Oversight, Committee on Ways
and Means, House of Representatives:
United States General Accounting Office:
GAO:
November 2002:
Tax Administration:
IRS Needs to Further Refine Its Tax Filing Season Performance Measures:
Tax Filing Performance Measures:
GAO-03-143:
Contents:
Letter:
Results in Brief:
Background:
Scope and Methodology:
Filing Season Performance Measures Have Many of the Attributes of
Successful Measures, but Further Enhancements Are Possible:
Conclusions:
Recommendations for Executive Action:
Agency Comments and Our Evaluation:
Appendix I: Expanded Explanation of Our Attributes and Methodology for
Assessing IRS‘s Performance Measures:
Appendix II: The 53 IRS Performance Measures Reviewed:
Appendix III: Comments from the Internal Revenue Service:
GAO Comments:
Appendix IV: GAO Contacts and Staff Acknowledgments:
GAO Contacts:
Acknowledgments:
Bibliography:
Related Products:
Tables:
Table 1: Key Attributes of Successful Performance Measures:
Table 2: Overview of Our Assessment of Telephone Assistance Measures:
Table 3: Overview of Our Assessment of Electronic Filing and Assistance
Measures:
Table 4: Overview of Our Assessment of Field Assistance Measures:
Table 5: Overview of Our Assessment of Submission Processing Measures:
Table 6: Telephone Assistance Performance Measures:
Table 7: Electronic Filing and Assistance Performance Measures:
Table 8: Field Assistance Performance Measures:
Table 9: Submission Processing Performance Measures:
Figures:
Figure 1: IRS‘s Mission and the Link between Its Strategic Goals and
the Elements of Its Balanced Measurement System:
Figure 2: Linkage from IRS Mission to Operating Unit Measure and
Target:
Figure 3: Performance Measures Should Have Four Characteristics:
Figure 4: Example of Relationship among Field Assistance Goals and
Measures:
Abbreviations:
CQRS: Centralized Quality Review Site:
CSR: customer service representative:
GPRA: Government Performance and Results Act of 1993:
IRS: Internal Revenue Service:
Q-Matic: Queuing Management System:
TAC: Taxpayer Assistance Center:
W&I: Wage and Investment:
United States General Accounting Office:
Washington, DC 20548:
November 22, 2002:
The Honorable Amo Houghton
Chairman, Subcommittee on Oversight
Committee on Ways and Means
House of Representatives:
Dear Mr. Chairman:
For most taxpayers, their only contacts with the Internal Revenue
Service (IRS) are associated with the filing of their individual income
tax returns. Most taxpayers file their returns between January 1 and
April 15, which is generally referred to as the ’filing season.“
[Footnote 1] In addition to the filing itself, which can be on
paper or electronic, these contacts generally involve millions of
taxpayers seeking help from IRS by calling one of IRS‘s toll-free
telephone numbers, visiting one of IRS‘s field assistance centers, or
accessing IRS‘s Web site on the Internet (www.irs.gov). Between January
1 and July 13, 2002, for example, IRS received about 105 million calls
for assistance over its toll-free telephone lines.[Footnote 2]
As part of a much larger effort to modernize and become more responsive
to taxpayers, IRS is revamping how it measures and reports its filing
season performance. The new filing season performance measures are to
balance customer satisfaction, employee satisfaction, and business
results, such as the quality of answers to taxpayer inquiries and the
timeliness of refund issuance. IRS intends to use the balanced measures
to make managers and frontline staff more accountable for improving
filing season performance.
Because so many taxpayers are affected by IRS‘s performance during the
filing season and because the revamped measures are part of a strategy
to improve performance, you asked us to review IRS‘s new set of filing
season performance measures. Those measures belong to the four program
areas critical to a successful filing season: telephone assistance;
electronic filing and assistance; field assistance; and the processing
of returns, refunds, and remittances (referred to as ’submission
processing“). Specifically, our objective was to assess whether the key
performance measures IRS uses to hold managers accountable in the four
program areas had the characteristics of a successful performance
measurement system.
Previous GAO work indicated agencies successful in measuring
performance had performance measures that demonstrate results, are
limited to the vital few, cover multiple priorities, and provide useful
information for decision making.[Footnote 3] To determine whether IRS‘s
filing season performance measures satisfy these four general
characteristics, we assessed the measures using nine specific
attributes.[Footnote 4] Earlier GAO work cited these specific
attributes as key to successful performance measures. Table 1 is a
summary of the nine attributes, including the potentially adverse
consequences if they are missing. All attributes are not equal and
failure to have a particular attribute does not necessarily indicate
that there is a weakness in that area or that the measure is not
useful; rather, it may indicate an opportunity for further refinement.
An expanded explanation of the nine attributes is included in appendix
I.
Table 1: Key Attributes of Successful Performance Measures:
[See PDF for image]
Source: Summary of information in appendix I.
[End of table]
We shared these attributes with various IRS officials, who generally
agreed with their relevance. As discussed in greater detail in the
separate scope and methodology section of this report, we took many
steps to validate and ensure consistency in our application of the
attributes.
We testified before the Subcommittee on Oversight on some of the
interim results of our assessment in April 2002.[Footnote 5]
Results in Brief:
In assessing 53 performance measures across four of IRS‘s key filing
season program areas, we found that the measures satisfied many of the
nine attributes of successful performance measures previously listed in
table 1. As part of its agencywide reorganization, IRS has made
significant efforts to improve its performance measurement system,
which is to provide useful information about how well IRS performed in
achieving its goals. The improvement of this system is an ongoing
process where, in some cases, IRS is only beginning to collect baseline
information on which to form targets and develop other measures that
would provide better information to evaluate performance results.
Despite IRS‘s progress, we identified instances in all four program
areas where the individual measures or suites of measures did not meet
some of our nine attributes. Some of these instances represent
opportunities for IRS to further refine its measures.
All of the 15 telephone assistance measures had some of the attributes
of successful performance measures. Of the more significant problems,
five measures had either clarity or reliability problems and one had an
objectivity problem. For example,
* five measures did not provide managers and other stakeholders with
clear information about the program‘s performance. For example, the
definition for ’customer service representative (CSR) response level“
is the percentage of callers who receive service from a CSR within a
specified period of time, but the formula did not include callers who
received a busy signal or hung up; this limitation could lead managers
and other stakeholders to conclude that IRS is providing significantly
better service than it is.
All of the 13 electronic filing and assistance performance measures
fulfilled some of the 9 attributes. The most significant problems
involved changing targets, objectivity, and missing measures. For
example,
* electronic filing and assistance changed the targets for two of its
measures during fiscal year 2001, which could distort the assessment of
performance because what was to be observed changed. For example, it
changed the target for the ’number of 1040 series returns filed
electronically“ from 42 million to 40 million because midyear data
indicated that 42 million 1040 series returns were not going to be
filed electronically. Because of the subjective considerations
involved, changing the target in this situation also affected the
measure‘s objectivity.
All of field assistance‘s 14 performance measures satisfied some of the
attributes. Many of the more important problems involved clarity and
reliability. In addition, some measures were missing, which could cause
an emphasis on some program goals at the expense of a balance among all
goals. For example,
* the methods used to track workload volume and staff hours expended
required manual input that is subject to errors and inconsistencies,
which could affect data accuracy and thus the reliability of 8 of field
assistance‘s 14 measures.
* Field assistance did not have timeliness, efficiency, or cost of
service measures.
Many of the 11 submission processing measures had the attributes of
successful performance measures. Some of the more significant problems
related to clarity and reliability. For example,
* one measure--“productivity“--was unclear because it is a compilation
of different types of work IRS performs in processing returns,
remittances, and refunds and issuing notices and letters. Managers told
us that they needed specific information related to their own
operations and that the measure‘s methodology was difficult to
understand.
In all four program areas, we were unable, because of documentation
limitations, to verify the linkages among IRS‘s goals and measures.
Among other things, such linkages provide managers and staff with a
road map that shows how their day-to-day activities contribute to
attaining agencywide goals.
We are making recommendations to the Commissioner of Internal Revenue
directed at taking actions to better ensure that IRS‘s filing season
measures have the four characteristics of successful performance
measures. For example, we are recommending that IRS modify the formulas
used to compute various measures; validate the accuracy of data
collection methods for several measures; and add certain measures such
as cost of service, to its suite of measures.
We requested comments on a draft of this report from the Commissioner
of Internal Revenue. We received written comments, which are reprinted
in appendix III. In his comments, the Commissioner agreed that there
were opportunities to refine some performance measures and said that
our observation about the ongoing nature of the performance measurement
process was on target. The Commissioner agreed with 12 of our 18
recommendations and discussed actions that had been taken or would be
taken to implement them. In 2 of those cases, the actions discussed by
IRS did not fully address our concerns. The Commissioner disagreed with
the other 6 recommendations. We discuss the Commissioner‘s comments in
the ’Agency Comments and Our Evaluation“ section of the report.
Background:
In keeping with the Government Performance and Results Act of
1993 (GPRA),[Footnote 6] IRS revamped its set of filing season
performance measures as part of a massive, ongoing modernization
effort. Congress mandated the modernization effort in the IRS
Restructuring and Reform Act of 1998[Footnote 7] and intended that IRS
would better balance service to taxpayers with enforcement of the tax
laws. To implement the modernization mandate, the Commissioner of
Internal Revenue developed a strategy composed of five interdependent
components. One of those components is the development of balanced
performance measures.[Footnote 8]
Balanced measures are to emphasize accountability for achieving
specific results and to reflect IRS‘s priorities, which are articulated
in its mission and its three strategic goals--top quality service to
all taxpayers through fair and uniform application of the law, top
quality service to each taxpayer in every interaction, and productivity
through a quality work environment. IRS has defined three elements of
balanced measures--(1) customer satisfaction, (2) employee
satisfaction,
and (3) business results (quality and quantity measures)--to ensure
balance
among its priorities. Figure 1 shows IRS‘s mission and the link between
its
strategic goals and the three elements of IRS‘s balanced measurement
system.
Figure 1: IRS‘s Mission and the Link between Its Strategic Goals and
the Elements of Its Balanced Measurement System:
[See PDF for image]
Source: GAO depiction of information in IRS Publication 3561 and IRS‘s
Progress Report (December 2001).
[End of figure]
IRS intends to use the balanced measures to make managers and frontline
staff more accountable for improving filing season performance. We
reviewed the performance measures in the four programs areas that
interact with taxpayers the most during the filing season--telephone
assistance, electronic filing and assistance, field assistance, and
submission processing. Each of these program areas is part of IRS‘s
Wage and Investment (W&I) operating division, which generally serves
taxpayers whose only income is from wages and investments.[Footnote 9]
Although IRS had measures of performance prior to the reorganization,
IRS managers have spent much effort to revamp the filing season
performance measures since that time.
An important aspect of IRS‘s progress in the challenging task of
improving its performance measures was the development of a new
Strategic Planning, Budgeting, and Performance Management process in
2000. As part of that process, IRS prepares an annual Strategy and
Program Plan that communicates some of the various levels of IRS‘s
goals (e.g., strategic goals, operating division goals) and many
performance measures.[Footnote 10] Although the Strategy and Program
Plan does not document all the linkages among the various goals and
performance measures, figure 2 is an example we developed to
demonstrate the complete relationship from the agency level mission
down to the operating unit‘s measures and targets.
Figure 2: Linkage from IRS Mission to Operating Unit Measure and
Target:
[See PDF for image]
Source: GAO Analysis of IRS‘s Strategy and Program Plan (October 29,
2001), the W&I Business Performance Review (January 2002), IRS‘s
Progress Report (December 2001) and IRS Publication 3561.
[End of figure]
The Strategy and Program Plan is an important document because the
Commissioner holds IRS managers accountable for the results of the
performance measures contained within it. In addition, many of the
measures within the document are presented to outside stakeholders,
such as Congress and the public, as key indicators of IRS‘s
performance. The Strategy and Program Plan is the source of the 53
measures we reviewed in the four programs.
As we discussed in our June 1996 guide on implementing GPRA,[Footnote
11] agencies that were successful in measuring performance strived to
establish performance measures that were based on four general
characteristics. Those four characteristics are shown in figure 3 as
applicable to the four filing season programs we reviewed and are
described in more detail following the figure.
Figure 3: Performance Measures Should Have Four Characteristics:
[See PDF for image]
Source: GAO.
[End of figure]
Demonstrate results. Performance measures should show an organization‘s
progress towards achieving an intended level of performance or results.
Specifically, performance goals establish intended performance, and
measures can be used to assess progress towards achieving those goals.
Be limited to the vital few. Limiting measures to core program
activities enables managers and other stakeholders to assess
accomplishments, make decisions, realign processes, and assign
accountability without having an excess of data that could obscure
rather than clarify performance issues.
Cover multiple priorities. Performance measures should cover many
governmentwide priorities, such as quality, timeliness, cost of
service, customer satisfaction, employee satisfaction, and outcomes.
Performance measurement systems need to include incentives for managers
to strike the difficult balance among competing interests. One or two
priorities should not be overemphasized at the expense of others. IRS‘s
history shows why this balance is important. Because of its emphasis on
achieving certain numeric targets, such as the amount of dollars
collected, IRS failed to adequately consider other priorities, such as
the fair treatment of taxpayers.
Provide Useful Information for Decision Making. Performance measures
should provide managers and other stakeholders timely, action-oriented
information in a format that helps them make decisions that improve
program performance. Measures that do not provide managers with useful
information will not alert managers and other stakeholders to the
existence of problems nor help them respond when problems arise.
On the basis of these four characteristics of successful performance
measures, we used various performance management literature to develop
a set of nine specific attributes that we used as criteria for
assessing IRS‘s filing season performance measures. The nine attributes
are linkage, clarity, measurable target, objectivity, reliability, core
program activities, limited overlap, balance, and governmentwide
priorities. Appendix I describes these attributes in more detail.
Scope and Methodology:
As previously mentioned, we focused our work on four key filing season
programs--telephone assistance, electronic filing and assistance,
field assistance, and submission processing--within W&I. IRS officials
identified the performance measures in the Strategy and Program Plan to
be the highest, most comprehensive level of measures for which they are
accountable. After discussions with IRS, we decided to review all
53 measures in the Strategy and Program Plan relating to the four
filing season programs. We used W&I‘s draft fiscal year 2001 - 2003
Strategy and Program Plan (dated July 25, 2001) to conduct our review
and updated relevant information with the final plan (dated October 29,
2001). Appendix II describes each measure we reviewed in the four
program areas and provides other relevant information, such as targets
and potential weaknesses.
Our review focused on whether IRS‘s new set of filing season
performance measures had the characteristics of a successful
performance measurement system (i.e., demonstrated results, were
limited to the vital few, covered multiple priorities, and provided
useful information for decision making). For use as criteria in
assessing the measures, and as detailed in appendix I, we identified
nine attributes of performance measures from various sources, such as
earlier GAO work, Office of Management and Budget Circular No. A-
11,[Footnote 12] GPRA, and IRS‘s handbook on Managing Statistics in a
Balanced Measures System.[Footnote 13] We shared our attributes with
IRS officials from various organizations that have a role in developing
or monitoring performance measures. Those units included IRS‘s
Organizational Performance Division and several W&I units, such as
Strategy and Finance; Planning and Analysis; Customer Account Services;
and Communications, Assistance, Research, and Education. Officials in
these units generally agreed with the relevance of our attributes and
our assessment approach.
We applied the 9 attributes to the 53 filing season measures in a
systematic manner, but some judgment was required. To ensure
consistency and reliability in our application of the attributes, we
had one staff person responsible for each of the four areas. That staff
person prepared the initial analysis and at least two other staff
reviewed those detailed results. Several staff reviewed the results for
all four areas. We did not do a detailed assessment of IRS‘s
methodology for calculating the measures, but looked only at
methodological issues as necessary to assess whether a particular
measure met the overall characteristics of a successful performance
measure.
In applying the attributes, we analyzed numerous pieces of
documentation, such as IRS‘s Congressional Budget Justification, Annual
Performance Plan, and data dictionary,[Footnote 14] and many other
reports and documents dealing with the four IRS programs, goals,
performance measures, and improvement initiatives. We interviewed IRS
officials at various levels within telephone assistance, electronic
filing and assistance, field assistance, and submission processing to
understand the measures, their methodology, and their relationship to
goals, among other things. We also interviewed officials from various
IRS organizations that are involved in managing, collecting, and/or
using performance data, such as the Organizational Performance
Division; Strategy and Finance; Customer Account Services; Statistics
of Income; and the Centralized Quality Review Site; and a
representative of an IRS contractor, Pacific Consulting Group,
responsible for analyzing and reporting the results of telephone
assistance‘s customer satisfaction survey. Appendix I provides more
detail on the nine attributes we used, including explanations and
examples of each attribute and information on our methodology for
assessing each attribute.
We conducted our review in Atlanta, Ga; Washington, D.C; Cincinnati,
Ohio; and Memphis, Tenn. from September 2001 to September 2002 in
accordance with generally accepted government auditing standards.
Filing Season Performance Measures Have Many of the Attributes of
Successful Measures, but Further Enhancements Are Possible:
The 53 filing season performance measures included in our review have
many of the attributes of successful performance measures, as detailed
in appendix I. For example, in all four of the program areas we
reviewed, most measures covered the core activities of each program and
had targets in place. In addition, IRS had several on-going initiatives
aimed at improving its measures, such as telephone assistance‘s efforts
to revamp all aspects of its quality measures.
At the same time, however, the measures did not satisfy all the
attributes, indicating the potential for further enhancements. The nine
attributes we used to assess each measure are not equal and failure to
have a particular attribute does not necessarily indicate that there is
a weakness in that area. In some cases, for example, a measure may not
have a particular attribute because benchmarking data are being
collected or a measure is being revised. Likewise, a noted weakness,
such as a measure not having clarity or being reliable, does not mean
that the measure is not useful. For example, telephone assistance‘s
’CSR level of service“ measure does not meet our clarity attribute
because its name and definition indicate that only calls answered by
CSRs are included, but its formula includes some calls answered by
automation. This defect currently does not impair the measure‘s
usefulness because the number of automated calls is fairly
insignificant. Other weaknesses, however, could lead managers or other
stakeholders to draw the wrong conclusions, overlook the existence of
problems, or delay resolving problems. For example, electronic filing
and assistance‘s ’number of IRS digital daily Web site hits“ measure
was not considered clear or reliable because it systematically
overstates the number of times the Web site is accessed. In total,
therefore, the weaknesses identified should be considered areas for
further refinement. Such refinements are not expected to be costly or
involve significant additional effort on the part of IRS because in
many instances our recommendations only include modifications or
increased rigor to procedures or processes already in place.
The rest of this report discusses the results of our analysis for each
of the four program areas--telephone assistance, electronic filing and
assistance, field assistance, and submission processing.
Telephone Assistance Measures:
As shown in table 2, all 15 of IRS‘s telephone performance measures
have some of the attributes of successful performance
measures.[Footnote 15] However, as summarized in this section, the
measures have several shortcomings. For example, we identified
opportunities to improve the clarity of five measures and the
reliability of five other measures. Table 6 in appendix II has more
detailed information about each telephone measure, including any
weaknesses we identified and any recommendations for improvement.
Table 2: Overview of Our Assessment of Telephone Assistance Measures:
[See PDF for image]
Note: A check mark denotes that the measure has the attribute.
[A] We were unable to verify the linkages between goals and measures
because of insufficient documentation.
[B] Core program activities of telephone assistance are to provide
timely and accurate assistance to taxpayers with inquiries about the
tax law and their accounts.
[C] IRS also refers to CSRs as assistors.
[D] IRS considers that these measures are balanced because they address
priorities, such as customer and employee satisfaction and business
results. However, including measures, such as cost of service, could
improve the balance of telephone assistance‘s program priorities.
Source: GAO analysis.
[End of table]
No Documentation Shows the Complete Linkage between Agencywide Goals
and Telephone Measures:
Although telephone assistance management stated that their goals and
measures generally aligned, we were unable to verify this because no
documentation shows the complete relationship. For example, some
documentation may show a link from a measure to an agencywide goal, but
the operating division level goals were omitted. When we attempted to
create the linkage ourselves, we found it difficult to determine how
some measures related to the different agencywide and operating
division goals. When we asked some IRS officials to describe the
complete link, they too had a difficult time and were uncertain of some
connections.
Telephone assistance managers stated that staff received performance
management training that should help them to understand their role in
helping the organization achieve its goals. However, having clear and
complete documentation would provide evidence that linkages exist and
help prevent misunderstandings. When employees do not understand the
relationship between goals and measures, they may not understand how
their work contributes to agencywide efforts and, thus, goals may not
be achieved.
Most Telephone Measures Have Clarity:
Ten of the 15 measures have clarity (e.g., ’automated calls answered“
clearly describes the count of all toll-free calls answered at customer
service sites by automated service). However, five measures contain or
omit certain data elements that can cause managers or other
stakeholders to misunderstand the level of performance. For example,
the ’CSR response level,“ is defined as the percentage of callers who
started receiving service from a CSR within a specified period of time.
However, this may not reflect the real customer experience at IRS
because the formula for computing the measure does not include callers
who tried to reach a CSR but did not, such as callers who (1) hung up
while waiting to speak to a CSR, (2) were provided access only to
automated services and hung-up, and (3) received a busy
signal.[Footnote 16] (The other four measures, as noted in table 6 in
appendix II, are ’CSR level of service,“ ’automated completion rate,“
’CSR service provided,“ and ’toll-free customer satisfaction.“):
Measures that do not provide clear information about program
performance may affect the validity of managers‘ and stakeholders‘
assessments of IRS‘s performance, possibly leading to a
misinterpretation of results or a failure to take proper action to
resolve performance problems.
Most Telephone Measures Have Targets:
Eleven of the 15 measures have numerical targets that facilitate the
future assessment of whether overall goals and objectives were
achieved. Of the four measures with no targets, three were measures for
which IRS was collecting data for use in developing first-time targets
and one was a measure (’automated completion rate“) that IRS was no
longer tracking in the Strategy and Program Plan. Although we generally
disagree with the removal of the ’automated completion rate“ measure
from the Strategy and Program Plan, as described in an upcoming
section, not having targets in these instances is reasonable.
Data Collection Methods for Telephone Assistance‘s Customer
Satisfaction Measure Are Not Always Objective:
IRS determines customer satisfaction with its toll-free telephone
assistance through a survey administered to taxpayers who speak with a
CSR.[Footnote 17] We observed survey collection methods in Atlanta that
were not always objective; that is, the administrators did not always
follow prescribed procedures for selecting calls to participate in the
survey. Not following prescribed procedures produces a systematic bias
that could compromise the randomness of the sample. Also, IRS
procedures do not require that administrators listen to the entire
call. Although administrators are instructed to notify the CSR towards
the end of a call that the call was selected for the survey, this may
not occur. If an administrator begins listening to a call after it has
started, it can be difficult to determine the full nature of the
taxpayer‘s question and thus whether the conversation is about to end.
As a result, an administrator could prematurely notify a CSR that the
call was selected for the survey, which could change the CSR‘s behavior
towards the taxpayer and affect the results of the survey and the
measure. In addition, administrators may not be able to correctly
answer certain questions on the survey, which could impair any analysis
of those answers. We discussed these issues with a representative of
the IRS contractor (Pacific Consulting Group) responsible for analyzing
and reporting the survey results who said that (1) he was aware of
these problems and (2) the same problems existed at other locations.
IRS has taken corrective action on one of these weaknesses. Because
management decided that the procedures for selecting calls to
participate in the customer satisfaction survey were too difficult to
follow, it revised them. Sites began using the revised sampling
procedures in July 2002.
Reliability of Five Telephone Quality Measures Is Suspect:
The reliability of telephone assistance‘s five quality measures (’toll-
free tax law quality,“ ’toll-free accounts quality,“ ’toll-free tax law
correct response rate,“ ’toll-free account correct response rate,“ and
’toll-free timeliness“) is suspect because of potential inconsistencies
in data collection that arise due to differences among individual
reviewer‘s judgment and perceptions.[Footnote 18] Although it is not
certain how much variation among reviewers exists, errors could occur
throughout data collection and could affect the results of the measures
and conclusions about the extent to which performance goals have been
achieved.
Reliability and credibility increase when performance data are checked
or tested for significant errors. IRS has conducted consistency reviews
in the past and found problems. It has taken steps to improve
consistency, the most important of which was the establishment of the
Centralized Quality Review Site (CQRS).[Footnote 19] Among other
controls within CQRS that are designed to enhance consistency,
reviewers are to receive the same training and gather to discuss cases
where the guidance is not clear. IRS has conducted one review to
determine the effectiveness of CQRS and its efforts to improve
consistency since IRS‘s October 2000 reorganization and continues to
find some problems.
At the time of our review, IRS was reviewing the five quality measures
as part of an ongoing improvement initiative. Since that time, it
redesigned many aspects of the measures, including what is measured,
how the measures are calculated, how data are collected, and how people
are held accountable for quality.[Footnote 20] Changes emanating from
this initiative may further enhance consistency.
Telephone Measures Cover Core Program Activities:
Telephone assistance‘s core program activities are to provide timely
and accurate assistance to taxpayers with inquiries about the tax law
and their accounts. IRS has at least one measure that directly
addresses each of these core activities. For example, ’toll-free
accounts quality“ is a measure that shows the percentage of accurate
responses to taxpayers‘ account related questions.
Some Overlap Exists between Telephone Assistance Measures:
The amount of overlap that exists between measures is a managerial
decision. Of the 15 telephone measures we reviewed, 10 have at least
partial overlap. For example, both the ’CSR response level“ and
’average speed of answer“ measures attempt to show how long a taxpayer
waited before receiving service, except that the former shows the
number of taxpayers receiving service within 30 seconds while the
latter shows the average wait time for all taxpayers. (Table 6 in
appendix II has information on other overlapping measures.):
IRS officials said that overlapping measures can add value to
management‘s decision-making process because each measure provides a
nuance that can be missed if both measures were not present. For
example, the ’CSR calls answered“ measure shows the number of taxpayer
calls answered while the ’CSR services provided“ measure attempts to
account for situations in which more than one CSR was involved in
handling a single call. At the same time, however, overlapping measures
(1) leave managers to sift through redundant, sometimes costly,
information to determine goal achievement and (2) could confuse outside
stakeholders, such as Congress.
Although we are not suggesting that IRS stop tracking or reporting any
of the overlapping measures, we question whether IRS has limited the
telephone measures included in the Strategy and Program Plan to the
vital few. Telephone officials agreed with this assessment and stated
that some of the overlapping measures will be removed from future
Strategy and Program Plans.
Telephone Measures Do Not Fully Cover Governmentwide Priorities:
When considering governmentwide priorities, such as quality,
timeliness, cost of service, and customer and employee satisfaction,
telephone assistance is missing two measures--(1) cost of service and
(2) a measure of customer satisfaction for automated services, as
described below.
* Cost of Service. According to key legislation[Footnote 21] and
accounting standards,[Footnote 22] agencies should develop and report
cost information. Besides showing financial accountability in the use
of taxpayer dollars, the cost information called for can be used for
various purposes, such as authorizing and modifying programs and
evaluating program performance. IRS does not report the average cost to
answer a taxpayer‘s inquiry by telephone. A cost-per-call analysis
could provide a link between program goals and costs, as required by
GPRA, and help IRS management and Congress decide about future
investments in telephone assistance. IRS officials said they would like
to develop a cost of services measures and are trying to determine what
information would be meaningful to include or exclude in the
calculation.
* Customer Satisfaction for Automated Services. Although IRS
projections show that about 70 percent of its fiscal year 2002 calls
would be handled by automation, it has no survey mechanism in place to
determine taxpayers‘ satisfaction with these automated services. IRS
officials agreed this would be a meaningful measure and want to develop
one for the future, but no implementation plans have been established.
Also, as previously mentioned, IRS has removed the ’automated
completion rate“ measure from its Strategy and Program Plan. We
realize, as noted in table 6 in appendix II, that this measure has
limitations that need to be addressed. However, because such a large
percentage of calls are handled by automation and because IRS plans to
serve even more calls with automation in the future, re-inclusion of
that measure in the Strategy and Program Plan may be warranted if the
associated problems can be resolved.
Telephone Measures Are Balanced:
Telephone assistance has measures in place for customer satisfaction,
employee satisfaction, and business results and, therefore, IRS
considers the measures balanced. However, including other measures,
such as a cost of service measure, as previously described, could
further enhance the balance of program priorities.
Electronic Filing and Assistance Measures:
As shown in table 3, all 13 of electronic filing and assistance‘s
performance measures have some of the attributes of successful
performance measures. However, as summarized in this section, the
measures have some shortcomings. For example, several of the measures
had some overlap and two measures had shortcomings related to the
changing of targets during the fiscal year. Table 7 in appendix II has
more detailed information about each electronic filing and assistance
measure, including any weaknesses we identified and any recommendations
for improvement.
Table 3: Overview of Our Assessment of Electronic Filing and Assistance
Measures:
[See PDF for image]
Note: A check mark denotes that the measure has the attribute.
[A] We were unable to verify the linkages between goals and measures
because of insufficient documentation.
[B] Electronic filing and assistance‘s core program activities are to
provide individual and business taxpayers with the capability to
transact and communicate electronically with IRS.
[C] Electronic filing and assistance measures address most
governmentwide priorities, such as quantity, customer satisfaction, and
employee satisfaction; however, they do not cover two important
priorities--quality and cost of service.
Source: GAO analysis.
[End of table]
Overall Alignment of Electronic Filing and Assistance‘s Goals and
Measures Not Fully Documented:
Electronic filing and assistance‘s 13 performance measures are aligned
with IRS‘s overall mission and IRS‘s strategic goals. However, we were
unable to validate whether the lower level goals, such as electronic
filing and assistance‘s operational goals and improvement projects, are
linked to the agencywide strategic level goals and operating division
performance measures because there is not complete documentation
available to show that linkage.
Electronic filing and assistance‘s managers stated that goals and
measures generally align and that employee briefings were held to
communicate their goals to the organization. It is essential that all
staff be familiar with IRS‘s mission and goals, electronic filing and
assistance‘s goals and performance measures, and how electronic filing
and assistance determines whether it is achieving its goals so that
staff know how their day-to-day activities contribute to the goals and
IRS‘s overall mission. When this is lacking, priorities may not be
clear and staff efforts may not be tied to goal achievement.
Most Electronic Filing and Assistance Measures Have Clarity:
All but one of electronic filing and assistance‘s 13 performance
measures had clarity. The ’number of IRS digital daily Web site hits“
measure, which is defined as the number of ’hits“ to IRS‘s Web site, is
not clear because its formula counts multiple hits every time a user
accesses the site‘s home page and counts a hit every time a user moves
to another page on the Web site. The formula is not consistent with the
definition because it does not represent the actual number of times the
Web site is accessed.
In its fiscal year 2003 Annual Performance Plan,[Footnote 23] IRS
acknowledged limitations with this measure as follows.
’..changes in the IRS Web design may cause a decrease in the
number of …hits‘ recorded in both [fiscal years] 2002 and 2003. This
decrease will be due to improved Web site navigation and search
functions, which may reduce the amount of random exploration by users
to find content. The decrease will also be due to better design of the
Web pages themselves that will reduce the number of graphics and other
items that are used to create the Web page, all of which are counted as
…hits‘ when a page is accessed.“:
In our report on IRS‘s 2001 tax filing season, we recommended that IRS
either discontinue the use of ’hits“ as a measure of the performance of
its Web site or revise the way ’hits“ are calculated so that the
measure more accurately reflects usage.[Footnote 24] IRS responded that
it should continue to count ’hits“ as a measure of the Web site‘s
performance because ’hits“ indicate site traffic and can be used to
measure system performance and estimate system needs. However,
officials stated that they could improve their method of counting
’hits“ once they had implemented a more sophisticated, comprehensive
Web analytical program. According to electronic filing and assistance
officials, IRS introduced its redesigned Web site in January 2001 and
implemented a new analytical program, but ’hits“ are still being
calculated the same way.
Two Electronic Filing and Assistance Measures Had Targets Changed and
Lack Objectivity:
Electronic filing and assistance changed the targets for two measures-
-“number of 1040 series returns filed electronically“[Footnote 25] and
’total number of returns electronically filed“--during fiscal year
2001. Changing targets could distort the assessment of performance
because what was to be observed changed. No major event (such as
legislation that affected the ability of many taxpayers to file
electronically) happened that warranted changing the targets in the
strategic plan. Instead, electronic filing and assistance changed the
target for the first of those measures from
42 million returns to 40 million returns because IRS‘s Research
Division‘s midyear data indicated that 42 million 1040 series returns
were not going to be filed electronically. Because the number of 1040
series returns filed electronically is a subset of the total number of
returns filed electronically, electronic filing and assistance also
reduced the target for total electronic filings. Because of these
subjective considerations, changing the targets in this situation also
affected the objectivity of these measures.
Electronic Filing and Assistance Measures Are Reliable, with One
Exception:
Of electronic filing and assistance‘s 13 performance measures, we
considered 12 to be reliable because the data on performance comes from
sources, such as IRS‘s masterfile[Footnote 26] and computer program
runs, that are subject to validity checks. The one measure we did not
consider reliable was the ’number of IRS digital daily Web site hits,“
because it does not represent the actual number of times the Web site
is accessed, as previously described.
Measures Cover Electronic Filing and Assistance‘s Core Program
Activities:
Electronic filing and assistance‘s core program activities are to
provide individual and business taxpayers the capability to transact
and communicate electronically with IRS. Electronic filing and
assistance focuses on taxpayers‘ ability to file their returns, pay
their taxes, receive assistance, and obtain information electronically.
These core activities are all covered by the 13 performance measures.
Overlap Exists among Electronic Filing and Assistance Measures:
Seven of the 13 electronic filing and assistance measures had partial
overlap. For example, the ’number of 1040 series returns electronically
filed“ and ’percent of individual returns electronically filed“
measures provide related information on a key program activity. The
difference is that the former is a count of the number filed
electronically while the latter is the percentage of total individual
tax returns filed electronically. (Table 7 in appendix II has
information on other overlapping electronic filing and assistance
measures.):
The amount of overlap to tolerate among measures is management‘s
judgment. Electronic filing and assistance officials told us that each
of the overlapping measures we identified provides additional
information to managers. For example, the ’number of 1040 series
returns electronically filed“ provides managers with information on the
size of the electronic return workload whereas the ’percent of
individual returns electronically filed“ tells them how they are doing
in relation to IRS‘s long-term strategic goal of 80 percent. IRS
officials also pointed out that both number and percent performance
measures exist because external customers, such as the press, like to
use the measures for reporting purposes.
Electronic Filing and Assistance‘s Measures Do Not Cover Some
Governmentwide Priorities, Thus Hindering Balance:
Although electronic filing and assistance‘s measures address several
governmentwide priorities, such as quantity, customer satisfaction, and
employee satisfaction, they do not cover two important priorities--
quality and cost of service. As a result, its performance measurement
system is not fully balanced.
Electronic filing and assistance classifies four of its performance
measures as quality measures, but the measures are merely counts of
certain types of electronic transactions (such as ’number of payments
received electronically“). On the other hand, it tracks what we
consider to be quality measures (i.e., ’processing accuracy“[Footnote
27] and ’refund timeliness, electronically filed“)[Footnote 28] but
those measures are not in the Strategy and Program Plan. These quality
measures and others, such as one that tracks the number of electronic
returns rejected,[Footnote 29] could be important indicators of program
success or failure. For example, IRS data indicate that many electronic
tax returns are rejected; a measure that captures the volume of rejects
could help to focus management‘s attention on the cause of those
rejects.
Also, similar to our discussion of a cost of service measure in the
telephone section, a ’cost-per-electronically filed return“ could
provide a link between program goals and costs, as required by GPRA,
and help IRS management and Congress decide about future investments in
electronic filing and assistance.
Field Assistance Measures:
As shown in table 4, all 14 of field assistance‘s performance measures
have some of the attributes of successful performance measures.
However, as summarized in this section, the measures have several
shortcomings, primarily with respect to clarity, reliability, and
balance. Table 8 in appendix II has more detailed information about
each field assistance measure, including any weaknesses we identified
and any recommendations for improvement.
Table 4: Overview of Our Assessment of Field Assistance Measures:
[See PDF for image]
Note: A check mark denotes that the measure has the attribute.
[A] We were unable to verify the linkages between goals and measures
because of insufficient documentation.
[B] Core program activities of field assistance are to provide face-to-
face assistance, education, and compliance services.
[C] Although field assistance continues to develop its suite of
performance measures, important measures of timeliness, efficiency or
productivity, and cost of service are missing and impair balance.
Source: GAO analysis.
[End of table]
Relationship between Goals and Field Assistance Measures Not Complete:
Field assistance recognizes the importance of creating a clear
relationship between goals and measures and has developed a template
that shows some of that relationship. Figure 4 is an excerpt of the
template, with the completed portions, as of October 2002, shown in
gray.
Figure 4: Example of Relationship among Field Assistance Goals and
Measures:
[See PDF for image]
Source: GAO‘s analysis of field assistance‘s business plan template.
[End of figure]
Although the template demonstrates a noteworthy effort to show a clear
link between goals and measures, it omits the link to IRS‘s mission,
IRS‘s strategic goals, and field assistance‘s improvement projects.
These links are important because they serve as the bridge between
long-term strategic goals and short-term daily operational goals, which
can, among other things, be used for holding IRS and the field
assistance program accountable for achieving those goals. Also,
officials told us that the completed template would only cite the type
of performance measure--employee satisfaction, customer satisfaction,
or business results--not the specific measure and target. The link to
the specific measure provides additional information needed to clearly
communicate the alignment of goals and measures throughout the agency,
and the target communicates the level of performance the operating
division hopes to achieve.
Many Field Assistance Measures Lack Clarity:
Many of field assistance‘s measures lack clarity. For example, the
’geographic coverage“ measure is unclear, even to IRS officials,
because it is not evident by its name or definition what is or is not
included in the measure‘s formula. Specifically, officials debated
whether or not the measure included alternate sites[Footnote 30] and
kiosks.[Footnote 31] Similarly, the formula only considers the location
of Taxpayer Assistance Centers (TAC), not their hours of operation or
services provided. Although we saw no evidence that this lack of
clarity led to adverse consequences, it could. For example, management
or other stakeholders may determine that TACs are needed in certain
areas of the country to improve geographic coverage when, in fact,
alternate sites and/or kiosks are already serving those areas. IRS
officials said that they have plans to revise the formula to include
alternate sites and kiosks. (The other measures that lack clarity, as
described in table 8 of appendix II, are ’return preparation contacts,“
’return preparation units,“ ’TACs total contacts,“ ’forms contact,“
’tax law contacts,“ ’account contacts,“ ’other contacts,“ ’tax law
accuracy,“ ’accounts/notices accuracy,“ and ’return preparation
accuracy.“):
All Field Assistance Measures Are Objective and Have Targets That Are
Either in Place or Being Established:
We determined that all of field assistance‘s 14 performance measures
are objective because, to the greatest extent possible, they are free
of significant bias or manipulation and indicate specifically what is
to be observed, in which population or conditions, and in what
timeframes. Of the 14 measures, 7 have targets in place to help
determine whether overall goals and objectives were achieved. Of the
seven measures without targets, three were being baselined (i.e., IRS
was collecting data for use in setting first-time targets). The
remaining four measures were being designed at the time of our review.
Targets will be set for these measures upon completion of data
collection.
Data Collection Process Affects Reliability of Several Field Assistance
Measures:
Eight of field assistance‘s 14 performance measures are based on a data
collection process that is subject to inconsistencies and human error,
meaning that the same results may not be produced in similar
circumstances. All TAC employees are to use Form 5311 (Field Assistance
Activity Report) to manually report their daily hours and type of
assistance provided. Supervisors are to review the forms for accuracy
and forward them for manual input into the Resources Management
Information System.[Footnote 32] These layers of manual input are
subject to error and can hinder data reliability that could (1) lead
managers or other stakeholders to draw inappropriate conclusions about
program performance, (2) not alert them to the existence of problems,
or (3) not help them respond when problems arise. For example, as we
noted in our report on IRS‘s 2001 tax filing season, our calculations
showed that the data reported by TACs did not account for the wait
times of about 661,000 taxpayers, or about 13 percent of taxpayers
served.[Footnote 33] IRS expects to minimize this human error by
equipping all of its TACs with an on-line automated tracking and
reporting system known as the Queuing Management System (Q-Matic). This
system is expected, among other things, to more efficiently monitor
customer traffic flow and wait times and eliminate staff time spent
completing Form 5311.[Footnote 34]
IRS has taken steps to solve data reliability problems with field
assistance‘s customer satisfaction measure. In a May 2000 report, the
Treasury Inspector General for Tax Administration concluded that IRS
had not established an adequate management process to ensure that the
survey yielded accurate, reliable, and statistically valid
results.[Footnote 35] To field assistance‘s credit and with the help of
a vendor, it (1) completed major revisions to the customer satisfaction
survey, such as using a different index scale; (2) included space for
written comments, which were to be provided to managers on a routine
basis; and (3) improved controls to ensure the survey is available to
all taxpayers. However, problems arose regarding the manner in which
the vendor was providing site managers with data containing cumulative
responses and, as of June 2002, the vendor had temporarily stopped
providing feedback to site managers and was in the process of
determining a more usable format to relay information to managers. The
improved data collection method is being implemented and IRS
anticipates an increase in the precision with which it measures field
assistance customer satisfaction.
Field Assistance Measures Cover Core Program Activities with Limited
Overlap:
Field assistance‘s measures cover its core program activities with
limited overlap. Field assistance identifies its core program
activities as face-to-face assistance, education, and compliance
services, which include such activities as preparing returns, answering
tax law questions, resolving account and notice inquiries, and
supplying forms and publications. For example, field assistance has an
’accounts contact“ measure (counts the number of contacts made) and an
’accounts accuracy“ measure (measures the accuracy of the responses) to
reflect both the quantity and quality of its accounts-related
assistance.
Field assistance identified some overlap between two measures, ’return
preparation contacts“ and ’return preparation units.“ It has decided,
for Strategy and Program Plan purposes, to discontinue the ’contacts“
measure (which counts the number of customers assisted) and keep the
’units“ measure (which counts the number of returns prepared) because
the ’units“ measure better reflects the amount of return preparation
work done.[Footnote 36] Field assistance will continue tracking the
’contacts“ measure outside of the Strategy and Program Plan in order to
determine customer demand for service at particular sites. We concur
with IRS‘s plans to track the ’contacts“ measure outside of the
Strategy and Program Plan because it is a diagnostic tool that can be
used for analysis purposes.
Field Assistance Is Missing Some Measures Needed to Balance
Governmentwide Priorities:
Field assistance continues to develop its suite of performance
measures. As part of that effort, it is beginning to deploy important
quality measures, such as ’tax law accuracy.“ However, other important
measures of timeliness, efficiency, and cost of service are missing,
which impairs balance.
* Timeliness. Before fiscal year 2001, field assistance had a
performance measure that officially tracked how long customers waited
to receive service from an employee. According to managers, it was
discontinued because employees were serving taxpayers as quickly as
possible in order to meet timeliness goals, which negatively affected
service quality.[Footnote 37] In March 2002, management went further
and (1) eliminated its requirement for TACs not equipped with Q-Matic
to submit biweekly wait-time reports and (2) doubled, from 15 to 30
minutes, the wait-time interval to be used by TACs with Q-Matic in
computing the percentage of customers served on time. Officials said
that they took these steps because employees continued to feel
pressured to hurry assistance despite the discontinuance of the
official timeliness measure. However, one purpose of balanced measures
is to avoid an inappropriate emphasis on just one aspect of
performance. The presence of a quality measure should provide a
disincentive for employees to ignore quality in favor of timeliness.
Similarly, in the absence of a timeliness performance measure, (1)
field assistance may not be balancing its customers‘ needs for timely
service with their needs for accurate information and (2) IRS is not
held accountable for timeliness to stakeholders, such as the Congress.
* Efficiency. Efficiency, or productivity as it is often referred to,
shows how efficiently IRS‘s resources are transformed into the
production of field assistance services. Field assistance officials
said they would like to develop an efficiency measure, but no plans are
in place. Among other things, having an efficiency measure would help
managers identify performance strengths and weaknesses.
* Cost of Service. As required by GPRA, agencies should have
performance measures that correlate the level of program activity and
program cost. Without such a measure in field assistance, officials do
not know how much it costs to provide face-to-face service. Field
assistance officials said that they would like to develop a cost of
service measure, but they are not certain how to calculate it.
Submission Processing Measures:
As shown in table 5, all 11 of submission processing‘s performance
measures have many of the attributes of successful performance
measures. However, as summarized in this section, we identified several
opportunities for improvement, especially in the area of reliability.
Table 9 in appendix II has more detailed information about each
submission processing measure, including any weaknesses we identified
and any recommendations for improvement.
Table 5: Overview of Our Assessment of Submission Processing Measures:
[See PDF for image]
Note: A check mark denotes that the measure has the attribute.
[A] We were unable to verify the linkages between goals and measures
because of insufficient documentation.
[B] Core program activities of submission processing are to efficiently
and accurately process returns, remittances, and refunds and issue
notices and letters.
[C] Submission processing measures cover various governmentwide
priorities, such as efficiency, timeliness, and accuracy; however,
submission processing‘s measures did not include a measure for customer
satisfaction or for showing how much it costs to process the average
return.
Source: GAO analysis.
[End of table]
Alignment between IRS‘s Goals and Submission Processing Measures Is
Uncertain:
No formal documentation exists to show how submission processing‘s
11 measures are aligned with IRS‘s mission, its agencywide goals, and
its operating division goals. Despite this lack of formal
documentation, submission processing officials said, and we generally
concur, that some linkage does exist. Without complete documentation,
however, we could not verify all the linkages. Submission processing
officials stated that staff and managers are aware of the link between
measures and goals because the submission processing organization has
taken action to help ensure that staff understand the measures and
their role in supporting IRS‘s overall mission and strategic and
operating goals. For example, according to submission processing
officials, they visited all eight W&I processing centers in 2001 to
talk directly with staff and managers about the importance of balanced
performance measures in ensuring that IRS meets its goals. Complete
documentation of the linkages between goals and measures could further
enhance understanding of those goals and measures with managers and
staff.
Submission Processing Measures Have Clarity, with One Exception:
All but one of the submission processing measures have clarity and
provide information to enable executives, other managers, and outside
stakeholders to properly assess performance against goals. The one
exception is the productivity measure.
Managers in different processing centers told us that they did not use
the productivity measure to provide them with performance information
or to help them assess performance because, among other things, the
measure does not provide specific information about their unit‘s or
center‘s performance or their contribution to overall productivity.
This is because the measure, as designed, is a compilation of different
types of work IRS performs in processing returns, remittances, and
refunds and issuing notices and letters. As a result, unit managers
used different productivity measures specific to their own processes to
help them identify how to increase their area‘s productivity. However,
according to IRS officials, the productivity measure is useful and
provides adequate information to some IRS executives.
From our perspective, although the productivity measure may be
meaningful to executives, the fact that field managers use other
measures and profess not to understand the current productivity measure
indicates that the current measure does not provide those managers with
useful information that would alert them to problems and help them
respond when problems arise. In addition, because the measure is
calculated by compiling and weighting different types of processing
work per staff year expended, it may be too confusing to be useful to
outside stakeholders, such as Congress.
All Submission Processing Measures Have Targets and Most Are Objective:
All 11 of submission processing‘s measures have measurable targets and
most are objective (i.e., reasonably free of significant bias or
manipulation). For example, the ’notice error rate“ had a target of
8.1 percent for fiscal year 2001. The ’deposit timeliness“ measure
appears to be objective, for example, because the Integrated Submission
and Remittance Processing System[Footnote 38] automatically calculates
data on which the measure is based. However, the ’notice error rate“
and ’letter error rate“ measures are not objective because the coding
required as part of data collection by individual reviewers is subject
to much interpretation that could systematically bias the results of
the measures. In October
2002, the Treasury Inspector General for Tax Administration reported,
based on a review at two processing centers, that the ’deposit error
rate“ measure was not objective, because the associated sampling plan
was not consistently implemented.[Footnote 39] The Treasury Inspector
General for Tax Administration recommended that IRS take steps to
ensure consistent implementation, and IRS reported that steps have been
taken.
Five Submission Processing Measures Lack Reliability:
Five measures are subject to consistency problems that affect the
reliability of the measures. Those measures are ’refund timeliness--
individual (paper),“ ’notice error rate,“ ’refund error rate,“ ’letter
error rate,“ and ’deposit error rate.“ Specifically, the five measures
are based on a data collection process, which according to the Director
of Submission Processing, involves about 80 staff who identify,
interpret, and analyze errors at the eight W&I processing centers. The
’notice error rate“ and ’letter error rate“ measures also involve
coding that is subject to further interpretation.
Submission processing managers recognized that staff inconsistently
coded notice and letter errors during the 2001 filing season. Neither
IRS nor we know the extent to which such inconsistencies exist because
no routine studies are done to validate the accuracy of data
collection. Reliability and credibility increase when such studies are
done. Submission processing initiated studies beginning in June 2001 to
improve reliability, but has not established any improvement goals.
Submission Processing Measures Cover Core Program Activities without
Overlap:
Each of submission processing‘s measures directly pertains to one of
the core program activities of submission processing‘s business
operations--timely, efficiently, and accurately processing returns,
remittances, and refunds and issuing notices and letters--without
redundancy or overlap. For example, the ’refund error rate--individual
(paper)“ measure directly pertains to one of submission processing‘s
core program activities, processing refunds, and does not overlap with
any of the other 11 measures.
Unlike the other three program areas we reviewed, submission processing
has two customers--taxpayers, to whom IRS issues refunds and sends
notices, and the Department of the Treasury, for which IRS deposits
remittances. Therefore, for some measures, such as ’refund timeliness,“
IRS views taxpayers as the customer, while for other measures, such as
’deposit timeliness,“ IRS views Treasury as the customer. Submission
processing officials believe that this dual-customer perspective
provides a complete view of their operations and the measures cover all
aspects of their operations while still being limited to a manageable
number.
Submission Processing Measures Cover Various Governmentwide
Priorities, but Are Not Fully Balanced:
Submission processing‘s measures cover various governmentwide
priorities, such as efficiency, timeliness, and accuracy. However, at
the time of our review, submission processing measures lacked balance
because they did not include a measure for customer satisfaction or a
measure showing how much it costs to process a return.
Although submission processing officials believe that some existing
measures, such as ’notice error rate“ and ’refund timeliness,“ provide
information related to the customer‘s experience, they recognize that
directly obtaining customers‘ perspectives would be more accurate than
assuming their experience based on such measures. Thus, submission
processing is obtaining customer satisfaction information as part of
IRS‘s corporate customer satisfaction survey, which IRS expects will be
available by the 2003 filing season.
Similar to the other three program areas, submission processing does
not have a cost of service measure.[Footnote 40] Among other things,
not having a cost of service measure affects IRS‘s ability to
adequately compare different types of processing, such as paper versus
electronic. In our view, because IRS does not take into account the
cost to process a particular type of return, managers cannot fully
understand the effectiveness of their unit.
Conclusions:
Because the filing season affects so many taxpayers, IRS‘s performance
is important. Having successful performance measures that demonstrate
results, are limited to the vital few, cover multiple program
priorities, and provide useful information to decision makers will help
IRS management and stakeholders, such as Congress, make decisions about
how to fund and improve return processing and assistance to taxpayers.
Despite the challenge of developing a set of 53 measures that satisfy
our criteria, IRS has made significant progress. As developed to date,
the measures satisfy many of our nine attributes for successful
performance measures. For example, in all four of the program areas we
reviewed, most measures covered the core activities of each program and
had targets in place. IRS also has several on-going improvement
initiatives, such as the effort to redesign all aspects of its
telephone assistance quality measures.
Although the measures satisfied many of the nine attributes, our
evaluation also showed that they do not have all the characteristics of
successful performance measures. The most significant weaknesses
include (1) the inability of some measures to provide clear information
to decision makers about program performance, (2) data collection
methods that hamper objectivity and reliability, and (3) measures to
cover governmentwide priorities that are missing from the Strategy and
Program Plan. Although such weaknesses do not mean that the measures
are not useful, IRS risks basing program and resource allocation
decisions on inadequate or incomplete information and is less
accountable until the weaknesses are addressed.
Correcting these weaknesses is important in order to (1) create a
results-oriented environment that demonstrates and tracks how IRS‘s
programs and activities contribute to achieving its mission and
strategic goals, (2) avoid creating an excess of data that could
obscure key information needed to identify problem areas and assess
goal achievement, (3) form a balanced environment that takes the core
program activities of the program into account, and (4) provide
managers
and other stakeholders with critical information on which to base their
decisions.
Recommendations for Executive Action:
We recommend that the Commissioner of Internal Revenue direct the
appropriate officials to do the following:
Take steps to ensure that agencywide goals clearly align with operating
division goals and performance measures for each of the four areas
reviewed. Specifically, (1) clearly document the relationship among
agencywide goals, operating division goals, and performance measures
(the other three program areas may want to consider developing a
template similar to the one field assistance developed, shown in figure
4) and (2) ensure that the relationship among goals and measures is
communicated to staff at all levels of the organization.
Make the name and definition of several field assistance measures
(i.e., ’geographic coverage,“ ’return preparation contacts,“ ’ return
preparation units,“ ’TACs total contacts,“ ’forms contacts,“ ’tax law
contacts,“ ’account contacts,“ ’other contacts,“ ’tax law accuracy,“
’accounts/notices accuracy,“ and ’return preparation accuracy“) more
clear to indicate what is and is not included in the formula.
As discussed in the body of this report and in appendix II, modify the
formulas used to compute various measures to improve clarity. If
formulas cannot be implemented in time for the next issuance of the
Strategy and Program Plan, then modify the name and definition of the
following measures so it is clearer what is or is not included in the
measure.
* Remove automated calls from the formula for the ’CSR level of
service“ measure.
* Revise the ’CSR response level“ measure to include calls from
taxpayers who tried to reach a CSR but did not, such as those who
(1) hung-up while waiting to speak to a CSR, (2) were provided access
only to automated services and hung up, and (3) received a busy signal.
* Analyze and use new or existing data to determine why calls are
transferred and use the data to revise the ’CSR services provided“
measure so that it only reflects transferred calls in which the caller
received help from more than one CSR (i.e., exclude calls in which a
CSR simply transferred the call and did not provide service.):
* Either discontinue use of the ’number of IRS digital daily Web site
hits“ measure or revise the way ’hits“ are calculated so that the
measure more accurately reflects usage.
* Revise field assistance‘s ’geographic coverage“ measure by ensuring
that the formula better reflects (1) the various types of field
assistance facilities, including alternate sites and kiosks; (2) the
types of services provided by each facility; and (3) the facility‘s
operating hours.
* Revise submission processing‘s ’productivity“ measure so it provides
more meaningful information to users.
Refrain from making changes to official targets, such as electronic
filing and assistance did in fiscal year 2001, unless extenuating
circumstances arise. Disclose any extenuating circumstances in the
Strategy and Program Plan and other key documents.
Modify procedures for the toll-free customer satisfaction survey,
possibly by requiring that administrators listen to the entire call, to
better ensure that administrators (1) notify CSRs that their call was
selected for the survey as close to the end of a call as possible and
(2) can accurately answer the questions they are responsible for on the
survey.
Implement annual effectiveness studies to validate the accuracy of the
data collection methods used for the five telephone measures (’toll-
free tax law quality,“ ’toll-free accounts quality,“ ’toll-free tax law
correct response rate,“ ’toll-free account correct response rate,“ and
’toll-free timeliness“) subject to potential consistency problems. The
studies could determine the extent to which variation exists in
collecting data and recognize the associated impact on the affected
measures. For those measures, and for the five submission processing
measures that already have effectiveness studies in place (’refund
timeliness-individual (paper),“ ’notice error rate,“ ’refund error
rate--individual (paper),“ ’letter error rate,“ and ’deposit error
rate“), IRS should establish goals for improving consistency, as
needed.
Ensure that plans to remove overlapping measures in telephone and field
assistance are implemented.
As discussed in the body of this report, include the following missing
measures in the Strategy and Program Plan in order to better cover
governmentwide priorities and achieve balance.
* In the spirit of provisions in the Chief Financial Officers Act of
1990 and Financial Accounting Standards Number 4, develop a cost
of services measure using the best information currently available
for each of the four areas discussed in this report, recognizing data
limitations as prescribed by GPRA. In doing so, adhere to guidance,
such as Office of Management and Budget Circular A-76, and consider
seeking outside counsel to determine best or industry practices.
* Given the importance of automated telephone assistance, develop a
customer satisfaction survey and measure for automated assistance.
* Put the ’automated completion rate“ measure back in the Strategy and
Program Plan after revising the formula so that calls for recorded tax
law information are not counted as completed when taxpayers hang up
before receiving service.
* Add one or more quality measures to electronic filing and
assistance‘s suite of measures in the Strategy and Program Plan.
Possible measures include ’processing accuracy,“ ’refund timeliness,
electronically filed,“ and ’number of electronic returns rejected.“:
* Re-implement field assistance‘s timeliness measure.
* Develop a measure that provides information about field assistance‘s
efficiency.
Agency Comments and Our Evaluation:
The Commissioner of Internal Revenue provided written comments on a
draft of this report in a letter dated November 1, 2002, which is
reprinted in appendix III. The Commissioner was pleased to see that
many of the measures had the attributes for successful performance and
agreed that others presented opportunities for further refinement. He
stated that the report was objective and balanced and that our
observation of the on-going nature of the performance measurement
process was on point. Furthermore, he noted that the attributes we
developed can be used as a checklist when performance measures are
developed in the future.
Of our 18 recommendations, IRS:
* agreed with 10 and cited planned corrective actions that were
responsive to those recommendations;
* cited actions taken or planned in response to 2 that did not fully
address our concerns; and:
* disagreed with 6.
The following discussion focuses on the recommendations with which IRS
disagreed or for which we believe additional action is necessary to
address our concerns.
In response to our recommendation about clarifying the name and
definition of several field assistance measures, IRS said that the
recently updated data dictionary addressed our concerns. We reviewed
the updated data dictionary. The modifications are substantial and
provide significant additional information about the measures. However,
the definitions remain unclear. Specifically, the definitions should
either define a taxpayer assistance center or state whether or not
alternate sites, such as kiosks and mobile sites, are included.
IRS did not agree that automated calls should be removed from the
formula for the ’CSR level of service“ measure. IRS said that including
the count of callers who choose an automated service while waiting for
CSR service is appropriate. IRS‘s response does not accurately
characterize all the calls answered by automation that are included in
the ’CSR level of service“ measure. Rather than choosing an automated
service while waiting for a CSR, some callers complete an automated
service after hearing an announcement that, due to high call volume,
only automated services are available--a choice is not involved. We
believe that the ’CSR level of service“ measure, because of its name
and the way it is calculated, could be misleading and might
misrepresent taxpayers‘ access to CSR‘s. For example, increasing the
percentage of calls served through automation because a CSR was not
available--meaning that CSR‘s were actually more difficult to reach--
would improve the ’CSR level of service“ measure, thus giving the
impression that access to CSR‘s had improved when it had actually
gotten worse. Calls answered through automation, regardless of the type
of assistance (CSR or automation) the caller was originally seeking,
should be reflected in an automated-level-of-service measure, such as
’automated service completion rate.“:
IRS did not agree that it should modify the ’CSR response level“
measure to include calls in which the caller hung up before receiving
service or got a busy signal. IRS said that altering the measure would
deviate from industry standards and hinder IRS‘s ability to gauge
success in meeting this ’world class service“ goal. We support IRS‘s
efforts to gauge its progress toward providing world class customer
service by telephone. However, IRS‘s use of the same telephone wait-
time measure used by others may actually hinder a meaningful comparison
of IRS with industry leaders. The ’CSR response level“ measure shows,
for the callers who reached a CSR, the percentage that waited 30
seconds or less. According to IRS officials, when taxpayers call IRS
attempting to reach a CSR, they are much less likely to reach one than
when they call a recognized telephone service leader (i.e., callers to
IRS are more likely to hang up while waiting to speak to a CSR, hang up
after being given access to only automated service because a CSR is not
available, or receive a busy signal). Therefore, when the ’CSR response
level“ measure (which excludes these hang-ups and busy signals) is used
by IRS, the measure may represent the experience of a significantly
smaller percentage of the total callers that attempted to reach an a
CSR than when the same measure is used by industry leaders, thus
potentially overstating the ease with which callers reached IRS CSR‘s.
Data we obtained from IRS suggest that there were about an equal number
of hang-ups and busy signals as calls answered in this measure in 2001.
In response to our recommendation about implementing annual studies to
validate the accuracy of various data collection methods and
establishing goals for improving consistency, IRS said that it (1) has
an ongoing process to ensure proper administration of the collection
methods for the telephone measures cited in our recommendation, (2)
does not agree that an annual independent review by non-CQRS analysts
is merited, and (3) does not agree that it should incorporate
consistency
improvement goals in the Strategy and Program Plan process. As we noted
in our report, telephone assistance‘s CQRS has some controls in place
to
monitor consistency. However, we believe that reliability and
credibility increase when performance data are checked or tested for
significant errors, which IRS currently does not do. We did not
recommend that non-CQRS analysts do these reviews; who does the reviews
is for IRS to decide. Also, we recognized in our report that submission
processing has an on-going process to verify consistency and that it
has found problems. Because that review process has found some
problems, we believe that establishing goals for improving consistency
in submission processing is warranted. Because telephone assistance
does not have a review process in place, we do not know whether
improvement goals are needed, but noted that they could be. We did not
recommend that these goals become a part of the Strategy and Program
Plan process. Instead, these goals should become part of the review
process and be made known to staff who are performing the work.
IRS disagreed with our recommendation that it put the ’automated
completion rate“ measure back in the Strategy and Program Plan.
Instead, IRS said it would continue to track and monitor that rate as a
diagnostic measure. IRS told us that its decision is based on the fact
that data on automated calls are not good enough to merit the attention
the measure would have at the Strategy and Program Plan level. We
recognize that there are data weaknesses with this measure. That is why
our recommendation calls for IRS to revise the formula before returning
the measure to the Strategy and Program Plan. Because serving more
callers through automation is important to IRS‘s strategy for improving
taxpayer service, we believe that IRS needs a measure of the level of
service provided by automation in its Strategy and Program Plan to
balance its measure of the level of service provided by CSRs. Other
than counts of the number of calls served, IRS has no measure of its
effectiveness in serving taxpayers through automation. Without such a
measure, IRS risks poorly serving the increasing number of taxpayers
being served through automation while possibly improving access for a
declining number of callers who need to speak with a CSR.
IRS does not believe that adding one or more quality measures to
electronic filing and assistance‘s suite of measures in the Strategy
and Program Plan would enhance the electronic filing program. It noted
that it tracks the quality of electronic filing outside the Strategy
and Program Plan and that quality has been consistently high. We
recognize that electronic filing and assistance tracks quality outside
the Strategy and Program Plan. However, we disagree with IRS‘s position
that adding quality measures to that plan would not enhance the
program. According to IRS officials, measures in the Strategy and
Program Plan are the highest, most comprehensive level of measures for
which they are accountable. In addition, many of those measures are
made available to outside stakeholders. By not elevating these measures
of quality to the Strategy and Program Plan, electronic filing and
assistance risks not being held to any quality standards. Furthermore,
not having quality measures hampers balance among electronic filing and
assistance‘s suite of measures and is not consistent with IRS‘s
balanced measurement program or the intent of IRS‘s Restructuring and
Reform Act of 1998.
IRS disagreed with our recommendation that it re-implement field
assistance‘s timeliness measure. IRS said that although timeliness
goals are important in providing service to taxpayers, they are
detrimental to quality service because field assistance employees tend
to rush customers when traffic is high. This position is inconsistent
with IRS‘s balanced measurement program and the intent of IRS‘s
Restructuring and Reform Act of 1998. Although the accuracy of
assistance is an important measure of quality, the timeliness of that
assistance is also an important and balancing aspect of quality.
Without this balancing emphasis, staff could theoretically take
excessive time providing quality tax law assistance to a few taxpayers
regardless of the impact on the wait-time for other taxpayers. We agree
that Q-Matic is the best source of this information and support IRS‘s
plans to implement it nationwide. IRS also stated that it could use
feedback from its customer satisfaction surveys to obtain information
about the ’promptness of service.“ As we noted in our report, problems
arose in the manner with which the feedback was provided from the
vendor and the vendor had stopped providing feedback to site managers
until the problems could be resolved. Even when those problems are
resolved, a timeliness measure based on actual IRS data versus
taxpayers‘ perceptions would be meaningful.
Regarding our recommendation about implementing an efficiency measure
in field assistance, IRS said that it will be testing a system for use
as a ’diagnostic tool“ to monitor and evaluate the strengths and
weaknesses of various productivity measures. However, IRS‘s response
was silent as to whether or when it would establish a field assistance
productivity measure. Maintaining and enhancing organizational
productivity is a fundamental agency management responsibility. The
extent to which IRS‘s field assistance organization is meeting this
basic responsibility needs to be visible to IRS, Treasury, and
congressional stakeholders in the form of an organizational performance
measure, rather than a ’diagnostic tool,“ which is generally visible
only to IRS managers.
We are sending copies of this report to the Chairmen and Ranking
Minority Members of the Senate Committee on Finance and the House
Committee on Ways and Means and the Ranking Minority Member of this
Subcommittee. We are also sending copies to the Secretary of the
Treasury; the Commissioner of Internal Revenue; the Director, Office of
Management and Budget; and other interested parties. We will make
copies available to others on request. In addition, the report will be
available at no charge on the GAO Web site at http://www.gao.gov.
This report was prepared under the direction of David J. Attianese,
Assistant Director. Other major contributors are acknowledged in
appendix IV. If you have any questions about this report, contact
Mr. Attianese or me on (202) 512-9110.
Sincerely yours,
James R. White
Director, Tax Issues
[Signed by James R. White
[End of section]
Appendix I: Expanded Explanation of Our Attributes and Methodology for
Assessing IRS‘s Performance Measures:
Performance goals and measures that successfully address important and
varied aspects of program performance are key to a results-oriented,
balanced work environment. Measuring performance allows organizations
to track the progress they are making toward their goals and gives
managers critical information on which to base decisions for improving
their programs. Organizations need to have performance measures that
(1) demonstrate results, (2) are limited to the vital few, (3) cover
multiple program priorities, and (4) provide useful information for
decision making in order to track how their programs and activities can
contribute to attaining the organization‘s goals and mission. These
four characteristics are important to accurately reveal the strengths
and weaknesses of a program since measures are often the key motivators
of performance and goal achievement.
For use as criteria to determine whether the Internal Revenue Service‘s
(IRS) performance measures in four key program areas--telephone
assistance, electronic filing and assistance, field assistance, and
submission processing--demonstrate results, are limited to the vital
few, cover multiple program priorities, and are useful in decision
making, we developed nine attributes of performance goals and measures
based on previously established GAO criteria. In addition, we
considered key legislation, such as the Government Performance and
Results Act of 1993 (GPRA) and the IRS Restructuring and Reform Act of
1998, and performance management literature cited in the bibliography
and
related products sections at the end of this report. Our nine
attributes
may not cover all the attributes of successful performance measures;
however, we believe these are some of the most important. We shared
these attributes with IRS officials responsible for performance
measurement issues, such as the Acting Director of the Organizational
Performance Division; and several officials in the Wage and Investment
(W&I) operating division, such as the Director of Strategy and Finance;
the Chief of Planning and Analysis; the Director of Customer Account
Services; and the Director of Field Assistance. These officials
generally agreed with the relevance of the attributes and our review
approach.
We applied these attributes to the 53 filing season measures in W&I‘s
fiscal year 2001-2003 Strategy and Program Plan in a systematic manner,
but some judgment was required. To ensure consistency and reliability
in our application of the attributes, we had one staff person
responsible for each of the four areas. That staff person prepared the
initial analysis and at least two other staff reviewed those detailed
results. Several staff reviewed the results for all four areas.
Inherently, the attributes described below are not weighted equally.
Weaknesses identified in a particular attribute do not,
in and of themselves, mean that a measure is ineffective or
meaningless. Instead, weaknesses identified should be considered areas
for further refinement.
Detailed information on each attribute, including an explanation,
examples, and the methodology we used to assess that attribute with
respect to the measures covered by our review, follows.
Attributes of Successful Performance Measures:
1. Is there a relationship between the performance goals and measures
and an agency‘s goals and mission? (Referred to as ’linkage“):
Explanation: Performance goals and measures should align with an
agency‘s goals and mission. A cascading or hierarchal linkage moving
from top management down to the operational level is important in
setting goals agencywide, and the linkage from the operational level to
the agency level provides managers and staff throughout an agency with
a road map that (1) shows how their day-to-day activities contribute to
attaining agencywide goals and mission and (2) helps define strategies
for achieving strategic and annual performance goals. As agencies
develop annual performance goals as envisioned by GPRA, they can serve
as a bridge that links long-term goals to agencies‘ daily operations.
For example, an annual goal that is linked to a program and also to a
long-term goal can be used both to (1) hold agencies and program
offices accountable for achieving those goals and (2) assess the
reasonableness and appropriateness of those goals for the agency as a
whole. In addition, annual performance planning can be used to better
define strategies for achieving strategic and annual performance goals.
Linkages between goals and measures are most effective when they are
clearly communicated to all staff within an agency so that everyone
understands what the organization is trying to achieve and the goals it
seeks to reach. Communicating goals and their associated measures is a
continuous process and supports the basis for everything the agency
does each day. Communication creates a ’line of sight“ throughout an
agency so that everyone understands what the organization is trying to
achieve and the goals it seeks to reach.
Example: Submission processing‘s ’notice error rate“ measure determines
the percentage of incorrect notices issued to taxpayers by submission
processing employees. The target set for this measure in 2001 was
8.1 percent. This measure could be used to support the ’notice
redesign“ improvement project as well as the operational priority to
’prioritize notices and monitor and control notice issuance.“ It also
is used to support one of W&I‘s goals--“to meet taxpayer demands for
timely, accurate, and efficient services.“ This W&I strategy aligns
with IRS‘s strategic goal, ’top quality service to all taxpayers
through fair and uniform application of the law,“ which in turn,
supports IRS‘s mission to ’provide America‘s taxpayers top quality
service by helping them understand and meet their tax responsibilities
and by applying the tax law with integrity and fairness to all.“:
Methodology: We compared IRS‘s measures with its targets, improvement
projects, operational priorities, operating division goals, and
agencywide goals and mission as documented in the Strategy and Program
Plan. We also interviewed operational/unit managers and managers
responsible for the Strategy and Program Plan about linkages and
reviewed training materials.
2. Are the performance measures clearly stated? (Referred to as
’clarity“):
Explanation: A measure has clarity when it is clearly stated and the
name and definition are consistent with the methodology used for
calculating the measure. A measure that is not clearly stated (i.e.,
contains extraneous or omits key data elements) or that has a name or
definition that is inconsistent with how it is calculated can confuse
users and could cause managers or other stakeholders to think that
performance was better or worse than it actually was.
Example: Telephone assistance‘s ’average handle time“ measure shows the
average number of seconds Customer Service Representatives (CSRs) spent
assisting callers. Its definition and formula are consistent with the
name of the measure and clearly note that the measure includes talk and
hold times and the time a CSR spends on work related to a call after
the call is terminated.
Methodology: We compared the name of the measure, the written
definition of the measure, and the formula or methodology for computing
the measure. In several instances, we discussed certain components of
the definition and formula with IRS officials to better understand its
meaning and purpose. For example, we discussed components of telephone
assistance‘s quality measures with staff in Customer Account Services,
and staff in the Centralized Quality Review Site. We also reviewed on-
line information available to field assistance managers from the
Queuing Management System (Q-Matic).[Footnote 41] We spoke to managers
at different levels within each of the four areas we reviewed and asked
them about the information they received and how they used it. In
addition, we used some of the results of a random telephone survey of
managers we conducted in 2001 at 84 of IRS‘s 413 Taxpayer Assistance
Centers (TAC) to solicit their views on the services provided at those
offices.
3. Do the performance measures have targets, thus allowing for easier
comparison with actual performance? (Referred to as ’measurable
target“):
Explanation: Where appropriate, performance goals and measures should
have quantifiable, numerical targets or other measurable values.
Numerical targets or other measurable values facilitate future
assessments of whether overall goals and objectives were achieved
because comparisons can be easily made between projected performance
and actual results. Some goals are self-measuring (i.e., they are
expressed objectively and are quantifiable) and therefore do not
require additional measures to assess progress. When goals are not
self-measuring, performance measures should translate those goals into
observable conditions that determine what data to collect to learn
whether progress was made toward achieving goals. The measures should
have a clearly apparent or commonly accepted relation to the intended
performance or have been shown to be reasonable predictors of desired
behaviors or events. If a goal cannot be expressed in an objective,
specific, and measurable form, GPRA allows the Office of Management and
Budget to authorize agencies to develop alternative forms of
measurement.[Footnote 42]
Example: Electronic filing and assistance‘s ’percent of individual
returns electronically filed“ had a numerical target of 31 percent in
fiscal year 2001.
Methodology: We determined that a goal or measure had a measurable
target when expected performance could be compared with actual results,
and in general, was not changed during the measurement period. Each of
the measures we reviewed was listed in the Strategy and Program Plan,
which provides projections or targets for the current and two
subsequent fiscal years. We verified that the target was measurable.
When the Strategy and Program Plan did not show a target, we contacted
appropriate IRS officials to determine why.
4. Are the performance goals and measures objective? (Referred to as
’objectivity“):
Explanation: To the greatest extent possible, goals and measures should
be reasonably free of significant bias or manipulation that would
distort the accurate assessment of performance. They should not allow
subjective considerations or judgments to dominate the outcome of the
measurement. To be objective, performance goals and measures should
indicate specifically what is to be observed, in which population or
conditions, and in what timeframe and be free of opinion and judgment.
Objectivity is important because it adds credibility to the performance
goals and measures by ensuring that significant bias or manipulation
will not distort the measure.
Example: The ’customer satisfaction“ measure for telephone assistance
has the potential for bias and therefore may not be objective. Survey
administrators are instructed to notify the CSR towards the end of the
call that his or her call was selected to participate in the survey. A
potential problem arises because administrators are not required to
listen to the entire call, and it can be difficult to determine when
the call is about to end. Therefore, if a CSR is notified prior to the
end of the call that the call was selected for survey, the CSR could
change behavior towards the taxpayer, thus affecting the results of the
survey and the measure.
Methodology: We reviewed information in IRS guidance or procedures,
data collection instruments, reports, and other documents. We held
discussions about objectivity with various staff and officials, such as
data owners and analysts, within each of the four areas we reviewed.
Because our interviews raised questions about the objectivity of some
measures for telephone assistance, we monitored some taxpayer calls and
interviewed an official from IRS‘s customer satisfaction survey
contractor, Pacific Consulting Group.
5. To what extent do the performance goals and measures provide a
reliable way to assess progress? (Referred to as ’reliability“):
Explanation: Reliability refers to whether measures are amenable to
applying standard procedures for collecting data or calculating results
so that they would be likely to produce the same results if applied
repeatedly to the same situation. Errors can occur at various points in
the collection, maintenance, processing, and reporting of data.
Significant errors would affect conclusions about the extent to which
performance goals have been achieved. Likewise, errors could cause the
measure to report performance at either a higher or lower level than is
actually being attained. Reliability is increased when verification and
validation procedures, such as checking performance data for
significant errors by formal evaluation or audit, exist.
Example: Field assistance‘s ’return preparation contacts“ measure
tracks the total number of customers assisted with return preparation
by IRS. This measure may not be reliable because it involves a
significant amount of manual entry on Form 5311 (Field Assistance
Activity Report) even at sites with the Q-Matic system. In addition to
the potential for error associated with manual entry, the instructions
for filing Form 5311 require that service time be recorded in whole
hours, which can misconstrue actual service times and is less exact
than the data in Q-Matic, which records service times in minutes.
Methodology: We looked for weaknesses in IRS‘s guidance or procedures,
data collection instruments, reports, and other documents that might
cause errors. We discussed potential weaknesses with various officials,
such as account data analysts, within each of the four areas we
reviewed. Because these efforts revealed the potential for errors in
measuring telephone performance, we monitored employees preparing data
collection instruments for assessing telephone quality and customer
satisfaction in Atlanta. Likewise, we monitored field assistance staff
helping taxpayers and reporting their time using both the automated Q-
Matic system and Form 5311.
6. Do the performance measures sufficiently cover a program‘s core
activities? (Referred to as ’core program activities“):
Explanation: Core program activities are the activities that an entity
is expected to perform to support the intent of the program.
Performance measures should be scoped to evaluate the core program
activities. Limiting the number of performance measures to the core
program activities will help identify performance that contributes to
goal achievement. At the same time, however, there should be enough
performance measures to ensure that managers have the information they
need about performance in all the core program activities. Without such
information, the possibility of achieving program goals is less likely.
Example: The core program activities for submission processing include
(1) processing returns, (2) depositing remittances, (3) issuing
refunds, and (4) sending out notices and letters. Each of submission
processing‘s 11 measures correspond to one of those core activities.
For example, the ’number of individual 1040 series returns filed
(paper)“
measure corresponds to processing returns and the ’letter error rate“
measure corresponds with sending out notices and letters.
Methodology: We determined the core program activities of each of the
four areas we reviewed based on IRS documentation and discussions with
IRS officials. We reviewed the suite of performance measures for each
of the four areas to determine whether measures existed that covered
each core program activity. We determined whether any measures were
missing or other pieces of information were needed to better manage
programs by using judgment and questioning IRS officials. In addition,
we reviewed the results of a questionnaire that we had used during a
review of IRS‘s
2001 filing season to ask TAC managers about information needed to
manage their program.
7. Does there appear to be limited overlap among the performance
measures? (Referred to as ’limited overlap“):
Explanation: Measures overlap when the results of measures provide
basically the same information. A measure that overlaps with another is
unnecessary and does not benefit program management. Unnecessary or
overlapping measures not only can cost money but also can cloud the
bottom line in a results-oriented environment by making managers or
other stakeholders sift through unnecessary or redundant information.
Some measures, however, may overlap partially and provide stakeholders
some new information. In those cases, management must make a judgment
as to whether having the additional information is worth the cost and
possible confusion it may cause.
Example: Telephone assistance‘s ’toll-free average speed of answer“ and
’toll-free CSR response level“ measures attempt to show how long a
taxpayer waited before receiving assistance. The difference between the
two measures is that the latter shows the percentage of taxpayers
receiving assistance within 30 seconds while the former shows the
average time taxpayers waited for service. These two measures are
likely to be correlated and thus partially overlap. However, the amount
of overlap between measures is management‘s discretion.
Methodology: Within each of the four areas we reviewed, we looked at
the suite of measures and compared the measures‘ names and definitions.
We also looked at the correlations between measures‘ results. When two
measures seemed similar, we discussed the potential for overlap with
IRS officials.
8. Does there appear to be a balance among the performance goals and
measures, or is there an emphasis on one or two priorities at the
expense of others? (Referred to as ’balance“):
Explanation: Balance exists when a suite of measures ensures that an
organization‘s various priorities are covered. IRS considers its
measures to be balanced when they address customer satisfaction,
employee satisfaction, and business results (quality and quantity).
Performance measurement efforts that overemphasize one or two
priorities at the expense of others may skew the agency‘s performance
and keep its managers from understanding the effectiveness of their
programs in supporting IRS‘s overall mission and goals.
Example: Submission processing has an employee satisfaction measure and
several business results measures, such as ’deposit timeliness.“ As of
October 2002, however, it had not fully implemented a customer
satisfaction measure, which resulted in an unbalanced process that can
overlook something as important as the customer‘s perspective.
Methodology: For each of the four areas, we ensured that a measure
existed for each component. If measures did not exist for certain
components, we contacted IRS officials to find out why and to see what
plans IRS has to ensure balance in the future.
9. Does the program or activity have performance goals and measures
that cover governmentwide priorities? (Referred to as ’governmentwide
priorities“):
Explanation: Agencies should develop a range of related performance
measures to address governmentwide priorities, such as quality,
timeliness, efficiency, cost of service, and outcome. A range is
important because most program activities require managers to balance
these priorities among other demands. When complex program goals are
broken down into a set of component quantifiable measures, it is
important to ensure that the overall measurement of performance does
not become biased because measures that assess some priorities but
neglect others could place the program‘s success at risk.
Example: Electronic filing and assistance provides the capability for
taxpayers to transact and communicate electronically with IRS. The
13 measures we reviewed included, for example, the number or percent of
returns filed, the number of hits to or downloads from IRS‘s Web site,
and employee and customer satisfaction. The Strategy and Program Plan
did not have any measures on the program‘s quality or timeliness. Not
having these measures means that management may not be sufficiently
balancing competing demands.
Methodology: We analyzed the suite of measures in the Strategy and
Program Plan for each of the four areas we reviewed. Based on
discussions with IRS officials and our own judgment, we identified
measures that appeared to be missing. We discussed those identified
with IRS officials.
[End of section]
Appendix II: The 53 IRS Performance Measures Reviewed:
The following four tables provide information on the 53 performance
measures we reviewed in the four program areas within the Internal
Revenue Service‘s (IRS) Wage and Investment (W&I) operating division
that are critical to a successful filing season. Among other things,
the tables show how each of the 53 measures matched up against the
attributes in appendix I. The attributes not addressed in the tables
are (1) ’linkage,“ because sufficient documentation did not exist to
validate linkages with any of the measures and (2) ’balance,“ because
that attribute does not apply to specific measures but, rather, to a
program‘s entire suite of measures. When reviewing the suite of
measures, we found some instances where additional measures are
warranted; the additional measures are generally not cited in these
tables.
Telephone Assistance Performance Measures:
Of the 53 performance measures in our review, 15 are for telephone
assistance.[Footnote 43] Table 6 has information about each of the 15
telephone measures.
Table 6: Telephone Assistance Performance Measures:
Measure name and definition[A]: Total automated calls answered; A
count of all toll-free calls answered at telephone assistance centers
by an automated system (e.g., Telephone Routing Interactive System) and
Tele-Tax.[B]; FY 2001 target and actual: Target: 85,000,000 calls
answered; Actual: 104,228,052 calls answered; Weaknesses of measure
and consequences: Some overlap with automated completion rate measure.
Both attempt to show how many automated calls were answered, but the
automated completion rate tries to show the percentage that completed
automated service successfully. Overlap could cloud the bottom line and
obscure performance results; Recommendations: See note 1 to the
table.
Measure name and definition[A]: Customer Service Representative (CSR)
calls answered; The count of all toll-free calls answered at
telephone assistance centers; FY 2001 target and actual: Target:
31,500,000 calls answered; Actual: 32,532,503 calls answered;
Weaknesses of measure and consequences: Some overlap with CSR services
provided measure. Both attempt to show how many calls CSRs answered,
but CSR services provided tries to count calls requiring the help of
more than one CSR as more than one call. Overlap could cloud the bottom
line and obscure performance results; Recommendations: See note 1 to
the table.
Measure name and definition[A]: CSR level of service; The relative
success rate of taxpayers who call for toll-free services reaching a
CSR; FY 2001 target and actual: Target: 55%; Actual: 53.7%; Weaknesses
of measure and consequences: Formula lacks clarity because it includes
some automated calls, which overstates the number of calls answered by
CSRs and thus the level of service being provided by CSRs.[C];
Definition
lacks clarity because it does not disclose inclusion of some automated
calls, which could lead to misinterpreted results or a failure to take
proper action to resolve performance problems; Recommendations: Remove
automated calls
from the formula.
Measure name and definition[A]: Toll-free customer satisfaction;
Customer‘s perception of service received, with a rating of ’4“ being
the best; FY 2001 target and actual: Target: 3.45 average score;
Actual:
3.45 average score; Weaknesses of measure and consequences: Not clear
because survey only applies to calls handled by CSRs. Satisfaction is
not measured for calls handled by automation, which accounted for 76
percent of all calls in fiscal year 2001; Potential bias exists
(not objective) because administrators are not required to listen to
the entire call, (1) CSRs could be prematurely notified that their call
was selected for the survey, thus changing their behavior towards the
caller
and affecting the results of the survey and (2) administrators may not
be
able to correctly answer certain questions on the survey, which could
impair the accuracy of the data; Recommendations: Develop a customer
satisfaction survey for automated assistance; Modify procedures for
the toll-free customer satisfaction survey, possibly by requiring that
administrators listen to the entire call, to better ensure that
administrators (1) notify CSRs that their call was selected for the
survey as close to the end of a call as possible and (2) can accurately
answer the questions they are responsible for on the survey.
Measure name and definition[A]: Toll-free tax law quality[D];
Evaluates the correctness of answers given by CSRs to callers with tax
law inquiries as well as CSRs‘ conformance with IRS administrative
procedures, such as whether the CSR gave his or her identification
number to the taxpayer; FY 2001 target and actual: Target: 74%;
Actual: 75.21%; Weaknesses of measure and consequences: A reliability
weakness exists because evaluations are based on judgments that are
potentially inconsistent. No routine studies to determine effectiveness
of procedures to ensure consistency of data collection. Possible
inconsistencies affect the accuracy of the measure and conclusions
about the extent to which performance goals have been achieved; Some
overlap with toll-free tax law correct response rate. Both attempt to
show the percentage of callers receiving accurate responses to tax law
questions, but toll-free tax law quality includes CSR conformance with
administrative procedures in computing that percentage. Overlap could
cloud the bottom line and obscure performance results;
Recommendations: Implement annual effectiveness studies to validate the
accuracy of data collection methods and establish goals for improving
consistency, as needed; See note 1 to the table.
Measure name and definition[A]: Toll-free accounts quality[E];
Evaluates the correctness of answers given by CSRs to callers with
account-related inquiries as well as CSRs‘ conformance with IRS
administrative procedures, such as whether a CSR gave his or her
identification number to the taxpayer; FY 2001 target and actual:
Target: 67%; Actual: 69.17%; Weaknesses of measure and
consequences: A reliability weakness exists because evaluations are
based on judgments that are potentially inconsistent. No routine
studies to determine effectiveness of procedures to ensure consistency
of data collection. Possible inconsistencies affect the accuracy of the
measure and conclusions about the extent to which performance goals
have been achieved; Some overlap with toll-free account correct
response rate. Both attempt to show the percentage of callers receiving
accurate responses to account questions, but toll-free accounts quality
includes CSR conformance with administrative procedures in computing
that percentage. Overlap could cloud the bottom line and obscure
performance results; Recommendations: Implement annual effectiveness
studies to validate the accuracy of data collection methods and
establish goals for improving consistency, as needed; See note 1 to
the table.
Measure name and definition[A]: Average handle time; The average
number of seconds CSRs spent assisting callers. It includes talk and
hold times and the time a CSR spends on work related to a call after
the call is terminated; FY 2001 target and actual: Target: not
available; Actual: 609 seconds; Weaknesses of measure and
consequences: Target to be set upon completion of baseline data
collection.[F]; Recommendations: None.
Measure name and definition[A]: Automated completion rate; The
percentage of total callers who completed a selected automated
service; FY 2001 target and actual: Target: not available; Actual:
not available; Weaknesses of measure and consequences: Formula lacks
clarity because it assumes that all callers seeking recorded tax law
information, including those who hang up before receiving service,
received the information they needed, which could produce inaccurate or
misleading results; Not clear because definition does not disclose
the previously mentioned assumption, which could lead to misinterpreted
results or a failure to take proper action to resolve performance
problems; Measure removed from the Strategy and Program Plan; target
not available; Some overlap with total automated calls answered.
Both attempt to show how many automated calls were answered, but
automated completion rate tries to show the percentage that completed
an automated service successfully. Overlap could cloud the bottom line
and obscure performance results; Recommendations: Revise the measure
so that calls for recorded tax law information are not counted as
completed when callers hang up before receiving service; Put this
measure back in the Strategy and Program Plan after revising the
formula so that calls for recorded tax law information are not counted
as completed when taxpayers hang up before receiving service; See
note 1 to the table.
Measure name and definition[A]: CSR services provided; The count of
all calls handled by CSRs; FY 2001 target and actual: Target: not
available; Actual: 35,799,122 calls answered; Weaknesses of measure
and consequences: Not clear because definition does not disclose that
IRS counts all calls transferred from one CSR to another as receiving
an additional service, which could lead to misinterpreted results or a
failure to take proper action to resolve performance problems. IRS does
not have complete information on why calls were transferred. Thus, IRS
cannot identify appropriate steps to reduce any inefficiency associated
with transferred calls; Target to be set upon completion of baseline
data collection[F]; Some overlap with CSR calls answered. Both
attempt to show how many calls CSRs answered, but CSR services provided
tries to count calls requiring the help of more than one CSR as more
than one call. Overlap could cloud the bottom line and obscure
performance results; Recommendations: Analyze and use new or existing
data to determine why calls are transferred and use the data to revise
the measure so that it only reflects transferred calls in which the
caller received help from more than one CSR (i.e., exclude calls in
which a CSR simply transferred the call and did not provide service);
See note 1 to the table.
Measure name and definition[A]: Toll-free tax law correct response
rate[G]; Evaluates the correctness of answers given by CSRs to callers
with tax law inquiries; FY 2001 target and actual: Target: 81.6%;
Actual:
79.53%; Weaknesses of measure and consequences: A reliability weakness
exists because evaluations are based on judgments that are potentially
inconsistent. No routine studies to determine effectiveness of
procedures
to ensure consistency of data collection. Possible inconsistencies
affect
the accuracy of the measure and conclusions about the extent to which
performance goals have been achieved; Some overlap with toll-free tax
law quality. Both attempt to show the percentage of callers receiving
accurate responses to tax law questions, but toll-free tax law quality
includes CSR conformance to administrative procedures in computing that
percentage. Overlap could cloud the bottom line and obscure performance
results; Recommendations: Implement annual effectiveness studies to
validate the accuracy of data collection methods and establish goals
for improving consistency, as needed; See note 1 to the table.
Measure name and definition[A]: Toll-free account correct response
rate[H]; Evaluates the correctness of answers given by CSRs to callers
with account-related inquiries; FY 2001 target and actual: Target:
90.8%;
Actual: 88.72%; Weaknesses of measure and consequences: A reliability
weakness exists because evaluations are based on judgments that are
potentially inconsistent. No routine studies to determine effectiveness
of procedures to ensure consistency of data collection. Possible
inconsistencies affect the accuracy of the measure and conclusions
about the extent to which performance goals have been achieved;
Some overlap with toll-free accounts quality. Both attempt to show the
percentage of callers receiving accurate responses to account
questions, but toll-free accounts quality includes CSR conformance with
administrative procedures in computing that percentage. Overlap could
cloud the bottom line and obscure performance results; Recommendations:
Implement annual effectiveness studies to validate the accuracy of the
data collection methods and establish goals for improving consistency,
as needed; See note 1 to the table.
Measure name and definition[A]: Toll-free timeliness[I]; The
successful resolution of all issues resulting from the caller‘s first
inquiry (telephone only); FY 2001 target and actual: Target: 82%;
Actual: 82.8%; Weaknesses of measure and consequences: A reliability
weakness exists because evaluations are based on judgments that are
potentially inconsistent. No routine studies to determine effectiveness
of procedures to ensure consistency of data collection. Possible
inconsistencies affect the accuracy of the measure and conclusions
about the extent to which performance goals have been achieved;
Recommendations: Implement annual effectiveness studies to validate the
accuracy of data collection methods and establish goals for improving
consistency, as needed.
Measure name and definition[A]: Toll-free employee satisfaction; The
percentage of survey participants that answered with a 4 or 5 (two
highest scores possible) to the question ’considering everything, how
satisfied are you with your job?“; FY 2001 target and actual: Target:
55%; Actual: 46%; Weaknesses of measure and consequences: None
observed; Recommendations: None.
Measure name and definition[A]: CSR response level; The percentage of
callers who started receiving service from a CSR within a specified
period of time; FY 2001 target and actual: Target: 49%; Actual: 40.8%;
Weaknesses of measure and consequences: Not clear because formula does
not include calls that received a busy signal or resulted in a hang-up
before a CSR came on the line, and the definition does not disclose
that
exclusion. Performance may be overstated and the real customer
experience
not reflected; Some overlap with average speed of answer. Both attempt
to show how long callers waited before receiving service, except that
CSR response level shows the number of callers receiving service within
30 seconds. Overlap could cloud the bottom line and obscure performance
results; Recommendations: Revise measure to include calls from
taxpayers who tried to reach a CSR but did not, such as those who (1)
hung-up while waiting to speak to a CSR, (2) were provided access only
to automated services and hung up, and (3) received a busy signal;
See note 1 to the table.
Measure name and definition[A]: Average speed of answer; The average
number of seconds callers waited in queue before receiving service from
a CSR; FY 2001 target and actual: Target: not available; Actual:
295 seconds; Weaknesses of measure and consequences: Target to be set
upon completion of baseline data collection.[F]; Some overlap with
toll-free CSR response level. Both attempt to show how long callers
waited before receiving service, except that CSR response level shows
the number of callers receiving service within 30 seconds. Overlap
could cloud the bottom line and obscure performance results;
Recommendations: See note 1 to the table.
Note 1: We identified this measure as having partial overlap with
another measure. Telephone assistance officials generally agreed with
our assessment and stated that some of these overlapping measures will
be removed from future Strategy and Program Plans. The following
recommendation applies to several measures as noted in the table:
’ensure that plans to remove overlapping measures are implemented.“:
[A] The names of some measures have been modified slightly from the
official names used by IRS for ease of reading and consistency
purposes. For example, we replaced the word ’assistor“ with CSR. Also,
the definitions of the measures listed in the table come from various
IRS sources, including interviews.
[B] The Telephone Routing Interactive System is an interactive routes
callers to CSRS or automated services and provides interactive
services. Tele-Tax is a telephone system that provides automated
services only.
[C] About 780,000 automated calls were included in the formula during
the 2001 filing season. If they had not been included, the CSR level of
service would have decreased by about 1 percentage point. The effect
could be more significant in the future because IRS plans to increase
the number of calls handled through automation.
[D] IRS plans to discontinue the ’toll-free tax law quality“ measure in
fiscal year 2004.
[E] IRS plans to discontinue the ’toll-free accounts quality“ measure
in fiscal year 2004.
[F] Although these measures did not have a measurable target in place,
IRS is taking reasonable steps to develop a target.
[G] IRS changed the name of the ’toll-free tax law correct response
rate“ measure to ’customer accuracy for tax law inquiries“ beginning in
October 2002.
[H] IRS changed the name of the ’toll-free account correct response
rate“ measure to ’customer accuracy for account inquiries“ beginning in
October 2002.
[I] IRS discontinued the ’toll-free timeliness“ measure beginning in
October 2002, and replaced it with a new ’quality timeliness“ measure.
Source: GAO comparison of IRS‘s December 13, 2000, July 25, 2001, and
October 29, 2001, Strategy and Program Plans with the attributes in
appendix I and an Embedded Quality Discussion Document (7/23/02), which
discusses the changes IRS plans for its telephone assistance quality
measures.
[End of table]
Electronic Filing and Assistance Performance Measures:
Of the 53 performance measures in our review, 13 are for electronic
filing and assistance.[Footnote 44] Table 7 has information about each
of the 13 measures.
Table 7: Electronic Filing and Assistance Performance Measures:
Measure name and definition[A]: Number of 1040 series returns
electronically filed (millions); The number of Forms 1040, 1040A, and
1040EZ filed electronically; FY 2001 target and actual: Target: 40.0;
; Actual: 40.0; Weaknesses of measure and consequences: Target changed
during filing season from 42.0 to 40.0. Changing the target in this
instance was subjective in nature and resulted in an objectivity
weakness as well; Some overlap with percent of individual returns
electronically filed. Both measures show the extent of electronic
filing by individuals--one in absolute numbers, the other as a percent
of total filings. Overlap could cloud the bottom line and obscure
performance results; Recommendations: Refrain from making changes to
official targets unless extenuating circumstances arise; Disclose
any extenuating circumstances in the Strategy and Program Plan and
other key documents; See note 1 to the table.
Measure name and definition[A]: Number of business returns
electronically filed (millions); The number of Forms 941, 1041, and
1065 filed electronically; FY 2001 target and actual: Target: 3.7;
Actual: 1.66; Weaknesses of measure and consequences: None observed;
Recommendations: None.
Measure name and definition[A]: Total number of electronically filed
returns (millions); The number of Forms 1040, 1040A, 1040EZ, 941,
1041 and 1065 filed electronically; FY 2001 target and actual:
Target: 43.7; Actual: 41.7; Weaknesses of measure and consequences:
Target changed during filing season from 45.7 to 43.7. Changing the
target in this instance was subjective in nature and resulted in an
objectivity weakness as well; Recommendations: Refrain from making
changes to official targets unless extenuating circumstances arise.
Disclose any extenuating circumstances in the Strategy and Program Plan
and other key documents.
Measure name and definition[A]: Number of information returns
electronically filed (millions); The total number of information
returns filed electronically. Includes Forms 1098, 1099, 5498, and W-2G
and Schedules K-1. Excludes Forms W-2 and 1099-SSA/RRB received from
the Social Security Administration; FY 2001 target and actual:
Target: 334.0; Actual: 322.8; Weaknesses of measure and
consequences: Some overlap with percent of information returns
electronically filed. Both measures show the extent of electronic
filing --one in absolute numbers, the other as a percent of total
filings. Overlap could cloud the bottom line and obscure performance
results; Recommendations: See note 1 to table.
Measure name and definition[A]: Percent of information returns
electronically filed; The percentage of total information returns
filed electronically; FY 2001 target and actual: Target: 24.4%;
Actual: not available[B]; Weaknesses of measure and consequences: Some
overlap with number of information returns electronically filed. Both
measures show the extent of electronic filing --one in absolute
numbers, the other as a percent of total filings. Overlap could cloud
the bottom line and obscure performance results; Recommendations: See
note 1 to table.
Measure name and definition[A]: Percent of individual returns
electronically filed; The percentage of total 1040 series tax returns
(Forms 1040, 1040A, and 1040EZ) filed electronically; FY 2001 target
and actual: Target: 31%; Actual: 32%; Weaknesses of measure and
consequences: Some overlap with number of 1040 series returns
electronically filed. Both measures show the extent of electronic
filing by individuals--one in absolute numbers, the other as a percent
of total filings. Overlap could cloud the bottom line and obscure
performance results; Recommendations: See note 1 to table.
Measure name and definition[A]: Number of payments received
electronically (millions); All individual and all business tax
payments made through the electronic federal tax payment system
(EFTPS); FY 2001 target and actual: Target: 64.4; Actual: 53.8;
Weaknesses of measure and consequences: Some overlap with percent of
payments received electronically. Both measures show the extent to
which payments are received electronically--one in absolute numbers,
the other as a percent of total receipts. Overlap could cloud the
bottom line and obscure performance results; Recommendations: See note
1 to table.
Measure name and definition[A]: Percent of payments received
electronically; The percentage of all individual and business tax
payments made through EFTPS; FY 2001 target and actual: Target: 30%;
Actual: not available[B]; Weaknesses of measure and consequences:
Some overlap with number of payments received electronically. Both
measures show the extent to which payments are received electronically-
-one in absolute numbers, the other as a percent of total receipts.
Overlap could cloud the bottom line and obscure performance results;
Recommendations: See note 1 to table.
Measure name and definition[A]: Number of electronic funds withdrawals/
credit card transactions (millions); The total number of credit card
and direct debit payments processed through EFTPS; FY 2001 target and
actual: Target: 1.0; Actual: 0.63; Weaknesses of measure and
consequences: Some overlap with number and percent of payments received
electronically. The payments covered by this measure are included in
the universe of payments covered by the other two measures. Overlap
could cloud the bottom line and obscure performance results;
Recommendations: See note 1 to table.
Measure name and definition[A]: Number of IRS digital daily Web site
hits (billions); The number of hits to IRS‘s Web site; FY 2001
target and actual: Target: 2.0; Actual: 2.3; Weaknesses of measure
and consequences: Measure is not clear and lacks reliability because,
for example, initial access counts as multiple hits and movement
throughout the Web site will count as additional hits;
Recommendations: Either discontinue use of this measure or revise the
way ’hits“ are calculated so that the measure more accurately reflects
usage.
Measure name and definition[A]: Number of downloads from ’IRS .GOV“
(millions); The total number of tax forms downloaded from IRS‘s Web
site; FY 2001 target and actual: Target: 311; Actual: 309;
Weaknesses of measure and consequences: None observed;
Recommendations: None.
Measure name and definition[A]: Customer satisfaction - individual
taxpayers; The percentage of taxpayers who respond ’very satisfied“
with individual E-file products; FY 2001 target and actual: Target:
76%; Actual: 83%; Weaknesses of measure and consequences: None
observed; Recommendations: None.
Measure name and definition[A]: Employee satisfaction - Electronic
filing and assistance; The percentage of survey participants that
answered with a 4 or 5 (two highest scores possible) to the question
’considering everything, how satisfied are you with your job?“; FY 2001
target and actual: Target: 66%; Actual: 38%; Weaknesses of measure
and consequences: None observed; Recommendations: None.
Note: We identified this measure as having partial overlap with another
measure. Electronic filing and assistance officials told us that each
of the overlapping measures we identified provides additional
information to managers. Determining whether or not to remove
overlapping measures is management‘s discretion.
[A] The names of some measures have been modified slightly from the
official names used by IRS for ease of reading and consistency
purposes. The definitions of the measures listed in the table come from
various IRS sources, including interviews.
[B] Despite setting a target, actual data were not available because
electronic filing and assistance did not begin tracking the measure
until 2002.
Source: GAO comparison of IRS‘s December 13, 2000, July 25, 2001, and
October 29, 2001, Strategy and Program Plans with the attributes in
appendix I.
[End of table]
Field Assistance Performance Measures:
Of the 53 performance measures in our review, 14 are for field
assistance. Table 8 has information about each of the 14 field
assistance measures.
Table 8: Field Assistance Performance Measures:
Measure name and definition[A]: Customer satisfaction; From surveys
established in 1998, an index was created to represent overall customer
satisfaction with field assistance services, with a ’7“ being the
best.[B]; FY 2001 target and actual: Target: 6.5 average score;
Actual: 6.4 average score; Weaknesses of measure and consequences:
None identified; Recommendations: None.
Measure name and definition[A]: Return preparation contacts; Total
number of customers assisted with tax return preparation, including
electronic and non-electronic tax return preparation at taxpayer
assistance centers (TAC); FY 2001 target and actual: Target: 979,206;
; Actual: 1,009,387; Weaknesses of measure and consequences: Name,
definition, and formula of measure are not clear; Significant manual
data collection process impedes reliability because of the potential
for errors and inconsistencies that could affect the accuracy of the
measure and conclusions about the extent to which performance goals
have been achieved; Some overlap with return preparation units
measure. Both measures attempt to show number of services provided, but
the contact measure takes the number of taxpayers served into account
and the units measure counts the number of returns prepared for those
taxpayers served. Overlap could cloud the bottom line and obscure
performance results; Recommendations: Make the name and/or definition
of the measure more clear to indicate what is and is not included in
the formula; See note 1 to the table; See note 2 to the table.
Measure name and definition[A]: Geographic coverage; Percentage of
W&I taxpayer population with distinct characteristics, behaviors, and
needs for face-to-face assistance within a 45-minute commuting distance
from a TAC; FY 2001 target and actual: Target: 70%; Actual: 74%;
Weaknesses of measure and consequences: Name, definition, and formula
of measure are not clear; uncertainties exist among IRS officials about
what is and is not included in the measure; The formula does not
include all facilities, which could lead to misinterpreted results or a
failure to properly identify alternative facility types to resolve
access problems; Because the formula does not include all facilities,
it is difficult for decision makers to determine if, when, and where
additional TACs are needed; Recommendations: Make the name and/or
definition of the measure more clear to indicate what is and is not
included in the formula; Revise the formula to better reflect (1)
the various types of field assistance facilities, including alternate
sites and kiosks; (2) the types of services provided by each facility;
and (3) the facility‘s operating hours.
Measure name and definition[A]: Return preparation units; Actual
number of tax returns prepared, in whole or in part, in a TAC or
alternative site. (Multiple returns may be prepared for a single
customer.); FY 2001 target and actual: Target: not available; Actual:
not available; Weaknesses of measure and consequences: Name,
definition,
and formula of measure are not clear; Target to be set upon completion
of
data collection.[C]; Significant manual data collection process impedes
reliability because of the potential for errors and inconsistencies
that
could affect the accuracy of the measure and conclusions about the
extent
to which performance goals have been achieved; Some overlap with return
preparation contacts. Both measures attempt to show number of services
provided, but the contact measure takes the number of taxpayers served
into account and the units measure counts the number of returns
prepared for those taxpayer‘s served. Overlap could cloud the bottom
line and obscure performance results; Recommendations: Make the name
and/or definition of the measure more clear to indicate what is and is
not included in the formula; See note 1 to the table; See note 2
to the table.
Measure name and definition[A]: TACs total contacts; Total number of
customers assisted, including number of customers assisted with tax
return preparation, at TACs and alternate sites and via mobile
services. All face-to-face, telephone, and correspondence contacts are
included; FY 2001 target and actual: Target: 9,116,099; Actual:
9,681,330; Weaknesses of measure and consequences: Name, definition,
and formula of measure are not clear; Significant manual data
collection process impedes reliability because of the potential for
errors and inconsistencies that could affect the accuracy of the
measure and conclusions about the extent to which performance goals
have been achieved; Recommendations: Make the name and/or definition
of the measure more clear to indicate what is and is not included in
the formula; See note 1 to the table.
Measure name and definition[A]: Forms contacts; Total number of
customers actually assisted by employees at TACs, alternate sites, and
via mobile services by (1) providing forms from stock or (2) using a
CD-ROM; FY 2001 target and actual: Target: 2,331,000; Actual:
2,388,039; Weaknesses of measure and consequences: Name, definition,
and formula of measure are not clear; Significant manual data
collection process impedes reliability because of the potential for
errors and inconsistencies that could affect the accuracy of the
measure and conclusions about the extent to which performance goals
have been achieved; Recommendations: Make the name and/or definition
of the measure more clear to indicate what is and is not included in
the formula; See note 1 to the table.
Measure name and definition[A]: Tax law contacts; Total number of
customers assisted in TACs, alternate sites, and via mobile services
with inquiries involving general tax law questions, non-account related
IRS procedures, preparation or review of Forms W-7, Individual Taxpayer
Identification Number documentation verification or rejection, a form
request where probing requiring technical tax law training takes place,
and assisting customers with audit reconsideration; FY 2001 target and
actual: Target: not available; Actual: 1,787,338; Weaknesses of
measure and consequences: Name, definition, and formula of measure are
not clear; Target to be set upon completion of data collection.[C];
Significant manual data collection process impedes reliability
because of the potential for errors and inconsistencies that could
affect the accuracy of the measure and conclusions about the extent to
which performance goals have been achieved; Recommendations: Make the
name and/or definition of the measure more clear to indicate what is
and is not included in the formula; See note 1 to the table.
Measure name and definition[A]: Account contacts; Total number of
customers assisted in TACs, alternate sites, and via mobile services
with inquiries involving account related inquiries including math error
notices, Integrated Data Retrieval System work, payments not attached
to a tax return, CP2000 inquiries, Individual Taxpayer Identification
Number issues requiring account research, the issuance of Form 809
receipts, and account related procedures; FY 2001 target and actual:
Target: not available; Actual: not available; Weaknesses of measure
and consequences: Name, definition, and formula of measure are not
clear; Target to be set upon completion of data collection.[C];
Significant manual data collection process impedes reliability because
of the potential for errors and inconsistencies that could affect the
accuracy of the measure and conclusions about the extent to which
performance goals have been achieved; Recommendations: Make the name
and/or definition of the measure more clear to indicate what is and is
not included in the formula; See note 1 to the table.
Measure name and definition[A]: Other contacts; Total number of
customers assisted in TACs, alternate sites, and via mobile services
with Form 2063, U.S. Departing Alien Income Tax statement, date
stamping tax returns when the customer is present, non-receipt or
incorrect W-2 inquiries, general information such as Service Center
address and directions to other agencies; FY 2001 target and actual:
Target: 3,869,000; Actual: 4,496,566; Weaknesses of measure and
consequences: Name, definition, and formula of measure are not clear;
; Significant manual data collection process impedes reliability
because of the potential for errors and inconsistencies that could
affect the accuracy of the measure and conclusions about the extent to
which performance goals have been achieved; Recommendations: Make the
name and/or definition of the measure more clear to indicate what is
and is not included in the formula; See note 1 to the table.
Measure name and definition[A]: Tax law accuracy; The quality of
service provided to TAC customers. Specifically, the accuracy of
responses concerning issues involving tax law; FY 2001 target and
actual: Target: not available; Actual: not available; Weaknesses of
measure and consequences: Name, definition, and formula of measure are
not clear; Target to be set upon completion of data collection.[C];
Recommendations: Make the name and/or definition of the measure more
clear to indicate what is and is not included in the formula.
Measure name and definition[A]: Accounts/notices accuracy; The
quality of service provided to TAC customers. Specifically, the
accuracy of responses and/or IDRS transactions concerning issues
involving account work and notices; FY 2001 target and actual:
Target: not available; Actual: not available; Weaknesses of measure
and consequences: Name, definition, and formula of measure are not
clear; Target to be set upon completion of data collection.[C];
Recommendations: Make the name and/or definition of the measure more
clear to indicate what is and is not included in the formula.
Measure name and definition[A]: Return preparation accuracy; The
quality of service provided to TAC customers. Specifically, the
accuracy of tax returns prepared in a TAC; FY 2001 target and actual:
Target: not available; Actual: not available; Weaknesses of measure
and consequences: Name, definition, and formula of measure are not
clear; Target to be set upon completion of data collection.[C];
Recommendations: Make the name and/or definition of the measure more
clear to indicate what is and is not included in the formula.
Measure name and definition[A]: Employee satisfaction; The percentage
of survey participants that answered with a 4 or 5 (two highest scores
possible) to the question ’considering everything, how satisfied are
you with your job.“; FY 2001 target and actual: Target: 62%;
Actual: 51%; Weaknesses of measure and consequences: None observed;
Recommendations: None.
Measure name and definition[A]: Alternate contacts; Total number of
customers assisted at kiosks, mobile units, and alternate sites. It
includes all face-to-face (including return preparation), telephone,
and correspondence contacts; FY 2001 target and actual: Target: not
available; Actual: not available; Weaknesses of measure and
consequences: Target to be set upon completion of data collection.[C];
; Significant manual data collection process impedes reliability
because of the potential for errors and inconsistencies that could
affect the accuracy of the measure and conclusions about the extent to
which performance goals have been achieved; Recommendations: See note
1 to the table.
Note 1: IRS expects to minimize this potential for errors and
inconsistency by equipping all of its TACS with an on-line automated
tracking and reporting system known as the Queuing Management System
(Q-Matic). This system is expected, among other things, to more
efficiently monitor customer traffic flow and eliminate staff time
spent completing Form 5311. Because IRS is in the process of
implementing Q-Matic, we are not making any recommendation.
Note 2: We identified this measure as having partial overlap with
another measure. Field assistance officials agreed with our assessment
and stated that they plan to remove the ’return preparation contacts“
measure from the Strategy and Program Plan. The following
recommendation applies to two measures, as noted in the table: ’ensure
that plans to remove overlapping measures are implemented.“:
[A] The names of some measures have been modified slightly from the
official names used by IRS for ease of reading and consistency
purposes. The definitions of the measures listed in the table come from
various IRS sources, including interviews.
[B] Field assistance implemented a new customer satisfaction survey in
fiscal year 2002. The index was changed, and a rating of ’5“ is now
best.
[C] Although these measures did not have a measurable target in place,
IRS is taking reasonable steps to develop a target.
Source: GAO comparison of IRS‘s December 13, 2000, July 25, 2001, and
October 29, 2001, Strategy and Program Plans with the attributes in
appendix I.
[End of table]
Submission Processing Performance Measures:
Of the 53 performance measures in our review, 11 are for submission
processing.[Footnote 45] Table 9 has information about each of the 11
submission processing performance measures.
Table 9: Submission Processing Performance Measures:
Measure name and definition[A]: Individual 1040 series returns filed
(paper)[B]; The number of Forms 1040, 1040A, and 1040EZ filed at
the eight W&I submission processing centers; FY 2001 target and
actual: Target: 87,869,000; Actual: 74,972,667; Weaknesses of
measure and consequences: None observed; Recommendations: None.
Measure name and definition[A]: Number of individual refunds issued
(paper)[B]; The number of individual refunds issued by the eight
W&I submission processing centers after the initial filing of a
return; FY 2001 target and actual: Target: 48,000,000; Actual:
45,456,534; Weaknesses of measure and consequences: None observed;
Recommendations: None.
Measure name and definition[A]: Employee satisfaction; The
percentage of survey participants that answered with a 4 or 5 (two
highest scores possible) to the question ’considering everything, how
satisfied are you with your job.“; FY 2001 target and actual: Target:
60%; Actual: 54%; Weaknesses of measure and consequences: None
observed; Recommendations: None.
Measure name and definition[A]: Refund timeliness - individual
(paper)[B]; The percentage of refunds issued to taxpayers within 40
days of the date IRS received the individual income tax return; FY
2001 target and actual: Target: 96.1%; Actual: 96.75%; Weaknesses
of measure and consequences: Potential reliability weakness because
data collected manually and evaluations of data based on judgment.
Possible inconsistencies affect the accuracy of the measure and
conclusions about the extent to which performance goals have been
achieved; Recommendations: Based on the results of effectiveness
studies, establish goals to improve consistency, as needed.
Measure name and definition[A]: Notice error rate; The percentage of
incorrect submission processing master file notices issued to taxpayers
(includes systemic errors).[C]; FY 2001 target and actual: Target:
8.1%; Actual: 14.84%; Weaknesses of measure and consequences:
Potential reliability weakness because data collected manually and
evaluations of data based on judgment. Possible inconsistencies affect
the objectivity of the measure and conclusions about the extent to
which performance goals have been achieved; Recommendations: Based on
the results of effectiveness studies, establish goals to improve
consistency, as needed.
Measure name and definition[A]: Refund error rate - individual
(paper)[B]; The percentage of refunds that have errors caused by IRS
involving, for example, a person‘s name or refund amount (includes
systemic errors).[C]; FY 2001 target and actual: Target: 13.6%;
Actual: 9.75%; Weaknesses of measure and consequences: Potential
reliability weakness because data collected manually and evaluations of
data based on judgment. Possible inconsistencies affect the accuracy of
the measure and conclusions about the extent to which performance goals
have been achieved; Recommendations: Based on the results of
effectiveness studies, establish goals to improve consistency, as
needed.
Measure name and definition[A]: Letter error rate; The percentage of
letters with errors issued to taxpayers by submission processing
employees (includes systemic errors).[C]; FY 2001 target and actual:
Target: 11.9%; Actual: 13.10%; Weaknesses of measure and
consequences: Potential reliability weakness because data collected
manually and evaluations of data based on judgment. Possible
inconsistencies affect the objectivity of the measure and conclusions
about the extent to which performance goals have been achieved;
Recommendations: Based on the results of effectiveness studies,
establish goals to improve consistency, as needed.
Measure name and definition[A]: Deposit timeliness (paper)[B]; Lost
opportunity cost of money received by IRS but not deposited in the bank
by the next day, per $1 billion of deposits, using a constant 8% annual
interest rate; FY 2001 target and actual: Target: $746,712;
Actual: $878,867; Weaknesses of measure and consequences: None
observed; Recommendations: None.
Measure name and definition[A]: Deposit error rate; The percentage of
payments misapplied based on the taxpayer‘s intent; FY 2001 target and
actual: Target: 4.9%; Actual: not available[D]; Weaknesses of
measure and consequences: Objectivity weakness because sampling plan
not consistently implemented; Potential reliability weakness because
data collected manually and evaluations of data based on judgment.
Possible inconsistencies affect the accuracy of the measure and
conclusions about the extent to which performance goals have been
achieved; Recommendations: See note 1 to the table; Based on the
results of effectiveness studies, establish goals to improve
consistency, as needed.
Measure name and definition[A]: Refund interest paid (per $1 million of
refunds); The amount of refund interest paid per $1 million of
refunds issued; FY 2001 target and actual: Target: $112; Actual:
$128.63; Weaknesses of measure and consequences: None observed;
Recommendations: None.
Measure name and definition[A]: Submission processing productivity;
The weighted workload or work units processed per staff year expended;
FY 2001 target and actual: Target: 28,787; Actual: 28,537;
Weaknesses of measure and consequences: Not clear because (1)
definition is not clearly stated, (2) managers do not understand their
unit‘s contribution to the formula and (3) unit managers do not use the
measure to assess performance; Recommendations: Revise the measure so
it provides more meaningful information to users.
Note 1: We are not making a recommendation regarding the objectivity
weakness for the ’deposit error rate“ measure because the Treasury
Inspector General for Tax Administration recommended that IRS take
steps to ensure that the sampling plan is being implemented
consistently, and IRS reported that steps have been taken. :
[A] The names of some measures have been modified slightly from the
official names used by IRS for ease of reading and consistency
purposes. The definitions of the measures listed in the table come from
various IRS sources, including interviews.
[B] ’Paper“ means that returns filed electronically (or their resulting
refunds) are not included in the measure.
[C] A systemic error is an error caused by a computer programming error
as opposed to an IRS employee.
[End of table]
[D] IRS could not provide actual data on this measure due to
discrepancies in its data.
Source: GAO comparison of IRS‘s December 13, 2000, July 25, 2001, and
October 29, 2001, Strategy and Program Plans with the attributes in
appendix I.
[End of section]
Appendix III: Comments from the Internal Revenue Service:
Note: GAO comments supplementing those in the report text appear at the
end of this appendix.
DEPARTMENT OF THE TREASURY
INTERNAL REVENUE SERVICE
WASHINGTON, D.C.
20224:
November 1, 2002:
Mr. James R. White, Director, Tax Issues, U.S. General Accounting
Office
441 G Street, N.W. Washington, D.C. 20548:
Dear Mr. White:
I appreciate your recognition of the substantial progress we have made
in implementing our balanced measures and our strategic planning
process. We issued our first Strategy and Program Plan (SPP) and
performance measures in Fiscal Year (FY) 2000, which I believe was
great progress in a short period of time. We continue to gain
experience and focus on the key attributes of performance measures as
we use them in our day-to-day operations. Your observation that this is
an ongoing process is exactly on point. The observations of your staff
will benefit us as we continue to improve our performance measures. I
believe your report is an insightful review of the measures we
developed for use in FY 2001. We recently completed our SPP and the
related performance measures for FYs 2003-2004. We will consider your
suggestions as we review our current plan and develop plans for our
next SPP cycle.
I am particularly impressed with the detailed definitions,
explanations, and examples your staff developed for the nine attributes
of successful performance measures. I believe the Wage and Investment
(W&I) Division can use these standards as a helpful checklist when they
develop future performance measures.
I also was pleased to note your observation that our measures had many
of the attributes for successful performance. This indicates that we
appropriately developed and properly targeted key performance measures.
I also agree the measures that did not satisfy all of the attributes
will give us opportunities for further refinement rather than
invalidate their overall value. As you noted, we have several
initiatives underway to continue improving these measures. Overall,
your report is objective and balanced.
I want to share some additional points for your consideration:
*Although the filing season is our busiest and most visible period, our
performance measures are for the entire fiscal year.
*The report is focused on the performance measures and their
relationship to the SPP without any mention of the importance of the
Operating Units‘ Business Plans. The Business Plan is a derivative of
the SPP that is linked to tactical actions, resource allocations, and
performance milestones that drive the day-today activities and goals
of the Operating Units. The Business Plan is the primary vehicle for
accountability through the Business Performance Review Process and
individual performance appraisals.
*Few of our performance measures are isolated measures. No individual
measures can adequately reflect the broad range of our responsibilities
and our mission. We manage our programs by reviewing performance
measures, diagnostic measures, performance indicators, and numerous
other data sources to ensure a broad perspective of our service to our
customers.
I have addressed the recommendations in more detail below:
Recommendations for Executive Action:
We recommend that the Commissioner of Internal Revenue Service direct
the appropriate officials to do the following:
Recommendation 1:
Take steps to ensure that the agencywide goals clearly align with the
operating division goals and performance measures for each of the four
areas reviewed. Specifically, (1) clearly document the relationship
among agencywide goals, operating division goals, and performance
measures (the other three program areas may want to consider developing
a template similar to the one Field Assistance developed, shown in
figure 4) and (2) ensure that the relationship among goals and measures
is communicated to staff at all levels of the organization.
Response:
We agree with this recommendation and in the next SPP, we will review
the performance measures for the four W&I areas to ensure that we align
and document their relationship to operating division goals and
agencywide goals. The Operating Units‘ Business Plans communicate the
relationship of SPP goals and measures throughout the organization.
Staff at all levels should recognize their role in delivering the
Business Plan. The four program areas reviewed in the W&I Division
distributed information on the SPP and Business Plan through annual
leadership conferences at each site for FYs 2002 and 2003.
Recommendation 2:
Make the name and definition of several field assistance measures
(i.e., ’geographic coverage,“ ’return preparation contacts,“ ’return
preparation units,“ ’TACs total contacts,“ ’forms contacts,“ ’tax law
contacts,“ ’account contacts,“ ’other contacts,“ ’tax law accuracy,“
’account/notice accuracy,“ and ’return preparation accuracy“) more
clear
to indicate what is and is not included in the formula.
Response:
Field Assistance recently updated the data dictionary for FY 2003. The
updated dictionary addresses your recommendation on clarity and
specifically identifies what is or is not included in the formulas. We
gave a copy of the updated document to your staff. We have updated the
data dictionary to include the purpose of the performance measurement,
the data limitations associated with data gathering of the measure, and
calculation changes from the prior year. It also provides a complete
description of the methodology used in capturing the data, the critical
path of how the measure originates and moves through the process, and
the level of reviews to ensure quality. Field Assistance uses the
current data dictionary in reporting measures to all levels of the
organization.
Recommendation 3:
As discussed in the body of this report and in appendix II, modify the
formulas used to compute various measures to improve clarity. If
formulas cannot be implemented in time for the next issuance of the
SPP, then modify the name and definition of the following measures so
it is clearer what is or is not included in the measure.
Recommendation 3(a):
Remove automated calls from the formula for the ’CSR level of service“
measure.
Response:
We published the definition of this measure in the SPP, Data
Dictionary, Measures Matrix, and numerous other sources. We believe
that including the count of callers who choose an automated service
while waiting for CSR service is appropriate. The formula accurately
reflects the percentage of customers that wanted to speak to a CSR and
subsequently received service. While we are promoting the use of
automated services as an alternative to CSR service, we expect
increases to occur before a customer enters the CSR queue. The growth
in automation service while in queue for CSR service should remain
small or decrease. We do not believe that this measure merits further
change.
Recommendation 3(b):
Revise the ’CSR response level“ measure to include calls from taxpayers
who tried to reach a CSR but did not, such as those who (1) hung up
while waiting to speak to a CSR, (2) were provided access only to
automated services and hung up, and (3) received a busy signal.
Response:
We do not agree that we should modify this measure. The methodology and
30 second threshold for this measure is in accordance with the industry
standard. This measure only applies to services answered and should not
include abandon calls, automated service disconnects, or busy signals.
Altering this measure would deviate from the industry standards and
hinder our ability to gauge success in meeting this ’world class
service“ goal.
Recommendation 3(c):
Analyze and use new or existing data to determine why calls are
transferred and use the data to revise the ’CSR services provided“
measure so that it only reflects transferred calls in which the caller
received help from more than one CSR (i.e., exclude calls in which a
CSR simply transferred the call and did not provide service.):
Response:
We agree in concept with your recommendation. We are continuing to
examine previously collected data on transferred calls from FY 2002. We
are also studying the anticipated impact that our new Toll Free
Operating Strategy will have on this measure. We specifically designed
this strategy to simplify the scripts and telephone menus to make the
customer‘s self-selection process easier and more efficient. After
assessing the impact of the Toll Free Operating Strategy, we will then
review the recommendation for possible change in FY 2004.
Recommendation 3(d):
Either discontinue use of the ’number of IRS digital daily Web site
hits“ measure or revise the way ’hits“ are calculated so that the
measure more accurately reflects usage.
Response:
Due to privacy restrictions associated with the use of ’cookies,“ we
cannot track the actual web site use. Instead, for FY 2003, we will
implement three new diagnostic indicators related to the web site.
These indicators (page view, unique visitors, and visits) will give us
additional information to track the system performance and gauge the
traffic on the web site. We will monitor these indicators for a year
and decide whether to include them as performance measures in the 2004-
2005 SPP.
We will also continue to measure the number of hits and downloads to
the web site. However, we will clarify the definition of ’hits“ to
reflect that each file requested by a visitor registers as a hit and
several hits can occur on each page.
Recommendation 3(e):
Revise Field Assistance‘s ’geographic coverage“ measure by ensuring
that the formula better reflects (1) the various types of field
assistance facilities, including alternate sites and kiosks; (2) the
types of services provided by each facility; and (3) the facility‘s
operating hours.
Response:
We agree that we should have revised the geographic coverage
description to include more than just Taxpayer Assistance Centers
(TAC).We are working with representatives from the Office of Program
Evaluation and Risk Analysis to modify the formula to ensure that the
formula reflects the appropriate elements by June 30, 2003. In
addition, we will use the model to assist in determining the locations
for different delivery options.
Recommendation 3(f):
Revise Submission Processing‘s ’productivity“ measure so it provides
more meaningful information to users.
Response:
We recognize that this measure needs improvement. The broad range of
returns and documents processed and numerous other variables that can
impact efficiency drives the complexity of the measure. The current
measure seeks to account for those differences to ensure equity and
fairness in the measurement process. We have looked at alternative ways
to measure productivity but have not found a suitable replacement for
this measure. We will continue our efforts to develop a more meaningful
productivity measurement.
Recommendation 4:
Refrain from making changes to official targets, such as Electronic
Filing and Assistance did in FY 2001, unless extenuating circumstances
arise. Disclose any extenuating circumstances in the SPP and other key
documents.
Response:
We agree that we should only make changes to official targets under the
circumstances you describe and that disclosing these changes is
appropriate. This approach is consistent with our overall practice.
Recommendation 5:
Modify procedures for the toll-free customer satisfaction survey,
possibly by requiring that the administrators listen to the entire
call, to better ensure that the administrators (1) notify CSRs that
their call was selected for the survey as close to the end of the call
as possible and (2) can accurately answer the questions they are
responsible for on the survey.
Response:
We agree we can improve this process. We will instruct the
administrators to listen to each call from its beginning to as close to
the conclusion as practical. Formalizing this practice will also enable
the administrators to accurately answer the questions on the survey.
Recommendation 6:
Implement annual effectiveness studies to validate the accuracy of the
data collection methods used for the five telephone measures (’toll-
free tax law quality,“ ’toll-free accounts quality,“ ’toll-free tax law
correct response rate,“ ’toll-free account correct response rate,“ and
’toll-free timeliness“) subject to potential consistency problems. The
studies could determine the extent to which variation exists in
collecting data and recognize the associated impact on the affected
measures. For those measures, and for the five Submission Processing
measures that already have effectiveness studies in place (’refund
timeliness-individual (paper),“ ’notice error rate,“ ’refund error
rateindividual (paper),“ ’letter error rate,“ and ’deposit error
rate“), IRS should establish goals for improving consistency, as
needed.
Response:
We have ongoing processes to ensure that we properly administer the
collection methods for the five telephone measures to minimize
potential consistency problems. We do not agree that an annual
independent review by a non-CQRS analyst is merited. Members of the
Treasury Inspector General for Tax Administration (TIGTA) perform
indepth oversight activities annually covering these collection
methods. While we will work to improve consistency, we do not agree
that we should incorporate a consistency improvement goal in the SPP
process.
Recommendation 7:
Ensure that plans to remove overlapping measures in Telephone and Field
Assistance are implemented.
Response:
We will continue our process of reviewing measures identified as
overlapping and deleted those that truly were redundant.
Recommendation 8:
As discussed in the body of this report, include the following missing
measures in the SPP in order to better cover governmentwide priorities
and achieve balance.
Recommendation 8(a):
In the spirit of provisions in the Chief Financial Officer‘s Act of
1990 and Financial Accounting Standards Number 4, develop a cost of
services measure using the best information currently available for
each of the four areas discussed in this report, recognizing data
limitations as prescribed by GPRA. In doing so, adhere to guidance,
such as Office of Management and Budget Circular A-76, and consider
seeking outside counsel to determine best or industry practices.
Response:
Development of cost of services measures for Telephone Assistance,
Electronic Filing and Assistance, Field Assistance, and Submission
Processing is dependent on Servicewide deployment of the Integrated
Financial System (IFS). The first release of IFS, scheduled for October
2003, will facilitate financial reporting and financial audits. The
second release of IFS, planned for March 2005, will include Property
and Performance Management. At this time, the development of cost of
services measures is directly linked to having a mechanism that
provides cost information for performance activities. The Service is
moving towards this goal with successful implementation of the IFS
system.
Recommendation 8(b):
Given the importance of automated telephone assistance, develop a
customer satisfaction survey and measure for automated assistance.
Response:
We agree that measuring customer satisfaction with automated services
is important. Our newer interactive Internet services have satisfaction
surveys incorporated in the program. We are continuing to upgrade our
automated services and will be implementing telephone system
architectural changes as part of the Customer Communications
Engineering Study. We will review your recommendation to evaluate the
benefit of programming and implementing a customer satisfaction survey
system based on outdated delivery systems.
Recommendation 8(c):
Put the ’automated completion rate“ measure back in the SPP after
revising the formula so that calls for recorded tax information are not
counted as completed when taxpayers hang up before receiving service.
Response:
We continue to track and monitor the ’automated completion rate“ as a
diagnostic measure.We do not plan to modify the formula nor do we
intend to reinstate it as a measure in the SPP.
Recommendation 8(d):
Add one or more quality measures to Electronic Filing and Assistance‘s
suite of measures in the SPP.Possible measures include ’processing
accuracy,“ ’refund timeliness, electronically filed,“ and ’number of
electronic returns rejected.“:
Response:
The quality of electronic filing has consistently been high due to the
pre-submission checks integrated into the system. We do track and
monitor numerous diagnostic indicators that reflect the quality of
electronic filing. We use this data to determine if there are error
trends that need to be addressed. We do not believe incorporating these
indicators as a performance measure in the SPP would enhance the
electronic filing program.
Recommendation 8(e):
Re-implement Field Assistance‘s timeliness measure.
Response:
Field Assistance agrees that timeliness goals are important in
providing service to taxpayers; however, we found that this is
detrimental to quality service in TACs because the employees tend to
rush the customers when traffic is high. Realistic expectations provide
a framework for our workers to provide appropriate service to the
taxpayer with the goal of taking the requisite time to provide complete
and accurate assistance. We will continue to use positive and negative
feedback from customers responding to the ’promptness of service“
section of the satisfaction survey as a gauge of service. In addition,
we are still tracking wait-times in locations equipped with the Queuing
Management System (Q-Matic) System. The Q-Matic System is an on-line
automated tracking and reporting system. We agree errors occur when
manual methods of tracking workload volume and staff hours are used. In
order to minimize reporting errors and better track wait-time, we plan
to equip all of our TACs with this system.We can have Q-Matic installed
and networked at all TACs nationwide by the end of FY 2004 with the
planned funding.
Recommendation 8(j):
Develop a measure that provides information about Field Assistance‘s
efficiency.
Response:
Field Assistance is implementing a performance monitoring system to
monitor productivity measures. We will use this system as a diagnostic
tool to identify organizational performance measures strengths and
weaknesses and not as an evaluative tool as we are a Section 1204
organization. We will test the system during FY 2003 to determine the
validity and usefulness of the data captured. At the end of the fiscal
year, we will decide whether to continue with the current system, or
modify it.
Again, I appreciate your observations and recommendations. If you have
questions or comments, please call Floyd Williams, Director,
Legislative Affairs, at (202) 622-3720.
Sincerely,
Charles O. Rossotti
Signed by Charles O. Rossotti
1. We recognize that IRS‘s performance measures cover entire fiscal
years. We reviewed 53 of the measures for all of fiscal year 2001,
and we reported the full year‘s results in appendix II.
2. We reviewed the business plans for all four program areas we
reviewed. Although we did not comment specifically about the
business performance review process in the report, we noted in the
background and field assistance sections that the business plans
communicate part of the relationship among the various goals and
measures.
3. Figure 4 shows an excerpt of field assistance‘s business unit plan.
As noted in the figure, the template used to communicate the
relationship between goals and measure is missing some key
components. Figure 2 is our attempt to show the complete relationship
among IRS‘s various goals and measures--it is based on multiple
documents.
[End of section]
Appendix IV: GAO Contacts and Staff Acknowledgments:
GAO Contacts:
James White (202) 512-9110:
Dave Attianese (202) 512-9110:
Acknowledgments:
In addition to those named above, Bob Arcenia, Healther Bothwell, Rudy
Chatlos, Grace Coleman, Evan Gilman, Ron Heisterkamp, Ronald Jones,
John Lesser, Allen Lomax, Theresa Mechem, Libby Mixon, Susan Ragland,
Meg Skiba, Joanna Stamatiades, and Caroline Villanueva made key
contributions to this report.
[End of section]
Bibliography:
To determine whether the Internal Revenue Service‘s (IRS) performance
goals and measures in four key program areas demonstrate results, are
limited to the vital few, cover multiple program priorities, and
provide useful information in decision making, we developed attributes
of performance goals and measures. These attributes were largely based
on previously established criteria found in prior GAO reports; our
review of key legislation, such as the Government Performance and
Results Act of 1993 (GPRA) and the IRS Restructuring and Reform Act of
1998; and other performance management literature. Sources we referred
to for this report follow.
101st Congress. Chief Financial Officer‘s Act of 1990. P.L. 101-576.
Washington, D.C.: January 23, 1990.
103rd Congress. Government Performance and Result Act of 1993. P.L.
103-62. Washington, D.C.: January 5, 1993.
103rd U.S. Senate. The Senate Committee on Government Affairs GPRA
Report. Report 103-58. Washington, D.C.: June 16, 1993.
105th Congress. IRS Restructuring and Reform Act. P.L. 105-206.
Washington, D.C.: July 22, 1998.
Internal Revenue Service. Managing Statistics in a Balanced Measures
System. Handbook 105.4. Washington, D.C.: October 1, 2000.
The National Partnership for Reinventing Government. Balancing
Measures: Best Practices in Performance Management. Washington, D.C.:
August 1, 1999.
Office of Management and Budget, Preparation and Submission of Budget
Estimates. Circular No. A-11, Revised. Transmittal Memorandum No. 72.
Washington, D.C.: July 12, 1999.
Office of Management and Budget. Circular A-76, Revised. Supplemental
Handbook, Performance of Commercial Activities. Washington, D.C.: March
1996 (Revised 1999).
Office of Management and Budget. Managerial Cost Accounting Concepts
and Standards for the Federal Government. Statement of Federal
Financial Accounting Standards, Number 4. Washington, D.C.: July 31,
1995:
[End of section]
Related Products:
U.S. General Accounting Office. Internal Revenue Service: Assessment of
Budget Request for Fiscal Year 2003 and Interim Results of 2002 Tax
Filing Season. (GAO-02-580T). Washington, D.C.: April 9, 2002.
U.S. General Accounting Office. Tax Administration: Assessment of IRS‘s
2001 Tax Filing Season. (GAO-02-144). Washington, D.C.: December 21,
2001.
U.S. General Accounting Office. Human Capital: Practices That Empowered
and Involved Employees (GAO-01-1070). Washington, D.C.: September 14,
2001.
U.S. General Accounting Office. Managing For Results: Emerging Benefits
From Selected Agencies‘ Use of Performance Agreements (GAO-01-115).
Washington, D.C.: October 30, 2000.
U.S. General Accounting Office. Agency Performance Plans: Examples of
Practices That Can Improve Usefulness to Decisionmakers (GAO/GGD/AIMD-
99-69). Washington, D.C.: February 26,1999.
U.S. General Accounting Office. The Results Act: An Evaluator‘s Guide
to Assessing Agency Annual Performance Plans (GAO/GGD-10.1.20).
Washington, D.C.: April 1,1998.
U.S. General Accounting Office. Executive Guide: Effectively
Implementing the Government Performance and Results Act (GAO/GGD-96-
118). Washington, D.C.: June 1996.
U.S. General Accounting Office. Executive Guide: Improving Mission
Performance Through Strategic Information Management and Technology
(GAO/AIMD-94-115). Washington, D.C.: May 1, 1994.
FOOTNOTES
[1] Although April 15 is generally considered the end of the filing
season, millions of taxpayers get extensions from IRS that allow them
to delay filing until as late as October 15.
[2] IRS tracks its performance in providing filing season-related
telephone service through mid-July instead of April because it receives
many filing season-related calls after April 15 from taxpayers who are
inquiring about the status of their refunds or responding to notices
they received from IRS related to returns they filed.
[3] Some earlier work includes U.S. General Accounting Office,
Executive Guide: Effectively Implementing the Government Performance
and Results Act, GAO/GGD-96-118 (Washington, D.C.: June 1996) and U.S.
General Accounting Office, The Results Act: An Evaluator‘s Guide to
Assessing Agency Annual Performance Plans, GAO/GGD-10.1.20
(Washington, D.C.: Apr. 1998).
[4] The four characteristics are overarching, thus there is not
necessarily a direct link between any one attribute and any one
characteristic.
[5] U.S. General Accounting Office, Internal Revenue Service:
Assessment of Budget Request for Fiscal Year 2003 and Interim Results
of 2002 Tax Filing Season, GAO-02-580T (Washington, D.C.: Apr. 9,
2002).
[6] GPRA, P.L. 103-62, was enacted to hold federal agencies accountable
for achieving program results. IRS‘s balanced measurement system is
consistent with the intent of GPRA.
[7] IRS‘s Restructuring and Reform Act of 1998, P.L. 105-206, was
enacted on July 22, 1998, and calls for broad reforms in areas such as
the structure and management of IRS, electronic filing, and taxpayer
protection and rights.
[8] The other components include revamped business practices, customer-
focused operating divisions, management roles with clear
responsibility, and new technology.
[9] As part of IRS‘s reorganization that took effect in October 2000,
IRS established four operating divisions that serve specific groups of
taxpayers. The four divisions are (1) Wage and Investment, (2) Small
Business and Self-Employed, (3) Large and Mid-Size Businesses, and (4)
Tax Exempt and Government Entities.
[10] The Strategy and Program Plans we used in our analysis had actual
performance information for part of the current fiscal year and
planning information for the current and two subsequent fiscal years.
An IRS manager said the agency plans to stop including actual
information in Strategy and Program Plans prepared after fiscal year
2002.
[11] GAO/GGD-96-118.
[12] Office of Management and Budget, Preparation and Submission of
Budget Estimates, Circular No. A-11, Revised. Transmittal Memorandum
No. 72 (Washington, D.C.:
July 12, 1999).
[13] IRS, Managing Statistics in a Balanced Measures System, Handbook
105.4 (Washington, D.C.: Oct. 1, 2000).
[14] The data dictionary is an IRS document that provides information
on performance measures, such as the measure‘s name, description, and
methodology.
[15] IRS deleted its ’automated completion rate“ measure in the 2002
Strategy and Program Plan and now has 14 telephone measures. However,
IRS still tracks that measure.
[16] There were about 30 million of these calls during in fiscal year
2001, which can have a significant impact on the ’CSR response level“
measure.
[17] CSRs answer about 24 percent of all incoming calls.
[18] As of January 2002, there were 53 quality reviewers in the
Centralized Quality Review Site: 26 for tax law inquiries, 20 for
account inquiries, and 7 others.
[19] CQRS is responsible for monitoring the accuracy of telephone
assistance. It produces various reports that show call sites what
errors CSRs are making so site managers can take action to reduce those
errors.
[20] IRS significantly modified its five quality measures beginning in
October 2002 based on the results of its initiative, which was aimed at
redesigning the way IRS measures quality to better capture the
taxpayer‘s experience. Specifically, IRS renamed the toll-free correct
response rate measures for tax law and account inquiries to ’customer
accuracy“ for tax law or account inquiries. Plans call for the tax
quality measures for tax law and account inquiries to be discontinued,
but reported in fiscal year 2003 for trending and comparative purposes.
IRS also eliminated the ’toll-free timeliness“ measure and replaced it
with a new ’quality timeliness“ measure. Finally, IRS implemented a new
measure called ’professionalism.“
[21] The Chief Financial Officer‘s Act, P.L. 101-576,underscores the
importance of improving financial management in the federal government.
Among other things, it calls for developing and reporting cost
information.
[22] Statement of Federal Financial Accounting Standard Number 4,
’Managerial Cost Accounting Concepts and Standards for the Federal
Government,“ is aimed at providing reliable and timely information on
the full cost of federal programs, their activities, and outputs.
[23] The Annual Performance Plan is a key document IRS produces each
year to comply with the requirements of GPRA. It highlights a limited
number of IRS performance measures.
[24] U.S. General Accounting Office, Tax Administration: Assessment of
IRS‘s 2001 Tax Filing Season, GAO-02-144 (Washington, D.C.: Dec. 21,
2001).
[25] 1040 series returns are individual income tax returns filed on
Forms 1040, 1040A, and 1040EZ.
[26] The masterfile is the system where most of IRS‘s taxpayer data
resides.
[27] ’Processing accuracy“ refers to the total number of returns that
do not go to the error resolution system. Transactions that fail
validity checks during processing are corrected through the error
resolution system.
[28] ’Refund timeliness, electronically filed“ is the amount of time it
takes for taxpayers to receive their refunds when filing
electronically.
[29] Electronic returns can be rejected, for example, if taxpayers fail
to include required Social Security numbers. IRS requires taxpayers to
correct such errors before it will accept their electronic returns.
[30] Alternate sites are staffed with field assistance employees and
offer limited face-to-face services, such as preparing returns and
distributing forms. Field assistance has about
50 alternate sites, such as temporary sites in shopping malls and
libraries. Alternate sites are currently not included in the
’geographic coverage“ measure.
[31] Kiosks are automated machines that taxpayers can use to obtain
certain forms, answers to frequently asked questions, and general IRS
information in English and Spanish. Kiosks are currently not included
in the ’geographic coverage“ measure.
[32] The Resources Management Information System is the primary
management information system that field assistance uses to track
workload volume and staff hour expenditures.
[33] GAO-02-144.
[34] Of about 420 TACs, 123 had Q-Matic as of June 2002. IRS officials
stated that installation and networking of Q-Matic in all offices is
scheduled to be complete by September 30, 2005. In the meantime, IRS
plans to pilot an installed and networked Q-Matic system in all the
TACs that are located in one of IRS‘s seven management areas during the
first quarter of 2003.
[35] Treasury Inspector General for Tax Administration, Walk-in
Customer Satisfaction Survey Results Should Be Qualified If Used for
the GPRA, 2000-10-079 (Washington, D.C.: May 17, 2000).
[36] The number of units would generally be larger than the number of
contacts. For example, if a taxpayer received help in preparing his or
her return and his or her child‘s return, field assistance would count
that service as one return preparation contact and two return
preparation units.
[37] TACs monitor timeliness, but IRS does not report the measure in
the Strategy and Program Plan.
[38] The Integrated Submission and Remittance Processing System is the
system IRS uses to process tax returns and remittances.
[39] Treasury Inspector General for Tax Administration, The Internal
Revenue Service Needs to Improve Oversight of Remittance Processing
Operations, 2003-40-002 (Washington, D.C.: Oct. 7, 2002).
[40] Submission processing did have some data related to the average
direct labor cost to process some paper returns in 1999.
[41] Q-Matic is an automated tracking and reporting system that is
expected to more efficiently monitor customer traffic flow and wait
times and eliminate staff time completing Form 5311. Of about 420 TACs,
123 had Q-Matic as of June 2002.
[42] An alternative form of measurement may be either (1) separate,
descriptive statements of a minimally effective program or (2) a
successful program, expressed with sufficient precision and in such
terms that would allow for an accurate, independent determination to be
made of how actual performance compares with the goals stated. An
example would be the polio vaccine and how its value to society is
judged by experts through a peer review.
[43] IRS deleted its ’automated completion rate“ measure in the 2002
Strategy and Program Plan and now has only 14 telephone measures.
However, IRS still tracks this measure.
[44] IRS has since added three measures (’number of information returns
filed by magnetic tape,“ ’percent of information returns filed by
magnetic tape,“ and ’customer satisfaction-business“) that were not
part of our review. In addition, electronic filing and assistance is
developing new performance measures and goals because it is in the
midst of a major reorganization. When the reorganization is completed,
electronic filing and assistance will no longer be responsible for all
the operational programs for which it was responsible in 2001 and 2002.
Electronic filing and assistance will remain responsible for strategic
services, Internet development services, and development services. The
IRS organizations assuming responsibility for electronic filing and
assistance‘s operational programs will be responsible for the related
performance measures and goals.
[45] IRS is developing a measure of customer satisfaction for
submission processing.
GAO‘s Mission:
The General Accounting Office, the investigative arm of Congress,
exists to support Congress in meeting its constitutional
responsibilities and to help improve the performance and accountability
of the federal government for the American people. GAO examines the use
of public funds; evaluates federal programs and policies; and provides
analyses, recommendations, and other assistance to help Congress make
informed oversight, policy, and funding decisions. GAO‘s commitment to
good government is reflected in its core values of accountability,
integrity, and reliability.
Obtaining Copies of GAO Reports and Testimony:
The fastest and easiest way to obtain copies of GAO documents at no
cost is through the Internet. GAO‘s Web site ( www.gao.gov ) contains
abstracts and full-text files of current reports and testimony and an
expanding archive of older products. The Web site features a search
engine to help you locate documents using key words and phrases. You
can print these documents in their entirety, including charts and other
graphics.
Each day, GAO issues a list of newly released reports, testimony, and
correspondence. GAO posts this list, known as ’Today‘s Reports,“ on its
Web site daily. The list contains links to the full-text document
files. To have GAO e-mail this list to you every afternoon, go to
www.gao.gov and select ’Subscribe to daily E-mail alert for newly
released products“ under the GAO Reports heading.
Order by Mail or Phone:
The first copy of each printed report is free. Additional copies are $2
each. A check or money order should be made out to the Superintendent
of Documents. GAO also accepts VISA and Mastercard. Orders for 100 or
more copies mailed to a single address are discounted 25 percent.
Orders should be sent to:
U.S. General Accounting Office
441 G Street NW,
Room LM Washington,
D.C. 20548:
To order by Phone:
Voice: (202) 512-6000:
TDD: (202) 512-2537:
Fax: (202) 512-6061:
To Report Fraud, Waste, and Abuse in Federal Programs:
Contact:
Web site: www.gao.gov/fraudnet/fraudnet.htm E-mail: fraudnet@gao.gov
Automated answering system: (800) 424-5454 or (202) 512-7470:
Public Affairs:
Jeff Nelligan, managing director, NelliganJ@gao.gov (202) 512-4800 U.S.
General Accounting Office, 441 G Street NW, Room 7149 Washington, D.C.
20548: