Best Practices

A More Constructive Test Approach Is Key to Better Weapon System Outcomes Gao ID: NSIAD-00-199 July 31, 2000

This report examines (1) how the conduct of testing and evaluation affects commercial and Defense Department (DOD) program outcomes, (2) how best commercial testing and evaluation practices compare with DOD's, and (3) what factors account for the differences in these practices. GAO found that commercial firms use testing to expose problems earlier than the DOD programs GAO visited. Commercial firms' testing and evaluation validates products' maturity based on three levels at specific points in time, which works to preclude "late-cycle churn" or the scramble to fix a significant problem discovered late in development. Late-cycle churn has been a fairly common occurrence on DOD weapon systems, where tests of a full system identify problems that often could have been found earlier. DOD's response to such test results typically is to expend more time and money to solve the problems--only rarely are programs terminated. The differences in testing practices reflect the different demands commercial firms and DOD impose on program managers. Leading commercial firms insist that a product satisfy the customer and make a profit. Success is threatened if unknowns about a product are not resolved early when costs are low and more options are available. Testing is constructive and eliminates unknowns. Success for a weapons system is centered on providing a superior capability within perceived time and funding limits. Testing plays a less constructive role, because test results often become directly liked to funding and other key decisions and can jeopardize program support. Such a role creates a more adversarial relationship between testers and program managers.

GAO noted that: (1) for the leading commercial firms GAO visited, the proof of testing and evaluation lies in whether a product experiences what one firm called "late cycle churn," or the scramble to fix a significant problem discovered late in development; (2) late cycle churn has been a fairly common occurrence on DOD weapon systems; (3) often, tests of a full system identify problems that could have been found earlier; (4) leading commercial firms GAO visited use testing and other techniques to expose problems earlier than the DOD programs GAO reviewed; (5) the firms focus on validating that their products have reached increasing levels of product maturity at given points in time; (6) the firms' products have three maturity levels in common--components work individually, components work together as a system in a controlled setting, and components work together as a system in a realistic setting; (7) the key to minimizing late surprises is to reach the first two levels early, limiting the burden on the third level; (8) by concentrating on validating knowledge rather than the specific technique used commercial firms avoid skipping key events and holding hollow tests that do not add knowledge; (9) on the weapon programs, system level testing carried a greater share of the burden; (10) earlier tests were delayed, skipped, or not conducted in a way that advanced knowledge; (11) the differences in testing practices reflect the different demands commercial firms and DOD impose on program managers; (12) leading commercial firms insist that a product satisfy the customer and make a profit; (13) success is threatened if managers are unduly optimistic or if unknowns about a product are not resolved early, when costs are low and more options are available; (14) the role of testing under these circumstances is constructive, for it helps eliminate unknowns; (15) product managers view testers and realistic test plans as contributing to a product's success; (16) success for a weapon system program is different--it centers on attempting to provide a superior capability within perceived time and funding limits; (17) success is influenced by the competition for funding and the quest for top performance--delivering the product late and over cost does not necessarily threaten success; (18) testing plays a less constructive role in DOD because a failure in a key test can jeopardize program support; (19) specifically, test results often become directly linked to funding and other key decisions for programs; and (20) such a role creates a more adversarial relationship between testers and program managers.

Recommendations

Our recommendations from this work are listed below with a Contact for more information. Status will change from "In process" to "Open," "Closed - implemented," or "Closed - not implemented" based on our follow up work.

Director: Team: Phone:

Download Full Report