Home Contact us Site Map
CSE Home Page
Center for Software Engineering USC Home Page
Back        

About us
News
History
People
Events
Upcoming
LA Spin
Past
Publications
Tech. Report
Research
Projects
Tools
Courses
Education
Degrees
Admissions
Affiliates
General
Private
Other Resources

 

LA SPIN Meeting Announcement
January 28, 1998
Speaker: Sam Keene
Coordinator: John Cosgrove


Predicting Software and System Reliability

Please remember to bring your parking ticket for validation

Abstract

This presentation speaks to a new reliability assessment technique that can be applied early in the development process to predict software, hardware and system reliability. This reliability modeling approach has been developed jointly with the Reliability Analysis Center under the support of Rome Laboratories.

First, system failures, that are hardware related, have been found to predomina ntly come from causes other than intrinsic part failures (whereas traditional hardware reliability modeling uses a random or exponential failure rate model). In our experience the Pareto rule holds true in system failures, such that in the hardware realm, a few percent of all the Field Replaceable Units (FRUs) constitute nearly all of the system hardware failures. The majority of failures are not randomly caused, but are attributable to specific, resolvable causes. Statisticians call these "special cause" failures. These failures result from oversights in design, manufacturing, parts, and "system management." The latter term includes requirements deficiencies and interface problems. All of these special cause failures are remediable failure problems. The underlying failure cause can be fixed so that the problem will not reoccur. The FRU and system failure rate wil l be lower thereafter. Software failures parallel this experience.

There is a need to predict reliability of the software earlier than System Test, where it is normally modeled. This need is especially present during the planning phase of development, before any operational data is available. This new prediction approach uses the CMM process maturity level as a predictor of the latent fault density of the delivered code. The latent fault density (design measure) is then modeled to a probable field Mean Time Between Failures (user measure). This modelin g is developed from a largely empirical basis. It establishes a practical basis to perform early predictions of reliability of the software. A model to project Mean Time To Recovery (MTTR) of the software is also shown.

Software failures are contrasted to hardware failures. Then the two failure contributors are combined to form a model of system reliability and availability.

Biography

Samuel J. Keene is a Fellow of the IEEE. He is a past President of the IEEE Reliability Society and was recognized as "Reliability Engineer of the Year" in 1995. He is a member of ASQC and is a general partner in Performance Technology, located in Boulder, Colorado. Dr. Keene received his Ph.D. in Operations Research from the University of Colorado. He is a teacher, practitioner and researcher in the reliability field and has published over one hundred articles and book chapters in the reliability field. He also has produced six video tutorials on the reliability field.