What is software reliability
Software reliability means operational reliability. Who cares how many bugs are in the program? We should be concerned with their effect on its operations ---Bev Little wood
The most accepted definition It is the probability of a failure free operation of a program for a specified time in a specified environment
Failures and Faults
A failure corresponds to unexpected run-time behaviour observed by a user of the software A fault is a static software characteristic which causes a failure to occur Faults need not necessarily cause failures. They only do so if the faulty part of the software is used If a user does not notice a failure, is it a failure? Remember most users dont know the software specification
Failure classification
Failure class Transient Permanent Recoverable Unrecoverable Non-corrupting Corrupting Des cription Occurs only with certain inputs Occurs with all inputs System can recover without operator intervention Operator intervention needed to recover from failure Failure does not corrupt system state or data Failure corrupts system state or data
Input/output mapping
Input set I
e
Inputs causing erroneous outputs
Program
Output set
Oe
Erroneous outputs
Reliability metrics
Probability of failure on demand This is a measure of the likelihood that the system will fail when a service request is made POFOD = 0.001 means 1 out of 1000 service requests result in failure Relevant for safety-critical or non-stop systems Rate of fault occurrence (ROCOF) Frequency of occurrence of unexpected behaviour ROCOF of 0.02 means 2 failures are likely in each 100 operational time units Relevant for operating systems, transaction processing systems
Mean time to failure
Measure of the time between observed failures MTTF of 500 means that the time between failures is 500 time units Relevant for systems with long transactions e.g. CAD systems
Availability
Measure of how likely the system is available for use. Takes repair/restart time into account Availability of 0.998 means software is available for 998 out of 1000 time units Relevant for continuously running systems e.g. telephone switching systems
Costs of increasing reliability
Cost
Low
Medium
High Reliability
Very high
Ultrahigh
Bathtub curve for hardware reliability
Revised bathtub curve for software reliability
Software Reliability Metrics
Product metrics Project management metrics Process metrics Fault and failure metrics
Next topic:- Reliability growth Models