2017 Reliability Engineering - Theory and Practice PDFDrive
2017 Reliability Engineering - Theory and Practice PDFDrive
Considering that complex equipment and systems are generally repairable, contain
redundancy and must be safe, the term reliability appears often for reliability, avail-
ability, maintainability, and safety. RAMS is used to point out this wherever neces-
sary in this book. The purpose of reliability engineering is to develop methods and
tools to assess RAMS figures of components, equipment & systems, as well as to
support development and production engineers in building in these characteristics.
In order to be cost and time effective, reliability (RAMS) engineering must be inte-
grated in the project activities, support quality assurance & concurrent engineering
efforts, and be performed without bureaucracy. This chapter introduces basic con-
cepts, shows their relationships, and discusses the tasks necessary to assure quality
and reliability (RAMS) of complex equipment & systems with high quality and reli-
ability (RAMS) requirements. A comprehensive list of definitions with comments is
given in Appendix A1. Standards for quality and reliability (RAMS) assurance are
discussed in Appendix A2. Refinements of management aspects are in Appendices
A3 - A5. Risk management is considered in Sections1.2.7 and 6.11.
1.1 Introduction
Until the 1950's, quality targets were deemed to have been reached when the item
considered was found to be free of defects and systematic failures at the time it left
the manufacturer. +) The growing complexity of equipment and systems, as well as
the rapidly increasing cost incurred by loss of operation as a consequence of failures,
have brought the aspects of reliability, availability, maintainability, and safety to the
forefront. The expectation today is that complex equipment & systems are not only
free from defects and systematic failures at time t = 0 (when starting operation), but
also perform the required function failure-free for a stated time interval and have a
fail-safe behavior, i. e. preserve safety, in case of critical (or catastrophic) failures.
However, the question of whether a given item will operate without failures during
a stated period of time cannot be answered by yes or no, based on an acceptance test.
__________________
+) Nodistinction is made in this book between defect and nonconformity (except pp. 388, 394, 395);
however, defect has a legal connotation and has to be used when dealing with product liability.
Experience shows that only a probability for this occurrence can be given. This
probability is a measure of the item’s reliability and can be interpreted as follows,
if n statistically identical and independent items are put into operation at
-
time t = 0 to perform a given mission and n £ n of them accomplish it
-
successfully, then the ratio n /n is a random variable which converges for
increasing n to the true value of the reliability (Eq. (A6.146) on p. 472).
Performance parameters as well as reliability, availability, maintainability, and safety
have to be built in during design & development and retained during production and
operation of the item. After the introduction of some important concepts in Section
1.2, Section 1.3 gives basic tasks and rules for quality and reliability (RAMS) as-
surance of complex equipment & systems (refinements are in Appendices A1-A5).
1.2.1 Reliability
Reliability is a characteristic of the item, expressed by the probability that it will
perform its required function under given conditions for a stated time interval.
It is generally designated by R . From a qualitative point of view, reliability can
also be defined as the ability of the item to remain functional. Quantitatively,
reliability specifies the probability that no operational interruption will occur
during a stated time interval. This does not mean that redundant parts may not fail,
such parts can fail and be repaired on-line (i. e. without operational interruption at
item (system) level). The concept of reliability thus applies to nonrepairable as well
as to repairable items (Chapters 2 and 6, respectively). To make sense, a numerical
statement on reliability (e. g. R = 0.9 ) must be accompanied by a clear definition of
the required function, environmental, operation & maintenance conditions,
as well as the mission duration and the state of the item at mission begin
(often tacitly assumed new or as-good-as-new).
An item is a functional or structural unit of arbitrary complexity (e. g. component
(part, device), assembly, equipment, subsystem, system) that can be considered as an
entity for investigations. +) It may consist of hardware, software, or both, and may
also include human resources. Assuming ideal human aspects & logistic support,
technical system should be preferred; however, system is often used for simplicity.
__________________
+) System refers in this book (as often in practice) to the highest integration level of the item considered.
1.2 Basic Concepts 3
The required function specifies item's task; i. e. for given inputs, the item outputs
have to be constrained within specified tolerance bands (performance parameters
should always be given with tolerances). The definition of the required function is
the starting point for any reliability analysis, as it defines failures.
Operating conditions have an important influence on reliability, and must be
specified with care. Experience shows, for instance, that the failure rate of semi-
conductor devices will double for an operating temperature increase of 10 to 20 ° C.
The required function and / or operating conditions can be time dependent. In
these cases, a mission profile has to be defined and all reliability figures will be
referred to it. A representative mission profile and the corresponding reliability
targets should be given in the item's specifications.
Often the mission duration is considered as a parameter t, the reliability function
R( t ) is then the probability that no failure at item (system) level will occur in (0 , t ] ;
however, item's condition at t = 0 influences final results, and to take care of
this, reliability figures at system level will have in this book (starting from
Section 2.2.6, except Chapter 7 & Appendix A6 to simplify notation) indices
Si (e. g. RS i (t ) ), where S stands for system (footnote on p. 2) and i is the
state Z i entered at t =0 (up state for reliability), with i = 0 for system new or
as-good-as-new, as often tacitly assumed at t =0 (see the footnote on p. 512).
A distinction between predicted and estimated reliability is important. The first
is calculated on the basis of the item’s reliability structure and the failure & repair
rates of its components (Chapters 2 & 6); the second is obtained from a statistical
evaluation of reliability tests or from field data by stated conditions (Chapter 7).
The concept of reliability can be extended to processes and services, although
human aspects can lead to modeling difficulties (Sections 1.2.7, 5.2.5, 6.10, 6.11).
1.2.2 Failure
A failure occurs when the item stops performing its required function. As simple as
this definition is, it can become difficult to apply it to complex items. The
failure-free time, used in this book for failure-free operating time,
is generally a random variable. It is often reasonably long; but it can be very short,
for instance because of a failure caused by a transient event at turn-on. A general
assumption in investigating failure-free times is that at t = 0 the item is new or
as-good-as-new and free of defects & systematic failures. Besides their frequency,
failures should be classified according to the mode, cause, effect, and mechanism:
1. Mode: The mode of a failure is the symptom (local effect) by which a failure is
observed; e. g. open, short, drift, functional faults for electronic, and brittle
fracture, creep, buckling, fatigue for mechanical components or parts.
2. Cause: The cause of a failure can be intrinsic, due to weaknesses in the item
4 1 Basic Concepts, Quality and Reliability (RAMS) Assurance of Complex Equipment & Systems
ν(t)
n
n–1
n–2
n–3
2
1
t1 t
0
t2
t3
ˆ ˆ
ˆ ( t ) = R (t ) − R (t + δ t ) .
λ (1.4)
δ t R̂ (t )
For R (t ) derivable, n → ∞ & δ t → 0 , λ̂( t ) converges to the (instantaneous) failure rate
− d R(t ) / d t
λ( t ) = . (1.5)
R(t )
Considering R(0 ) = 1 (at t = 0 all items are new), Eq. (1.5) leads to
t
− ∫ λ (x ) dx
R( t ) = e 0 , (for R ( 0) = 1) . (1.6)
The failure rate λ( t ) defines thus completely the reliability function R (t ) of a
nonrepairable item. However,
considering Eq. (2.10) on p. 39 or Eq. (A6.25) on p. 441, λ( t ) can also be
defined for repairable items which are as-good-as-new after repair, taking
instead of t the variable x starting by 0 after each repair (renewal), as for
interarrival times (see e. g. pp. 41, 390); extension which is necessary when
investigating repairable systems in Chapter 6 (see e. g. Fig. 6.8 on p. 197).
In this case, for an item (system) with more than one element, as-good-as-new at
item (system) level is only given either if all elements have constant (time inde-
pendent) failure rates or at each failure also all not failed elements with time
dependent failure rate are renewed to as-good-as-new (see pp. 390 - 91, 442, 443).
If a repairable item (system) cannot be restored to be as-good-as-new after repair,
failure intensity z ( t ) must be used (pp. 389, 390 - 91, 540). Use of hazard rate or
force of mortality for λ( t ) should be avoided.
6 1 Basic Concepts, Quality and Reliability (RAMS) Assurance of Complex Equipment & Systems
where MTTF stands for mean time to failure. For l (x ) = l it follows MTTF = 1/l .
A constant (time independent) failure rate l is often considered also for repaira-
ble items (pp. 390 - 91, Chapter 6). Assuming item (system) as-good-as-new after
each repair (renewal), consecutive failure-free times are then independent random
variables, exponentially distributed with parameter l and mean 1/l . In practice,
1/l MTBF (for l ( x ) = l , x starting by 0 after each repair / renewal) (1.9)
is often tacitly used, where MTBF stands for mean operating time between failures,
expressing a figure applicable to repairable items (system). Considering thus
the common usage of MTBF, the statistical estimate MTBF ˆ = T / k used in
practical applications (see e. g. [1.22, A2.5, A2.6 (HDBK-781]) but valid only
for l (x ) = l (p. 330), and to avoid misuses, MTBF should be confined to re-
pairable items with l (x) = l , i. e. to MTBF 1/ l as in this book (pp. 392 -93).
However, at component level MTBF = 108 h for l =10 -8 h -1 has no practical significance.
Moreover, for systems described by Markov or semi-Markov processes in steady-
state or for t Æ • MUTS (system mean up time) applies, and the use of MTBFS
instead of MUTS should be avoided (see pp. 393, 394 & 279 and e. g. Eq. (6.95)).
The failure rate of a large population of statistically identical and independent
items exhibits often a typical bathtub curve (Fig. 1.2) with the following 3 phases:
1. Early failures: l( t ) decreases (in general) rapidly with time; failures in this
phase are attributable to randomly distributed weaknesses in materials,
components, or production processes.
2. Failures with constant (or nearly so) failure rate: l( t ) is approximately
constant; failures in this period are Poisson distributed and often sudden.
3. Wear-out failures: l( t ) increases with time; failures in this period are attribut-
able to aging, wear-out, fatigue, etc. (e. g. corrosion or electromigration).
Early failures are not deterministic and appear in general randomly distributed in
time and over the items. During the early failure period, l(t ) must not necessarily
decrease as in Fig. 1.2, in some cases it can oscillate. To eliminate early failures,
1.2 Basic Concepts 7
λ (t)
θ2 > θ1
θ1
λ
1. 2. 3.
t
0
Figure 1.2 Typical shape for the failure rate of a large population of statistically identical and inde-
pendent (nonrepairable) items (dashed is a possible shift for a higher stress, e. g. ambient temperature)
1.2.6 Availability
Availability is a broad term, expressing the ratio of delivered to expected service. It
is often designated by A and used for the asymptotic & steady-state value of the
point & average availability (PA = AA ) . Point availability (PA( t )) is a characteristic
of the item expressed by the probability that the item will perform its required func-
tion under given conditions at a stated instant of time t. From a qualitative point of
view, point availability can also be defined as the ability of the item to perform its
required function under given conditions at a stated instant of time (dependability).
Availability calculation is often difficult, as human aspects & logistic support
have to be considered in addition to reliability and maintainability (Fig. 1.3). Ideal
human aspects & logistic support are often assumed, yielding to the intrinsic
availability. In this book, availability is generally used for intrinsic availability.
Further assumptions for calculations are continuous operation and complete renewal
of the repaired element (assumed as-good-as-new after repair, see p. 386). In this
case, the point availability PA( t ) of the one-item structure rapidly converges to an
asymptotic & steady-state value, given by
PA = MTTF / (MTTF + MTTR ) . (1.10)
PA is also the asymptotic & steady-state value of the average availability ( AA ) giv-
ing the mean percentage time during which the item performs its required function
( PAS ( t ) , PA S & AA S are used on p. 185 for the one-item structure, with S for
system as per footnote on p. 2). For systems described by Markov or semi-Markov
processes, MUTS & MDTS are used instead of MTTF & MTTR (pp. 279, 393 - 94,
516, 525). Other availability measures can be defined, e. g. mission, work-mission,
overall availability (Sections 6.2.1.5, 6.8.2). Application specific figures are also
known, see e. g. [6.12]. In contrast to reliability analyses for which no failure at
system level is allowed (only redundant parts can fail and be repaired on-line),
availability analyses allow failures at system level.
safety assurance examines measures which can bring the item in a safe state at
failure (fail-safe procedure), reliability assurance aims to minimize the number of
failures. Moreover, for technical safety, effects of external events (human errors,
natural catastrophes, attacks, etc.) are important and must be considered carefully
(Sections 6.10, 6.11). The safety level of the item also influences the number of
product liability claims. However, increasing in safety can reduce reliability.
Closely related to the concept of safety are those of risk, risk management, and
risk acceptance. Experience shows that risk problems are generally interdisciplinary
and have to be solved in close cooperation between engineers and sociolo-
gists, in particular, to find common solutions to controversial questions.
For risk evaluation, a weighting between probability of occurrence and effect (con-
sequence) of a given accident / disaster is often used, and the multiplicative rule is
one among other possibilities (see e. g. [1.3, 2.82]). Also it is necessary to consider
the different causes (machine, machine & human, human) and effects (location,
time, involved people, effect duration) of an accident / disaster. Statistical tools can
support risk assessment. However, although the behavior of a homogenous human
population is often known, experience shows that the reaction of a single person can
become unpredictable. Similar difficulties also arise in the evaluation of rare events
in complex systems. Risk analysis and risk mitigation are generally performed with
tools given in Sections 2.6, 6.8, 6.9; see also Sections 6.10 & 6.11 for new models.
Basically, considerations on risk and risk acceptance should take into account
that the probability p1 for a given accident / disaster which can be caused by one of
n statistically identical and independent items (systems), each of them with
occurrence probability p, is for n p small ( n → ∞ , p → 0 ) nearly equal to n p as per
p1 = n p (1 − p ) n−1 ≈ n p e − n p ≈ n p (1 − n p ) ≈ n p . (1.11)
Equation (1.11) follows from the binomial distribution and Poisson approximation
(Eqs. (A6.120), (A6.129)). It also applies with n p = λ tot T to the case in which one
assumes that the accident / disaster occurs randomly in the interval ( 0, T ] , caused by
one of n independent items (systems) with failure rates λ1 ,…, λn (λ tot = λ1 +…+ λn ).
This is because the sum of n independent Poisson processes is again a Poisson proc-
ess (Eq. (7.27)) and the probability λ tot T e − λ tot T for one failure in the interval ( 0, T ]
is nearly equal to λ tot T . Thus, for n p << 1 or λ tot T << 1 it holds that
p1 ≈ n p ≈ ( λ1 + …+ λ n ) T . (1.12)
Also by assuming a reduction of the individual occurrence probability p (or of λ i ),
one recognizes that in the future it will be necessary either
to accept greater risks p1 or to keep the spread of high-risk technologies
under tighter control, similar for environmental stresses caused by mankind.
Aspects of ecologically acceptable production, use, disposal & recycling of products
should become subject for international regulations (sustainable development).
1.2 Basic Concepts 11
1.2.8 Quality
Quality is understood as the degree to which a set of inherent characteristics fulfills
specified or expected requirements. This definition, given also in the ISO 9000: 2000
family [A1.6], follows closely the traditional definition of quality, expressed by
fitness for use, and applies to products and services as well.
Example 1.1
An assembly contains n independent components each with a defective probability p . Let ck be
the cost to replace k defective components. Determine (i) the mean (expected value) C(i ) of the
total replacement cost (no defective components are allowed in the assembly) and (ii) the mean
of the total cost (test and replacement) C (ii ) if the components are submitted to an incoming
inspection which reduces defective percentage from p to p0 (test cost ct per component).
Solution
(i) The solution makes use of the binomial distribution (Appendix A6.10.7) and question (i) is
also solved in Example A6.19 on p. 466. The probability of having exactly k defective
components in a sample of size n is given by (Eq. (A6.120))
pk = () n
k
p k ( 1 − p) n − k . (1.13)
The mean C(i ) of the total cost (deferred cost) caused by the defective components follows
then from the weighted sum
()
n n n
C(i ) = ∑ ck pk = ∑ ck p k ( 1 − p) n − k . (1.14)
k =1 k =1 k
(ii) To the cost caused by the defective components, calculated from Eq. (1.14) with p0 instead
of p, one must add the incoming inspection cost n ct
()
n n
C (ii ) = n ct + ∑ ck p0k ( 1 − p0 ) n − k . (1.15)
k =1 k
The difference between C(i ) and C(ii ) gives the gain (or loss) obtained by introducing the incom-
ing inspection, allowing thus a cost optimization (see also Section 8.4 for a deeper discussion).
Cost Effectiveness
(System Effectiveness)
(Dependability)
Intrinsic
Availability
Acquisition
Disposal
Damage to Environment
Damage to Property
Injury to Persons
Logistic Support
Maintainability
Human Factors
Useful Life
Reliability
Safety and
Capability and Quality Maintain-
Reliability ability
Human- Logistic
Life-Cycle Assurance
Engineering Engineering
Factors Support
Cost (Hardw.& Softw.)
Engineering
• Design, develop- • Configuration • Reliability • Maintainability • Safety targets • Maintenance
ment, evaluation management targets targets • Design guide- concept
• Production • Quality testing • Required • Maintenance lines (incl. hu- • Customer/User
(hardware) (incl. reliability, function concept man aspects) documentation
• Cost analyses maintainability, • Environm. cond. • Partitioning • Safety analysis • Spare parts
(Life-cycle costs, and safety tests) • Parts & materials in LRUs (FMEA/FMECA, provisioning
VE, VA) • Quality control • Derating • Faults detection FTA, etc.) • Tools and test
during produc- • Screening and localization • Risk manage- equipment for
tion (hardware) • Redundancy • Design ment maintenance
• Quality data • FMEA, FTA, etc. guidelines • Design reviews • After sales
reporting system • Design • Maintainability service
• Software guidelines analysis
quality • Rel. block diagr. • Design reviews
• Rel. prediction
• Design reviews
Figure 1.3 Cost Effectiveness (System Effectiveness) for complex equipment & systems with high
quality and reliability (RAMS) requirements (see Appendices A1 - A5 for definitions & management
aspects, and pp. 395 & 404 for the concept of product assurance as used in space & railway fields;
dependability can be used instead of operational availability, for a qualitative meaning)
14 1 Basic Concepts, Quality and Reliability (RAMS) Assurance of Complex Equipment & Systems
MUTS and OA S are the system mean up (operating) time between failures, assumed
here = 1 / λ , +) and the system steady-state overall availability (Eq. (6.196) with Tpm
instead of TPM ). T is the total system operating time (useful life) and nd the number
of hidden defects discovered in the field. C q , C r , C cm , C pm & C l are the cost for
quality assurance and for the assurance of reliability, repairability, serviceability,
and logistic support, respectively. ccm , coff , and cd are the cost per repair, per hour
down time, and per hidden defect (preventive maintenance cost are scheduled cost,
considered here as a part of Cpm ). The first five terms in Eq. (1.17) represent a part
of the acquisition cost, the last three terms are deferred cost occurring during field
operation. ++) A model for investigating the cost C according to Eq. (1.17) was de-
veloped in [2.2 (1986)], by assuming Cq , C r , Ccm , C pm , C l , MUTS , OA S , T , ccm ,
coff , cd , and nd as parameters and investigating the variation of the total cost given
by Eq. (1.17) as a function of the level of attainment of the specified targets, i. e. by
introducing the variables gq = QA /QA g , gr = MUTS / MUTSg , gcm = MDTSg / MDTS ,
g pm = MTTPM Sg / MTTPM S , and gl = MLDSg / MLDS , where the subscript g denotes the
specified target for the corresponding quantity. A power relationship
mi
C i = Cig gi (1.18)
was assumed between the actual cost C i , the cost C ig to reach the specified target
(goal) of the considered quantity, and the level of attainment of the specified target
( 0 < m l < 1 , other m i >1). The following relationship between number of hidden
defects discovered in the field and ratio Cq / Cqg was also included in the model
1 1
nd = −1= − 1. (1.19)
(Cq / Cqg ) m d gqm q m d
The final equation for the cost C as function of the variables gq , gr , gcm , g pm , and
gl follows then as (using Eq. (6.196) for OA S ) +)
mq m T c cm
+ C rg grm r + C cmg gcm
m cm
C = C qg gq + C pmg g pmpm + C lg glm l +
gr MUTSg
1 1
+ (1 − ) T coff + ( m m − 1) cd . + +)
1 MDTSg 1 MLDSg MTTPM Sg gq q d
1+ ⋅ + ⋅ +
gr gcm MUTSg gr gl MUTSg g pm T pm (1.20)
The relative cost C / Cg given in Fig. 1.4 is obtained by dividing C by the value Cg
from Eq. (1.20) with all gi =1. Extensive analyses with different values for m i , C ig ,
MUTSg , MDTSg , MLDSg , MTTPMSg , T pm , T , ccm , coff , and cd , have shown that
the value C / Cg is only moderately sensitive to the parameters m i .
__________________
+) Equations (1.17) and (1.20) hold for series structures; for systems with redundancy, elaboration is
more laborious and can be performed taking care of the remarks given on pp. 121 - 24.
++) If repair cost differ for each element, T ∑ λ c
i cmi can be used instead of T ccm / MUTS = λ T ccm .
1.2 Basic Concepts 15
5 5 C/Cg = f(gq)
C/Cg = f(gq)
4 4
3 3
2 2
C/Cg = f(gr) C/Cg = f(gr)
1 1
gq , gr gq , gr
0 0.5 1 1.5 2 0 0.5 1 1.5 2
Figure 1.4 Basic shape of the relative cost C / Cg per Eq. (1.20) as function of gq = QA / QAg and
g r = MUTS / MUTSg (quality assurance and reliability assurance as in Fig. 1.3) for two complex sys-
tems with different mission profiles (the specified targets gq =1 and gr =1 are dashed)
Table 1.1 Historical development of quality assurance and reliability (RAMS) engineering
before 1940 Quality attributes & characteristics are defined. In-process & final tests are carried out,
usually in the production area. The concept of quality of manufacture is introduced.
1940 - 50 Defects (nonconformities) and failures are systematically collected and analyzed.
Corrective actions are carried out. Statistical quality control is developed. It is re-
cognized that quality must be built into an item, quality of design becomes important.
1950 - 60 Quality assurance is recognized as a means for developing and manufacturing an
item with a specified quality level. Preventive measures (actions) are added to tests
and corrective actions. It is recognized that correct short-term functioning does not
also signify reliability. Design reviews and systematic analysis of failures (failure
data and failure modes / mechanisms), performed often in the research & development
area, lead to important reliability improvements.
1960 - 70 Difficulties with respect to reproducibility and change control, as well as interfacing
problems during the integration phase, require a refinement of the concept of
configuration management. Reliability engineering is recognized as a means for
developing and manufacturing an item with specified reliability. Reliability predic-
tion, estimation & demonstration tools are developed. It is recognized that reliability
cannot easily be demonstrated at an acceptance test. Instead of a reliability figure
( λ or MTBF =1 / λ ) , contractual requirements are for a reliability assurance program.
Maintainability, availability, and logistic support become important.
1970 - 80 Due to the increasing complexity and cost for maintenance of equipment and sys-
tems, the aspects of man-machine interface and life-cycle cost become important.
Customers require demonstration of reliability and maintainability during the
warranty period. Quality and reliability assurance activities are made project specific
and carried out in close cooperation with all engineers involved in a project.
Concepts like product assurance, cost effectiveness and systems engineering are intro-
duced. Product liability and human reliability become important.
1980 - 90 Testability is required. Test and screening strategies are developed to reduce testing
cost and warranty services. Because of the rapid progress in microelectronics, greater
possibilities are available for redundant and fault tolerant structures. Software
quality becomes important.
after 1990 The necessity to further shorten the development time leads to the concept of concur-
rent engineering. Total Quality Management appears as a refinement to Quality As-
surance. RAMS is used for reliability, availability, maintainability & safety, reliability
engineering for RAMS engineering. Performance based contracts are stipulated for
systems with high RAMS requirements. Faced increasing safety and sustainability
problems, risk management and ethic aspects become important.
1.3 Basic Tasks & Rules for Quality & Reliability (RAMS) Assurance of Complex Eq. & Systems 17
100
Systems engineering (part)
Reliability (RAMS)
engineering Fault causes / modes /
effects / mechanisms analysis
75
Reliability (RAMS) analysis
50 Software quality
Quality assurance
Configuration management
25
Quality testing, Quality control,
Quality data reporting system
0 Year
1950 1970 1990 2010
Figure 1.5 Approximate distribution of the effort between quality assurance and reliability (RAMS)
engineering for complex equipment & systems with high quality and reliability (RAMS) requirements
Table 1.2 Main tasks for quality and reliability (RAMS) assurance of complex equipment & systems
with high quality and reliability requirements (the bar height is a measure of the relative effort)
Specific during
Project-independent
Conception
Production
Evaulation
Definition
(see Table A3.2 for greater details and a possible task assignment;
(software quality appears in tasks 4, 8-11, 14-16, see also Section 5.3)
Use
1. Customer and market requirements
2. Preliminary analyses
• Idea, market • Feasibility check • Feasibility check • Feasibility check • Series item
Disposal, Recycling
requirements • System • Revised system • Production • Customer
• Evaluation of specifications specifications documentation documentation
delivered • Interface definition • Qualified and • Qualified produc- • Logistic support
equipment • Proposal for the released tion processes plan
and systems design phase prototypes • Qualified and • Spare part
• Proposal for • Technical released first provisioning
preliminary documentation series item • Risk manage-
study • Proposal for pilot • Proposal for series ment plan
production production
Figure 1.6 Basic life-cycle phases of complex equipment and systems (the output of a given
phase is the input to the next phase), see Tab. 5.3 on p. 161 for software
Figure 1.7 shows a basic organization which can embody the above rules and sat-
isfy requirements of quality management standards (Appendix A2). As shown in
Table A3.2, the assignment of quality and reliability (RAMS) assurance tasks should
be such, that every engineer in a project bears his / her own responsibilities (as per
TQM). So, for instance, a design engineer should be responsible for all aspects of
his / her own product (e. g. an assembly) including reliability, maintainability and
safety aspects, the production department should be able to manufacture and test
such an item within its own competence, and the quality and reliability (RAMS)
assurance department (Q & RA in Fig. 1.7) should be responsible,
• within a project, for the
• formulation of preliminary quality and reliability (RAMS) targets,
• preparation of guidelines & working documents (quality & rel.(RAMS) aspects),
• coordination of activities belonging to quality & reliability (RAMS) assurance,
• reliability (RAMS) analyses at system level (footnote on p. 2),
• planning and evaluation of qualification, testing and screening of components and
material (quality and reliability (RAMS) aspects),
• release of manufacturing processes (quality and reliability (RAMS) aspects),
• operation of the quality data reporting system (Fig. 1.8),
• acceptance testing with customers;
Management
QC
Figure 1.7 Basic organizational structure for quality and reliability (RAMS) assurance i n a company
producing complex equipment and systems with high quality and reliability (RAMS) requirements
(connecting lines indicate close cooperation; A denotes assurance, C control and tests during pro-
duction, Q quality, R reliability (RAMS))
1.3 Basic Tasks & Rules for Quality & Reliability (RAMS) Assurance of Complex Eq. & Systems 21
Processing
Defects, Storage
Failures
Compression
Collection
Table 1.3 Example of data reporting sheets for PCBs (populated printed circuit boards) evaluation,
from a quality data reporting system (see also footnote on p. 1)
Compo- Manufac- No. of components Number of No. of faults per place of occurrence
nent turer Same Same faults % incoming in-process final test warranty
type application inspection test period
Basic training
Special training
Figure 1.9 Example for a practical oriented training and motivation program in a company
producing complex equipment and systems with high quality and reliability (RAMS) requirements