Failure Modes and Effects Analysis Dr. M. Hodkiewicz
Contents \u2022 Definitions and background \u2022 Design and Process FMEA \u2022 Terminology \u2022 The FMEA Process \u2022 Discussion \u2022 Future developments
Learning Outcomes \u2022 After this session you will be able to: \u2013 Explain the role of FMEA/ FMECA in the AM lifecycle process \u2013 Identify key components in the FMEA process \u2013 Conduct and report on a simple FMEA exercise \u2013 Appreciate challenges with FMEA implementation \u2013 Appreciate how FMEA can be updated by integration into the routine maintenance environment
References (1 of 2)
[1]Dailey, K.W., The FMEA Pocket Handbook. 2004: DW Publishing company. [2]McDermott, R.E., R.J. Mikulak, and M.R. Beauregard, The basics of FMEA. 1996: Productivity. [3]SAE J1739: Potential Failure Modes and Effects analysis in Design and Potential Failure Effects in Manufacturing and Assembly Processes Reference Manual - Draft for review. 2005. [4]MIL-STD-1629A: Procedure for performing a failure mode, effects and criticality analysis. 1980.
References (2of 2)
[5]Tweeddale, M., Managing Risk and Reliability of Process Plants. 2003: Gulf Publishing. [6]ISO 14224: Petroleum and natural gas industries Collection and exchange of reliability and maintenance data for equipment. 1999. [7]IEC 60050-191: International Electrotechnical Vocabulary - Dependability and Quality of Service. 1990. [8] MIL-STD-721C Definitions of terms for reliability and maintainability. 1995. [9] Macaulay, D., The Way things work. 1988: RD Press. [10] What's wrong with your existing FMEAs , 24/7 Quality.com.
Software/Internet Resources • FMEA InfoCentre: http://www.fmeainfocentre.com/ MH1
• http://www.weibull.com/basics/fmea.htm • On-line paper: B. S. Dhillon, Failure modes and effects analysis-Bibliography, Microelectronics
Slide 6 MH1
Add from Plant Maintenance web site
Melinda Hodkiewicz; 2006/ 07/ 16
Definitions and background
What is FMEA? • MIL-STD-1629A [4]: “The purpose of FMEA is to study the results or effects of item failure on system operation and to classify each potential failure in terms of its severity” • SAE J1739 [3]: “A FMEA can be described as a systemised group of activities intended to: (a) recognise and evaluate the potential failure of a product/process and its effect, (b) identify actions which could eliminate or reduce the chance of a potential failure occurring, and (c) document the process. It is complementary to the process of defining what a design or process must do to satisfy the customer”.
Informal definition • “FMEA is a non-quantitative analysis that aims to identify the nature of the failures that can occur in a system, machine, or piece of equipment by examining the sub-systems or components in turn, considering for each the full range of possible failure types and the effect on the system of each type of failure. • FMECA is an extension of FMEA that assigns a ranking to both the severity of the possible effects and their likelihood, enabling the risks to be ranked” [From 5]
Philosophy • FMEA is a ‘common sense’ procedure. • The aim is to provide a framework/process to assist the thought process of a competent person engaged in identifying system or design problems. • The process focuses on what we want the equipment to do not what it actually is. By identifying what functions need to be achieved, we can then identify situations when the equipment does not perform the required function, and focus attention on the related causes and effects.
An example FMEA report
For what activities is FMEA appropriate? • New designs, new technology, or new process • Modifications to existing design or process • Use of existing design or process in a new environment, location or application • Identify monitoring and inspection practices for equipment • Identifying failure codes for the CMMS system • Part of the RCM process
Design and Process FMEA
Types of FMEA • FMEA can be applied to a physical entity or to a functional entity. • For example,
– it can be applied to a particular equipment (design FMEA), or to – A process function (process FMEA).
Example ANTI-SURGE (LP)
ANTI-SURGE (HP)
PCV 1(LP)
GAS EXPORT HEADER
LP SUCTION COOLER
m LP SUCTI ON
P
DRUM
T
P T
LP DISCH. COOLER
DISCH. COOLER
m HP SUCTION DRUM
P T
DEHYD. PACK
P T
M P
GEARBOX
LP STAGE COMPRESSION ND
2
GAS EXPORT COMPRESSION TRAIN
HP STAGE COMPRESSION
PCV 2 P TO SUBSEA PIPELINE
Design FMEA (DFMEA) • Identifies functional requirements of a design • Evaluates the initial design for manufacturing, assembly, service and recycling requirements. • Used by Design Team. The customer for the design team may be the end user, the design engineer of the higher level assemblies or the manufacturing process/assembly team.
Design FMEA in the AM context
• If you are the maintenance engineer in an oil and gas or similar facility, it is unlikely that you will be involved in a design FMEA process. • However, if you are (1) troubleshooting equipment, (2) developing failure codes or (3) engaged in RCM, then information from the design FMEA conducted by the manufacturer may be helpful.
Process FMEA (PFMEA)
• Identifies the process functions, process requirements, potential product and process failures and the effects on the customer. • Identifies process/operational variables on which to focus controls. • Traditionally used by Manufacturing/ Assembly/ Process team. The customer can be a downstream team, a service operation, or even government regulations. • In the AM arena, there is some overlap between HAZOP and Process FMEA for operational equipment
Machinery FMEA (MFMEA) • This is a new category in the draft SAE J1739-2005 aimed at Plant Machinery and Tools. • In AM, machinery FMEA may be applied to important maintenance support tools such as lathes, cranes, milling machines etc. • There are similarities in approach between DFMEA, PFMEA and MFMEA.
Relationship SYSTEM
Components, sub-systems, main systems
DESIGN
Components, sub-systems, main systems
PROCESS
Manpower, Machine, Method, Material, Measurement, Environment
MACHINERY
Tools, Work stations, production lines, operator training, processes, gauges
Approaches to FMEA • A FMEA may be based on a • (a) hardware/physical, or (b) functional approach. • (a) The hardware approach lists individual hardware items and analyses their possible failure modes. • (b) The functional approach recognises that every item is designed to perform a number of functions that can be classified as outputs. The outputs are listed and their failure modes analysed. • For complex systems, a combination of (a) and (b) may be required.
Maintenance • For Maintenance Personnel, FMEA is a direct approach to the reduction of maintenance costs through the elimination of faults that give rise to the maintenance task. • FMEA identifies the most critical problems first paving the way for improved maintenance techniques • FMEA on installed equipment provides suggestions for redesign and ‘proactive maintenance’ – (From: Hastings, 1998, Reliability and Maintenance Course notes, QUT)
Terminology
Terms & definitions (1) • Failure: Termination of the ability of an item to perform a required function [7] • Required function: Function, or combination of functions, of an item which is considered necessary to provide a given service [7] • Failure mode: The manner by which a failure is observed. Generally describes the way the failure occurs and its impact on equipment operation [4].
Terms & definitions (2) • Failure cause: • (1) Circumstance during design, manufacture or use which have led to failure [7]. • (2) The physical or chemical processes, design defects, quality defects, part misapplication, or other processes which are the basic reason for failure or which initiate the physical process by which deterioration proceeds [4]. • Failure mechanism: Physical, chemical or other process which has led to failure [7]
Terms & definitions (3) • Failure effect: The consequence a failure mode has on the operation, function, or status of an item [4]. • Critical failure: Failure of an equipment unit which causes an immediate cessation of the ability to perform its required function [6] • Non-critical failure: Failure of an equipment unit which does not cause an immediate cessation of the ability to perform its required function [6]
Terms & definitions (4) • Criticality: A relative measure of the consequences of a failure mode and its frequency of occurrence [4] • Severity: The consequence of a failure mode. Severity considers the worst potential consequence of a failure, determined by degree of injury, property damage, or system damage that could ultimately occur.
Terms & definitions (5) • Reliability [8]: The probability that an item will perform its intended function(s) for a specified interval under stated conditions. • Undetectable (Hidden) failure: A postulated failure mode in the FMEA for which there is no failure detection method by which the operator is made aware of the failure [4].
The FMEA Process
DEFINE SCOPE
DEFINE LEVEL OF ANALYSIS
IDENTIFY FUNCTIONS AND FAILURE MODES
IDENTIFY CAUSES OF FAILURE
IDENTIFY EFFECTS OF FAILURE & SEVERITY RATING
ASSIGN OCCURRENCE (FREQUENCY) RATING
IDENTIFY CONTROLS & ASSIGN DETECTION RATING
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
FMEA flowsheet
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
RANK FAILURE MODES FOR ACTION
RANK FAILURE MODES FOR ACTION & ANALYSIS
Steps in the FMEA process (from [3])
1. Define the SCOPE of the study (System boundary)
2. Decide on the LEVEL of analysis (System, sub-system, components) 3. For the selected system or sub-systems, IDENTIFY and list functions and the potential failure modes. Failure modes may be assessed at the hardware or functional level, or a combination of both.
Step 3 in the maintenance context • IDENTIFY and list failure modes …
– Information on what failed and when on a specific piece of equipment or in a system should be available in the maintenance management system (CMMS) – Depending on the organization of the system and the data quality processes then there may be a failure code indicating the cause of failure.
Continued … 4. For the selected system or sub-system and for each of the identified failure mode, identify the POTENTIAL EFFECT(s) on the machine, system or process and the relative importance (SEVERITY) of the effect(s).
Continued … 4. continued. The effects could include: – Injury to people – Damage to the environment – Damage to equipment – Loss of production – Reduced quality of production – Increased cost of operation
Continued … 5. Assign an (OCCURRENCE) ranking to each failure mode 6. For each failure mode for each element, identify CONTROLS – The means of preventing the failure by design, operating and maintenance practices, and management. – The means of detecting the failure and responding effectively to it – The means (if any) of limiting the impact of the failure, particularly by design changes.
Continued .. 7. For each of the controls assign a DETECTION ranking 8. Calculate the Risk Priority Number (RPN) for each effect 9. Prioritise the failure modes for action (RANKING) 10. Take ACTION to eliminate or reduce the high risk failure modes 11. Calculate the resulting RPN as the failure modes are reduced or eliminated.
DEFINE SCOPE
DEFINE LEVEL OF ANALYSIS
IDENTIFY FUNCTIONS AND FAILURE MODES
IDENTIFY CAUSES OF FAILURE
IDENTIFY EFFECTS OF FAILURE & SEVERITY RATING
ASSIGN OCCURRENCE (FREQUENCY) RATING
IDENTIFY CONTROLS & ASSIGN DETECTION RATING
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
FMEA flowsheet
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
RANK FAILURE MODES FOR ACTION
RANK FAILURE MODES FOR ACTION & ANALYSIS
Selecting the team • Have you got representatives from all the stakeholders? • Do you have a facilitator? • Are the team members familiar with the subject but from diverse vantage points?
Setting up the meeting • • • • • • • •
Provide advance notice Who will record meeting minutes? Who will facilitate? Establish ground rules Provide and follow an agenda Evaluate meetings Who will you report the results to? Allow no interruptions
Brainstorming rules [1] • Participants must be enthusiastic and give their imagination free reign • The recorder must be given time to record ideas • The ideas must be concisely recorded and placed in clear view of participants • Idea evaluation occurs after the session • Set a firm time limit • Clearly define the problem you want solved • The moderator must keep the group on subject and moving • When time is up, the group rank the ideas
Selecting systems/subsystems and components • It is important to have an agreed taxonomy when breaking systems down into sub-systems and components. • This may be agreed with by the team for a specific FMEA, or they may choose to use a taxonomy described in a Standard, for example: [6].
Deciding level for analysis [6] Subunit
Power transmission
Pump unit
Control & monitoring
Lubrication
Miscellaneou s
Maintainable item
Gearbox Variable drive Bearings Seals Lubrication Coupling to drive Coupling to driven unit
Support Casing Impeller Shaft Radial bearing Thrust bearing Seals Valves Piping Cylinder liner Piston Diaphragm
Control Actuating device Monitoring Valves Internal Power supply
Reservoir Pump with motor Filter Cooler Valves Piping Oil
Purge air Cooling/ heating system Filter/ Cyclone Pulsation damper Flange joints Others
Defining functions and failure modes
• Required function: Function, or combination of functions, of an item which is considered necessary to provide a given service [7]. • Be explicit so it is clear when a functional failure has occurred. • Failure mode: The manner by which a failure is observed. Generally describes the way the failure occurs and its impact on equipment operation [4].
Defining functions and failures • Equipment: Diesel Engine Crankshaft • Function: To convert reciprocating force from pistons and connecting rods into rotational force through the bearings and crankshaft to the drive coupling at a maximum rate of up to ‘x’ kW per cylinder at up to ‘y’ rpm continuously or ‘z’ kW per cylinder at ‘w’ rpm for up to ‘v’ hours in 12. • Question: What are some possible functional failures?
Functional failure • Function: To convert reciprocating force from pistons and connecting rods into rotational force through the bearings and crankshaft to the drive coupling • Functional Failure: Unable to convert and transmit any force from the pistons
Failure mode (1 of 2) • Failure Mode (1): Damaged crankshaft axial alignment bearing (ball race) due to lubrication failure • Failure Effect (1): Crankshaft will float axially and foul on crankcase, misalignment of gear drives. • Existing controls (1): Daily fuel dilution test, weekly oil screen, change oil and filters as required.
Failure mode (2 of 2) • Failure Mode (2): Damaged crankshaft axial alignment bearing (ball race) due to bearing material failure • Failure Effect (2): Same as (1). Crankshaft will float axially and foul on crankcase, misalignment of gear drives. • Existing controls (2): Routine vibration monitoring. Replace bearing as required.
Examples of failure modes • For mechanical equipment – Cracked, Loosened, Fractured, Leaking, Oxidised, Loss of structural support, Deformed, Slips, Disengages too fast, Failure to transmit torque. • For electrical equipment – No signal, Intermittent signal, Inadequate signal, Sticking, Drift,
Process Failure modes (from [4]) Failure mode
Definition
Failure mode
Definition
FTS
Fail to start on demand
BRD
Breakdown
STP
Fail to stop on demand
HIO
High output
SPS
Spurious stop
LOO
Low output
FTC
Fail to close on demand
ERO
Erratic output
FTO
Fail to open on demand
VIB
Vibration
FTR
Fail to regulate
NOI
Noise
DOP
Delayed operation
ELU
External leakage lubricant, hydraulic fluid
FTF
Fail to function on demand
ELP
External leakage process medium
AOL
Abnormal output – low
INL
Internal leakage
AOH
Abnormal output - high
LCP
Leakage in closed position
OWD
Operation without demand
PLU
Plugged/choked
OHE
Overheating
SER
Minor in-service problems
PDE
Parameter deviation
OTH
Other
AIR
Abnormal instrument reading
UNK
Unknown
STD
Structural deficiency
Examples of design failure causes • • • • • • • • •
Improper tolerances Incorrect (stress or other) calculations Wrong assumptions Wrong material Lower grade components Lack of design standards Incorrect algorithm Insufficient lubrication capability Excessive heat
Examples of failure causes in manufacturing & process 1.
Skipped steps
11 Poor control procedures
2.
Processing errors
12 Improper maintenance
3.
Set up errors
13 Bad ‘recipe’
4.
Missing parts
14 Fatigue
5.
Wrong parts
15 Lack of safety
6.
Processing incorrect work piece16 Hardware failure
7.
Mis-operation
17 Failure to enforce controls
8.
Adjustment error
18 Environment
9.
Equipment improperly set-up
19 Stress connections
10 Tools improperly prepared
20 Poor FMEAS
equipment
• • • • • • •
Example of design failure mechanisms
Yield Fatigue Material instability Creep Wear Corrosion Chemical oxidation
Examples of design controls • Prototype testing • Design reviews • Worst case stress analysis • FEA • Fault tree analysis
ACTIVITY • Workshop activity to identify functions and failure modes • The aim of this activity is to see how the functions are broken down and assessed at the different levels.
[from 3] Bicycle for (Male) Commuter • List some design objectives (functions) of a regular commuter bicycle • For two of the functions identify potential failure modes?
Bicycle example continued • Identify some of the sub-systems of the bicycle • For one subsystem: Identify at least two functions and failure modes.
DEFINE SCOPE
DEFINE LEVEL OF ANALYSIS
IDENTIFY FUNCTIONS AND FAILURE MODES
IDENTIFY CAUSES OF FAILURE
IDENTIFY EFFECTS OF FAILURE & SEVERITY RATING
ASSIGN OCCURRENCE (FREQUENCY) RATING
IDENTIFY CONTROLS & ASSIGN DETECTION RATING
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
FMEA flowsheet
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
RANK FAILURE MODES FOR ACTION
RANK FAILURE MODES FOR ACTION & ANALYSIS
Recording the FMEA process
Severity (S) • A relative ranking, within the scope of the individual FMEA. • A reduction in Severity can be achieved by design change to system, sub-system or component, or a redesign of the process. • The rank depends on the evaluation criteria. Examples of suitable tables are available in the literature. Some companies may have standard tables.
Severity Tables(from [1])
Occurrence (O) • This is the likelihood that a specific cause/mechanism (listed in the previous column) will occur. Occurrence is usually based on ranking charts and is a relative rating within the scope of the FMEA.
Occurrence Tables(from [1])
Controls • (1) prevent to the extent possible the failure
mode or cause from occurring or reduce the rate of occurrence, or
• (2) detect the cause/ mechanism and lead to corrective action, or • (3) detect the failure mode or cause should it occur.
Detection ranking (D) • A rank associated with the best type of control listed in the previous column. Detection is a relative ranking within the scope of the FMEA.
Detection Tables (from [1])
Risk Priority number • RPN = (S) x (O) X (D) • Within the scope of the individual FMEA, the resulting value (between 1 and 1000) can be used to rank order the concerns identified by the process. This allows the highest ranking items to be identified and addressed.
Action plans • Recommended action(s) • Corrective action should be addressed at high severity, high RPN issues. The intent of the action is to reduce rankings in the order of preference: severity, occurrence and detection. • Actions taken and resulting revised ratings • After a preventative/corrective action has been identified, estimate and record the resulting S, O and D rankings. All revised rankings should be reviewed to see if further action is necessary.
Design FMEA actions • An increase in design validation/ verification actions will result in reduction of ‘detection’ ranking only • Occurrence ranking can be effected by removing or controlling the causes or mechanisms through design revision • Design revision can also affect severity ranking
DEFINE SCOPE
DEFINE LEVEL OF ANALYSIS
IDENTIFY FUNCTIONS AND FAILURE MODES
IDENTIFY CAUSES OF FAILURE
IDENTIFY EFFECTS OF FAILURE & SEVERITY RATING
ASSIGN OCCURRENCE (FREQUENCY) RATING
IDENTIFY CONTROLS & ASSIGN DETECTION RATING
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
FMEA flowsheet
CALCULATE RISK PRIORITY NUMBER FOR EACH EFFECT
RANK FAILURE MODES FOR ACTION
RANK FAILURE MODES FOR ACTION & ANALYSIS
Discussion
Drawbacks (1 of 2) • 1. The ratings and RPN number are subjective • 2. The categorisation into failure mode and cause does not allow for thinking in terms of causal chains, the’ 5 WHYS’ or other processes. For each mode you must have a failure cause. This cause may have a deeper cause. Sometimes the cause and the mode are the same. • 3. It can be difficult to control brainstorming sessions
Drawbacks (2 of 2) • 4. Legal ramifications: if you have identified a failure mode but you have not eliminated it, are you culpable of negligence? • 5. Approach makes it difficult to allow for the interaction of two benign failure models. • 6. FMEA often assumes that the part is ‘in tolerance’. To assume otherwise expands the scope of FMEA considerably. However in real life, out of spec parts are common.
Common problems with FMEA (from [10])
• Engineers often do not follow a recognised standard and company format • Multiple descriptions of the exact same failure mode, cause or effect • No recommendations or corrective action for high RPM items • Inconsistent documents between parts of the study • No document control or revision control • Engineers don’t see a value in a FMEA, they find them a pain to perform and labour intensive • Companies perform a FMEA study when it is too late.
Benefits of design FMEA (1 of 2)
• Aids in objective evaluation of design, including functional requirements and design alternatives • Evaluating the initial design for manufacturing, assembly, service and recycling requirements • Increasing the probability that potential failure modes and their effects on the system have been considered in the design/development process.
Benefits of design FMEA (2 of 2)
• Developing a ranked list of potential failure modes according to their effect on the customer (can be the assembly team), thus establishing a priority system for design improvements, development and validation testing/analysis. • Providing an open-issue format for recommending and tracking risk reduction actions • Providing future reference eg lessons learned, to aid in analysing field concerns, evaluating design changes and developing advanced designs
Benefits of Process FMEA • Identifies the process functions and requirements • Identifies potential product and process related failure modes • Assesses the potential customer effects of the failures • Identifies process variables on which to focus process controls • Develops a ranked list of potential failure modes thus establishing a priority system for preventative/corrective action considerations • Documents the results of the analysis of the manufacturing, assembly or production process
Future developments
FMEA and Failure Analysis: Closing the Loop Between Theory and Practice Dr Joanna Sikorska, Imes Group Ltd Dr Melinda Hodkiewicz, UWA Presented to Engineers Australia Conference, May 2006
Before and after events • FMEA identifies failure modes, causes and effects based before they occur. • The Computerized Maintenance Management system (CMMS) records events/failures as/after they occur. • QUESTION: Is there benefit in a feedback loop from the CMMS to update the FMEA failure records and O, D, and S values?
What happens now? • FMEA process: – Proactive but subjective analysis of hypothetical – Integrated into other methodologies – Large upfront costs – Results or benefits rarely substantiated – Static process – Non-inclusive – Completed reports collect dust – Data & process owned by engineering
What happens now? • Failure analysis process & storage in CMMS: – – – – – – – –
Retrospective & selective view of reality Hampered by bad/missing data Evolved functionality Dictated by accountants Interfaces ruled by codes & structure Poor integration with non-financial systems Widely distributed, used & disliked Data & process owned by ops/maintenance
What happens now? • FMEA & CMMS data rarely linked &/or integrated • Why? – Different process owners – Non-uniform coding between FMEA & CMMS systems – Reporting may be at different hierarchy levels – Hierarchies may be different – Tradition
Issues to overcome • Structural issues – Consistency of coding & reporting – Adapt systems to users not administrators
• Data quality issues – Ensure data is fit for purpose – Real-time data verification – Up-skill data collectors
Issues to overcome • Organizational issues – Implement cultural change – Include the disenfranchised – Increase frequency & quality of feedback – Improve status of data collectors • Technical challenges are trivial by comparison
What is our vision? • Living FMEA • Live links between theoretical (FMEA) & actual (CMMS) • Inclusive process, shared ownership for both datasets • Managed, audited & utilized data processes • Live feeds into various business improvement systems
How can we get there? • Living FMEA model • See Handout
Benefits • Aids prioritization & guides business response • Improves reliability analysis from: – Reduced reliance on free text – More consistent failure classification • Facilitates maintenance optimization • Uncovers disparity between theory & reality • Creates knowledge workers • Ensures a recorded, managed & auditable process
More benefits • Reviews & measures success of FMEA process • Maximizes return on FMEA investment • Supplies future FMEAs with objective data • Future studies become easier
End