WHAT IS FMEA OR FMECA?
Failure Mode and Effect Analysis is a bottom-up risk analysis technique used to identify potential failures in which each basic component of the system could fail. Thereafter, the effects of these failures are assessed on higher levels in order to show the global consequences and system level implications, normally stated as End Effect.
When adding the concept Criticality, the Severity and Occurrence of each failure are assessed. The first characteristic defines the acceptance probability threshold, being this threshold an inverse relationship between severity and probability, as it is shown in Figure 1. The second characteristic is just the predicted probability for each case, obtaining this information from previous studies, such as RPA, or predefined criterions. Combining both concepts, the risk associated with each failure is assessed.
Figure 1. Probability vs. Severity graph (AMC 25.1309)
WHY PERFORM A FMEA?
FMEA is not strictly a required deliverable for certification purposes, but is the standard analysis used in the aerospace industry to assess the chain of events leading to failures. FMEA helps the systems’ engineers think about how the analysed product works and, more importantly, how it fails. What parts, sub-systems or systems integrate the product? What function does each part perform? How many failure modes does it have? What’s the probability of failure for each mode? What end effects do these failures entail? These are fundamental questions that FMEA will answer. FMEA also helps identify single points of failure in a system, which are to be avoided.
All these answers create a full view of the system’s weaknesses and critical points. This information is the essence of FMEA’s purpose, which can be summarised in the following three points:
- Identify and map how failures occur for each system part
- Identify failure causes & and their effects
- When using FMECA, also evaluate the criticality of each failure mode
HOW IS FMEA RELATED TO OTHER RAMS ANALYSIS?
FMEA is the basis for further analysis in the reliability and safety field. For instance, Fault Tree Analysis (FTA) studies the relationship between different failures that cause an undesired end effect. Having previously developed an FMEA, all those different basic events have already been detected, facilitating the construction of the fault trees.
A downloadable FMEA template along with FMECA Report is available , follow to download.
Figure 2. Relationship of different elements of FHA, FMEA & FTA
Moreover, if FMECA is developed, the previously computed Reliability Prediction Analysis (RPA) allows to incorporate the failure rate quantitative values for each component. Thus, a preliminary analysis can be done by combining the probabilities of the failures that directly lead to each end effect. The FMES (Failure Mode and Effect Summary) gathers this information, showing the different top failures and their preliminary total probabilities, which will be corroborated on following FTAs.
Figure 2 shows how failure conditions from a FHA and related to FMEA end effects, which can become top level event in FTA. Likewise according to ARP4761, when an FMEA is performed, a relationship should be accomplished to ensure all significant effects are linked in basic events.
INPUTS REQUIRED TO PERFORM FMEA
Before diving into performing a FMEA it is essential to gather all relevant information in the following areas:
1) Requirements :
The FMEA process begins by clarifying the scope of the analysis, which can vary depending on certification or design and operational requirements. For a quantitative FMECA (see section below), it is necessary to calculate the applicable list of failure rates (RPA). Understanding the intended operating environment and the different phases of operation can help identify potential failure modes and their effects, which is directly linked to design requirement to have a safe product. For example, an engine failure during take-off has more severe effects when compared to an engine failure occurring during cruise, descent or landing phase.
2) System structure:
It is important to assign unique identifiers to each component using a systematic approach that takes into account the type, function, and location of each component within the system. Creating a reference document that provides detailed descriptions of each component, including it’s purpose, specifications, and relevant technical details, can serve as a reference guide throughout the FMEA process and ensure that everyone involved in the project has a clear understanding of the system components, when working in a team.
The latest technical documentation for the system design must be ready to perform an FMECA and correctly understand system functions and operations. The following documents shall be the input list:
- BoM: Bill of Materials, hierarchical list of systems, subssystems and components for the whole product
- Schematics for electronic systems, technical description of mechanical parts, etc.
- CAD models
- Previous FMEAs, if existing
- System supplier FMEAs analyses for (Commercial Of-The-Shelf) COTS equipment
4) Standard practices:
Develop a preliminary list of failure modes, failure causes and detection means to be used according to the industry regulations or standard practices.For failure modes, several documents can be used:
- Rome’s Laboratory “Reliability Engineer’s Toolkit” (1993)
STEPS TO PERFORM FMEA
Once all necessary inputs have been obtained, define the scope of analysis and start developing the FMEA. It is recommended to execute the following steps in this order to ensure a systematic analysis.
1) Identify the functionality of parts:
For each uniquely identified part, describe their functionalities within the system or subsystem they belong to. If they perform different tasks, state the most relevant ones related to the item’s possible failure modes.
2) Brainstorm potential failure modes:
Assign the most likely to occur failure modes to each component in each operational phase if relevant. Be consistent with the selected failure modes for the project and their definitions and, if necessary, add a new type of failure mode and define it to avoid misconceptions. Failure modes need to be defined with the design team and need to be studied on a case-by-case basis. However, the failure modes can generally fall in one of three generic failure modes:
- Loss of function: The item, functional block or subsystem is no longer operating and the function previously performed by it is gone.
- Erroneous function: The item or functional block is operating within design thresholds and parameters, but not in the desired way.
- Incorrect function: The item or functional block is operating outside design thresholds and parameters.
Each generic failure mode has different respective severities. As an example, if an item or a functional block which conditions a signal, loss of function would mean that the signal does not transmit to it’s destination; erroneous function would mean that the signal is within expected values, but it’s a inaccurate reading and does not represent true value; and finally, incorrect function would outside the expected range of values, indicating that the item or functional block responsible for conditioning the signal is not functioning correctly.
3) Assign a cause for each failure:
Assign the potential causes that could trigger the failures. As in the previous step, be consistent and put on record any change or addition to the selected failure causes for the project.
4) Analyse the effects of each failure:
Investigate the repercussions of the different failure modes through the system. Normally, three different levels are assessed:
- LOCAL EFFECT: Failure effect over the component itself.
- NEXT EFFECT: Failure effect on the next level or system component of the system.
- END EFFECT: Failure effect at system level. It corresponds with a Functional Hazard Assessment (FHA) failure condition and will appear as a Fault Tree Analysis (FTA) top level gate.
Attention! FMEA only accounts for Single Failure Events. While evaluating the effects of a component failure, it’s important to assume that all other components are functioning correctly, even if a group of elements such as screws or bolts are treated as a single component. This means that the analysis should focus on the failure of a single element within the group.
5) Assess risk and criticality:
There are several ways to assess the risk associated with each failure mode depending on the type of analysis and it’s accuracy:
- FMEA: To obtain the most hazardous failures, the three values that is severity, occurrence, and detectability should be multiplied together. This will result in a risk priority number (RPN) for each failure mode, and the failures with the highest scores will be the most critical. Therefore, it’s important to prioritize these failures for further analysis and implement appropriate mitigation measures to reduce the risk associated.
- Qualitative FMECA: Objectively defined scales, based on indicators such as crew workload, flight safety, etc. to determine severity of the different failures, as shown in Table 1. The probability defined scale according to European aviation occurrences during operational life can be seen in Table 2. As a consequence, as seen before in Figure 1, we can obtain the limitation on the risk acceptance.
|Effect on Aeroplane||Effect on Occupants excluding Flight Crew||Effect on Flight Crew|
|No Safety Effect||No effect on operational capabilities or safety||Inconveniences||No effect on flight crew|
|Minor||Slight reduction in functional capabilities or safety margins||Physical discomfort||Slight increase in workload|
|Major||Significant reduction in functional capabilities or safety margins||Physical distress, possibly including injuries||Physical discomfort or a significant increase in workload|
|Hazardous||Large reduction in functional capabilities or safety margins||Serious or fatal injury to a small number of passengers or cabin crew||Physical distress or excessive workload impairs ability to perform tasks|
|Catastrophic||Normally with hull loss||Multiple fatalities||Fatalities or incapacitation|
Table 1. Severity classification (AMC 25.1309)
|Aircraft operational life||Fleet operational life|
|Probable||One or more times||–|
|Extremely remote||Not anticipated||Few times|
|Extremely improbable||–||Not anticipated|
Table 2. Qualitative probability (AMC 25.1309)
6) Allocate detection means to each failure mode
There are three basic means of detection:
- EVIDENT: The failure is readily detected during operation.
- DORMANT: The failure can be detected when maintenance is performed.
- HIDDEN: The failure is not detected unless intentionally sought, for instance, by testing the system.
Detection refers to the ability to detect a failure means, despite not affecting the FMEA failure rates, will be relevant when developing the subsequent FTAs. Depending on the type, a determined latency time will be assigned to modify the failure rates used in the fault trees.
7) Plan CORRECTIVE ACTIONS:
In case the risk derived from the failure mode under analysis is deemed unacceptable (see figure 1), need of corrective actions is triggered in order to:
- Decrease the failure probability of the component.
- Mitigate the severity of the failure mode.
After implanting corrective actions, it is common to re-evaluate the risk assessment to find out that the risk index has decreased, achieving a safer design.
8) Summarize FMEA conclusions:
Once the FMEA tables have been completed, it is important to offer a good overview of the results and conclusions. Those are collected in the main report stating different aspects such as inputs, methodology, etc. It is important to draft the report index to properly structure the information according to the initially defined design, operational and certification requirements for the FMEA at hand.
CONSIDER THIS WHEN DOING FMEA
- Do not disregard preparatory steps. A good understanding of the system, gathering the most updated data and preliminary defining the different components, failure modes, etc will greatly decrease the number of errors and revisions to be done.
- A concept that really needs stressing is that in a FMEA/FMECA you’re studying single failure events.
- If the specific nature of failure mode cannot be identified because of complex components, you may assume the worst case. Nevertheless, if the risk becomes unacceptable for FTA, a lower-level analysis will be needed.
- Use international standards! Specific components or functional items might have their failure modes studied and described in international standards, which prevent the performed FMEA from missing failure modes or adding non-applicable failure modes. Standards like ECSS for space applications or SAE for automotive or aviation, for example, are a very valuable addition to your FMEA.
DID YOU KNOW?
You can perform FMEA on a word processor table or even a general spreadsheet. However, there are interesting pieces of software in the market that specifically support the development of FMEA and that offer engineering aids such as standards effects generation or links to useful content like Robin is a software tool that supports the development and management of Failure Mode and Effects Analysis (FMEA) processes. With Robin FMEA, teams can work collaboratively to identify potential failures in a product or process, analyze the severity and likelihood of each failure, and develop action plans to mitigate the risk associated with each failure. Not limited to FMEA, Robin RAMS suite offers complete Reliability, Maintainability, Availability and Safety solutions.
Some of the advantages of using specialized FMEA software tools include:
A. Standardized templates
B. Automated calculations
C. Collaboration and sharing
D. Organised content from relevant industry standards, regulations, and best practices