As per ICH Q9 “Quality Risk Management is a systematic process for the assessment, control, communication and review of risks to the quality of the product throughout its life cycle.” There are many methods and tools to perform Quality Risk Management. It is important to understand that none of the tool or set of tools are sufficient to address every situation of Quality Risk Management. Some of the important tools are as follows:

  1. Basic Risk Management Facilitation Methods
  2. Failure Mode and Effects Analysis (FMEA)
  3. Failure Mode, Effects and Criticality Analysis (FMECA)
  4. Fault Tree Analysis (FTA)
  5. Hazard Analysis and Critical Control Points (HACCP)
  6. Hazard Operability Analysis (HAZOP)
  7. Preliminary Hazard Analysis (PHA)
  8. Risk Ranking and Filtering
  9. Supporting Quality and Statistical Tools

Failure mode, effects and criticality analysis (FMECA) is considered as an extension of Failure mode and effect analysis (FMEA). FMEA and FMECA are risk analysis methodologies designed for identification of potential failure modes for a process or product by assessing the risk. The FMECA shall be iterative to correspond to the nature of design of the process to be effective. FMECA when properly performed can be valuable for making program decisions on the adequacy and feasibility of design approach. Timeliness is an important factor for determining the effective implementation of the process.

The primary purpose of an FMECA is to identify all critical and catastrophic failure possibilities as early as possible in order to minimize and eliminate failures through the design correction as early as possible to attain greatest impact on reliability of the process.  FMECA can be Top-Down approach (system level- implemented in early design phase) or Bottom-Up approach (component level- implemented when system concept has been decided). The FMECA is comprised of the following two steps:

  1. Failure Mode and Effect Analysis (FMEA)
  2. Criticality Analysis (CA)

As already discussed earlier, FMEA is a highly structured, systematic technique to define, identify and eliminate known and/or potential failure and eliminate known and/or potential failure that may exist within the system/design/process leading to an improved product design. RPN (Severity x Occurrence x Detection) is calculated to determine the risk in FMEA.

Criticality Analysis is performed to rank the potential failure mode in the FMEA task according to their combined influence on severity classification and their probability of occurrence. It can be performed in two ways as described in the MIL-STD-1629A (military standard): qualitative and quantitative.

Qualitative approach is used when failure rate data are not available. In such cases failure probability levels are used in which individual failure are grouped into logically defined levels which are as follows:

  1. Level A- Frequent: A high probability of occurrence of the item during its operation time. High probability may be defined as a single failure mode probability greater than 0.20 of the overall probability of failure during the item operating time interval.
  2. Level B – Reasonably probable: The probability of occurrence is moderate during the operation time. Probability may be defined as a single failure mode probability of occurrence which is more than 0.10 but less than 0.20 of the overall probability of failure during the item operating time.
  3. Level C-Occasional: The probability of occurrence is occasional during the operation time. Occasional probability may be defined as a single failure mode probability of occurrence which is more than 0.01 but less than 0.10 of the overall probability of failure during the item operating time.
  4. Level D- Remote: The probability of occurrence is unlikely during the operation time. Remote probability may be defined as a single failure mode probability of occurrence which is more than 0.001 but less than 0.01 of the overall probability of failure during the item operating time.
  5. Level E-Very Unlikely: The probability of occurrence is near Zero during the operation time. Extremely unlikely may be defined as a single failure mode probability of occurrence which is less than 0.001 of the overall probability of failure during the item operating time.

For the qualitative approach, the Severity classification to potential consequences of an item failure or error are also assigned which are as follows:

Category I – Catastrophic: A failure which may cause death or weapon system loss (i.e., aircraft, tank, missile, ship, etc.)

Category II – Critical: A failure which may cause severe injury, major property damage, or major system damage which will result in mission loss.

Category III – Major: A failure which may cause minor injury, minor property damage, or minor system damage which will result in delay or loss of availability or mission degradation.

Category IV – Minor: A failure not serious enough to cause injury, property damage, or system damage, but which will result in unscheduled maintenance or repair.

Criticality Matrix of the probability levels and severity classification is constructed to assess the impact of the failure.

Severity classification to potential consequences

 

When the data becomes available, then actual criticality in terms of numbers should be analysed using the quantitative approach.

Quantitative approach is used when an actual data, i.e.- failure rate data and configuration data regarding the component is available. The data used shall be same as the data applied for other maintainability and reliability analyses. CA worksheet is used for quantitative approach and the same shall be included with the FMECA report.  Following are the information covered in the CA worksheets:

  1. Identification number This is the number designated as an identification number for traceability of the worksheet (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Item/Functional identification This is the name of the system or item to be analyzed (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Function These are the tasks that the system, design, process/service are required to perform (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Failure modes and causes All possible failure modes for every level analyzed must be identified and described (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Mission phase/Operational mode- This is a statement of the mission and the operational mode in which the failure occurs. This is regarded as the timing information of the profile (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Severity Classification This is the category assigned for every failure mode item which is analysed. (Same has been included in the FMEA worksheet, so can be transferred to the CA worksheet)
  1. Failure probability/failure rate data source Failure probability shall be listed when the failure modes are evaluated according to their probability of occurrence. Failure rates used in the calculation shall be listed in cases where the failure rates are used for calculation. When failure probability has been listed, then the details on the rest of the columns are not required to be filled, only the criticality matrix is required to be constructed.
  1. Failure effect probability (β)- These are the values of conditional probability that the failure effect will result in the identified criticality classification (condition that failure mode has occurred). Following are the β values quantified as per general accordance:
Failure effect β value
Actual loss 1.00
Probable loss >0.10 to < 1.00
Possible loss >0 to = 0.10
No effect 0
  1. Failure mode ratio (α) This is the probability expressed in decimal fraction given that the item or part shall fall in the mode identified. The sum of α values for the item or part will be equal to one when all the potential failure modes are listed. In cases where α value are not given, then the analyst shall assign the value based on judgement of the item’s functions.
  1. Part failure rate (λp) This is the ratio between the failures per unit of time. Expressed as failures per million hours or failures/106
  1. Operating time (t) This is the number of operating cycles per mission of the item which is derived from the system expressed in hours.
  1. Failure mode criticality number (Cm) This is the relative measure of the frequency of a failure mode. Mathematical equation to device Cm is as follows:

                        Cm = β x α x λp x t

  1. Item criticality numbers (Cr) This is the relative measure of the frequency and consequences of specific system failures. Cr is calculated by totaling all the failure mode criticality numbers having the same severity level (as determined by FMEA).

                       Cr= ∑ (Cm)

  1. Remarks Any remarks corresponding to the worksheet shall be noted be it regarding an improvement or any suggestion.

Fig 2

The Critical items determined during the analysis shall be documented. Risk mitigation (strategies include design change, selecting component with lower failure rate, use of warning system, testing inspection) and maintainability analysis (determining which component shall fail first) is required to reduce the possible failures to occur. Follow up on the corrective actions implemented shall be carried out to assess its effectiveness.

Part of the information on the FMEA worksheet are transferred into the FMECA worksheet. The whole data of the FMEA are not used due to space concerns and to make the work clear. The other details of the FMEA not transferred are considered as reference at the time of criticality analysis. Some organizations add additional information to these worksheets based in the analysis level required

Failure Mode, Effects and Criticality Analysis provides information on maintainability, survivability, safety analysis, maintenance plan analysis, and failure detection & isolation of system design. Its ability to detect the failure well in advance is the main advantage of the process, Following are some of the benefits of FMECA:

  • Identifies potential failure mode of a process/product
  • Contributes in improving the design of process/product
  • Assess the risks associated with failure modes
  • Reduces development time
  • Increased throughput
  • A base for availability and quantitative reliability analysis can be provided
  • Increased safety
  • Lowers the nonvalue added operations
  • Warranty costs can be decreased
  • Re-design costs can be reduced
  • Can provide a base for maintenance planning
  • Helps in prioritizing corrective actions for detected failures
  • Customer satisfaction is enhanced
  • Beneficial for cost savings

FMEA and FMECA can be considered complimentary to each other, however, there are certain differences between the two processes

S. No. FMEA FMECA
1 Primary step for developing FMECA Details from FMEA are incorporated in the CA worksheet
2 FMEA is used for process FMECA is used for system
3 Criticality analysis is not performed Criticality analysis is performed
4 Problem prevention is the major aim of FMEA Detection and controlling the measures contributing to failure mode by providing management information is the major aim of FMECA
5 Criticality matrix is not used Criticality matrix is used which is helpful in identifying and comparing the failure modes based on their severity
6 Process/product quality and reliability is improved Analysis of the repair level, logistics support, system safety & maintenance planning and production planning are attained by FMECA
7 Human errors are examined to a certain limit and the output is dependent on the operation mode Human errors are not considered
8 Less time consuming compared to FMEA FMECA are more time consuming

FMECA methodology is a flexible process and can be adapted in any organization. Main objective of FMECA is to provide an accurate analysis of the potential failure modes and to identify the most critical failure modes to increase the reliability of the product/process.