Risk Assessment and FMEA-Based Analysis

Risk Assessment is a structured evaluation of potential failure modes associated with GMP-relevant systems, equipment, utilities, and processes that may impact product quality, patient safety, data integrity, or regulatory compliance. Within a validation lifecycle, risk assessment is used to define validation scope, testing depth, and control strategy, ensuring that validation effort is proportionate to system criticality and intended use.

In this article, Risk Assessment refers primarily to an FMEA-based analysis. Supporting tools such as flow diagrams, checklists, or Fault Tree Analysis may be used to inform the assessment, but they do not replace FMEA as the primary risk evaluation method.

A risk assessment is recommended for all validation activities. Risk assessments previously performed for existing equipment may be applied to functionally equivalent new equipment when equivalency is justified and documented

Failure Mode and Events Analysis Process Flow

Purpose of Risk Assessment in Validation

The objective of risk assessment is to understand and manage risk, not to eliminate it entirely. By identifying credible failure modes and evaluating their potential impact, organizations can focus validation activities on what truly matters to product quality and patient safety.

Risk assessment supports:

  • Selection of full versus lean validation approaches
  • Identification of critical components, parameters, and controls
  • Definition of qualification and validation requirements
  • Integration of risk controls into lifecycle management

When Risk Analysis Is Required

A formal risk analysis shall be performed when:

  • A lean validation approach is planned
  • The quality or business impact of a new system or utility is unknown
  • Changes are introduced under change control and the impact to quality, compliance, or business continuity is unclear

In these cases, documented risk analysis provides the justification for validation strategy and downstream controls.


FMEA as the Primary Risk Analysis Method

Failure Mode and Effects Analysis (FMEA) is the primary method used to evaluate risk during validation. FMEA systematically identifies potential failure modes, their causes, and their effects on system performance and product quality. It relies on sound understanding of system design, intended use, operating conditions, and historical performance.

Once failure modes are identified, risk reduction measures may be applied to eliminate, reduce, contain, or control risk through engineering controls, procedural controls, monitoring, maintenance, training, or validation activities.


Risk Assessment Methodology

The Risk Assessment identifies, analyzes, and evaluates parameters that are critical to GMP compliance and system performance.

The assessment typically begins with system requirements derived from the User Requirements Specification (URS), when available. Each GMP-critical requirement is evaluated, and one or more risk scenarios are defined.

The Risk Assessment team is responsible for:

  • Identifying potential failure modes
  • Assigning Severity (S), Probability (P), and Detectability (D) scores
  • Applying the defined scoring model consistently
  • Documenting rationale and assumptions

Risk Scoring Model

Severity (S)

ScoreClassificationDescription
1NegligibleFailure causes no impact on product quality, no interruption to manufacturing, no compliance risk. E.g., non-critical display light failure.
2MinorFailure affects a non-GMP utility or secondary function. No direct product or data impact, minimal downtime (e.g., minor HVAC fluctuation outside manufacturing area).
3ModerateFailure impacts a non-critical step but could lead to deviation or rework. Possible short production delay (e.g., buffer tank temperature drift detected and corrected before batch impact).
4MajorFailure affects a GMP-critical utility (WFI loop, clean steam) or key equipment control, likely to lead to batch rejection or process deviation requiring investigation.
5CriticalFailure results in confirmed product contamination, sterility breach, or major data integrity issue. Could trigger regulatory action or product recall.

The potential impact of the failure on product quality, patient safety, or regulatory compliance.


Probability of Occurrence (P)

ScoreClassificationDescription
1RemoteFailure has never been observed; robust preventive maintenance (PM) and monitoring in place (e.g., validated UPS for control systems).
2Unlikely Failure could occur due to unusual conditions; historical data shows rare events (e.g., filter housing gasket failure once in 5 years).
3PossibleOccasional failure modes documented; some known wear points (e.g., autoclave door seal replacement required once or twice a year).
4LikelyFrequent operational issues; dependent on manual intervention or aging components (e.g., recurring chiller trips, known PLC faults).
5FrequentHigh likelihood of recurring failure unless mitigated (e.g., known history of valve sticking or flow meter sensor drift impacting batches).

The likelihood that a specific failure mode will occur under normal operating conditions.


Detectability (D)

ScoreClassificationDescription
1High DetectabilityAutomatic alarms, interlocks, or monitoring systems reliably catch failures (e.g., System temperature deviation alarms).
2GoodSingle automated or manual control that typically detects failure (e.g., in-process pH checks).
3ModerateFailure might go unnoticed until later QA checks or trending (e.g., pressure drop across filters reviewed post-run).
4LowDetection is only possible via operator observation or delayed test results (e.g., microbial excursions in WFI).
5UndetectableNo reliable detection until after product release or significant impact (e.g., hidden PLC logic error with no alarms).

The likelihood that a failure or its effect will be detected before adverse impact.


Risk Priority Number (RPN)

After assigning individual scores for Severity (S), Probability (P), and Detectability (D), the Risk Priority Number (RPN) is calculated as: RPN = S × P × D

The RPN is not an absolute measure of risk, but a prioritization tool used to focus resources on higher-priority failure modes.

RPN Priority Categories

PriorityPRN RangeActions
Low Risk1-19Acceptable; monitor via PM and trending.
Medium Risk20-39Mitigation or additional control required before qualification.
High Risk40-125Mitigation mandatory; requires CAPA, design change, or enhanced validation approach.

Risk Classification and Mitigation

Failure modes with potential adverse impact on product quality or patient safety are classified as Critical or Direct Impact. Failure modes with no such impact are classified as Non-Critical or No Impact.

For all Critical or Direct Impact risks, mitigation measures shall be defined to reduce probability, severity, or improve detectability. These measures must be necessary, appropriate, and integrated into qualification, validation, and operational controls.


Documentation

All identified failure modes, risk scores, mitigation measures, and supporting justifications shall be documented as part of the formal Risk Assessment record. Risk Assessments are subject to review and approval by the System Owner, appropriate Subject Matter Experts, and Quality Assurance to ensure accuracy, consistency, and regulatory compliance.

Risk assessments previously executed for existing equipment or systems may be applied to functionally equivalent systems, provided that equivalency is appropriately evaluated, justified, and documented.

The table below is provided as an example to demonstrate how Failure Mode and Effects Analysis (FMEA) outputs may be documented and evaluated within a validation risk assessment. The example demonstrates the application of Severity, Probability, and Detectability scoring, the calculation of the Risk Priority Number (RPN), and the reassessment of residual risk following the implementation of mitigation measures.

This example does not represent a complete or prescriptive risk assessment and does not establish acceptance criteria for any specific system or process. Actual risk assessments shall be performed based on system-specific requirements, intended use, operating conditions, historical performance, and Quality oversight.