FMEA: Failing on Paper First -- A Field Checklist for Process Risk Analysis
Failure Mode and Effects Analysis (FMEA) is a structured technique for identifying potential process failures, assessing their severity and likelihood, and prioritising action on the highest-risk failure modes before they cause defects, delays, or service failures. It is one of the most broadly applicable tools in the Lean Six Sigma toolkit -- useful in manufacturing, service delivery, project management, and supply chain operations.
In 2022, with return-to-office transitions introducing new process variation and supply chain disruptions exposing gaps in contingency plans, process risk analysis is particularly timely. Here is a field checklist for running a process FMEA.
Preparation
Define the process scope precisely. An FMEA on "the procurement process" will produce an unmanageable list. An FMEA on "the supplier onboarding process from contract execution to first purchase order" is tractable. The tighter the scope, the more useful the output.
Assemble a cross-functional team. Process FMEAs should include people who do the work, people who manage it, and people who experience its failures. Aim for four to six participants who together cover the full scope of the process.
Create a process map first. You cannot identify failure modes in a process you have not mapped. Use a swim lane diagram or a simple flow chart to document the current-state process before you begin the FMEA.
The FMEA Checklist
For each process step, identify the potential failure modes. A failure mode is a way in which the process step can fail to produce its intended output. A single step can have multiple failure modes. Ask: what could go wrong? What does go wrong? Collect data on historical failures if available.
For each failure mode, identify the effect. The effect is the consequence of the failure mode for the customer or the downstream process. Effects should be described in terms of what the customer or downstream step experiences, not what the process does internally.
Rate severity (S) on a 1 to 10 scale. Severity rates the impact of the effect. A failure mode that causes a safety hazard or regulatory non-compliance rates 9 or 10. A failure mode that causes minor inconvenience with no service impact rates 1 or 2. Severity is a property of the effect, not the failure mode.
For each failure mode, identify the causes. The cause is the root reason the failure mode occurs. A single failure mode can have multiple causes. Use the 5 Whys or an Ishikawa diagram to drill below the surface cause to the root cause.
Rate likelihood of occurrence (O) on a 1 to 10 scale. Occurrence rates how often the failure mode occurs. Base your rating on data where available. If no data exists, use team experience and agree the rating explicitly -- undefended occurrence ratings are the most common source of FMEA errors.
Identify current controls and rate detection (D) on a 1 to 10 scale. Detection rates the likelihood that the current process controls will catch the failure before it reaches the customer. A low detection score means the control is effective. A high detection score means the failure is likely to reach the customer undetected.
Calculate the Risk Priority Number (RPN = S x O x D) and prioritise. The RPN is not a perfect measure of risk -- a failure mode with high severity and low occurrence may deserve more attention than one with moderate scores across all three dimensions. Use the RPN as a starting point, then apply judgment to the high-severity items regardless of their RPN.
Define and assign improvement actions for the highest-priority items. Each improvement action should have an owner, a due date, and a target reduction in RPN. After actions are implemented, recalculate the RPN to verify that the improvement was effective.
XNM supports public-sector and capital-project clients in applying Lean Six Sigma tools including FMEA. Connect with XNM's strategic advisory team to discuss process risk analysis for your organisation.