Building a Data-Collection Plan That Survives Contact With Reality
In Lean Six Sigma, the Measure phase of DMAIC is where good intentions go to die. A team eager to fix a problem skips straight to gathering numbers, then discovers months later that two operators counted defects differently, half the timestamps are guesses, and the 'before' baseline can't be compared to the 'after'. After the disruptions of the past year, when so much routine data was collected inconsistently or not at all, a deliberate data-collection plan matters more than ever. It is the unglamorous step that decides whether your conclusions hold up.
Decide what you need before you collect anything
Start from the question, not the data you happen to have. Tie every measure back to the project's problem statement: what output (Y) are you trying to improve, and what inputs or process steps (Xs) might drive it? If a metric won't help you confirm the problem or test a cause, do not collect it. A focused plan beats a giant spreadsheet nobody trusts.
The plan, step by step
Define each measure operationally. Write down exactly what counts as a 'defect', a 'cycle', a 'late delivery'. An operational definition removes judgement so two people measuring the same thing get the same answer. This is the single highest-leverage step and the one teams most often skip.
Choose data type and sampling. Decide whether each measure is continuous (time, weight, dollars) or discrete (pass/fail, counts). Then plan a sample that represents the real process — across shifts, machines, and days — rather than a convenient afternoon. Note how much data and over what period before you call the baseline complete.
Validate the measurement system. Before trusting the numbers, check that the gauge or counting method is itself reliable. For discrete data, a simple attribute agreement check — do appraisers classify the same items the same way? — catches the most common source of garbage data.
Specify who, how, and where it is recorded. Name the person, the form or check sheet, and the exact moment of capture. Build the form so the right entry is the easy entry. Pilot it on a handful of records and fix the confusion before the full run, not after.
Common traps to design out
Vague definitions, so 'defect' means something different to each recorder.
Convenience samples that quietly exclude night shift or the busiest days.
Mixing baseline data collected one way with new data collected another.
No measurement-system check, so you improve the gauge's noise, not the process.
A good plan is short enough that the people doing the recording actually follow it, and rigorous enough that the resulting data answers the question. Spend the extra day up front writing operational definitions and piloting the form; you will save weeks of arguing about whose numbers are right. When the data is trustworthy, the Analyze phase becomes almost easy — the signal is already in the data, waiting to be found rather than manufactured.
Remember the discipline behind DMAIC: you cannot improve what you have not honestly measured, and you cannot honestly measure without a plan made before the first data point is recorded.
If you are standing up a process-improvement effort and want measurement that holds up to scrutiny, XNM's strategic advisory can help you build a data foundation your decisions can safely rest on.