← All articles

Building a Solid Data-Collection Plan: A Field Checklist for Six Sigma Practitioners

By XNM Technologies · June 7, 2022 · 3 min read
Building a Solid Data-Collection Plan: A Field Checklist for Six Sigma Practitioners

A data-collection plan is a structured document that defines what data needs to be collected, why it is needed, how it will be collected, who will collect it, when collection will occur, and what operational definitions will be used. In the DMAIC problem-solving framework, the data-collection plan is a central output of the Measure phase. Without a rigorous data-collection plan, the data collected will be inconsistent, incomplete, or collected in a way that introduces bias -- and any analysis built on that data will have conclusions that cannot be trusted.

Here is a field checklist for building a data-collection plan that will produce reliable, analysis-ready data.

Checklist Part 1: Define What You Are Measuring

  1. Define the operational definition for each data item. An operational definition describes exactly what is being measured and how it will be measured in a way that produces the same result regardless of who is doing the measuring. 'Defect' is not an operational definition. 'A customer order that contains one or more incorrect line items, as identified during the packing verification step' is an operational definition.

  2. Specify the unit of measure. Define the unit of measure for each data item (percent, count, minutes, dollars, kilograms) and the precision required. 'Time to resolve a complaint' measured to the nearest hour is different from measured to the nearest minute.

  3. Confirm alignment with the problem statement. Each data item in the collection plan should connect directly to the problem statement or the hypothesis being tested. Data that does not inform any analysis decision should not be collected.

Checklist Part 2: Plan the Collection Method

  1. Identify the data source. Specify where the data comes from: a database system, a manual log, a physical measurement, an observation checklist. Data from existing systems should be validated -- what the system records and what you think it records are sometimes different.

  2. Choose the sampling method. If you cannot measure the entire population, define how the sample will be selected. Random sampling is preferable to convenience sampling. Stratified sampling ensures that important subgroups are represented.

  3. Conduct a Measurement System Analysis before full-scale collection. Before collecting data at scale, validate that the measurement system (the combination of instrument, procedure, and operator) is capable of detecting the variation you are trying to measure. A Gauge R&R study on the measurement system will reveal whether the data collected is reliable.

Checklist Part 3: Assign Responsibility and Train Collectors

  • Name a specific person responsible for collecting each data item -- not a role or a team. Unnamed responsibilities are unowned responsibilities.

  • Train data collectors on the operational definition and the collection procedure. Inconsistent application of an operational definition is a common source of measurement error. Document the training and record who was trained and when.

  • Pilot the data-collection plan before full-scale deployment. A one-week pilot with 2-3 collectors and 20-30 observations will surface operational definition ambiguities, system access issues, and procedure gaps before they contaminate the full data set.

  • Define what to do with incomplete or borderline observations. What happens if a collector is uncertain whether an observation meets the operational definition? Having a documented resolution process prevents inconsistent treatment of ambiguous data.

XNM applies Six Sigma data-collection and analysis methodology to public-sector and capital-project environments. Reach out to XNM's strategic advisory team to discuss data quality and process improvement for your organisation.