Understanding Variation

Understanding Variation

Author

Donald J. Wheeler

Year
1993
image

Review

Wheeler highlights the value of Process Behaviour Charts, which present data as a time series with control limits. This approach accounts for natural process variation and provides a more accurate understanding than single-point comparisons. The book, clear and accessible, discusses recording and analysing data and differentiating between signal and noise.These insights can greatly improve the way we understand data and manage processes, particularly for those in product management roles.

You Might Also Like:

image

Key Takeaways

The 20% that gave me 80% of the value.

  1. Data are meaningless without their context.
    • Presenting data without context effectively renders them useless.
    • Monthly reports usually do not provide sufficient context.
  2. An analysis method is essential for interpreting data.
    • Comparing data to specifications, goals, and targets doesn't provide a rational context for analysis and doesn't promote a consistent purpose. It encourages the perspective of either being "operating okay" or "in trouble," without considering the impact of variation on the data and treating every fluctuation as a signal.
    • Comparisons to average values are slightly better, but still don't account for data variation. Again, every fluctuation is treated as a signal.
    • Shewhart's control charts provide a superior data analysis approach, addressing these shortcomings by explicitly considering data variation. By distinguishing between predictable and unpredictable variations, the focus shifts from the results to the behavior of the system that produced them. This shift is a major step towards continual improvement.
      • If a system displays statistical control, it's already as consistent as possible. Searching for Special Causes is wasteful. Instead, efforts can be directed towards improvements and modifications.
      • If a system lacks statistical control, trying to improve or modify the process will be futile. The focus should be on identifying the Special Causes disrupting the system.
    • Failing to differentiate between these two actions is a significant source of confusion and wasted effort in businesses today.
  3. All data contain noise, but some also contain signals. To detect a signal, the noise must first be filtered out. Among all the statistical techniques designed to distinguish signals from noise, Shewhart's control charts are the simplest. This simplicity contributes to their status as one of the most powerful analysis methods available today.
  4. The goal of analysis is to gain insight. The best analysis is the simplest one that provides the required insight. By using control charts in combination with histograms, flow charts, cause and effect diagrams, Pareto charts, and running records, it is possible to extract insights from the data. These insights may remain hidden when using traditional analyses.
image

Deep Summary

Longer form notes, typically condensed, reworded and de-duplicated.

Introduction

Information is random and miscellaneous, but knowledge is orderly and cumulative. Daneil Boorstin.
  • Raw data have to be digested before they can be useful.
  • Most people don’t understand how to extract knowledge from data.
  • Numerical illiteracy can only be overcome through practice.

Chapter 1: Data are random and miscellaneous

Managing a company by means of a monthly report is like trying to drive a car by watching the yellow line in the rear view mirror. Myron Tribus
  • Global comparisons between two values are flawed as they fail to capture and convey the behaviour of a time series and are often based on the assumption that the previous period was normal.
  • Management reports often use limited comparisons, which can be misleading due to lack of data and inherent variation in numbers, making it difficult to accurately determine the cause of any changes observed.
  • Graphs, especially time-series graphs and tally plots, are effective in data representation as they provide a quick understanding of whether a current value is unusual or related to external factors, unlike tables that often offer excessive details. They allow you to spot unusual data and trend
  • Data should always be presented in such a way that preserves the evidence in the data for all the predictions that might be made from these data.
    • Numerical summaries of data may supplement graphs, but they can never replace them.
    • Tables should accompany most graphs.
    • The context for the data should be completely and fully described.
    • Data can’t be divorced from their context without the danger of distortion.
  • Data summaries, like averages, ranges, or histograms, should not mislead users to take actions they wouldn't if the data was shown as a time series.
  • Data should always be presented in their context, with comparisons made within a broader framework rather than between pairs of values, utilising graphs for effective presentation.

Chapter 2: Knowledge is Orderly and Cumulative

  • The interpretation of data requires a method of analysis. Data → Analysis → Interpretation. Input → Transformation → Output.
  • Variation is the random and miscellaneous component that undermines the simple and limited comparisons.
  • Comparisons to specifications (goals, targets, budgets, etc.) can lead to a binary worldview. You're either ahead or behind, which can result in periods of neglect and panic.Specifications which are arbitrary numerical targets are detrimental and counter-productive.
  • Three categories:
    • Facts: Things we know to be true. E.g. Profit.
    • Predictions for planning: based on past data + present actions + future conditions
    • Arbitrary targets: are neither helpful nor necessary, they are often detrimental.
  • When people are pressured to meet a target, there are three ways they can respond:
    • Work to improve the system
    • Distort the system
    • Distort the data
  • To improve a system you need to:
    • Listen to the voice of the system/process.
    • Understand how the inputs affect the outputs
    • Change the inputs (and the system) in order to achieve the results.
    • Requiring effort, purpose and an environment where continual improvement is the operating philosophy
  • When a current value is compared to an arbitrary numerical target, it creates a temptation to make the data look favourable. Distortion is always easier than improving the system.
  • The specification approach tells you where you are, not how you got there, or how to get out.
  • Managers often ask for justifications from average values. But expect to be performing above or below the average most of the time. Avoid analysis that attempts to attach a meaning upon each and every value.
  • While specifications do nothing to describe the voice of the process.
  • A control chart is a time series data accompanied by three horizontal lines added. A central line (some sort of average computed from the data), straddled by two limit lines (computed from ranges or some other function). A control chart defines the Voice of the Process.
  • Most time series data are unpredictable, inconsistent and change over time. If unpredictable, a time series is out-of-control. A process is in control however, is when past experience can predict within limits how the process will behave in future. Unless the process is changed in some fundamental way you can be more confident about predicting the future.
  • The essence of statistical control is predictability.
  • Before a single month can be said to signal change in the time series, it must go beyond the limits. Then look at the sequence of points adjacent to the out-of-control point which are on the same side of the central line. Do they show a trend/change.
  • Instead of attempting to attach a meaning to every value of the time series, the Control Chart concentrates on the behaviour of the underlying process. It yields more insight and understanding than the specification or average value approach. It helps you understand whether it’s safe to extrapolate into the future. It defines a range of values you are likely to see in the future. It takes variation into account.
  • The noise introduced by variation clouds all comparisons between single values. Until you allow for the noice, you can’t detect the signal.
  • While every data set contains noise, some data sets may contain signals. Therefore, before you can detect a signal within any given data set, you must first filter out the noise.
  • Two mistakes when analysing data: Interpreting noise as if it were a signal. Failing to detect a signal when it is present. Control charts strike a balance between these two mistakes.

Chapter 3: The purpose of Analysis is insight

  • Data are generally collected as a basis for action. Therefore it’s vital that we separate probable signal from probable noise.
  • Problems with using % difference:
    • Obscures the absolute size of the change.
    • Assumes variation in % terms is similar (some processes may naturally have higher variation)
    • Large differences can be due to an unusual value in the past rather than an unusual value in the present.

The Purpose of Analysis is Insight

  • Large percent differences do not necessarily indicate a signal. Small percent differences do not necessarily indicate a lack of a signal.
  • The control chart focuses data so that the user will ask the interesting and important questions.
  • A single value beyond the limits of a control chart is a signal.
  • Another pattern which is taken to be a signal consists of at least three out of four consecutive values which are closer to one of the limits than they are to the central line.
  • The control chart filters out the probable noise in order to detect the potential signals in any data set.
  • By filtering out the noise, the control chart minimises the number of times that one interprets a bit of noise as if it were a signal.
  • By causing the potential signals to stand out, the control chart also minimises the number of times that one misses a signal.
  • Shewhart’s control charts are the beginning of knowledge because they help one to ask the right questions.

Chapter 4: The Best Analysis is the Simplest Analysis

  • Traditional comparisons often fail to filter out noise and highlight potential signals.
    • Large percentage differences may be merely noise, while small ones may represent significant signals.
  • There are typically multiple methods to measure most processes.
  • Eight or more consecutive values on the same side of the central line signal a trend.
  • Tables can overload users with unnecessary details. In contrast, graphs uncover interesting patterns in the data.
  • Monthly management reports are not the most effective way to communicate numerical values.
  • Setting arbitrary numerical goals tends to distort the system more than improve it.
  • The Voice of the Customer outlines desired outcomes.
  • The Voice of the Process indicates what the system can deliver.
  • It is management's responsibility to align the Voice of the Process with the Voice of the Customer.
  • The key to successfully using control charts lies in adopting the associated mindset.

Chapter 5: But you have to use the right data

  • As data get aggregated, they lose context and usefulness.
  • Aggregated data may serve as a report card but won't pinpoint what needs fixing.
  • Setting goals doesn't change the system.
  • Setting targets to meet goals is desperate.
  • Measurements are stronger than counts.
  • Measures of activity provide more insight than counts of occurrences.
  • Some measures inherently risk distortion.
  • The key figures for customer satisfaction are unknown and unknowable.
  • Narrowly focused improvement efforts may be harmful.

Chapter 6: Look what you’ve been missing.

  1. Data are meaningless without their context.
    • Presenting data without context effectively renders them useless.
    • Monthly reports usually do not provide sufficient context.
  2. An analysis method is essential for interpreting data.
    • Comparing data to specifications, goals, and targets doesn't provide a rational context for analysis and doesn't promote a consistent purpose. It encourages the perspective of either being "operating okay" or "in trouble," without considering the impact of variation on the data and treating every fluctuation as a signal.
    • Comparisons to average values are slightly better, but still don't account for data variation. Again, every fluctuation is treated as a signal.
    • Shewhart's control charts provide a superior data analysis approach, addressing these shortcomings by explicitly considering data variation. By distinguishing between predictable and unpredictable variations, the focus shifts from the results to the behavior of the system that produced them. This shift is a major step towards continual improvement.
      • If a system displays statistical control, it's already as consistent as possible. Searching for Special Causes is wasteful. Instead, efforts can be directed towards improvements and modifications.
      • If a system lacks statistical control, trying to improve or modify the process will be futile. The focus should be on identifying the Special Causes disrupting the system.
    • Failing to differentiate between these two actions is a significant source of confusion and wasted effort in businesses today.
  3. All data contain noise, but some also contain signals.
    • To detect a signal, the noise must first be filtered out.
    • Among all the statistical techniques designed to distinguish signals from noise, Shewhart's control charts are the simplest. This simplicity contributes to their status as one of the most powerful analysis methods available today.
  4. The goal of analysis is to gain insight.
    • The best analysis is the simplest one that provides the required insight. By using control charts in combination with histograms, flow charts, cause and effect diagrams, Pareto charts, and running records, it is possible to extract insights from the data. These insights may remain hidden when using traditional analyses.