Wednesday, March 30, 2011

Data Homogeneity - An excerpt from "Data Analysis with Minitab"

The following is an excerpt from my Data Analysis with Minitab course. I thought this was too important; it's ignored far too often. For more information, see Davis Balestracci's Data Sanity (the paper or the book), or Don Wheeler's The Six Sigma Practitioner's Guide to Data Analysis.

Shape, Center and Spread: Histograms


We have discussed simple graphical analysis in histograms. Remember, a histogram allows us visually to get a feel for shape, center and spread of a set of data. Adding the specification limits to a histogram allow us to see performance in relationship to specifications, and any outliers might show up on a histogram.

Important to note: A histogram is a snapshot in time. It shows how the data are “piled.” If the process is not stable, we can’t make any assumptions about the distribution. So, while a histogram is a very useful tool, it’s more useful when used in conjunction with some time-series plot. The following scenarios, adapted from Davis Balestracci’s Data Sanity, illustrate the importance of looking at process data over time.

These scenarios depict the percentage of calls answered within 2 minutes for three different clinics in a metropolitan area. All three sets of data were collected during the same 60-day time period.

What can you say about the performance of the clinics, based on the histograms and data summaries?

The summaries presented in the histograms all show unimodal, fairly symmetrical, bell-shaped piles of data. The p-values for the Anderson-Darling tests for normality are all high, indicating no significant departures from a normal distribution. There are no apparent outliers. The mean percentage for each clinic is a little over 85%, and the standard deviations are all around 2.5%.

The histogram, though, is a snapshot. It only reveals how the data piled up at a particular point in time. The graphic, and its associated summary statistics, can only represent what’s happening at the clinics if the data are homogeneous. These data were gathered over time: what would a picture of the data over time reveal?

The control chart for clinic A is below. Although the histogram showed the same bell-shaped pattern and high p-value for the normality test, you can easily see that the histogram can’t represent the data for clinic A; we caught it in an overall upward trend, and so a histogram of the next sixty days will no doubt look very different from the histogram of the first sixty days.

Likewise, the control chart for Clinic B…

This chart shows that what we are actually looking at is three different processes, the data for which just appear to stack up to a single, not-different-from-normal distribution. In fact, by slicing the chart at the shifts, we can see that there are three distinct time periods when the variation is in control:

The only one of the three clinics with a stable process is clinic C. Looking at Clinic C’s plot over time, we see the random pattern of variation within the control limits. We can now expect that the histogram will not change shape significantly over time, the parameters will all remain about the same, so our assumptions about distribution will be valid and useful.

Friday, March 11, 2011

Contra the 1.5-Sigma Shift

I'm currently working up some simulations to try once again to put the "1.5-Sigma Shift" to bed for good. The simulations seem to prove out what I've long felt about the shift, but I have one to run yet to demonstrate effects of the shift -- and detectability -- on a high-volume operation.
My understanding of the origin of the use of the shift is this: people at Motorola apparently had some data that showed that you could have undetected shifts of up to 1.5 Sigma; this would certainly be a valid concern when you have high-volume production with low monitoring rates.
As an example of what can happen when you get shifts in high volume enterprises, I'll mention Don Wheeler's Japanese Control Chart story from Tokai Rika. They were running about 17,000 cigarette lighter sockets per day, and had found that they could detect shifts using one subgroup of four sockets per day. They selected one at 10 AM, 12 PM, 2 PM and 4 PM each day, and kept an XbarR chart on the data. The only rule they used was rule 1, (a single point outside the control limits).
Suppose they had decided to add rule 4 of the Western Electric Zone Tests (a run of eight above or below the centerline--Minitab and JMP call this rule 2 and use a run of nine). This would mean that if a shift in the mean occurred and and the first signal was a rule 4 signal, they might run 8 x 17,000 = 136,000 sockets at the changed level. This would be unlikely to result in any nonconforming product (since they were using less than half the specified tolerance), but from a Taguchi Loss perspective, it's not desirable.
So it might be prudent to study your processes and either sample more frequently; or you can "play the slice" as Motorola did, and assume that you might have undetected shifts up to 1.5 sigma on a regular basis. If you do this, you will end up only giving yourself credit for a Cpk of 1.5 when you actually have a Cpk of 2, and you end up estimating much higher proportions defective than what you actually get. As a fudge factor for setting specifications, it's sloppy but safe, I guess.
So let's talk about what Motorola might have gotten wrong.
1. My understanding is that they (much like Tokai Rika) only used rule 1. This would keep them from picking up some of the other signals. I don't have the data from the studies they based their conclusions on, but they might have used a different value than 1.5 had they had the added sensitivity lent by the rest of the Western Electric Zone Tests.
2. "Undetected shifts" are, logically, undefined. If we operationally define a shift in the mean by using some combination of the Western Electric Zone Tests, then any long run without a signal is not (by definition) an undetected shift. Logically, you can't detect an undetected shift. We can define the difference between long-term variation (dispersion characterized by the standard deviation of the entire data set) and short-term variation (dispersion characterized by rbar/d2 or sbar/c4). If you want an operational definition of "undetected shifts," the delta between those two measures of variation might be useful. It's silly to assume, however, that there are some bursts of variation that average 1.5 sigma and somehow escape detection. Not only that, but the false alarm rate itself induces false signals.
3. It's damned difficult to induce a shift in a simulation that isn't picked up within a few subgroups. In one of the simulations I've been working recently, I created 10,000 random variables from a normal distribution, with a mean of 50 and a standard deviation of .5. I cleaned up the false signals by substituting other randomly-generated numbers for those outside the control limits, and rearranging the order to kill off the rule 2, 3 and 4 signals. I then ramped up a 1.5 sigma shift in .05-sigma intervals, 50 at a time. An ImR chart caught the shift within the first 8 subgroups (and I had only shifted .05 sigma at that time). That was for a gradual shift; an abrupt 1.5 sigma shift signalled immediately.
4. The only way you get the results the process sigma calculations give you is if all the data are shifted 1.5 sigma; in other words, the mean has to shift 1.5-sigma and stay there. So you have a control chart, and the centerline is on 50, and the upper control limit is at 51.5, and you don't have any out-of-control signals...but the actual process mean is 50.75? In what world can that happen? Those are the conditions you would need, though, to actually get "3.4 defects per million opportunities" in any process showing six sigma units between the process mean and the nearest specification limit (a process sigma of six). Occasional process meandering to as far as 1.5 in either direction, if it could go undetected, would result in significantly lower DPMO than what the Process Sigma Table predicts.
I believe it was a mistake for the statistical communiy to allow this to become an informal standard. We are about quantifying uncertainty, not about arbitrarily adding large chunks of uncertainty. The "process sigma" is already counterintuitive. If you tell managers their process sigma is 3.2, the first question they always ask is, "So what does that mean?" It's much better, I think, to use makes sense to most people, doesn't require translation, and doesn't have require assumptions about shifts that probably don't exist. It also acts as a sort of Rosetta Stone, allowing to translate between data from counts and data from measurements. We do have to remind managers that DPMO is still just a best estimate based on current data, but it's certainly more meaningful than the "process sigma."
There is a danger that it will become more than just an informal standard soon. There is a proposal for a new ISO interational standard for DMAIC; it does include the Process Sigma, and the language in the proposed standard says we will adjust by 1.5 sigma "by convention." Anyone interested should watch for opportunities for public comment on the standard, either through TAG 69, NIST, or ISO.