Friday, March 11, 2011

Contra the 1.5-Sigma Shift

I'm currently working up some simulations to try once again to put the "1.5-Sigma Shift" to bed for good. The simulations seem to prove out what I've long felt about the shift, but I have one to run yet to demonstrate effects of the shift -- and detectability -- on a high-volume operation.
My understanding of the origin of the use of the shift is this: people at Motorola apparently had some data that showed that you could have undetected shifts of up to 1.5 Sigma; this would certainly be a valid concern when you have high-volume production with low monitoring rates.
As an example of what can happen when you get shifts in high volume enterprises, I'll mention Don Wheeler's Japanese Control Chart story from Tokai Rika. They were running about 17,000 cigarette lighter sockets per day, and had found that they could detect shifts using one subgroup of four sockets per day. They selected one at 10 AM, 12 PM, 2 PM and 4 PM each day, and kept an XbarR chart on the data. The only rule they used was rule 1, (a single point outside the control limits).
Suppose they had decided to add rule 4 of the Western Electric Zone Tests (a run of eight above or below the centerline--Minitab and JMP call this rule 2 and use a run of nine). This would mean that if a shift in the mean occurred and and the first signal was a rule 4 signal, they might run 8 x 17,000 = 136,000 sockets at the changed level. This would be unlikely to result in any nonconforming product (since they were using less than half the specified tolerance), but from a Taguchi Loss perspective, it's not desirable.
So it might be prudent to study your processes and either sample more frequently; or you can "play the slice" as Motorola did, and assume that you might have undetected shifts up to 1.5 sigma on a regular basis. If you do this, you will end up only giving yourself credit for a Cpk of 1.5 when you actually have a Cpk of 2, and you end up estimating much higher proportions defective than what you actually get. As a fudge factor for setting specifications, it's sloppy but safe, I guess.
So let's talk about what Motorola might have gotten wrong.
1. My understanding is that they (much like Tokai Rika) only used rule 1. This would keep them from picking up some of the other signals. I don't have the data from the studies they based their conclusions on, but they might have used a different value than 1.5 had they had the added sensitivity lent by the rest of the Western Electric Zone Tests.
2. "Undetected shifts" are, logically, undefined. If we operationally define a shift in the mean by using some combination of the Western Electric Zone Tests, then any long run without a signal is not (by definition) an undetected shift. Logically, you can't detect an undetected shift. We can define the difference between long-term variation (dispersion characterized by the standard deviation of the entire data set) and short-term variation (dispersion characterized by rbar/d2 or sbar/c4). If you want an operational definition of "undetected shifts," the delta between those two measures of variation might be useful. It's silly to assume, however, that there are some bursts of variation that average 1.5 sigma and somehow escape detection. Not only that, but the false alarm rate itself induces false signals.
3. It's damned difficult to induce a shift in a simulation that isn't picked up within a few subgroups. In one of the simulations I've been working recently, I created 10,000 random variables from a normal distribution, with a mean of 50 and a standard deviation of .5. I cleaned up the false signals by substituting other randomly-generated numbers for those outside the control limits, and rearranging the order to kill off the rule 2, 3 and 4 signals. I then ramped up a 1.5 sigma shift in .05-sigma intervals, 50 at a time. An ImR chart caught the shift within the first 8 subgroups (and I had only shifted .05 sigma at that time). That was for a gradual shift; an abrupt 1.5 sigma shift signalled immediately.
4. The only way you get the results the process sigma calculations give you is if all the data are shifted 1.5 sigma; in other words, the mean has to shift 1.5-sigma and stay there. So you have a control chart, and the centerline is on 50, and the upper control limit is at 51.5, and you don't have any out-of-control signals...but the actual process mean is 50.75? In what world can that happen? Those are the conditions you would need, though, to actually get "3.4 defects per million opportunities" in any process showing six sigma units between the process mean and the nearest specification limit (a process sigma of six). Occasional process meandering to as far as 1.5 in either direction, if it could go undetected, would result in significantly lower DPMO than what the Process Sigma Table predicts.
I believe it was a mistake for the statistical communiy to allow this to become an informal standard. We are about quantifying uncertainty, not about arbitrarily adding large chunks of uncertainty. The "process sigma" is already counterintuitive. If you tell managers their process sigma is 3.2, the first question they always ask is, "So what does that mean?" It's much better, I think, to use DPMO...it makes sense to most people, doesn't require translation, and doesn't have require assumptions about shifts that probably don't exist. It also acts as a sort of Rosetta Stone, allowing to translate between data from counts and data from measurements. We do have to remind managers that DPMO is still just a best estimate based on current data, but it's certainly more meaningful than the "process sigma."
There is a danger that it will become more than just an informal standard soon. There is a proposal for a new ISO interational standard for DMAIC; it does include the Process Sigma, and the language in the proposed standard says we will adjust by 1.5 sigma "by convention." Anyone interested should watch for opportunities for public comment on the standard, either through TAG 69, NIST, or ISO.