Whenever I hear someone talking about a "best practice," I always add the Homer Simpson modifier: "Best practice SO FAR..." What this term means is just that it's the best solution yet to some set of problems or circumstances.
My experience has been that they don't stifle creativity in creative people...they can serve as springboards for further creativity or improvement. I think they are best used just that way...as you're studying a process, and you're analyzing the cause systems that create the outputs and outcomes, you will look for aspects of the systems that can be worked on to optimize the outcomes. Looking at "best practices" is like looking at any other process...we're just starting with a process that has already been improved before (at least for this set of inputs).
The downside to "best practices" comes from leaders who hear the term "best" and decide that it must actually mean "best it could be." Managers who do this will try to force replication, without knowing what to replicate or why it worked in its original environment (and whether it will work in the new environment). In that case, it will certainly create road blocks and slow down process improvement.
Tuesday, November 2, 2010
Tuesday, June 1, 2010
A Story about Systems Thinking
In a class a few years ago, we asked students to talk about quality-related projects on which they were currently working. The class comprised a number of people from several business units. At one table, a project leader stood and told us all about his project for the marketing unit: they were exploring server consolidation. They knew that only a fraction of the capacity of each of many of their servers was in use; they had a large number of servers, therefore, that could be consolidated. Because this business unit "rented" the servers from the Shared Services unit, they figured they could save $250,000 per year by consolidating servers and turning them back over to Shared Services. The class politely applauded.
Next up was a person from the Shared Services unit, who talked about his project, which was developing a new service they could "sell" to the marketing unit, which would generate over $250,000 in new revenue for Shared Services. The class again politely applauded.
I asked, "What's wrong with these stories?"
Blank stares (I'm the idiot!)
I tried to give them a hint: "How does the company benefit from these projects?"
A tentative hand, then (in a tone that indicates that surely, I AM the idiot), "Well, the company saves half a million dollars! Why wouldn't THAT be a benefit?"
I asked, "How is the company saving a half-million dollars?"
Again, incredulous stares..."You're the stats guy...maybe you should have taken accounting instead...250,000 plus 250,000...isn't that half a million?"
I pointed out that marketing was "saving" a quarter of a million by not "renting" a quarter of a million's worth of servers from Shared Services, but that Shared Services was "making" a quarter million by "selling" a quarter-million's worth of new services to marketing. So they just dipped a bucket into one end of the lake and dumped it into the other end...and some evaporated while they were transporting it, because of the cost of the project.
Eventually, we did work out that there were benefits...increased server capacity, benefits from the new service, etc.. Most of these numbers (the actual benefits) were "unknown and unknowable" numbers. None of those benefits had been discussed originally, because the "knowable" numbers were easily calculated (and wrong)...
Next up was a person from the Shared Services unit, who talked about his project, which was developing a new service they could "sell" to the marketing unit, which would generate over $250,000 in new revenue for Shared Services. The class again politely applauded.
I asked, "What's wrong with these stories?"
Blank stares (I'm the idiot!)
I tried to give them a hint: "How does the company benefit from these projects?"
A tentative hand, then (in a tone that indicates that surely, I AM the idiot), "Well, the company saves half a million dollars! Why wouldn't THAT be a benefit?"
I asked, "How is the company saving a half-million dollars?"
Again, incredulous stares..."You're the stats guy...maybe you should have taken accounting instead...250,000 plus 250,000...isn't that half a million?"
I pointed out that marketing was "saving" a quarter of a million by not "renting" a quarter of a million's worth of servers from Shared Services, but that Shared Services was "making" a quarter million by "selling" a quarter-million's worth of new services to marketing. So they just dipped a bucket into one end of the lake and dumped it into the other end...and some evaporated while they were transporting it, because of the cost of the project.
Eventually, we did work out that there were benefits...increased server capacity, benefits from the new service, etc.. Most of these numbers (the actual benefits) were "unknown and unknowable" numbers. None of those benefits had been discussed originally, because the "knowable" numbers were easily calculated (and wrong)...
Labels:
Deming,
Six Sigma,
systems theory,
Systems Thinking
Friday, April 23, 2010
What is "Productivity?"
In one of my stats classes, a nursing student mentioned that they measure productivity at her hospital. It's measured this way:
"To get the productivity ratio; you take the total number of hours worked by nursing ( all nurses on the unit) and divide that by the total number of patients on the unit at midnight. For example if there are 4 nurses per shift and they work 12 hour shifts then that is 96 hours; then say there are 30 patients on the unit at midnight; divide 96(nursing hours worked) by 30(# of patients) = 3.2."
In my consulting practice, my clients often tell me about productivity numbers. This, to me, is one of the compelling questions for those of us in the quality profession: what is "productivity?" To keep the discussion going with my student, I posted the following, to raise some of the issues I've seen organizations struggle with over the years:
This is one problem with many of the metrics used for "productivity." By trying to boil it down to the simplest, easiest to use ratio, you leave out a lot of important information. What is productivity in nursing? Is it just being there? Clocking in and clocking out? Most of the nurses I know work pretty hard, but even the amount of work completed wouldn't necessarily reflect the value of a nurse. A number of years ago, a paradigm came out called ABC (for Activity-Based Costing) that measured productivity in terms of activity...how much were you actually doing? Seems reasonable, but it doesn't necessarily reflect value, any more than motion reflects progress.
Nursing can be a lot like being in the Military. I can't tell you how many watches I stood in 20 years...tens of thousands of hours where no one took a shot at anyone. If my job was to kill enemies, then most of the time, I was a waste of taxpayer dollars. Did that mean we didn't need to be there? Our job was not to be constantly doing something, but to be alert and vigilant so that if something did happen, we could take immediate action.
Similarly, there are nights, even in Emergency Rooms, that are slow. Would you send everyone home, to keep your productivity numbers high? Or is there value in having some knowledgeable and experienced caregivers there for the probable event of an emergency?
What is the productivity measure tied to? Can you show that a higher ratio correlates to better outcomes? Higher profits? If it's just cost-cutting, it's hardly "productivity;" it's just lack of having to pay for "non-productivity."
The point is, productivity is difficult to measure, and productivity is in the eye of the recipient. What the patient may value, the administrator may not. What the doctor may value, the HMO may not. What the nurse may value, the patient may not (one example; waking a surgical patient up every hour during the night to check vitals).
Of course, I guess the whole point boils down to value...who defines that, how you prioritize the "whos." This is where you must be able to understand something about systems thinking.
"To get the productivity ratio; you take the total number of hours worked by nursing ( all nurses on the unit) and divide that by the total number of patients on the unit at midnight. For example if there are 4 nurses per shift and they work 12 hour shifts then that is 96 hours; then say there are 30 patients on the unit at midnight; divide 96(nursing hours worked) by 30(# of patients) = 3.2."
In my consulting practice, my clients often tell me about productivity numbers. This, to me, is one of the compelling questions for those of us in the quality profession: what is "productivity?" To keep the discussion going with my student, I posted the following, to raise some of the issues I've seen organizations struggle with over the years:
This is one problem with many of the metrics used for "productivity." By trying to boil it down to the simplest, easiest to use ratio, you leave out a lot of important information. What is productivity in nursing? Is it just being there? Clocking in and clocking out? Most of the nurses I know work pretty hard, but even the amount of work completed wouldn't necessarily reflect the value of a nurse. A number of years ago, a paradigm came out called ABC (for Activity-Based Costing) that measured productivity in terms of activity...how much were you actually doing? Seems reasonable, but it doesn't necessarily reflect value, any more than motion reflects progress.
Nursing can be a lot like being in the Military. I can't tell you how many watches I stood in 20 years...tens of thousands of hours where no one took a shot at anyone. If my job was to kill enemies, then most of the time, I was a waste of taxpayer dollars. Did that mean we didn't need to be there? Our job was not to be constantly doing something, but to be alert and vigilant so that if something did happen, we could take immediate action.
Similarly, there are nights, even in Emergency Rooms, that are slow. Would you send everyone home, to keep your productivity numbers high? Or is there value in having some knowledgeable and experienced caregivers there for the probable event of an emergency?
What is the productivity measure tied to? Can you show that a higher ratio correlates to better outcomes? Higher profits? If it's just cost-cutting, it's hardly "productivity;" it's just lack of having to pay for "non-productivity."
The point is, productivity is difficult to measure, and productivity is in the eye of the recipient. What the patient may value, the administrator may not. What the doctor may value, the HMO may not. What the nurse may value, the patient may not (one example; waking a surgical patient up every hour during the night to check vitals).
Of course, I guess the whole point boils down to value...who defines that, how you prioritize the "whos." This is where you must be able to understand something about systems thinking.
Thursday, March 11, 2010
Creating a Culture of Process Improvement
This morning one of the questions posed by readers of IQ Six Sigma posed the following question:
“My department is charged with creating a "culture of process improvement" within our zone. We're struggling with what that looks like once we've created this culture. Looking at the Toyota model, they challenge employees to look for PI opportunities every day. What exactly does that look like, and what measurements should we consider (i.e. number of PI suggestions with managers being held accountable for X number per quarter, etc.) I'd like some ideas.”
My "short" answer (admittedly, this answer could have--and has--filled books):
Well, one thing you for sure don't want to do is set some quota for suggestions. You may already be faced with an uphill battle, because the leadership at your organization is actually the entity that has to create that culture of process improvement. If they are just rolling it downhill like any other MBO, it suggests that they don't know what they are doing.
Toyota does challenge employees with looking for improvement ideas. One of the ways they do that is by implementing them. Most suggestion boxes go unheeded by employees because they go unheeded by management. At companies like Toyota, they use mechanisms such as Quality Function Deployment to communicate the voice of the customer to everyone in the organization. It allows people on the production line a clear line of sight to the mind of the customer and the organization's leadership.
How do you establish this culture? Well, if you have to do it locally, start by knowing that you may not be as successful as you would if your leaders were leading. Empowerment is a big piece of the pie...you have to let people know they are empowered to make changes. You have to have mechanisms in place that let changes be approved at the lowest possible level. This doesn't mean that any line worker should be empowered to make design changes that require retooling the entire line without some study, but small local changes should be able to be made and standardized locally, as long as they don't suboptimize the system.
So, start by listening to people. I once found an operator potting an assembly with epoxy, using a pneumatic syringe...one of the primary quality characteristics in this assembly was that the epoxy had to be free from air bubbles! This line worker had been telling people about it for some time, but no one would listen; after all, an engineer had designed that workstation--who was this uneducated line worker to question the engineers? So, again, listen! Your people have the answers to most of your quality problems. It may take some time before they will talk (because it's a culture change for them, too).
It's not enough just to listen, though, you have to act! If you don't act on what you hear, and act promptly and visibly, soon you won't have anything to listen to. If you listen and act, you'll soon find that you can't keep up with the suggestions for improvement. That will be the beginning of changing the culture to one of improvement.
You also have to be a champion. You have to be out there talking it up, walking the talk, aggressively and visibly removing obstacles to improvement. Align whatever passes for reward and recognition in your zone with PI, to let people know that it's important. Constantly let people know what you value; proactively seek (and take) opportunities to demonstrate those values and beliefs. Measure important process and throughput measures...use SPC so you don't make boneheaded decisions about those measures.
As to what to measure to gage progress along the cultural change path...well, there are lots of things you can measure. Probably the most important are results and employee morale. If your error rates, rework rates and scrap rates are going down and your throughput is going up, it's working. You can also measure suggestions received; but you should use that number as the basis for a perhaps more important metric: percentage of suggestions implemented. This is certainly not an exhaustive list...there are numerous things you can measure. Deming said that the most important numbers are unknown and unknowable; this is what makes measuring what we can measure so important.
Standardize, do 5S, start holding 5-10 minute meetings at every cell every day, to go over quality metrics, suggestions entered, suggestions implemented (and get ideas for implementing suggestions), recognize people for advancing continuous improvement.
“My department is charged with creating a "culture of process improvement" within our zone. We're struggling with what that looks like once we've created this culture. Looking at the Toyota model, they challenge employees to look for PI opportunities every day. What exactly does that look like, and what measurements should we consider (i.e. number of PI suggestions with managers being held accountable for X number per quarter, etc.) I'd like some ideas.”
My "short" answer (admittedly, this answer could have--and has--filled books):
Well, one thing you for sure don't want to do is set some quota for suggestions. You may already be faced with an uphill battle, because the leadership at your organization is actually the entity that has to create that culture of process improvement. If they are just rolling it downhill like any other MBO, it suggests that they don't know what they are doing.
Toyota does challenge employees with looking for improvement ideas. One of the ways they do that is by implementing them. Most suggestion boxes go unheeded by employees because they go unheeded by management. At companies like Toyota, they use mechanisms such as Quality Function Deployment to communicate the voice of the customer to everyone in the organization. It allows people on the production line a clear line of sight to the mind of the customer and the organization's leadership.
How do you establish this culture? Well, if you have to do it locally, start by knowing that you may not be as successful as you would if your leaders were leading. Empowerment is a big piece of the pie...you have to let people know they are empowered to make changes. You have to have mechanisms in place that let changes be approved at the lowest possible level. This doesn't mean that any line worker should be empowered to make design changes that require retooling the entire line without some study, but small local changes should be able to be made and standardized locally, as long as they don't suboptimize the system.
So, start by listening to people. I once found an operator potting an assembly with epoxy, using a pneumatic syringe...one of the primary quality characteristics in this assembly was that the epoxy had to be free from air bubbles! This line worker had been telling people about it for some time, but no one would listen; after all, an engineer had designed that workstation--who was this uneducated line worker to question the engineers? So, again, listen! Your people have the answers to most of your quality problems. It may take some time before they will talk (because it's a culture change for them, too).
It's not enough just to listen, though, you have to act! If you don't act on what you hear, and act promptly and visibly, soon you won't have anything to listen to. If you listen and act, you'll soon find that you can't keep up with the suggestions for improvement. That will be the beginning of changing the culture to one of improvement.
You also have to be a champion. You have to be out there talking it up, walking the talk, aggressively and visibly removing obstacles to improvement. Align whatever passes for reward and recognition in your zone with PI, to let people know that it's important. Constantly let people know what you value; proactively seek (and take) opportunities to demonstrate those values and beliefs. Measure important process and throughput measures...use SPC so you don't make boneheaded decisions about those measures.
As to what to measure to gage progress along the cultural change path...well, there are lots of things you can measure. Probably the most important are results and employee morale. If your error rates, rework rates and scrap rates are going down and your throughput is going up, it's working. You can also measure suggestions received; but you should use that number as the basis for a perhaps more important metric: percentage of suggestions implemented. This is certainly not an exhaustive list...there are numerous things you can measure. Deming said that the most important numbers are unknown and unknowable; this is what makes measuring what we can measure so important.
Standardize, do 5S, start holding 5-10 minute meetings at every cell every day, to go over quality metrics, suggestions entered, suggestions implemented (and get ideas for implementing suggestions), recognize people for advancing continuous improvement.
Labels:
Culture change,
Deming,
Empowerment,
Lean,
Management,
MBO,
Measurement,
Process Improvement,
Six Sigma,
Standardization
Monday, February 8, 2010
Bonus Plans
In one of my LinkedIn Discussion Groups, we have been going back and forth on the idea of bonus schemes for a couple of weeks now. Today, we got a thoughtful post from John, who said that "Incentives and reinforcement are part of what I design." He offered insights as to how a system might be designed. I responded to one of his ideas.
He pointed out that "bonuses have been factored into sales compensation since the dawn of time because we know that vigorous sustainted effort is required," then asked, "Why here and not in all key jobs?" One of his reasons: "Execs are unfamiliar with the ways that objective measures can be designed for staff, managers, and production people," and goes on later to suggest that "Incentives need to be based on objective measures of performance, and that "ALL incentives are ultimately individual."
While these ideas seem to make some common sense, things that we've learned over the last 30 years or so suggest that they bear some scrutiny. Here's my reply:
_________________________________________________
I think Scott points to a couple of drawbacks to many bonus schemes. There are some problems with one of his fixes, though.
Let's talk about objective criteria: sometimes they do exist, but it's not as often as we think, and it's never (an I do mean NEVER) as clear-cut as we think. Anyone who's ever seen the Red Bead Experiment can attest to that. It's also almost never possible to separate the performance of the person from the performance of the system in which they operate. So, even when we talk about "anyone who reaches the goal gets the bonus," we assume that it's possible for everyone to reach that goal, completely independent of all the factors that drive the system.
Let me illustrate with an example from my days in the Military:
An Army school convenes twice per year, and runs for 5 months. One class starts in late Fall, the other in late Spring. Each class is led and instructed by two soldiers. During a study of these classes 10-11 years back, one of these instructor teams clearly excelled, by all the “objective” criteria used to measure performance: very low dropout rates, very high academic achievement with very little remediation, almost no legal or medical problems, excellent advancement rates for graduates, etc. The other team, however, didn’t fare so well; their dropout rates were very high, most of their students struggled to pass the weekly exams (despite extensive remediation and night study), they had numerous problems reported from both base security, military police and community police, a high incidence of sick days, and most students who graduated required a lot of extra work to gain adequate proficiency, once they arrived at their units.
Of course, the team with the highest scores on all the criteria won Instructor of the Quarter/Year, Soldier of the Quarter/Year and other achievement awards given by the training command, and were consistently ranked in the top 5 by their commanders—all this, of course, led to rapid advancement for these soldiers
The low-scoring team ended up at the bottom of the heap, in the “not ranked” category, and received letters of reprimand for their poor performance.
Eventually, someone noticed that this difference in performance transcended the soldiers themselves…ALL the Fall classes were better, and ALL the Spring classes were worse. As it turned out, there was a great logical explanation for all of it.
The classes that convened in the late Fall comprised students who had come into the Army right after High School graduation, many on delayed entry programs. They had enlisted for this particular specialization. They were highly qualified and highly motivated, both for the Army and for this school. In contrast, the Spring classes were made up of people for whom the Army was something to do after they had failed to find a job, and who had been put into this class to fill a quota. Some had needed waivers to get into the Army; many had required waivers to get into the class.
Ironically, if you looked at the workloads for the instructor teams, the hardest-working and most creative teams were those for the Spring class. They had to be, just to survive. They had to conduct remedial sessions at night study, as well as before classes, lunchtimes, weekends, etc. They had to continually push the envelope to find new and better ways to get these challenged students to learn. The other team largely skated through the duty…very little extra time, no extra thought needed.
This same sorry story still happens every day in Military recruiting. Recruiters in very populous areas in more patriotic-leaning states have very few problems meeting quota. They get awards, advancements, etc. Those in rural areas work many times harder and often don't make quota, and are forced to accept low evaluations and sometimes humiliating "remedial" sessions where senior recruiters come in and yell at them like drill sergeants ...many of these are just back from Iraq or Afghanistan.
He pointed out that "bonuses have been factored into sales compensation since the dawn of time because we know that vigorous sustainted effort is required," then asked, "Why here and not in all key jobs?" One of his reasons: "Execs are unfamiliar with the ways that objective measures can be designed for staff, managers, and production people," and goes on later to suggest that "Incentives need to be based on objective measures of performance, and that "ALL incentives are ultimately individual."
While these ideas seem to make some common sense, things that we've learned over the last 30 years or so suggest that they bear some scrutiny. Here's my reply:
_________________________________________________
I think Scott points to a couple of drawbacks to many bonus schemes. There are some problems with one of his fixes, though.
Let's talk about objective criteria: sometimes they do exist, but it's not as often as we think, and it's never (an I do mean NEVER) as clear-cut as we think. Anyone who's ever seen the Red Bead Experiment can attest to that. It's also almost never possible to separate the performance of the person from the performance of the system in which they operate. So, even when we talk about "anyone who reaches the goal gets the bonus," we assume that it's possible for everyone to reach that goal, completely independent of all the factors that drive the system.
Let me illustrate with an example from my days in the Military:
An Army school convenes twice per year, and runs for 5 months. One class starts in late Fall, the other in late Spring. Each class is led and instructed by two soldiers. During a study of these classes 10-11 years back, one of these instructor teams clearly excelled, by all the “objective” criteria used to measure performance: very low dropout rates, very high academic achievement with very little remediation, almost no legal or medical problems, excellent advancement rates for graduates, etc. The other team, however, didn’t fare so well; their dropout rates were very high, most of their students struggled to pass the weekly exams (despite extensive remediation and night study), they had numerous problems reported from both base security, military police and community police, a high incidence of sick days, and most students who graduated required a lot of extra work to gain adequate proficiency, once they arrived at their units.
Of course, the team with the highest scores on all the criteria won Instructor of the Quarter/Year, Soldier of the Quarter/Year and other achievement awards given by the training command, and were consistently ranked in the top 5 by their commanders—all this, of course, led to rapid advancement for these soldiers
The low-scoring team ended up at the bottom of the heap, in the “not ranked” category, and received letters of reprimand for their poor performance.
Eventually, someone noticed that this difference in performance transcended the soldiers themselves…ALL the Fall classes were better, and ALL the Spring classes were worse. As it turned out, there was a great logical explanation for all of it.
The classes that convened in the late Fall comprised students who had come into the Army right after High School graduation, many on delayed entry programs. They had enlisted for this particular specialization. They were highly qualified and highly motivated, both for the Army and for this school. In contrast, the Spring classes were made up of people for whom the Army was something to do after they had failed to find a job, and who had been put into this class to fill a quota. Some had needed waivers to get into the Army; many had required waivers to get into the class.
Ironically, if you looked at the workloads for the instructor teams, the hardest-working and most creative teams were those for the Spring class. They had to be, just to survive. They had to conduct remedial sessions at night study, as well as before classes, lunchtimes, weekends, etc. They had to continually push the envelope to find new and better ways to get these challenged students to learn. The other team largely skated through the duty…very little extra time, no extra thought needed.
This same sorry story still happens every day in Military recruiting. Recruiters in very populous areas in more patriotic-leaning states have very few problems meeting quota. They get awards, advancements, etc. Those in rural areas work many times harder and often don't make quota, and are forced to accept low evaluations and sometimes humiliating "remedial" sessions where senior recruiters come in and yell at them like drill sergeants ...many of these are just back from Iraq or Afghanistan.
Monday, January 11, 2010
Some Problems with Conditional Probability
A lot of people in my statistics classes struggle with conditional probability; you may be in the same boat. If you are, though, please don’t feel alone. A lot of people get this (and simple probability, for that matter) wrong. If you read "Innumeracy" by Poulos or "The Power of Logical Thinking" by Vos Savant, you'll see examples of how a misunderstanding or misuse of this topic has put innocent people in prison and ruined many careers. It's one of the reasons I'm passionate about statistics; it's counterintuitive for me, too. It's not easy to work out in your head, unless maybe you do it all the time. I always have to build a table.
When confronted with conditional probability, my advice is that you be completely process-driven; identify what's given, then follow the process and the formulas religiously. After a while, you can start to see it intuitively, but it does take a while. It's all about what you are given, and how you define things.
In my MBA stats class, one of the problems that always stumped the students was a conditional problem:
“Pregnancy tests, like almost all health tests, do not yield results that are 100% accurate. In clinical trials of a blood test for pregnancy, the results shown in the accompanying table were obtained for the Abbot blood test (based on data from "Specificity and Detection Limit of Ten Pregnancy Tests" by Tiitinen and Stenman, in the Scandanavian Journal of Clinical Laboratory Investigation, 53, Supplement 216). The disclaimer in the journal stated that other tests are more reliable that the test with results given in this table.
“1. Based on the results in the table, what is the probability of a woman being pregnant if the test indicates a negative result?
“2. Based on the results in the table, what is the probability of a false positive; that is, what is the probability of getting a positive result if the woman is not actually pregnant?”
Everyone would just try to look at it as though there were no conditions...they would say, 5/80 for question 1, and 3/80 for question 2. The first question, though, is asking "what is the chance of being pregnant, given a negative result?" There were 16 negative results, and of those, 5 were pregant. So the answer is 5/16, or 31.25%. For the second question, it's what is the probability of a positive, given that the woman is not pregant. In this case, there are 14 non-pregnant women, and 3 of those got a positive result. So that's about 21.42%.
These numbers, and this idea, are really important--that is, they carry real-world import. Some statisticians make their living explaining these concepts to juries. People get fired or arrested because of false positives on urinalysis and other tests, because there is a general impression that they are far more reliable than they actually are.
Let’s look at a different example. In the military, people are given random drug screenings. The test is “certified 99% accurate.” I was always told that this means that if you do drugs, and you’re tested, it will catch you 99 percent of the time. We think, “logically,” that this means there is only a one percent false negative rate…that the fact that someone who does drugs doesn’t get caught one percent of the time indicates that one percent false positive rate. Worse, we assume that if the “false negative rate” is only 1 percent, the false positive rate must also be one percent…it’s just common sense, right?
But “common sense” isn’t…it’s neither common nor truly sensical. Look at it this way…suppose we test 100,000 service members. Suppose further that .1% or 1 in a thousand service members actually do drugs. We might get this:
Tables like this are informative, but they don’t tell the whole story. You can see from this that the company is technically correct…at least in this case, of 100 people who did drugs, 99 were caught and 1 was not. But a false positive rate and a false negative rate are made up of more. To get to the whole story, it’s also good to do the marginals, or row and column totals:
Numbers like this, the numbers of people tested, are very important. This helps us figure out our givens. The false negative rate is not the number of people who did drugs and tested negative. It’s the number out of all the people who tested negative who actually did drugs. In this case, the false negative rate is much better than advertised…it’s 1/98,902, or .00001, about one in 10,000 who do drugs and get tested get away with it.
The consequences, though, are on the false positive side…this is where people get turned away for employment, get fired, etc. In the case of the military, a lot of people end up in a lot of trouble with the random urinalysis program. While we want to be cautious, and we don’t want a lot of druggies flying or controlling aircraft or tanks or other deadly weapons, we should also be concerned that we might be ruining careers unnecessarily. If we look at the table, the “common sense” interpretation of the false positive rate would be 999/100000, or 0.999 percent, very close to the one percent that we assumed initially. But, as astounding as it may seem, considering the number of people that are convicted each year because of this assumption, this is entirely incorrect!
The actual false positive rate consists of the number of people incorrectly identified as drug users, or the number of non-drug users out of the total number of positives. In this case, that’s 999 out of 1,098, or 90.98%! In other words, your chance of actually being a drug user, given a positive result on this “99% accurate” test, is only 9.02%!
Yes, it’s tricky. No, it’s not easy. But it’s important. It touches lives. Juries, lab technicians, doctors and nurses, lawyers, employers, employees and patients who don’t understand this put either themselves or others in peril every day.
When confronted with conditional probability, my advice is that you be completely process-driven; identify what's given, then follow the process and the formulas religiously. After a while, you can start to see it intuitively, but it does take a while. It's all about what you are given, and how you define things.
In my MBA stats class, one of the problems that always stumped the students was a conditional problem:
“Pregnancy tests, like almost all health tests, do not yield results that are 100% accurate. In clinical trials of a blood test for pregnancy, the results shown in the accompanying table were obtained for the Abbot blood test (based on data from "Specificity and Detection Limit of Ten Pregnancy Tests" by Tiitinen and Stenman, in the Scandanavian Journal of Clinical Laboratory Investigation, 53, Supplement 216). The disclaimer in the journal stated that other tests are more reliable that the test with results given in this table.
Positive Result | Negative Result | |
Subject is pregnant | 80 | 5 |
Subject is not pregnant | 3 | 11 |
“1. Based on the results in the table, what is the probability of a woman being pregnant if the test indicates a negative result?
“2. Based on the results in the table, what is the probability of a false positive; that is, what is the probability of getting a positive result if the woman is not actually pregnant?”
Everyone would just try to look at it as though there were no conditions...they would say, 5/80 for question 1, and 3/80 for question 2. The first question, though, is asking "what is the chance of being pregnant, given a negative result?" There were 16 negative results, and of those, 5 were pregant. So the answer is 5/16, or 31.25%. For the second question, it's what is the probability of a positive, given that the woman is not pregant. In this case, there are 14 non-pregnant women, and 3 of those got a positive result. So that's about 21.42%.
These numbers, and this idea, are really important--that is, they carry real-world import. Some statisticians make their living explaining these concepts to juries. People get fired or arrested because of false positives on urinalysis and other tests, because there is a general impression that they are far more reliable than they actually are.
Let’s look at a different example. In the military, people are given random drug screenings. The test is “certified 99% accurate.” I was always told that this means that if you do drugs, and you’re tested, it will catch you 99 percent of the time. We think, “logically,” that this means there is only a one percent false negative rate…that the fact that someone who does drugs doesn’t get caught one percent of the time indicates that one percent false positive rate. Worse, we assume that if the “false negative rate” is only 1 percent, the false positive rate must also be one percent…it’s just common sense, right?
But “common sense” isn’t…it’s neither common nor truly sensical. Look at it this way…suppose we test 100,000 service members. Suppose further that .1% or 1 in a thousand service members actually do drugs. We might get this:
Do Drugs | Don't Do Drugs | |
Test Positive | 99 | 999 |
Test Negative | 1 | 98,901 |
Tables like this are informative, but they don’t tell the whole story. You can see from this that the company is technically correct…at least in this case, of 100 people who did drugs, 99 were caught and 1 was not. But a false positive rate and a false negative rate are made up of more. To get to the whole story, it’s also good to do the marginals, or row and column totals:
Do Drugs | Don't Do Drugs | Total in Row | |
Test Positive | 99 | 999 | 1098 |
Test Negative | 1 | 98,901 | 98,902 |
Totals | 100 | 99,900 | 100,000 |
Numbers like this, the numbers of people tested, are very important. This helps us figure out our givens. The false negative rate is not the number of people who did drugs and tested negative. It’s the number out of all the people who tested negative who actually did drugs. In this case, the false negative rate is much better than advertised…it’s 1/98,902, or .00001, about one in 10,000 who do drugs and get tested get away with it.
The consequences, though, are on the false positive side…this is where people get turned away for employment, get fired, etc. In the case of the military, a lot of people end up in a lot of trouble with the random urinalysis program. While we want to be cautious, and we don’t want a lot of druggies flying or controlling aircraft or tanks or other deadly weapons, we should also be concerned that we might be ruining careers unnecessarily. If we look at the table, the “common sense” interpretation of the false positive rate would be 999/100000, or 0.999 percent, very close to the one percent that we assumed initially. But, as astounding as it may seem, considering the number of people that are convicted each year because of this assumption, this is entirely incorrect!
The actual false positive rate consists of the number of people incorrectly identified as drug users, or the number of non-drug users out of the total number of positives. In this case, that’s 999 out of 1,098, or 90.98%! In other words, your chance of actually being a drug user, given a positive result on this “99% accurate” test, is only 9.02%!
Yes, it’s tricky. No, it’s not easy. But it’s important. It touches lives. Juries, lab technicians, doctors and nurses, lawyers, employers, employees and patients who don’t understand this put either themselves or others in peril every day.
Subscribe to:
Posts (Atom)