1 00:00:00,856 --> 00:00:05,206 In this video we continue the prior video by considering the effect of confounding. 2 00:00:05,656 --> 00:00:09,196 That prior video was all about the technical details behind confounding. 3 00:00:09,656 --> 00:00:13,626 We know that confounding can confuse us, but I'll show how some knowledge 4 00:00:13,626 --> 00:00:17,166 about our systems can be used to recover useful information. 5 00:00:17,876 --> 00:00:21,956 But first, right at the beginning of the course remember how I always asked you to think 6 00:00:21,956 --> 00:00:24,336 about how each factor might impact the outcome? 7 00:00:24,806 --> 00:00:26,906 Now you are going to see why I made you do that. 8 00:00:27,626 --> 00:00:31,876 Consider the case where we are treating water, and we have three factors. 9 00:00:32,046 --> 00:00:34,846 The first is the chemical added to treat the water. 10 00:00:35,396 --> 00:00:36,906 The second is temperature. 11 00:00:37,436 --> 00:00:39,396 And the third is stirring speed. 12 00:00:39,396 --> 00:00:45,476 A full set of experiments would require eight runs, but if I can only afford four 13 00:00:45,476 --> 00:00:48,746 or five experiments, I should run a half fraction. 14 00:00:49,986 --> 00:00:52,916 Prior experience with the system might lead me 15 00:00:52,916 --> 00:00:56,076 to believe there will be no significant interaction 16 00:00:56,196 --> 00:00:58,436 between temperature and stirring speed. 17 00:00:59,966 --> 00:01:03,836 I would like to get a good estimate of the chemical factor added. 18 00:01:04,456 --> 00:01:08,476 Remember the premise that chemical Q was twice the cost of chemical P? 19 00:01:08,726 --> 00:01:13,506 In that case, I don't want the effects of the chemical to be confounded 20 00:01:13,506 --> 00:01:15,266 with other effects in the system. 21 00:01:15,266 --> 00:01:19,736 So if I would like this good estimate of the chemical added factor, 22 00:01:19,946 --> 00:01:26,286 by that I mean I don't want it to be confounded with other large effects, a natural choice is 23 00:01:26,286 --> 00:01:32,656 to alias the interaction between any other two factor interaction, temperature 24 00:01:32,656 --> 00:01:36,346 and stirring speed in this case, with the chemical effect. 25 00:01:37,846 --> 00:01:43,526 So now when I plan my experiments, I could rather assign A as the temperature, 26 00:01:43,956 --> 00:01:52,236 B as the stirring speed, and factor C, the chemical factor, can be set as A times B. Notice 27 00:01:52,236 --> 00:01:55,516 that I'm free to assign the letters to my factors in any way, 28 00:01:55,896 --> 00:02:00,626 and I can choose the assignment that causes the least problems given 29 00:02:00,626 --> 00:02:02,696 that I know confounding will occur. 30 00:02:03,466 --> 00:02:08,416 Now factor C, that chemical effect, will be confounded with the AB interaction, 31 00:02:09,286 --> 00:02:11,346 but I've used my knowledge of the system, 32 00:02:11,746 --> 00:02:15,216 that I know that that AB interaction is going to be small. 33 00:02:15,216 --> 00:02:22,256 So I'm pretty sure that the effect of C, the chemical effect, will be a good estimate 34 00:02:22,596 --> 00:02:27,316 of that chemical effect and not be confounded with the AB interaction. 35 00:02:27,826 --> 00:02:34,546 Right there is where I've demonstrated good use of educated guessing and smart assumptions. 36 00:02:34,906 --> 00:02:36,916 Notice that you may not get this assignment 37 00:02:36,916 --> 00:02:39,986 of letters correct the first time, but that's quite okay. 38 00:02:40,636 --> 00:02:45,026 I often tell my students, it feels like I spend more time planning my experiments 39 00:02:45,026 --> 00:02:46,206 than actually doing them. 40 00:02:46,946 --> 00:02:50,456 That's because usually we only have one chance to do them, 41 00:02:50,966 --> 00:02:53,126 but I've many chances to plan them on paper. 42 00:02:54,066 --> 00:02:59,366 If I don't like the confounding pattern I get the first time, I can simply reassign my letters 43 00:02:59,496 --> 00:03:04,826 and I can do that as many times as I like until I get the desired confounding pattern. 44 00:03:05,676 --> 00:03:09,906 What's really important to notice here is that at no point 45 00:03:09,906 --> 00:03:13,366 in this video have we used any of the y-values. 46 00:03:13,966 --> 00:03:19,876 What this means is that this analysis can be done before you run any experiments. 47 00:03:19,876 --> 00:03:25,156 You must use your brain and some educated guessing before you start the work. 48 00:03:25,706 --> 00:03:27,676 Remember in the water treatment example, 49 00:03:27,926 --> 00:03:32,136 each experiment costs $10,000, so we can't simply repeat them. 50 00:03:32,676 --> 00:03:34,366 Now back to that trade-off table. 51 00:03:34,926 --> 00:03:38,426 You might be curious about the other entries and how they were found. 52 00:03:39,256 --> 00:03:43,196 Those entries here are found so that you can minimize the confounding 53 00:03:43,626 --> 00:03:49,566 and recover the most amount of information for a given row and column combination in the table. 54 00:03:49,566 --> 00:03:53,316 So that's a bit of the detail behind half-fractions. 55 00:03:53,566 --> 00:03:56,696 We're going to have plenty of practice with these tables coming soon. 56 00:03:58,206 --> 00:04:01,006 Here's one more important pointer about half-fractions. 57 00:04:01,786 --> 00:04:04,966 They are often more suitable than a full factorial design 58 00:04:05,406 --> 00:04:07,516 when you are trying to learn more about a system. 59 00:04:08,056 --> 00:04:11,476 In other words, for that very first set of runs that you're trying. 60 00:04:12,516 --> 00:04:17,676 What if you had four factors to investigate and did the full set of 16 runs? 61 00:04:18,366 --> 00:04:23,836 You do them, you take the samples, and you send them a laboratory for analysis. 62 00:04:23,836 --> 00:04:25,926 A few days later you get the results, 63 00:04:26,096 --> 00:04:29,576 only to find out there was a problem with your experimental system. 64 00:04:29,956 --> 00:04:30,806 What a waste. 65 00:04:31,176 --> 00:04:36,756 It would have been cheaper to perform just eight runs, send the results for analysis 66 00:04:36,756 --> 00:04:40,796 to discover the problem, then you can still spend the remaining budget 67 00:04:40,886 --> 00:04:45,576 on those other eight runs to recover most of your results. 68 00:04:45,576 --> 00:04:47,546 What about the other case? 69 00:04:47,546 --> 00:04:51,056 What if you had done those first eight runs and there was no problem? 70 00:04:51,696 --> 00:04:53,066 Well, you haven't lost anything. 71 00:04:53,576 --> 00:04:56,806 You can quickly go do the analysis on those first eight runs, 72 00:04:57,076 --> 00:04:59,146 and if you are satisfied you can stop. 73 00:04:59,146 --> 00:05:02,686 If you want, you can go do the other eight runs. 74 00:05:03,056 --> 00:05:07,876 We call those the complementary half fraction, and it's totally your choice to do that, 75 00:05:08,236 --> 00:05:11,286 depending on how you want to spend the rest of your money. 76 00:05:12,646 --> 00:05:15,946 Trying to visualize graphically what a half fraction as well 77 00:05:15,946 --> 00:05:21,386 as its complementary half fraction looks like is only feasible for a system of 3 factors: A, B, 78 00:05:21,386 --> 00:05:27,506 and C. First we would do the runs with the open circles, and complete all our analysis to find 79 00:05:27,506 --> 00:05:31,556 out which of the factors is significant, and my how much they affect the outcome. 80 00:05:32,346 --> 00:05:35,536 Recall, we would use a Pareto plot for this. 81 00:05:35,576 --> 00:05:44,286 Then, let's say things looked promising, and we got approval to do the other half-fraction, 82 00:05:44,416 --> 00:05:46,946 then you would come back and do the runs with closed circles. 83 00:05:46,946 --> 00:05:53,416 This concepts extends naturally to a system with 8 + 8 = 16 experiments, 84 00:05:53,416 --> 00:05:57,256 or 16 + 16 = 32 experiments, and so on. 85 00:05:58,226 --> 00:06:01,696 In other words, the experiments on the diagonal of the trade-off table. 86 00:06:02,426 --> 00:06:05,476 Obviously the savings are more impressive for the larger systems. 87 00:06:07,036 --> 00:06:12,666 To end off: half fractions are a great example of what we call an experimental building block. 88 00:06:13,136 --> 00:06:17,656 A piece of work that we start with and decide to add on top of if we choose. 89 00:06:19,516 --> 00:06:22,096 You should be asking questions about how to use this table. 90 00:06:22,466 --> 00:06:25,416 Here's a suggestion, in your the experimental system you thought of, 91 00:06:25,416 --> 00:06:31,316 back in an earlier module, consider how you would have planned a reduced set of runs. 92 00:06:31,886 --> 00:06:34,096 How would you have used this table in your system? 93 00:06:34,576 --> 00:06:35,836 Make sure you can answer that.