1 00:00:03,320 --> 00:00:06,710 In the prior video, I left you half way up the mountain. 2 00:00:06,710 --> 00:00:11,940 I had asked you to take that ninth step, that ninth experiment on your own. 3 00:00:11,940 --> 00:00:14,510 Were you able to find the location of that next run? 4 00:00:15,810 --> 00:00:18,800 As we proceed, we will cover two diversions. 5 00:00:18,900 --> 00:00:22,240 We will look, what happens, if you have constraints in your experiments. 10 00:00:22,240 --> 00:00:27,400 By that, I mean, what happens if you want to take a step and realize that because of 11 00:00:27,400 --> 00:00:32,000 safety issues, or for other reasons, that you can't quite go as far as you'd hoped. 12 00:00:32,410 --> 00:00:34,180 We'll also look at mistakes. 13 00:00:34,280 --> 00:00:39,640 What if you, or your colleagues, run an experiment but use the wrong settings? 14 00:00:39,640 --> 00:00:42,679 We'll show that you can easily recover from that. 29 00:00:43,680 --> 00:00:44,850 And in the prior video, 30 00:00:44,850 --> 00:00:50,760 I ended by asking you to take a step size with delta x_P equal to one and a half. 31 00:00:51,820 --> 00:00:58,338 If you did that, you would've found the associated delta x_T equal to 0.718. 32 00:00:58,338 --> 00:01:03,350 Now, let's convert these delta lower case x's to their upper case, 33 00:01:03,350 --> 00:01:07,520 real world changes, using the formulas we introduced in the prior video. 34 00:01:08,840 --> 00:01:13,200 For throughput, this lower case "delta x_T" corresponds to 35 00:01:13,200 --> 00:01:17,980 an increase of 2.87 parts per hour, which we round to 3 parts. 36 00:01:19,070 --> 00:01:23,387 For price, it's a 27 cent increase that we would add to the baseline value. 37 00:01:24,910 --> 00:01:29,460 Now we can go tell our employees or colleagues that the 9th experiment is at 38 00:01:29,460 --> 00:01:34,850 337 parts per hour, with a price of $1.45. 39 00:01:34,850 --> 00:01:38,420 Remember, our colleagues don't speak in coded units. 40 00:01:38,420 --> 00:01:41,600 We have to talk with them in actual units, even though we 41 00:01:41,600 --> 00:01:45,530 speak in coded units behind their backs, when we deal with the least squares model. 42 00:01:46,750 --> 00:01:48,670 Now we should always go predict the outcome of 43 00:01:48,670 --> 00:01:50,300 the experiment before running it. 44 00:01:51,320 --> 00:01:55,272 In coded units x_P for the ninth experiment is at 1.5, 45 00:01:55,272 --> 00:01:56,870 because we selected that. 46 00:01:57,890 --> 00:02:02,690 You might presume that the x_T value is 0.718 that you calculated but 47 00:02:02,690 --> 00:02:06,250 not quite, because remember, we rounded that value. 48 00:02:06,250 --> 00:02:08,940 So we should go recalculate what x_T is for 49 00:02:08,940 --> 00:02:14,270 a nine, using the usual formulas that connects real world units to coded units. 50 00:02:15,500 --> 00:02:20,140 So that value of x_T is equal to 0.75. 51 00:02:20,140 --> 00:02:22,400 When we go use the model, 52 00:02:22,400 --> 00:02:28,184 the prediction with these coded values gives us a profit prediction of $731. 53 00:02:29,370 --> 00:02:31,561 Now if you go to the website and 54 00:02:31,561 --> 00:02:36,478 run the actual experiment, you might get a value close to $717. 55 00:02:36,478 --> 00:02:39,120 Our prediction was off by about $13 or $14. 56 00:02:39,120 --> 00:02:46,090 You should have been able to do all of the above after watching the prior video. 57 00:02:46,090 --> 00:02:48,467 If not, go back to the prior video and 58 00:02:48,467 --> 00:02:53,089 recap with those calculations where they were showing in some detail. 59 00:02:53,089 --> 00:02:57,112 Now how bad is that prediction error of $13? 60 00:02:57,112 --> 00:03:01,570 One way to tell is by comparing it to the value from the noise in the system. 61 00:03:02,640 --> 00:03:03,920 And to calculate the noise, 62 00:03:03,920 --> 00:03:08,380 we need some replicated experiments, which we haven't gone and done. 63 00:03:08,380 --> 00:03:12,770 But if we had the time and budget, we could certainly do that and verify. 64 00:03:14,190 --> 00:03:18,780 But a rough way that we can get an estimate of that noise is by comparing it 65 00:03:18,780 --> 00:03:22,400 to the coefficient of the main effects in the model. 66 00:03:22,400 --> 00:03:25,390 And it is about half the size of the smallest main effect. 67 00:03:26,460 --> 00:03:28,460 So that prediction error is not too bad. 68 00:03:29,840 --> 00:03:32,640 Now since the model's predictions are still adequate, 69 00:03:32,640 --> 00:03:35,719 we can keep going up this direction of steepest ascent. 70 00:03:36,810 --> 00:03:38,080 This is new. 71 00:03:38,080 --> 00:03:43,710 In the prior factorial, we had to stop and rebuild after using it single step. 72 00:03:43,710 --> 00:03:47,560 But this time our predictions are still okay, so we keep going. 73 00:03:47,560 --> 00:03:50,960 This is the general principle of response surface methods: 74 00:03:50,960 --> 00:03:55,429 keep going up that path as long as the predictions are consistent with reality. 75 00:03:56,920 --> 00:04:01,105 Now we can try step two delta x_P=2.5 away from the baseline. 76 00:04:02,530 --> 00:04:04,010 Pause the video and 77 00:04:04,010 --> 00:04:08,330 try to calculate these quantities at these new 10th experiment yourself. 78 00:04:09,860 --> 00:04:12,510 You'll soon become an expert at these calculations, but 79 00:04:12,510 --> 00:04:15,500 it will take you several minutes at first. 80 00:04:15,500 --> 00:04:17,070 Once you're done with your work, 81 00:04:17,070 --> 00:04:21,270 go compare your prediction to the actual experiments using the website. 82 00:04:22,400 --> 00:04:27,790 So these are the values that you should have obtained, "delta x_T" = 1.2, 83 00:04:27,790 --> 00:04:32,600 "delta T" in real world units is a change of 4.8 parts per hour, and 84 00:04:32,600 --> 00:04:34,316 we'll round that up to 5. 85 00:04:34,316 --> 00:04:39,882 "Delta P" is 0.45 or $0.45. 86 00:04:39,882 --> 00:04:47,220 T for the tenth experiment, corresponds to 339 parts an hour; and P is a $1.63. 87 00:04:47,220 --> 00:04:52,880 x_T in coded units is 1.25. 88 00:04:52,880 --> 00:04:56,930 Just a little bit different from the x_T=1.2 that we had calculated earlier, 89 00:04:56,930 --> 00:04:58,540 due to rounding. 90 00:04:58,540 --> 00:05:01,770 And x_P=2.5. 91 00:05:01,770 --> 00:05:06,293 Using those coded values, we can predict a y-value for 92 00:05:06,293 --> 00:05:10,022 the 10th experiment of $784.77. 93 00:05:10,022 --> 00:05:11,767 Well, $785. 94 00:05:11,767 --> 00:05:15,490 Now, the actual experimental outcome is around $732. 95 00:05:15,490 --> 00:05:19,730 You won't get that exact figure from the website because we 96 00:05:19,730 --> 00:05:22,870 add some noise to the prediction just to make things realistic. 97 00:05:24,150 --> 00:05:27,330 That's about a $50 deviation though, and 98 00:05:27,330 --> 00:05:31,650 it's comparable to the main effect of the largest factor, the price. 99 00:05:31,650 --> 00:05:34,480 So it's probably time we rebuild this model. 100 00:05:34,480 --> 00:05:37,110 And the 10th experiment can form our baseline. 101 00:05:38,390 --> 00:05:41,481 Notice that when we do this, we reset our (0,0) 102 00:05:41,481 --> 00:05:46,020 center point to this new location in real world units. 103 00:05:46,020 --> 00:05:49,270 We do not use the previous factorial's coded units. 104 00:05:49,270 --> 00:05:54,000 We start fresh and build a new local model to approximate the surface in this region. 105 00:05:55,060 --> 00:05:58,180 What range should we use for the new factorial? 106 00:05:58,180 --> 00:06:00,340 I'm going to use a slightly smaller range for 107 00:06:00,340 --> 00:06:05,120 the throughput, T, of 6 parts per hour, for two reasons. 108 00:06:05,120 --> 00:06:09,550 First, we're coming close to our upper bound of 350 parts an hour. 109 00:06:10,610 --> 00:06:13,070 In case there's an optimum near this bound, 110 00:06:13,070 --> 00:06:16,630 we will see in the next video we should have a bit of room 111 00:06:16,630 --> 00:06:20,600 to move outside the factorial bounds to fit a non-linear model. 112 00:06:21,760 --> 00:06:24,470 Secondly, we might suspect we're levelling off. 113 00:06:25,510 --> 00:06:29,880 And the way I can see this is by looking at the spread in the profit values in 114 00:06:29,880 --> 00:06:30,950 the first factorial. 115 00:06:32,300 --> 00:06:34,710 See how far apart they are over there? 116 00:06:34,710 --> 00:06:37,940 And here in this second factorial, they're closer together. 117 00:06:39,090 --> 00:06:42,430 That reduction indicates there might be a levelling off, and 118 00:06:42,430 --> 00:06:46,110 we don't want to overshoot the optimum by taking too large a step. 119 00:06:47,250 --> 00:06:50,700 For price P, I'm going to take the same range as before. 120 00:06:51,880 --> 00:06:54,850 We are still far away from the extreme upper bound. 121 00:06:54,850 --> 00:06:56,630 But if you'd like to use a different range for 122 00:06:56,630 --> 00:07:01,990 the price, go ahead and try using perhaps, $0.20 for example. 123 00:07:01,990 --> 00:07:06,090 You'll see that your direction to the optimum is not very different to the one 124 00:07:06,090 --> 00:07:09,580 I am going to take with the $0.36 range. 125 00:07:09,580 --> 00:07:12,050 So let me have a small digression here. 126 00:07:12,050 --> 00:07:14,850 You might be wondering if your choice of range will have 127 00:07:14,850 --> 00:07:17,840 a significant impact on the path of steepest ascent. 128 00:07:19,180 --> 00:07:23,940 You notice that the direction of steepest ascent is in proportion to the range of 129 00:07:23,940 --> 00:07:25,660 the factors chosen. 130 00:07:25,660 --> 00:07:27,800 If you were doing the experiments, 131 00:07:27,800 --> 00:07:31,318 it's quite likely you will pick a different range to the one that I'll pick. 132 00:07:32,650 --> 00:07:36,930 Fortunately, and it has been shown in various statistical textbooks, 133 00:07:36,930 --> 00:07:41,150 that these different range choices selected by different experimenters 134 00:07:41,150 --> 00:07:46,270 will lead to a different path up the mountain but not radically different. 135 00:07:46,270 --> 00:07:51,210 There is this idea of "confidence interval of paths" so to speak. 136 00:07:51,210 --> 00:07:55,140 So the bottom line is this, don't be too concerned about the range choice 137 00:07:55,140 --> 00:07:58,930 as long as it is reasonable and leaves you room to the left and 138 00:07:58,930 --> 00:08:02,340 right of your extreme bounds to approach that mountain peak. 139 00:08:03,540 --> 00:08:05,510 Now, back to the factorial. 140 00:08:05,510 --> 00:08:12,290 Here are experiments, 11, 12, 13 and 14 and their corresponding profit values. 141 00:08:12,290 --> 00:08:15,880 Remember, we run them in random order, but I report them here in standard order. 142 00:08:17,310 --> 00:08:21,500 If you write and run the R code, you can show you get the following linear 143 00:08:21,500 --> 00:08:26,220 model from the five experiments, including the baseline point at position ten. 144 00:08:28,540 --> 00:08:32,190 Pause the video and fit the model from the data points. 145 00:08:32,190 --> 00:08:34,730 What interesting feature do you notice in the contour plot? 146 00:08:36,540 --> 00:08:39,820 You would've observed some curvature in the contours. 147 00:08:39,820 --> 00:08:44,110 This is an indication that something has changed in the surface. 148 00:08:44,110 --> 00:08:46,630 Now you can happily skip onto the next video and 149 00:08:46,630 --> 00:08:48,619 see how to continue this analysis. 150 00:08:49,720 --> 00:08:52,620 But to end this video, I am going to divert and 151 00:08:52,620 --> 00:08:56,060 talk a little bit about experimental mistakes. 152 00:08:56,060 --> 00:08:59,740 I am also going to show what happens when you hit in to constraints. 153 00:08:59,740 --> 00:09:02,870 But feel free to come back to this topic later on 154 00:09:02,870 --> 00:09:07,519 if you want to jump ahead and see how the case study continues. 155 00:09:09,440 --> 00:09:11,640 So to talk about mistakes. 156 00:09:11,640 --> 00:09:14,020 I will use run number 9 over here and 157 00:09:14,020 --> 00:09:16,410 show how we could have used it a bit more effectively. 158 00:09:17,870 --> 00:09:21,330 Notice that run 9 and run 11 are close to each other. 159 00:09:22,350 --> 00:09:28,620 If I was planning this third factorial here in runs 11, 12, 13 and 14. 160 00:09:28,620 --> 00:09:32,890 And if my experiments were really expensive, I would want to know if I 161 00:09:32,890 --> 00:09:36,635 could use experiment 9 and avoid running experiment 11. 162 00:09:37,890 --> 00:09:40,310 And the answer is yes, you definitely can. 163 00:09:41,560 --> 00:09:46,210 We use the concept of a "botched design", which is just an English word for 164 00:09:46,210 --> 00:09:47,180 "mistaken design". 165 00:09:48,380 --> 00:09:51,779 Mistakes happen all the time in experiments in two main ways. 166 00:09:52,890 --> 00:09:57,440 Firstly, imagine your employee wanted to actually run experiments 11, but 167 00:09:57,440 --> 00:10:01,622 made a mistake with the settings and ran the experiment at position 9 by accident. 168 00:10:03,200 --> 00:10:06,940 Another way this could have happened is to imagine that if you were running 169 00:10:06,940 --> 00:10:13,080 experiments in random order, you might have run experiment 12 then 13 then 14. 170 00:10:13,080 --> 00:10:14,360 And then you want to come and 171 00:10:14,360 --> 00:10:19,530 run experiment 11 when you suddenly realize that condition would be unsafe, or 172 00:10:19,530 --> 00:10:22,860 lead to totally different, very unexpected operation. 173 00:10:23,970 --> 00:10:27,210 Someone in the course forums asked exactly that question. 174 00:10:27,210 --> 00:10:30,630 You might think that you'd have to shrink experiment 12 over to 175 00:10:30,630 --> 00:10:36,670 this location to line up with experiment 9 and get back to a regular factorial. 176 00:10:36,670 --> 00:10:38,600 But it is not necessary. 177 00:10:38,600 --> 00:10:43,752 The important insight is that you can get an adequate model with these four points, 178 00:10:43,752 --> 00:10:47,653 even if they're not in perfect alignment with the -1 and 179 00:10:47,653 --> 00:10:51,721 +1 positions they would normally occupy on the cube plot. 180 00:10:51,721 --> 00:10:55,377 But if one or more of the experiments are shifted, 181 00:10:55,377 --> 00:10:58,780 you must use the correct coded value for them. 182 00:10:58,780 --> 00:11:01,021 For point 9, for example, 183 00:11:01,021 --> 00:11:07,170 the correct coded value is -0.67 from this equation; not -1. 184 00:11:08,880 --> 00:11:14,910 So in our R code, instead of -1, +1, -1, +1 for the factor, 185 00:11:14,910 --> 00:11:20,340 T, we use the mistaken value of -2/3, plus 1, plus 1. 186 00:11:21,680 --> 00:11:24,669 And we enter the outcome value we got at the mistaken point. 187 00:11:27,220 --> 00:11:30,770 Now mistaken experiments, because they're not at these -1 or 188 00:11:30,770 --> 00:11:36,540 +1 positions are generally calculated by a computer, and not by hand. 189 00:11:36,540 --> 00:11:40,520 When you rebuild the model, you get the following prediction equation and 190 00:11:40,520 --> 00:11:41,370 contour plots. 191 00:11:42,740 --> 00:11:45,520 Let me contrast that to the situation over here on 192 00:11:45,520 --> 00:11:50,930 the right where I had used experiments at position 11, 12, 13 and 14. 193 00:11:50,930 --> 00:11:57,790 And you can see that that model is not different to the model with a mistake. 194 00:11:57,790 --> 00:12:00,530 Now, you do lose some of the useful properties we get when 195 00:12:00,530 --> 00:12:05,360 the design was run at the correct values. But if this small change means 196 00:12:05,360 --> 00:12:11,030 saving lots of money to avoid redoing an experiment, it's really worth the price. 197 00:12:11,030 --> 00:12:14,230 Notice that you definitely did not have to shrink in your range. 198 00:12:15,510 --> 00:12:18,890 So we have discussed what to do with constraints that relate to mistakes or 199 00:12:18,890 --> 00:12:20,380 "botched designs". 200 00:12:20,380 --> 00:12:23,390 What about constraints that are imposed by the system, 201 00:12:23,390 --> 00:12:25,590 constraints that you know about before the time? 202 00:12:26,620 --> 00:12:29,270 It is common for systems to have such constraints that 203 00:12:29,270 --> 00:12:31,760 prevent operation outside a certain region. 204 00:12:32,850 --> 00:12:36,590 I'm talking not just about constraints that align with extreme vertical or 205 00:12:36,590 --> 00:12:40,830 horizontal edges of the factors, outside which you cannot operate. 206 00:12:40,830 --> 00:12:44,850 But rather, I'm referring to constraints that cut entire regions out 207 00:12:44,850 --> 00:12:45,719 of consideration. 208 00:12:46,830 --> 00:12:52,100 For example, a constraint that runs along this direction, shown in red, and 209 00:12:52,100 --> 00:12:55,090 anything beyond it, we cannot go and run over there. 210 00:12:56,640 --> 00:13:01,000 What if the path of steepest ascent was showing a promising direction along 211 00:13:01,000 --> 00:13:02,880 here, in green? 212 00:13:02,880 --> 00:13:06,860 Well, then we modify our path to obey that constraint. 213 00:13:06,860 --> 00:13:12,460 But we have to always obey safety requirements in our process. 214 00:13:12,460 --> 00:13:14,310 They are of primary importance. 215 00:13:14,310 --> 00:13:18,360 And we have to find our optimum within those restrictions. 216 00:13:18,360 --> 00:13:22,850 And I'll end by saying it is not uncommon to find your optimum, 217 00:13:22,850 --> 00:13:25,050 right at the boundary of a constraint. 218 00:13:25,050 --> 00:13:27,990 We see that in engineering systems frequently, and 219 00:13:27,990 --> 00:13:30,310 it's likely to occur in other systems as well. 220 00:13:31,860 --> 00:13:33,843 So that's the end of this video. 221 00:13:33,843 --> 00:13:39,949 In the next video, we resume going back to the factorial with this baseline at point 10. 222 00:13:39,949 --> 00:13:42,580 Where would you run your next experiment? 223 00:12:42,580 --> 00:13:47,310 Use the tool on the website and try a few runs yourself before watching video 6C.