1 00:00:00,036 --> 00:00:03,346 Remember that term "factor" from the last module? 2 00:00:03,556 --> 00:00:08,836 Well, our main goal with this module is to achieve confidence, to run and analyze data 3 00:00:08,916 --> 00:00:11,116 from experiments when there are two factors. 4 00:00:11,116 --> 00:00:16,006 We are only going to use pen and paper only, and everything is going to be done by hand. 5 00:00:16,736 --> 00:00:18,706 It's actually a whole lot easier than you think. 6 00:00:19,336 --> 00:00:24,266 But, if you already understand the concept of factorial experiments in two factors, 7 00:00:24,546 --> 00:00:29,616 feel free to jump ahead; check out the last video, which is a 3-factor example, 8 00:00:29,796 --> 00:00:31,876 then try the quizzes for this module. 9 00:00:31,876 --> 00:00:37,146 If you do well, move ahead and start the material for module three. 10 00:00:37,146 --> 00:00:40,636 In that next module we are going to introduce computer software 11 00:00:40,636 --> 00:00:43,346 to analyze the experiments and visualize the data. 12 00:00:43,346 --> 00:00:47,736 But for now, get out that pen and paper and let's get started. 13 00:00:47,736 --> 00:00:52,676 -- So we are considering a basic example; an experiment with 2 factors. 14 00:00:53,386 --> 00:00:57,386 In the previous module we had said factors can be either numeric or categorical. 15 00:00:58,036 --> 00:01:01,086 In this example we will consider one factor of each type. 16 00:01:02,756 --> 00:01:04,386 So we're going to make popcorn! 17 00:01:04,746 --> 00:01:08,476 And in this experiment, the outcome is the number of popped kernels. 18 00:01:09,696 --> 00:01:13,136 It might be our objective to maximize that number of popped corns. 19 00:01:13,136 --> 00:01:16,946 Most of you will be able to try this one at home, which is why this is 20 00:01:16,946 --> 00:01:18,586 such a great example to start with. 21 00:01:19,846 --> 00:01:22,556 We're going to apply the same amount of heat each time 22 00:01:22,636 --> 00:01:25,446 and use the number or raw kernels to start with. 23 00:01:25,586 --> 00:01:29,936 From prior experience, I know that between three to four minutes are required, 24 00:01:30,076 --> 00:01:32,676 on medium heat, to pop most of the corn. 25 00:01:32,676 --> 00:01:36,546 So our first factor is going to be the time on the stove. 26 00:01:37,146 --> 00:01:40,816 And I'm going to use 160 seconds and 200 seconds. 27 00:01:41,706 --> 00:01:45,106 Notice that we use two levels, or two values, for this factor. 28 00:01:45,256 --> 00:01:47,786 Just under 3 minutes, and just over 3 minutes. 29 00:01:47,826 --> 00:01:52,676 Figuring out these numeric values for your experiments takes some practice. 30 00:01:52,676 --> 00:01:56,336 You will make mistakes, but we give general advice in coming classes. 31 00:01:57,126 --> 00:02:00,206 One quick tip though is don't use extremes. 32 00:02:00,206 --> 00:02:04,376 For example, you wouldn't use 30 seconds and 10 minutes for this experiment. 33 00:02:05,016 --> 00:02:08,636 You know in the first case that nothing happens in 30 seconds, 34 00:02:09,006 --> 00:02:12,626 and for 10 minutes you are going to burn it all. 35 00:02:12,796 --> 00:02:17,626 So let's recap: we use 160 seconds and 200 seconds. 36 00:02:18,046 --> 00:02:22,456 In later modules you'll learn how to either increase or decrease that cooking time, 37 00:02:22,456 --> 00:02:24,216 in order to improve our objective. 38 00:02:26,066 --> 00:02:28,916 The second factor we will consider is the type of popcorn. 39 00:02:29,606 --> 00:02:32,346 You could buy either white popcorn or yellow popcorn. 40 00:02:33,086 --> 00:02:36,696 Notice that this is a categorical variable, and there are two levels. 41 00:02:37,456 --> 00:02:41,236 We will assign the low level for white corn and the high level for yellow corn. 42 00:02:41,346 --> 00:02:44,996 So let's start planning the experiments next. 43 00:02:44,996 --> 00:02:50,126 We have two factors: cooking time, and type of corn; and each factor has two levels. 44 00:02:50,126 --> 00:02:54,096 From this we know that we will have four total combinations. 45 00:02:54,726 --> 00:02:57,956 This comes from the mathematical rule that two to the power 46 00:02:57,956 --> 00:03:00,926 of "k" tells us how many experiments we will have. 47 00:03:01,906 --> 00:03:06,086 Now "k" is the number of factors, and in this experiment we have 2 of them. 48 00:03:06,246 --> 00:03:10,546 So in other words, there will be two to the power of two, or in this case four, 49 00:03:10,546 --> 00:03:12,886 experiments in total that we have to run. 50 00:03:14,096 --> 00:03:16,286 We will write them in a table first, as follows. 51 00:03:16,286 --> 00:03:22,066 Let's pick cooking time and call it factor A, then call the type of corn factor B. 52 00:03:22,206 --> 00:03:28,516 So there are two columns, one for A and one for B. We use minus signs to indicate a low level 53 00:03:28,516 --> 00:03:31,526 for a factor, and a plus sign to indicate a high level. 54 00:03:31,526 --> 00:03:35,416 You will hear me say this a few times, but I hope you believe me: 55 00:03:35,946 --> 00:03:40,096 I promise it will be clearer by the 3rd module why we use minuses and plusses. 56 00:03:40,936 --> 00:03:46,506 The standard approach is to vary the signs for factor A the fastest, so put "minus", "plus", 57 00:03:46,556 --> 00:03:52,736 "minus", "plus", in the four rows for column A. These signs tell the experimenter what levels 58 00:03:52,736 --> 00:03:54,136 to operate that factor at. 59 00:03:54,136 --> 00:04:00,156 For factor A in this experiment that means we will have two experiments at 160 seconds, 60 00:04:00,566 --> 00:04:03,636 and these other two experiments will be run at 200 seconds. 61 00:04:03,636 --> 00:04:09,606 For numeric variables, the "minus" corresponds most naturally to the smaller numeric value, 62 00:04:09,606 --> 00:04:11,616 and the "plus" to the larger numeric value. 63 00:04:11,616 --> 00:04:14,516 Now let's consider factor B,. 64 00:04:14,516 --> 00:04:16,686 This is a categorical variable. 65 00:04:16,986 --> 00:04:20,706 There isn't a natural assignment for the "minus" or "plus" signs. 66 00:04:20,706 --> 00:04:23,056 In this case, we allocate the signs arbitrarily. 67 00:04:23,056 --> 00:04:28,356 For example, let's put white corn as "minus" and yellow corn as "plus". 68 00:04:28,356 --> 00:04:30,166 We could have flipped this allocation around. 69 00:04:30,496 --> 00:04:33,936 But, as you will prove to yourself in a quiz during this module, 70 00:04:34,116 --> 00:04:35,806 you will still get the same results. 71 00:04:35,806 --> 00:04:42,086 So, complete the table, now, by adding column B, and vary that one step slower 72 00:04:42,086 --> 00:04:48,416 than you varied column A: "minus", "minus", "plus" and then "plus". 73 00:04:48,416 --> 00:04:51,696 Now we are ready to implement the experiments. 74 00:04:51,696 --> 00:04:54,066 Here's a bit of advice and, in this course, 75 00:04:54,066 --> 00:04:57,466 when we give some practical advice, we will show it with this icon. 76 00:04:58,476 --> 00:05:02,346 The most important thing that you should NOT do is run the experiments 77 00:05:02,346 --> 00:05:03,986 in the order shown in the table. 78 00:05:04,536 --> 00:05:07,046 You MUST run the experiments in random order. 79 00:05:08,166 --> 00:05:12,656 Now you can choose any method you like to pick that random order of experiments. 80 00:05:12,656 --> 00:05:19,086 The easiest, I find, is to write numbers on pieces of paper - as you have experiments. 81 00:05:19,086 --> 00:05:22,846 Then randomly select these pieces until no more are left. 82 00:05:22,846 --> 00:05:27,516 A few other options are shown here on the screen; please take a look at them. 83 00:05:30,036 --> 00:05:32,806 So here the standard order column refers to the way 84 00:05:32,806 --> 00:05:35,066 which we will label the rows in standard order. 85 00:05:35,726 --> 00:05:38,556 The next column we add is the "actual order" column. 86 00:05:39,166 --> 00:05:42,866 This column represents the order in which the experiments were actually run. 87 00:05:44,016 --> 00:05:47,676 Now we can go start running our experiments and record the outcome variable. 88 00:05:47,676 --> 00:05:53,866 The first experiment I randomly picked was number 3 - that experiment is run 89 00:05:53,866 --> 00:05:56,456 at short cooking times and with yellow corn. 90 00:05:56,586 --> 00:06:02,366 After the experiment is run, I recorded an outcome of 62 popped kernels. 91 00:06:02,996 --> 00:06:07,306 Then I drew my next random number and found that I should run experiment 1. 92 00:06:07,876 --> 00:06:10,386 I recorded a value of 52 popped kernels. 93 00:06:11,416 --> 00:06:16,186 I then go get another random number and suppose I get row 4 from standard order table. 94 00:06:16,846 --> 00:06:20,146 When I run that experiment I get a value of 80 popped corns. 95 00:06:20,916 --> 00:06:25,896 My final experiment is number 2 from the standard order table, with long cooking times 96 00:06:25,896 --> 00:06:30,446 and white corn; this led me to a result of 74 popped corns. 97 00:06:30,446 --> 00:06:35,826 So once all the experiments are done we will have 4 entries of outcome values. 98 00:06:36,656 --> 00:06:39,296 Now where do we start with our analysis? 99 00:06:39,296 --> 00:06:43,496 The first thing a good statistical analysis will do is to visualize the data. 100 00:06:43,496 --> 00:06:46,816 We start by drawing a cube plot for the system. 101 00:06:47,376 --> 00:06:49,576 Remember the cube plot from the first module? 102 00:06:50,326 --> 00:06:52,686 That plot shows us the effect of each factor. 103 00:06:52,746 --> 00:06:58,546 Start by drawing a square and then put the first variable along the horizontal axis, 104 00:06:58,546 --> 00:07:01,776 and the second variable along the vertical axis. 105 00:07:01,776 --> 00:07:04,326 Let's consider the horizontal axis first. 106 00:07:04,326 --> 00:07:07,876 We have short cooking times on the left and long cooking times on the right. 107 00:07:08,756 --> 00:07:12,886 In the vertical direction we have white corn at the bottom, and yellow corn at the top. 108 00:07:14,076 --> 00:07:16,796 Now you are ready to add the outcome variable to this plot. 109 00:07:17,586 --> 00:07:20,186 The number 52 goes over here in the bottom left, 110 00:07:20,186 --> 00:07:23,386 because that's the combination with short times and white corn. 111 00:07:23,486 --> 00:07:27,566 74 goes here at the bottom right, for those combination settings. 112 00:07:27,936 --> 00:07:29,366 Up here we have 62. 113 00:07:29,496 --> 00:07:34,616 Our final value at the top right hand corner is 80, at long cooking times with yellow corn. 114 00:07:34,716 --> 00:07:38,096 Start by considering the effect of time. 115 00:07:38,716 --> 00:07:44,816 As cooking time increases, and when using yellow corn, we go from 62 to 80. 116 00:07:44,816 --> 00:07:48,276 That's an increase of 18 units. 117 00:07:48,276 --> 00:07:55,426 For white corn, we see that we go from 52 to 74, an increase of 22 units. 118 00:07:55,426 --> 00:08:02,356 So, on average, we have a 20 unit increase when cooking time goes from 160 to 200 seconds. 119 00:08:02,356 --> 00:08:05,986 Let's consider the difference between the corn type next. 120 00:08:05,986 --> 00:08:09,276 This is the effect between yellow corn and white corn. 121 00:08:09,326 --> 00:08:16,206 Similar to before, what we do is we compare this effect, keeping the other variable constant. 122 00:08:17,036 --> 00:08:22,196 In other words, let's fix time at the high value of 200 seconds and see what the effect 123 00:08:22,196 --> 00:08:24,626 of changing from white to yellow corn does. 124 00:08:25,666 --> 00:08:29,256 In this case, we go from 74 to 80 popped kernels. 125 00:08:30,106 --> 00:08:34,566 When we report and quantify this effect we say 80 minus 74, 126 00:08:34,846 --> 00:08:38,506 in other words an increase of six units. 127 00:08:38,506 --> 00:08:41,436 What is the effect of corn colour at short cooking times? 128 00:08:42,276 --> 00:08:45,226 I'd like you to pause the video and calculate that for yourself now. 129 00:08:47,096 --> 00:08:52,846 You should found it to be a 10 unit change: from 52 to 62; in other words, 130 00:08:52,946 --> 00:08:56,436 62 minus 52 is a 10 unit increase. 131 00:08:56,436 --> 00:09:02,046 So we can report the average: an 8 unit increase in the number of popped corns 132 00:09:02,046 --> 00:09:04,896 when changing from white corn to yellow corn. 133 00:09:05,646 --> 00:09:08,366 Make sure your interpretation matches up with your cube plot. 134 00:09:08,956 --> 00:09:12,816 Those visualizations are so important to check your analysis. 135 00:09:14,956 --> 00:09:18,316 Let's visualize this in a second way with a contour plot. 136 00:09:19,156 --> 00:09:21,346 I've redrawn the cube plot here for you. 137 00:09:21,706 --> 00:09:23,766 And now we're going to add contours to it. 138 00:09:23,766 --> 00:09:28,766 Start in any corner that is not a maximum and not a minimum. 139 00:09:29,556 --> 00:09:32,546 Then connect the lines as shown by this example. 140 00:09:32,546 --> 00:09:37,156 Notice that the value of 62 would appear approximately 141 00:09:37,156 --> 00:09:38,936 over here on this side of the square. 142 00:09:38,936 --> 00:09:42,686 So we draw a contour to connect these two points, 143 00:09:42,686 --> 00:09:44,396 because they should be at the same level. 144 00:09:46,036 --> 00:09:48,346 Look at the value of 74 over here. 145 00:09:49,206 --> 00:09:53,006 It would appear approximately on this opposite side of the cube at this point. 146 00:09:53,756 --> 00:09:57,996 And finally, we can guess that the rest of the contours are approximately linear. 147 00:09:59,136 --> 00:10:04,476 Don't worry, we will show in later classes how to verify that using computer software. 148 00:10:04,476 --> 00:10:09,296 This is a great way to visualize a set of experiments. 149 00:10:09,296 --> 00:10:14,266 Because we can quickly see here how to start moving towards improving our objective. 150 00:10:14,266 --> 00:10:19,466 For example, if our objective was to maximize the number of popped kernels, 151 00:10:20,066 --> 00:10:22,246 then we can see we should move in this direction 152 00:10:22,406 --> 00:10:25,196 to the top right hand corner, to achieve that goal. 153 00:10:26,226 --> 00:10:31,436 In this specific case that means we must use yellow corn and longer cooking times. 154 00:10:31,866 --> 00:10:36,836 The longer cooking times result is probably intuitive though for this particular case study. 155 00:10:36,836 --> 00:10:39,876 The interpretation of white and yellow corn probably wasn't. 156 00:10:41,416 --> 00:10:45,866 I always tell my students, you must ask: "where should I run my next experiment?" 157 00:10:46,476 --> 00:10:49,726 And the contour plot tells us that answer. 158 00:10:49,726 --> 00:10:52,886 In summary: we have seen two ways to visualize our data. 159 00:10:53,446 --> 00:10:57,666 One way is with a cube plot, with the values superimposed at the corners. 160 00:10:58,346 --> 00:11:02,106 The second method is to take the cube plot and add contour lines. 161 00:11:02,926 --> 00:11:06,156 I'm going to show you a third way before we end this class today. 162 00:11:06,156 --> 00:11:11,986 This plot is call an interaction plot, and you'll see why, especially in the next video. 163 00:11:12,826 --> 00:11:15,716 Put one of the variables at the bottom, with it's low value 164 00:11:15,716 --> 00:11:18,066 and it's high value in the horizontal direction. 165 00:11:18,166 --> 00:11:22,776 For example when we use white corn, our outcome variable is 52 166 00:11:22,776 --> 00:11:25,316 and 74 at the two settings of time. 167 00:11:25,316 --> 00:11:28,366 Let's use a solid line to connect them. 168 00:11:28,466 --> 00:11:33,526 For yellow corn, the outcome values were 62 and 80, 169 00:11:33,986 --> 00:11:36,566 and I'll use a dashed line to connect those two. 170 00:11:36,566 --> 00:11:40,046 Notice that these two lines are roughly parallel. 171 00:11:40,126 --> 00:11:43,976 So there we have an interaction plots. 172 00:11:43,976 --> 00:11:48,676 We could have flipped our choice of variable to start with on the horizontal axis. 173 00:11:48,676 --> 00:11:49,986 Let me quickly show you how. 174 00:11:50,706 --> 00:11:54,886 Or maybe, you'd like to pause the video and try it yourself first. 175 00:11:54,886 --> 00:11:58,286 Put yellow corn and white corn on the horizontal axis 176 00:11:58,286 --> 00:12:00,626 and then connect them with two different line styles. 177 00:12:01,926 --> 00:12:05,986 One line for short cooking time and one line for long cooking time. 178 00:12:09,286 --> 00:12:11,086 Did you get the result that was shown here? 179 00:12:11,516 --> 00:12:17,066 I've used a solid line for short cooking durations, where the outcomes were 52 and 62, 180 00:12:17,746 --> 00:12:21,716 and I've used a dashed line for the longer duration experiments 181 00:12:21,716 --> 00:12:24,206 where the outcome values were 74 and 80. 182 00:12:25,196 --> 00:12:27,206 Note that the lines are parallel again. 183 00:12:28,476 --> 00:12:31,896 I'm pointing out the parallel lines to you because the fact 184 00:12:31,896 --> 00:12:36,006 that these lines are parallel means that the system has no interaction. 185 00:12:36,556 --> 00:12:40,066 And that's a term you are going to hear about in the next few videos. 186 00:12:40,926 --> 00:12:45,946 One last point to wrap up: notice that the 4 visualization methods we've considered 187 00:12:45,946 --> 00:12:48,986 in this video do not require any computer software. 188 00:12:49,746 --> 00:12:53,206 We've drawn the table by hand, we've drawn a cube plot, 189 00:12:53,656 --> 00:12:56,776 and a contour plot, and now this interaction plot. 190 00:12:57,926 --> 00:13:02,486 You can apply these visualizations whether the factors are numeric or categorical. 191 00:13:03,556 --> 00:13:07,306 All of this demonstrates a distinct advantage of these experiments: 192 00:13:07,346 --> 00:13:10,926 we can quickly understand the results using simple graphical tools, 193 00:13:11,286 --> 00:13:13,236 and quick calculations on a piece of paper. 194 00:13:13,236 --> 00:13:18,606 The fact that they are so simple, means that the results are easy to share 195 00:13:18,606 --> 00:13:20,656 with your colleagues and your managers at work. 196 00:13:22,136 --> 00:13:26,606 Now in the next class I'm going to show you how we can build mathematical models 197 00:13:26,606 --> 00:13:27,986 to predict the outcome. 198 00:13:28,796 --> 00:13:31,946 Making predictions is one of the most powerful aspects 199 00:13:31,946 --> 00:13:34,676 of these running our experiments in this factorial manner. 200 00:13:34,676 --> 00:13:36,166 See you next time.