1 00:00:02,750 --> 00:00:05,570 Today's class, we're going to talk about interactions. 2 00:00:05,570 --> 00:00:10,000 The term "interaction" has a very specific meaning when talking about experiments. 3 00:00:11,010 --> 00:00:13,860 As mentioned in the previous class, I'm going to add a 4 00:00:13,860 --> 00:00:17,580 term to our prediction model, and the interaction term is that one. 5 00:00:19,080 --> 00:00:22,970 Now, some people take a while to understand what interactions are. 6 00:00:22,970 --> 00:00:26,030 I'm going to give you a very simple example to start with, 7 00:00:26,030 --> 00:00:29,010 and then some actual numbers to look at an example more thoroughly. 8 00:00:31,130 --> 00:00:34,240 So, assume your hands are covered with dirt or oil. 9 00:00:34,240 --> 00:00:37,000 And we know if you wash your hands with cold water, it's going to 10 00:00:37,000 --> 00:00:41,140 take a while to clean them, much longer than if you wash with hot water. 11 00:00:42,630 --> 00:00:45,620 So, the temperature of the water has a significant 12 00:00:45,620 --> 00:00:48,140 effect on the time taken to clean your hands. 13 00:00:49,230 --> 00:00:54,080 Now consider the case when washing your hands with cold water, but using soap. 14 00:00:55,260 --> 00:00:58,090 If you use soap, it will reduce the time taken 15 00:00:58,090 --> 00:01:01,360 to clean your hands than if you did not use soap. 16 00:01:01,360 --> 00:01:05,970 So, it's clear, when using cold water, and adding soap, 17 00:01:05,970 --> 00:01:08,350 you're going to reduce the time to clean your hands. 18 00:01:09,840 --> 00:01:13,670 Now consider what might happen if you use hot water and add soap. 19 00:01:15,640 --> 00:01:20,369 The time taken to clean your hands with hot water and soap is greatly reduced. 20 00:01:22,370 --> 00:01:26,100 We say there's an interaction between soap and the temperature of the water. 21 00:01:27,380 --> 00:01:30,730 The effect of warm water enhances the effect of soap. 22 00:01:31,810 --> 00:01:35,890 Conversely, the effect of soap is enhanced by using warm water. 23 00:01:36,930 --> 00:01:40,770 This is an interaction that works to help us reach our objective faster. 24 00:01:41,940 --> 00:01:45,120 All that "interaction" means is the effect of one 25 00:01:45,120 --> 00:01:48,570 factor depends on the level of the other factor. 26 00:01:50,330 --> 00:01:53,670 In this example, the effect of soap is different depending 27 00:01:53,670 --> 00:01:56,050 on whether we were using cold water or hot water. 28 00:01:57,600 --> 00:02:00,360 Interactions are also symmetrical. 29 00:02:00,360 --> 00:02:03,590 The soap's effect is enhanced by warm water. 30 00:02:03,590 --> 00:02:06,650 Also, the warm water's effect is enhanced by soap. 31 00:02:08,090 --> 00:02:12,000 So, "symmetry" means that if soap interacts with water temperature, then 32 00:02:12,000 --> 00:02:15,680 we know the water temperature factor interacts with the soap factor. 33 00:02:17,530 --> 00:02:19,880 There are examples of interactions that actually work 34 00:02:19,880 --> 00:02:22,280 against each other and cancel each other out. 35 00:02:22,280 --> 00:02:25,400 And we'll see some of that in the upcoming videos. 36 00:02:25,400 --> 00:02:30,320 Today's class is going to consider interaction using a baking experiment. 37 00:02:30,320 --> 00:02:32,770 We're going to look at ginger biscuits. 38 00:02:32,770 --> 00:02:35,970 Now, ginger biscuits are quite possibly my favourite type of biscuit. 39 00:02:37,170 --> 00:02:40,070 And the results we're going to consider are from a student that I 40 00:02:40,070 --> 00:02:43,949 had in my class a few years ago, where she considered 3 outcome variables. 41 00:02:45,000 --> 00:02:47,430 The first variable was taste. 42 00:02:47,430 --> 00:02:52,650 The second outcome was break strength or breakability of the biscuit. 43 00:02:52,650 --> 00:02:56,010 And the third outcome was the breakability of the biscuit after one week. 44 00:02:57,130 --> 00:02:59,770 Why did she measure three outcomes? 45 00:02:59,770 --> 00:03:03,120 Here's a great piece of advice when you run experiments. 46 00:03:03,120 --> 00:03:05,290 Even if you only have one outcome variable 47 00:03:05,290 --> 00:03:09,310 as your current objective, try to measure as many 48 00:03:09,310 --> 00:03:12,400 outcomes as you possibly can because you never know 49 00:03:12,400 --> 00:03:15,540 in future which outcome you will be interested in. 50 00:03:15,540 --> 00:03:19,730 It's very expensive to repeat experiments, so measure as much as you 51 00:03:19,730 --> 00:03:23,650 can the first time, even variables you're not interested in right now. 52 00:03:25,430 --> 00:03:27,730 So, let's go back to that taste outcome. 53 00:03:27,730 --> 00:03:29,720 And ignore the other two outcomes for now. 54 00:03:31,010 --> 00:03:33,790 Taste is obviously a subjective measurement. 55 00:03:33,790 --> 00:03:37,240 So these results are going to be very specific to my student's taste. 56 00:03:37,240 --> 00:03:39,630 I have a friend who is a professional 57 00:03:39,630 --> 00:03:42,020 taster and he was trained for over six months 58 00:03:42,020 --> 00:03:44,890 before he was considered to be qualified enough 59 00:03:44,890 --> 00:03:48,190 to taste foods for a large Canadian grocery store. 60 00:03:48,190 --> 00:03:51,320 Students in my class are not qualified tasters. 61 00:03:51,320 --> 00:03:54,480 So this outcome is very subjective. 62 00:03:54,480 --> 00:03:56,880 What that means is that if you repeated 63 00:03:56,880 --> 00:03:59,980 these experiments, your answers may actually be quite different. 64 00:04:01,780 --> 00:04:03,379 So, back to those ginger biscuits. 65 00:04:04,500 --> 00:04:09,080 The two factors that were considered were the baking time, and the type of sugar. 66 00:04:09,080 --> 00:04:10,270 Here's the recipe for you. 67 00:04:12,460 --> 00:04:17,110 The baking time values that the student used were 8 minutes and 14 minutes, 68 00:04:17,110 --> 00:04:22,390 and for the type of sugar she chose either molasses or honey. 69 00:04:23,490 --> 00:04:26,840 All other settings for the recipe were left as shown here on the screen. 70 00:04:28,530 --> 00:04:30,810 Here are the taste results. 71 00:04:30,810 --> 00:04:33,480 3, 5, 4, and 9. 72 00:04:34,750 --> 00:04:38,820 "3" is a very bad tasting biscuit and "9" is really good. 73 00:04:38,820 --> 00:04:41,370 This is on a scale from 1 to 10. 74 00:04:41,370 --> 00:04:44,130 It's clear that the best tasting biscuits were produced 75 00:04:44,130 --> 00:04:47,180 when using molasses and a baking time of 14 minutes. 76 00:04:47,180 --> 00:04:50,120 The worst tasting biscuits were those made with honey 77 00:04:50,120 --> 00:04:52,820 and baked for a short time of 8 minutes. 78 00:04:52,820 --> 00:04:55,650 Those had a value of 3 on the taste scale. 79 00:04:55,650 --> 00:04:59,000 Start with an "interaction plot" for the two types of sugars. 80 00:04:59,000 --> 00:05:01,880 And we see the lines here are divergent. 81 00:05:01,880 --> 00:05:03,810 They're definitely not parallel this time. 82 00:05:05,150 --> 00:05:08,950 When the lines are not parallel, this is evidence of interaction. 83 00:05:10,350 --> 00:05:13,180 If we change the choice of the variable on the horizontal 84 00:05:13,180 --> 00:05:16,900 axis to be sugar and redraw the plots, we have one 85 00:05:16,900 --> 00:05:20,250 line for short cooking times and another line for longer cooking 86 00:05:20,250 --> 00:05:22,700 times and again, we will observe 87 00:05:22,700 --> 00:05:25,760 divergence, again demonstrating that there's interaction. 88 00:05:27,670 --> 00:05:31,000 Recall that we had said earlier, interactions imply the effect 89 00:05:31,000 --> 00:05:34,940 of one variable depends on the level of the other variables. 90 00:05:36,640 --> 00:05:39,500 Our conclusion here is that the effect of sugar 91 00:05:39,500 --> 00:05:43,580 type depends on the duration that it is baked for. 92 00:05:43,580 --> 00:05:44,960 In other words, the time factor. 93 00:05:46,770 --> 00:05:51,100 Another visualization I showed you last time was the "contour plot". 94 00:05:51,100 --> 00:05:53,935 I showed you how to draw contour plots in class 2A. 95 00:05:55,020 --> 00:05:56,580 And when we draw a contour plot for this 96 00:05:56,580 --> 00:06:01,210 system, we notice that there's some non-linearities here. 97 00:06:01,210 --> 00:06:05,920 In order to connect these lines, we need to have curves in them to make it work. 98 00:06:07,120 --> 00:06:11,320 We use the term curvature to describe this sort of non linearity. 99 00:06:11,320 --> 00:06:15,550 And curvature is evidence that there might be interaction in our system. 100 00:06:17,360 --> 00:06:19,910 Let's try to quantify the system numerically. 101 00:06:19,910 --> 00:06:21,920 We're going to apply what we learned in the previous 102 00:06:21,920 --> 00:06:25,760 class, class 2B, and see if we can predict taste. 103 00:06:27,550 --> 00:06:29,770 Start with the main effects. 104 00:06:29,770 --> 00:06:34,650 "Main effects" is a term we use to describe how the factor will affect the outcome. 105 00:06:36,000 --> 00:06:40,800 In this example, there are two factors, so there's two main effects. 106 00:06:40,800 --> 00:06:42,020 Let's start with baking time. 107 00:06:43,660 --> 00:06:46,930 Always quantify this main effect from high to low. 108 00:06:48,150 --> 00:06:53,670 So, when using molasses, the main effect is 9 - 4, which equals 5. 109 00:06:53,670 --> 00:06:58,590 And when we use honey, it is 5 - 3, which equals 2. 110 00:06:58,590 --> 00:07:02,330 So we can say, on average, the main effect of 111 00:07:02,330 --> 00:07:05,760 time is to increase taste by 3.5 units. 112 00:07:07,040 --> 00:07:13,520 But remember, only report half this number, which is 1.75. 113 00:07:13,520 --> 00:07:16,660 Next, we examine the main effect of sugar. 114 00:07:17,820 --> 00:07:22,030 When we move from high to low, we can see this is 9 - 5, that's 4, 115 00:07:22,030 --> 00:07:27,914 and the change at low baking times is 4 - 3, which equals 1. 116 00:07:30,110 --> 00:07:33,770 So, the average of 4 and 1 is 2.5. 117 00:07:34,955 --> 00:07:37,780 There's a 2.5 unit change on average 118 00:07:37,780 --> 00:07:42,630 in taste when we go from using honey to molasses. 119 00:07:42,630 --> 00:07:44,260 We report half the value again. 120 00:07:46,875 --> 00:07:50,320 Okay, so now is where it might get a little bit messy. 121 00:07:50,320 --> 00:07:52,790 Let's try to quantify this interaction numerically. 122 00:07:53,920 --> 00:07:56,810 Keep factor B at its high level and note that 123 00:07:56,810 --> 00:08:01,190 we have a change of five units when A is changed. 124 00:08:01,190 --> 00:08:05,040 Now put factor B at its low level and we see a change of 2 units. 125 00:08:06,200 --> 00:08:09,940 There's a bit of a discrepancy here, "5" over there and "2" down here. 126 00:08:11,080 --> 00:08:13,826 A system with no interaction will have 127 00:08:13,826 --> 00:08:18,940 these individual effects of A, roughly the same. 128 00:08:18,940 --> 00:08:20,940 But 5 and 2 are actually quite different. 129 00:08:22,820 --> 00:08:28,640 Interaction is mathematically defined as half the difference, when factor B 130 00:08:28,640 --> 00:08:32,230 is high and subtract from it when factor B is low. 131 00:08:33,830 --> 00:08:35,210 Let's show that mathematically. 132 00:08:36,340 --> 00:08:43,770 That is, 5 - 2 and then divide that by 2, in other words, 1.5. 133 00:08:43,770 --> 00:08:50,080 And, by convention again, we report only half of that value, 0.75. 134 00:08:50,080 --> 00:08:54,610 Remember we said that interactions are symmetrical. 135 00:08:54,610 --> 00:08:57,940 So, let's try it again by looking at it from the other perspective. 136 00:08:59,500 --> 00:09:04,020 Compare the difference for factor B at high and low levels of A this time. 137 00:09:05,290 --> 00:09:07,630 So, the effect of sugar type at long 138 00:09:07,630 --> 00:09:10,609 cooking times is a difference of 4 units. 139 00:09:12,090 --> 00:09:14,380 That same difference, when using short cooking 140 00:09:14,380 --> 00:09:17,758 times, is only a difference of +1. 141 00:09:17,758 --> 00:09:23,157 Those two values are quite different, +1 and +4. 142 00:09:23,157 --> 00:09:31,140 The half difference this time is 4 - 1 divided by 2, and that's a value of 1.5. 143 00:09:31,140 --> 00:09:32,510 The same value as before. 144 00:09:33,590 --> 00:09:37,030 Again, we report only half of this, so a 0.75. 145 00:09:37,030 --> 00:09:42,500 So, let's go ahead and add that term to our prediction model. 146 00:09:43,730 --> 00:09:48,459 So far, our prediction for taste is 147 00:09:48,459 --> 00:09:54,150 1.75 times xA plus 1.25 times xB. 148 00:09:54,150 --> 00:09:59,990 The interaction value was 0.75, and we multiply that by xA and xB. 149 00:10:01,060 --> 00:10:03,520 It is symmetrical and multiplicative. 150 00:10:04,860 --> 00:10:06,680 There's only one other term missing from 151 00:10:06,680 --> 00:10:10,540 this prediction equation, and that's our baseline taste. 152 00:10:10,540 --> 00:10:14,210 That baseline is the average of all four values. 153 00:10:14,210 --> 00:10:22,610 So 3 + 5 + 4 + 9 and then divide all of that by 4, that's equal to 5.25. 154 00:10:22,610 --> 00:10:25,150 And we'll add that number right up here at the front. 155 00:10:26,920 --> 00:10:29,540 So let's take a look and see how well our predictions work. 156 00:10:30,810 --> 00:10:32,740 Try to predict the taste when using 157 00:10:32,740 --> 00:10:35,479 molasses and a cooking time of 8 minutes. 158 00:10:37,810 --> 00:10:42,197 For that case, xA is equal to -1 because of 8 minutes 159 00:10:42,197 --> 00:10:46,660 and xB is equal to +1 because we are using molasses. 160 00:10:46,660 --> 00:10:47,540 That's our coding. 161 00:10:48,660 --> 00:10:53,944 So that predicted taste is 5.25 + 1.75 162 00:10:53,944 --> 00:10:59,603 times -1 plus 1.25 times +1, plus this 163 00:10:59,603 --> 00:11:05,410 interaction of 0.75 times -1 times +1. 164 00:11:05,410 --> 00:11:10,990 This shows our baseline taste is 5.25, but using 165 00:11:10,990 --> 00:11:15,150 short baking times removes 1.75 from our taste score. 166 00:11:16,870 --> 00:11:22,020 Using molasses improves the taste by 1.25 units. 167 00:11:22,020 --> 00:11:23,730 And then the interaction works against us, 168 00:11:23,730 --> 00:11:28,980 unfortunately, and subtracts off 0.75 units. 169 00:11:28,980 --> 00:11:31,020 So, our total prediction here is 4. 170 00:11:32,210 --> 00:11:38,345 Try it again but using baking times of 14 minutes so that xA is a +1. 171 00:11:39,420 --> 00:11:44,770 So, now our prediction is 5.25 plus an additional 1.75 for baking 172 00:11:44,770 --> 00:11:49,820 time, plus 1.25 for using molasses, and now 173 00:11:49,820 --> 00:11:55,830 the interaction works in our favour by adding 0.75 units for taste. 174 00:11:55,830 --> 00:11:58,499 This gets us a cumulative total of 9 units. 175 00:11:59,830 --> 00:12:05,370 Now, this all may seem very messy but it's well worth it because what we get is 176 00:12:05,370 --> 00:12:08,330 a really good prediction model that accounts for 177 00:12:08,330 --> 00:12:11,660 the interactions in our system and the main effects. 178 00:12:13,260 --> 00:12:17,250 A system with no interactions would have this term over here equal to zero. 179 00:12:18,450 --> 00:12:24,750 Interactions involving two variables are called a two factor interaction. 180 00:12:24,750 --> 00:12:28,250 And they obviously occur when we have two factors. 181 00:12:28,250 --> 00:12:30,950 But two factor interactions also occur in systems 182 00:12:30,950 --> 00:12:34,670 when we have 3 or 4 or more factors. 183 00:12:34,670 --> 00:12:37,630 We'll see these guys cropping up several times. 184 00:12:37,630 --> 00:12:41,560 In fact, two factor interactions occur very frequently in real 185 00:12:41,560 --> 00:12:47,040 systems, so make sure you're comfortable understanding them, at least conceptually. 186 00:12:47,040 --> 00:12:49,600 Maybe go back to that soap and water example and 187 00:12:49,600 --> 00:12:53,119 really try to figure what an interaction means in that case. 188 00:12:54,960 --> 00:12:59,360 Interactions imply that the main effect of one factor depends on the value 189 00:12:59,360 --> 00:13:03,320 of another factor, and that's really all you have to remember about an interaction. 190 00:13:05,370 --> 00:13:08,930 Also, bear in mind that interactions are symmetrical, and 191 00:13:08,930 --> 00:13:11,592 one way we can see that, over here, is mathematically. 192 00:13:11,592 --> 00:13:16,290 xA times xB is the same as xB times xA. 193 00:13:17,960 --> 00:13:21,100 Now, let's just step back and take one 194 00:13:21,100 --> 00:13:23,820 important piece of advice away from today's video. 195 00:13:25,490 --> 00:13:28,530 Never do any work without critically 196 00:13:28,530 --> 00:13:32,190 thinking about the interpretation of these numbers. 197 00:13:32,190 --> 00:13:33,900 These are not just numbers. 198 00:13:33,900 --> 00:13:35,930 What is the message that they are telling us? 199 00:13:37,630 --> 00:13:39,880 We can see we get better taste if we 200 00:13:39,880 --> 00:13:43,750 increase the baking time from 8 minutes to 14 minutes. 201 00:13:43,750 --> 00:13:45,920 But what will happen to taste if we go 202 00:13:45,920 --> 00:13:49,830 and cook for 16 minutes, 18 minutes, or 20 minutes? 203 00:13:50,870 --> 00:13:52,660 Will taste really keep improving? 204 00:13:53,820 --> 00:13:56,220 Think of all those burnt biscuits you're going to create. 205 00:13:57,540 --> 00:14:01,440 We saw that we improved taste by changing from honey to molasses. 206 00:14:01,440 --> 00:14:02,950 But why was this? 207 00:14:02,950 --> 00:14:04,510 Does that actually make sense? 208 00:14:05,820 --> 00:14:07,210 Remember, taste is subjective. 209 00:14:08,540 --> 00:14:11,699 I think this student preferred the taste of molasses over honey. 210 00:14:12,842 --> 00:14:15,900 Molasses is certainly a more complex ingredient than honey. 211 00:14:15,900 --> 00:14:20,550 I prefer molasses as well, but many of my friends don't like the taste of it at all. 212 00:14:21,670 --> 00:14:25,930 Another important point, how do we interpret the interaction? 213 00:14:25,930 --> 00:14:27,820 Is there any reason why we got a 214 00:14:27,820 --> 00:14:30,719 better taste with molasses at longer baking times? 215 00:14:32,590 --> 00:14:36,180 Both experiments with molasses had a better taste after all. 216 00:14:36,180 --> 00:14:38,850 But it seems like molasses, when cooked for a 217 00:14:38,850 --> 00:14:42,220 longer time, brings out a much better taste than honey. 218 00:14:43,290 --> 00:14:46,730 Perhaps the chemical reactions that are occurring during that 219 00:14:46,730 --> 00:14:49,970 baking in the oven give it a better taste. 220 00:14:49,970 --> 00:14:52,010 And because of the longer duration, they're 221 00:14:52,010 --> 00:14:54,910 given longer times for those reactions to complete. 222 00:14:55,960 --> 00:14:58,040 That's possibly what leads to better taste. 223 00:14:59,745 --> 00:15:04,150 Okay, so that's it for today's class, all about interactions. 224 00:15:04,150 --> 00:15:06,650 I hope you get a chance to review this a second time. 225 00:15:06,650 --> 00:15:11,410 It's not always the easiest concept to understand the first time around. 226 00:15:11,410 --> 00:15:13,330 So, please feel free to review it again.