1 00:00:00,136 --> 00:00:02,326 My goal with this video is to show you 2 00:00:02,326 --> 00:00:06,496 where the predictive model we calculate using computer software comes from. 3 00:00:06,556 --> 00:00:09,606 This predictive model is called a least-squares model. 4 00:00:09,606 --> 00:00:12,316 And these models are widely used in companies. 5 00:00:12,886 --> 00:00:17,536 You've certainly seen them if you've taken a basic math or statistics class. 6 00:00:17,536 --> 00:00:20,666 Quickly watch this video, even if you understand least squares. 7 00:00:20,666 --> 00:00:24,096 If you have limited experience though with least-squares, 8 00:00:24,376 --> 00:00:27,866 take a moment to see the extra resources we've posted for you. 9 00:00:27,866 --> 00:00:31,336 We certainly want to give you as much help as we can. 10 00:00:31,866 --> 00:00:34,866 Now in the videos in the prior module, we were looking at popcorn. 11 00:00:35,226 --> 00:00:38,616 And I'm going to use that example again in this class. 12 00:00:38,616 --> 00:00:43,606 In the popcorn experiment, our objective was to maximize the amount of popcorn created. 13 00:00:44,346 --> 00:00:47,176 Our outcome variable was the number of popped kernels. 14 00:00:47,826 --> 00:00:51,666 Here is the cube plot, and the corresponding predictive model that we created. 15 00:00:53,076 --> 00:00:58,866 The predictive model has four parameters: 67, 10, 4, and -1. 16 00:00:58,916 --> 00:01:04,296 67 was the baseline amount, the average of all four experimental outcomes. 17 00:01:04,916 --> 00:01:08,246 We also refer to that as the intercept, and you'll see why in a minute. 18 00:01:09,306 --> 00:01:12,206 "10" is the effect of factor A, the cooking time. 19 00:01:12,206 --> 00:01:17,486 This is what we call the main effect for factor A. "4" is the effect 20 00:01:17,486 --> 00:01:20,016 of factor B, the kind of popcorn we used. 21 00:01:20,526 --> 00:01:23,796 And lastly, the "-1" is the two factor interaction term. 22 00:01:23,796 --> 00:01:26,966 Do you recall how we calculated these numbers by hand? 23 00:01:27,636 --> 00:01:30,296 Go back to the videos in the previous module if you are not sure. 24 00:01:30,296 --> 00:01:37,096 The most general form of the least squares model for this system is y equals b_0, 25 00:01:37,096 --> 00:01:46,256 plus b_A times x_A, plus b_B times x_B, plus b_{AB} times x_A times x_B. 26 00:01:47,996 --> 00:01:53,396 The x_A is the coded value for factor A, and it represents the amount of cooking time. 27 00:01:53,396 --> 00:01:58,626 If x_A = -1, that represents 160 seconds of time. 28 00:01:59,066 --> 00:02:03,056 And x_A = +1 represents 200 seconds of cooking time. 29 00:02:03,366 --> 00:02:08,526 The "-1" and "+1" are called coded units and the 160 seconds 30 00:02:08,526 --> 00:02:11,826 and 200 seconds are called real world units. 31 00:02:11,826 --> 00:02:17,486 Note that we can not use real world units in this equation only the coded units. 32 00:02:17,486 --> 00:02:18,876 Similarly for x_B. 33 00:02:19,246 --> 00:02:24,276 It is coded so that "-1" represents white corn and plus one represents yellow corn. 34 00:02:24,276 --> 00:02:30,656 Similar to the x_A case, the -1 and +1 are the coded units, while white corn 35 00:02:30,656 --> 00:02:33,716 and yellow corn are the real-world units. 36 00:02:33,716 --> 00:02:38,956 Recall that with categorical variables we assigned the -1 and +1 arbitrarily. 37 00:02:38,956 --> 00:02:43,146 The sign of the coded unit will not change the model's interpretation. 38 00:02:44,456 --> 00:02:47,626 Now take a look at what happens if I write that equation down, 39 00:02:47,756 --> 00:02:50,466 for each of the four experimental points in the system. 40 00:02:51,616 --> 00:02:55,696 We can substitute in values for the coded units into this prediction equation. 41 00:02:55,696 --> 00:03:03,606 For the first experiment, for example, we would have y_1 equals b_0 _ b_A times x_{A-}, 42 00:03:03,606 --> 00:03:10,286 plus b_B, times x_{B-}, plus b_{AB} times x_{A-} times x_{B-}. 43 00:03:10,466 --> 00:03:13,546 That's because x_A is at the minus level, 44 00:03:13,546 --> 00:03:16,486 and x_B is at the minus level, for the first experiment. 45 00:03:17,676 --> 00:03:22,246 We can repeat this process for the other three points in the cube, as shown here on the screen. 46 00:03:24,076 --> 00:03:30,686 Now let's go substitute in -1, or +1, for the factors A and B, and we will get four equations. 47 00:03:31,306 --> 00:03:34,346 Notice that the 4 equations have 4 unknown parameters. 48 00:03:34,676 --> 00:03:38,196 b_0, b_A, b_B, and b_{AB}. 49 00:03:38,196 --> 00:03:43,536 If you have some mathematical background, you will recall that four equations 50 00:03:43,536 --> 00:03:47,366 with four unknowns represents a set of equations that we can solve. 51 00:03:48,306 --> 00:03:52,856 These equations are linear, and so they're very efficiently solved using matrix methods. 52 00:03:53,846 --> 00:03:54,806 Let me show you how. 53 00:03:55,866 --> 00:03:59,406 In matrix form, the equations are written as shown here on the screen. 54 00:03:59,406 --> 00:04:01,996 Three things quickly become apparent. 55 00:04:02,086 --> 00:04:05,856 Firstly, we notice a column of 1's in the first column. 56 00:04:06,036 --> 00:04:10,026 That corresponds to this parameter: b_0, the intercept. 57 00:04:10,026 --> 00:04:13,826 Next we notice that the second and third columns, in other words, 58 00:04:13,826 --> 00:04:16,226 the columns that correspond to the parameters for A 59 00:04:16,226 --> 00:04:19,346 and B are simply the columns from the standard order table. 60 00:04:20,206 --> 00:04:25,166 And finally the last column corresponds to the two factor interaction for AB. 61 00:04:26,256 --> 00:04:33,226 You'll notice that this is simply the column for A, multiplied by the column for B. This comes 62 00:04:33,226 --> 00:04:37,306 from minus minus is plus; plus times minus is minus. 63 00:04:37,666 --> 00:04:43,186 Minus times plus is minus; and finally, plus times plus is plus. 64 00:04:43,186 --> 00:04:49,966 This entire set of equations can be written as vector "y" equals matrix "X" times vector "b". 65 00:04:51,166 --> 00:04:57,126 Now for those of you with some background in least-squares, will realize that the solution 66 00:04:57,126 --> 00:05:03,896 to this set of equations is b = (X^T* X)^{-1} multiplied by (X^T * y). 67 00:05:04,096 --> 00:05:06,486 If you don't have that experience, don't worry. 68 00:05:06,836 --> 00:05:10,676 The computer software, will solve these equations very efficiently for us. 69 00:05:10,916 --> 00:05:12,096 That's what computers are good for. 70 00:05:12,096 --> 00:05:17,106 All we require is the "X" matrix and the "y" vector. 71 00:05:17,336 --> 00:05:21,536 And we have these, the "X" matrix is assembled from the standard order table, 72 00:05:21,606 --> 00:05:25,026 and the "y" vector is simply the four experimental outcomes. 73 00:05:25,196 --> 00:05:28,586 The software will calculate these four parameters, In other words, 74 00:05:28,646 --> 00:05:30,466 the four entries in the vector "b". 75 00:05:30,776 --> 00:05:38,366 Those corresponds to b_0, the intercept, b_A, b_B, and b_AB for the two factor interaction. 76 00:05:38,546 --> 00:05:41,416 So now we are ready to use the computer software. 77 00:05:41,836 --> 00:05:45,546 Please watch the next video to see how those 4 parameters are calculated.