1 00:00:03,166 --> 00:00:08,716 This video, is all about learning some new notation, and the use of fractional factorials. 2 00:00:09,036 --> 00:00:13,926 At the start of this section, I described how half-fractions work. 3 00:00:14,556 --> 00:00:18,326 The next logical question, is what about quarter fractions, 4 00:00:18,326 --> 00:00:21,746 or one eighth of a fraction, or even fewer experiments? 5 00:00:22,696 --> 00:00:27,016 Every time we do less and less work, what other information are we losing out on? 6 00:00:27,216 --> 00:00:33,646 So to answer all of this, we are going to introduce some fun and easy to use notation. 7 00:00:34,946 --> 00:00:38,186 The one point I want to make here, right at the start with the hope 8 00:00:38,186 --> 00:00:42,616 that you watch these videos all the way to the end, is that the techniques we're going 9 00:00:42,616 --> 00:00:46,326 to investigate, have been very well established for the last 80 years. 10 00:00:47,016 --> 00:00:51,526 But this field is evolving, and some interesting new fractional designs have emerged. 11 00:00:51,526 --> 00:00:55,226 I will give some pointers to them in the last video for this section. 12 00:00:56,086 --> 00:00:58,786 Now people in the forums for this course have already hinted 13 00:00:58,946 --> 00:01:00,996 at the problem we are going to discuss today. 14 00:01:02,446 --> 00:01:06,006 Imagine you're running a system in which you can create bacteria. 15 00:01:06,506 --> 00:01:10,796 These cells are grown, to generate valuable nutrients which are then used 16 00:01:10,796 --> 00:01:14,706 to create drugs, food product, and other items. 17 00:01:15,676 --> 00:01:20,736 These systems, can operate for a long period of time, and they're expensive. 18 00:01:20,816 --> 00:01:26,456 A scientist or engineer that is investigating the system, with five factors, 19 00:01:26,856 --> 00:01:30,776 could take well over a year to collect all the data necessary 20 00:01:30,866 --> 00:01:34,346 to run 32 experiments in a two to the five factorial. 21 00:01:35,136 --> 00:01:39,036 In most situations, we cannot wait this long for results. 22 00:01:39,036 --> 00:01:40,716 Think about your own case study. 23 00:01:41,366 --> 00:01:45,436 You might be working with a system that is expensive or takes a long time. 24 00:01:46,226 --> 00:01:51,896 In the cell culture example, imagine we had three months, that was our budget available, 25 00:01:52,366 --> 00:01:55,516 and that corresponds to about nine experiments. 26 00:01:55,516 --> 00:01:58,826 Now an inexperienced experimenter, we'll go tell the manager, 27 00:01:58,826 --> 00:02:03,546 that they can only investigate three factors, because that requires eight experiments 28 00:02:03,666 --> 00:02:05,606 which can be done in the three months. 29 00:02:05,606 --> 00:02:10,776 The experimenter actually does not have to eliminate other factors from consideration. 30 00:02:11,116 --> 00:02:13,866 They can go investigate all five factors. 31 00:02:14,186 --> 00:02:19,756 That's what this tradeoff table shows us, it tells us that if we can run 8 experiments, 32 00:02:20,206 --> 00:02:26,316 then we could actually investigate 3 factors in a full factorial, 5 factors in a half fraction. 33 00:02:26,416 --> 00:02:29,846 5 factors in a quarter fraction, and so on. 34 00:02:30,576 --> 00:02:36,396 In fact, we can go all the way as far up as 7 factors in 8 experiments. 35 00:02:36,986 --> 00:02:38,076 That's pretty incredible! 36 00:02:38,536 --> 00:02:44,436 The scientist or engineer at their next meeting, can in fact ask their colleagues for suggestions 37 00:02:44,436 --> 00:02:48,566 on two extra factors, so they can go from five up to seven. 38 00:02:48,566 --> 00:02:53,656 Factors that they think might impact the outcome, but they're not quite sure about. 39 00:02:54,176 --> 00:02:58,226 Those are the perfect examples to go move along the row. 40 00:02:58,926 --> 00:03:03,076 Once you have a budget for a certain number of experiments, usually, 41 00:03:03,176 --> 00:03:08,846 try to go as far across to the right so you can include as many factors as possible. 42 00:03:09,626 --> 00:03:13,016 When we do that, we are generating what is called a screening design. 43 00:03:13,806 --> 00:03:16,736 We are screening to see which of the factors are important. 44 00:03:17,406 --> 00:03:20,976 We know some of them will be, we just aren't sure which ones yet. 45 00:03:21,856 --> 00:03:24,916 We don't need a detailed model of their effects at this point, 46 00:03:25,456 --> 00:03:27,736 just to be sure that they're important or not. 47 00:03:28,676 --> 00:03:32,436 I'm going to show you how we can deal with a case of 5 factors in this example, 48 00:03:32,846 --> 00:03:36,236 and you can practice with a case with 7 factors for homework. 49 00:03:36,546 --> 00:03:38,666 And you'll see that a bit in the next video too. 50 00:03:38,796 --> 00:03:41,006 So back to this trade-off table. 51 00:03:41,526 --> 00:03:45,106 Because we have five factors and a budget for 8 experiments, 52 00:03:45,416 --> 00:03:47,776 we know that we are in this entry of the table. 53 00:03:48,606 --> 00:03:51,276 A full factorial would have required 32 experiments. 54 00:03:51,676 --> 00:03:52,696 We're doing 8. 55 00:03:52,966 --> 00:03:55,436 So in fact, this is a quarter fraction. 56 00:03:56,046 --> 00:03:59,276 It is 2 to the power of 5, minus 2. 57 00:03:59,856 --> 00:04:02,976 Notice that all entries in the table have this general format. 58 00:04:03,506 --> 00:04:05,636 2 to the power of "k" minus "p". 59 00:04:06,536 --> 00:04:08,276 The "k" is the number of factors. 60 00:04:08,796 --> 00:04:11,366 The "p" refers to the reduction in work. 61 00:04:12,316 --> 00:04:15,096 Let's focus on these two other items in the entry. 62 00:04:15,636 --> 00:04:20,076 D equals AB, and E equals AC. 63 00:04:21,036 --> 00:04:26,246 We call these two entries the generators, because they tell us how to create, or generate, 64 00:04:26,356 --> 00:04:28,936 the D and E factors in our experiment. 65 00:04:29,586 --> 00:04:33,826 If you're being observant, you will notice that coincides 66 00:04:33,826 --> 00:04:37,216 with the number of generators of half fractions. 67 00:04:37,436 --> 00:04:43,386 "p" is always equal to 1 because we have a work, and half fractions always have one generator. 68 00:04:43,556 --> 00:04:47,556 If you have a quarter fraction, "p" is equal to 2. 69 00:04:48,026 --> 00:04:49,676 And then we have two generators. 70 00:04:50,136 --> 00:04:52,216 That pattern continues in a logical way. 71 00:04:52,316 --> 00:04:57,196 So in our case, we have "p" equals 2, we're doing one quarter 72 00:04:57,196 --> 00:04:59,766 of the work and we have two generators. 73 00:05:00,056 --> 00:05:02,696 Why do we have to generate these factors D and E? 74 00:05:03,296 --> 00:05:06,906 Well, I've we've established that our budget is for eight experiments, 75 00:05:07,426 --> 00:05:13,446 we can immediately write three columns for factors A, B and C in a full factorial. 76 00:05:14,176 --> 00:05:16,426 You've done this many times by now in the course. 77 00:05:16,906 --> 00:05:21,016 Here's the table, and notice that it is mission columns D and E, 78 00:05:21,386 --> 00:05:24,556 but we can quickly generate them from these two generators. 79 00:05:25,766 --> 00:05:30,646 Now, notice that A, B, and C, refer to a column of plus and minus signs. 80 00:05:30,646 --> 00:05:37,026 So, a product of them, such as A times C, which equals E, refers to the element 81 00:05:37,026 --> 00:05:40,366 by element multiplication of the entries in column A 82 00:05:40,366 --> 00:05:44,386 and C to give column E. See how quick that was? 83 00:05:44,926 --> 00:05:48,026 There are my five factors to use in my eight experiments. 84 00:05:48,296 --> 00:05:52,266 All done. So what about that potential 9th experiment? 85 00:05:52,946 --> 00:05:56,176 I often recommend starting with the center point or some sort 86 00:05:56,176 --> 00:05:58,836 of baseline experiment as your first experiment. 87 00:05:59,286 --> 00:06:02,556 Put all the factors at their centers, their zero value. 88 00:06:03,436 --> 00:06:06,436 Now categorical variables, don't have a natural zero. 89 00:06:06,476 --> 00:06:11,186 Simply choose an arbitrary low or high value for that categorical variable. 90 00:06:11,916 --> 00:06:16,956 That first experiment is a great way to iron out any problems in your experimental protocol. 91 00:06:17,356 --> 00:06:21,256 There are always unexpected issues to deal with in the very first experiment, 92 00:06:21,416 --> 00:06:24,546 so rather do it on one that you're willing to throw away. 93 00:06:24,946 --> 00:06:27,796 However, if that first experiment does work out. 94 00:06:27,936 --> 00:06:31,796 You get to keep it, and it improves some of your predictive model parameters. 95 00:06:32,366 --> 00:06:34,536 Well that seemed almost too good to be true. 96 00:06:34,936 --> 00:06:37,796 You've got these eight runs, you go ahead and do them, 97 00:06:38,046 --> 00:06:41,036 record your outcome values and then you're finished. 98 00:06:41,476 --> 00:06:44,646 Surely there's got to be a bit more to these fractional factorials? 99 00:06:45,446 --> 00:06:46,146 And there is. 100 00:06:47,136 --> 00:06:49,786 We're going to have to figure out what the aliases are. 101 00:06:51,176 --> 00:06:56,436 Remember when we were looking at half fractions, and we introduced the word "alias"? 102 00:06:56,436 --> 00:07:00,986 If that's not something you're comfortable with yet, please review the prior videos again 103 00:07:00,986 --> 00:07:03,556 and make sure you understand what that is. 104 00:07:04,166 --> 00:07:07,206 Aliases are easy to figure out for half fractions by hand. 105 00:07:07,706 --> 00:07:11,396 But they're particularly messy for heavily fractionated designs. 106 00:07:12,476 --> 00:07:15,396 We need a system to help figure out what these aliases are, 107 00:07:16,416 --> 00:07:19,056 and I'm going to take a few minutes to show you that process. 108 00:07:19,136 --> 00:07:21,586 So we've already seen generators. 109 00:07:22,176 --> 00:07:24,626 These are the expressions we read from the trade-off table. 110 00:07:25,546 --> 00:07:29,896 Now, if we take a generator, we can multiply both the left, and right side, 111 00:07:29,896 --> 00:07:33,356 by the same single symbol, that appears on the left hand side. 112 00:07:33,356 --> 00:07:38,646 So here, for example, we multiply by D on both sides. 113 00:07:38,646 --> 00:07:42,396 So we get D*D = ABD. 114 00:07:42,396 --> 00:07:47,416 This creates a desirable simplification that I'll quickly demonstrate. 115 00:07:48,046 --> 00:07:49,526 Now we introduce a quick rule. 116 00:07:50,016 --> 00:07:55,516 Any time you see two of the same letters side by side, you can instantly eliminate it 117 00:07:55,516 --> 00:07:57,646 and replace it with a letter, capital I. 118 00:07:58,516 --> 00:08:01,186 This I corresponds to the intercept. 119 00:08:01,356 --> 00:08:04,906 Or another way of seeing it is the identity, the number "1". 120 00:08:06,236 --> 00:08:11,316 The reason is, is because these two columns multiplied by each other that are the same, 121 00:08:11,586 --> 00:08:13,936 will always result in a column of ones. 122 00:08:14,416 --> 00:08:19,576 If that column contains a minus entry, multiply by itself, you get a plus. 123 00:08:19,636 --> 00:08:23,876 If the column contained a plus entry, multiply by itself, it is still a plus. 124 00:08:24,036 --> 00:08:26,356 So two columns that are the same, 125 00:08:26,696 --> 00:08:31,546 multiplied together always equals I, and can in fact be eliminated. 126 00:08:32,746 --> 00:08:37,146 Okay, so we have taken our generator and slightly transformed it 127 00:08:37,146 --> 00:08:42,416 so that it has an identity I on the left and the rest of the generator on the right. 128 00:08:42,416 --> 00:08:44,636 That's right. 129 00:08:45,066 --> 00:08:50,416 You should have found a generator or EE which equals I, which equals ACE. 130 00:08:50,416 --> 00:08:53,816 It is just another way of expressing that generator. 131 00:08:53,816 --> 00:09:02,606 So to summarize our progress, we have two of these generators, I equals ABD and I equals ACE, 132 00:09:02,606 --> 00:09:05,206 and we learned to rule how to read them from the table. 133 00:09:05,206 --> 00:09:08,026 I quickly want to introduce another term. 134 00:09:08,796 --> 00:09:12,196 Any collection of sequential letters is called a "word". 135 00:09:12,636 --> 00:09:17,446 ABD is a word, ACE is a word, even I is a word. 136 00:09:19,006 --> 00:09:23,516 The last piece of terminology we need, is what is known as the defining relationship. 137 00:09:24,206 --> 00:09:28,516 The defining relationship, is a sequence of words that are equal to each other. 138 00:09:29,346 --> 00:09:35,686 The defining relationship always has a length of 2 to the power of "p" words, and the first word 139 00:09:35,686 --> 00:09:38,706 in the relationship is always the identity, I. 140 00:09:38,856 --> 00:09:43,036 So, remember in our example, we have "p" equals 2. 141 00:09:43,096 --> 00:09:48,676 So, we should have 2 to the 2, in other words, four words in our defining relationship. 142 00:09:49,116 --> 00:09:52,046 And the first one is I, so where are the other three? 143 00:09:53,026 --> 00:09:58,566 We find them by taking all possible combinations of the rearranged generators. 144 00:09:58,566 --> 00:10:03,456 The simplest combination, is to take the words on their own. 145 00:10:03,586 --> 00:10:11,566 So my second word is ABD, and my third word in the defining relationship, is ACE. 146 00:10:11,606 --> 00:10:15,366 The next combination, is to combine two of the words together. 147 00:10:16,246 --> 00:10:23,196 Well, since we only have two words, we can do this by combining ABD with ACE. 148 00:10:23,196 --> 00:10:26,116 Now let's rearrange that and regroup our letters, AABCE. 149 00:10:26,116 --> 00:10:33,896 Next we can use the rule that two sequential letters are equal to their identity. 150 00:10:33,896 --> 00:10:37,616 So that, in fact, becomes IBCDE. 151 00:10:38,736 --> 00:10:41,716 Now remember that I is just a column of plus 1 entries. 152 00:10:41,716 --> 00:10:45,286 So it's kind of redundant when it's multiplied with other letters. 153 00:10:45,286 --> 00:10:49,666 So we can drop that and simplify it to BCDE. 154 00:10:49,766 --> 00:10:54,186 So here's my complete defining relationship with 4 words. 155 00:10:54,886 --> 00:11:00,406 I = ABD = ACE = BCDE. 156 00:11:00,956 --> 00:11:05,526 That simple set of words, holds the key to figuring out the aliases. 157 00:11:05,936 --> 00:11:08,856 We're going to see exactly how to do that in the next video. 158 00:11:09,416 --> 00:11:14,156 But, before we wrap up, are you brave enough to try this yourself on a different example? 159 00:11:14,896 --> 00:11:16,196 Practice makes perfect! 160 00:11:16,196 --> 00:11:22,416 So consider a system, with six factors and running 16 experiments. 161 00:11:22,416 --> 00:11:27,466 Your task is to write out the complete defining relationships for that case. 162 00:11:27,466 --> 00:11:31,966 First, write out the rearranged generators from the trade-off table. 163 00:11:32,596 --> 00:11:34,426 Then take all the combinations. 164 00:11:35,366 --> 00:11:38,966 Are you sure you have the correct number of words in your defining relationship? 165 00:11:40,936 --> 00:11:44,446 Prepare to pause the video, the solution is going to be shown shortly.