1
00:00:03,320 --> 00:00:06,710
In the prior video,
I left you half way up the mountain.

2
00:00:06,710 --> 00:00:11,940
I had asked you to take that ninth step,
that ninth experiment on your own.

3
00:00:11,940 --> 00:00:14,510
Were you able to find
the location of that next run?

4
00:00:15,810 --> 00:00:18,800
As we proceed,
we will cover two diversions.

5
00:00:18,900 --> 00:00:22,240
We will look, what happens, if you
have constraints in your experiments.

10
00:00:22,240 --> 00:00:27,400
By that, I mean, what happens if you want
to take a step and realize that because of

11
00:00:27,400 --> 00:00:32,000
safety issues, or for other reasons, that
you can't quite go as far as you'd hoped.

12
00:00:32,410 --> 00:00:34,180
We'll also look at mistakes.

13
00:00:34,280 --> 00:00:39,640
What if you, or your colleagues, run
an experiment but use the wrong settings?

14
00:00:39,640 --> 00:00:42,679
We'll show that you can
easily recover from that.

29
00:00:43,680 --> 00:00:44,850
And in the prior video,

30
00:00:44,850 --> 00:00:50,760
I ended by asking you to take a step size
with delta x_P equal to one and a half.

31
00:00:51,820 --> 00:00:58,338
If you did that, you would've found
the associated delta x_T equal to 0.718.

32
00:00:58,338 --> 00:01:03,350
Now, let's convert these delta
lower case x's to their upper case,

33
00:01:03,350 --> 00:01:07,520
real world changes, using the formulas
we introduced in the prior video.

34
00:01:08,840 --> 00:01:13,200
For throughput,
this lower case "delta x_T" corresponds to

35
00:01:13,200 --> 00:01:17,980
an increase of 2.87 parts per hour,
which we round to 3 parts.

36
00:01:19,070 --> 00:01:23,387
For price, it's a 27 cent increase that
we would add to the baseline value.

37
00:01:24,910 --> 00:01:29,460
Now we can go tell our employees or
colleagues that the 9th experiment is at

38
00:01:29,460 --> 00:01:34,850
337 parts per hour, with a price of $1.45.

39
00:01:34,850 --> 00:01:38,420
Remember, our colleagues
don't speak in coded units.

40
00:01:38,420 --> 00:01:41,600
We have to talk with them in actual units,
even though we

41
00:01:41,600 --> 00:01:45,530
speak in coded units behind their backs,
when we deal with the least squares model.

42
00:01:46,750 --> 00:01:48,670
Now we should always go
predict the outcome of

43
00:01:48,670 --> 00:01:50,300
the experiment before running it.

44
00:01:51,320 --> 00:01:55,272
In coded units x_P for
the ninth experiment is at 1.5,

45
00:01:55,272 --> 00:01:56,870
because we selected that.

46
00:01:57,890 --> 00:02:02,690
You might presume that the x_T value
is 0.718 that you calculated but

47
00:02:02,690 --> 00:02:06,250
not quite, because remember,
we rounded that value.

48
00:02:06,250 --> 00:02:08,940
So we should go recalculate what x_T is for

49
00:02:08,940 --> 00:02:14,270
a nine, using the usual formulas that
connects real world units to coded units.

50
00:02:15,500 --> 00:02:20,140
So that value of x_T is equal to 0.75.

51
00:02:20,140 --> 00:02:22,400
When we go use the model,

52
00:02:22,400 --> 00:02:28,184
the prediction with these coded values
gives us a profit prediction of $731.

53
00:02:29,370 --> 00:02:31,561
Now if you go to the website and

54
00:02:31,561 --> 00:02:36,478
run the actual experiment,
you might get a value close to $717.

55
00:02:36,478 --> 00:02:39,120
Our prediction was off by about $13 or $14.

56
00:02:39,120 --> 00:02:46,090
You should have been able to do all of the
above after watching the prior video.

57
00:02:46,090 --> 00:02:48,467
If not, go back to the prior video and

58
00:02:48,467 --> 00:02:53,089
recap with those calculations where
they were showing in some detail.

59
00:02:53,089 --> 00:02:57,112
Now how bad is that
prediction error of $13?

60
00:02:57,112 --> 00:03:01,570
One way to tell is by comparing it to
the value from the noise in the system.

61
00:03:02,640 --> 00:03:03,920
And to calculate the noise,

62
00:03:03,920 --> 00:03:08,380
we need some replicated experiments,
which we haven't gone and done.

63
00:03:08,380 --> 00:03:12,770
But if we had the time and budget,
we could certainly do that and verify.

64
00:03:14,190 --> 00:03:18,780
But a rough way that we can get an
estimate of that noise is by comparing it

65
00:03:18,780 --> 00:03:22,400
to the coefficient of the main
effects in the model.

66
00:03:22,400 --> 00:03:25,390
And it is about half the size
of the smallest main effect.

67
00:03:26,460 --> 00:03:28,460
So that prediction error is not too bad.

68
00:03:29,840 --> 00:03:32,640
Now since the model's
predictions are still adequate,

69
00:03:32,640 --> 00:03:35,719
we can keep going up this
direction of steepest ascent.

70
00:03:36,810 --> 00:03:38,080
This is new.

71
00:03:38,080 --> 00:03:43,710
In the prior factorial, we had to stop and
rebuild after using it single step.

72
00:03:43,710 --> 00:03:47,560
But this time our predictions
are still okay, so we keep going.

73
00:03:47,560 --> 00:03:50,960
This is the general principle
of response surface methods:

74
00:03:50,960 --> 00:03:55,429
keep going up that path as long as the
predictions are consistent with reality.

75
00:03:56,920 --> 00:04:01,105
Now we can try step two delta x_P=2.5 
away from the baseline.

76
00:04:02,530 --> 00:04:04,010
Pause the video and

77
00:04:04,010 --> 00:04:08,330
try to calculate these quantities at
these new 10th experiment yourself.

78
00:04:09,860 --> 00:04:12,510
You'll soon become an expert
at these calculations, but

79
00:04:12,510 --> 00:04:15,500
it will take you several minutes at first.

80
00:04:15,500 --> 00:04:17,070
Once you're done with your work,

81
00:04:17,070 --> 00:04:21,270
go compare your prediction to the actual
experiments using the website.

82
00:04:22,400 --> 00:04:27,790
So these are the values that you should
have obtained, "delta x_T" = 1.2,

83
00:04:27,790 --> 00:04:32,600
"delta T" in real world units is
a change of 4.8 parts per hour, and

84
00:04:32,600 --> 00:04:34,316
we'll round that up to 5.

85
00:04:34,316 --> 00:04:39,882
"Delta P" is 0.45 or $0.45.

86
00:04:39,882 --> 00:04:47,220
T for the tenth experiment, corresponds
to 339 parts an hour; and P is a $1.63.

87
00:04:47,220 --> 00:04:52,880
x_T in coded units is 1.25.

88
00:04:52,880 --> 00:04:56,930
Just a little bit different from
the x_T=1.2 that we had calculated earlier,

89
00:04:56,930 --> 00:04:58,540
due to rounding.

90
00:04:58,540 --> 00:05:01,770
And x_P=2.5.

91
00:05:01,770 --> 00:05:06,293
Using those coded values,
we can predict a y-value for

92
00:05:06,293 --> 00:05:10,022
the 10th experiment of $784.77.

93
00:05:10,022 --> 00:05:11,767
Well, $785.

94
00:05:11,767 --> 00:05:15,490
Now, the actual experimental
outcome is around $732.

95
00:05:15,490 --> 00:05:19,730
You won't get that exact figure
from the website because we

96
00:05:19,730 --> 00:05:22,870
add some noise to the prediction
just to make things realistic.

97
00:05:24,150 --> 00:05:27,330
That's about a $50 deviation though, and

98
00:05:27,330 --> 00:05:31,650
it's comparable to the main effect
of the largest factor, the price.

99
00:05:31,650 --> 00:05:34,480
So it's probably time
we rebuild this model.

100
00:05:34,480 --> 00:05:37,110
And the 10th experiment
can form our baseline.

101
00:05:38,390 --> 00:05:41,481
Notice that when we do this,
we reset our (0,0)

102
00:05:41,481 --> 00:05:46,020
center point to this new
location in real world units.

103
00:05:46,020 --> 00:05:49,270
We do not use the previous
factorial's coded units.

104
00:05:49,270 --> 00:05:54,000
We start fresh and build a new local model
to approximate the surface in this region.

105
00:05:55,060 --> 00:05:58,180
What range should we use for
the new factorial?

106
00:05:58,180 --> 00:06:00,340
I'm going to use a slightly
smaller range for

107
00:06:00,340 --> 00:06:05,120
the throughput, T,
of 6 parts per hour, for two reasons.

108
00:06:05,120 --> 00:06:09,550
First, we're coming close to our
upper bound of 350 parts an hour.

109
00:06:10,610 --> 00:06:13,070
In case there's an optimum
near this bound,

110
00:06:13,070 --> 00:06:16,630
we will see in the next video
we should have a bit of room

111
00:06:16,630 --> 00:06:20,600
to move outside the factorial
bounds to fit a non-linear model.

112
00:06:21,760 --> 00:06:24,470
Secondly, we might suspect
we're levelling off.

113
00:06:25,510 --> 00:06:29,880
And the way I can see this is by looking
at the spread in the profit values in

114
00:06:29,880 --> 00:06:30,950
the first factorial.

115
00:06:32,300 --> 00:06:34,710
See how far apart they are over there?

116
00:06:34,710 --> 00:06:37,940
And here in this second factorial,
they're closer together.

117
00:06:39,090 --> 00:06:42,430
That reduction indicates there
might be a levelling off, and

118
00:06:42,430 --> 00:06:46,110
we don't want to overshoot the optimum
by taking too large a step.

119
00:06:47,250 --> 00:06:50,700
For price P, I'm going to take
the same range as before.

120
00:06:51,880 --> 00:06:54,850
We are still far away from
the extreme upper bound.

121
00:06:54,850 --> 00:06:56,630
But if you'd like to use
a different range for

122
00:06:56,630 --> 00:07:01,990
the price, go ahead and
try using perhaps, $0.20 for example.

123
00:07:01,990 --> 00:07:06,090
You'll see that your direction to the
optimum is not very different to the one

124
00:07:06,090 --> 00:07:09,580
I am going to take with the $0.36 range.

125
00:07:09,580 --> 00:07:12,050
So let me have a small digression here.

126
00:07:12,050 --> 00:07:14,850
You might be wondering if your
choice of range will have

127
00:07:14,850 --> 00:07:17,840
a significant impact on
the path of steepest ascent.

128
00:07:19,180 --> 00:07:23,940
You notice that the direction of steepest
ascent is in proportion to the range of

129
00:07:23,940 --> 00:07:25,660
the factors chosen.

130
00:07:25,660 --> 00:07:27,800
If you were doing the experiments,

131
00:07:27,800 --> 00:07:31,318
it's quite likely you will pick a
different range to the one that I'll pick.

132
00:07:32,650 --> 00:07:36,930
Fortunately, and it has been shown
in various statistical textbooks,

133
00:07:36,930 --> 00:07:41,150
that these different range choices
selected by different experimenters

134
00:07:41,150 --> 00:07:46,270
will lead to a different path up
the mountain but not radically different.

135
00:07:46,270 --> 00:07:51,210
There is this idea of "confidence
interval of paths" so to speak.

136
00:07:51,210 --> 00:07:55,140
So the bottom line is this, don't be
too concerned about the range choice

137
00:07:55,140 --> 00:07:58,930
as long as it is reasonable and
leaves you room to the left and

138
00:07:58,930 --> 00:08:02,340
right of your extreme bounds to
approach that mountain peak.

139
00:08:03,540 --> 00:08:05,510
Now, back to the factorial.

140
00:08:05,510 --> 00:08:12,290
Here are experiments, 11, 12, 13 and
14 and their corresponding profit values.

141
00:08:12,290 --> 00:08:15,880
Remember, we run them in random order,
but I report them here in standard order.

142
00:08:17,310 --> 00:08:21,500
If you write and run the R code,
you can show you get the following linear

143
00:08:21,500 --> 00:08:26,220
model from the five experiments, including
the baseline point at position ten.

144
00:08:28,540 --> 00:08:32,190
Pause the video and
fit the model from the data points.

145
00:08:32,190 --> 00:08:34,730
What interesting feature do you
notice in the contour plot?

146
00:08:36,540 --> 00:08:39,820
You would've observed some
curvature in the contours.

147
00:08:39,820 --> 00:08:44,110
This is an indication that something
has changed in the surface.

148
00:08:44,110 --> 00:08:46,630
Now you can happily skip
onto the next video and

149
00:08:46,630 --> 00:08:48,619
see how to continue this analysis.

150
00:08:49,720 --> 00:08:52,620
But to end this video,
I am going to divert and

151
00:08:52,620 --> 00:08:56,060
talk a little bit about
experimental mistakes.

152
00:08:56,060 --> 00:08:59,740
I am also going to show what happens
when you hit in to constraints.

153
00:08:59,740 --> 00:09:02,870
But feel free to come back
to this topic later on

154
00:09:02,870 --> 00:09:07,519
if you want to jump ahead and
see how the case study continues.

155
00:09:09,440 --> 00:09:11,640
So to talk about mistakes.

156
00:09:11,640 --> 00:09:14,020
I will use run number 9 over here and

157
00:09:14,020 --> 00:09:16,410
show how we could have used
it a bit more effectively.

158
00:09:17,870 --> 00:09:21,330
Notice that run 9 and
run 11 are close to each other.

159
00:09:22,350 --> 00:09:28,620
If I was planning this third factorial
here in runs 11, 12, 13 and 14.

160
00:09:28,620 --> 00:09:32,890
And if my experiments were really
expensive, I would want to know if I

161
00:09:32,890 --> 00:09:36,635
could use experiment 9 and
avoid running experiment 11.

162
00:09:37,890 --> 00:09:40,310
And the answer is yes, you definitely can.

163
00:09:41,560 --> 00:09:46,210
We use the concept of a "botched design",
which is just an English word for

164
00:09:46,210 --> 00:09:47,180
"mistaken design".

165
00:09:48,380 --> 00:09:51,779
Mistakes happen all the time in
experiments in two main ways.

166
00:09:52,890 --> 00:09:57,440
Firstly, imagine your employee wanted
to actually run experiments 11, but

167
00:09:57,440 --> 00:10:01,622
made a mistake with the settings and ran
the experiment at position 9 by accident.

168
00:10:03,200 --> 00:10:06,940
Another way this could have happened
is to imagine that if you were running

169
00:10:06,940 --> 00:10:13,080
experiments in random order, you might
have run experiment 12 then 13 then 14.

170
00:10:13,080 --> 00:10:14,360
And then you want to come and

171
00:10:14,360 --> 00:10:19,530
run experiment 11 when you suddenly
realize that condition would be unsafe, or

172
00:10:19,530 --> 00:10:22,860
lead to totally different,
very unexpected operation.

173
00:10:23,970 --> 00:10:27,210
Someone in the course forums
asked exactly that question.

174
00:10:27,210 --> 00:10:30,630
You might think that you'd have
to shrink experiment 12 over to

175
00:10:30,630 --> 00:10:36,670
this location to line up with experiment
9 and get back to a regular factorial.

176
00:10:36,670 --> 00:10:38,600
But it is not necessary.

177
00:10:38,600 --> 00:10:43,752
The important insight is that you can get
an adequate model with these four points,

178
00:10:43,752 --> 00:10:47,653
even if they're not in perfect
alignment with the -1 and

179
00:10:47,653 --> 00:10:51,721
+1 positions they would
normally occupy on the cube plot.

180
00:10:51,721 --> 00:10:55,377
But if one or
more of the experiments are shifted,

181
00:10:55,377 --> 00:10:58,780
you must use the correct coded value for
them.

182
00:10:58,780 --> 00:11:01,021
For point 9, for example,

183
00:11:01,021 --> 00:11:07,170
the correct coded value is -0.67
from this equation; not -1.

184
00:11:08,880 --> 00:11:14,910
So in our R code, instead of -1, 
+1, -1, +1 for the factor,

185
00:11:14,910 --> 00:11:20,340
T, we use the mistaken value of  
-2/3, plus 1, plus 1.

186
00:11:21,680 --> 00:11:24,669
And we enter the outcome value
we got at the mistaken point.

187
00:11:27,220 --> 00:11:30,770
Now mistaken experiments, because
they're not at these -1 or

188
00:11:30,770 --> 00:11:36,540
+1 positions are generally
calculated by a computer, and not by hand.

189
00:11:36,540 --> 00:11:40,520
When you rebuild the model, you get
the following prediction equation and

190
00:11:40,520 --> 00:11:41,370
contour plots.

191
00:11:42,740 --> 00:11:45,520
Let me contrast that to
the situation over here on

192
00:11:45,520 --> 00:11:50,930
the right where I had used experiments
at position 11, 12, 13 and 14.

193
00:11:50,930 --> 00:11:57,790
And you can see that that model is not
different to the model with a mistake.

194
00:11:57,790 --> 00:12:00,530
Now, you do lose some of
the useful properties we get when

195
00:12:00,530 --> 00:12:05,360
the design was run at the correct values.
But if this small change means

196
00:12:05,360 --> 00:12:11,030
saving lots of money to avoid redoing an
experiment, it's really worth the price.

197
00:12:11,030 --> 00:12:14,230
Notice that you definitely did
not have to shrink in your range.

198
00:12:15,510 --> 00:12:18,890
So we have discussed what to do with
constraints that relate to mistakes or

199
00:12:18,890 --> 00:12:20,380
"botched designs".

200
00:12:20,380 --> 00:12:23,390
What about constraints that
are imposed by the system,

201
00:12:23,390 --> 00:12:25,590
constraints that you know
about before the time?

202
00:12:26,620 --> 00:12:29,270
It is common for
systems to have such constraints that

203
00:12:29,270 --> 00:12:31,760
prevent operation outside
a certain region.

204
00:12:32,850 --> 00:12:36,590
I'm talking not just about constraints
that align with extreme vertical or

205
00:12:36,590 --> 00:12:40,830
horizontal edges of the factors,
outside which you cannot operate.

206
00:12:40,830 --> 00:12:44,850
But rather, I'm referring to
constraints that cut entire regions out

207
00:12:44,850 --> 00:12:45,719
of consideration.

208
00:12:46,830 --> 00:12:52,100
For example, a constraint that runs
along this direction, shown in red, and

209
00:12:52,100 --> 00:12:55,090
anything beyond it, we cannot go and
run over there.

210
00:12:56,640 --> 00:13:01,000
What if the path of steepest ascent was
showing a promising direction along

211
00:13:01,000 --> 00:13:02,880
here, in green?

212
00:13:02,880 --> 00:13:06,860
Well, then we modify our path
to obey that constraint.

213
00:13:06,860 --> 00:13:12,460
But we have to always obey safety
requirements in our process.

214
00:13:12,460 --> 00:13:14,310
They are of primary importance.

215
00:13:14,310 --> 00:13:18,360
And we have to find our optimum
within those restrictions.

216
00:13:18,360 --> 00:13:22,850
And I'll end by saying it is not
uncommon to find your optimum,

217
00:13:22,850 --> 00:13:25,050
right at the boundary of a constraint.

218
00:13:25,050 --> 00:13:27,990
We see that in engineering
systems frequently, and

219
00:13:27,990 --> 00:13:30,310
it's likely to occur in
other systems as well.

220
00:13:31,860 --> 00:13:33,843
So that's the end of this video.

221
00:13:33,843 --> 00:13:39,949
In the next video, we resume going back to
the factorial with this baseline at point 10.

222
00:13:39,949 --> 00:13:42,580
Where would you run your next experiment?

223
00:12:42,580 --> 00:13:47,310
Use the tool on the website and try a few
runs yourself before watching video 6C.