1
00:00:02,750 --> 00:00:06,570
In this section,
we start looking outside our cube plot.

2
00:00:06,570 --> 00:00:11,110
What happens when we leave that range from
minus one to plus one that we've been so

3
00:00:11,110 --> 00:00:11,690
focused on?

4
00:00:13,070 --> 00:00:17,420
We're going to add a new tool to our
toolkit that we used to analyze the data.

5
00:00:17,420 --> 00:00:19,709
The concept called
Response Surface Methods (RSM).

6
00:00:20,820 --> 00:00:25,630
Now, in the next video, we will consider
in depth the case of a single factor.

7
00:00:25,630 --> 00:00:30,080
Most practical systems, though, have two
or more factors that affect the outcome.

8
00:00:30,080 --> 00:00:31,510
But if you understand the idea for

9
00:00:31,510 --> 00:00:34,920
one factor, then the subsequent
videos will make more sense.

10
00:00:36,110 --> 00:00:39,200
I'll explain what Response Surface
Methods are in this video and

11
00:00:39,200 --> 00:00:41,070
why you would want to use them.

12
00:00:41,070 --> 00:00:43,711
And in the remainder of the videos,
we'll see them in action.

13
00:00:45,832 --> 00:00:49,752
When I use data to improve a process or
a system, in my experience,

14
00:00:49,752 --> 00:00:54,240
I find that I'm inevitably trying to
achieve one of these five objectives.

15
00:00:55,320 --> 00:00:58,390
Trying to learn more or
increase my knowledge of the system.

16
00:00:58,390 --> 00:01:00,820
Maybe I'm troubleshooting the process.

17
00:01:00,820 --> 00:01:04,170
Or perhaps, I'm using the data
to make some form of prediction.

18
00:01:04,170 --> 00:01:07,170
Or maybe I'm trying to optimize
the system in some way.

19
00:01:07,170 --> 00:01:11,790
Or finally, I might just be monitoring
the process based on the data to make sure

20
00:01:11,790 --> 00:01:15,570
that I'm retaining all those performance
gains I've made in the past.

21
00:01:15,570 --> 00:01:19,960
Those of you taking the course and working
in a company, you will find that any

22
00:01:19,960 --> 00:01:24,630
project or task you do likely falls
into one of these five categories.

23
00:01:24,630 --> 00:01:28,040
Think back about the past few
projects you've been working on.

24
00:01:28,040 --> 00:01:31,830
The biggest problem I often
encounter is that people don't have

25
00:01:31,830 --> 00:01:34,130
their objectives clearly in mind.

26
00:01:34,130 --> 00:01:38,050
Once you've figured out your objective,
picking the simplest approach, and

27
00:01:38,050 --> 00:01:41,740
using the appropriate tools to solve
that problem becomes apparent.

28
00:01:42,740 --> 00:01:44,600
In the prior four modules of this course,

29
00:01:44,600 --> 00:01:48,390
we have focused really only on the first
three objectives listed there.

30
00:01:48,390 --> 00:01:50,700
We've hinted a little
bit at that fourth one,

31
00:01:50,700 --> 00:01:53,640
trying to optimize
the process in some way.

32
00:01:53,640 --> 00:01:56,790
For that first objective,
we've seen how we can learn

33
00:01:56,790 --> 00:02:00,650
which factors are important and
illuminate which are not.

34
00:02:00,650 --> 00:02:03,690
This improves our overall
understanding of the system.

35
00:02:03,690 --> 00:02:05,130
To quote George Box:

36
00:02:05,130 --> 00:02:09,750
"discovering the unexpected is more
important than confirming the unknown".

37
00:02:09,750 --> 00:02:13,720
Really think about your experimental
results and interpret them every time.

38
00:02:14,890 --> 00:02:19,440
The concepts learnt in this course can
also be used to troubleshoot a problem.

39
00:02:19,440 --> 00:02:23,720
If your boss comes to you with a problem,
you can brainstorm a list of five, six, or

40
00:02:23,720 --> 00:02:26,740
more factors that
are potentially the cause.

41
00:02:26,740 --> 00:02:29,820
Use fractional factorial
ideas from module four, and

42
00:02:29,820 --> 00:02:34,550
you can quickly identify which factors
are actually related to the issue.

43
00:02:34,550 --> 00:02:37,970
And right since video 2A,
we've been making predictions based on

44
00:02:37,970 --> 00:02:41,110
our experimental results, so
you're very comfortable with that idea.

45
00:02:42,360 --> 00:02:46,340
In this section,
we're going to be optimizing our process.

46
00:02:46,340 --> 00:02:50,170
Let's go back to a familiar process,
making popcorn.

47
00:02:50,170 --> 00:02:54,550
And it was perfect timing, that there
was a great forum posting about that.

48
00:02:54,550 --> 00:02:57,240
It seems many of you love this snack.

49
00:02:57,240 --> 00:03:00,060
Let's say you were simply
investigating two factors.

50
00:03:00,060 --> 00:03:04,370
Cooking time as factor A, and
the type of oil as factor B.

51
00:03:04,370 --> 00:03:09,280
And I'm going to use the number of
unburned popcorn as the outcome variable.

52
00:03:09,280 --> 00:03:11,230
You'll see why I chose this.

53
00:03:11,230 --> 00:03:17,580
Unburned popcorn are those that have
popped but not burned, the white popcorn.

54
00:03:17,580 --> 00:03:22,560
We want to maximize this outcome variable,
that's the objective of my experiments.

55
00:03:22,560 --> 00:03:24,050
And here are the results on a cube plot.

56
00:03:25,340 --> 00:03:28,790
You're experts at this now, so
you can quickly see that factor B,

57
00:03:28,790 --> 00:03:32,580
the type of oil,
has almost no effect on the outcome.

58
00:03:32,580 --> 00:03:34,860
Notice that the first
objective was used here.

59
00:03:35,860 --> 00:03:40,340
We have learned in our system that
the type of oil over this range of

60
00:03:40,340 --> 00:03:44,060
cooking times seems to have
little impact on the outcome.

61
00:03:44,060 --> 00:03:46,560
We've learned something
new about our process.

62
00:03:46,560 --> 00:03:49,910
It doesn't mean that oil
type is totally irrelevant.

63
00:03:49,910 --> 00:03:54,050
It simply says that over the range
of A that we've used here,

64
00:03:54,050 --> 00:03:55,940
cooking time seems to have little effect.

65
00:03:56,960 --> 00:03:59,680
Visually, this means we can
collapse our square down to

66
00:03:59,680 --> 00:04:01,280
a single line as shown here.

67
00:04:02,360 --> 00:04:07,110
Let's go apply objective three now and
build a predictive model for the system.

68
00:04:07,110 --> 00:04:10,500
Y = 90 + 15 x_A

69
00:04:10,500 --> 00:04:12,860
Note that we don't have
to include factor B or

70
00:04:12,860 --> 00:04:17,880
the AB interaction in our model because
we've determined that B is not useful.

71
00:04:17,880 --> 00:04:18,950
Here is the R code.

72
00:04:19,960 --> 00:04:23,740
And you will get the exact same
result with any statistical software.

73
00:04:23,740 --> 00:04:29,420
Just a brief recap on the interpretation
of the 15 x_A term in the model.

74
00:04:29,420 --> 00:04:33,700
That says, when we increase
the cooking time from -1 to 0, or

75
00:04:33,700 --> 00:04:39,110
from 0 to +1 in coded units,
in other words, a one unit increase, then

76
00:04:39,110 --> 00:04:45,260
the number of popped but unburned popcorn
increases on average by a value of 15.

77
00:04:45,260 --> 00:04:49,720
Now response surface methods, or
response surface optimization,

78
00:04:49,720 --> 00:04:54,340
uses the idea that this model can
tell us where to move to next.

79
00:04:54,340 --> 00:04:57,650
We're going to build on our
existing experiments over here

80
00:04:57,650 --> 00:05:00,340
to figure out what happens over there.

81
00:05:00,340 --> 00:05:04,560
We've figured out already that factor
B does not play an important role in

82
00:05:04,560 --> 00:05:05,580
this system.

83
00:05:05,580 --> 00:05:09,010
So response surface methods are used
after you've already completed your

84
00:05:09,010 --> 00:05:10,880
screening experiments.

85
00:05:10,880 --> 00:05:12,370
That's an important point.

86
00:05:12,370 --> 00:05:15,990
Don't include factors in
the optimization that have little or

87
00:05:15,990 --> 00:05:17,800
no effect on the outcome.

88
00:05:17,800 --> 00:05:21,250
Then once we build a model based
on they important factors,

89
00:05:21,250 --> 00:05:25,270
we can now go use it to tell
us where to move to next.

90
00:05:25,270 --> 00:05:28,600
We can see here that we should
be moving towards the right,

91
00:05:28,600 --> 00:05:29,950
to increase out objective.

92
00:05:31,040 --> 00:05:34,320
Now we can never expect
the model to tell us exactly or

93
00:05:34,320 --> 00:05:39,170
perfectly what will happen over there on
the right as we move towards that region.

94
00:05:39,170 --> 00:05:44,000
There is no way that this simple model
summarizes all the laws of physics,

95
00:05:44,000 --> 00:05:45,280
heat transfer, and

96
00:05:45,280 --> 00:05:49,950
the complex chemical reactions taking
place when popcorn is popping.

97
00:05:49,950 --> 00:05:54,530
This simple model, also referred to
by the name of an "empirical model",

98
00:05:54,530 --> 00:05:59,940
is a great approximation, and provides
good guidance on where to move to next.

99
00:05:59,940 --> 00:06:03,660
That is what response surface 
methods (RSM) are about, in a nutshell.

100
00:06:03,660 --> 00:06:08,010
Efficient sequential experiments
to reach an optimum, using only

101
00:06:08,010 --> 00:06:11,710
the important factors after you've
done a preliminary screening design.