1
[Link],120 --> [Link],953
1
[Link],000 --> [Link],900
Instructor: Hello, my friends.
2
[Link],900 --> [Link],540
Welcome to this new practical activity
3
[Link],540 --> [Link],700
on polynomial regression.
4
[Link],700 --> [Link],610
This time we're gonna learn how to build together
5
[Link],610 --> [Link],770
a non-linear regression model
6
[Link],770 --> [Link],440
which will allow us to tackle a problem
7
[Link],440 --> [Link],180
with a non-linear data set,
8
[Link],180 --> [Link],450
meaning a data set with non-linear relationships
9
[Link],450 --> [Link],840
on which therefore a multiple linear regression model
10
[Link],840 --> [Link],670
would not be relevant.
11
[Link],670 --> [Link],970
Now we're all gonna go into part two regression,
12
[Link],970 --> [Link],220
and this time we're gonna go to section six,
13
[Link],220 --> [Link],400
polynomial regression, to learn how to build indeed
14
[Link],400 --> [Link],220
this non-linear regression model.
15
[Link],220 --> [Link],120
All right, and as usual,
16
[Link],120 --> [Link],130
we're gonna start with Python,
17
[Link],130 --> [Link],710
inside which you will find two files,
18
[Link],710 --> [Link],660
polynomial_regression.ipynb,
19
[Link],660 --> [Link],600
which is, of course, your Python implementation,
20
[Link],600 --> [Link],640
which you can open either in Google CoLab
21
[Link],640 --> [Link],930
or Jupyter Notebook,
22
[Link],930 --> [Link],280
and the data set called Position_Salaries.
23
[Link],280 --> [Link],210
All right, so as usual,
24
[Link],210 --> [Link],820
we're gonna start by describing the data set.
25
[Link],820 --> [Link],230
Once again, I'd like to remind
26
[Link],230 --> [Link],960
that this is a simple data set, but no worries.
27
[Link],960 --> [Link],700
The further we progress in this course,
28
[Link],700 --> [Link],950
the more we will work with real world
29
[Link],950 --> [Link],240
and complex data sets.
30
[Link],240 --> [Link],470
You will see at the end,
31
[Link],470 --> [Link],710
we will work with data sets with many more observations
32
[Link],710 --> [Link],420
and more complexities.
33
[Link],420 --> [Link],210
So what is this data set about?
34
[Link],210 --> [Link],150
Well, let's imagine the following scenario.
35
[Link],150 --> [Link],990
Let's imagine that we are actually an HR department
36
[Link],990 --> [Link],910
and that we want to hire someone
37
[Link],910 --> [Link],610
and we actually found someone that seems to be
38
[Link],610 --> [Link],050
a great fit for the job,
39
[Link],050 --> [Link],450
so we would like to offer this person
40
[Link],450 --> [Link],370
a position in our company.
41
[Link],370 --> [Link],780
And so this person says yes,
42
[Link],780 --> [Link],360
but at the end of the interview process
43
[Link],360 --> [Link],520
comes the inevitable question,
44
[Link],520 --> [Link],370
what is your salary expectation?
45
[Link],370 --> [Link],260
And let's say that this person is, you know,
46
[Link],260 --> [Link],270
very well advanced in his career,
47
[Link],270 --> [Link],270
and therefore that person is asking for $160,000 per year.
48
[Link],530 --> [Link],470
And then also as HR negotiators,
49
[Link],470 --> [Link],737
we ask this person,
50
[Link],737 --> [Link],430
"Why are you expecting such a high salary?"
51
[Link],430 --> [Link],997
And this person replies,
52
[Link],997 --> [Link],390
"Well, that's because that's what I earned
53
[Link],390 --> [Link],770
in my previous company.
54
[Link],770 --> [Link],840
That was my salary in my previous company.
55
[Link],840 --> [Link],380
I earned $160,000 per year.
56
[Link],380 --> [Link],190
So I'm expecting at least $160,000
57
[Link],190 --> [Link],920
per year in your company."
58
[Link],920 --> [Link],380
Is that the truth or is that a bluff?
59
[Link],380 --> [Link],960
Well, that's exactly what we're gonna figure out
60
[Link],960 --> [Link],900
thanks to our polynomial regression model.
61
[Link],900 --> [Link],500
We're going to build a polynomial regression model
62
[Link],500 --> [Link],370
to predict the previous salary of this candidate.
63
[Link],370 --> [Link],260
So how are we going to do this?
64
[Link],260 --> [Link],093
Well, of course,
65
[Link],093 --> [Link],750
in order to make such a prediction,
66
[Link],750 --> [Link],710
we need data,
67
[Link],710 --> [Link],830
and that's exactly the data we collected here.
68
[Link],830 --> [Link],200
So what is this data and how did we collect it?
69
[Link],200 --> [Link],440
This data is actually the different salaries
70
[Link],440 --> [Link],170
of the previous company for the different positions,
71
[Link],170 --> [Link],780
from business analyst to CEO.
72
[Link],780 --> [Link],640
And now how did we collect such data?
73
[Link],640 --> [Link],710
Well, you know, there are many websites online
74
[Link],710 --> [Link],620
which actually display the different salaries
75
[Link],620 --> [Link],170
of the different positions in companies.
76
[Link],170 --> [Link],750
I can give you an example, like Glassdoor.
77
[Link],750 --> [Link],280
Well, let's say that we did this
78
[Link],280 --> [Link],220
and that's how we collected all this data
79
[Link],220 --> [Link],740
containing all the salaries for the different positions
80
[Link],740 --> [Link],490
of this previous company for which this person worked.
81
[Link],490 --> [Link],350
Okay, so we have this data,
82
[Link],350 --> [Link],110
and now we need to know obviously which position
83
[Link],110 --> [Link],140
this person had within this previous company.
84
[Link],140 --> [Link],580
Well, that's easy.
85
[Link],580 --> [Link],650
Let's say we went to LinkedIn
86
[Link],650 --> [Link],890
and we checked out the profile of this person
87
[Link],890 --> [Link],810
and we actually saw that this person
88
[Link],810 --> [Link],440
was actually a region manager, okay?
89
[Link],440 --> [Link],350
However, on the LinkedIn, we also see something else.
90
[Link],350 --> [Link],990
It turns out that this person actually has been
91
[Link],990 --> [Link],270
a region manager for quite a while,
92
[Link],270 --> [Link],980
like, let's say, two years.
93
[Link],980 --> [Link],420
And therefore, you know,
94
[Link],420 --> [Link],620
the salary of this person should not exactly be $150,000,
95
[Link],620 --> [Link],480
as we can see on this data set.
96
[Link],480 --> [Link],590
But instead it should be somewhere between $150,000,
97
[Link],590 --> [Link],450
the salary of position number six,
98
[Link],450 --> [Link],500
and $200,000, the salary of position number seven.
99
[Link],500 --> [Link],090
So in order to extrapolate,
100
[Link],090 --> [Link],480
we're gonna suppose that this person has a position
101
[Link],480 --> [Link],190
in between six and seven,
102
[Link],190 --> [Link],070
and we'll consider this position to be 6.5,
103
[Link],070 --> [Link],610
so that then we can actually deploy our model, you know,
104
[Link],610 --> [Link],230
after training it, of course,
105
[Link],230 --> [Link],050
on the position level 6.5
106
[Link],050 --> [Link],240
so that we can get the predicted salary
107
[Link],240 --> [Link],620
of such a position level.
108
[Link],620 --> [Link],650
And we will compare this predicted salary
109
[Link],650 --> [Link],020
to the salary expected by this person
110
[Link],020 --> [Link],810
to see if indeed there is truth or bluff.
111
[Link],810 --> [Link],640
All right, are you ready?
112
[Link],640 --> [Link],600
Let's do this.
113
[Link],600 --> [Link],183
Let's build our polynomial regression model.
Instructor: Hello, my friends.
2
[Link],953 --> [Link],040
All right, let's make that prediction with the SVR model.
3
[Link],040 --> [Link],460
So we're gonna create a new coattail here, and here we go.
4
[Link],460 --> [Link],580
We're gonna start, of course, from our regressor
5
[Link],580 --> [Link],180
from which we're gonna call the predict method
6
[Link],180 --> [Link],680
that will take as input.
7
[Link],680 --> [Link],480
And now I'm asking you what?
8
[Link],480 --> [Link],610
What will it exactly take as input?
9
[Link],610 --> [Link],760
Well, it's not directly the 6.5 level
10
[Link],760 --> [Link],870
but the scaled value of the 6.5 level, right?
11
[Link],870 --> [Link],340
Because our SVR model was trained on the scaled values
12
[Link],340 --> [Link],320
of the training set, and therefore, in the predict method,
13
[Link],320 --> [Link],290
we must enter the scaled value
14
[Link],290 --> [Link],690
of the input that we want to predict.
15
[Link],690 --> [Link],520
And therefore, here we must call our scaler "X object,"
16
[Link],520 --> [Link],840
from which we're gonna call the transform method.
17
[Link],840 --> [Link],090
There we go, and then in this transform method
18
[Link],090 --> [Link],860
of our scaler object, well that's where
19
[Link],860 --> [Link],500
we can enter our position level of 6.5.
20
[Link],500 --> [Link],330
But remember, we have to enter it
21
[Link],330 --> [Link],850
in a double pair of square brackets
22
[Link],850 --> [Link],310
because the predict method expects any input
23
[Link],310 --> [Link],230
as a 2D array, all right?
24
[Link],230 --> [Link],180
So, let's enter 6.5.
25
[Link],180 --> [Link],860
And now, that's not all.
26
[Link],860 --> [Link],810
There are two extra things we need to do.
27
[Link],810 --> [Link],130
Remember that we not only scaled the input in X,
28
[Link],130 --> [Link],740
but we also scaled the output.
29
[Link],740 --> [Link],320
Remember, I'll go back to feature scaling here.
30
[Link],320 --> [Link],140
We made that scaler object for the input X,
31
[Link],140 --> [Link],500
but also that scaler object SC_Y for the output Y.
32
[Link],500 --> [Link],320
And indeed we used that scaler Y object here
33
[Link],320 --> [Link],940
to scale the output Y.
34
[Link],940 --> [Link],790
And therefore, since the output Y was scaled,
35
[Link],790 --> [Link],010
well, in order to get the prediction
36
[Link],010 --> [Link],310
in the original scale, meaning the original salaries,
37
[Link],310 --> [Link],670
well, we must apply a reverse scaling
38
[Link],670 --> [Link],800
to that whole prediction, right?
39
[Link],800 --> [Link],350
And the method that we'll do exactly this
40
[Link],350 --> [Link],410
is a method called inverse transform.
41
[Link],410 --> [Link],500
And this will exactly reverse the scaling
42
[Link],500 --> [Link],540
that we apply to the output Y.
43
[Link],540 --> [Link],500
All right, so let's do this.
44
[Link],500 --> [Link],000
Let's call this method, first,
45
[Link],000 --> [Link],930
we have to call it from our SC_Y object
46
[Link],930 --> [Link],600
because we want to reverse the scaling of the output Y,
47
[Link],600 --> [Link],250
so SC_Y, from which we call this inverse underscore,
48
[Link],250 --> [Link],680
there it is, inverse transform method.
49
[Link],680 --> [Link],260
And then we will put this whole prediction
50
[Link],260 --> [Link],170
in the parenthesis of this inverse transform method
51
[Link],170 --> [Link],430
of the SC_Y object.
52
[Link],430 --> [Link],360
So, here we go.
53
[Link],360 --> [Link],260
We add a parenthesis here and we close it right here.
54
[Link],260 --> [Link],640
All right, almost ready,
55
[Link],640 --> [Link],320
we're almost ready to get that prediction.
56
[Link],320 --> [Link],510
But there is one last thing we need to do
57
[Link],510 --> [Link],580
and you don't need to worry too much about this.
58
[Link],580 --> [Link],800
This is just for the SVR model.
59
[Link],800 --> [Link],630
You won't have to apply too much
60
[Link],630 --> [Link],540
of reshapes in all the other models of this course.
61
[Link],540 --> [Link],680
But to avoid a format error, we must just add
62
[Link],680 --> [Link],770
inside the parenthesis of the inverse transform method
63
[Link],770 --> [Link],010
another reshape, which is ".reshape,"
64
[Link],010 --> [Link],090
and then in parenthesis, you enter minus one and one.
65
[Link],090 --> [Link],550
And this way we'll all avoid a format error
66
[Link],550 --> [Link],160
and we'll all be able to get the prediction.
67
[Link],160 --> [Link],540
All right, so that's it.
68
[Link],540 --> [Link],800
We are ready to get the prediction,
69
[Link],800 --> [Link],930
but first, remember that in the previous tutorial,
70
[Link],930 --> [Link],460
we didn't run the sale.
71
[Link],460 --> [Link],640
So, let's do it now to train the SVR model.
72
[Link],640 --> [Link],460
Here it is and so now if you're ready,
73
[Link],460 --> [Link],530
let's get the predicted salary
74
[Link],530 --> [Link],920
of the position level 6.5 by the SVR model.
75
[Link],920 --> [Link],753
Here we go.
76
[Link],753 --> [Link],723
Let's run the sale and we get $170,370.
77
[Link],330 --> [Link],010
All right. It looks pretty good.
78
[Link],010 --> [Link],390
It looks to make pretty good sense,
79
[Link],390 --> [Link],730
but that we will double check
80
[Link],730 --> [Link],750
in the next tutorial by visualizing the SVR results.
81
[Link],750 --> [Link],860
Try to do it before me.
82
[Link],860 --> [Link],770
You will again have to play with the SC_X transform
83
[Link],770 --> [Link],080
and SC_Y inverse transform, but you can do it.
84
[Link],080 --> [Link],360
You can of course start from the codes
85
[Link],360 --> [Link],280
at the end of the polynomial regression notebook.
86
[Link],280 --> [Link],960
At least that's where we'll start from.
87
[Link],960 --> [Link],280
So, I look forward to this.
88
[Link],280 --> [Link],473
And until then, enjoy machine learning.