Given a dataset D={(X1,Y2),…,(XN,YN)} such as Xi and Yi are continuous, The goal of "Linear Regression" is to find the best line that fits this data.
In other words, we want to create the model:
where p is the number of dimensions of the variable X.
In this article we will see how to solve this problem in three scenarios:
When X is one dimensional i.e. p=1.
When X is multi-dimensional i.e. p>1.
Using gradient descent.
X is one dimensional (Ordinary Least Square)
The model that we want to create is of shape:
Remember that the goal of linear regression is to find the line that best fits the data. In other words, we need to minimize the distance between the data points and the line.
Let's put:
In order to find the minimum, we need to solve the following equations:
In this case, Xi is no longer a real number, but instead it's a vector of size p:
So, the model is written as follow:
or, it can be written in a matrix format:
Y is of shape (N,1).
X is of shape (N,p).
W is of shape (p,1): this is the parameters vector (w1,w2,…,wp).
Similarly to the first case, we aim to minimize the following quantity:
Again let's put:
Since we want to minimize L with respect to W, then we can ignore the first term "YTY" because it's independent of W and let's solve the following equation:
Using gradient descent
Here is the formulation of the gradient descent algorithm:
Now all we have to do is to apply it on the two parameters a0 and a1 (in the case of a one variable X):
Engagiert und auf dich fokussiert. Wir helfen dir, deine neuen Fähigkeiten zu verstehen, zu nutzen und zu präsentieren, indem wir deinen Lebenslauf prüfen, Vorstellungsgespräche üben und Gespräche mit der Industrie führen.