We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

The purpose of the article is to provide support to novice data scientists. IN previous article we have sorted out three ways of solving the linear regression equation on our fingers: analytical solution, gradient descent, stochastic gradient descent. Then for the analytical solution we applied the formula We bring the linear regression equation into a matrix form. In this article, as follows from the title, we will justify the use of this formula or, in other words, derive it ourselves.

Why it makes sense to pay special attention to the formula We bring the linear regression equation into a matrix form?

It is with the matrix equation that in most cases acquaintance with linear regression begins. At the same time, detailed calculations of how the formula was derived are rare.

For example, in Yandex machine learning courses, when students are introduced to regularization, they are offered to use functions from the library sklearn, while not a word is mentioned about the matrix representation of the algorithm. It is at this point that some listeners may want to understand this issue in more detail - write code without using ready-made functions. And for this, we must first present the equation with a regularizer in matrix form. This article, just, will allow those who wish to master such skills. Let's get started.

Initial conditions

Targets

We have a range of target values. For example, a target could be the price of an asset: oil, gold, wheat, dollar, etc. At the same time, under the range of values ​​of the target indicator, we mean the number of observations. Such observations can be, for example, monthly oil prices for the year, that is, we will have 12 target values. Let's start introducing notation. Let's denote each value of the target indicator as We bring the linear regression equation into a matrix form. In total we have We bring the linear regression equation into a matrix form observations, so we can represent our observations as We bring the linear regression equation into a matrix form.

Regressors

We will assume that there are factors that to a certain extent explain the values ​​of the target indicator. For example, the exchange rate of the dollar/ruble pair is strongly influenced by the price of oil, the Fed rate, etc. Such factors are called regressors. At the same time, each value of the target indicator should correspond to the value of the regressor, that is, if we have 12 target indicators for each month in 2018, then we should also have 12 regressor values ​​for the same period. Denote the values ​​of each regressor through We bring the linear regression equation into a matrix form. Let in our case we have We bring the linear regression equation into a matrix form regressors (i.e. We bring the linear regression equation into a matrix form factors that influence the target indicator values). So our regressors can be represented as follows: for the 1st regressor (for example, the price of oil): We bring the linear regression equation into a matrix form, for the 2nd regressor (for example, the Fed rate): We bring the linear regression equation into a matrix form, for "We bring the linear regression equation into a matrix form-th" regressor: We bring the linear regression equation into a matrix form

Dependence of target indicators on regressors

Let's assume that the dependence of the target indicator We bring the linear regression equation into a matrix form from the regressorsWe bring the linear regression equation into a matrix form-th" observation can be expressed through a linear regression equation of the form:

We bring the linear regression equation into a matrix form

Where We bring the linear regression equation into a matrix form - "We bring the linear regression equation into a matrix form-th" value of the regressor from 1 to We bring the linear regression equation into a matrix form,

We bring the linear regression equation into a matrix form β€” the number of regressors from 1 to We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form β€” slope coefficients, which represent the amount by which the calculated target indicator will change on average when the regressor changes.

In other words, we are for everyone (with the exception of We bring the linear regression equation into a matrix form) of the regressor, we determine "our" coefficient We bring the linear regression equation into a matrix form, then we multiply the coefficients by the values ​​of the regressors "We bring the linear regression equation into a matrix form-th" observation, as a result we obtain a certain approximation "We bring the linear regression equation into a matrix form-th" target indicator.

Therefore, we need to choose such coefficients We bring the linear regression equation into a matrix form, for which the values ​​of our approximating function We bring the linear regression equation into a matrix form will be located as close as possible to the target values.

Estimation of the quality of the approximating function

We will determine the estimate of the quality of the approximating function by the least squares method. The quality evaluation function in this case will take the following form:

We bring the linear regression equation into a matrix form

We need to choose such values ​​of the coefficients $w$ for which the value We bring the linear regression equation into a matrix form will be the smallest.

We translate the equation into a matrix form

Vector representation

To begin with, to make your life easier, you should pay attention to the linear regression equation and notice that the first coefficient We bring the linear regression equation into a matrix form is not multiplied by any regressor. At the same time, when we convert the data into a matrix form, the above circumstance will seriously complicate the calculations. In this regard, it is proposed to introduce another regressor for the first coefficient We bring the linear regression equation into a matrix form and equate it to one. Rather, eachWe bring the linear regression equation into a matrix formequate the β€œth” value of this regressor to one - after all, when multiplying by one, nothing will change from the point of view of the result of calculations, and from the point of view of the rules for the product of matrices, our torment will be significantly reduced.

Now, for the moment, for the sake of simplicity, let's assume that we have only one "We bring the linear regression equation into a matrix formth observation. Then, we present the values ​​of the regressors "We bring the linear regression equation into a matrix formth" observation as a vector We bring the linear regression equation into a matrix form... Vector We bring the linear regression equation into a matrix form has the dimension We bring the linear regression equation into a matrix formThat is, We bring the linear regression equation into a matrix form rows and 1 column:

We bring the linear regression equation into a matrix form

We represent the required coefficients as a vector We bring the linear regression equation into a matrix form, which has the dimension We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Linear regression equation for "We bring the linear regression equation into a matrix form-th" observation will take the form:

We bring the linear regression equation into a matrix form

The function of assessing the quality of the linear model will take the form:

We bring the linear regression equation into a matrix form

Note that in accordance with the rules of matrix multiplication, we needed to transpose the vector We bring the linear regression equation into a matrix form.

Matrix representation

As a result of vector multiplication, we get a number: We bring the linear regression equation into a matrix form, which is to be expected. This number is an approximationWe bring the linear regression equation into a matrix form-th" target indicator. But we need an approximation of not one value of the target indicator, but all. To do this, we write down allWe bring the linear regression equation into a matrix form-th" regressors in matrix format We bring the linear regression equation into a matrix form. The resulting matrix has dimension We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Now the linear regression equation will take the form:

We bring the linear regression equation into a matrix form

Let us denote the values ​​of the target indicators (all We bring the linear regression equation into a matrix form) per vector We bring the linear regression equation into a matrix form dimension We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Now we can write in a matrix format the equation for estimating the quality of a linear model:

We bring the linear regression equation into a matrix form

Actually, from this formula, we further obtain the formula known to us We bring the linear regression equation into a matrix form

How it's done? Parentheses are opened, differentiation is performed, resulting expressions are transformed, etc., and this is what we will do now.

Matrix transformations

Let's open the brackets

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

Let's prepare the equation for differentiation

To do this, we will carry out some transformations. In subsequent calculations, it will be more convenient for us if the vector We bring the linear regression equation into a matrix form will be presented at the beginning of each product in the equation.

Transformation 1

We bring the linear regression equation into a matrix form

How did it happen? To answer this question, it is enough to look at the sizes of the multiplied matrices and see that at the output we get a number or otherwise We bring the linear regression equation into a matrix form.

Let's write down the sizes of matrix expressions.

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

Transformation 2

We bring the linear regression equation into a matrix form

Let us write out similarly to transformation 1

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

At the output, we get an equation that we have to differentiate:
We bring the linear regression equation into a matrix form

We differentiate the function of assessing the quality of the model

Differentiate with respect to the vector We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

Questions why We bring the linear regression equation into a matrix form should not be, but we will analyze the operations for determining derivatives in the other two expressions in more detail.

Derivation 1

Let's expand the differentiation: We bring the linear regression equation into a matrix form

In order to determine the derivative of a matrix or vector, you need to look at what they have inside. We look:

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form We bring the linear regression equation into a matrix form

Denote the product of matrices We bring the linear regression equation into a matrix form through the matrix We bring the linear regression equation into a matrix form. The matrix We bring the linear regression equation into a matrix form square and moreover, it is symmetrical. These properties will be useful to us later, remember them. Matrix We bring the linear regression equation into a matrix form has the dimension We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Now our task is to correctly multiply the vectors by the matrix and not get "twice two five", so let's focus and be extremely careful.

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

However, we have an intricate expression! In fact, we got a number - a scalar. And now, for real, we turn to differentiation. It is necessary to find the derivative of the resulting expression for each coefficient We bring the linear regression equation into a matrix form and get the output dimension vector We bring the linear regression equation into a matrix form. Just in case, I will write down the procedures for actions:

1) differentiate with respect to We bring the linear regression equation into a matrix form, we get: We bring the linear regression equation into a matrix form

2) differentiate with respect to We bring the linear regression equation into a matrix form, we get: We bring the linear regression equation into a matrix form

3) differentiate with respect to We bring the linear regression equation into a matrix form, we get: We bring the linear regression equation into a matrix form

The output is the promised vector of size We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

If you take a closer look at the vector, you will notice that the left and corresponding right elements of the vector can be grouped in such a way that, as a result, a vector can be selected from the presented vector We bring the linear regression equation into a matrix form size We bring the linear regression equation into a matrix form. For example, We bring the linear regression equation into a matrix form (left element of the top row of the vector) We bring the linear regression equation into a matrix form (the right element of the top row of the vector) can be represented as We bring the linear regression equation into a matrix form, We bring the linear regression equation into a matrix form - as We bring the linear regression equation into a matrix form etc. for each line. Let's group:

We bring the linear regression equation into a matrix form

Take out the vector We bring the linear regression equation into a matrix form and at the output we get:

We bring the linear regression equation into a matrix form

Now, let's look at the resulting matrix. The matrix is ​​the sum of two matrices We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Recall that a little earlier, we noted one important property of the matrix We bring the linear regression equation into a matrix form - it is symmetrical. Based on this property, we can confidently state that the expression We bring the linear regression equation into a matrix form equals We bring the linear regression equation into a matrix form. This is easy to check by expanding the product of matrices element by element We bring the linear regression equation into a matrix form. We will not do this here, those who wish can check it themselves.

Let's go back to our expression. After our transformations, it turned out the way we wanted to see it:

We bring the linear regression equation into a matrix form

So, we have dealt with the first differentiation. Let's move on to the second expression.

Derivation 2

We bring the linear regression equation into a matrix form

Let's go down the beaten track. It will be much shorter than the previous one, so don't go too far from the screen.

Let's expand the vector and matrix element by element:

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

For a while, we will remove the deuce from the calculations - it does not play a big role, then we will return it to its place. We multiply the vectors by the matrix. First, multiply the matrix We bring the linear regression equation into a matrix form per vector We bring the linear regression equation into a matrix form, we have no restrictions here. Get the size vector We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Let's perform the following action - multiply the vector We bring the linear regression equation into a matrix form to the resulting vector. At the output, a number will be waiting for us:

We bring the linear regression equation into a matrix form

We will differentiate it. At the output we get a vector of dimensions We bring the linear regression equation into a matrix form:

We bring the linear regression equation into a matrix form

Does it remind you of something? All right! This is the product of the matrix We bring the linear regression equation into a matrix form per vector We bring the linear regression equation into a matrix form.

Thus, the second differentiation is successfully completed.

Instead of a conclusion

Now we know how the equality came about We bring the linear regression equation into a matrix form.

Finally, we describe a quick way to transform the basic formulas.

Let's evaluate the quality of the model in accordance with the least squares method:
We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

We differentiate the resulting expression:
We bring the linear regression equation into a matrix form We bring the linear regression equation into a matrix form

We bring the linear regression equation into a matrix form

Literature

Internet sources:

1) habr.com/en/post/278513
2) habr.com/ru/company/ods/blog/322076
3) habr.com/en/post/307004
4) nabatchikov.com/blog/view/matrix_der

Textbooks, collections of tasks:

1) Lecture notes on higher mathematics: full course / D.T. Written - 4th ed. - M .: Iris-press, 2006
2) Applied regression analysis / N. Draper, G. Smith - 2nd ed. - M .: Finance and statistics, 1986 (translated from English)
3) Tasks for solving matrix equations:
function-x.ru/matrix_equations.html
mathprofi.ru/deistviya_s_matricami.html


Source: habr.com

Add a comment