Least-Squares Method

Ordinary least-squares method (OLS) is one of the methods of regression analysis that is used to estimate unknown values based on measurements containing random errors. This method is also used for approximate representation of the specified function by other (more simple) functions, as well as for handling observations.

When an unknown value can be measured directly, for example, segment length or angle, the value is measured many times to ensure that it is accurate and the output value is the arithmetic mean of all measurements. This arithmetic mean rule is based on probability theory; the sum of squared measurement deviations from the arithmetic mean is less than the sum of squared measurement deviations from any other value. Therefore, arithmetic mean rule is the simplest case of the least-squares method.

Least-squares method uses unknown values for initial equations. Right parts of the equation result in zeroes or in small values that have a smaller sum of squares than the sum of squares of similar residuals after substituting any other values of unknown values. In addition, when solving equations using least-squares method, the user can get probable errors of unknown values, that is, the user can estimate accuracy of the result.

Double-Step Least-Squares Method. Simultaneous Equations Systems

Set of simultaneous equations is a set of equations that contains interdependent variables included in one of model equations as a resulting characteristic and serving as a factor characteristic for other equation. One cannot calculate coefficients of a set of simultaneous equations using regular least-squares method because the right part of the set includes endogenous variables. The most frequently used calculation method is two-step least-squares method.

Described below is the procedure of two-step least squares method:

Two-step least-squares method is used to estimate coefficients of a defined regression equation: y = Y·a + X·b + u.

Where:

Weighted OLS

The method is used to estimate coefficients of the model y = Xβ + e that minimizes the sum of deviations squares e'e. The coefficients are estimated by the following formula: β = (X'X)-1X'Y.

The case of multicollinearity is considered separately, when the matrix X'X is near to singular (the absolute value of the determinant is small). In this case, the coefficients estimate is ambiguous, as the columns of the matrix X are linearly dependent. To get an unambiguous estimate, exclude the columns from the matrix X until it has the maximum rank.

Weighting is used to estimate coefficients of the model Y =Xβ + ε assuming that the residuals are heteroscedastic.

A simple transform reduces the case to a standard model of multiple linear regression having homoscedastic residuals:

The obtained model is estimated using standard least-squares method.

Available Generalized OLS Method

In case of linear regression model Y =Xβ + ε, assuming that the residuals ε are distributed according to the law N(0, σ2Ω) , with the defined covariance matrix Ω, the available generalized least-squares method is estimated using the following formula:

If the model contains a constant that should be automatically estimated, a single column is to be added to the matrix X.

See also:

Linear Regression | Modeling Container: The Linear Regression (OLS Estimation) Model | Time Series Analysis: Linear Regression | IModelling.Ols | ISmSimultaneousSystem