Ordinary least-squares method (OLS) is one of the methods of regression analysis that is used to estimate unknown values based on measurements containing random errors. This method is also used for approximate representation of the specified function by other (more simple) functions, as well as for handling observations.
When an unknown value can be measured directly, for example, segment length or angle, the value is measured many times to ensure that it is accurate and the output value is the arithmetic mean of all measurements. This arithmetic mean rule is based on probability theory; the sum of squared measurement deviations from the arithmetic mean is less than the sum of squared measurement deviations from any other value. Therefore, arithmetic mean rule is the simplest case of the least-squares method.
Least-squares method uses unknown values for initial equations. Right parts of the equation result in zeroes or in small values that have a smaller sum of squares than the sum of squares of similar residuals after substituting any other values of unknown values. In addition, when solving equations using least-squares method, the user can get probable errors of unknown values, that is, the user can estimate accuracy of the result.
Set of simultaneous equations is a set of equations that contains interdependent variables included in one of model equations as a resulting characteristic and serving as a factor characteristic for other equation. One cannot calculate coefficients of a set of simultaneous equations using regular least-squares method because the right part of the set includes endogenous variables. The most frequently used calculation method is two-step least-squares method.
Described below is the procedure of two-step least squares method:
Create a model reduced form and calculate numerical values of model parameters using the regular least-squares method.
Determine endogenous variables in the right part of the structural equation, which parameters determine the two-step least-squares method, and calculate their numeric values using the appropriate reduced model form equations.
Regular least-squares method calculates numeric values of structural equation parameters using actual values of predefined variables and calculated values of endogenous variables from the right part of this structural equation as source data.
Two-step least-squares method is used to estimate coefficients of a defined regression equation: y = Y·a + X·b + u.
Where:
y. Dependent variable of the equation.
Y. The matrix of n*g observations of other values of endogenous variables included into the equation.
X. The matrix of n*k observations of predefined variables included into the equation.
a. The vector of g*1 structure coefficients related to variables of the matrix Y.
b. The vector of k*1 coefficients related to variables of the matrix X.
u. The vector of n*1 random disturbances.
The method is used to estimate coefficients of the model y = Xβ + e that minimizes the sum of deviations squares e'e. The coefficients are estimated by the following formula: β = (X'X)-1X'Y.
The case of multicollinearity is considered separately, when the matrix X'X is near to singular (the absolute value of the determinant is small). In this case, the coefficients estimate is ambiguous, as the columns of the matrix X are linearly dependent. To get an unambiguous estimate, exclude the columns from the matrix X until it has the maximum rank.
Weighting is used to estimate coefficients of the model Y =Xβ + ε assuming that the residuals are heteroscedastic.
A simple transform reduces the case to a standard model of multiple linear regression having homoscedastic residuals:
The obtained model is estimated using standard least-squares method.
In case of linear regression model Y =Xβ + ε, assuming that the residuals ε are distributed according to the law N(0, σ2Ω) , with the defined covariance matrix Ω, the available generalized least-squares method is estimated using the following formula:
If the model contains a constant that should be automatically estimated, a single column is to be added to the matrix X.
See also:
Linear Regression | Modeling Container: The Linear Regression (OLS Estimation) Model | Time Series Analysis: Linear Regression | IModelling.Ols | ISmSimultaneousSystem