Linear regression is a statistical procedure for predicting the value of a dependent variable from an independent variable when the relationship between the variables can be described with a linear model.
A linear regression equation can be written as Yp= mX + b, where Yp is the predicted value of the dependent variable, m is the slope of the regression line, and b is the Y-intercept of the regression line.
In statistics, linear regression is a method of estimating the conditional expected value of one variable y given the values of some other variable or variables x. The variable of interest, y, is conventionally called the "dependent variable". The terms "endogenous variable" and "output variable" are also used. The other variables x are called the "independent variables". The terms "exogenous variables" and "input variables" are also used. The dependent and independent variables may be scalars or vectors. If the independent variable is a vector, one speaks of multiple linear regression.
A linear regression model is typically stated in the form y = α + βx + ε
The right hand side may take other forms, but generally comprises a linear combination of the parameters, here denoted α and β. The term ε represents the unpredicted or unexplained variation in the dependent variable; it is conventionally called the "error" whether it is really a measurement error or not. The error term is conventionally assumed to have expected value equal to zero, as a nonzero expected value could be absorbed into α. See also errors and residuals in statistics; the difference between an error and a residual is also dealt with below. It is also assumed that is ε independent of x.
A useful alternative to linear regression is robust regression in which mean absolute error is minimized instead of mean squared error as in linear regression. Robust regression is computationally much more intensive than linear regression and is somewhat more difficult to implement as well.
Robust regression usually means linear regression with robust (Huber-White) standard errors (e.g. relaxing the assumption of homoskedasticity).
An equivalent formulation which explicitly shows the linear regression as a model of conditional expectation is with the conditional distribution of y given x essentially the same as the distribution of the error term. A linear regression model need not be affine, let alone linear, in the independent variables x.
|All-In-One Multivariate Data Analysis (MVA) and Design of Experiments (DoE) Package
with K-Means Clustering
A Snapshot of Industry Applications of The Unscrambler® Suite of Software Products
The Unscrambler® Suite of Software Products (The Unscrambler® X, Unscrambler Predictor & Unscrambler Classifier and Unscrambler Optimizer) are industry leading standards used in a variety of industries. Select an industry from below to read more on how the software products are useful to each industry, with actual case studies included.
Tailor-made for advanced multivariate statistical modeling, prediction, and classification, The Unscrambler® X Software’s wizard-driven design of experiments functionality completes this all-in-one, powerhouse analytical package, enabling users to delve deep into the value embedded within their data and derive models and results that add tremendous value in R&D efficiency, time and cost savings to a wide array of growing client installations.
|Food and Beverage||Agriculture|
|Oil and Gas||Chemical Manufacturing|
|Polymer and Paper||Pharmaceutical and Biotechnology|
CAMO Software provides professional training in multivariate data analysis, spectroscopy, sensometrics, Linear Regression statistical regression analysis, simple linear regression, K-Means Clustering and chemometrics across United States & Canada, Europe, South America, Africa, Australia and Asia through our panel of chemometric experts, spectroscopy professionals, sensometrics instructors and Multivariate Data Analysis Trainers.