Regression is a generic term for all methods attempting to fit a model to observed data in order to quantify the relationship between two groups of variables. The fitted model may then be used either to merely describe the relationship between the two groups of variables, or to predict new values.
The two data matrices involved in regression are usually denoted X and Y, and the purpose of regression is to build a model Y = f(X). Such a model tries to explain, or predict, the variations in the Y-variable(s) from the variations in the X-variable(s). The link between X and Y is achieved through a common set of samples for which both X- and Y-values have been collected.
The X- and Y-variables can be denoted with a variety of terms, according to the particular context (or culture). The most common ones are listed in the table below:
Usual names for X- and Y-variables
| Context | X | Y |
| General | Predictors | Responses |
| Multiple Linear Regression (MLR) | Independent Variables | Dependent Variables |
| Designed Data | Factors, Design Variables | Responses |
| Spectroscopy | Spectra | Constituents |
Univariate regression uses a single predictor, which is often not sufficient to model a property precisely. Multivariate regression takes into account several predictive variables simultaneously, thus modeling the property of interest with more accuracy.
Building a regression model involves collecting predictor and response values for common samples, and then fitting a predefined mathematical relationship to the collected data.
For example, in analytical chemistry, spectroscopic measurements are made on solutions with known concentrations of a given compound. Regression is then used to relate concentration to spectrum.
Once you have built a regression model, you can predict the unknown concentration for new samples, using the spectroscopic measurements as predictors. The advantage is obvious if the concentration is difficult or expensive to measure directly.
More generally, classical indications for regression as a predictive tool could be the following:
Statistical Regression Analysis Software Solutions
| The Unscrambler® 9.8 | Complete software package for Multivariate Data Analysis, Statistical Regression Analysis Analysis and Experimental Design |
| Food and Beverage | Agriculture |
| Oil and Gas | Chemical Manufacturing |
| Polymer and Paper | Pharmaceutical and Biotechnology |