Simple linear regression is a method that enables you to determine the relationship between a continuous process output (Y) and one factor (X). The relationship is typically expressed in terms of a mathematical equation such as Y = b + mX
Suppose we believe that the value of y tends to increase or decrease in a linear manner as x increases. Then we could select a model relating y to x by drawing a line which is well fitted to a given data set. Such a deterministic model – one that does not allow for errors of prediction –might be adequate if all of the data points fell on the fitted line. However, you can see that this idealistic situation will not occur for the data of Table 11.1 and 11.2. No matter how you draw a line through the points in Figure 11.2 and Figure 11.3, at least some of points will deviate substantially from the fitted line.
The solution to the proceeding problem is to construct a probabilistic model relating y to x- one that knowledge the random variation of the data points about a line. One type of probabilistic model, a simple linear regression model, makes assumption that the mean value of y for a given value of x graphs as straight line and that points deviate about this line of means by a random amount equal to e, i.e.
y = A + B x + e,
where A and B are unknown parameters of the deterministic (nonrandom ) portion of the model.
If we suppose that the points deviate above or below the line of means and with expected value E(e) = 0 then the mean value of y is
y = A + B x.
Therefore, the mean value of y for a given value of x, represented by the symbol E(y) graphs as straight line with y-intercept A and slope B.
|All-In-One Multivariate Data Analysis (MVA) and Design of Experiments (DoE) Package
with Simple Linear Regression
This procedure performs linear regression on the selected dataset. This fits a linear model of the form
Y= b 0 + b 1 X 1 + b 2 X 2 + .... + b k X k + e
where Y is the dependent variable (response) and X 1 , X 2 ,.. .,X k are the independent variables (predictors) and e is random error. b 0 , b 1 , b 2 , .... b k are known as the regression coefficients, which have to be estimated from the data. The multiple linear regression algorithm in XLMiner chooses regression coefficients so as to minimize the difference between predicted values and actual values.
Linear regression is performed either to predict the response variable based on the predictor variables, or to study the relationship between the response variable and predictor variables. For example, using linear regression, the crime rate of a state can be explained as a function of other demographic factors like population, education, male to female ratio etc.
A Snapshot of Industry Applications of The Unscrambler® Suite of Software Products
The Unscrambler® Suite of Software Products (The Unscrambler® X, Unscrambler Predictor & Unscrambler Classifier and Unscrambler Optimizer) are industry leading standards used in a variety of industries. Select an industry from below to read more on how the software products are useful to each industry, with actual case studies included.
Tailor-made for advanced multivariate statistical modeling, prediction, and classification, The Unscrambler® X Software’s wizard-driven design of experiments functionality completes this all-in-one, powerhouse analytical package, enabling users to delve deep into the value embedded within their data and derive models and results that add tremendous value in R&D efficiency, time and cost savings to a wide array of growing client installations.
|Food and Beverage||Agriculture|
|Oil and Gas||Chemical Manufacturing|
|Polymer and Paper||Pharmaceutical and Biotechnology|
CAMO Software provides professional training in multivariate data analysis, spectroscopy, sensometrics, simple linear regression, statistical regression analysis, Linear Regression, K-Means Clustering and chemometrics across United States & Canada, Europe, South America, Africa, Australia and Asia through our panel of chemometric experts, spectroscopy professionals, sensometrics instructors and Multivariate Data Analysis Trainers.