Home > Resources > DoE > Classical DoE methods and PLS-ANOVA

# Classical DoE methods and PLS-ANOVA

#### Specific methods for analyzing designed data

• Assess the important effects and interactions with an analysis of the effects, including an ANOVA. The model is made with MLR.
• Create a response surface in including interaction and quadratic terms in the MLR model.
• Analyze results from mixture designs with Scheffé models Scheffé, 1958.
• Use PLS regression to analyze D-optimal designs.
 Analysis of effects using classical methods An analysis of the effects should be conducted on screening and screening with interaction designs: Plackett-Burman, Fractional Factorial, Full-Factorial designs as well as mixture designs when the goal is set accordingly. The classical DOE analysis method for studying effects is an analysis of variance or ANOVA. The model is fitted either by multiple linear regression (MLR) for Plackett-Burman and factorial designs or Scheffé models for mixture design. more...

 Response surface analysis using classical methods A response surface analysis is used when the experimental objective is an optimization. It corresponds to the Central Composite and Box-Behnken designs as well as mixture and D-optimal designs, depending on the experimental goal that has been set. The classical DOE method of analysis for studying a response surface is to fit a quadratic (or even a cubic) model by MLR. In this case the ANOVA table may be studied in the same way as in an analysis of effects. The significance of the individual effects as well as the additional effects: two-variable and three-variable interactions, square and cubic terms must be assessed, depending on the terms included in the analysis. more...

### Analysis of effects using classical methods

The models for Plackett-Burman and factorial designs are:

• Main effects,
• Main effects and interactions (two- or three-variable interaction).

The models for mixture designs are:

• First order
• Second order

In terms of results, the ANOVA table provides information about how well the model fits the responses. The ANOVA table also provides information about the participation or effect of the different design variables and their possible interaction, as well as the significance of these effects. The analysis sequence is then to first look at the model p-value and R². A p-value below 5% indicates a good model and a R² close to 1 indicates a good correlation between the predicted response value and the actual response value. Consideration must then be given to the value of the individual effects or model terms and their sign. Consideration should also be given to the size of the corresponding p-values. Each effect with a p-value < 5% is considered significant; if the p-value is < 1% it is highly significant. A p-value between 5 and 10% indicates a possibly significant effect.A p-value > 10%, indicates that an effect is not considered as significant. A p-value for an interaction term effect between 5 to 10% is considered to be significant. The summary table provides data for the effects with the values and associated p-values for all response variables.

ANOVA table

Sum of Square (SS) Degree of Freedom (DF) Mean Square F-ratio p-value
Summary
Model 1.750 e+03 3 583.333 194.444 0.0001
Error 12 4 3
Total 1.762 e+03 7 251.714
Variables
A 50.000 1 50.000 16.667 0.0151
B 1.250 e+03 1 1.250 e+03 416.667 0.0000
AB 450.000 1 450.000 150.000 0.0003

In this example the model is valid (p-value=0.0001) and all effects are significant (p-values < 0.05). The most significant effect is B as it has the smallest p-value.

The error sum of square and degree of freedom can be calculated either on the design samples or on the replicated center samples if the design is saturated.

Note: A saturated design is a design that has the number of experimental design samples equal to the number of model terms (including B0 if necessary). This type of design uses all the degree of freedom to calculate the model terms.

Other checks that can be applied after analyzing the ANOVA table include the detection of curvature effects. These can be found by plotting the main effects plot. If a nonlinear trend is detected when checking the position of the center sample, one may consider a possible curvature effect and include the square term of the effect in the model.

Main effect plot with curvature When a variable is category, it is necessary to check which effects are significant and also if they are significantly different.
The multiple comparison test provides this type of information. It is based on a comparison of the averages of the response variable at the different levels. If the difference between two averages is greater than the critical limit the two levels are significantly different. If not they have a similar effect. If no level has an effect all levels will have a statistically similar effect and the averages for the response variables at the different levels will be non-significantly different.

In The Unscrambler®, there are three specific outputs for the multiple comparison test:

• A table of distances, that gives the two-by-two distance between the levels.
• A group table, that indicates the different grouping between the levels.
• A plot, that shows the levels in their group. More information in the plot section.

### Response surface analysis using classical methods

The models for CC and BB designs are:

• Main effects and interactions (two- or three-variable interaction) and quadratic and cubic terms
• Main effects and interactions (two- or three-variable interaction) and quadratic terms The models for mixture designs are:

• Full cubic, this is similar to main effects and interactions (two- or three-variable interaction) and quadratic terms.
• Special cubic, this is similar to main effects and interactions (two- or three-variable interaction). However as the model has a closure constraint quadratic terms are partially included.

In addition one can also study the response surface. Maximum, minimum or saddle points can be detected on the response surface by varying the conditions to plot the response surface. More information on how to vary the condition can be found in the RS table section in the plot interpretation page.

### Analysis with PLS regression

All design can be analyzed by PLS, however it is the only possibility for D-optimal designs.

##### Settings

The settings are done automatically.

The number of PCs tested is the maximal number of PC.

The analysis is done on the real value matrix with all X-variable weighted with the option 1/std.

The response variables are also weighted with the same option 1/std.

A full cross-validation is run to validate the model.

An uncertainty test is run on the optimal number of component so as to get the Jack-knife test results.

##### Results

In addition to the usual results of PLS, some special plots are useful in DoE. They are all computed from the Jack-knife uncertainty test developed by Dr. Harald Martens.

Estimated p-values
The estimated p-values should be used as the p-values coming from the ANOVA results.
Weighted Beta coefficient with their uncertainty limit
The Weighted B-coefficient are use to determine which effects are the most important: the bigger, the more important.
The uncertainty test shows which coefficients are stable. The stable one have an uncertainty value that doe snot cross the zero line.
Weighted Beta coefficient table
This table shows a summary of the results found for the B-coefficient with both estimated p-value and uncertainty limit.
PLS ANOVA summary table
This table shows the effect values and the associated p-values.
##### Advantages and drawbacks of using PLS

The analysis with PLS provides some advantages:

• There is no limitation by the number of experiment and the degree of freedom of the model. It is then possible to analyze more effects without having to deal with the confounding pattern.
• If there are several response variables their covariance will be taken into account in the model.
• In addition to the results given by the effect or response surface analysis, PLS provides results to detect outliers. As each experiment (except the center one) are often not replicated it is difficult to locate an outlier result in classical method.

There are some drawbacks as well:

• PLS is not a statistical method. It is not possible to get real p-values but only estimation even though they come from the result of a t-test.