Home > Page

Martens Uncertainty Test

New and unique method for uncertainty testing gives safer interpretation of models

Users of multivariate modeling methods are often uncertain when interpreting models. Frequently asked questions are; Which variables are significant? Is the model stable? Why is there a problem?

Dr. Harald Martens has developed a new concept for uncertainty testing based on cross validation, Jack-Knifing and stability plots. CAMO is the first software supplier to implement this new method via The Unscrambler® ver 7.5, which was released in March 1999.

This application note explains how the method works and shows how to use it through a few examples.

How does Martens' uncertainty test work?

You validate your PLS, PCR, or PCA model with cross validation, choosing full cross validation or segmented cross validation as you find appropriate for your data. When you have chosen the optimal number of PLS- or Principal Components (PC's), tick Jack-Knifing in The Unscrambler® modeling dialog box for the optimal number of PC's.

A number of sub-models are created through the cross validation option. These sub-models are based on the samples that were not held out of the cross validation segment. For each sub-model a set of model parameters, B-coefficients, scores, loadings, and loading weights, are calculated. In addition, an overall model is generated, based on all the samples.

Uncertainty of regression coefficients

For each variable we can calculate the difference between the B-coefficient, Bi in a sub-model, and the Ball for the overall model. The sum of the squares of the differences in all the sub-models is used to estimate an expression of the variance of the BI estimate for the ith variable. Using a t-test, we can calculate the significance of the estimate of BI Thus, we can present the resulting regression coefficients with uncertainty limits that correspond to 2 standard deviations under ideal conditions. From this information we can determine which variables are significant.

Uncertainty of loadings and loading weights

The same can be done for other model parameters. However, there is a rotational ambiguity in the latent variables of bilinear models. To be able to compare all the sub-models correctly, Martens has chosen to rotate them. Consequently, we can also determine uncertainty limits for these parameters.

Stability plots

The results of these calculations can be visualized as stability plots in scores, loadings, and loading weights plots. They can be used to understand the influence of specific samples and variables on the model, and explain, for example, why a variable with a large regression coefficient is not statistically significant. This will be illustrated in the example that follows.

Easier to interpret important variables in models with many components

Models with many components, three, four or more, may be difficult to interpret, especially if the first ones don't explain much of the variance. E.g. if each of the first 4-5 PCs explain 15-20% of the variation, the PC1/PC2 plot is not enough to understand which are the most important variables. However, Martens' uncertainty test automatically shows you the significant variables in the many-component model and your interpretation is far easier.

Remove non-significant variables for more robust models

Non-significant variables often display non-structured variation, i.e. noise. Their removal will result in a more stable and robust model. Usually the prediction error decreases as well. Thus, after identifying the significant variables, use The Unscrambler® function Recalibrate with Marked to re-estimate a simpler model. (Verify that it is indeed better.)

Spectroscopic calibrations work better if you remove noisy wavelengths. Some models may be improved by adding interactions and square terms of the variables. The Unscrambler® has a feature to do this automatically. However, many of these terms will be irrelevant. Apply Martens' uncertainty test to identify and keep only the significant ones.

Example Work environment study

We used PLS1 to model 34 data samples from a questionnaire about feeling good at work (Y) with 26 questions (X) about repetitive tasks, inspiration from the boss, helpful colleagues, positive feedback from boss, etc. The model had 2 PC's assessed with full cross validation and Jack-Knifing. Thus, the cross validation created 34 sub-models, where one sample had been left out of each sub-model.

Regression overview Feel good at work

Significant variables

When plotting the regression coefficients we can also plot the uncertainty limits:

X11 has a large regression coefficient but also a large uncertainty limit.

The function "Mark significant variables" clearly shows which variables had a significant effect on Y:

15 X-variables out of 26 were significant. X11 ("Do you get help from your colleagues?") was not significant, even though it had a large B-coefficient. How come?

Stability in Loading weights plots

By clicking the icon for Stability plot when studying Loading weights, we get this picture:

Stability Loading Weight Plots

For each variable you see a swarm of its loadings in each sub-model. There are 26 such X-loading weights swarms. In the middle of each swarm you see the loading weight for the variable in the total model. All points within a swarm should lie close together. It is not unusual for the uncertainty to be larger (i.e. the spread is larger in the swarm) for variables close to the origin, i.e. these variables are non-significant.

Zooming in on variable X11.

If a variable has a sub-loading far away from the rest of the loadings in its swarm (e.g. X11 in the upper right quadrant), then this variable has had a large influence in one of the sub-models. Without this sample the loading weight for this variable would be very different, and the model itself might be different. It is quite possible that this sample had an extreme value in the variable in question. Consequently, the distribution may be skewed. As a result, the estimate of the loading weight for this variable is uncertain, and it becomes non-significant.

We can see this by plotting X11 versus Y:

Only two departments do not consider their colleagues to be helpful. These two samples strongly influence the models and twist them. Without these two samples, variable X11 has a very small variation.

Stability in Scores plots

Stability : Scores Plots

For each sample there is a swarm of its scores from each sub-model. There are 34 sample swarms. In the middle of each swarm is the score for the sample from the overall model. The circle shows the projected or rotated score of the sample for the sub-model without the sample.

Zooming in on sample 23

If this encircled sample is far away from the rest in the swarm, the model without this sample is very different, i.e. this sample has strongly influenced all other sub-models due to its uniqueness. In the example above, all samples seem OK and the model seems robust.


Martens' uncertainty test is ideal to make better, simpler, and more robust models. It provides measures of the uncertainties of the parameter estimates in PCA, PCR, and PLS-models. The Unscrambler® has easy-to-use features that automatically display these uncertainty limits, mark significant variables, and display stability plots of scores, loadings, and loading weights.