Food quality is multivariate. It depends on ingredients and process parameters and is described by its many attributes: taste, consistency, smell, fat content, pH, yield etc.
When developing new products or improving existing ones all the important attributes have to be exact at the same time - that means the amount of all ingredients - and the settings of all process parameters have to be just right. To find the right levels you must experiment, and this is often a time consuming, costly and frustrating process. To do this work efficiently, experimental design combined with multivariate analysis is a powerful tool.
This application shows how the combination of experimental design, ANOVA PCA and Response Surface Analysis in combination with sensory analysis may be used in an iterative way to systematically work towards a clear goal.
This application was developed by CAMO and a customer. A dried soup manufacturer wanted to copy a product from a competitor. They had already tried using traditional methods - varying only one experimental factor at the time and keeping the rest constant. After they had performed over 50 experiments without succeeding, CAMO suggested another approach. We started from scratchand used experimental design combined with the multivariate analysis, available with the new software package Guideline
The project was carried out not only to find a better recipe, but also to teach the company's product developers to use the new methods.
In this case experimental design should be used to generate different samples for comparison and to find which ingredients influence the soup's sensory attributes. Sensory profiling was considered the most relevant measure of the soup quality.
The company's product developers conducted a brainstorming session to identify which ingredients might affect quality. It was decided that the soup should contain 20 ingredients and they chose 12 of them (see table 1) for the experiments. Each factor was varied from a low to a high level in different combinations. This is called Factorial design.
Names and values of the different ingredients are changed because of secrecy.
Table 1: Design variables and levels
|Level||Description||Low value||High value|
|A||Milk product||2 %||6 %|
|B||Milk product||1 %||5 %|
|C||Milk product||1.5 %||2.5 %|
|D||Milk product||1.5 %||6 %|
|E||Vegetable||0 %||2 %|
|F||Vegetable||0 %||5 %|
|G||Vegetable||1 %||3 %|
|H||Vegetable||3 %||6 %|
|I||Spice||0.2 %||0.5 %|
|J||Spice||0 %||1 %|
|K||Spice||0.2 %||0.4 %|
|L||Spice||0.1 %||0.3 %|
4096 experiments would be required if all combinations of low and high levels had to be tested. This was of course too expensive and time consuming! But a fractional factorial design with only 16 experiments
In addition to the 16 experiments the center point of the design was included twice. This resulted in 18 productions.
The sensory quality of the 18 soups plus the best soup developed by traditional methods and the competitor's soup, were all evaluated by 8 judges in a trained panel, using 12 sensory attributes. These were quantitative evaluations, on a scale from 1 to 9.
The average ratings for each of the 20 soups are shown in table 2. Some of the variable names are changed because the customer wishes this information to remain confidential.
Results from sensory evaluation of 20 soups
How is it, from these results, possible to find the soup most similar to the competitor's (Comp.) for all sensory attributes?
A Principal Component Analysis (PCA) on the results from table 2 was made. PCA treats all variables simultaneously. By this method the main information (variation) in the data are projected onto a few number of new variables, which are called Principal Components, plus errors or residuals. The residuals represent the remaining variability which is interpreted as noise.
The attributes were standardized to unit variance prior to the analysis. To avoid blowing up noise in the data, we only included those sensory attributes that were found to differentiate between the soups. We excluded onion taste because the standard deviation of this measurement for the replicated center samples was almost as large as the standard deviation for the design samples.
|Fig. 1 Explained variance
Fig. 2 Score plot using PC 1 and 2
Fig. 3 Loading plot using PC 1 and 2
Fig. 4 Score plot using PC 1 and 3
Fig. 5 Loading plot using PC 1 and 3
|First we decided on how many Principal Components (PCs, new variables) to use for interpretation of the results. Figure 1 shows that 3 components explained totally 81% of the variation. Using more components would not increase the explained information in the data considerably and we could risk overfitting.
A map of samples (score plot) for the first two principal components is shown in figure 2. Samples lying close to each other are similar and samples far from each other are different for all of the attributes. The plot shows that sample 1 and 19 are closest to the competitor's (Competitor) product. The two samples are similar to the competitors product for all sensory attributes explained by two principal components. The soup developed by traditional methods (Traditional) is more different from the competitor's. We can also see that the replicated center samples (cp) are very close to each other which shows that most responses were measured with adequate precision, and that the experimental variation is small.
By interpreting the corresponding plot of variables, figure 3, we can see what characterizes the different samples. The horizontal axis (PC 1) describes 58% of the information (variation) in the data and all sensory attributes except tomato taste and fish taste. The vertical axis (PC 2) describes 11% of the information in the data and this mainly describes tomato taste. Fish taste which is placed in the center of the model is not described at all using 2 components.
The soup made using traditional methods is fairly similar to the competitor's soup for all attributes described along the first principal component because Competitor and Traditional are lying rather close to each other along that axis. However, along the second principal component the two samples are different; the Competitor's soup has got more tomato taste than Traditional.
Using two components does not give a complete picture of the similarities and differences for the 20 samples. We have decided to use one more component. The plots for component 3 vs. component 1 are shown in figure 4 and 5. We can see that component 3 describes 12 % of the variation in the data which is mainly due to fish taste and to some degree carrot taste. The closest sample to the competitor's is no. 13 which was very different from the competitor's soup regarding tomato taste. We therefore chose to concentrate on sample 1 and 19. The competitor's product had more fish taste and less carrot taste than sample no. 1 and less fish taste and more carrot taste than sample no. 19. The soup developed by traditional methods was relatively close to sample 19 along component 3, but far away in the PC1/PC2 plot (figure 2).