Efficient planning, preparation, performance and evaluation of sensory tests (Part 1)
DLG Expert report 3-2012
Author:
- Prof. Dr. Dietlind Hanrieder, Anhalt University of Applied Sciences, Bernburg, Dietlind.Hanrieder@hs-anhalt.de
Contact:
- Bianca Schneider-Häder, DLG Competence Center Food, Sensorik@DLG.org
In cooperation with the DLG Sensory Analysis Committee
Sensory tests are non-measuring tests, also known as attribute tests. Unlike measuring tests (variable tests), they always supply qualitative findings to which figures are only allocated subsequently with the help of a test scale in the case of certain individual test methods (e.g. profile analyses, DLG Test).
While chemical-physical measuring methods (e.g. measurements of pH value, spectrophotometric methods, mass spectrometry) are considered to be precise, sensory tests partly still suffer from an image problem. They are considered to be subjective and imprecise, to involve a lot of work and not to produce very much. That is why especially small and medium-sized firms with limited personnel resources either do without them completely or promote these very prejudices with half-hearted “tastings”.
Only the correct planning, preparation, performance and evaluation of sensory tests (Figure 1) allow reliable and useful statements to be obtained and make sensory tests effective instruments for market research, product development and quality assurance. Faults are possible in all these phases and in practice are more likely to be the rule than the exception. That is why there is enormous potential for improvement here.
1. Target definition
First of all it is important to be clear what kind of problem one wishes to solve with a sensory test and what the target of the test is:
- “Measurement” of the sensory properties of foods = sensory analysis (sensory analysis of foods) ➝ Intensities, intensity differences, qualities (human = measuring instrument, like GC, HPLC etc.)
- “Measurement” of the consumer perception = sensory analysis (sensory analysis by individuals) ➝ Recognisability of a difference, defect etc.
- “Measurement” of the consumer preference/acceptance = hedonic (affective) sensory testing
Only after the target has been clarified, it is possible to start planning the sensory test, as some tests differ distinctly depending on the target.
2. Test planning
The following points are among those to be addressed when planning the sensory test:
- samples, sampling
- testers
- test conditions
- testing method
- sample preparation, arrangement, presentation
- test protocol/questions
- testing time/process
- number of tests to be conducted per session
- specifications on neutralisation
- warm-up
- allowing testers to test independently or contact method
Where applicable further specifications must be made that are necessary to ensure exact and reproducible results.
2.1 Samples, sampling
Depending on the target of the sensory test, either one or several food samples represent the random samples to be examined in the test, or one or several consumer samples.
In the sensory analysis of foods, e.g. a DLG Test, a profile analysis or an In/Out Test, random samples of foods are drawn from one or more populations (batches, varieties, formulation variants, …) and examined. These random samples must be correspondingly representative of the respective population, which is to be noted on the selection and drawing of the samples. However, if the target of the test is to answer the question of whether consumers would notice a modification in the formulation of a food already on the market, consumers represent the sample to be examined. Therefore a representative sample of consumers is necessary in order to be able to draw conclusions from them for the total population of all consumers (or of a sub-group such as children in the case of a food chiefly intended for and consumed by children). The situation is similar for a preference or acceptance test. Although food samples must be representative of the respective formulation variant, variety, batch or the like, the sample actually examined in the hedonic test consists of consumers.
2.2 Testers
The question of what testers are to be assigned for the test to be carried out therefore also depends on the target of the test. In the case of sensory analysis of foods – depending on the requirements of the test – trained sensory testers or sensory experts are called in. The latter differ from the former by comprehensive and usually many years of experience regarding the product or the product group to be examined.
Generally such sensory tests are carried out in a group (test panel) in order to minimise any subjective influences which may exist and at the same time to profit from the different strengths of the individual testers (panel as a “broadband test instrument”). However, it is not always expedient to make a group result the basis for decision-making. For example, if the aim is to find out whether or not a food sample displays a certain sensory defect (e.g. taste of cooking, musty note), this defect exists even if only one or few testers establish it in repeated tests. In this case the tester is an analytical measuring instrument like a gas chromatograph or an HPLC facility, and when these are used – if several pieces of equipment with differing measuring sensitivity are available – one would always trust the most sensitive instrument.
More detailed information on the composition and fields of assignment of test panels for sensory food analysis can be found in the DLG Expert reports on panel selection and qualification.
Under no circumstances may trained testers also be assigned for hedonic tests. As a result of their preceding training and the experience gained as sensory testers, they are no longer “normal” consumers. Hedonic tests, e.g. acceptance and preference tests, but also tests that examine consumer perception of e.g. product defects or differences, must always be carried out with untrained consumers as test persons. As both the sensitivity of the sensory perception and sensory preferences and dislikes vary strongly between consumers, relatively large samples are necessary for the statements obtained to be representative.
If for example the aim of the test is to examine whether consumers notice a change in the sensory properties of a product already on the market due to a modification of the formulation or the production process, care should be taken to ensure that only consumers who know and consume the product are invited as test persons. On the other hand, if the aim is to obtain information about the popularity or acceptance of a new product, or to learn which of two or more product variants offered is preferred, it should be taken into account when selecting the consumer sample for what product group the product is intended. The random sample is then to be recruited from this group.
In accordance with the specific test purpose, a random consumer sample can be recruited specifically for a certain test or be selected from an existing pool of potential test persons such as is maintained by some large companies and commercial test institutes.
2.3 Test conditions
The selection of the test conditions is also closely linked with the target of a sensory test. For tests that serve analytical goals, the conditions should be selected in such a way that they do not exert any negative influence on the testers in their work, do not falsify the test result and allow reproducible results. Appropriate room conditions (see here DIN EN ISO 8589:2014-10 “Sensory analysis – General guidance for the design of test rooms“) and technical facilities (e.g. air conditioning) are necessary here that make it possible to control the test conditions and maintain them constant. A professional sensory laboratory is best suited for such purposes.
Such an analytical test environment is not recommended for hedonic tests, as they motivate the testers too strongly to concentrate on the product to be tested and to “analyse” it in sensory terms. Test judgements obtained in this way may differ strongly from those that are obtained under normal consumption conditions. The test conditions should therefore be as similar as possible to a normal eating situation. In this connection the food samples can be given to the testers to take home (Home Use Test), or the test can take place e.g. in a shopping centre, during a consumer trade fair or the like (Central Location Test). However, in these cases any preparation of the product which may be necessary (e.g. packet soup) is difficult to control or realise. Specially equipped stationary or mobile test studios represent a compromise here. Figure 2 shows the connection between test target, sample, testers and test conditions.
2.4 Test method
The selection of the test method also depends on the target of the sensory test. Figure 3 provides an overview of the customary test methods. Other aspects also influence the choice of method, for example:
- sensitivity or performance capability of the method
- quantity of samples available
- number of samples to be tested
- time available
- nature of samples (e.g. spiciness, temperature)
- number and qualification of the testers
For example, the various methods for testing sensory differences differ in their detection sensitivity, as well as in the demand for samples, the necessary number of testers, and the test outlay and test duration. In descriptive tests a profile analysis makes greater demands of the qualification of the test panel than a simple descriptive test.
2.5 Sample preparation, arrangement and presentation
Considerations regarding sample preparation, arrangement and presentation are also part of the test planning. Looking ahead, it is important to prevent external influences, known as context effects, from influencing the test result and thus falsifying it. Such context effects are caused e.g. by
- another sensory property differing from that to be tested
- the nature of sampling, i.e. the subset drawn
- the packaging of the food/the test vessel
- the sample temperature
- the quantity of samples provided
- the nature of the sample arrangement
- the sequence in which the samples are tested
- the testing environment
- the test manager, other persons or superiors present
- any background knowledge of the testers
For instance the appearance of a food can influence the perception of other sensory properties. Intensively coloured fruit juices are for example rated as more aromatic than those with a less intensive colour. In the case of difference tests, it should not be possible to guess an odour, taste or texture difference on the basis of the appearance. Coloured light in the test cabins (red, blue, yellow) can help to mask such differences. It is also possible to darken the test room, or if necessary blindfold the testers. If the purpose is to assess the taste/flavour of natural products such as carrots or apples, texture differences can also influence the result. In such cases it is possible e.g. to grate or rub the samples or even to assess just the juice. An assessment of the taste, independently of the odour of the food, is only possible if nose clips are used.
In the case of inhomogeneous foods, widely differing results may be obtained under certain circumstances depending on the specific subset. That is why sampling must be carried out in such a way that wherever possible comparable samples are made available to the testers.
Furthermore, the samples should not allow any conclusions to be drawn regarding the producer, the brand or the like. Hedonic tests would be falsified by this, as would quality assessments. By filling the products into neutral, identical test vessels or rendering the original packaging anonymous (which is not possible in the case of packages that are used exclusively by a particular firm/brand), it is possible to render the samples anonymous. For the purpose of identification the samples are provided with coding consisting of at least three and preferably four to five-character codes that are generated randomly, consisting of figures and/or letters. If the samples are transferred into neutral vessels, these vessels must not have any sensory influence on the samples either.
The temperature of the samples must be adequate for the food, as sensory quality and popularity are closely linked with the right consumption temperature. For example, beer at a temperature of 20 °C would certainly produce poorer results than at refrigerator temperature. If a number of samples are being tested, they must all display the same temperature, as different temperatures can lead to differing perception of the intensity of the individual sensory attributes (profile analysis) or to different assessments of the quality or acceptance of the foods. In the case of difference tests, testers would be able to guess the deviating sample (triangle test) or the sample coinciding with the reference sample (Duo-Trio-Test) on the basis of the temperature.
The quantity of samples made available must be sufficient to solve the test task and allow the testers to verify their results by re-tasting where necessary. At the same time the same quantity of each sample should be presented to the testers. Otherwise under certain circumstances not every tester has the same opportunity to re-taste in order to confirm his/her result. Furthermore, the testers could suppose that the smaller quantity of a sample might possibly be a reflection of a higher price of this sample, which in turn could influence quality or popularity tests.
The arrangement of the samples to be tested, generally on a tray, and the sequence in which the samples of a sample set or several sample sets are tested after one another, must also not have any influence on the test result. For example, the sensory performance capability of a tester might rise in the course of the test, but it can also drop due to fatigue as a consequence of adaptation of the senses.
In the case of quality assessments or popularity tests, samples tested beforehand influence the assessment/evaluation of subsequent samples. For these reasons the samples in a set must always be set up in the same manner in front of each tester, e.g. in a row, and must be tested in a specified manner, e.g. from left to right. The arrangement of the samples in the set must either be determined purely randomly for each tester (e.g. by throwing a dice or by a computer), or be balanced within the panel. The latter case means that all possible sample arrangements must be tested equally often. In a triangle test, for example, there are six possible arrangement variants (ABB, BAB, BBA, BAA, ABA, AAB). Accordingly, it is favourable to use a number of testers that can be divided by six. In the case of 30 testers, each of the six arrangement variants would then be tested five times. In a paired comparison test or a paired preference test there are only two variants (AB, BA), so that half of the testers would test one of the two variants in each case. It should also be noted that the allocation of the individual sample sets to the testers must also be carried out on a random basis (e.g. by drawing lots). If several individual samples or sample sets are tested one after the other, their test sequence must also be determined by chance or be balanced. The same applies for the case in which samples are tested in different sessions. As the performance capability of the testers, the test conditions or the like can vary between the sessions, the possibility that the session date/time has a systematic influence on the test result must be ruled out.
The test environment must be conducive to the concentration of the testers and may not influence the test result (see also test conditions above). For instance unfavourable climatic conditions and noise or sounds are unfavourable. Foreign odours or poor illumination of the test rooms, or very bright colours of walls or furnishings also have a falsifying effect. To ensure that the test environment does not have any systematic influence on the test result, it must be kept constant during the test or during different sessions.
The test manager or another person who is present during the test can also trigger context effects. Remarks made, facial mimicry or gestures can influence the testers in their work and thus falsify the result.
Hierarchical differences in the panel itself can also have negative effects. If both superiors and subordinates sit on the same panel, the latter may find their judgement capacity impaired by the presence of their boss. Such constellations should therefore be avoided.
Context effects can also result from staff in the production department knowing the ingredients used in a food, or where applicable what has gone wrong in the process. In the same way staff from the laboratory know the analysis data. Such background knowledge can lead to the relevant persons assigned as panellists possibly believing that they can taste certain components or perceive fault impressions, even though this may not really be the case. Quality assessments are also influenced by such knowledge. If the company wishes to use the services of an internal test panel for sensory analysis, the panel should therefore as far as possible only consist of persons who do not have such background knowledge, e.g. staff from the administration, maintenance personnel or the like.
Further aspects concerning efficient planning, preparation, performance and evaluation of sensory tests will be provided in DLG Worksheet 04/2012.
Reading list:
- Busch-Stockfisch, M. (Hrsg.): Praxishandbuch Sensorik in der Produktentwicklung und Qualitätssicherung. Loseblattsammlung. Behr’s Verlag, Hamburg
- O’Mahony, M.: Sensory Evaluation of Food. Marcel Dekker Inc., New York und Basel 1986
- Stone, H., Sidel, J. L.: Sensory Evaluation Practices. Academic Press Inc., San Diego, New York
- Derndorfer, E.: Lebensmittelsensorik. Facultas-Verlag, Wien 2006
- DIN and DIN EN ISO standards on sensory analysis
Contact:
Bianca Schneider-Häder, DLG Competence Center Food, Sensorik@DLG.org
In cooperation with the DLG Sensory Analysis Committee (www.DLG.org/Sensorikausschuss.html)