Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Thurstonian and
R-Index Analysis
Graham Cleaver
Consumer Perception & Behaviour
Unilever Food & Health Research Institute
Vlaardingen
Sensometrics
July 2008
Unilever Categories & Brands
Spreads Deodorants
Blind Product 1
Is It An ‘A’ or ‘Not A’ ? How Sure? Optional
Control (A) 14 11 8 4 12 1 30
Test (Not A) 10 2 5 8 12
1 3 30
Aim:
• Establish which is most appropriate / most powerful significance test for
the R-Index
• Compare with Thurstonian analysis in terms of power
• Create power curves for difference testing and similarity testing
Thurstonian Model
d’ Boundary
Criteria
-2 -1 0 1 2 3
SAS Code Equal Variance
Parameter
D
F Estimate
Standard
Error
Wald
Chi-Square Pr > ChiSq
d’
Boundary
Intercept 1 1 -1.1642 0.2832 16.9001 <.0001 Criteria
Intercept 2 1 -0.0437 0.2175 0.0404 0.8407
1.0
Approx Approximate 95%
0.9
Parameter Estimate Std Error Confidence Limits
0.8
C1 -0.7355 0.0884 -0.9627 -0.5083 0.7
PDF
0.5
D2 0.6266 0.0641 0.4618 0.7914 Ref
0.4 Test
D3 0.7520 0.0849 0.5337 0.9703 Std Dev = 1.00
0.3 Std Dev = 1.28
0.2
D4 0.6445 0.0988 0.3906 0.8983
0.1
B 1.1825 0.1281 0.8532 1.5117 0.0
-3 -2 -1 0 1 2 3 4
A -0.2451 0.0899 -0.4763 -0.0139 Intensity
Power Depends On Scale Usage
Boundary criteria
evenly separated
0.9
-3 -2 -1 0 1 2 3 4
Intensity
Power
0.8
Boundary criteria
Boundary criteria far apart
close together
0.7 -3 -2 -1 0 1
Intensity
2 3 4
-3 -2 -1 0 1 2 3 4
Intensity
0 1 2 3
Range Of Boundary Criteria
Power Determination: Simulation
0.9
SESAM Model
(approx)
0.8
Triangle Test
0.7 Retasting x 3
0.6
Power
0.5
Triangle Test
0.4 Retasting x 2
0.3
To detect difference of R-Index=65% (d’=0.6)
40 replicates per product
0.2
Sign level = 0.05
One-sided test Triangle Test
0.1
No retasting
Box plot shows mean, middle 50%, middle 90% and range
0.0
Thurstonian R Index (U Stat) R Index (Bi2007) R Index (Bi1995)
Analysis Method
Power Comparison
1.0
No product difference: R-Index=50% (d’=0.0)
40 replicates per product
0.9 Sign level = 0.05
One-sided test
0.8
0.7
0.6
Power
0.5
0.4
0.3
0.2
0.0
Thurstonian R Index (U Stat) R Index (Bi2007) R Index (Bi1995)
Analysis Method
Power Curve Comparison
1.0 To detect difference of
R-Index=65% (d’=0.6)
0.9 Sign level = 0.05
One-sided test
0.8
0.7
0.6
Power
0.5
0.4
0.3
Thurstonian Analysis
0.6
Power
0.5
0.4
0.3
Thurstonian Analysis
Difference test
Difference test Difference test
Similarity test
Aim
Aim Demonstrate, with confidence, that
Demonstrate that the product the product difference is less than
difference is significant pre-specified value
Power: Difference Test
Power: How Many Replicates Are Required ?
True True
R Index R Index
1.0 90 90
70
75
80
85
65
85
0.9
80 40 reps per product will give an 80%
60
0.8 chance detecting a ‘true’ R Index of
65% as significant.
0.7
75
Power
0.4 55
65
0.3
0.2 60
If the ‘true’ R Index is 60% then 30 reps per product will
55 give a 40% chance of getting a significant difference
0.1
50 50
0.0
10 20 30 40 50 60 70 80 90 100
Significance level 0.05 One sided test No. Of Reps Per Product
Power: Similarity Test
100 How Many Replicates Are Required To Give Power = 90%?
If (a) the ‘action standard’ is: demonstrate with
confidence that the R Index is less than 65%
Maximum 'Acceptable' R Index
90
and (b) We estimate that really is no difference
between the products (‘True’ R Index=50)
80 75%
Then 60 reps per product will
70% give 90% chance of being able to
conclude that the products are
70 65% ‘acceptably similar’
60%
60 55%
50%
50
0 100 200 300 400 500 600 700 800 900 1000
Significance level 0.05 One sided test No. Reps Per Product
'True' R Index 50 55 60 65 70 75
Power: Similarity Test
100 How Many Replicates Are Required To Give Power = 90%?
80 75%
70%
Then 600 reps per product
70 65% are required to give 90%
chance of being able to
60% conclude that the products
are ‘acceptably similar’
60 55%
50%
50
0 100 200 300 400 500 600 700 800 900 1000
Significance level 0.05 One sided test No. Reps Per Product
'True' R Index 50 55 60 65 70 75
Individual vs Pooled Analysis
Same as Reference Different to Reference
Subject Sure Not Sure Guess Guess Not Sure Sure Total R Index Std Err
Ref 0 4 1 3 2 0 10
1 78.5 12.6
Test 0 0 0 5 4 1 10
Ref 0 0 4 3 2 1 10
2 76.5 12.7
Test 0 1 0 0 6 3 10
Ref 1 5 3 1 0 0 10
3 84.0 12.6
Test 0 0 6 2 2 0 10
Ref 3 5 0 0 2 0 10
4 77.0 12.9
Test 0 2 5 2 0 1 10
Ref 0 1 3 5 1 0 10
5 84.0 12.6
Test 0 0 0 4 4 2 10
Ref 4 15 11 12 7 1 50
Pooled 73.9 5.7
Test 0 3 11 13 16 7 50
z = 4.21 ***
Alternative Analyses
Pooled Analysis
• Most suited to analysis of one individual, pooling over replicate
assessments
• Or where it can be safely assumed that individual assessors are
the same and using the scale in the same way
New Directions
• Aim: Overall analysis that allows for individual differences in
sensitivity and boundary criteria, and influences of other factors
Poster P13: A statistical model for A-Not A data with and without
sureness. R.H.B. Christensen G. J. Cleaver and P. B. Brockhoff
Summary
R-Index
- Important to use the correct analysis for significance test
- Recommended: Test based on U-Statistic with ties
- Or test in Bi (2007) if table of critical values required
Thurstonian Analysis
- Based on assumption about underlying nature of perception
- More insights on perceptual difference and scoring
- Similar power to R-Index analysis
Power Charts
- To be used according to objective of study
* Difference Test
* Similarity Test
New Directions
- Potential bias with simple pooling of data
- Analysis based on individual R-Index values safer
- New Generalised NL Model – see Poster (Christensen et al)