Sei sulla pagina 1di 5

ACADEMIC EMERGENCY MEDICINE December 2001, Volume 8, Number 12 1153

EDUCATIONAL ADVANCES

Reliability of the Visual Analog Scale for


Measurement of Acute Pain
POLLY E. BIJUR, PHD, WENDY SILVER, MA, E. JOHN GALLAGHER, MD

Abstract. Objective: Reliable and valid measures The summary ICC for all paired VAS scores was 0.97
of pain are needed to advance research initiatives on [95% CI = 0.96 to 0.98]. The Bland-Altman analysis
appropriate and effective use of analgesia in the showed that 50% of the paired measurements were
emergency department (ED). The reliability of visual within 2 mm of one another, 90% were within 9 mm,
analog scale (VAS) scores has not been demonstrated and 95% were within 16 mm. The paired measure-
in the acute setting where pain fluctuation might be ments were more reproducible at the extremes of pain
greater than for chronic pain. The objective of the intensity than at moderate levels of pain. Conclu-
study was to assess the reliability of the VAS for mea- sions: Reliability of the VAS for acute pain measure-
surement of acute pain. Methods: This was a pro- ment as assessed by the ICC appears to be high.
spective convenience sample of adults with acute pain Ninety percent of the pain ratings were reproducible
presenting to two EDs. Intraclass correlation coeffi- within 9 mm. These data suggest that the VAS is suf-
cients (ICCs) with 95% confidence intervals (95% CIs) ficiently reliable to be used to assess acute pain. Key
and a Bland-Altman analysis were used to assess re- words: pain; pain measurement; reproducibility of re-
liability of paired VAS measurements obtained 1 min- sults. ACADEMIC EMERGENCY MEDICINE 2001;
ute apart every 30 minutes over two hours. Results: 8:11531157

E FFECTIVE pain management in the emer-


gency department (ED) is a major challenge
to emergency physicians. The First International
been done to assess the reliability of the VAS for
measurement of acute pain. The few studies that
have explicitly assessed the reproducibility of VAS
Symposium on Pain Research in Emergency Med- measures of pain focused on chronic or postopera-
icine identified the development of valid and reli- tive pain, and most examined the correlation be-
able measures of acute pain as a necessary first tween repeat VAS measures. A study of a mechan-
step in a research agenda designed to guide deci- ical version of a VAS (a tool with a 10-cm ruler and
sions about analgesia in the ED.1 Without valid a marker that the patient moves to the point in-
(accurate) and reliable (reproducible) instruments, dicating his or her intensity of pain) used by pa-
any true effect of treatment can be obscured by tients with rheumatoid arthritis found a correla-
measurement error, or ineffective treatments may tion of 0.88 between two measures taken two hours
be erroneously considered therapeutic. apart.5 Studies that examined the correlation be-
The visual analog scale (VAS) is a valid and re- tween a vertically oriented VAS for pain with a
liable measure of chronic pain intensity.25 Three horizontally oriented VAS found correlations of
studies have demonstrated the validity of the VAS 0.99 and 0.91 when they were given within 10
for acute pain measurement among ED patients to minutes of each other2,3 to patients with a variety
measure acute pain.68 However, little work has of rheumatic diseases.
Although calculation of the Pearson product
moment correlation coefficient has often been used
to assess VAS reliability,25 this method has been
From the Department of Emergency Medicine, Albert Einstein
justly criticized as providing an inflated esti-
College of Medicine (PEB, WS, EJG), Bronx, NY.
Received February 28, 2001; revision received May 29, 2001; mate.911 A more appropriate formulation of the
accepted June 1, 2001. correlation coefficient12 and an analysis of the ac-
Presented at the SAEM annual meeting, San Francisco, CA, tual difference between repeated measures have
May 2000. both been recommended.9
Address for correspondence: Dr. Polly Bijur, Albert Einstein
College of Medicine, Kennedy Center Room 920, 1410 Pelham
Our primary goal in this study was to assess
Parkway South, Bronx, NY 10461. Fax: 718-430-8821; e-mail: the reliability of the VAS in acute pain using meth-
bijur@aecom.yu.edu odologically appropriate statistical techniques.
1154 VAS Bijur et al. RELIABILITY OF VAS FOR ACUTE PAIN

METHODS ple, increasing as the range of scores increases. To


adjust for this, ICCs were calculated separately for
Study Design. An observational prospective co- VAS scores above and below 50 mm. Multiple re-
hort design was used to assess the reliability of gression analyses were performed for each of the
VAS pain measurements. The study was approved time periods to assess whether differences between
by the institutional review boards of the two hos- 1-minute measures of pain on the VAS were asso-
pitals that provided patients. Written informed ciated with sex, age, and location of the pain cat-
consent forms in English or Spanish were com- egorized as abdominal/genitourinary, extremity,
pleted by all participants. and other.
The Bland-Altman analysis consists of calculat-
Study Setting and Population. All English- and ing the difference between paired pain scores 1
Spanish-speaking patients 18 years of age or older minute apart. The distribution of the differences is
who presented to the ED with acute pain as a com- examined and the differences are plotted against
ponent of their chief complaint were eligible for in- the average of the two scores under the assump-
clusion. Acute pain was operationally defined as tion that the mean is the best measure of the true
pain of recent onset or exacerbation of existing pain score. In our data the differences between the
pain of sufficient severity to bring the patient to 1-minute scores were not normally distributed.
the ED. Patients with altered mental status and Therefore, the median and the ranges of differ-
decreased visual acuity that precluded scoring of ences between measures made 1 minute apart that
the VAS were excluded. Data were collected on a included 50%, 90%, and 95% of the differences
convenience sample of patients by trained research were calculated as recommended for differences
associates during the hours of 8:00 AM to 8:00 PM, that are not normally distributed.14
six days a week over a period of nine months. The number of VAS differences greater than or
equal to 10 mm per patient was calculated in order
Study Protocol. After providing written informed to assess whether the most discordant VAS scores
consent, patients were asked to rate pain intensity taken 1 minute apart were due to individual pa-
by placing a mark on a 100-mm VAS. The VAS was tients who could not appropriately perform the
horizontally positioned with the extremes labeled task of rating their pain on the VAS.
least possible pain and worst possible pain. One SPSS (Version 9, SPSS Inc., Chicago, IL) pro-
minute later, patients were asked to rate their pain vided all descriptive and inferential statistics re-
severity again on a fresh VAS without reference to ported.
the first measurement. A minute was chosen as the
time interval between paired ratings under the as- RESULTS
sumption that most pain would not change within
a 1-minute period. This procedure was repeated at Ninety-six patients participated in the study, pro-
times 0 and 1 minute, 30 and 31 minutes, 60 and viding 432 paired measures on the VAS obtained 1
61 minutes, 90 and 91 minutes, and 120 and 121 minute apart. Mean age of the population was 37
minutes for a maximum of two hours (five paired years (range 1971 years), 55% of whom were fe-
readings) or until the patient left the ED. The first male. The most common locations of pain were the
measurement in each of the five 1-minute time in- abdomen/genitourinary tract and extremities.
tervals is referred to as time 0 and the second mea- The ICCs between VAS scores 1 minute apart
surement as time 1. were between 0.95 and 0.98 with very narrow con-
fidence intervals at all five time points (Table 1).
Data Analysis. Reliability of the VAS was as- As expected, the correlations are lower in each of
sessed in two ways. Following classic measure- the two subsets of VAS scores dichotomized at 50
ment theory,13 correlations between VAS measures mm. However, all ICCs remain in the range con-
taken 1 minute apart were calculated. However, sidered to represent excellent reliability.12 There
intraclass correlation coefficients (ICCs) were used were no significant associations between sex, age,
rather than Pearson productmoment correlations and location of pain and the difference between the
because the ICC can reach 1.0 only when the two VAS scores.
measurements are identical (y intercept of 0, slope The Bland-Altman plot (Fig. 1) displays the dif-
of 1.0).12 In contrast, a Pearson productmoment ferences between VAS ratings over a 1-minute time
correlation value of 1.0 requires only that the mea- interval plotted against the average of the two VAS
surements be perfectly correlated, but not neces- ratings. The differences between ratings 1 minute
sarily identical. Intraclass correlation coefficients apart ranged from 48 mm to 47 mm. The median
of 0.75 and above are considered to be excellent.12 difference was 0. This suggests that time 1 ratings
The magnitude of correlation coefficients is af- of pain were not systematically lower or higher
fected by the range of scores included in the sam- than time 0 ratings. Fifty percent of the ratings
ACADEMIC EMERGENCY MEDICINE December 2001, Volume 8, Number 12 1155

TABLE 1. Relationship between Time 0 and Time 1 Pain Measures


Intraclass Correlation for VAS Ratings in the Following Ranges:
Intraclass
Correlation 049 mm 50100 mm

Time (Min) n r 95% CI n r 95% CI n r 95% CI

0 and 1 96 0.95 (0.93, 0.97) 35 0.72 (0.51, 0.84) 61 0.85 (0.76, 0.91)
30 and 31 95 0.98 (0.97, 0.99) 36 0.95 (0.91, 0.98) 59 0.89 (0.82, 0.93)
60 and 61 92 0.97 (0.96, 0.98) 38 0.97 (0.95, 0.99) 54 0.84 (0.73, 0.90)
90 and 91 77 0.98 (0.97, 0.99) 35 0.89 (0.80, 0.94) 42 0.94 (0.90, 0.97)
120 and 121 72 0.96 (0.94, 0.97) 36 0.81 (0.67, 0.90) 36 0.84 (0.71, 0.92)

TOTAL 432 0.97 (0.96, 0.98) 180 0.88 (0.84, 0.91) 252 0.87 (0.84, 0.90)

taken 1 minute apart were within 2 mm of each six at 60 minutes, five at 90 minutes, and eight at
other, 90% were within 9 mm, and 95% were 120 minutes.
within 16 mm.
It can be seen from Figure 1 that the 1-minute DISCUSSION
ratings of pain were more reproducible at the ex-
tremes of pain intensity than at moderate levels of The VAS is generally regarded as a valid and re-
pain. For the 88 one-minute differences associated liable tool for chronic pain measurement.25 Al-
with average pain of 20 mm or less, 95% were be- though it appears to be equally valid in acute pain
tween 9 and 4 mm. Similarly 95% of the 130 dif- measurement,68 to the best of our knowledge, its
ferences associated with average pain 80 mm and reliability has not been assessed in this setting. We
above were between 7 and 4 mm. wished to examine the reliability of the VAS in
In order to determine whether the most discor- acute pain, using two different methodologies that
dant ratings came from a small group of patients would be less likely to inflate its reliability than
who may not have understood or were unable to the traditional Pearson correlation coefficient.9,12
use the VAS, the number of VAS differences of 10 The ICCs (Table 1) suggest that, in the aggre-
mm or more were analyzed separately. Only two gate, the reliability of the VAS for acute pain mea-
patients had three VAS differences 1 minute apart surement in the ED setting is high.
that were 10 mm or higher, ten had two VAS dif- An examination of the Bland-Altman plot, how-
ferences 10 mm or higher, and 20 had a single VAS ever, suggests that the VAS may not be as reliable
difference 10 mm or higher. Sixteen of the discor- as the ICCs would indicate. Although 50% of the
dant ratings were at baseline, 11 at 30 minutes, ratings are within 2 mm of each other, and 90%

Figure 1. Bland-Altman plot: differences between paired visual analog scale (VAS) scores obtained 1 minute apart
by the average of the two scores. Dotted lines indicate the interval that includes 95% of differences between time
0 and time 1 VAS scores; solid lines indicate the interval that includes 90% of differences between time 0 and time
1 VAS scores.
1156 VAS Bijur et al. RELIABILITY OF VAS FOR ACUTE PAIN

are within 9 mm, there remain about 5% of paired course of a minute, and that lack of agreement be-
measurements whose differences encroach upon tween VAS scores 1 minute apart is not measure-
the threshold of the minimum clinically significant ment error, but rather due to fluctuating pain,
difference in pain (approximately 13 mm). This is such as might occur with intermittent small bowel
consistent with the findings of DeLoach et al., who obstruction. Due to the episodic nature of pain ex-
reported that 95% of the differences between acute perienced in the ED setting, it is plausible that
postoperative pain measured 3 minutes apart were some pain measurements obtained in studies com-
within 18 mm.15 The magnitude of measurement paring different treatment modalities will reflect
error found in this study and that of DeLoach et rapidly changing levels of pain rather than treat-
al. can be used to explain the consistent finding of ment effects. In small studies, this phenomenon
discrepancies between changes in VAS scores and might bias conclusions. In large trials, however,
verbal descriptions of pain in a small proportion of randomization should distribute patients with rap-
patients (e.g., a VAS difference indicating a 15-mm idly fluctuating pain equally among treatment
decrease in pain described as a little more groups. We cannot rule out the possibility that
pain).68 some of the larger differences in pain measured 1
Because most treatment studies select patients minute apart simply reflect random variation,
with substantial pain, it is of particular interest rather than true change in the experience of pain.
that the correlation of VAS scores taken 1 minute The actual measurement properties of a VAS for
apart was high, 0.87, for initial VAS scores of 50 pain can be best studied with an experimental de-
mm or higher. Similarly, the greater reproducibil- sign in which the pain stimulus is standardized by
ity of pain ratings among patients with high rather the investigators. However, the purpose of this
than moderate levels of pain suggests that even study was to assess the reliability of the VAS in
those in extreme pain can make reproducible as- the ED in order that conclusions drawn from treat-
sessments of acute pain severity. The finding that ment studies of acute pain using this instrument
the reproducibility was greatest for the lowest and would be informed by an estimate of its measure-
highest pain intensities may reflect a floor and ceil- ment error. In this natural setting pain does fluc-
ing effect. Alternatively, it may simply be easier to tuate. Although most pain ratings are within 9 mm
quantify the absence of significant pain and the of each other, any study of pain in the ED must
presence of intense pain, than to assess moderate take this fluctuation into account. This limitation
pain. is not apparent from the ICC, but does become ev-
ident when the Bland-Altman plot, which graphi-
LIMITATIONS AND FUTURE QUESTIONS cally displays all the data, is inspected.
There were no significant differences in relia-
Although the high ICCs and the finding that 90% bility by location, age, or gender. However, the lack
of the paired ratings were within 9 mm of one an- of detail about type of pain other than location lim-
other suggest that the VAS is a reliable measure its the ability to assess whether there are any dif-
of acute pain for the majority of ED patients, about ferences in reproducibility of the VAS by type of
5% of the pain ratings did differ, on at least one pain. A comparison of VAS scores and verbal rat-
occasion, by more than what is generally consid- ings from this data set reported elsewhere8 with
ered to represent a clinically significant difference Todd et al.s findings on a group of patients with
in pain, approximately 13 mm. Differences in re- trauma,6 and Kellys Australian patients with a
producibility on the VAS over a short time frame mix of different pain etiologies7 indicated similar
can reflect several sources of error, including: 1) minimum clinically significant differences despite
difficulty in translating the subjective experience major differences in the populations studied and
of pain into a distance measure on a quantitative casemixes. This suggests that our findings may be
scale; 2) inability to make an accurate mark on the generalizable to settings with varying casemixes
VAS due to motor, cognitive, or visual impairment; and populations, although it is possible that mea-
and 3) lack of sufficient effort to appropriately com- sures of certain types of pain may be more or less
plete the task due to pain or cultural and other reproducible than what we found. Similarly, our
behavioral characteristics. However, all of these experience in an urban setting with Spanish- and
patient-related factors would be expected to affect English-speaking patients may not be generaliza-
measurements over the entire two-hour time pe- ble to all settings.
riod, not just in one or two half-hour periods when
the VAS scores were obtained. Our data indicate CONCLUSIONS
that this is not the case, since only two patients
had more than two ratings with pain differences of The findings from this study indicate that the VAS
more than 9 mm. is a highly reliable instrument for measurement of
It is also possible that pain can vary over the acute pain. For 90% of paired measurements, dif-
ACADEMIC EMERGENCY MEDICINE December 2001, Volume 8, Number 12 1157

ferences between scores obtained 1 minute apart 3. Scott J, Huskisson EC. Vertical or horizontal visual ana-
logue scales. Ann Rheum Dis. 1979; 38:560.
were 9 mm or less. The clinical significance of this 4. McCormack HM, Horne DJL, Sheather S. Clinical applica-
finding is that if a VAS were used to measure tions of visual analogue scales: a critical review. Psychol Med.
change in pain in individual patients, change of 10 1988; 18:100719.
5. Gaston-Johansson F. Measurement of pain: the psychomet-
mm or more would be likely to indicate a true ric properties of the Pain-O-Meter, a simple, inexpensive pain
change in the experience of pain for most patients. assessment tool that could change health care practices. J Pain
For research purposes the findings from this study Symptom Manage. 1996; 12:17281.
6. Todd KH, Funk KG, Funk JP, Bonacci R. Clinical signifi-
suggest that mean differences between groups of cance of reported changes in pain severity. Ann Emerg Med.
patients smaller than 10 mm are within the error 1996; 4:4859.
of the method, and should be interpreted with cau- 7. Kelly AM. Does the clinically significant difference in visual
analog scale pain scores vary with gender, age, or cause of
tion. pain? Acad Emerg Med. 1998; 11:108690.
For about 5% of measurements, there appeared 8. Libman M, Berkoff D, Lahn M, Bijur P, Gallagher EJ. In-
dependent validation of the minimum clinically important
to be a clinically significant difference between two change in pain scores as measured by visual analog scale [ab-
VAS ratings made by the same patient 1 minute stract]. Acad Emerg Med. 2000; 7:550.
apart.68 The pattern of this finding suggests that 9. Bland JM, Altman DG. Statistical methods for assessing
agreement between two methods of clinical measurement. Lan-
a change of this magnitude in VAS score may rep- cet. 1986; 1:30710.
resent clinically significant fluctuation in acute 10. Bland JM, Altman DG. Measurement error and correlation
pain occurring in a small proportion of patients coefficients. BMJ. 1996; 313:412.
11. Porter AMW. Misuse of correlation and regression in three
over a 1-minute interval. medical journals. J R Soc Med. 1999; 92:1238.
12. Fleiss JL. The Design and Analysis of Clinical Experi-
ments. New York: John Wiley & Sons, 1986.
References 13. Crocker L, Algina J. Introduction to Classical and Modern
Test Theory. New York: Harcourt Brace Jovanovich, 1986.
1. Ducharme J. Proceedings from the First International Sym- 14. Bland JM, Altman DG. Measuring agreement in method
posium on Pain Research in Emergency Medicine: Foreword. comparison studies. Stat Methods Med Res. 1999; 8:13560.
Ann Emerg Med. 1996; 27:399403. 15. DeLoach LJ, Higgins MS, Caplan AB, Stiff JL. The visual
2. Downie WW, Leatham PA, Rhind VW, Wright V, Branco JA, analog scale in the immediate postoperative period: intrasub-
Anderson JA. Studies with pain rating scales. Ann Rheum Dis. ject variability and correlation with a numeric scale. Anesth
1978; 37:37881. Analg. 1998; 86:1026.

Potrebbero piacerti anche