Sei sulla pagina 1di 18

The current issue and full text archive of this journal is available at

http://www.emerald-library.com

Attribute importance in Attribute


importance in
service quality: an empirical service quality

test of the PBZ conjecture


in Brazil Received August 1998
487

Frederico A. de Carvalho Accepted May 1999


COPPEAD ± Universidade Federal do Rio de Janeiro (UFRJ), Brazil, and
Valdecy Faria Leite
FACC ± Universidade Federal do Rio de Janeiro (UFRJ), Brazil
Keywords Service quality, Brazil
Abstract According to the Parasuraman-Berry-Zeithaml conjecture, the greater the importance
of a given quality dimension, the thinner the corresponding tolerance zone would be. This paper
seeks to test the conjecture when attribute items are individually considered. The original data
have been collected to assess the quality of postal services in Brazil. A qualitative stage yielded a
list comprising 39 attribute items. In the quantitative stage the three-column format of a
SERVQUAL questionnaire was employed to permit the computation of importance weights and
tolerance widths for each attribute item. The questionnaire was mailed to a sample of some 5,900
firms. About 10 per cent (540) of mailed questionnaires returned and were considered valid. The
values obtained for the correlation coefficients were significantly negative and consistently close to
each other. The inverse association between importance and tolerance of service quality attributes
was then accepted. The most interesting consequence of this finding is that simply ordering the
computed width of attributes' zones of tolerance will yield the most important attributes. Other
implications are also discussed.

Introduction
The main objective of this paper is to take up an empirical investigation of the
association between the importance of quality attributes and the width of the
tolerance zone as defined through service expectations. A conjecture stated in
aggregate terms by Parasuraman, Berry and Zeithaml (1991; hereinafter PBZ)
suggested that there must be an inverse association. In other words, the greater
the importance a service user assigns to an attribute, the thinner will be the
tolerance zone defined by the consumer's expectations regarding that same
attribute. In the present paper, the conjecture is approached with a view toward
empirically testing the inverse association suggested by PBZ.
The test relies on a new, simple method for (almost) completely ordering a
list of quality attribute items in terms of their importance. The new method is
presented en route; its main idea is to aggregate a set of partial orderings, each
The authors gratefully acknowledge the detailed comments from two anonymous IJSIM
referees; the usual caveat applies. They also thank the participants in the CEPEAD-UFMG
research seminar for their comments. Financial support from CNPq, the Brazilian national International Journal of Service
Industry Management,
research council, and from PAP/CAPES/ANPAD has been decisive to the accomplishment of Vol. 10 No. 5, 1999, pp. 487-504.
the original research project. # MCB University Press, 0956-4233
IJSIM involving a number of items far smaller than the total number of attributes.
10,5 The empirical analysis reported here may also serve to illustrate the method.
After this brief introduction, the conceptual framework that supports the
study is presented and the main aspects relating to the idea of weighing
attributes are exposed. The methodology is then described, particularly the
data collection, data analysis and hypothesis testing procedures, followed by
488 the main findings. In the last section some further aspects are discussed.

Service quality measurement: conceptual basis


Many recent authors conceptualize service quality as the result from a
comparison between the consumer's expectations about the service to be
rendered, on the one hand, and the consumer's experience resulting from the
use of that service, on the other hand (Liljander and Strandvik, l994). Such a
comparison is, in its turn, theoretically supported by the so called Paradigm of
Disconfirmation (or Paradigm of Disconfirmation of Expectations), which is
present both in the literature on overall consumer satisfaction and in specific
references concerning service quality (Evrard, 1993).
In rough terms, according to the disconfirmation paradigm, the consumer
will be satisfied or not depending on whether service performance exceeds or
not her or his expectations about the service. However, different ideas on how
to operationalize the concept of expectation still persist. One of the most
important aspects concerning those alternative operationalizations refers to the
fact that consumer expectations have been considered either as a point, that is,
a determined numerical value, or as a zone, that is, a numerical interval
(Johnston, 1994; Liljander and Strandvik, l994). As an example of point
expectation, the original SERVQUAL scales may be invoked (Parasuraman et
al., 1985). As an example of interval expectations, consider the most recent
version of the SERVQUAL scales (Parasuraman et al., 1994), where, for each
attribute item, two levels of expectations are defined ± the desired service level
and the adequate service level. The interval defined by these two levels is the
so-called zone of tolerance of the attribute item, which represents a set of service
performances that the client considers as satisfactory (Parasuraman et al., 1994;
Berry and Parasuraman, 1991). As Johnston (1994) has shown, the concept of
zone of tolerance is quite useful as a way into exploring the dynamic aspects of
the relationship between service process and service output. It will be shown
here that the concept is also fruitful in terms of information gathering potential.
It may be emphasized, at this point, that, in a sense, the revised format of the
SERVQUAL instrument reflects a new approach to service quality, since
expectations are taken as occurring in two levels. However, it is likely that the
relevance of the revised format lies essentially in the fact that it provides a new
format for the empirical measurement of the quality construct. While
presenting this new format of the SERVQUAL scale, Parasuraman et al. (l994)
also proposed some solutions to the several criticisms (Babakus and Boller,
1992; Carman 1990, Cronin and Taylor 1994; Teas 1993, 1994) quoted in the
literature, both as regards practical aspects and conceptual issues. In fact,
opponents to the five gaps approach are most often concerned about the Attribute
measurement model, or rather, the SERVQUAL scale for measuring the service importance in
quality construct. service quality
The first aspect to be mentioned concerns the revised number and format of
columns, which permits the construction of tolerance zones for each attribute
item. The previous two formats corresponded to direct (that is, net)
calculations, by the consumer, of the differences between perceived service and 489
desired service, and/or between perceived service and the minimum acceptable
level of service.
Another relevant aspect to be considered refers to the diagnostic power of
the utilized scales, that is, the possibility that, upon applying the (new) scales,
one may be able to diagnose and/or infer results that are relevant from a
managerial viewpoint. For example, the consequences that computing the
width of the tolerance zone might have for the improvement of services
rendered. Among the three types of questionnaire tested by Parasuraman et al.
(l994), only the three-column version was capable of specifically indicating the
tolerance zone along with the perceived level of services in relation to that same
tolerance zone (p. 215). In other words, in terms of management diagnosis the
three-column format makes it easier to visualize more clearly the most critical
dimensions, or, as it will be argued here, the most critical individual attribute
items. Briefly, the assessment of perceived service together with evaluations of
the two expected levels of service ± namely, adequate (or minimum) and
desired (or maximum) ± reveals itself to be useful in the diagnosis of service
deficiencies, as well as in the process of taking appropriate improvement
initiatives (p. 216).
Thirdly, the three-column format solves the practical difficulty of separately
applying two batteries of scales (respectively, concerning expectations and
perceptions) as it is the case in the first versions of the SERVQUAL scales.
Finally, the nonnegative number measuring the width of the tolerance
interval ± or, else, the difference equalling desired level minus adequate level ±
is important insofar as, paraphrasing Berry and Parasuraman (1991), the
greater the importance of an attribute item, the smaller that width will be. As a
matter of fact, Berry and Parasuraman (1991) argue in terms of the five
aggregate dimensions of quality services derived from substantial work done
since the original SERVQUAL model was established. Therefore, it seemed
quite natural to extend this conjecture to the case in which the attribute items
are individually considered, which is taken up here.

The question of attribute weighing


Despite the fact that the services literature emphasizes the relevance of
attribute weighing for measuring the determinant attributes of service quality,
the discussion goes on concerning which would be a good method for
obtaining the relative weights of those attributes. The method originally
suggested by Zeithaml et al. (1990) consisted in requesting the respondent to
distribute 100 points among five a priori dimensions associated to service
IJSIM quality. It is assumed that the attribute items included within a given
10,5 dimension will all receive the same weight. Considering that the issue posed
by the generalization of the five quality dimensions is still open to criticisms,
inferring the attribute weights on the basis of those a priori dimensions can
also be brought under discussion. For example Carman (1990) suggested that
the importance scores relative to each attribute be measured directly via
490 respondent's perception.
Of course, this suggestion suffers a limitation of a practical nature, since it
does not seem reasonable to expect that a respondent would easily order, in
terms of importance, a list of some 20 or 30 attribute items. It is not hard to
accept that the longer the list, the more difficult the respondent's task would be.
In the case of the SERVQUAL scales, for example, Parasuraman et al. (1994)
suggest a minimal (or basic) list of 22 attributes, which possibly would be
considered long if it should be applied to samples of people with low or even
average schooling level. At the same time, these authors recognize that in
specific cases ± especially in different sectors of service business ± the
application of their measuring model might require an extension of that list.
One limitation to such an extension is precisely the fact that longer lists are,
in general, more difficult for respondents to deal with. A way out of the
difficulty would be to require the respondent to perform the ranking of only a
small subset of attribute items and try afterwards to obtain a complete ranking
of attributes through an aggregating procedure. In such general terms, the
issue of finding an optimal subset to solve the total ordering problem is
certainly not trivial.
Keeping this caution in mind, it makes sense to search for substitutes of the
importance-ranking operation. This paper examines whether measuring the
width of the tolerance zone of each attribute item may be taken as a valid
substitute. To perform a real world application, it has been necessary to specify
the smaller subset. A new method of attribute weighing and an example
showing an application to postal services will be described in the next section.

Methodology
Originally the PBZ conjecture has been stated in aggregate terms to imply that
the greater the importance of a given quality dimension, the thinner the
corresponding tolerance zone would be. As a matter of fact, Berry and
Parasuraman (1991) argue in terms of the five aggregate dimensions of quality
service which the original SERVQUAL model had identified since its first
versions. The idea behind the conjecture may be directly associated to the
nature of the two levels of expectations ± desired service and adequate service.
Admittedly customers' tolerance zones should vary for different attributes.
However, for a given attribute, whereas the desired level of service is of a more
stable nature, the adequate level is more open to temporary, transitory or
situational changes. For any customer, a fluctuation in the tolerance zone of an
attribute is more likely due to changes in the adequate level, which moves in
response to situational circumstances, than to movements in the desired level,
which moves incrementally in response to accumulated experience (Zeithaml Attribute
and Bitner, 1996, p. 81). Accordingly, those fluctuations may be compared to importance in
one-sided accordion gyration (p. 82). Upward movements in the adequate level service quality
are critical in the sense that, for a given attribute, the customer becomes less
tolerant and the service offer may fall outside the competitive zone, that is,
below the minimum expected level. In this sense, shrinking tolerance zones
may be revealing important attributes to be monitored if the firm wants to stay 491
in business.
The original inspiration to aggregate several individual attribute items into
a smaller number of dimensions was very likely linked to empirical
considerations and it goes on being discussed on those grounds (e.g. Mels et al.,
1997). Likewise, the same empirical origin may be invoked to motivate the
attributional approach adopted here. In fact, in terms of diagnostic power
(Parasuraman et al., 1994), it will be much easier for a manager to control
specific items than aggregate dimensions which, in questionnaire terms, are
very likely understood in plain English and therefore may have to be explained
to respondents. In addition, as argued by Callan (1997), in some service
contexts ± e.g. the hospitality sector ± providers are often audited and
accredited on an attribute item basis in order to determine whether current
classification and grading schemes indeed measure those attributes which the
consumers value most (p. 333). In other contexts ± e.g. hospital services,
catering ± aggregate dimensions have also served as a first step for defining
expanded, more specific lists of quality criteria (Reidenbach and Sandifer-
Smallwood, 1990, p. 48; Johns and Tyas, 1996, pp. 323-4).
Keeping these points in mind, it seemed quite natural to extend the original
PBZ conjecture to the case in which attribute items are individually considered,
which is what is taken up in this paper.
The research hypothesis to be tested here can be enunciated as:
H1: The importance of a service quality attribute is inversely associated to
the width of the corresponding tolerance zone: the more important the
attribute item, the thinner its tolerance zone.
This hypothesis corresponds to the following null hypothesis:
H0: There is no association between the order of importance of quality
attribute items and the width of the attributes' tolerance zones.
In statistical terms, this null hypothesis ± the one to be rejected ± can be tested
as a hypothesis of null correlation versus the unilateral alternative of negative
correlation, as stated in the original conjecture.

Sample and data collection


The original data employed to perform the test have been collected by Leite
(1996) with the purpose of investigating whether there were significant
differences between the quality of services rendered by public postal agencies
vis aÁ vis franchised postal agencies. According to Leite's approach, service
IJSIM quality should reflect the perceptions of the clients of both categories of postal
10,5 units.
The instrument utilized for primary data collection was a questionnaire,
whose construction followed established practices. Considering the purpose of
the present work, the main elements of the research questionnaire deserving
mention are the users' levels of expectation about postal service quality and the
492 importance-rating of the attribute items. A pilot version of the questionnaire
was tested in firms that were users of postal services and then was revised by
three experienced researchers.
Starting from a bibliographical survey, which provided a preliminary list of
quality attributes, a qualitative stage followed. The initial list of attributes was
tested and improved via in depth interviews with Post Office executives, with
owners of franchised agencies (franchisees), with managers of public agencies,
with public servants who were considered as specialists in postal services, and
with client firms. In this qualitative phase, the main purpose was to obtain
those quality attributes of postal service which were most relevant from the
users' viewpoint and most specific for the Brazilian situation. The final list
comprised 39 attribute items.
In the quantitative stage the three-column format of a SERVQUAL
questionnaire was employed (Parasuraman et al., 1994), each column
corresponding to one of the three service levels ± desired, perceived and
acceptable (or minimum tolerable). For each attribute item a nine-point scale
was used for each level. In Figure 1 an example illustrates how the scales were
presented; the wording in the questionnaire, which appears on the example,
closely follows the words used by the authors of the three-column version
(Parasuraman et al., 1994, p. 222). By using this format, the numerical
difference between the score attributed by the respondent to the desired level of
service and the score assigned to the least acceptable level of service can be
computed. It represents the operationalization of the tolerance width of that
service user relative to that attribute item.
In view of the remaining controversy in the methodological literature (e.g.
Carman, 1990) and of the difficulties mentioned before, this paper follows a

Figure 1.
Example of
three-column scale
format
procedure originally proposed by Leite (1996) in order to obtain the attribute Attribute
weights which will support the ordering of attributes by importance. This importance in
approach tries to avoid: service quality
. asking respondents to rate each dimension independently (cf. Zeithaml
et al., 1990, p. 26; Johns and Tyas, 1996, p. 322, 324);
. inferring the attribute weights from the weights of the aggregate, 493
underlying dimensions of service quality (cf. Zeithaml et al., 1990, p. 28);
and
. requesting the respondent to importance-rank all the 39 attribute items.
For the sake of clarity, the method will be briefly described.
In order to weigh the attribute items via their relative importance, an attribute
list much smaller than the original one should be created so that respondents
might accomplish the partial order. The list must be much smaller in order that
respondents might find it feasible ± both cognitively and emotively ± to order
all the elements in the partial subset of attributes. For example, a list of 38
attributes would produce a complete ordering yet it would be too big. A list
containing just one attribute ± meaning that the respondent would precisely
indicate the most important attribute ± would be much smaller but would
provide very limited information, meaning that the final weights would very
likely present many ties. This fact would render the complete ordering much
less immediate.
The procedure consists of requesting the respondent to rank, in order of
decreasing importance, the six most important attributes among the 39 listed in
the questionnaire. In Figure 2 an example shows the form whereby this
question was introduced in the survey. One possible justification for this
number may be found in the literature on choice among a number of products.
In fact, according to empirical findings reported by Hauser and Wernerfelt
(1990) (see also Lilien et al., 1992, pp. 67, 80), the mean or median size of the
consideration sets they analyzed across some 20 product categories ± that is,
the set of brands a consumer will evaluate or search for in a given purchase ±
ranges from two to eight brands. Considering that a purchase may arise more
interest than answering to a survey, it was judged that six was an adequate
size for the subset of attributes the respondents should be asked to rank in
order of importance.
For each attribute the number of votes is computed from the whole group of
respondents by successively considering the number of times the attribute was
indicated as the most important attribute, then as the second most important,
then as the third most important and so forth, until the number of responses

Figure 2.
Example of the
partial-ordering
question
IJSIM assigned to that attribute as the sixth most important. The number of votes a
10,5 given attribute obtained as the first most important attribute was assigned
weight 6, the number obtained as the second most important attribute received
weight 5, and so forth, until weight 1 was assigned to the amount
corresponding to the sixth most important place. If each attribute is cited by at
least one of the respondents in one of the six positions in the importance scale
494 (see Figure 2), then each attribute will have a positive number of votes assigned
to it. If some attribute does not appear in any of those positions, it will receive
zero votes; if this happens for all the attributes, each will receive zero votes and
the method will not apply. Admittedly this would be a very uninteresting list of
attributes.
How is it possible that each attribute will receive at least one vote? This is
surely an empirical question. However, if that was not the case, either the
situation would be uninteresting in the preceding sense or there would likely be
many ties. Note that even in the case where each attribute is cited by at least
one of the respondents in one of the six positions, there still remains the
problem of likely ties. The behavior of the method with respect to ties has been
studied by Carvalho and Leite (1998a).
Once the weighted sum of votes is computed for each attribute, a global
value is reached which is associated to the number 100. Therefore, taking the
weighted sums for each attribute, its weighted votes are reached
proportionately to the total weight of 100. In this way, the complete ranking of
the 39 attributes according to their relative importance was made possible. In
fact there occurred only one tie, corresponding to the value of the weighted
sums involving attributes classified in the 17th and 18th place. The criterion
adopted to break the tie was to find out the largest number of votes obtained by
each attribute when classified as the most important. In other words, the
amount of votes each tied attribute received as the most important was
checked. Again a tie occurred in this step. Then, the amount of votes each of the
two tied attributes received as the second most important attribute was
computed and it has then been possible to break the ties.
Previous work by Carvalho and Leite (1997, 1998a) has shown that this sort
of voting procedure is robust with respect to the order in which the attributes
appear in the questionnaire and that, in a specific sense, six is an appropriate
size of the smaller set to be totally ordered. These results may be viewed as an
indication of the (internal) consistency of the proposed weighing method.
Concerning the population to be studied, this research follows Leite (1996) as
he restricted his attention to firms using postal services rendered in public or in
franchised agencies. From this population the author selected a convenience
sample of 6,000 firms located in various regions throughout the Brazilian
territory and listed in a commercial databank called Dun's Cone Sul ± Guia de
NegoÂcios published by Dun and Bradstreet. This databank contains
information about some 17,000 organizations doing business in Cone Sul ± the
region formed by Brazil, Argentina, Chile, Paraguay and Uruguay. According
to the publisher, those organizations may be considered the leading companies
in their respective business sectors; however, the publication does not elicit the Attribute
criteria adopted to qualify a listed company as a leader in its sector. For the importance in
purpose of his study Leite (1996) considered such information as being of little service quality
relevance, since the posting volume or the postage expenses ± which he would
be collecting through his questionnaire ± would be more important than other,
more general demographic data.
From the initial sample 122 organizations were eliminated, for they were 495
considered unlikely to be important as postal service users, for example, quarry
companies. The final sample comprised 5,878 firms to which letters of
institutional support from a metropolitan public university were sent, letting
them know about the overall study as well as about the research instrument. In
fact, since the purpose of the study required that the whole Brazilian territory
would be surveyed regarding the quality of national postal services, the mail
survey was considered the most adequate method of primary data collection.
Studies of an identical nature ± i.e. employing mail surveys for data collection ±
are found in the service literature (e.g. Parasuraman et al., 1994).
About 10 per cent (540) of mailed questionnaires were returned and were
considered valid under the research objectives. The actual sample has been
controlled ex post in terms of size and industrial classification; no significant
biases have been detected. Although the number of replies cannot be
considered high, three reasons can be invoked to support accepting them as
final. First, it was a national sample covering an almost continental territory,
where very different firm profiles prevail regionally. Second, the research
budget was quite binding, so that both time and money costs would hardly be
bypassed by the potential benefits resulting from the follow up of non-
responses. Third, this study may be considered as a first step into testing the
conjecture and it will need much more improvement. Even though the present
research is not exploratory in the usual sense, it is surely preliminary in
scope.

Hypothesis test
The test to be applied here will make use of the correlation coefficients between
the widths of the tolerance zones for each of the 39 attribute items and their
respective weights as computed according to the weighing procedure
previously explained.
Both Spearman correlation and Pearson correlation coefficients will be
employed to test the association between importance (weights) and tolerance
(widths). This double test was preferred because ranking is an operation that is
more resistant to errors (e.g. measurement errors) than the numerical values
representing the weights. Also, the test for the coefficient of rank correlation
does not depend on any hypothesis about the distribution of the two variables
involved and is even preferable for small samples. Specifically, Pearson
correlation will make use of two numerical variables: the weights computed
from the application of the previous weighing method and the widths
computed as difference of expectation average scores. To compute Spearman
IJSIM correlation coefficients two ordinal variables will be employed, namely, the
10,5 respective ranks corresponding to each of the preceding numerical variables. In
other words, the ranks to be correlated are those originated by the computed
weights and those generated by ordering the tolerance widths.
Either for Pearson or for Spearman coefficients, in order to determine
significance levels it will suffice to consider a unilateral test (r ˆ 0 versus
496 r < 0), as suggested in the original PBZ conjecture.

Regression analysis
According to Phlips and Blomme (1973), who use the expression descriptive
approach, regression analysis may be applied even when the estimated
equation is regarded as a purely numerical result though optimally fitted in
the least-squares sense. In such cases regression coefficients are purely
descriptive parameters with no necessary link with some econometrically
testable theory.
In addition, Wittink (1988) stresses the importance of inspecting the
residuals of an estimated equation with the objective of exploring any
remaining systematic patterns in the behavior of the dependent variable, and
much more so in a descriptive as opposed to an explanatory model. In fact, in a
descriptive framework ± as it is the case in many forecasting applications ± one
is much freer to provide ad hoc explanations on how the dependent variable
behaves.
In this paper regression equations are employed under a descriptive
approach and their residuals are explored to learn further about the association
tested.

Results
Research findings will be presented in two parts. The first part contains mostly
descriptive results corresponding to attribute weighing; in the second part
appear the results of hypothesis testing.

Computing attribute weights


To illustrate the method, Table I presents the case of the attribute Safety in
postal transactions, which, according to the weighing scheme adopted here, has
been considered by the respondents as the most important.

Safety in postal transactions Number of votes Weights Weighted votes

Most important 112 6 672


2nd most important 51 5 255
3rd most important 25 4 100
4th most important 26 3 78
Table I. 5th most important 23 2 46
Example of attribute 6th most important 27 1 27
weighing Total 264 ± 1,178
From the overall total of weighted votes for the whole 39 attributes ± namely, Attribute
11,025 ± it is possible to find the weight of the attribute Safety in postal importance in
transactions as equalling 10.68 (1,178/11,025). In this way, since the overall sum service quality
of the weights of the 39 attributes equals 100, each attribute weight may be
regarded as the relative importance of the attribute according to respondents'
perceptions; therefore, in the case of the attribute Safety in postal transactions,
the weight attributed was 10.68 per cent. The results for the 39 attributes, 497
including the respective weights, appear in Table II. The second column ±
weight ± results from applying the method described previously. The third
column ± order ± corresponds to ranking the attributes via the values in the
preceding column. The fourth column ± width ± results from averaging across
respondents the computed values for the difference between desired and
adequate service levels. Finally, the fifth column ± rank ± is obtained by
ranking the attributes via the width values appearing in the preceding column.
In terms of the columns in Table II, the PBZ conjecture to be tested states that
there is an inverse association between the weight values in column 2
(respectively, the weight orders in column 3) and the width values in column 4
(respectively, the rank values in column 5).
Note that, alternatively, upon using some measure of overall quality, the
order of importance might have been obtained when importance is given by the
attribute's respective (standardized) regression coefficients. Such a comparison
has not been included in the present paper's objectives. However, the question
of the significance of individual coefficients implies that this regression
approach may be more appropriate when the regressors are quality factors (or
aggregate dimensions) instead of individual attribute items.
The numerical values in the fourth column of Table II may seem to be quite
stable at first sight as the computed tolerance widths vary between 1.34 and
1.87. In spite of the fact that this is an empirical issue, the attribute list has been
split into two subsets by neglecting the fifth and sixth deciles in order to test for
differences among the values in the first four deciles and the values in the last
four deciles. Both parametric (two-tailed t-test, unequal variances, t ˆ ÿ7:819,
p ˆ 0:000) and nonparametric (two-tailed Mann-Whitney test, U ˆ 0:000,
p ˆ 0:000) tests indicated that the figures are statistically different, although
very close to each other. In any case, this small variability might raise some
concern about the diagnostic power of the tolerance width as originally argued
(Berry and Parasuraman, 1991; Parasuraman et al., 1991, 1994), at least as long
as its relation to importance is concerned. In a sense this concern might be
expressing the fact that the tolerance width, as originally defined by PBZ,
neglects the information contained in the absolute values of the two expected
service levels. Work in progress by Carvalho and Leite (1999) shows that the
PBZ conjecture may be refined in order to improve the statistical results that
are presented in the next section. However, the new measure they propose as a
better correlate for attribute importance displayed only slightly more
variability than the tolerance width when applied to the same data set.
IJSIM Attribute wording Weighta Orderb Widthc Rankd
10,5
Safety in postal transactions 10.68 1st 1.34 1
Prompt service 9.86 2nd 1.64 13
Dependability in dealing with clients' problems 5.42 3rd 1.71 26
Keeping service promises 5.39 4th 1.57 6
498 Employees with knowledge to answer questions/
eliminate doubts 4.43 5th 1.68 19
Collecting mail in client's office 3.91 6th 1.52 2
Readiness in responding to complaints 3.85 7th 1.58 7
Employees who instill confidence 3.76 8th 1.53 4
Speed in furnishing information 3.45 9th 1.64 13
Performing services right since the first time 3.32 10th 1.66 17
Convenient business hours 3.24 11th 1.62 10
Ease to make telephone contacts 3.09 12th 1.70 23
Speed in helping clients 2.78 13th 1.63 11
Giving individual attention to clients 2.62 14th 1.74 30
Ease of payment 2.59 15th 1.58 7
Employees are consistently courteous 2.28 16th 1.63 11
Filling of documents without mistakes 2.59 17th 1.52 2
Teaching employees how to better serve the clients 2.27 18th 1.68 19
The agency manager as decision maker 2.23 19th 1.65 16
Specially convenient facilities to service business clients 2.18 20th 1.76 33
Modern equipment 2.08 21st 1.76 33
Employees who understand customers' needs 1.95 22nd 1.72 28
Flexibility in responding to clients' business
particularities 1.93 23rd 1.59 9
Location of the agency 1.85 24th 1.69 21
Automated services 1.64 25th 1.66 17
Use of simplified procedures 1.14 26th 1.79 36
Ease of access to agency 1.12 27th 1.74 30
Materials allowing for previous preparation of
consignment 1.12 28th 1.64 13
Relationship with agency manager 1.03 29th 1.55 5
Keeping customers informed about service schedule 1.00 30th 1.69 21
Informing about products and services of post offices 0.97 31st 1.79 36
Employees with a neat, professional appearance 0.93 32nd 1.87 39
Informing about service characteristics 0.92 33rd 1.73 29
Convenient shipment 0.59 34th 1.71 26
Visually appealing facilities 0.56 35th 1.80 38
Comfortable facilities 1.64 36th 1.76 33
Complementary services (enveloping, labeling, etc.) 0.42 37th 1.70 23
Visually appealing materials to support the service 0.35 38th 1.70 23
Unconventional services (e.g. selling lottery tickets) 0.29 39th 1.75 32

Notes: a Weights obtained from the proposed method; b ranks corresponding to the
Table II. computed values for weights appearing in the second column; c width of the tolerance zones
Attributes of postal averaged over respondents; d ranks corresponding to the computed values for widths
service quality in appearing in the fourth column
decreasing order of
importance Source: Survey results
Correlation analysis Attribute
In order to accomplish the objective of the paper ± namely, to test for the importance in
existence of an inverse association between tolerance and importance of quality service quality
attributes of postal services ± the hypothesis will be tested via correlation
coefficients. To estimate the correlation coefficients among the convenient pairs
of variables, the values appearing in Table II were utilized and computations
were completed as explained in the methodology section. 499
The value obtained for the Pearson correlation coefficient resulted from
correlating the variables expressed in the second and fourth columns in Table
II and equalled ±0.6056 ( p < 0:01). The value estimated for the Spearman
correlation coefficient was obtained by correlating the ordinal variables
expressed in the third and fifth columns of Table II and equalled ±0.6005 with
p < 0:01. These negative values are consistently close and indicate that the null
hypothesis be rejected, or equivalently that the alternative hypothesis H1 be
accepted. In other words, there is an inverse association between importance
and tolerance of service quality attributes. These results are summarized in
Table III.

Regression analysis
In spite of the test's significance, the values obtained for the correlation
coefficients were not as high as might have been expected. Therefore, through
the so-called descriptive approach to regression, a simple equation linking
importance and tolerance was estimated via ordinary least squares. A fitted
equation allows the residuals of the estimated equation to be analyzed, which
fits the purpose of the paper, namely, knowing how well importance-ranking,
far more difficult to obtain in practice, can be approached by means of
tolerance-ranking, whose computation is of a more direct and simple nature.
In all the equations specified to test the inverse form of the association, the
results were very good, with adjusted R2 values varying between 0.319 and
0.378, and all F values significant with p  0:01.
To further learn about the association, two equations were chosen which
gave, simultaneously, significant coefficients ( p  0:01) and higher values for
the adjusted R2 . The results for these two equations are reported in Table IV.
The residuals corresponding to these two specifications indicate that the
estimated deviations still present some structure, that is, some systematic
behavior. In fact, the plots corresponding to both equations suggest that the
errors increase with the weights, i.e. the greater the weight (i.e., the calculated
importance) of the attribute, the bigger (and the more positive) the error due to

Type of correlation Value of coefficient p values Test result

Pearson correlation ±0.6056 p < 0:01 Significant inverse association Table III.
Spearman correlation ±0.6005 p < 0:01 Significant inverse association Correlation results:
weights correlated to
Note: Weight is the second column in Table II; width appears in the fourth column width
IJSIM utilizing the width to approximate the weight. In other words, the widths
10,5 overestimate the importance of the most important attributes. However, when
the observations corresponding to the five biggest weights are eliminated, the
residuals become distributed (around the zero average) as expected. In practical
applications overestimating might be seen as a conservative, more secure result
that could be acceptable in some service contexts.
500
Limitations
The research presents some limitations that deserve mention. First, observed
units are organizations, which suggests that respondents may have a quite
different attitude towards the process of communication via questionnaire vis aÁ
vis the communicating attitudes of individual respondents. Second, in terms of
purchase process characteristics, many organizations are served by the
employees they send to the postal outlet, in contrast to individual customers, who
buy postal services for themselves. Therefore, respondents may have only a
limited perception about, say, physical aspects of the service they receive (Leite,
1996). Third, following the argument by Rosen and Karwan (1994) as against
generic importance in terms of aggregate quality dimensions, both the type of
service and the service firms' characteristics do influence perceived importance.
In this paper a unique type of service was considered for which, in fact, the
prevailing market structure has been very close, when not fully coincident, to a
monopoly, at least as far as service delivery (at the agencies) is concerned.
Fourth, although the ex post control has depicted an acceptable profile for actual
respondents, no check concerning nonresponse bias has been tried. Fifth, to
compute the width of the tolerance zone, for each attribute, an average has been
calculated over respondents in the sample, despite the fact that averaging enjoys
many but not all the desirable properties as a sample summary. Alternatively,
however, had medians (or even some other order statistics) been employed much
more ties might have occurred. For the moment it is not clear how to choose a
summary measure with the best properties. Finally, both the definition of
attribute weights and the choice of the size of the list of items and of the (much
smaller) list of most important attributes are apparently very particular. For
example, it is assumed here that the 39 recorded attributes represent exactly all it

Equation number Constant Regressor coefficients Adjusted R2 F(1; 37)

Equation 1: ±19.92 37.35 0.376 24.05


Regressor is 1/(width) (±4.35) (4.9)
p ˆ 0:01 p < 0:01 p < 0:01
Table IV. Equation 2: 14.23 ±22.91 0.364 22.74
Regression results: Regressor is Log (width) (5.78) (±4.77)
weight regressed on p < 0:01 p < 0:01 p < 0:01
selected regressor
variables Note: Weight is the second column in Table II; width appears in the fourth column
is necessary to know about the attributes which determine postal service quality Attribute
(however, see Carvalho and Leite, 1998a). importance in
service quality
Discussion
This paper has empirically investigated a conjecture brought forth by
Parasuraman et al. (1991) which refers to the association between the
importance of an attribute and the width of its zone of tolerance. One way to 501
interpret the paper is to view it as an attempt at establishing the convergent
validity of the tolerance width as an importance measure. En route it provides a
method to order a whole attribute list ± to the extent that it allows to find a
complete ordering of the total N ˆ 39 attributes starting from much easier
partial orderings of fewer attributes ± six instead of the sufficient N ÿ 1 ˆ 38.
In this sense the paper will be a good test of convergent validity as long as this
method is a good way to determine the order of importance.
Notwithstanding the fact that their original assertion related to the
aggregate dimensions of service quality, no empirical test of the conjecture
seems to be available to date. Although the present paper employs data from an
empirical research completed in a Latin American context, the performed test
may well have an interest for researchers working in other contexts.
Admittedly Latin American countries are an example of contexts where target
respondents may lack time, concentration or education to completely order a
list of attributes of interest to a quality audit. However, the alleged advantage
may apply to some respondent segments ± for example, educationally
disfavoured minorities ± in other countries as well.
Two versions (Pearson and Spearman) of the correlation coefficient were
tested. Test results indicated that the null hypothesis should be rejected, which
means accepting the existence of an inverse association, exactly as expressed in
the original PBZ conjecture. In other words, the more important an attribute,
the thinner its tolerance zone. In practical terms, the tolerance widths to be
computed instead of orders of importance have been directly obtained from the
data on service expectation levels collected in the questionnaire. An interesting
consequence of this finding is that simply ordering the computed width of
attributes' zones of tolerance will yield the most important attributes. Since the
ordering of attributes may be a costly task on the part of respondents ± e.g. in
terms of time or of cognitive effort ± this finding points to a simplification when
practical applications of quality research are considered.

Methodological implications
In terms of survey applications additional implications of the results still
deserve attention. First the SERVQUAL questionnaire in its three-column
format presents an additional, attractive feature, namely, that it allows for
direct computation of the tolerance zones' widths per attribute. Second, since
individual attribute items are so vital for the results, special attention should be
paid to performing a qualitative stage where quality attribute items might be
considered and listed in detail. Third, the significant inverse association
IJSIM accepted here implies that, under the announced restrictions, it should be
10,5 possible to employ width ranking as if it was importance ranking. In some
application contexts, where time or cognitive constraints are binding, it will be
possible to obtain orders of importance in a much more simple and direct way.
Finally, additional research ± with data covering other types of service and/or
explicitly looking for different demographic segments ± is needed in order to
502 sharpen the results obtained here, particularly with respect to weight
overestimation.
In order to be able to rank attribute items, it has been necessary to
empirically determine numerical values that could be interpreted as importance
figures. A new method for weighing the quality items has been proposed
elsewhere and applied here to data collected in a quality service survey of
postal agencies in Brazil. The method shows how to start from reported relative
importance of just six among the total list of items in order to rank all the
attributes by incurring only a very small number of ties. This paper is also an
illustration of how the method operates.
A final methodological point may be raised. As it was mentioned before,
although the original conjecture has been expressed in aggregate terms, the test
performed here deals with attribute items. Empirically speaking, depending on
which operation is performed on numerical responses, a more than
unidimensional construct may result. For example, by factor-analyzing the
differences ideal less received service level, Parasuraman et al. (1985) found five
(aggregate) dimensions. However, as pointed out by Mels et al. (1997), various
dimensions ± from one up to seven or eight ± have been found for essentially
the same operation, i.e. ideal less expected level (see also Dabholkar et al., 1996).
What about the difference between two expected service levels, which supports
the tolerance width construct? Research in progress (Carvalho and Leite,
1998b), employing exploratory factor analysis and alpha-reliability, encourages
the researchers to believe that such an operation truly engenders a
unidimensional construct. All these considerations encourage the approach by
quality items, which makes concrete sense for respondents in diverse quality
management contexts.

Managerial implications
To complete the present discussion, two implications of interest for managers
involved in service quality research are worth mentioning. First, if managers
intend to use surveys to get information about the relative importance of
service quality attributes, then applying the three-column versions of the
SERVQUAL instrument will provide them with an easy way to collect that
kind of data. Of course, other relevant information will result from using that
instrument. In this way one may expect to increase managers' interest in
applying research to support service quality decisions, even in contexts where
cultural traits defy quantitative research.
Second, since the empirical results directly depend on the individual items
whereby service quality is perceived, very careful statements defining the
individual attributes will be needed to develop the questionnaire. Accordingly, Attribute
managers should pay special attention to the role of qualitative research in that importance in
respect. In particular, this will require managers to support a survey design service quality
which stimulates an intense participation of people involved with the service as
well as those likely to possess a specific, thorough knowledge of service
characteristics. Service marketing teams, customer service executives and
branch managers will surely be instrumental along the whole research project. 503
References
Babakus, E. and Boller, G.W. (1992), ``An empirical assessment of the SERVQUAL scale'', Journal
of Business Research, Vol. 24 No. 2, pp. 253-68.
Berry, L.L and Parasuraman, A. (1991), Marketing Services: Competing through Quality, The Free
Press, New York, NY.
Callan, R.J. (1997), ``An attributional approach to hotel selection. Part I ± the managers'
perceptions'', Progress in Tourism and Hospitality Research, Vol. 3, pp. 333-49.
Carman, J.M. (1990), ``Consumer perceptions of service quality: an assessment of the SERVQUAL
dimensions'', Journal of Retailing, Vol. 66 No. 1, pp. 33-55.
Cronin, J.J. Jr and Taylor, S.A. (1994), ``SERVPERF versus SERVQUAL: reconciling performance-
based and perceptions-minus-expectations measurement of service quality'', Journal of
Marketing, Vol. 58 No. 1, pp. 125-31.
de Carvalho, F.A. and Leite, V.F. (1999), ``ToleraÃncia e importaÃncia de atributos de qualidade ±
refinando a conjetura PBZ'', Working Paper, IBMEC Business School, Rio de Janeiro, April
1999.
de Carvalho, F.A. and Leite, V.F. (1998a), ``Alternativas de ordenacËaÄo da importaÃncia de atributos
de qualidade de servicËos'', Proceedings of the XXXII Meeting of the Brazilian Association of
Graduate Programs in Administration (ANPAD), ANPAD, Foz do IguacËu, September.
de Carvalho, F.A. and Leite, V.F. (1998b), ``A toleraÃncia e unidimensional? uma anaÂlise fatorial
exploratoÂria'', Working Paper, IBMEC Business School, Rio de Janeiro, July.
de Carvalho, F.A. and Leite, V.F. (1997), ``A ordem dos atributos afeta a avaliacËaÄo da qualidade?
Uma investigacËaÄo empõÂrica a partir da versaÄo mais recente do modelo SERVQUAL'',
Revista de AdministracËaÄo ContemporaÃnea, Vol. 1 No. 1, pp. 35-53.
Dabholkar, P.A., Thorpe, D.I. and Rentz, J.O. (1996), ``A measure of service quality for retail
stores: scale development and validation'', Journal of the Academy of Marketing Science,
Vol. 24 No. 1, pp. 3-16.
Evrard, Y. (1993), ``La satisfaction des consommateurs: eÂtat des recherches'', in Proceedings of the
XXVII Meeting of the Brazilian Association of Graduate Programs in Administration
(ANPAD), ANPAD, FlorianoÂpolis, Vol. 8 (Marketing), pp. 59-86.
Hauser, J.R. and Wernerfelt, B. (1990), ``An evaluation cost model of consideration sets'', Journal
of Consumer Research, Vol. 16, pp. 393-408, March.
Johns, N. and Tyas, P. (1996), ``Use of service quality gap theory to differentiate between
foodservice outlets'', The Service Industries Journal, Vol. 16 No. 3, pp. 321-46, July.
Johnston, R. (1994), ``The zone of tolerance ± exploring the relationship between service
transactions and satisfaction with the overall service'', International Journal of Service
Industry Management, Vol. 6 No. 2, pp. 46-61.
Leite, V.F. (1996), ``A AdocËaÄo do Sistema de Franquia nos Correios do Brasil ± Um estudo sobre
qualidade e produtividade no setor puÂblico'', unpublished doctoral dissertation, Graduate
School of Business, Federal University at Rio de Janeiro (COPPEAD/UFRJ).
IJSIM Lilien, G.L., Kotler, P. and Moorthy, K.S. (1992), Marketing Models, Prentice-Hall, Inc., Englewood
Cliffs, NJ.
10,5 Liljander, V. and Strandvik, T. (1994), ``Estimating zones of tolerance in perceived service quality
and perceived service value'', International Journal of Service Industry Management, Vol. 4
No. 2, pp. 6-28.
Mels, G., Boshoff, C. and Nel, D. (1997), ``The dimensions of service quality: the original European
perspective revisited'', The Service Industries Journal, Vol. 17 No. 1, pp. 173-89.
504 Parasuraman, A., Berry, L.L. and Zeithaml, V.A. (1991), ``Understanding customer expectations
of service'', Sloan Management Review, Vol. 32 No. 3, pp. 39-48.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1985), ``A conceptual model of service quality
and its implications for future research'', Journal of Marketing, Vol. 49 No. 4, pp. 41-50.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1994), ``Alternative scales for measuring service
quality: a comparative assessment based on psychometric and diagnostic criteria'', Journal
of Retailing, Vol. 70 No. 3, pp. 201-30.
Phlips, L. and Blomme, R. (1973), Analyse chronologique, Vander, Louvain.
Reidenbach, R.E. and Sandifer-Smallwood, B. (1990), ``Exploring perceptions of hospital
operations by a modified SERVQUAL approach'', Journal of Health Care Marketing,
Vol. 10 No. 4, pp. 47-55.
Rosen, L.D. and Karwan, K.K. (1994), ``Prioritizing the dimensions of service quality ± an
empirical investigation and strategic assessment'', International Journal of Service
Industry Management, Vol. 5 No. 4, pp. 39-52.
Teas, R.K. (1993), ``Expectations, performance evaluation and consumer's perceptions of quality'',
Journal of Marketing, Vol. 57 No. 4, pp. 18-34.
Teas, R.K. (1994), ``Expectations as comparison standard in measuring service quality: an
assessment of a reassessment'', Journal of Marketing, Vol. 58 No. 1, pp. 132-9.
Wittink, D. (1988), The Applications of Regression Analysis, Allyn and Bacon Inc., Boston, MA
and London.
Zeithaml, V.A and Bitner, M.J. (1996) Services Marketing, McGraw-Hill, New York, NY.
Zeithaml, V.A., Parasuraman, A. and Berry, L.L. (1990), Delivering Quality Service: Balancing
Customer Perceptions and Expectations, The Free Press, New York, NY.

Potrebbero piacerti anche