Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
1 Introduction
The Web has become an immense information repository with perpetual expan-
sion and ubiquity as two of its intrinsic characteristics. With the recent flurry in
Web technologies, the “data-store” Web is steadily evolving to a more “vibrant”
environment where passive data sources coexist with active services that access
these data sources and inter-operate with limited or no intervention from users.
The Web has also brought a new non-deterministic paradigm in accessing infor-
mation. In traditional, closed, deterministic multi-user systems (e.g., enterprise
networks), data sources are accessible only by a few known users with a set of
predefined privileges. On the contrary, the Web is an open environment where
information is accessible (or potentially accessible) by far more numerous and a
priori unknown users. The problem of access control in the open Web environ-
ment is not another “flavor” of the problem of access control that the security
community has extensively studied in the context of closed systems. In fact, so-
lutions resulting from that research are only of peripheral consequences to the
B. Benatallah and M.-C. Shan (Eds.): TES 2003, LNCS 2819, pp. 91–103, 2003.
c Springer-Verlag Berlin Heidelberg 2003
92 A. Rezgui, A. Bouguettaya, and Z. Malik
problem of protecting information in the Web context. For example, access con-
trol models (e.g., RBAC, TBAC, MAC, DAC [4]) that work for resources shared
by a well-known, relatively small set of users are obviously of little help when
the resources are information that may be accessed by millions of random Web
clients. The problem of protecting Web-accessible information becomes more
compelling when the information are private, i.e., related to personal aspects
such as the health records of a hospital’s patients, employment history of job
seekers, etc. Accessing personal information through the Web clearly raises le-
gitimate privacy concerns that call for effective, reliable, and scalable privacy
preserving solutions.
Preserving privacy is a key requirement for the Web to reach its full potential.
Recent studies and surveys show that Web users’ concerns about their privacy
are key factors behind their reluctance in using the Web to conduct transactions
that require them to provide sensitive personal information. However, despite
the recent growing interest in the problem of privacy, little progress appears to
have been achieved. One reason is the excessive trend to assimilate privacy to the
wide spectrum of security-related problems. Although not completely unrelated,
the two problems of privacy and security are, essentially, different. Much of
the research in securing shared resources views security as an “inward” concept
where different users have different access rights to different shared objects.
Contrarily to the concept of security, we see privacy as an “outward” concept
in the sense that an information exposes a set of rules (i.e., privacy policies)
to users. These rules determine if and how any arbitrary user may access that
information. In contrast to typical shared information, private information is,
generally, related to an individual who normally owns the information and has
the authority to specify its associated privacy policy.
Solutions to the problem of preserving privacy in the Web may not be con-
ceivable or even feasible if not developed for and using tomorrow’s Web building
blocks: Web services. These are, essentially, applications that expose interfaces
through which they may be automatically invoked by Web clients. Many of the
traditional problems of the Web have found elegant and effective solutions de-
veloped around Web services and a growing number of Web-based applications
are being developed using Web services. Examples include health care systems,
digital government, and B2B applications. In their pre-Web versions, these ap-
plications were based on an information flow where users divulge their private
information only to known, trusted parties. The recent introduction of Web ser-
vices as key components in building these applications has introduced new types
of transactions where users frequently disclose sensitive information to Web-
based entities (e.g., government agencies, businesses) that are unknown and/or
whose trustworthiness may not be easily determined.
The privacy problem is likely to become more challenging with the envi-
sioned semantic Web where machines become much better able to process and
understand the data that they merely display at present [3]. To enable this vi-
sion, “intelligent” (mobile or fixed) software agents will carry out sophisticated
tasks for their users. In that process, these agents will manipulate and exchange
extensive amounts of personal information and replace human users in mak-
A Reputation-Based Approach to Preserving Privacy in Web Services 93
ing decisions regarding their personal data. The challenge is then to develop
agents that autonomously enforce the privacy of their respective users, i.e., au-
tonomously determine, according to the current context, what information is
private.
Business
Web Service
Citizen’s
Unemployed Private Info
Citizen
Agent
User 4 Unauthorized
Info Transfer
5
2
1 3
Government
Web service S
mechanism that provides an objective evaluation of the trust that may be put
in any given Web service. This mechanism must have functionalities to collect,
evaluate, update, and disseminate information related to the reputation of Web
services. Moreover, this mechanism must meet two requirements:
Verifiability. Web services may expose arbitrary privacy policies to their users.
Currently, users are left with no means to verify whether or not Web services
actually abide by the terms of their privacy policies. Ideally, a privacy preserv-
ing infrastructure must provide a practical means to check whether advertised
privacy policies are actually enforced.
Automatic Enforcement. In the envisioned semantic Web, software agents
will have to make judgments about whether or not to release their users’ private
data to other agents and Web services. Thus, a privacy preserving solution must
not require an extensive involvement of users.
In this paper, we propose a reputation-based solution to the problem of
preserving privacy in the semantic Web. The solution is based on two prin-
ciples. First, the reputation of Web services is quantified such that high rep-
utation is attributed to services that are not the source of any “leakage” of
private information. Second, the traditional invocation scheme of Web services
(discovery-selection-invocation) is extended into a reputation-based invocation
scheme where the reputation of a service is also a parameter in the discovery-
selection processes.
The paper is organized as follows. Section 2 introduces the key concepts used
throughout the paper. In Section 3, we present a general model for reputation
management in a semantic Web environment. In Section 4, we describe an ar-
chitecture for deploying the proposed general model. In Section 5, we report on
some research work that has addressed various aspects of trust and reputation
in the Web. Section 6 concludes the paper.
In this section, we set the framework for our discussion. We first clearly define
the terms of Web services and Web agents. We then introduce the two con-
cepts of attribute ontology and information flow difference on which is built our
reputation management model described in Section 3.
In the envisioned semantic Web, two types of entities interact: Web services
and Web agents. The former are applications that may be invoked through the
Web. For instance, in our DG example, a business may deploy a service that
provides the list of current job opportunities. Services may have access to private
information. Upon invocation, they may also deliver private information. In the
dynamic Web environment, the number of Web agents and services may not
be known a priori. In this paper, we consider a dynamic set S of Web services
that interact with a potential exchange of private information. Web agents are
A Reputation-Based Approach to Preserving Privacy in Web Services 95
“intelligent” software modules that are responsible of some specific tasks (e. g.,
search for an appropriate job for a given unemployed citizen in the previous DG
example).
Let si be a Web service in the set S and Opj an operation of the service si that
has p input attributes and q output attributes. Let Input(Opj ) denote the set of
input attributes for operation Opj , and Output(Opj ) denote the set of output
attributes for operation Opj .
Definition 1: The Information Flow Difference IF D of operation Opj is de-
fined by:
IF D(Opj ) = a∈Input(Opj ) P SL(a) − a∈Output(Opj ) P SL(a)
Example 1: Assume that Opj has as its input the attribute SSN and as its out-
put the attribute PhoneNumber. The values of the function P SL for attributes
SSN and PhoneNumber are respectively 6 and 4. In this example, IF D(Opj ) = 2.
The meaning of this (positive) value is that an invocation of operation Opj must
provide information (SSN) that is more sensitive than the returned information
(PhoneNumber). Intuitively, the Information Flow Difference captures the degree
of “permeability” of a given operation, i.e., the difference (in the privacy signif-
icance level) between what it gets (i.e., input attributes) and what it discloses
(i.e., output attributes).Box
In general, positive values for the function IF D do not necessarily indicate
that invocations of the corresponding operation actually preserve privacy. In the
previous example, a service invoking Opj may still be unauthorized to access the
phone number although it already knows a more sensitive information (i.e., the
social security number). However, invocations of operations with negative values
of the function IF D necessarily disclose information that is more sensitive than
their input attributes. They must, therefore, be considered as cases of privacy
violation.
Definition 2: A service s ∈ S is said to have an information flow that violates
privacy if and only if it has an operation Op such that: IF D(Op) < 0.
Definition 3: The Information Flow Difference of a Web service s is the sum
of the IF D’s of all of its operations.
reason) for any transactions. The basic idea is to deploy a reputation manage-
ment system that continuously monitors Web services, assesses their reputation
and disseminates information about services’ reputation to (other) services and
agents. To each Web service, we associate a reputation that reflects a common
perception of other Web services towards that service. In practice, different cri-
teria may be important in determining the reputation of a Web service. For
example, the reputation of a service that searches for “best” airline fares clearly
depends on whether or not it actually delivers the best fares. In this paper, ser-
vices’ reputation depends only on the effectiveness of their enforcement of the
privacy of their users. To simplify our discussion, we will use “services” instead
of “services and agents” in any context where “services” is an active entity (e.
g., “service” s invokes operation Op). We propose five criteria that are the basis
in the process of reputation assessment.
If the dates of deployment, d1 and d2 , of two services s1 and s2 , are known, then
the reputation of s1 may be considered better than that of s2 if d1 precedes d2 .
In practice, the criteria used in reputation assessment are not equally impor-
tant or relevant to privacy enforcement. For example, the seniority criterion is
clearly less important than the degree of permeability. To each criterion cj ∈ R,
we associate a weight wj that is proportional to its relative importance as com-
pared to the other criteria in R. A Web service’s reputation may then be defined
as follows:
Definition 4: For a given service si ∈ S, the reputation function is defined by:
m
Reputation(si ) = wk .cki (1)
k=1
4 The Architecture
The Reputation Manager (RM) is the core of the reputation system. It is a unan-
imously trusted party responsible of (i) collecting, (ii) evaluating, (iii) updating,
and (iv) disseminating reputation information. To understand the operation of
the Reputation Manager, consider the following typical scenario enabled by this
architecture. A Web service si (or an agent) is about to send a message M con-
taining private information to a Web service sj . Before sending M to sj , the
service si submits to the Reputation Manager a request asking for the reputa-
tion of service sj . If a value of Reputation(sj ) is available and accurate (i.e.,
reasonably recent), the RM answers si ’s request with a message containing that
value. Based on the value of the received reputation and its local policy, the
A Reputation-Based Approach to Preserving Privacy in Web Services 99
WS WS WS WS WS
Local
Reputation
Cache
Local
Reputation
Cache
Reputation Manager
Reputation Attribute
Repository Ontolgy
Probing
Agents
Service Registry
service si may or may not decide to send the message M to sj . If the reputa-
tion of sj is outdated or not available (i.e., has not been previously assessed),
the Reputation Manager initiates a reputation assessment process that involves
one of a set of Probing Agents. To assess the reputation of a Web service si ,
the RM collects information that are necessary to evaluate the given criteria for
reputation assessment (discussed in Section 3). The evaluation of most of these
criteria is based on the (syntactic) description of the service sj . For example,
to evaluate the degree of permeability of sj , the RM reads sj ’s description (by
accessing the appropriate service registry), computes IF D(sj ) (sj ’s Information
Flow Difference), and maps that value to the corresponding degree of perme-
ability. The RM maintains an attribute ontology that is used in computing the
degree of permeability of the different Web services. Once the DoP of service sj
is evaluated, it is stored in the local Reputation Repository. The other criteria
may be obtained using similar processes. Once all the criteria are evaluated, the
RM computes sj ’s reputation (as described in Section 3), stores the obtained
value in the Reputation Repository, and sends it to the service si .
already built without a mechanism for privacy enforcement. The solution clearly
requires to modify the service invocation scheme to accommodate the added
mechanism. Moreover, the solution must not induce a high upgrading cost on
“legacy” services. To achieve this transition to privacy preserving services, we
introduce components called service wrappers. A service wrapper associated with
a service si is a software module that is co-located with si and that handles all
messages received or sent by the service si . To send a privacy sensitive message
M to a service sj , the service si first submits the message to its wrapper. If
necessary, the wrapper sends a request to the Reputation Manager to inquire
about sj ’s reputation. Based on the answer received from the RM, si ’s wrapper
may then forward the message M to sj or decide to cancel sending M to sj . To
avoid an excessive traffic at the Reputation Manager, the wrapper may locally
maintain a Reputation Cache that contains information about the reputation of
the most frequently invoked services.
5 Related Work
Among all Web applications, E-commerce is the one where the concept of repu-
tation was the most extensively studied. Online businesses have deployed repu-
tation systems that improve customers’ trust and, consequently, stimulate sales
[7]. Examples include: Ebay, Bizrate, Amazon, and Epinions. Several studies have
investigated and, generally, confirmed the potential positive impact of reputa-
tion systems on online shoppers’ decisions to engage in business transactions (e.
g., the study reported in [5] that targeted Ebay’s reputation system).
Current reputation systems are based on the simple idea of evaluating the
reputation of a service or a product using feedbacks collected from customers
that, in the past, have purchased similar or comparable services or products.
Customers may then make “informed shopping” in light of past experiences.
Different variations derived from this general model have been proposed. For ex-
ample, in [9], the authors presented two models based on the global and personal-
ized notions of reputation. The global model, Sporas, is similar to the reputation
model used at Ebay while the personalized model, Histos, takes the identity of
the inquirer and the environment in which the inquiry is carried out into account.
In most E-commerce systems, a typical transaction involves two parties (e.
g., a seller and a buyer). In some cases, one of the parties (the seller) is also the
provider of the reputation service (i.e., assesses and/or disseminates the repu-
tation). These reputation systems inherently lack the objectivity that generally
characterizes systems based on a neutral third party. An example of such a sys-
tem was proposed in [2]. It is based on a trust model that uses intermediaries
to bolster the buyer’s and seller’s confidence in online transactions. An interme-
diary agent, known as a trust service provider (TSP), is responsible for making
sure that the end-parties (buyer and seller) behave as expected (e. g., the buyer
pays and the seller delivers). The whole process is invisible to the transacting
parties as the TSP carries out the transaction in an automatic fashion. Another
approach to introduce more objectivity in reputation systems was proposed in
102 A. Rezgui, A. Bouguettaya, and Z. Malik
[1]. It treats trust as a non-transitive concept and takes into account the credibil-
ity of the source providing a recommendation. The proposed approach assumes
that the recommender may lie or state contradictory recommendations.
Other solutions have also been proposed for other types of Web applications.
In a previous work [8,6], we addressed the problem in the context of Digital
Government applications. The solution was developed to enforce privacy in Web-
accessible (government) databases. Its principle was to combine data filters and
mobile privacy preserving agents. The former were used to protect privacy at the
server side (i.e., the databases) and the latter were used to preserve privacy at
the client side (i.e., Web clients requesting private information).
The reputation system proposed in this paper differs from the previously
mentioned (and other) systems in three aspects: First, contrarily to most existing
systems that target Web sites or Web databases, our system targets a semantic
Web environment hosting Web services and software agents. Second, reputation
in our system is not directly based on business criteria (e. g., quality of a service
or a product) but, rather, it reflects the “quality of conduct” of Web services
with regard to the preservation of the privacy of (personal) information that
they exchange with other services and agents. Finally, reputation management
in our system is fully automated and does not use (or necessitate) potentially
subjective human recommendations or feedbacks.
6 Conclusion
In this paper, we presented a reputation management system for Web services.
The proposed system aims at automating the process of privacy enforcement in
a semantic Web environment. The solution is based on a general model for as-
sessing reputation where a reputation manager permanently probes Web services
to evaluate their reputation. Currently, we are investigating three alternatives
to the proposed centralized model. These are a distributed, registry-based and
peer-to-peer reputation models. The objective of investigating these four models
is to extricate the components and features of an ideal reputation management
system for Web services.
Acknowledgment. This research is supported by the National Science Foun-
dation under grant 9983249-EIA and by a grant from the Commonwealth Infor-
mation Security Center (CISC). The authors would also like to thank Mourad
Ouzzani for his valuable comments on earlier drafts of this paper.
References
1. A. Abdulrahman and S. Hailes. Supporting Trust in Virtual Communities. In Pro-
ceedings of the 33rd Annual Hawaii International Conference on System Sciences,
2000.
2. Yacine Atif. Building Trust in E-Commerce. IEEE Internet Computing, 6(1):18–24,
January–February 2002.
3. T. Berners-Lee, J. Hendler, and O. Lassila. The Semantic Web. Scientific Ameri-
can, May 2001.
A Reputation-Based Approach to Preserving Privacy in Web Services 103