Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
C. System Model:
Now we introduce a system model for the discussions in this document. The model covers the
architecture of the database, the environment it is operated in, the way it is accessed, and failure types we
address. We also clarify our view on availability.
In our discussion, we only consider distributed shared nothing database clusters. We assume that the
various database manager instances are operated by a single administrative instance, but not necessarily
situated in the same geographical location (data-centre). The clients accessing the database are middle-tier
components of the enterprise application. They access the database by sending request messages
containing query or update operations; depending on the type of request, they either receive a response
with an update status or the actual data item.
We only tackle common failure events and do by no means consider catastrophic events such as
earthquakes, flooding, etc. We assume that access to the database instances is well protected by security
mechanism outside the databases and that the software does not contain any critical bugs. We also
do not consider data corruption on disk level or disk failures. Hence, our considerations only deal with the
full failures of one or multiple nodes as well as network partitions between the database nodes. Such
partitions may be full or partial. We argue that all of these additional failures can be dealt with by other
mechanisms, such as RAID for disk failures. In consequence, the system model allows that any failed
node can be repaired and re-introduced into the cluster. This is due to the fact that the persistent state on
disk does not get lost and can be made available again by repairs.
We assume an unlimited network capacity between clients and database nodes. Furthermore, we
assume that a client is always capable of reaching at least one node of the database cluster provided that
not all nodes have failed. Then, a database is available when it is possible to access a particular, but
arbitrary data item within an arbitrary, but fixed time interval t. A data item i is available when it is
possible to access that item within an arbitrary, but fixed time interval t. Our model allows the
overloading of nodes such that requests get dropped, the reply time reaches above t or a reply will not
even get created. Clients that interact with such a node will consider the node as failed and proceed as
they would in case of a failure. Usually, this means contacting another replica.
B. Aspects of Reliability
As stated earlier, reliability enables a component to perform its required functions under stated
conditions for a specific period of time. In contrast to availability (which is a statistical property), the
question whether a system is reliable primarily depends on the specification of the function a system is
supposed to fulfill. Hence, a database may be considered unreliable when interacting with it yields
different results than expected. This is particularly important for concurrency handling and client-side
consistency. With respect to reliability, the main challenges can be highlighted by the following two
questions:
(1) How are concurrent writes to the same item resolved?
(2) What is the consistency experienced by clients
A. Replication
With respect to data, replication means to have several physical copies of the same logical data item.
Apart from replication scope that defines the relation of replicas and nodes, the primary conceptual
decisions to be made when using replication are (i) which of the physical data items may be updated by
clients and (ii) how the other replicas are changed once such an update has taken place [12]. We refer to
the first aspect as the replication strategy of the system and to the latter as the update strategy. The update
strategy determines how the replicas interchange updates. That is, it defines the update data exchanged by
the database nodes, as well as update laziness that influences the relative ordering of responses sent to the
client and updates.
B. Conflict Management
Any sort of concurrent updates to a single data item may lead to conflicts. While this is obvious for
multi-master replication, it also has to be considered for single-master approaches and even non-
replicated items. Commonly two strategies are known to deal with this problem called optimistic and
pessimistic concurrency control. The strategy has an influence on the possible versions of an item.
Concurrent operations may be conflicting, so conflict detection and conflict resolution have to be applied.
C. Consistency
Databases as well as distributed systems research have developed a large number of different
consistencies. Consistency protocols define the allowed interleaving when multiple processes access the
same data sets. With respect to the scope of this document, we are merely interested in what the client, i.e.
the user of the database, can experience. In Vogel et al.s terminology this vantage point is called client-
centric consistency. Apparently, any database system also requires some internal consistency
specification, mainly with respect to how to deal with replicas.
D. Partitioning
In order to enable scalability the existence of a partitioning mechanism is necessary. Partitioning means
that each node is only responsible for dedicated subsets of data items stored in the system, i.e. data
partitions. Depending on the actual database system, partitions are also sometimes called ranges/regions.
A partitioning mechanism maps data items to partitions. A commonly established mechanism is consistent
hashing. Database systems differ with respect to the dynamic their partitioning mechanism allows. For
being able to access the data items, for any request issued by a client the appropriate partition has to be
resolved. Finally, a client may be restricted in the way it is allowed to access a cluster of database nodes.