Sei sulla pagina 1di 2

Vignesh Gawali

vg975 | vg975@nyu.edu
Paper Review 3 Dynamo
Amazons implementation of Dynamo is their take on developing a very reliable and scalable storage
solution, that can handle failures without impacting the availability of the services that it provides. The
Dynamo storage system is designed to store data in key-value pairs. It also provides a key-only interface
to services that require access to the storage in key-only form. The system partitions the data and stores
multiple replicas of the data using consistent hashing, and also by maintaining object versions.
Dynamo system was built based on the following requirements:

Query model: A simple read and write model that allows services to access relatively small data
objects based on unique key.
ACID properties: properties that guarantee reliable processing of transactions.
Efficiency: The system needs to be configured such that the latency and throughput requirements
are actively met.

Architecture:
System Interface: The interface provides get() and put() methods to provide access to the key-value data.
Based on a get request, the system returns a list of objects if there are multiple versions and the context
for each object.
Partitioning Algorithm: The partitioning mechanism uses consistent hashing to distribute the data across
the nodes. The key is used to generate a MD5 hash which determines the storage nodes which store that
data.
Replication: The system stores N replicas of the data. Each key is assigned to a coordinator node, which is
responsible for replicating the data N-1 times on the successor nodes in the ring.
Data Versioning: If the system receives a request for updating a data object, the recent version of the
object is checked for availability. If the updated version of the object is not available, then changes are
made to an older version of the object, and once the recent object is available, the changes are reconciled.
Request Handling: The request arrives through HTTP infrastructure and may be redirected to any node in
the system. The node then checks if it is in the top N nodes of the preference list for that key, then the
request is executed. If not, the node redirects the request to the top node in the preference list.
For get() operation, the coordinator sends requests to all N nodes in the preference list and waits for their
response, and then responds to the client with the all the version of the object that seem unrelated.
For put() operation, the coordinator generates a vector clock and writes the new version locally, updates
the vector clock and sends write requests to the N nodes that are reachable in the preference list.
Handling Failures: Dynamo uses a modified version of the quorum approach, in which, all read and write
operations are performed on the first N healthy nodes, which may be less than N nodes in the preference
list. If a particular node that is designated to store the object is unavailable, then the object is passed on

to the next node out of the preference list, which stores the object locally. Once the designated node
becomes available, the object is then sent to that node for writing.
Adding/Removing a node: When a new node is added to the ring, it becomes responsible for storing the
keys for the key ranges assigned to it. This implies that some other nodes are relieved of their
responsibility of storing those keys that are assigned to the new node. These nodes then send a request
to the new node for the transfer of keys. When a node is being removed, the same actions take place in a
reverse manner.
Implementation:
Each of the Dynamo storage nodes comprises of the following:
Request Coordination: Consists of an event-driven messaging mechanism that allows to pipeline the
requests received from the client for data processing.
Membership and failure detection: A detection mechanism that allows for handling of node failures
without the need for manual intervention.
Local persistence engine: A pluggable interface that allows multiple datastore engines to be connected.
Berkeley Data Store, MySQL, in-memory buffer store are the ones that are currently used.
Conclusion:
The above details describe the genius implementation of the Dynamo storage solution implemented by
Amazon to cater to its needs of high reliability and scalability, and providing a base for many of the services
on e-commerce platform like Shopping Cart, and User session handling.

Potrebbero piacerti anche