Sei sulla pagina 1di 2

This is a list of fun project ideas that no one is currently working on.

a) Publish/Subscribe API

Storage systems have become much more specialized in recent years with each system
providing expertise in certain areas—Hadoop and proprietary data warehouses provide
batch processing capabilities, Search indexes provide support for complex ranked
text queries, and a variety of distributed databases have sprung up. Voldemort is a
specialized key-value system, but the same data stored in Voldemort may need to be
indexed by search, churned over in Hadoop, or otherwise processed by another
system. Each of these systems needs the ability to subscribe to the changes
happening in Voldemort and get a stream of such changes that they can process in
their own specialized way.

Indeed even Voldemort nodes could subscribe to one another as a quick catch-up
mechanism for recovering from failure.

Amazon has implemented this functionality as a “Merkle tree” data structure in


their Dynamo system which allows nodes to compare their contents quickly and catch
up to differences they have missed, but this is not the only approach. It could be
a simple secondary index that implements a node-specific logical counter that
tracks modification number for each key.

The api that would be provided would be something like getAllChangesSince(int


changeNumber), and this api would provide the latest change for each key.

b) Operational Interface

One of the primary problems for a practical distributed system is knowing the state
of the system. Voldemort has a rudimentary GUI that provides basic information.
This project would be to make a first rate management GUI and corresponding control
functionality to be able to know the performance and availability of each node in
the system as well as perform more intense operations like starting / stopping
nodes, restoring from replication, rebalancing, etc.

c) Scala Voldemort Shell

Voldemort comes with a very simple text shell. A better way to build such a thing
is to fully integrate a language with an interpreter and provide a set of
predefined administrative commands as functions in the shell. Scala has a flexible
syntax and integrates easily with Java so it would be a good choice for such a
shell.

d) Support for LevelDB storage engine

Since Voldemort supports a pluggable storage engine interface, we definitely want


to try out other solutions. For example, we have a Krati based storage engine in
contrib. Another storage engine which is picking up a momentum is Google’s
LevelDB . The first phase of this project would require building JNA / JNI bindings
for the storage engine followed by the integration with Voldemort.

e) REST based API

Besides the existing Ruby / Python clients having a REST based API would increase
adoption amongst the web community. A good v1 could derive ideas from existing well
know systems like Riak

f) Memcache protocol support


An easy project would be to provide the same API as Memcache.

g) Duplex request support

Preliminary work has been done to support duplexing the our socket requests. This
would minimize the impact of network latency across data-centers. Some initial
thoughts can be found here [ http://github.com/kirktrue/voldemort/wiki/Duplex-
Request-Response-Communication ]

h) Maven support

We want to add the ability to push jars in a central Maven repository.

i) Better configuration system

The XML files can get really out of control once the cluster size increases.
Migrating from XML to YAML would alleviate this problem a bit.
In the long term we would like to come up with a better configuration system
( Explore Zookeeper? )

Potrebbero piacerti anche