Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
a) Publish/Subscribe API
Storage systems have become much more specialized in recent years with each system
providing expertise in certain areas—Hadoop and proprietary data warehouses provide
batch processing capabilities, Search indexes provide support for complex ranked
text queries, and a variety of distributed databases have sprung up. Voldemort is a
specialized key-value system, but the same data stored in Voldemort may need to be
indexed by search, churned over in Hadoop, or otherwise processed by another
system. Each of these systems needs the ability to subscribe to the changes
happening in Voldemort and get a stream of such changes that they can process in
their own specialized way.
Indeed even Voldemort nodes could subscribe to one another as a quick catch-up
mechanism for recovering from failure.
b) Operational Interface
One of the primary problems for a practical distributed system is knowing the state
of the system. Voldemort has a rudimentary GUI that provides basic information.
This project would be to make a first rate management GUI and corresponding control
functionality to be able to know the performance and availability of each node in
the system as well as perform more intense operations like starting / stopping
nodes, restoring from replication, rebalancing, etc.
Voldemort comes with a very simple text shell. A better way to build such a thing
is to fully integrate a language with an interpreter and provide a set of
predefined administrative commands as functions in the shell. Scala has a flexible
syntax and integrates easily with Java so it would be a good choice for such a
shell.
Besides the existing Ruby / Python clients having a REST based API would increase
adoption amongst the web community. A good v1 could derive ideas from existing well
know systems like Riak
Preliminary work has been done to support duplexing the our socket requests. This
would minimize the impact of network latency across data-centers. Some initial
thoughts can be found here [ http://github.com/kirktrue/voldemort/wiki/Duplex-
Request-Response-Communication ]
h) Maven support
The XML files can get really out of control once the cluster size increases.
Migrating from XML to YAML would alleviate this problem a bit.
In the long term we would like to come up with a better configuration system
( Explore Zookeeper? )