Sei sulla pagina 1di 20

HypergraphDB

HypergraphDB NDBI040 Jan Drozen http://www.ms.mff.cuni.cz/~drozenj

NDBI040

Jan Drozen

http://www.ms.mff.cuni.cz/~drozenj

HypergraphDB (HGDB)

open-source

graph-oriented database

embedded

higher-order relationships

queries and traversals

indices

transactions

distribution

• embedded • higher-order relationships • queries and traversals • indices • transactions • distribution

Hypergraph

„is a family of sets over a universal set of vertices V“

undirected graph where an edge can connect ANY number of

vertices

of sets over a universal set of vertices V“ • undirected graph where an edge can

Data model

basic unit is an atom

each atom has associated tuple of atoms called target set

the size of target set is arity

arity 0 atoms are nodes, otherwise links

let x is an atom then a set of atoms having x in target set is incidence set of x

set of links pointing to x

each atom has its value

each value has its type

Storage architecture

physical storage independent

needs key-value indexing storage

uses BerkeleyDB

two layers

primitive storage layer

model layer

• needs key-value indexing storage • uses BerkeleyDB • two layers • primitive storage layer •

Primitive storage layer

low-level storage

graph of identities and raw data

consists of two key-value stores

LinkStore: ID->List<ID>

DataStore: ID->List<Byte>

ID is cryptographically strong UID

eliminating collisions

type 4 UUID

Model layer

atoms, type system, caching, indexing, queries

formalizing layout of the primitive storage

AtomID -> [type,value,{target set}]

ValueID -> List<ID> | List<Byte>

ValueID can form complex structures

core indices needed UUID -> SortedSet<UUID>

IncidenceIndex

maps hypergraph atom to set of all links pointing to it

TypeIndex

maps type atom to set of all its instance atoms

ValueIndex

maps a top-level value structure to the set of atoms with this value

Architecture

Architecture

Types

programming language neutral

maps data values to/from permanent storage

type is an atom too

capable of storing, constructing and removing instances to/from storage

subtype/supertype relationships

Type system

is bootstrapped from basic types

predefined numbers, strings, records, lists, maps

HGAtomType interface

each type atom implements this one

has an Object make(…) method

type constructor is a type atom which make method returns an

HGAtomType instance

records type constructor is managing records

single record‘s parts are managed recursively

as atoms

as values

Java typing

Java typing

Indices

we are able to create indices

maintained at primitive layer

handled by type implementation

and at model layer too

are always associated with atom types (and sub-types)

interface HGIndexer

instances are atoms

produces a key for given atom

predefined indexers

ByPartIndexer, ByTargetIndexer, CompositeIndexer, LinkIndexer,

TargetToTargetIndexer

Queries

traversal

DF or BF

adjacency

depending on atom type, traversal direction

predicate match

not necessarily linked atoms

pattern matching of graph structures

special query language needed (SPARQL)

Predicate match

set-oriented queries

set of query primitives:

eq(x), lt(x), eq(“name“,x)

target(LinkID)

incident(TargetID)

arity(n)

and, or, not

lazy evaluation

compare atom‘s value atom belongs to the target set of LinkID atom points to TargetID arity of the atom is n

Transactions

multiversion consistency check

ACI by default

upon failure commited data may be lost

transaction nesting

auto-transactions (for updates)

Distribution

implemented at model layer

peer-to-peer

Agent Communication Language

propose, accept, inform, request, query,…

not total availability

eventually consistent

upon startup each agent broadcasts interest in certain atoms (sending subscribe)

each peer listens to atom events. After update, additon or removal notifies interested peers (sending inform)

local transactions are lineary ordered by a version number and logged (ensures consistency, can reach all interested peers)

a peer that received transaction notification must acknowledge it and decide whether

to enact the transaction locally or not

DEMO

assume we have following situation:

library containing some books, every book has an author, someone

could borrow some books, there can be friendships between people

Human

Author

First name

Last name

Nationality

Reader

First name

Last name

writen by
writen by

Book

Name

Page count

friendship
friendship
lent
lent

Queries

we can now query the database:

set-oriented queries:

for all books of an author X

for all books are currently lent to a friend of a person X

traversal-oriented:

get all people are connected with me via my friends

References

official website

official Google code repository and Wiki

Thank you!

Thank you!