Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
10/06/16, 11:29 PM
2.1 Nodes
- Nodes are typically used to represent entities (or complex value types).
- Nodes can have properties, which are key/value pairs. Values can be
primitives or collections of primitives.
- Nodes can have zero or more relationships connecting them to other nodes.
2.2 Relationships
- Relationships are used to represent the relationships between nodes; to
provide context to the nodes.
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 1 of 12
10/06/16, 11:29 PM
- Relationships must have a start and end node, thus relationships must have
a direction. Direction can be ignored at query time, so the fact that direction
is there does not mean it must be used.
- Relationships must have a relationship type.
- Relationships can have properties (key/value pairs. values can be primitives
or collections of primitives).
2.3 Properties
- Nodes and relationships can have properties (key/value pairs. values can be
primitives or collections of primitives.)
- Properties can quantify relationships.
2.4 Labels
- Nodes can have zero or more labels.
- Labels can represent roles, categories or types.
- Labels are used to define indexes and constraints.
pattern, where ident and ident2 are identifiers. Relationship identifiers are
specified within square brackets, with an optional type after a colon, like (u)[r:HAS_ACCESS]->(a). Labels are specified similarly to relationship types, following
Page 2 of 12
10/06/16, 11:29 PM
not only while querying, but also while creating new nodes and relationships.
connecting nodes and relationships. If there aren't any for a particular nwith
anr.strength > 0.5, it will return null for r and m, while still returningn:
MATCH (n)
OPTIONAL MATCH (n)-[r]-(m)
WHERE r.strength > 0.5
RETURN n, r, m
is like SELECT in SQL. You list all of the expressions and fields you want to
return, along with optional aliases. This returns the first three users sorted by
name, and 3^2 (9):
RETURN 3^2 as nine, u.name as userName
ORDER BY userName
LIMIT 3
Page 3 of 12
10/06/16, 11:29 PM
MATCH (n:Person)-[:FRIEND]-(f)
WITH count(f) as c, n
MATCH (n)-[:FRIEND]-()-[:FRIEND]-(fof)
RETURN n, c, fof
3.2.5 CREATE
CREATE
will create all new parts in a pattern. This query will create a new :Bar node,
and a :FOO relationship connecting out from the n node that has node id 1:
MATCH (n)
WHERE id(n) = 1
CREATE (n)-[:FOO]->(b:Bar)
RETURN n, b
Note that, if we run this query more than once, we'll end up with more than one
:Bar node connecting to our n node. Each time we'll get a new b returned out of
the query.
3.2.6 MERGE
MERGE
MATCH (n)
WHERE id(n) = 1
MERGE (n)-[:FOO]->(b:Bar)
RETURN n, b
If we run this query more than once, we'll end up with just one node connecting to
our n node. This is because MERGE will match the pattern instead of creating it, once
it is created, so the b node returned will always be the same.
Page 4 of 12
10/06/16, 11:29 PM
more. When you use these, the rest of the results asked for are grouped implicitly.
In SQL you would need to explicitly define which fields to GROUP BY; in Cypher, it
groups them implicitly.
This query finds the count of outbound relationships for each node. If the RETURN
n, count(1)
were changed to be just RETURN count(1), it would give the total count
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 5 of 12
10/06/16, 11:29 PM
It doesn't look so bad, but consider the queries we'll need to run in order to decide
whether a particular user has access to a particular asset. They might be directly
connected via user_asset_access, or they might be in a group that has access via
group_asset_access. Or they may even be in a group that's part of a larger group
that has access via group_asset_access, and so on...
Some SQL engines even have a hierarchical index optimization, to handle this a
little better these are still somewhat limited, though. The hierarchical nature of
the model makes it ideal to use a graph database.
Here's a quick script (SQL Fiddle) to create the schema in postgres, to follow
along:
CREATE TABLE USERS (id SERIAL, name varchar(50));
CREATE TABLE GROUPS (id SERIAL, name varchar(50));
CREATE TABLE USER_GROUPS (user_id integer, group_id integer);
CREATE TABLE GROUP_GROUPS (parent_group_id integer, group_id integer);
CREATE TABLE USER_ASSET_ACCESS(user_id integer, asset_id integer);
CREATE TABLE GROUP_ASSET_ACCESS(group_id integer, asset_id integer);
CREATE TABLE ASSETS (id SERIAL, uri varchar(1000));
INSERT INTO USERS (name) values('neo');
INSERT INTO USERS (name) values('morpheus');
INSERT INTO USERS (name) values('trinity');
INSERT INTO USERS (name) values('cypher');
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 6 of 12
10/06/16, 11:29 PM
You end up writing long queries like this (SQL Fiddle) in order to determine whether
someone has access to a particular asset. Here, if any counts returned are >0, they
have access:
SELECT count(1)
FROM users u, user_asset_access uaa, assets a
WHERE u.id = uaa.user_id
AND uaa.asset_id = a.id
AND a.uri = '/mainframe'
AND u.name = 'smith'
UNION ALL
SELECT count(1)
FROM users u, user_groups ug, groups g, group_asset_access gaa, assets a
WHERE u.id = ug.user_id
AND g.id = ug.group_id
AND gaa.asset_id = a.id
AND gaa.group_id = g.id
AND a.uri = '/mainframe'
AND u.name = 'smith'
UNION ALL
SELECT count(1)
FROM users u, user_groups ug, groups g, groups pg, group_groups gg, group_asset_access gaa
WHERE u.id = ug.user_id
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 7 of 12
10/06/16, 11:29 PM
One of the downsides is that this query has a fixed depth that you need to continue
to increase with more UNIONs, depending on how deep your group hierarchy can
be. If there's no theoretical limit, you'll just need to do it programmatically and
create more than one query.
Most SQL databases provide a similar functionality, but the end result will be CSVencoded dumps of the tables that we can use to direct the data into Neo4j. We'll
need to use LOAD CSV along with the MERGE clause to bring the data in.
First we'll make some unique constraints (which come with indexes), as well as
some indexes for our non-unique data:
CREATE CONSTRAINT ON (u:User) ASSERT u.id IS UNIQUE;
CREATE CONSTRAINT ON (g:Group) ASSERT g.id IS UNIQUE;
CREATE CONSTRAINT ON (a:Asset) ASSERT a.id IS UNIQUE;
CREATE INDEX ON :User(name);
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 8 of 12
10/06/16, 11:29 PM
Let's start with the simple parts, individual entity creation. We'll use the ids as
"unique identifiers" for the entities:
LOAD CSV FROM 'file:MERGE (:User {id:toInt(line[0]), name:line[1]});
LOAD CSV FROM 'file:MERGE (:Group {id:toInt(line[0]), name:line[1]});
LOAD CSV FROM 'file:MERGE (:Asset {id:toInt(line[0]), uri:line[1]});
Now we can connect the entities with relationships using the join table CSV
exports:
LOAD CSV FROM 'file:MATCH (u:User {id:toInt(line[0])}), (a:Asset {id:toInt(line[1])}
MERGE (u)-[:HAS_ACCESS]->(a);
LOAD CSV FROM 'file:MATCH (g:Group {id:toInt(line[0])}), (a:Asset {id:toInt(line[1])
MERGE (g)-[:HAS_ACCESS]->(a);
LOAD CSV FROM 'file:MATCH (u:User {id:toInt(line[0])}), (g:Group {id:toInt(line[1])}
MERGE (u)-[:IS_MEMBER]->(g);
LOAD CSV FROM 'file:MATCH (p:Group {id:toInt(line[0])}), (g:Group {id:toInt(line[1])
MERGE (g)-[:IS_MEMBER]->(p);
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 9 of 12
10/06/16, 11:29 PM
Let's see whether Neo has access to /the/red/pill, using the shortestPath
function to optimize the search (note the difference in complexity between this
query and the SQL that accomplished a similar result):
LOAD CSV
gives us a result for each line of the CSV, a collection of strings. In order
to use those strings as integers (like the ids), we can use the conversion function
toInt. Cypher will do an index lookup on the group id and the asset id in the MATCH,
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 10 of 12
10/06/16, 11:29 PM
This query does a lot of things at once in the first MATCH. It can actually be broken
out to make the lines shorter, if desired:
MATCH shortestPath((neo:User)-[:HAS_ACCESS|IS_MEMBER*]->(a:Asset))
WHERE neo.name = 'neo' AND a.uri = '/the/red/pill'
RETURN count(*) > 0 as hasAccess
The above two queries are exactly the same, and you can decide whether you
prefer the inline property matching syntax, or the WHERE property matching syntax.
Let's start by looking at the pattern and WHERE filters.
(neo:User)-[:HAS_ACCESS|IS_MEMBER*]->(a:Asset)
WHERE neo.name = 'neo' AND a.uri = '/the/red/pill'
In this case, the query is almost the same, except we start with the :Asset(uri)
lookup, and then calculate the shortest path for all :User-labeled nodes, note that
the :User has no name specified in the pattern. The names that are returned are
the ones that have a path with either :HAS_ACCESS or :IS_MEMBER, so it's able to
traverse the group hierarchy as well as the access hierarchy.
8 Conclusions
This is a simplified example of an ACL, but it should be apparent that the
complexity of the queries, and the exact transfer of the data model to the graph
structure are a great fit for Neo4j. There are many other use cases the graph
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 11 of 12
10/06/16, 11:29 PM
https://www.airpair.com/neo4j/posts/getting-started-with-neo4j-and-cypher
Page 12 of 12