FACEBOOK GRAPH SEARCH
How to create your own graph search using
Neo4j

jDays 2013
Ole-Martin Mørk
26/11/13
FACEBOOK GRAPH SEARCH
How to create your own graph search using
Neo4j

jDays 2013
Ole-Martin Mørk
26/11/13
ABOUT ME
Ole-Martin Mørk
Scientist
Bekk Consulting AS
Oslo, Norway

twitter: olemartin
AGENDA
INTRODUCTION TO SEARCH
INTRODUCTION TO NEO4J
INTRODUCTION TO PARSING
GRAPH SEARCH
GRAF

BETRAYS

KNOWS
KNOWS

LOVES
KNOWS
NODE

PERSON
ADRESSE

BODDE

navn: Thomas
alder: 24

gate: Aker
nummer: 15
RELASJON

BODDE
fra:
til:
RELASJONSDATABASER
“PATH EXISTS” RESPONSTID

- 

One database
containing 1000
persons

- 

Max 50 friends

- 

Detect if two random
persons are
connected via friends

Antall personer Responstid
Relational db

1.000

2000ms

Neo4j

1.000

2ms

Neo4j

1.000.000

2ms
Neo4j
GRAF

C

B

A

I

D

H

E
F

G
CYPHER
GRAF

C

B

A

I

D

H

E
F

G
Cypher
CYPHER

( ) --> ( )
CYPHER

a

b

(a) --> (b)
CYPHER

a

b

c

(a)-->(b)<--(c)
CYPHER

a

kjenner

b

(a) –[:kjenner]-> (b)
CYPHER SØK

START person=node:person(name=“Ole-Martin”)
START school=node:school(“name:Norw*”)
START student=node:student(“year:(3 OR 4 OR 5)”)
FACEBOOK GRAPH SEARCH WITH CYPHER

START me=node:person(name = “Ole-Martin”),
location=node:location(location=“Göteborg”),
cuisine=node:cuisine(cuisine=“Sushi”)
MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant)
-[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine)
RETURN restaurant
Grammar
GRAMMAR

A “language” can be formally defined as “any system of
formalized symbols, signs, etc. used for communication”

A “grammar” can be defined as a “the set of structural rules
that governs sentences, words, etc. in a natural language”
TEXT PARSING

CFG PEG
GRAMMAR

Alfred, who loved fishing, bought fish at the store
downtown

(Alfred, (who loved (fishing)), (bought
(fish (at (the store (downtown))))))
additionExp!
:
multiplyExp!
( '+' multiplyExp!
| '-' multiplyExp!
)*!
;!

An additionExp is defined as a multiplyExp
+ or - a multiplyExp

!
multiplyExp!
:
atomExp!
( '*' atomExp!
| '/' atomExp!
)*!
;!

A multiplyExp is defined as an atomExp *
or / an atomExp

!

atomExp!
:
|
;!

Number!
'(' additionExp ')'!

Number!
:
;!

('0'..'9')+!

An atomExp is defined as a number or a
parenthesized additionExp

!

Number is one or more character between
0-9
class CalculatorParser extends BaseParser<> {!
!
Rule Expression() {!
return Sequence(!
Term(),!
ZeroOrMore(AnyOf("+-"), Term())!
);!
}!
!
Rule Term() {!
return Sequence(!
Factor(),!
ZeroOrMore(AnyOf("*/"), Factor())!
);!
}!
!
Rule Factor() {!
return FirstOf(!
Number(),!
Sequence('(', Expression(), ')')!
);!
}!
!
Rule Number() {!
return OneOrMore(CharRange('0', '9'));!

An expression is a sequence of Term
followed by zero or more “+” or “-” followed
by a Term

Term is a Factor followed by zero or more
sequences of “*” or “/” followed by a factor

Factor is a number or a parenthesized
expression

Number is a one or more characters
between 0-9
PEG VS CFG

PEGs firstof operator vs CFG’s | operator
PEG does not have a separate tokenizing step
CFG might come across as more powerful, but also more difficult to master
PEG does not allow ambiguity in the grammar
PARBOILED
PARBOILED

Parsing expression grammars parser
Lightweight
Easy to use
Implementation in Scala and Java
Rules are written in the programming language
class CalculatorParser extends BaseParser<> {!
!
Rule Expression() {!
return Sequence(!
Term(),!
ZeroOrMore(AnyOf("+-"), Term())!
);!
}!
!
Rule Term() {!
return Sequence(!
Factor(),!
ZeroOrMore(AnyOf("*/"), Factor())!
);!
}!
!
Rule Factor() {!
return FirstOf(!
Number(),!
Sequence('(', Expression(), ')')!
);!
}!
!
Rule Number() {!
return OneOrMore(CharRange('0', '9'));!

An expression is a sequence of Term
followed by zero or more “+” or “-” followed
by a Term

Term is a Factor followed by zero or more
sequences of “*” or “/” followed by a factor

Factor is a number or a parenthesized
expression

Number is a one or more characters
between 0-9
I went for a walk downtown

Sequence( “I”, “went”, “for”, “a”, “walk”, “downtown”)
went
downtown
I
for a walk
wend
to the city
Sequence(
String(“I”),
FirstOf(“went”, “wend”),
Sequence(“for”, “a”, “walk”),
FirstOf(“downtown”, “to the city”));
went
downtown
today
I
for a walk
walked
to the city
Sequence(
…,
Optional(String(“today”)));
went
downtown
today
for a walk
today I
walked
to the city
Sequence(
Today(),
…,
Today());
Rule Today() { return Optional(String(“today”)); }
Rule AnyOf(java.lang.String characters)

Creates a new rule that matches any of
the characters in the given string.
Rule Ch(char c)

Explicitly creates a rule matching the
given character.
Rule CharRange(char cLow, char cHigh)

Creates a rule matching a range of
characters from cLow to cHigh (both
inclusively).
Rule FirstOf(java.lang.Object... rules)

Creates a new rule that successively
tries all of the given subrules and succeeds
when the first one of its subrules matches.
Rule IgnoreCase(char... characters)

Explicitly creates a rule matching the
given string in a case-independent fashion.
Rule NoneOf(char... characters)

Creates a new rule that matches all
characters except the ones in the given char
array and EOI.
Rule NTimes(int repetitions,
java.lang.Object rule)

Creates a new rule that repeatedly
matches a given sub rule a certain fixed
number of times.
!
went
downtown
today
for a walk
today I
walked
to the city
Sequence(
…,
FirstOf(
Sequence(push(“downtown”), “downtown”,
Sequence(push(“city”), “to the city”)));
BEKK
CV-db
A search for Java

yields 224 hits!
public Rule Expression() {!
return Sequence(!
Start(),!
FirstOf(People(), Projects(), Technologies()),!
OneOrMore(!
FirstOf(!
And(),!
Sequence(!
Know(),!
Subjects()!
),!
Sequence(!
WorkedAt(),!
Customers()!
),!
Sequence(!
Know(),!
Customers()!
),!
Sequence(!
Know(),!
Technologies()!
") "
") "
") "
);!
}!
!

start !
"fag=node:fag(navn = "Neo4J"),
"fag1=node:fag(navn = "Java"),
"prosjekt2=node:prosjekt(navn ="Modernisering")
match !
"CONSULTANTS -[:KAN]-> fag, !
"CONSULTANTS -[:KAN]-> fag1,!
"CONSULTANTS -[:KONSULTERTE]-> prosjekt !
return !
"distinct CONSULTANTS!
Demo
LEARN MORE

graphdatabases.com
neo4j.org
bit.ly/neo-cyp
parboiled.org
?
Thank
you!

@olemartin

Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se

  • 1.
    FACEBOOK GRAPH SEARCH Howto create your own graph search using Neo4j jDays 2013 Ole-Martin Mørk 26/11/13
  • 3.
    FACEBOOK GRAPH SEARCH Howto create your own graph search using Neo4j jDays 2013 Ole-Martin Mørk 26/11/13
  • 4.
    ABOUT ME Ole-Martin Mørk Scientist BekkConsulting AS Oslo, Norway twitter: olemartin
  • 5.
    AGENDA INTRODUCTION TO SEARCH INTRODUCTIONTO NEO4J INTRODUCTION TO PARSING GRAPH SEARCH
  • 13.
  • 14.
  • 15.
  • 17.
  • 18.
    “PATH EXISTS” RESPONSTID -  Onedatabase containing 1000 persons -  Max 50 friends -  Detect if two random persons are connected via friends Antall personer Responstid Relational db 1.000 2000ms Neo4j 1.000 2ms Neo4j 1.000.000 2ms
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
    CYPHER SØK START person=node:person(name=“Ole-Martin”) STARTschool=node:school(“name:Norw*”) START student=node:student(“year:(3 OR 4 OR 5)”)
  • 29.
    FACEBOOK GRAPH SEARCHWITH CYPHER START me=node:person(name = “Ole-Martin”), location=node:location(location=“Göteborg”), cuisine=node:cuisine(cuisine=“Sushi”) MATCH (me)-[:IS_FRIEND_OF]->(friend)-[:LIKES]->(restaurant) -[:LOCATED_IN]->(location),(restaurant)-[:SERVES]->(cuisine) RETURN restaurant
  • 30.
  • 31.
    GRAMMAR A “language” canbe formally defined as “any system of formalized symbols, signs, etc. used for communication” A “grammar” can be defined as a “the set of structural rules that governs sentences, words, etc. in a natural language”
  • 32.
  • 33.
    GRAMMAR Alfred, who lovedfishing, bought fish at the store downtown (Alfred, (who loved (fishing)), (bought (fish (at (the store (downtown))))))
  • 34.
    additionExp! : multiplyExp! ( '+' multiplyExp! |'-' multiplyExp! )*! ;! An additionExp is defined as a multiplyExp + or - a multiplyExp ! multiplyExp! : atomExp! ( '*' atomExp! | '/' atomExp! )*! ;! A multiplyExp is defined as an atomExp * or / an atomExp ! atomExp! : | ;! Number! '(' additionExp ')'! Number! : ;! ('0'..'9')+! An atomExp is defined as a number or a parenthesized additionExp ! Number is one or more character between 0-9
  • 35.
    class CalculatorParser extendsBaseParser<> {! ! Rule Expression() {! return Sequence(! Term(),! ZeroOrMore(AnyOf("+-"), Term())! );! }! ! Rule Term() {! return Sequence(! Factor(),! ZeroOrMore(AnyOf("*/"), Factor())! );! }! ! Rule Factor() {! return FirstOf(! Number(),! Sequence('(', Expression(), ')')! );! }! ! Rule Number() {! return OneOrMore(CharRange('0', '9'));! An expression is a sequence of Term followed by zero or more “+” or “-” followed by a Term Term is a Factor followed by zero or more sequences of “*” or “/” followed by a factor Factor is a number or a parenthesized expression Number is a one or more characters between 0-9
  • 36.
    PEG VS CFG PEGsfirstof operator vs CFG’s | operator PEG does not have a separate tokenizing step CFG might come across as more powerful, but also more difficult to master PEG does not allow ambiguity in the grammar
  • 37.
  • 38.
    PARBOILED Parsing expression grammarsparser Lightweight Easy to use Implementation in Scala and Java Rules are written in the programming language
  • 39.
    class CalculatorParser extendsBaseParser<> {! ! Rule Expression() {! return Sequence(! Term(),! ZeroOrMore(AnyOf("+-"), Term())! );! }! ! Rule Term() {! return Sequence(! Factor(),! ZeroOrMore(AnyOf("*/"), Factor())! );! }! ! Rule Factor() {! return FirstOf(! Number(),! Sequence('(', Expression(), ')')! );! }! ! Rule Number() {! return OneOrMore(CharRange('0', '9'));! An expression is a sequence of Term followed by zero or more “+” or “-” followed by a Term Term is a Factor followed by zero or more sequences of “*” or “/” followed by a factor Factor is a number or a parenthesized expression Number is a one or more characters between 0-9
  • 40.
    I went fora walk downtown Sequence( “I”, “went”, “for”, “a”, “walk”, “downtown”)
  • 41.
    went downtown I for a walk wend tothe city Sequence( String(“I”), FirstOf(“went”, “wend”), Sequence(“for”, “a”, “walk”), FirstOf(“downtown”, “to the city”));
  • 42.
    went downtown today I for a walk walked tothe city Sequence( …, Optional(String(“today”)));
  • 43.
    went downtown today for a walk todayI walked to the city Sequence( Today(), …, Today()); Rule Today() { return Optional(String(“today”)); }
  • 44.
    Rule AnyOf(java.lang.String characters) Createsa new rule that matches any of the characters in the given string. Rule Ch(char c) Explicitly creates a rule matching the given character. Rule CharRange(char cLow, char cHigh) Creates a rule matching a range of characters from cLow to cHigh (both inclusively). Rule FirstOf(java.lang.Object... rules) Creates a new rule that successively tries all of the given subrules and succeeds when the first one of its subrules matches. Rule IgnoreCase(char... characters) Explicitly creates a rule matching the given string in a case-independent fashion. Rule NoneOf(char... characters) Creates a new rule that matches all characters except the ones in the given char array and EOI. Rule NTimes(int repetitions, java.lang.Object rule) Creates a new rule that repeatedly matches a given sub rule a certain fixed number of times. !
  • 45.
    went downtown today for a walk todayI walked to the city Sequence( …, FirstOf( Sequence(push(“downtown”), “downtown”, Sequence(push(“city”), “to the city”)));
  • 46.
  • 47.
    A search forJava yields 224 hits!
  • 49.
    public Rule Expression(){! return Sequence(! Start(),! FirstOf(People(), Projects(), Technologies()),! OneOrMore(! FirstOf(! And(),! Sequence(! Know(),! Subjects()! ),! Sequence(! WorkedAt(),! Customers()! ),! Sequence(! Know(),! Customers()! ),! Sequence(! Know(),! Technologies()! ") " ") " ") " );! }!
  • 50.
    ! start ! "fag=node:fag(navn ="Neo4J"), "fag1=node:fag(navn = "Java"), "prosjekt2=node:prosjekt(navn ="Modernisering") match ! "CONSULTANTS -[:KAN]-> fag, ! "CONSULTANTS -[:KAN]-> fag1,! "CONSULTANTS -[:KONSULTERTE]-> prosjekt ! return ! "distinct CONSULTANTS!
  • 51.
  • 52.
  • 53.
  • 54.