Skip to content

[Snyk] Upgrade org.postgresql:postgresql from 42.2.8 to 42.3.1 #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 208 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
208 commits
Select commit Hold shift + click to select a range
0dc7f67
Add option to skip loading but generate only data files for PostgreSQ…
jmarton Mar 31, 2019
9cc948d
Re-order env variables in the PostgreSQL loader script and docs
jmarton Mar 31, 2019
5fa9731
Add option to generate message file (of posts and comments) PostgreSQ…
jmarton Mar 31, 2019
59dd882
Use 0.4.0-SNAPSHOT version of the driver
szarnyasg Dec 6, 2019
f36f816
Update links. Fixes #97
szarnyasg Jan 17, 2020
3dfd4dd
Classes for Cypher delete queries
szarnyasg Jan 23, 2020
2773770
Fix CI
szarnyasg Jan 23, 2020
bb67f38
initial cypher delete queries
jackwaudby Jan 23, 2020
73a7979
Add newlines at the ends of file
szarnyasg Feb 3, 2020
7047d78
Bump Neo4j version to 4.0.0
szarnyasg Feb 3, 2020
d987d5c
Merge branch 'neo4j-4' into cypher-delete
szarnyasg Feb 3, 2020
54c5508
Bump Java version
szarnyasg Feb 3, 2020
2ac3fa2
Fix function call on pattern comprehension
szarnyasg Feb 3, 2020
20dd563
Adjust Neo4j import scripts and variables to 4.0
szarnyasg Feb 3, 2020
c979142
Merge branch 'cypher-deletes' into cypher-delete
szarnyasg Feb 26, 2020
7e972c8
Adjust CSVs to new Datagen output (draft)
szarnyasg Feb 29, 2020
552fed6
Remove typo from hashbang
szarnyasg Mar 6, 2020
4359487
Show Neo4j log after importing/restarting the database
szarnyasg Mar 6, 2020
f27be93
Rework Neo4j setup and import process
szarnyasg Mar 6, 2020
67eeba9
Fix README instructions
szarnyasg Mar 6, 2020
bb5b6a5
Minor improvements in Neo4j scripts
szarnyasg Mar 6, 2020
9dc10de
Use Date and DateTime types when importing data to Neo4j
szarnyasg Mar 6, 2020
8bd062e
Update Neo4j test data to match schema
szarnyasg Mar 6, 2020
cb2d9e3
Fix import
szarnyasg Mar 6, 2020
f02d19c
Update Travis configuration to match env var names
szarnyasg Mar 6, 2020
ee101ad
Remove tests for deprecated BI queries
szarnyasg Mar 6, 2020
4ba371a
Remove deprecated Cypher queries
szarnyasg Mar 7, 2020
248aba0
Drop curly braces around Neo4j parameters. Fixes #103
szarnyasg Mar 7, 2020
61b8ca8
Use native date type in Neo4j, update parameter syntax
szarnyasg Mar 7, 2020
e3e5bdd
Optimize Cypher implementation of BI Q7
szarnyasg Mar 7, 2020
821daa9
Extend Postgres with creationDate and deletionDate attributes. Fixes …
szarnyasg Mar 7, 2020
94a0318
Make Postgres load scripts fail on errors #105
szarnyasg Mar 7, 2020
876807d
Rename PG_DATA_DIR to PG_CSV_DIR
szarnyasg Mar 7, 2020
56a2e8f
Optimize and document BI Q7 Cypher
szarnyasg Mar 7, 2020
03e67aa
Fix PG environment variable in CI conf
szarnyasg Mar 8, 2020
0ac0bfc
Ignore Postgres update tests for the time being
szarnyasg Mar 8, 2020
c912a36
Change joinDate to creationDate in Postgres queries
szarnyasg Mar 8, 2020
40d2870
Bump Postgres version on Travis
szarnyasg Mar 8, 2020
4e2e56e
Drop PG port
szarnyasg Mar 8, 2020
bb5cc96
Update Postgres test data
szarnyasg Mar 8, 2020
8ad2943
Fix indentation
szarnyasg Mar 8, 2020
49395f2
Push parameters inside BI Q3
szarnyasg Mar 8, 2020
a8ebe4c
Further cosmetic changes in Q3
szarnyasg Mar 8, 2020
b54194c
Update Neo4j syntax for example parameters in BI queries #103
szarnyasg Mar 8, 2020
db3e970
fix README
szarnyasg Mar 8, 2020
2c43761
Don't use absolute paths to PostgreSQL tools
petere Mar 18, 2020
93cf3e9
[add] clarify how to change update_interleave property which is of ex…
filipecosta90 Feb 26, 2020
d32e79b
Fix typo
szarnyasg Feb 26, 2020
3621b95
[add] updated interactive read query frequencies to match SF1
filipecosta90 Feb 27, 2020
396ca22
Use Java 11. Fixes #106
szarnyasg Mar 16, 2020
c08327b
Cleanup Cypher scripts #90
szarnyasg Mar 18, 2020
b27b70e
Merge pull request #107 from petere/postgres-path
szarnyasg Mar 19, 2020
50c7dec
Fix driver branch in CI configuration
szarnyasg Apr 9, 2020
474b96a
Use ed instead of sed to speed up conversion of CSV files
szarnyasg Apr 17, 2020
df84142
Use simpler method for changing headers
szarnyasg Apr 21, 2020
1207b86
Fix Neo4j CSV headers to keep up with Datagen
szarnyasg Apr 25, 2020
43d47b9
Add Cypher test files
szarnyasg Apr 26, 2020
da3f62f
Make SparqlConverter thread-safe
szarnyasg Apr 26, 2020
966a149
Change occurrences of SimpleDateFormat to DateTimeFormatter
szarnyasg Apr 26, 2020
93d540b
Merge pull request #118 from ldbc/thread-safe-datetimeconverter
szarnyasg Apr 26, 2020
e2c48a4
Fix compile errors in Cypher queries
szarnyasg Apr 26, 2020
492cdd1
Fix Interactive Q1 implementations: exclude start Person from results
szarnyasg May 3, 2020
bd83427
Use uniform frequencies
szarnyasg May 3, 2020
eabdf64
Add script to temporarily disable updates in Postgres runs
szarnyasg May 3, 2020
4051934
Remove unused results_log configuration parameter. Fixes #122
szarnyasg May 9, 2020
7e580a3
Merge pull request #124 from ldbc/fix-cypher-queries
szarnyasg May 9, 2020
47b9af2
Remove redundant implementations
szarnyasg May 9, 2020
08e754a
Fix use of datetime in Cypher
szarnyasg May 9, 2020
f0b8b45
Add 'disable-updates.sh' script to Cypher implementation
szarnyasg May 9, 2020
c319e60
Use new driver_mode flag in scripts #126
szarnyasg May 9, 2020
8b14c33
Add notice to SPARQL implementation on lack of maintenance
szarnyasg May 9, 2020
3e52195
Add new delete operation to the configurations (currently disabled)
szarnyasg May 9, 2020
4c71f5d
Revert "Fix Neo4j CSV headers to keep up with Datagen"
szarnyasg May 9, 2020
f72ed29
Track branch renaming
szarnyasg Jul 6, 2020
a0923bd
Fix Travis configuration
szarnyasg Jul 6, 2020
17bbdc5
Fix Travis CI build
szarnyasg Jul 7, 2020
b02c90a
Remove unused example
szarnyasg Jul 7, 2020
f9d858d
Drop unused deletionDate attribute from headers
szarnyasg Jul 7, 2020
3958c7a
Support Forum labels (Wall/Album/Group)
szarnyasg Jul 7, 2020
8688622
Adjust headers
szarnyasg Jul 7, 2020
3f505b5
Add note on citations
szarnyasg Aug 14, 2020
f6616d2
Add compatibility matrix
szarnyasg Sep 10, 2020
df14a34
Bump Neo4j version
szarnyasg Sep 18, 2020
4ee3bd7
Check that environment variables are set before loading
szarnyasg Sep 18, 2020
6ac9539
Fix CSV header
szarnyasg Sep 24, 2020
7165811
Update README
szarnyasg Sep 24, 2020
4ed912b
Update README
szarnyasg Sep 24, 2020
fbbbe1f
Merge branch 'dev' into postgresql-csv-generate-options
szarnyasg Oct 2, 2020
a10f6b8
Update Cypher test data
szarnyasg Oct 2, 2020
26f1afb
Update Cypher test data
szarnyasg Oct 2, 2020
55426f3
Merge pull request #125 from ldbc/postgresql-csv-generate-options
szarnyasg Oct 2, 2020
808cc1d
Cleanup READMEs
szarnyasg Oct 2, 2020
e4e428c
Rename old BI queries, add new ones (mix of implementations/skeletons)
szarnyasg Oct 2, 2020
05cbfac
Use consistent capitalization in Interactive short queries
szarnyasg Oct 3, 2020
62e33df
Initial Cypher implementations of Q16-Q20 completed
szarnyasg Oct 3, 2020
bd71a63
Fix counting interactions in BI Q19
szarnyasg Oct 3, 2020
6ffd1d6
Remove unmaintained subprojects (SPARQL and DBToaster)
szarnyasg Oct 4, 2020
7d34e5b
Add instructions on how to generate small test data
szarnyasg Oct 4, 2020
00ffb00
Remove old BI classes
szarnyasg Oct 4, 2020
fe4d37a
Add Cypher glue code for new BI queries
szarnyasg Oct 4, 2020
fd6b2e0
Fix example parameters
szarnyasg Oct 8, 2020
18204e9
Bump junit from 4.12 to 4.13.1 in /common
dependabot[bot] Oct 13, 2020
f7bcdac
Adjust BI query numbering in Postgres
szarnyasg Oct 16, 2020
137ac7d
Put new BI Q16-20 queries on ignore until we have SQL implementations
szarnyasg Oct 16, 2020
c1a6c43
Configure Neo4j after loading the data in CI
szarnyasg Oct 16, 2020
0b5b123
Merge pull request #137 from ldbc/cleanup-bi-queries
szarnyasg Oct 16, 2020
e9cf1bf
Merge pull request #138 from ldbc/dependabot/maven/common/junit-junit…
szarnyasg Oct 16, 2020
2353739
Fix Interactive Q2 reference implementations to use exclusive upper b…
szarnyasg Oct 19, 2020
becbaa8
Adjust Postgres load script and test data to the latest Datagen
szarnyasg Oct 23, 2020
08f9ac0
Change birthday type from 'timestamp' to 'date'
szarnyasg Oct 23, 2020
5c20529
Update Postgres update queries to match new schema
szarnyasg Oct 30, 2020
16f11e4
Print env vars used by the PostgreSQL load script
szarnyasg Nov 8, 2020
f56465c
Fix possible infinite loop in IC-13 query by recording the path.
jmarton Jun 1, 2020
1320b22
Bump Neo4j and GDS versions
szarnyasg Nov 12, 2020
2cbe65e
Check PG_USER
szarnyasg Nov 12, 2020
c3c22bb
Add tie-breaking to BI Q19
szarnyasg Nov 13, 2020
0d70ab9
Remove comment from Cypher params code
szarnyasg Nov 13, 2020
7acc874
Fix Cypher queries using weighted shortest paths
szarnyasg Nov 13, 2020
456cd87
Migrate build to CircleCI #146
szarnyasg Nov 14, 2020
9765bb6
Install cURL in CI
szarnyasg Nov 14, 2020
672f1e8
Change CI badge
szarnyasg Nov 14, 2020
605b1ba
Merge pull request #142 from ldbc/update-postgres-update-queries
szarnyasg Nov 14, 2020
a3f79c3
Merge pull request #147 from ldbc/neo4j-version-bump
szarnyasg Nov 14, 2020
cb25311
Bump Neo4j version
szarnyasg Nov 17, 2020
75236cf
Add missing space
szarnyasg Nov 17, 2020
783154f
Refine Neo4j load script
szarnyasg Nov 17, 2020
82eb168
Create indices in Neo4j load
szarnyasg Nov 17, 2020
1fefd76
Use one step load script in CI
szarnyasg Nov 17, 2020
be26b2a
Add missing netcat package
szarnyasg Nov 17, 2020
7a12532
Rework Neo4j default env vars and scripts, use path based on the dire…
szarnyasg Nov 17, 2020
236bc48
Use consistent directory names for projects
szarnyasg Nov 18, 2020
3a624ad
Update Postgres README
szarnyasg Nov 18, 2020
6b1bcbe
Fix Datagen configuration
szarnyasg Nov 18, 2020
1d28348
Update headers in accordance with the Datagen repository
szarnyasg Nov 18, 2020
8fc2e26
Download deployed generated data sets instead of storing them in this…
szarnyasg Nov 18, 2020
8c42e44
Omit changing labels to uppercase
szarnyasg Nov 24, 2020
36d170a
Run scripts in their directory
szarnyasg Nov 24, 2020
9e648ce
Removing executable bit from env var configuration script
szarnyasg Dec 5, 2020
72ea4bb
User cypher-shell for testing whether DB is running
szarnyasg Dec 15, 2020
1e55e6f
Update queries and their Cypher implementation to match implementation
szarnyasg Dec 15, 2020
99a3d65
Revise PostgreSQL BI queries
szarnyasg Dec 15, 2020
c86e2a1
Add comment on handling Posts/Comments
szarnyasg Dec 15, 2020
bbaf3c3
Use Message in BI Q3 and Q4
szarnyasg Dec 15, 2020
11e51f1
Use 'date' parameter in BI Q2 instead of year/month
szarnyasg Dec 15, 2020
bec75c5
Dockerize Cypher implementation
szarnyasg Dec 15, 2020
59a934c
Use non-Docker CI image to allow running Docker
szarnyasg Dec 15, 2020
3643318
Cleanup Cypher scripts, adjust CI configuration
szarnyasg Dec 15, 2020
b125c25
Use sudo in CI
szarnyasg Dec 15, 2020
5271354
Cleanup tests, remove authentication
szarnyasg Dec 15, 2020
b03ccfc
Fix GitHub URL (broke on some git versions with a trailing slash)
szarnyasg Dec 15, 2020
fb3b9c8
Ignore Postgres tests
szarnyasg Dec 15, 2020
e9690d7
Specify CSV dir for Neo4j
szarnyasg Dec 15, 2020
403bdc1
Add dummy user/pw
szarnyasg Dec 15, 2020
eeb4756
Use common interface to reduce code duplication
szarnyasg Dec 15, 2020
11a326c
Revise Neo4j scripts, run GDS library in Docker
szarnyasg Dec 16, 2020
363870b
Unignore BI unit tests
szarnyasg Dec 16, 2020
64cc0ce
Simplify Cypher/GDS queries
szarnyasg Dec 16, 2020
19d2ad3
Add checks and pv to Cypher script
szarnyasg Dec 18, 2020
e27297d
Rename script
szarnyasg Dec 18, 2020
525633f
Run APOC in the Neo4j Docker container
szarnyasg Dec 18, 2020
8698895
Rework Cypher implementation of Q10
szarnyasg Dec 18, 2020
5262b1b
Format parameters
szarnyasg Dec 18, 2020
6f3a86d
Cleanup Cypher queries and add more checks
szarnyasg Dec 18, 2020
075248c
Add alternative BI Q10 Cypher implementation
szarnyasg Dec 18, 2020
a19e269
Remove deprecated script from README
szarnyasg Dec 28, 2020
e5c0f33
Rename Neo4j env var
szarnyasg Dec 28, 2020
7688c11
Add example graph covering BI Q1-Q15
szarnyasg Jan 3, 2021
a4a166c
Extend example to cover BI Q16-Q20
szarnyasg Jan 3, 2021
31b2919
Clarify extra requirements of Cypher queries
szarnyasg Jan 3, 2021
c825565
Clarify comment
szarnyasg Jan 4, 2021
bccb47e
Remove alternative queries
szarnyasg Jan 4, 2021
0445b05
Fix dates
szarnyasg Jan 4, 2021
110f979
Formatting
szarnyasg Jan 4, 2021
8179457
Slight adjustments to example graph
szarnyasg Jan 4, 2021
f3621f0
Do not nuke database with example graph query
szarnyasg Jan 4, 2021
29d4214
Add/update some properties in example graph
szarnyasg Jan 4, 2021
8dd99b2
Clarify BI Q6 comments
szarnyasg Jan 4, 2021
d64840a
Optimize BI Q5. Fixes #104
szarnyasg Jan 4, 2021
a12adf6
Minor fixes to the example graph
szarnyasg Jan 4, 2021
02ca463
Clarify versioning
szarnyasg Jan 6, 2021
6bcb4c2
Rename variable in recursive CTEs in SQL queries: s/depth/level/g
szarnyasg Jan 21, 2021
365cebc
Fix direction of edge in BI Q20
szarnyasg Jan 21, 2021
13a589b
Cleanup Cypher README
szarnyasg Jan 21, 2021
d0a9413
Add missing 'ORDER BY' clause
szarnyasg Jan 22, 2021
2af8a2c
Change error message
szarnyasg Jan 23, 2021
ab32c6f
Add docs to releases table
szarnyasg Jan 27, 2021
be4d39a
Remove typo, fix direction of edge
szarnyasg Jan 27, 2021
988e425
Newline
szarnyasg Jan 30, 2021
5c45b76
Add tmp file to gitignore
szarnyasg Jan 30, 2021
55d51e4
Add 'e' flag to Bash scripts
szarnyasg Jan 30, 2021
cf03bf1
Drop '-e' flag from sourced script
szarnyasg Feb 1, 2021
8c1c260
Add e/pipefail options to Neo4j scripts
szarnyasg Feb 1, 2021
70684a4
Use working data set
szarnyasg Feb 3, 2021
e5125c0
Update cityIds for new example data set
szarnyasg Feb 3, 2021
7294731
Add test Python script
szarnyasg Feb 3, 2021
0750b07
Run BI script in CI
szarnyasg Feb 3, 2021
7314c6e
Use type designators for parameter files
szarnyasg Feb 3, 2021
963b129
Fix BI Q11: add projection to only count each triangle once
szarnyasg Feb 9, 2021
fb9363f
Use correct type (DateTime -> Date) in the example parameters/scripts…
szarnyasg Feb 11, 2021
4034a1f
Make env vars script zsh-compatible
szarnyasg Feb 15, 2021
acfa0fe
Spell out locale verbosely (POSIX=C), see issue #157
szarnyasg Mar 14, 2021
2804124
Spell out query ids (necessitated by Q14a/b)
szarnyasg Mar 15, 2021
b240abe
Add conversion script for parameters files
szarnyasg Mar 15, 2021
1e4b570
Fix Python script running Cypher queries
szarnyasg Mar 15, 2021
475ce05
Add instructions on using the example data set
szarnyasg Mar 15, 2021
bf0bc7f
Fix parameters: separate date/datetime parameters
szarnyasg Mar 15, 2021
0c90eae
Use new filenames (with PascalCase node label names)
szarnyasg Mar 16, 2021
cf4a623
fix: upgrade org.postgresql:postgresql from 42.2.8 to 42.3.1
snyk-bot Feb 4, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
64 changes: 64 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
version: 2.1
orbs:
slack: circleci/[email protected]
workflows:
version: 2
build:
jobs:
- test:
filters:
branches:
ignore:
- stable

jobs:
test:
resource_class: large
machine:
image: ubuntu-2004:202008-01
steps:
- checkout
- run:
name: Setup
command: |
export DEBIAN_FRONTEND=noninteractive
sudo apt update
# install dependencies
sudo apt install -y curl git wget unzip maven netcat
# driver
git clone --depth 1 --branch dev https://github.com/ldbc/ldbc_snb_driver && cd ldbc_snb_driver && mvn install -DskipTests && cd ..
# Cypher
cd cypher
scripts/install-dependencies.sh
cd ..
# PostgreSQL
# TODO
- run:
name: Download data
command: |
mkdir data/
cd data
wget -q https://ldbc.github.io/ldbc_snb_data_converter/csv-composite-projected-fk.zip
unzip csv-composite-projected-fk.zip
cd ..
- run:
name: Load
command: |
# PostgreSQL
# TODO
# Cypher
cd cypher
. scripts/environment-variables-default.sh
export NEO4J_CSV_DIR=`pwd`/../data/csv-composite-projected-fk
export NEO4J_CSV_POSTFIX=.csv
scripts/load-in-one-step.sh
cd ..
- run:
name: Test
command: mvn test -B
- run:
name: Run Cypher Python script
command: |
cd cypher
python3 bi.py
- slack/status
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -11,4 +11,5 @@ validation_params.csv
*-actual.json
*-expected.json
results-*/
*.DS_Store
*.DS_Store
*.report
43 changes: 0 additions & 43 deletions .travis.yml

This file was deleted.

72 changes: 29 additions & 43 deletions README.md

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions common/pom.xml
Original file line number Diff line number Diff line change
@@ -16,12 +16,12 @@
<dependency>
<groupId>com.ldbc.driver</groupId>
<artifactId>jeeves</artifactId>
<version>0.3.3</version>
<version>0.4.0-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<version>4.13.1</version>
</dependency>
</dependencies>
<build>
309 changes: 156 additions & 153 deletions common/src/main/java/com/ldbc/impls/workloads/ldbc/snb/QueryStore.java

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -1,34 +1,11 @@
package com.ldbc.impls.workloads.ldbc.snb.bi;

import com.ldbc.driver.DbException;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery10TagPerson;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery11UnrelatedReplies;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery12TrendingPosts;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery13PopularMonthlyTags;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery14TopThreadInitiators;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery15SocialNormals;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery16ExpertsInSocialCircle;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery17FriendshipTriangles;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery18PersonPostCounts;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery19StrangerInteraction;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery1PostingSummary;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery20HighLevelTopics;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery21Zombies;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery22InternationalDialog;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery23HolidayDestinations;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery24MessagesByTopic;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery25WeightedPaths;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery2TopTags;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery3TagEvolution;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery4PopularCountryTopics;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery5TopCountryPosters;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery6ActivePosters;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery7AuthoritativeUsers;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery8RelatedTopics;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiQuery9RelatedForums;
import com.ldbc.driver.workloads.ldbc.snb.bi.*;
import com.ldbc.driver.workloads.ldbc.snb.bi.LdbcSnbBiWorkload;
import com.ldbc.impls.workloads.ldbc.snb.SnbTest;
import com.ldbc.impls.workloads.ldbc.snb.db.BaseDb;
import org.junit.Ignore;
import org.junit.Test;

import java.util.Arrays;
@@ -46,122 +23,97 @@ public void testBiQuery1() throws DbException {

@Test
public void testBiQuery2() throws DbException {
run(db, new LdbcSnbBiQuery2TopTags(1265583600000L, 1290380400000L, "Germany", "United_States", LIMIT));
run(db, new LdbcSnbBiQuery2TagEvolution(1311307200000L, "MusicalArtist", LIMIT));
}

@Test
public void testBiQuery3() throws DbException {
run(db, new LdbcSnbBiQuery3TagEvolution(2015, 12, 100));
run(db, new LdbcSnbBiQuery3PopularCountryTopics("MusicalArtist", "Netherlands", LIMIT));
}

@Test
public void testBiQuery4() throws DbException {
run(db, new LdbcSnbBiQuery4PopularCountryTopics("MusicalArtist", "Netherlands", LIMIT));
run(db, new LdbcSnbBiQuery4TopCountryPosters("Ethiopia", LIMIT));
}

@Test
public void testBiQuery5() throws DbException {
run(db, new LdbcSnbBiQuery5TopCountryPosters("Ethiopia", LIMIT));
run(db, new LdbcSnbBiQuery5ActivePosters("Ehud_Olmert", LIMIT));
}

@Test
public void testBiQuery6() throws DbException {
run(db, new LdbcSnbBiQuery6ActivePosters("Ehud_Olmert", LIMIT));
run(db, new LdbcSnbBiQuery6AuthoritativeUsers("Che_Guevara", LIMIT));
}

@Test
public void testBiQuery7() throws DbException {
run(db, new LdbcSnbBiQuery7AuthoritativeUsers("Che_Guevara", LIMIT));
run(db, new LdbcSnbBiQuery7RelatedTopics("Imelda_Marcos", LIMIT));
}

@Test
public void testBiQuery8() throws DbException {
run(db, new LdbcSnbBiQuery8RelatedTopics("Imelda_Marcos", LIMIT));
run(db, new LdbcSnbBiQuery8TagPerson("Che_Guevara", 1311307200000L, LIMIT));
}

@Test
public void testBiQuery9() throws DbException {
run(db, new LdbcSnbBiQuery9RelatedForums("BaseballPlayer", "ChristianBishop", 200, LIMIT));
run(db, new LdbcSnbBiQuery9TopThreadInitiators(1338523200000L, 1341115200000L, LIMIT));
}

@Test
public void testBiQuery10() throws DbException {
run(db, new LdbcSnbBiQuery10TagPerson("Che_Guevara", 1311307200000L, LIMIT));
run(db, new LdbcSnbBiQuery10ExpertsInSocialCircle(13194139534730L, "Germany", "MusicalArtist", 1, 2, LIMIT));
}

@Test
public void testBiQuery11() throws DbException {
run(db, new LdbcSnbBiQuery11UnrelatedReplies("Germany", Arrays.asList("also"), LIMIT));
run(db, new LdbcSnbBiQuery11FriendshipTriangles("Ethiopia", 1338523200000L));
}

@Test
public void testBiQuery12() throws DbException {
run(db, new LdbcSnbBiQuery12TrendingPosts(1311307200000L, 100, LIMIT));
run(db, new LdbcSnbBiQuery12PersonPostCounts(1311307200000L, 0, Arrays.asList("English"), LIMIT));
}

@Test
public void testBiQuery13() throws DbException {
run(db, new LdbcSnbBiQuery13PopularMonthlyTags("Ethiopia", LIMIT));
run(db, new LdbcSnbBiQuery13Zombies("Ethiopia", 1357016400000L, LIMIT));
}

@Test
public void testBiQuery14() throws DbException {
run(db, new LdbcSnbBiQuery14TopThreadInitiators(1338523200000L, 1341115200000L, LIMIT));
run(db, new LdbcSnbBiQuery14InternationalDialog("Mexico", "Indonesia", LIMIT));
}

@Test
public void testBiQuery15() throws DbException {
run(db, new LdbcSnbBiQuery15SocialNormals("Egypt", LIMIT));
run(db, new LdbcSnbBiQuery15WeightedPaths(2199023264119L, 8796093028894L, 1275364800000L, 1277956800000L));
}

@Test
public void testBiQuery16() throws DbException {
run(db, new LdbcSnbBiQuery16ExpertsInSocialCircle(13194139534730L, "Germany", "MusicalArtist", 1, 2, LIMIT));
run(db, new LdbcSnbBiQuery16FakeNewsDetection("Imelda_Marcos", 1317859200L, "Che", 1318377600L, 5, 10));
}

@Test
public void testBiQuery17() throws DbException {
run(db, new LdbcSnbBiQuery17FriendshipTriangles("Ethiopia"));
run(db, new LdbcSnbBiQuery17InformationPropagationAnalysis("Elizabeth_Taylor", 10, 20));
}

@Test
public void testBiQuery18() throws DbException {
run(db, new LdbcSnbBiQuery18PersonPostCounts(1311307200000L, 0, Arrays.asList("English"), LIMIT));
run(db, new LdbcSnbBiQuery18FriendRecommendation(290L, "Elizabeth_Taylor", 20));
}

@Test
public void testBiQuery19() throws DbException {
run(db, new LdbcSnbBiQuery19StrangerInteraction(599634000000L, "MusicalArtist", "OfficeHolder", LIMIT));
run(db, new LdbcSnbBiQuery19InteractionPathBetweenCities(1178L, 1142L, 20));
}

@Test
public void testBiQuery20() throws DbException {
run(db, new LdbcSnbBiQuery20HighLevelTopics(Arrays.asList("Country"), LIMIT));
}

@Test
public void testBiQuery21() throws DbException {
run(db, new LdbcSnbBiQuery21Zombies("Ethiopia", 1357016400000L, LIMIT));
}

@Test
public void testBiQuery22() throws DbException {
run(db, new LdbcSnbBiQuery22InternationalDialog("Mexico", "Indonesia", LIMIT));
}

@Test
public void testBiQuery23() throws DbException {
run(db, new LdbcSnbBiQuery23HolidayDestinations("Ethiopia", LIMIT));
}

@Test
public void testBiQuery24() throws DbException {
run(db, new LdbcSnbBiQuery24MessagesByTopic("Single", LIMIT));
}

@Test
public void testBiQuery25() throws DbException {
run(db, new LdbcSnbBiQuery25WeightedPaths(2199023264119L, 8796093028894L, 1275364800000L, 1277956800000L));
run(db, new LdbcSnbBiQuery20Recruitment("TajAir", 13194139533688L, 20));
}

}
Original file line number Diff line number Diff line change
@@ -3,15 +3,18 @@
import com.ldbc.driver.workloads.ldbc.snb.interactive.LdbcUpdate1AddPerson;

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.time.Instant;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.Date;
import java.util.List;
import java.util.TimeZone;
import java.util.stream.Collectors;

public class Converter {

final static String DATAGEN_FORMAT = "yyyy-MM-dd'T'HH:mm:ss.SSS'+0000'";
protected final ZoneId GMT = ZoneId.of("GMT");
final String DATAGEN_FORMAT = "yyyy-MM-dd'T'HH:mm:ss.SSS'+0000'";
final DateTimeFormatter dfGeneric = DateTimeFormatter.ofPattern(DATAGEN_FORMAT).withZone(GMT);

/**
* Converts epoch seconds to a date to the format of the converter (e.g. PostgreSQL-style timestamps).
@@ -24,9 +27,7 @@ public String convertDateTime(long timestamp) {
}

public String convertDateTime(Date date) {
final SimpleDateFormat sdf = new SimpleDateFormat(DATAGEN_FORMAT);
sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
return "'" + sdf.format(date) + "'";
return "'" + dfGeneric.format(date.toInstant()) + "'";
}

public String convertDate(long timestamp) {
@@ -38,16 +39,14 @@ public String convertDate(Date date) {
}

/**
* Converts timestamp strings (in the format produced by DATAGEN) ({@value #DATAGEN_FORMAT})
* Converts timestamp strings (in the format produced by DATAGEN)
* to a date.
*
* @param timestamp
* @return
*/
public long convertTimestampToEpoch(String timestamp) throws ParseException {
final SimpleDateFormat sdf = new SimpleDateFormat(DATAGEN_FORMAT);
sdf.setTimeZone(TimeZone.getTimeZone("GMT"));
return sdf.parse(timestamp).toInstant().toEpochMilli();
return Instant.from(dfGeneric.parse(timestamp)).toEpochMilli();
}

/**
@@ -117,7 +116,7 @@ public String convertBlacklist(List<String> words) {
}

/**
* Some implementations, e.g. the SPARQL one, will not work with a simple toString and require some tinkering,
* Some implementations, e.g. the SPARQL one (now deprecated), will not work with a simple toString and require some tinkering,
* e.g. padding the id with '0' characters.
*
* @param value
@@ -128,7 +127,7 @@ public String convertId(long value) {
}

/**
* Some implementation, e.g. the SPARQL one, require a different id for updates:
* Some implementation, e.g. the SPARQL one (now deprecated), require a different id for updates:
* while SparqlConverter#convertId() wraps the value with `"00000..."^^xsd:long`,
* updates require plain `00000...` format.
*
Original file line number Diff line number Diff line change
@@ -33,6 +33,7 @@
import com.ldbc.driver.workloads.ldbc.snb.interactive.LdbcUpdate8AddFriendship;
import com.ldbc.impls.workloads.ldbc.snb.SnbTest;
import com.ldbc.impls.workloads.ldbc.snb.db.BaseDb;
import org.junit.Ignore;
import org.junit.Test;

import java.util.Date;
@@ -148,6 +149,7 @@ public void testShortQuery7() throws Exception {
run(db, new LdbcShortQuery7MessageReplies(2061584476422L));
}

@Ignore
@Test
public void testUpdateQuery1() throws Exception {
final LdbcUpdate1AddPerson.Organization university1 = new LdbcUpdate1AddPerson.Organization(1001L, 2013);
@@ -173,26 +175,31 @@ public void testUpdateQuery1() throws Exception {
);
}

@Ignore
@Test
public void testUpdateQuery2() throws Exception {
run(db, new LdbcUpdate2AddPostLike(1021L, 1022L, new Date(0L)));
}

@Ignore
@Test
public void testUpdateQuery3() throws Exception {
run(db, new LdbcUpdate3AddCommentLike(1031L, 1032L, new Date(0L)));
}

@Ignore
@Test
public void testUpdateQuery4() throws Exception {
run(db, new LdbcUpdate4AddForum(1041L, "", new Date(0L), 1042L, ImmutableList.of(1043L, 1044L)));
}

@Ignore
@Test
public void testUpdateQuery5() throws Exception {
run(db, new LdbcUpdate5AddForumMembership(1051L, 1052L, new Date(0L)));
}

@Ignore
@Test
public void testUpdateQuery6() throws Exception {
run(db, new LdbcUpdate6AddPost(
@@ -211,6 +218,7 @@ public void testUpdateQuery6() throws Exception {
));
}

@Ignore
@Test
public void testUpdateQuery7() throws Exception {
run(db, new LdbcUpdate7AddComment(
@@ -227,6 +235,7 @@ public void testUpdateQuery7() throws Exception {
ImmutableList.of(1076L, 1077L)));
}

@Ignore
@Test
public void testUpdateQuery8() throws Exception {
run(db, new LdbcUpdate8AddFriendship(1081L, 1082L, new Date(0L)));
1 change: 1 addition & 0 deletions cypher/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
neo4j-server/
neo4j*.tar.gz
neo4j*.zip
69 changes: 18 additions & 51 deletions cypher/README.md
Original file line number Diff line number Diff line change
@@ -1,71 +1,38 @@
# LDBC SNB Cypher implementation

[(open)Cypher](http://www.opencypher.org/) implementation of the [LDBC SNB BI benchmark](https://github.com/ldbc/ldbc_snb_docs).
[Cypher](http://www.opencypher.org/) implementation of the [LDBC SNB benchmark](https://github.com/ldbc/ldbc_snb_docs).
Note that some BI queries are not expressed using pure Cypher, instead, they make use of the [APOC](https://neo4j.com/labs/) and [Graph Data Science](https://neo4j.com/product/graph-data-science-library/) Neo4j libraries.

## Starting Neo4j
## Loading the data in Neo4j

Run:
The Neo4j instance is run in Docker. To initialize the environment variables, use:

```bash
. scripts/environment-variables-default.sh
```
./get-neo4j.sh
./environment-variables-neo4j.sh && ./configure-neo4j.sh && neo4j-server/bin/neo4j start
```

## Loading the data set

### Generating the data set

The data set needs to be generated and preprocessed before loading it to the database. To generate it, use the `CSVComposite` serializer classes of the [DATAGEN](https://github.com/ldbc/ldbc_snb_datagen/) project:

```
ldbc.snb.datagen.serializer.dynamicActivitySerializer:ldbc.snb.datagen.serializer.snb.csv.dynamicserializer.activity.CsvCompositeDynamicActivitySerializer
ldbc.snb.datagen.serializer.dynamicPersonSerializer:ldbc.snb.datagen.serializer.snb.csv.dynamicserializer.person.CsvCompositeDynamicPersonSerializer
ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv.staticserializer.CsvCompositeStaticSerializer
```

An example configuration for scale factor 1 is given in the [`params-csv-composite.ini`](https://github.com/ldbc/ldbc_snb_datagen/blob/master/params-csv-composite.ini) file of the DATAGEN repository. For small loading experiments, we recommend using scale factor 0.1, i.e. `snb.interactive.0.1`.

### Preprocessing and loading

Go to the `load-scripts/` directory.

#### Preprocessing

Set the following environment variables appropriately:
To load a data set other than the example data set, you might want to adjust the following variables:

```bash
export NEO4J_HOME=/path/to/the/neo4j/dir
export NEO4J_DB_DIR=$NEO4J_HOME/data/databases/graph.db
export NEO4J_DATA_DIR=/path/do/the/csv/files
export POSTFIX=_0_0.csv
export NEO4J_CSV_DIR=/path/to/the/directory/social_network/
export NEO4J_CSV_POSTFIX=_0_0.csv
```

The CSV files require a bit of preprocessing:

* replace headers with Neo4j-compatible ones
* replace labels (e.g. change `city` to `City`)
* convert date and datetime formats

The following script takes care of those steps:

```bash
./convert-csvs.sh
scripts/load-in-one-step.sh
```

#### Delete your database and load the SNB CSVs

Be careful -- this deletes all data in your database, imports the SNB data set and restarts the database.
This script replaces the headers in the input CSVs, load them, starts Neo4j, and creates indices.

```bash
./delete-neo4j-database.sh
./import-to-neo4j.sh
./restart-neo4j.sh
```
## Loading the example data set

#### All-in-one loading script
Transform the example data set in the [data converter](https://github.com/ldbc/ldbc_snb_data_converter) repository:

If you know what you're doing, you can run all scripts with a single command:
Then, in in this repository, run

```bash
./load-in-one-step.sh
. scripts/environment-variables-default.sh
export NEO4J_CSV_DIR=${DATA_CONVERTER_DIR}/ldbc_snb_data_converter/data/csv-composite-projected-fk
export NEO4J_CSV_POSTFIX=.csv
scripts/load-in-one-step.sh
```
1 change: 0 additions & 1 deletion cypher/bi-benchmark.properties
Original file line number Diff line number Diff line change
@@ -9,7 +9,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=true
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
1 change: 0 additions & 1 deletion cypher/bi-create-validation-parameters.properties
Original file line number Diff line number Diff line change
@@ -9,7 +9,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=false
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
1 change: 0 additions & 1 deletion cypher/bi-validate.properties
Original file line number Diff line number Diff line change
@@ -10,7 +10,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=false
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
53 changes: 53 additions & 0 deletions cypher/bi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
from neo4j import GraphDatabase, time
from datetime import datetime
from neo4j.time import DateTime, Date
import time
import pytz
import csv
import re

#@unit_of_work(timeout=300)
def query_fun(tx, query_spec, query_parameters):
result = tx.run(query_spec, query_parameters)
return result.value()

def run_query(session, query_id, query_spec, query_parameters):
print(f'Q{query_id}: {query_parameters}')
start = time.time()
result = session.read_transaction(query_fun, query_spec, query_parameters)
print(f'{len(result)} results')
end = time.time()
duration = end - start
#print("Q{}: {:.4f} seconds, {} tuples".format(query_id, duration, result[0]))
return (duration, result)

def convert_to_datetime(timestamp):
dt = datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%S.%f+00:00')
return DateTime(dt.year, dt.month, dt.day, 0, 0, 0, pytz.timezone('GMT'))

def convert_to_date(timestamp):
dt = datetime.strptime(timestamp, '%Y-%m-%d')
return Date(dt.year, dt.month, dt.day)

driver = GraphDatabase.driver("bolt://localhost:7687")

with driver.session() as session:
for query_variant in ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14a", "14b", "15", "16", "17", "18", "19", "20"]:
query_num = re.sub("[^0-9]", "", query_variant)
query_file = open(f'queries/bi-{query_num}.cypher', 'r')
query_spec = query_file.read()

parameters_csv = csv.DictReader(open(f'parameters/bi-{query_variant}.txt'), delimiter='|')

for query_parameters in parameters_csv:
# convert fields based on type designators
query_parameters = {k: int(v) if re.match('.*:(ID|LONG)', k) else v for k, v in query_parameters.items()}
query_parameters = {k: convert_to_date(v) if re.match('.*:DATE$', k) else v for k, v in query_parameters.items()}
query_parameters = {k: convert_to_datetime(v) if re.match('.*:DATETIME', k) else v for k, v in query_parameters.items()}
query_parameters = {k: v.split(';') if re.findall('\[\]$', k) else v for k, v in query_parameters.items()}
# drop type designators
type_pattern = re.compile(':.*')
query_parameters = {type_pattern.sub('', k): v for k, v in query_parameters.items()}
run_query(session, query_variant, query_spec, query_parameters)

driver.close()
3 changes: 0 additions & 3 deletions cypher/configure-neo4j.sh

This file was deleted.

2 changes: 1 addition & 1 deletion cypher/bi-benchmark.sh → cypher/driver/bi-benchmark.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -P bi-benchmark.properties
java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm EXECUTE_WORKLOAD -P bi-benchmark.properties
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -P bi-create-validation-parameters.properties
java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm CREATE_VALIDATION_PARAMS -P bi-create-validation-parameters.properties
2 changes: 1 addition & 1 deletion cypher/bi-validate.sh → cypher/driver/bi-validate.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -P bi-validate.properties
java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm VALIDATE_DATABASE -P bi-validate.properties
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -P interactive-benchmark.properties
java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm EXECUTE_WORKLOAD -P interactive-benchmark.properties
3 changes: 3 additions & 0 deletions cypher/driver/interactive-create-validation-parameters.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm CREATE_VALIDATION_PARAMS -P interactive-create-validation-parameters.properties
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -P interactive-validate.properties
java -cp target/cypher-0.4.0-SNAPSHOT.jar com.ldbc.driver.Client -dm VALIDATE_DATABASE -P interactive-validate.properties
4 changes: 0 additions & 4 deletions cypher/environment-variables-neo4j.sh

This file was deleted.

8 changes: 0 additions & 8 deletions cypher/get-neo4j.sh

This file was deleted.

27 changes: 17 additions & 10 deletions cypher/interactive-benchmark.properties
Original file line number Diff line number Diff line change
@@ -9,7 +9,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=true
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
@@ -24,22 +23,28 @@ operation_count=250
ldbc.snb.interactive.parameters_dir=../../ldbc_snb_datagen/substitution_parameters/
ldbc.snb.interactive.updates_dir=../../ldbc_snb_datagen/social_network/
ldbc.snb.interactive.short_read_dissipation=0.2
ldbc.snb.interactive.update_interleave=49274
## The ldbc.snb.interactive.update_interleave driver parameter must come from the
## updateStream.properties file, which is created by the data generator.
## This parameter should NEVER be set manually.
ldbc.snb.interactive.update_interleave=4477

warmup=100

## frequency of read queries (number of update queries per one read query)
## Make sure that the frequencies are those for the selected scale factor
## as found on section B.1 "Scale Factor Statistics for the Interactive workload"
## at http://ldbc.github.io/ldbc_snb_docs/ldbc-snb-specification.pdf
ldbc.snb.interactive.LdbcQuery1_freq=26
ldbc.snb.interactive.LdbcQuery2_freq=37
ldbc.snb.interactive.LdbcQuery3_freq=123
ldbc.snb.interactive.LdbcQuery3_freq=69
ldbc.snb.interactive.LdbcQuery4_freq=36
ldbc.snb.interactive.LdbcQuery5_freq=78
ldbc.snb.interactive.LdbcQuery6_freq=434
ldbc.snb.interactive.LdbcQuery7_freq=38
ldbc.snb.interactive.LdbcQuery8_freq=5
ldbc.snb.interactive.LdbcQuery9_freq=527
ldbc.snb.interactive.LdbcQuery10_freq=40
ldbc.snb.interactive.LdbcQuery11_freq=22
ldbc.snb.interactive.LdbcQuery5_freq=57
ldbc.snb.interactive.LdbcQuery6_freq=129
ldbc.snb.interactive.LdbcQuery7_freq=87
ldbc.snb.interactive.LdbcQuery8_freq=45
ldbc.snb.interactive.LdbcQuery9_freq=157
ldbc.snb.interactive.LdbcQuery10_freq=30
ldbc.snb.interactive.LdbcQuery11_freq=16
ldbc.snb.interactive.LdbcQuery12_freq=44
ldbc.snb.interactive.LdbcQuery13_freq=19
ldbc.snb.interactive.LdbcQuery14_freq=49
@@ -77,3 +82,5 @@ ldbc.snb.interactive.LdbcUpdate5AddForumMembership_enable=true
ldbc.snb.interactive.LdbcUpdate6AddPost_enable=true
ldbc.snb.interactive.LdbcUpdate7AddComment_enable=true
ldbc.snb.interactive.LdbcUpdate8AddFriendship_enable=true

ldbc.snb.interactive.LdbcDelete1RemovePerson_enable=false
19 changes: 10 additions & 9 deletions cypher/interactive-create-validation-parameters.properties
Original file line number Diff line number Diff line change
@@ -9,7 +9,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=false
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
@@ -31,15 +30,15 @@ create_validation_parameters=validation_params.csv|100
## frequency of read queries (number of update queries per one read query)
ldbc.snb.interactive.LdbcQuery1_freq=26
ldbc.snb.interactive.LdbcQuery2_freq=37
ldbc.snb.interactive.LdbcQuery3_freq=123
ldbc.snb.interactive.LdbcQuery3_freq=69
ldbc.snb.interactive.LdbcQuery4_freq=36
ldbc.snb.interactive.LdbcQuery5_freq=78
ldbc.snb.interactive.LdbcQuery6_freq=434
ldbc.snb.interactive.LdbcQuery7_freq=38
ldbc.snb.interactive.LdbcQuery8_freq=5
ldbc.snb.interactive.LdbcQuery9_freq=527
ldbc.snb.interactive.LdbcQuery10_freq=40
ldbc.snb.interactive.LdbcQuery11_freq=22
ldbc.snb.interactive.LdbcQuery5_freq=57
ldbc.snb.interactive.LdbcQuery6_freq=129
ldbc.snb.interactive.LdbcQuery7_freq=87
ldbc.snb.interactive.LdbcQuery8_freq=45
ldbc.snb.interactive.LdbcQuery9_freq=157
ldbc.snb.interactive.LdbcQuery10_freq=30
ldbc.snb.interactive.LdbcQuery11_freq=16
ldbc.snb.interactive.LdbcQuery12_freq=44
ldbc.snb.interactive.LdbcQuery13_freq=19
ldbc.snb.interactive.LdbcQuery14_freq=49
@@ -77,3 +76,5 @@ ldbc.snb.interactive.LdbcUpdate5AddForumMembership_enable=true
ldbc.snb.interactive.LdbcUpdate6AddPost_enable=true
ldbc.snb.interactive.LdbcUpdate7AddComment_enable=true
ldbc.snb.interactive.LdbcUpdate8AddFriendship_enable=true

ldbc.snb.interactive.LdbcDelete1RemovePerson_enable=false
3 changes: 0 additions & 3 deletions cypher/interactive-create-validation-parameters.sh

This file was deleted.

3 changes: 2 additions & 1 deletion cypher/interactive-validate.properties
Original file line number Diff line number Diff line change
@@ -10,7 +10,6 @@ printQueryResults=false
status=1
thread_count=1
name=LDBC-SNB
results_log=false
time_unit=MILLISECONDS
time_compression_ratio=0.001
peer_identifiers=
@@ -75,3 +74,5 @@ ldbc.snb.interactive.LdbcUpdate5AddForumMembership_enable=true
ldbc.snb.interactive.LdbcUpdate6AddPost_enable=true
ldbc.snb.interactive.LdbcUpdate7AddComment_enable=true
ldbc.snb.interactive.LdbcUpdate8AddFriendship_enable=true

ldbc.snb.interactive.LdbcDelete1RemovePerson_enable=false
30 changes: 0 additions & 30 deletions cypher/load-scripts/convert-csvs.sh

This file was deleted.

4 changes: 0 additions & 4 deletions cypher/load-scripts/delete-neo4j-database.sh

This file was deleted.

19 changes: 0 additions & 19 deletions cypher/load-scripts/graphalytics-import-to-neo4j.sh

This file was deleted.

3 changes: 0 additions & 3 deletions cypher/load-scripts/graphalytics-load-in-one-step.sh

This file was deleted.

31 changes: 0 additions & 31 deletions cypher/load-scripts/headers.txt

This file was deleted.

36 changes: 0 additions & 36 deletions cypher/load-scripts/import-to-neo4j.sh

This file was deleted.

3 changes: 0 additions & 3 deletions cypher/load-scripts/load-in-one-step.sh

This file was deleted.

3 changes: 0 additions & 3 deletions cypher/load-scripts/restart-neo4j.sh

This file was deleted.

2 changes: 2 additions & 0 deletions cypher/neo4j-scratch/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*
!.gitignore
File renamed without changes.
2 changes: 2 additions & 0 deletions cypher/parameters/bi-1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
datetime:DATETIME
2011-12-01T11:05:56.000+00:00
2 changes: 2 additions & 0 deletions cypher/parameters/bi-10.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
personId:ID|country:STRING|tagClass:STRING|minPathDistance:LONG|maxPathDistance:LONG
5|France|Sports|2|3
2 changes: 2 additions & 0 deletions cypher/parameters/bi-11.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
country:STRING|startDate:DATE
France|2010-05-01
2 changes: 2 additions & 0 deletions cypher/parameters/bi-12.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
date:DATE|lengthThreshold:LONG|languages:STRING[]
2010-07-22|50|en;fr
2 changes: 2 additions & 0 deletions cypher/parameters/bi-13.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
country:STRING|endDate:DATE
France|2013-01-01
2 changes: 2 additions & 0 deletions cypher/parameters/bi-14a.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
country1:STRING|country2:STRING
Spain|France
2 changes: 2 additions & 0 deletions cypher/parameters/bi-14b.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
country1:STRING|country2:STRING
Spain|France
2 changes: 2 additions & 0 deletions cypher/parameters/bi-15.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
person1Id:ID|person2Id:ID|startDate:DATE|endDate:DATE
2|4|2011-06-01|2012-05-31
2 changes: 2 additions & 0 deletions cypher/parameters/bi-16.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tagA:STRING|dateA:DATE|tagB:STRING|dateB:DATE|maxKnowsLimit:LONG
Pyrenees|2011-10-10|Snowboard|2012-03-04|5
2 changes: 2 additions & 0 deletions cypher/parameters/bi-17.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag:STRING|delta:LONG
Snowboard|10
2 changes: 2 additions & 0 deletions cypher/parameters/bi-18.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
person1Id:ID|tag:STRING
2|Snowboard
2 changes: 2 additions & 0 deletions cypher/parameters/bi-19.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
city1Id:ID|city2Id:ID
5|6
2 changes: 2 additions & 0 deletions cypher/parameters/bi-2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
date:DATE|tagClass:STRING
2011-10-01|Sports
2 changes: 2 additions & 0 deletions cypher/parameters/bi-20.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
company:STRING|person2Id:ID
SoftEngCo|5
2 changes: 2 additions & 0 deletions cypher/parameters/bi-3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tagClass:STRING|country:STRING
Sports|Spain
2 changes: 2 additions & 0 deletions cypher/parameters/bi-4.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
country:STRING
Spain
2 changes: 2 additions & 0 deletions cypher/parameters/bi-5.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag:STRING
Snowboard
2 changes: 2 additions & 0 deletions cypher/parameters/bi-6.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag:STRING
Pyrenees
2 changes: 2 additions & 0 deletions cypher/parameters/bi-7.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag:STRING
Pyrenees
2 changes: 2 additions & 0 deletions cypher/parameters/bi-8.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tag:STRING|date:DATE
Pyrenees|2010-10-01
2 changes: 2 additions & 0 deletions cypher/parameters/bi-9.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
startDate:DATE|endDate:DATE
2011-10-01|2011-10-15
21 changes: 21 additions & 0 deletions cypher/parameters/headers.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
bi-1 datetime:DATETIME
bi-2 date:DATE|tagClass:STRING
bi-3 tagClass:STRING|country:STRING
bi-4 country:STRING
bi-5 tag:STRING
bi-6 tag:STRING
bi-7 tag:STRING
bi-8 tag:STRING|date:DATE
bi-9 startDate:DATE|endDate:DATE
bi-10 personId:ID|country:STRING|tagClass:STRING|minPathDistance:LONG|maxPathDistance:LONG
bi-11 country:STRING|startDate:DATE
bi-12 date:DATE|lengthThreshold:LONG|languages:STRING[]
bi-13 country:STRING|endDate:DATE
bi-14a country1:STRING|country2:STRING
bi-14b country1:STRING|country2:STRING
bi-15 person1Id:ID|person2Id:ID|startDate:DATE|endDate:DATE
bi-16 tagA:STRING|dateA:DATE|tagB:STRING|dateB:DATE|maxKnowsLimit:LONG
bi-17 tag:STRING|delta:LONG
bi-18 person1Id:ID|tag:STRING
bi-19 city1Id:ID|city2Id:ID
bi-20 company:STRING|person2Id:ID
2 changes: 1 addition & 1 deletion cypher/pom.xml
Original file line number Diff line number Diff line change
@@ -21,7 +21,7 @@
<dependency>
<groupId>org.neo4j.driver</groupId>
<artifactId>neo4j-java-driver</artifactId>
<version>1.6.3</version>
<version>4.1.1</version>
</dependency>
</dependencies>
<build>
16 changes: 0 additions & 16 deletions cypher/queries/README.md

This file was deleted.

8 changes: 4 additions & 4 deletions cypher/queries/bi-1.cypher
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
// Q1. Posting summary
/*
:param { date: 20110721220000000 }
:param datetime => datetime('2011-12-01')
*/
MATCH (message:Message)
WHERE message.creationDate < $date
WHERE message.creationDate < $datetime
WITH count(message) AS totalMessageCountInt // this should be a subquery once Cypher supports it
WITH toFloat(totalMessageCountInt) AS totalMessageCount
MATCH (message:Message)
WHERE message.creationDate < $date
WHERE message.creationDate < $datetime
AND message.content IS NOT NULL
WITH
totalMessageCount,
message,
message.creationDate/10000000000000 AS year
message.creationDate.year AS year
WITH
totalMessageCount,
year,
68 changes: 0 additions & 68 deletions cypher/queries/bi-10-without-pattern-comprehension.cypher

This file was deleted.

73 changes: 40 additions & 33 deletions cypher/queries/bi-10.cypher
Original file line number Diff line number Diff line change
@@ -1,38 +1,45 @@
// Q10. Central Person for a Tag
// Q10. Experts in social circle
// Requires the Neo4j APOC library
/*
:param {
tag: 'John_Rhys-Davies',
date: 20120122000000000
}
:param [{ personId, country, tagClass, minPathDistance, maxPathDistance }] => {
RETURN
5 AS personId,
'France' AS country,
'Sports' AS tagClass,
2 AS minPathDistance,
3 AS maxPathDistance
}
*/
MATCH (tag:Tag {name: $tag})
// score
OPTIONAL MATCH (tag)<-[interest:HAS_INTEREST]-(person:Person)
WITH tag, collect(person) AS interestedPersons
OPTIONAL MATCH (tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person:Person)
WHERE message.creationDate > $date
WITH tag, interestedPersons + collect(person) AS persons
UNWIND persons AS person
// poor man's disjunct union (should be changed to UNION + post-union processing in the future)
WITH DISTINCT tag, person
WITH
tag,
person,
100 * length([(tag)<-[interest:HAS_INTEREST]-(person) | interest])
+ length([(tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person) WHERE message.creationDate > $date | message])
AS score
OPTIONAL MATCH (person)-[:KNOWS]-(friend)
WITH
person,
score,
100 * length([(tag)<-[interest:HAS_INTEREST]-(friend) | interest])
+ length([(tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(friend) WHERE message.creationDate > $date | message])
AS friendScore
MATCH (startPerson:Person {id: $personId})
CALL apoc.path.subgraphNodes(startPerson, {
relationshipFilter: "KNOWS",
minLevel: 1,
maxLevel: $minPathDistance-1
})
YIELD node
WITH startPerson, collect(DISTINCT node) AS nodesCloserThanMinPathDistance
CALL apoc.path.subgraphNodes(startPerson, {
relationshipFilter: "KNOWS",
minLevel: 1,
maxLevel: $maxPathDistance
})
YIELD node
WITH nodesCloserThanMinPathDistance, collect(DISTINCT node) AS nodesCloserThanMaxPathDistance
// compute the difference of sets: nodesCloserThanMaxPathDistance - nodesCloserThanMinPathDistance
WITH [n IN nodesCloserThanMaxPathDistance WHERE NOT n IN nodesCloserThanMinPathDistance] AS expertCandidatePersons
UNWIND expertCandidatePersons AS expertCandidatePerson
MATCH
(expertCandidatePerson)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(:Country {name: $country}),
(expertCandidatePerson)<-[:HAS_CREATOR]-(message:Message)-[:HAS_TAG]->(:Tag)-[:HAS_TYPE]->
(:TagClass {name: $tagClass})
MATCH
(message)-[:HAS_TAG]->(tag:Tag)
RETURN
person.id,
score,
sum(friendScore) AS friendsScore
expertCandidatePerson.id,
tag.name,
count(DISTINCT message) AS messageCount
ORDER BY
score + friendsScore DESC,
person.id ASC
messageCount DESC,
tag.name ASC,
expertCandidatePerson.id ASC
LIMIT 100
38 changes: 14 additions & 24 deletions cypher/queries/bi-11.cypher
Original file line number Diff line number Diff line change
@@ -1,26 +1,16 @@
// Q11. Unrelated replies
// Q11. Friend triangles
/*
:param {
country: 'Germany',
blacklist: ['also', 'Pope', 'that', 'James', 'Henry', 'one', 'Green']
}
:param [{ country, startDate }] => { RETURN 'France' AS country, datetime('2010-05-01') AS startDate }
*/
WITH $blacklist AS blacklist
MATCH
(country:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-
(person:Person)<-[:HAS_CREATOR]-(reply:Comment)-[:REPLY_OF]->(message:Message),
(reply)-[:HAS_TAG]->(tag:Tag)
WHERE NOT (message)-[:HAS_TAG]->(:Tag)<-[:HAS_TAG]-(reply)
AND size([word IN blacklist WHERE reply.content CONTAINS word | word]) = 0
OPTIONAL MATCH
(:Person)-[like:LIKES]->(reply)
RETURN
person.id,
tag.name,
count(DISTINCT like) AS countLikes,
count(DISTINCT reply) AS countReplies
ORDER BY
countLikes DESC,
person.id ASC,
tag.name ASC
LIMIT 100
MATCH (country:Country {name: $country})
MATCH (a:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (b:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (c:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (a)-[k1:KNOWS]-(b)-[k2:KNOWS]-(c)-[k3:KNOWS]-(a)
WHERE a.id < b.id
AND b.id < c.id
AND $startDate <= k1.creationDate
AND $startDate <= k2.creationDate
AND $startDate <= k3.creationDate
WITH DISTINCT a, b, c
RETURN count(*) AS count
34 changes: 15 additions & 19 deletions cypher/queries/bi-12.cypher
Original file line number Diff line number Diff line change
@@ -1,23 +1,19 @@
// Q12. Trending Posts
// Q12. How many persons have a given number of posts
/*
:param {
date: 20110721220000000,
likeThreshold: 400
}
:param [{ date, lengthThreshold, languages }] => { RETURN datetime('2010-07-22') AS date, 50 AS lengthThreshold, ['en', 'fr'] AS languages }
*/
MATCH
(message:Message)-[:HAS_CREATOR]->(creator:Person),
(message)<-[like:LIKES]-(:Person)
WHERE message.creationDate > $date
WITH message, creator, count(like) AS likeCount
WHERE likeCount > $likeThreshold
MATCH (person:Person)
OPTIONAL MATCH (person)<-[:HAS_CREATOR]-(message:Message)-[:REPLY_OF*0..]->(post:Post)
WHERE message.content IS NOT NULL
AND message.length < $lengthThreshold
AND message.creationDate > $date
AND post.language IN $languages
WITH
person,
count(message) AS messageCount
RETURN
message.id,
message.creationDate,
creator.firstName,
creator.lastName,
likeCount
messageCount,
count(person) AS personCount
ORDER BY
likeCount DESC,
message.id ASC
LIMIT 100
personCount DESC,
messageCount DESC
63 changes: 42 additions & 21 deletions cypher/queries/bi-13.cypher
Original file line number Diff line number Diff line change
@@ -1,29 +1,50 @@
// Q13. Popular Tags per month in a country
// Q13. Zombies in a country
/*
:param { country: 'Burma' }
:param [{ country, endDate }] => { RETURN 'France' AS country, datetime('2013-01-01') AS endDate }
*/
MATCH (:Country {name: $country})<-[:IS_LOCATED_IN]-(message:Message)
OPTIONAL MATCH (message)-[:HAS_TAG]->(tag:Tag)
MATCH (country:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(zombie:Person)
OPTIONAL MATCH
(zombie)<-[:HAS_CREATOR]-(message:Message)
WHERE zombie.creationDate < $endDate.year
AND message.creationDate < $endDate.month
WITH
message.creationDate/10000000000000 AS year,
message.creationDate/100000000000%100 AS month,
message,
tag
WITH year, month, count(message) AS popularity, tag
ORDER BY popularity DESC, tag.name ASC
country,
zombie,
count(message) AS messageCount
WITH
year,
month,
collect([tag.name, popularity]) AS popularTags
country,
zombie,
12 * ($endDate.year - zombie.creationDate.year )
+ ($endDate.month - zombie.creationDate.month)
+ 1 AS months,
messageCount
WHERE messageCount / months < 1
WITH
year,
month,
[popularTag IN popularTags WHERE popularTag[0] IS NOT NULL] AS popularTags
country,
collect(zombie) AS zombies
UNWIND zombies AS zombie
OPTIONAL MATCH
(zombie)<-[:HAS_CREATOR]-(message:Message)<-[:LIKES]-(likerZombie:Person)
WHERE likerZombie IN zombies
WITH
zombie,
count(likerZombie) AS zombieLikeCount
OPTIONAL MATCH
(zombie)<-[:HAS_CREATOR]-(message:Message)<-[:LIKES]-(likerPerson:Person)
WHERE likerPerson.creationDate < $endDate
WITH
zombie,
zombieLikeCount,
count(likerPerson) AS totalLikeCount
RETURN
year,
month,
popularTags[0..5] AS topPopularTags
zombie.id,
zombieLikeCount,
totalLikeCount,
CASE totalLikeCount
WHEN 0 THEN 0.0
ELSE zombieLikeCount / toFloat(totalLikeCount)
END AS zombieScore
ORDER BY
year DESC,
month ASC
zombieScore DESC,
zombie.id ASC
LIMIT 100
56 changes: 38 additions & 18 deletions cypher/queries/bi-14.cypher
Original file line number Diff line number Diff line change
@@ -1,22 +1,42 @@
// Q14. Top thread initiators
// Q14. International dialog
/*
:param {
startDate: 20120531220000000,
endDate: 20120630220000000
}
:param [{ country1, country2 }] => { RETURN 'France' AS country1, 'Spain' AS country2 }
*/
MATCH (person:Person)<-[:HAS_CREATOR]-(post:Post)<-[:REPLY_OF*0..]-(reply:Message)
WHERE post.creationDate >= $startDate
AND post.creationDate <= $endDate
AND reply.creationDate >= $startDate
AND reply.creationDate <= $endDate
MATCH
(country1:Country {name: $country1})<-[:IS_PART_OF]-(city1:City)<-[:IS_LOCATED_IN]-(person1:Person),
(country2:Country {name: $country2})<-[:IS_PART_OF]-(city2:City)<-[:IS_LOCATED_IN]-(person2:Person)
WITH person1, person2, city1, 0 AS score
// subscore 1
OPTIONAL MATCH (person1)<-[:HAS_CREATOR]-(c:Comment)-[:REPLY_OF]->(:Message)-[:HAS_CREATOR]->(person2)
WITH DISTINCT person1, person2, city1, score + (CASE c WHEN null THEN 0 ELSE 4 END) AS score
// subscore 2
OPTIONAL MATCH (person1)<-[:HAS_CREATOR]-(m:Message)<-[:REPLY_OF]-(:Comment)-[:HAS_CREATOR]->(person2)
WITH DISTINCT person1, person2, city1, score + (CASE m WHEN null THEN 0 ELSE 1 END) AS score
// subscore 3
OPTIONAL MATCH (person1)-[k:KNOWS]-(person2)
WITH DISTINCT person1, person2, city1, score + (CASE k WHEN null THEN 0 ELSE 15 END) AS score
// subscore 4
OPTIONAL MATCH (person1)-[:LIKES]->(m:Message)-[:HAS_CREATOR]->(person2)
WITH DISTINCT person1, person2, city1, score + (CASE m WHEN null THEN 0 ELSE 10 END) AS score
// subscore 5
OPTIONAL MATCH (person1)<-[:HAS_CREATOR]-(m:Message)<-[:LIKES]-(person2)
WITH DISTINCT person1, person2, city1, score + (CASE m WHEN null THEN 0 ELSE 1 END) AS score
// preorder
ORDER BY
city1.name ASC,
score DESC,
person1.id ASC,
person2.id ASC
WITH
city1,
// using a list might be faster, but the browser query editor does not like it
collect({score: score, person1: person1, person2: person2})[0] AS top
RETURN
person.id,
person.firstName,
person.lastName,
count(DISTINCT post) AS threadCount,
count(DISTINCT reply) AS messageCount
top.person1.id,
top.person2.id,
city1.name,
top.score
ORDER BY
messageCount DESC,
person.id ASC
LIMIT 100
top.score DESC,
top.person1.id ASC,
top.person2.id ASC
75 changes: 54 additions & 21 deletions cypher/queries/bi-15.cypher
Original file line number Diff line number Diff line change
@@ -1,28 +1,61 @@
// Q15. Social normals
// Q15. Weighted interaction paths
/*
:param { country: 'Burma' }
:param [{ person1Id, person2Id, startDate, endDate }] => {
RETURN
2 AS person1Id,
4 AS person2Id,
datetime('2011-06-01') AS startDate,
datetime('2012-05-31') AS endDate
}
*/
MATCH
(country:Country {name: $country})
MATCH
(country)<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(person1:Person)
path=allShortestPaths((p1:Person {id: $person1Id})-[:KNOWS*]-(p2:Person {id: $person2Id}))
UNWIND relationships(path) AS k
WITH
path,
startNode(k) AS pA,
endNode(k) AS pB,
0 AS relationshipWeights

// case 1, A to B
// every reply (by one of the Persons) to a Post (by the other Person): 1.0
OPTIONAL MATCH
// start a new MATCH as friend might live in the same City
// and thus can reuse the IS_PART_OF edge
(country)<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(friend1:Person),
(person1)-[:KNOWS]-(friend1)
WITH country, person1, count(friend1) AS friend1Count
WITH country, avg(friend1Count) AS socialNormalFloat
WITH country, floor(socialNormalFloat) AS socialNormal
MATCH
(country)<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(person2:Person)
(pA)<-[:HAS_CREATOR]-(c:Comment)-[:REPLY_OF]->(post:Post)-[:HAS_CREATOR]->(pB),
(post)<-[:CONTAINER_OF]-(forum:Forum)
WHERE forum.creationDate >= $startDate AND forum.creationDate <= $endDate
WITH path, pA, pB, relationshipWeights + count(c)*1.0 AS relationshipWeights

// case 2, A to B
// every reply (by ones of the Persons) to a Comment (by the other Person): 0.5
OPTIONAL MATCH
(pA)<-[:HAS_CREATOR]-(c1:Comment)-[:REPLY_OF]->(c2:Comment)-[:HAS_CREATOR]->(pB),
(c2)-[:REPLY_OF*]->(:Post)<-[:CONTAINER_OF]-(forum:Forum)
WHERE forum.creationDate >= $startDate AND forum.creationDate <= $endDate
WITH path, pA, pB, relationshipWeights + count(c1)*0.5 AS relationshipWeights

// case 1, B to A
// every reply (by one of the Persons) to a Post (by the other Person): 1.0
OPTIONAL MATCH
(pB)<-[:HAS_CREATOR]-(c:Comment)-[:REPLY_OF]->(post:Post)-[:HAS_CREATOR]->(pA),
(post)<-[:CONTAINER_OF]-(forum:Forum)
WHERE forum.creationDate >= $startDate AND forum.creationDate <= $endDate
WITH path, pA, pB, relationshipWeights + count(c)*1.0 AS relationshipWeights

// case 2, B to A
// every reply (by ones of the Persons) to a Comment (by the other Person): 0.5
OPTIONAL MATCH
(country)<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(friend2:Person)-[:KNOWS]-(person2)
WITH country, person2, count(friend2) AS friend2Count, socialNormal
WHERE friend2Count = socialNormal
(pB)<-[:HAS_CREATOR]-(c1:Comment)-[:REPLY_OF]->(c2:Comment)-[:HAS_CREATOR]->(pA),
(c2)-[:REPLY_OF*]->(:Post)<-[:CONTAINER_OF]-(forum:Forum)
WHERE forum.creationDate >= $startDate AND forum.creationDate <= $endDate
WITH path, pA, pB, relationshipWeights + count(c1)*0.5 AS relationshipWeights

WITH
[person IN nodes(path) | person.id] AS personIds,
sum(relationshipWeights) AS weight

RETURN
person2.id,
friend2Count AS count
personIds,
weight
ORDER BY
person2.id ASC
LIMIT 100
weight DESC,
personIds ASC
63 changes: 32 additions & 31 deletions cypher/queries/bi-16.cypher
Original file line number Diff line number Diff line change
@@ -1,34 +1,35 @@
// Q16. Experts in social circle
// Q16. Fake news detection
// These parameters return a 'false positive' as the maxKnowsLimit is set too high.
/*
:param {
personId: 19791209310731,
country: 'Pakistan',
tagClass: 'MusicalArtist',
minPathDistance: 3,
maxPathDistance: 5
}
:param [{ tagA, dateA, tagB, dateB, maxKnowsLimit }] => { RETURN
'Pyrenees' AS tagA,
date('2011-10-10') AS dateA,
'Snowboard' AS tagB,
date('2012-03-04') AS dateB,
5 AS maxKnowsLimit
}
*/
// This query will not work in a browser as is. I tried alternatives approaches,
// e.g. enabling path of arbitrary lengths, saving the path to a variable p and
// checking for `$minPathDistance <= length(p)`, but these could not be
// evaluated due to the excessive amount of paths.
// If you would like to test the query in the browser, replace the values of
// $minPathDistance and $maxPathDistance to a constant.
MATCH
(:Person {id: $personId})-[:KNOWS*$minPathDistance..$maxPathDistance]-(person:Person)
WITH DISTINCT person
MATCH
(person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(:Country {name: $country}),
(person)<-[:HAS_CREATOR]-(message:Message)-[:HAS_TAG]->(:Tag)-[:HAS_TYPE]->
(:TagClass {name: $tagClass})
MATCH
(message)-[:HAS_TAG]->(tag:Tag)
UNWIND [
{letter: 'A', tag: $tagA, date: $dateA},
{letter: 'B', tag: $tagB, date: $dateB}
] AS param
WITH param.letter AS paramLetter, param.tag AS paramTagX, param.date AS paramDateX
CALL {
WITH paramTagX, paramDateX
MATCH (person1:Person)<-[:HAS_CREATOR]-(message1:Message)-[:HAS_TAG]->(tag:Tag {name: paramTagX})
WHERE date(message1.creationDate) = paramDateX
// filter out people with more than $maxKnowsLimit friends who posted the same kind of message
OPTIONAL MATCH (person1)-[:KNOWS]-(person2:Person)<-[:HAS_CREATOR]-(message2:Message)-[:HAS_TAG]->(tag)
WHERE date(message2.creationDate) = paramDateX
WITH person1, count(DISTINCT message1) AS cm, count(DISTINCT person2) AS cp2
WHERE cp2 <= $maxKnowsLimit
// return count
RETURN person1, cm
}
WITH person1, collect({letter: paramLetter, messageCount: cm}) AS results
WHERE size(results) = 2
RETURN
person.id,
tag.name,
count(DISTINCT message) AS messageCount
ORDER BY
messageCount DESC,
tag.name ASC,
person.id ASC
LIMIT 100
person1.id,
[r IN results WHERE r.letter = 'A' | r.messageCount][0] AS messageCountA,
[r IN results WHERE r.letter = 'B' | r.messageCount][0] AS messageCountB
ORDER BY messageCountA + messageCountB DESC, person1.id ASC
26 changes: 15 additions & 11 deletions cypher/queries/bi-17.cypher
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
// Q17. Friend triangles
// Q17. Information propagation analysis
/*
:param { country: 'Spain' }
:param [{ tag, delta }] => {RETURN 'Snowboard' AS tag, 10 AS delta}
*/
MATCH (country:Country {name: $country})
MATCH (a:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (b:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (c:Person)-[:IS_LOCATED_IN]->(:City)-[:IS_PART_OF]->(country)
MATCH (a)-[:KNOWS]-(b), (b)-[:KNOWS]-(c), (c)-[:KNOWS]-(a)
WHERE a.id < b.id
AND b.id < c.id
RETURN count(*) AS count
// as a less elegant solution, count(a) also works
MATCH
(tag:Tag {name: $tag}),
(person1:Person)<-[:HAS_CREATOR]-(message1:Message)-[:REPLY_OF*0..]->(post1:Post)<-[:CONTAINER_OF]-(forum1:Forum),
(message1)-[:HAS_TAG]->(tag),
(forum1)<-[:HAS_MEMBER]->(person2:Person)<-[:HAS_CREATOR]-(comment:Comment)-[:HAS_TAG]->(tag),
(forum1)<-[:HAS_MEMBER]->(person3:Person)<-[:HAS_CREATOR]-(message2:Message)-[:HAS_TAG]->(tag),
(comment)-[:REPLY_OF]->(message2)-[:REPLY_OF*0..]->(post2:Post)<-[:CONTAINER_OF]-(forum2:Forum)
WHERE forum1 <> forum2
AND message2.creationDate > message1.creationDate + duration({hours: $delta})
AND NOT (forum2)-[:HAS_MEMBER]->(person1)
RETURN person1.id, count(message2) AS messageCount
ORDER BY person1.id ASC
LIMIT 10
29 changes: 8 additions & 21 deletions cypher/queries/bi-18.cypher
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
// Q18. How many persons have a given number of posts
// Q18. Friend recommendation
/*
:param {
date: 20110722000000000,
lengthThreshold: 20,
languages: ['ar']
}
:param [{ person1Id, tag }] => {RETURN 2 AS person1Id, 'Snowboard' AS tag}
*/
MATCH (person:Person)
OPTIONAL MATCH (person)<-[:HAS_CREATOR]-(message:Message)-[:REPLY_OF*0..]->(post:Post)
WHERE message.content IS NOT NULL
AND message.length < $lengthThreshold
AND message.creationDate > $date
AND post.language IN $languages
WITH
person,
count(message) AS messageCount
RETURN
messageCount,
count(person) AS personCount
ORDER BY
personCount DESC,
messageCount DESC
MATCH (person1:Person {id: $person1Id})-[:KNOWS]-(mutualFriend:Person)-[:KNOWS]-(person2:Person)-[:HAS_INTEREST]->(:Tag {name: $tag})
WHERE person1 <> person2
AND NOT (person1)-[:KNOWS]-(person2)
RETURN person2.id AS person2Id, count(DISTINCT mutualFriend) AS mutualFriendCount
ORDER BY mutualFriendCount DESC, person2Id ASC
LIMIT 20
51 changes: 24 additions & 27 deletions cypher/queries/bi-19.cypher
Original file line number Diff line number Diff line change
@@ -1,30 +1,27 @@
// Q19. Stranger's interaction
// Q19. Interaction path between cities
// Requires the Neo4j Graph Data Science library
/*
:param {
date: 19890101,
tagClass1: 'MusicalArtist',
tagClass2: 'OfficeHolder'
}
:param [{ city1Id, city2Id }] => {RETURN 5 AS city1Id, 6 AS city2Id}
*/
MATCH
(:TagClass {name: $tagClass1})<-[:HAS_TYPE]-(:Tag)<-[:HAS_TAG]-
(forum1:Forum)-[:HAS_MEMBER]->(stranger:Person)
WITH DISTINCT stranger
MATCH
(:TagClass {name: $tagClass2})<-[:HAS_TYPE]-(:Tag)<-[:HAS_TAG]-
(forum2:Forum)-[:HAS_MEMBER]->(stranger)
WITH DISTINCT stranger
MATCH
(person:Person)<-[:HAS_CREATOR]-(comment:Comment)-[:REPLY_OF*]->(message:Message)-[:HAS_CREATOR]->(stranger)
WHERE person.birthday > $date
AND person <> stranger
AND NOT (person)-[:KNOWS]-(stranger)
AND NOT (message)-[:REPLY_OF*]->(:Message)-[:HAS_CREATOR]->(stranger)
RETURN
person.id,
count(DISTINCT stranger) AS strangersCount,
count(comment) AS interactionCount
ORDER BY
interactionCount DESC,
person.id ASC
LIMIT 100
(person1:Person)-[:IS_LOCATED_IN]->(city1:City {id: $city1Id}),
(person2:Person)-[:IS_LOCATED_IN]->(city2:City {id: $city2Id})
CALL gds.alpha.shortestPath.stream({
nodeQuery: 'MATCH (p:Person) RETURN id(p) AS id',
relationshipQuery:
'MATCH
(personA:Person)-[:KNOWS]-(personB:Person),
(personA)<-[:HAS_CREATOR]-(:Message)-[replyOf:REPLY_OF]-(:Message)-[:HAS_CREATOR]->(personB)
RETURN
id(personA) AS source,
id(personB) AS target,
1.0/count(replyOf) AS weight',
startNode: person1,
endNode: person2,
relationshipWeightProperty: 'weight'
})
YIELD nodeId, cost
WHERE nodeId = id(person2)
RETURN person1.id, person2.id, cost AS totalWeight
ORDER BY totalWeight DESC, person1.id ASC, person2.id ASC
LIMIT 20
55 changes: 21 additions & 34 deletions cypher/queries/bi-2.cypher
Original file line number Diff line number Diff line change
@@ -1,40 +1,27 @@
// Q2. Top tags for country, age, gender, time
// Q2. Tag evolution
/*
:param {
date1: 20091231230000000,
date2: 20101107230000000,
country1: 'Ethiopia',
country2: 'Belarus'
}
:param [{ date, tagClass }] => { RETURN datetime('2011-10-01') AS date, 'Sports' AS tagClass }
*/
MATCH
(country:Country)<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(person:Person)
<-[:HAS_CREATOR]-(message:Message)-[:HAS_TAG]->(tag:Tag)
WHERE message.creationDate >= $startDate
AND message.creationDate <= $endDate
AND (country.name = $country1 OR country.name = $country2)
MATCH (tag:Tag)-[:HAS_TYPE]->(:TagClass {name: $tagClass})
// window 1
OPTIONAL MATCH (message1:Message)-[:HAS_TAG]->(tag)
WHERE $date <= message1.creationDate
AND message1.creationDate < $date + duration({days: 100})
WITH tag, count(message1) AS countWindow1
// window 2
OPTIONAL MATCH (message2:Message)-[:HAS_TAG]->(tag)
WHERE $date + duration({days: 100}) <= message2.creationDate
AND message2.creationDate < $date + duration({days: 200})
WITH
country.name AS countryName,
message.creationDate/100000000000%100 AS month,
person.gender AS gender,
floor((20130101 - person.birthday) / 10000 / 5.0) AS ageGroup,
tag.name AS tagName,
message
WITH
countryName, month, gender, ageGroup, tagName, count(message) AS messageCount
WHERE messageCount > 100
tag,
countWindow1,
count(message2) AS countWindow2
RETURN
countryName,
month,
gender,
ageGroup,
tagName,
messageCount
tag.name,
countWindow1,
countWindow2,
abs(countWindow1 - countWindow2) AS diff
ORDER BY
messageCount DESC,
tagName ASC,
ageGroup ASC,
gender ASC,
month ASC,
countryName ASC
diff DESC,
tag.name ASC
LIMIT 100
36 changes: 24 additions & 12 deletions cypher/queries/bi-20.cypher
Original file line number Diff line number Diff line change
@@ -1,15 +1,27 @@
// Q20. High-level topics
// Q20. Recruitment
// Requires the Neo4j Graph Data Science library
/*
:param { tagClasses: ['Writer', 'Single', 'Country'] }
:param [{ company, person2Id }] => {RETURN 'SoftEngCo' AS company, 5 AS person2Id}
*/
UNWIND $tagClasses AS tagClassName
MATCH
(tagClass:TagClass {name: tagClassName})<-[:IS_SUBCLASS_OF*0..]-
(:TagClass)<-[:HAS_TYPE]-(tag:Tag)<-[:HAS_TAG]-(message:Message)
RETURN
tagClass.name,
count(DISTINCT message) AS messageCount
ORDER BY
messageCount DESC,
tagClass.name ASC
LIMIT 100
(company:Company {name: $company})<-[:WORK_AT]-(person1:Person),
(person2:Person {id: $person2Id})
CALL gds.alpha.shortestPath.stream({
nodeQuery: 'MATCH (p:Person) RETURN id(p) AS id',
relationshipQuery:
'MATCH
(personA:Person)-[:KNOWS]-(personB:Person),
(personA)-[saA:STUDY_AT]->(u:University)<-[saB:STUDY_AT]-(personB)
RETURN
id(personA) AS source,
id(personB) AS target,
abs(saA.classYear - saB.classYear) + 1 AS weight',
startNode: person1,
endNode: person2,
relationshipWeightProperty: 'weight'
})
YIELD nodeId, cost
WHERE nodeId = id(person2)
RETURN person1.id, cost AS totalWeight
ORDER BY totalWeight DESC, person1.id ASC
LIMIT 20
63 changes: 0 additions & 63 deletions cypher/queries/bi-21.cypher

This file was deleted.

45 changes: 0 additions & 45 deletions cypher/queries/bi-22.cypher

This file was deleted.

21 changes: 0 additions & 21 deletions cypher/queries/bi-23.cypher

This file was deleted.

25 changes: 0 additions & 25 deletions cypher/queries/bi-24.cypher

This file was deleted.

60 changes: 0 additions & 60 deletions cypher/queries/bi-25.cypher

This file was deleted.

45 changes: 14 additions & 31 deletions cypher/queries/bi-3.cypher
Original file line number Diff line number Diff line change
@@ -1,35 +1,18 @@
// Q3. Tag evolution
// Q3. Popular topics in a country
/*
:param {
year: 2010,
month: 10
}
:param [{ tagClass, country }] => { RETURN 'Sports' AS tagClass, 'Spain' AS country }
*/
WITH
$year AS year1,
$month AS month1,
$year + toInteger($month / 12.0) AS year2,
$month % 12 + 1 AS month2
// year-month 1
MATCH (tag:Tag)
OPTIONAL MATCH (message1:Message)-[:HAS_TAG]->(tag)
WHERE message1.creationDate/10000000000000 = year1
AND message1.creationDate/100000000000%100 = month1
WITH year2, month2, tag, count(message1) AS countMonth1
// year-month 2
OPTIONAL MATCH (message2:Message)-[:HAS_TAG]->(tag)
WHERE message2.creationDate/10000000000000 = year2
AND message2.creationDate/100000000000%100 = month2
WITH
tag,
countMonth1,
count(message2) AS countMonth2
MATCH
(:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-
(person:Person)<-[:HAS_MODERATOR]-(forum:Forum)-[:CONTAINER_OF]->
(post:Post)<-[:REPLY_OF*0..]-(message:Message)-[:HAS_TAG]->(:Tag)-[:HAS_TYPE]->(:TagClass {name: $tagClass})
RETURN
tag.name,
countMonth1,
countMonth2,
abs(countMonth1-countMonth2) AS diff
forum.id,
forum.title,
forum.creationDate,
person.id,
count(DISTINCT message) AS messageCount
ORDER BY
diff DESC,
tag.name ASC
LIMIT 100
messageCount DESC,
forum.id ASC
LIMIT 20
34 changes: 19 additions & 15 deletions cypher/queries/bi-4.cypher
Original file line number Diff line number Diff line change
@@ -1,21 +1,25 @@
// Q4. Popular topics in a country
// Q4. Top messageers in a country
/*
:param {
tagClass: 'MusicalArtist',
country: 'Burma'
}
:param country => 'Spain'
*/
MATCH (:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-(person:Person)<-[:HAS_MEMBER]-(forum:Forum)
WITH forum, count(person) AS numberOfMembers
ORDER BY numberOfMembers DESC, forum.id ASC
LIMIT 100
WITH collect(forum) AS popularForums
UNWIND popularForums AS forum
MATCH
(:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-
(person:Person)<-[:HAS_MODERATOR]-(forum:Forum)-[:CONTAINER_OF]->
(post:Post)-[:HAS_TAG]->(:Tag)-[:HAS_TYPE]->(:TagClass {name: $tagClass})
(forum)-[:HAS_MEMBER]->(person:Person)
OPTIONAL MATCH
(person)<-[:HAS_CREATOR]-(message:Message)-[:REPLY_OF*0..]->(post:Post)<-[:CONTAINER_OF]-(popularForum:Forum)
WHERE popularForum IN popularForums
RETURN
forum.id,
forum.title,
forum.creationDate,
person.id,
count(DISTINCT post) AS postCount
person.firstName,
person.lastName,
person.creationDate,
count(DISTINCT message) AS messageCount
ORDER BY
postCount DESC,
forum.id ASC
LIMIT 20
messageCount DESC,
person.id ASC
LIMIT 100
33 changes: 13 additions & 20 deletions cypher/queries/bi-5.cypher
Original file line number Diff line number Diff line change
@@ -1,27 +1,20 @@
// Q5. Top posters in a country
// Q5. Most active Posters of a given Topic
/*
:param { country: 'Belarus' }
:param tag => 'Snowboard'
*/
MATCH
(:Country {name: $country})<-[:IS_PART_OF]-(:City)<-[:IS_LOCATED_IN]-
(person:Person)<-[:HAS_MEMBER]-(forum:Forum)
WITH forum, count(person) AS numberOfMembers
ORDER BY numberOfMembers DESC, forum.id ASC
LIMIT 100
WITH collect(forum) AS popularForums
UNWIND popularForums AS forum
MATCH
(forum)-[:HAS_MEMBER]->(person:Person)
OPTIONAL MATCH
(person)<-[:HAS_CREATOR]-(post:Post)<-[:CONTAINER_OF]-(popularForum:Forum)
WHERE popularForum IN popularForums
MATCH (tag:Tag {name: $tag})<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person:Person)
OPTIONAL MATCH (message)<-[likes:LIKES]-(:Person)
WITH person, message, count(likes) AS likeCount
OPTIONAL MATCH (message)<-[:REPLY_OF]-(reply:Comment)
WITH person, message, likeCount, count(reply) AS replyCount
WITH person, count(message) AS messageCount, sum(likeCount) AS likeCount, sum(replyCount) AS replyCount
RETURN
person.id,
person.firstName,
person.lastName,
person.creationDate,
count(DISTINCT post) AS postCount
replyCount,
likeCount,
messageCount,
1*messageCount + 2*replyCount + 10*likeCount AS score
ORDER BY
postCount DESC,
score DESC,
person.id ASC
LIMIT 100
26 changes: 13 additions & 13 deletions cypher/queries/bi-6.cypher
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
// Q6. Most active Posters of a given Topic
// Q6. Most authoritative users on a given topic
/*
:param { tag: 'Abbas_I_of_Persia' }
:param tag => 'Pyrenees'
*/
MATCH (tag:Tag {name: $tag})<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person:Person)
OPTIONAL MATCH (:Person)-[like:LIKES]->(message)
OPTIONAL MATCH (message)<-[:REPLY_OF]-(comment:Comment)
WITH person, count(DISTINCT like) AS likeCount, count(DISTINCT comment) AS replyCount, count(DISTINCT message) AS messageCount
MATCH (tag:Tag {name: $tag})<-[:HAS_TAG]-(message2:Message)-[:HAS_CREATOR]->(person1)
OPTIONAL MATCH (message2)<-[:LIKES]-(person2:Person)
OPTIONAL MATCH (person2)<-[:HAS_CREATOR]-(message3:Message)<-[like:LIKES]-(person3:Person)
RETURN
person.id,
replyCount,
likeCount,
messageCount,
1*messageCount + 2*replyCount + 10*likeCount AS score
person1.id,
// Using 'DISTINCT like' here ensures that each person2's popularity score is only added once for each person1
count(DISTINCT like) AS authorityScore
ORDER BY
score DESC,
person.id ASC
authorityScore DESC,
person1.id ASC
LIMIT 100

// We need to use a redundant computation due to the lack of composable graph queries in the currently supported Cypher version.
// This might change in the future with new Cypher versions and GQL.
21 changes: 10 additions & 11 deletions cypher/queries/bi-7.cypher
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
// Q7. Most authoritative users on a given topic
// Q7. Related Topics
/*
:param { tag: 'Arnold_Schwarzenegger' }
:param tag => 'Pyrenees'
*/
MATCH (tag:Tag {name: $tag})
MATCH (tag)<-[:HAS_TAG]-(message1:Message)-[:HAS_CREATOR]->(person1:Person)
MATCH (tag)<-[:HAS_TAG]-(message2:Message)-[:HAS_CREATOR]->(person1)
OPTIONAL MATCH (message2)<-[:LIKES]-(person2:Person)
OPTIONAL MATCH (person2)<-[:HAS_CREATOR]-(message3:Message)<-[like:LIKES]-(p3:Person)
MATCH
(tag:Tag {name: $tag})<-[:HAS_TAG]-(message:Message),
(message)<-[:REPLY_OF]-(comment:Comment)-[:HAS_TAG]->(relatedTag:Tag)
WHERE NOT (comment)-[:HAS_TAG]->(tag)
RETURN
person1.id,
count(DISTINCT like) AS authorityScore
relatedTag.name,
count(DISTINCT comment) AS count
ORDER BY
authorityScore DESC,
person1.id ASC
count DESC,
relatedTag.name ASC
LIMIT 100
39 changes: 29 additions & 10 deletions cypher/queries/bi-8.cypher
Original file line number Diff line number Diff line change
@@ -1,15 +1,34 @@
// Q8. Related Topics
// Q8. Central Person for a Tag
/*
:param { tag: 'Genghis_Khan' }
:param [{ tag, date }] => { RETURN 'Pyrenees' AS tag, datetime('2010-10-01') AS date }
*/
MATCH
(tag:Tag {name: $tag})<-[:HAS_TAG]-(message:Message),
(message)<-[:REPLY_OF]-(comment:Comment)-[:HAS_TAG]->(relatedTag:Tag)
WHERE NOT (comment)-[:HAS_TAG]->(tag)
MATCH (tag:Tag {name: $tag})
// score
OPTIONAL MATCH (tag)<-[interest:HAS_INTEREST]-(person:Person)
WITH tag, collect(person) AS interestedPersons
OPTIONAL MATCH (tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person:Person)
WHERE message.creationDate > $date
WITH tag, interestedPersons + collect(person) AS persons
UNWIND persons AS person
WITH DISTINCT tag, person
WITH
tag,
person,
100 * size([(tag)<-[interest:HAS_INTEREST]-(person) | interest]) + size([(tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(person) WHERE message.creationDate > $date | message])
AS score
OPTIONAL MATCH (person)-[:KNOWS]-(friend)
// We need to use a redundant computation due to the lack of composable graph queries in the currently supported Cypher version.
// This might change in the future with new Cypher versions and GQL.
WITH
person,
score,
100 * size([(tag)<-[interest:HAS_INTEREST]-(friend) | interest]) + size([(tag)<-[:HAS_TAG]-(message:Message)-[:HAS_CREATOR]->(friend) WHERE message.creationDate > $date | message])
AS friendScore
RETURN
relatedTag.name,
count(DISTINCT comment) AS count
person.id,
score,
sum(friendScore) AS friendsScore
ORDER BY
count DESC,
relatedTag.name ASC
score + friendsScore DESC,
person.id ASC
LIMIT 100
37 changes: 14 additions & 23 deletions cypher/queries/bi-9.cypher
Original file line number Diff line number Diff line change
@@ -1,28 +1,19 @@
// Q9. Forum with related Tags
// Q9. Top thread initiators
/*
:param {
tagClass1: 'BaseballPlayer',
tagClass2: 'ChristianBishop',
threshold: 200
}
:param [{ startDate, endDate }] => { RETURN datetime('2011-10-01') AS startDate, datetime('2011-10-15') AS endDate }
*/
MATCH
(forum:Forum)-[:HAS_MEMBER]->(person:Person)
WITH forum, count(person) AS members
WHERE members > $threshold
MATCH
(forum)-[:CONTAINER_OF]->(post1:Post)-[:HAS_TAG]->
(:Tag)-[:HAS_TYPE]->(:TagClass {name: $tagClass1})
WITH forum, count(DISTINCT post1) AS count1
MATCH
(forum)-[:CONTAINER_OF]->(post2:Post)-[:HAS_TAG]->
(:Tag)-[:HAS_TYPE]->(:TagClass {name: $tagClass2})
WITH forum, count1, count(DISTINCT post2) AS count2
MATCH (person:Person)<-[:HAS_CREATOR]-(post:Post)<-[:REPLY_OF*0..]-(reply:Message)
WHERE post.creationDate >= $startDate
AND post.creationDate <= $endDate
AND reply.creationDate >= $startDate
AND reply.creationDate <= $endDate
RETURN
forum.id,
count1,
count2
person.id,
person.firstName,
person.lastName,
count(DISTINCT post) AS threadCount,
count(DISTINCT reply) AS messageCount
ORDER BY
abs(count2-count1) DESC,
forum.id ASC
messageCount DESC,
person.id ASC
LIMIT 100
18 changes: 9 additions & 9 deletions cypher/queries/interactive-complex-1.cypher
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
MATCH (:Person {id:$personId})-[path:KNOWS*1..3]-(friend:Person)
WHERE friend.firstName = $firstName
WITH friend, min(length(path)) AS distance
ORDER BY distance ASC, friend.lastName ASC, toInteger(friend.id) ASC
LIMIT 20
MATCH p=shortestPath((person:Person {id:$personId})-[path:KNOWS*1..3]-(friend:Person {firstName: $firstName}))
WHERE person <> friend
WITH friend, length(p) AS distance
ORDER BY distance ASC, friend.lastName ASC, toInteger(friend.id) ASC
LIMIT 20
MATCH (friend)-[:IS_LOCATED_IN]->(friendCity:Place)
OPTIONAL MATCH (friend)-[studyAt:STUDY_AT]->(uni:Organisation)-[:IS_LOCATED_IN]->(uniCity:Place)
WITH
@@ -11,7 +11,7 @@ WITH
CASE uni.name
WHEN null THEN null
ELSE [uni.name, studyAt.classYear, uniCity.name]
END
END
) AS unis,
friendCity,
distance
@@ -22,7 +22,7 @@ WITH
CASE company.name
WHEN null THEN null
ELSE [company.name, workAt.workFrom, companyCountry.name]
END
END
) AS companies,
unis,
friendCity,
@@ -41,5 +41,5 @@ RETURN
friendCity.name AS friendCityName,
unis AS friendUniversities,
companies AS friendCompanies
ORDER BY distanceFromPerson ASC, friendLastName ASC, toInteger(friendId) ASC
LIMIT 20
ORDER BY distanceFromPerson ASC, friendLastName ASC, toInteger(friendId) ASC
LIMIT 20

This file was deleted.

12 changes: 6 additions & 6 deletions cypher/queries/interactive-complex-10.cypher
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
MATCH (person:Person {id:$personId})-[:KNOWS*2..2]-(friend:Person)-[:IS_LOCATED_IN]->(city:Place)
WHERE
((friend.birthday/100%100 = $month AND friend.birthday%100 >= 21) OR
(friend.birthday/100%100 = $nextMonth AND friend.birthday%100 < 22))
AND not(friend=person)
AND not((friend)-[:KNOWS]-(person))
((friend.birthday.month = $month AND friend.birthday.month >= 21) OR
(friend.birthday.month = $nextMonth AND friend.birthday.month < 22))
AND friend <> person
AND NOT (friend)-[:KNOWS]-(person)
WITH DISTINCT friend, city, person
OPTIONAL MATCH (friend)<-[:HAS_CREATOR]-(post:Post)
WITH friend, city, collect(post) AS posts, person
WITH
friend,
city,
length(posts) AS postCount,
length([p IN posts WHERE (p)-[:HAS_TAG]->(:Tag)<-[:HAS_INTEREST]-(person)]) AS commonPostCount
size(posts) AS postCount,
size([p IN posts WHERE (p)-[:HAS_TAG]->(:Tag)<-[:HAS_INTEREST]-(person)]) AS commonPostCount
RETURN
friend.id AS personId,
friend.firstName AS personFirstName,
15 changes: 12 additions & 3 deletions cypher/queries/interactive-complex-14.cypher
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
MATCH path = allShortestPaths((person1:Person {id:$person1Id})-[:KNOWS*..15]-(person2:Person {id:$person2Id}))
WITH nodes(path) AS pathNodes
RETURN
extract(n IN pathNodes | n.id) AS personIdsInPath,
reduce(weight=0.0, idx IN range(1,size(pathNodes)-1) | extract(prev IN [pathNodes[idx-1]] | extract(curr IN [pathNodes[idx]] | weight + length((curr)<-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]->(:Post)-[:HAS_CREATOR]->(prev))*1.0 + length((prev)<-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]->(:Post)-[:HAS_CREATOR]->(curr))*1.0 + length((prev)-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]-(:Comment)-[:HAS_CREATOR]-(curr))*0.5) )[0][0]) AS pathWight
ORDER BY pathWight DESC
[n IN pathNodes | n.id] AS personIdsInPath,
reduce(
weight = 0.0,
idx IN range(1, size(pathNodes)-1) |
[prev IN [pathNodes[idx-1]] |
[curr IN [pathNodes[idx]] | weight +
size((curr)<-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]->(:Post)-[:HAS_CREATOR]->(prev))*1.0 +
size((prev)<-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]->(:Post)-[:HAS_CREATOR]->(curr))*1.0 +
size((prev)-[:HAS_CREATOR]-(:Comment)-[:REPLY_OF]-(:Comment)-[:HAS_CREATOR]-(curr))*0.5]
][0][0]
) AS pathWeight
ORDER BY pathWeight DESC
2 changes: 1 addition & 1 deletion cypher/queries/interactive-complex-2.cypher
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
MATCH (:Person {id:$personId})-[:KNOWS]-(friend:Person)<-[:HAS_CREATOR]-(message:Message)
WHERE message.creationDate <= $maxDate
WHERE message.creationDate < $maxDate
RETURN
friend.id AS personId,
friend.firstName AS personFirstName,
2 changes: 1 addition & 1 deletion cypher/queries/interactive-complex-7.cypher
Original file line number Diff line number Diff line change
@@ -15,7 +15,7 @@ RETURN
WHEN true THEN latestLike.msg.content
ELSE latestLike.msg.imageFile
END AS messageContent,
latestLike.msg.creationDate AS messageCreationDate,
latestLike.likeTime.minutes - latestLike.msg.creationDate.minutes AS minutesLatency,
not((liker)-[:KNOWS]-(person)) AS isNew
ORDER BY likeCreationDate DESC, toInteger(personId) ASC
LIMIT 20
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-1.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (person:Person {id: $personId})
DETACH DELETE person
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-2.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (person:Person {id: $personId})-[likes:LIKES]->(post:Post {id: $postId})
DELETE likes
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-3.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (person:Person {id: $personId})-[likes:LIKES]->(comment:Comment {id: $commentId})
DELETE likes
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-4.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (forum:Forum {id: $forumId})-[:CONTAINER_OF]->(posts:Post)<-[:REPLY_OF*]-(comments:Comment)
DETACH DELETE forum, posts, comments
3 changes: 3 additions & 0 deletions cypher/queries/interactive-delete-5.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
MATCH (person:Person {id: $personId})<-[member:HAS_MEMBER]-(forum:Forum {id: $forumId})
OPTIONAL MATCH (person)<-[moderator:HAS_MODERATOR]-(forum)
DELETE moderator, member
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-6.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (post:Post {id: $postId})<-[:REPLY_OF*]-(replies:Comment)
DETACH DELETE post, replies
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-7.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (comment:Comment {id: $commentId})<-[:REPLY_OF*]-(replies:Comment)
DETACH DELETE comment, replies
2 changes: 2 additions & 0 deletions cypher/queries/interactive-delete-8.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
MATCH (person1:Person {id: $person1Id})-[knows:KNOWS]->(person2:Person {id: $person2Id})
DELETE knows
6 changes: 3 additions & 3 deletions cypher/queries/interactive-short-2.cypher
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
MATCH (:Person {id:$personId})<-[:HAS_CREATOR]-(m:Message)-[:REPLY_OF*0..]->(p:Post)
MATCH (p)-[:HAS_CREATOR]->(c)
RETURN
m.id as messageId,
m.id AS messageId,
CASE exists(m.content)
WHEN true THEN m.content
ELSE m.imageFile
END AS messageContent,
m.creationDate AS messageCreationDate,
p.id AS originalPostId,
c.id AS originalPostAuthorId,
c.firstName as originalPostAuthorFirstName,
c.lastName as originalPostAuthorLastName
c.firstName AS originalPostAuthorFirstName,
c.lastName AS originalPostAuthorLastName
ORDER BY messageCreationDate DESC
LIMIT 10
2 changes: 1 addition & 1 deletion cypher/queries/interactive-short-4.cypher
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MATCH (m:Message {id:$messageId})
RETURN
m.creationDate as messageCreationDate,
m.creationDate AS messageCreationDate,
CASE exists(m.content)
WHEN true THEN m.content
ELSE m.imageFile
1 change: 1 addition & 0 deletions cypher/scripts/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
tmpfile.csv
35 changes: 35 additions & 0 deletions cypher/scripts/convert-csvs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash

set -e
set -o pipefail

echo "Starting preprocessing CSV files"

cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

: ${NEO4J_CSV_DIR:?"Environment variable NEO4J_CSV_DIR is unset or empty"}
: ${NEO4J_CSV_POSTFIX:?"Environment variable NEO4J_CSV_POSTFIX is unset or empty"}

# provide progressbar is available
if command -v pv > /dev/null 2>&1; then
SNB_CAT=pv
else
SNB_CAT=cat
fi

# replace headers
while read line; do
IFS=' ' read -r -a array <<< $line
FILENAME=${array[0]}
HEADER=${array[1]}

echo ${FILENAME}: ${HEADER}
# replace header (no point using sed to save space as it creates a temporary file as well)
if [ ! -f ${NEO4J_CSV_DIR}/${FILENAME}${NEO4J_CSV_POSTFIX} ]; then
echo "${NEO4J_CSV_DIR}/${FILENAME}${NEO4J_CSV_POSTFIX} does not exist"
exit 1
fi
echo ${HEADER} | ${SNB_CAT} - <(tail -n +2 ${NEO4J_CSV_DIR}/${FILENAME}${NEO4J_CSV_POSTFIX}) > tmpfile.csv && mv tmpfile.csv ${NEO4J_CSV_DIR}/${FILENAME}${NEO4J_CSV_POSTFIX}
done < headers.txt

echo "Finished preprocessing CSV files"
21 changes: 21 additions & 0 deletions cypher/scripts/convert-param-headers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

set -e
set -o pipefail

echo "Starting preprocessing parameter files"

cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd ..

# replace headers
while read line; do
IFS=' ' read -r -a array <<< $line
FILENAME=${array[0]}
HEADER=${array[1]}

echo ${FILENAME}: ${HEADER}
echo ${HEADER} | cat - <(tail -n +2 parameters/${FILENAME}.csv) > parameters/${FILENAME}.txt
done < parameters/headers.txt

echo "Finished preprocessing parameter files"
96 changes: 96 additions & 0 deletions cypher/scripts/create-example-graph.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
// Draft example graph.
// For an actively maintained example graph, see the https://github.com/ldbc/ldbc_snb_data_converter repository
CREATE
// Persons
(pA:Person {id: 1, firstName: 'Amelie', lastName: '', creationDate: datetime('2010-06-10T11:05:56.000+00:00')}),
(pB:Person {id: 2, firstName: 'Bernardo', lastName: '', creationDate: datetime('2010-06-10T11:05:56.000+00:00')}),
(pC:Person {id: 3, firstName: 'Cedric', lastName: '', creationDate: datetime('2010-06-10T11:05:56.000+00:00')}),
(pD:Person {id: 4, firstName: 'Diane', lastName: '', creationDate: datetime('2010-06-10T11:05:56.000+00:00')}),
(pE:Person {id: 5, firstName: 'Eve', lastName: '', creationDate: datetime('2010-06-10T11:05:56.000+00:00')}),
(pA)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pB),
(pA)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pC),
(pA)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pD),
(pB)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pC),
(pC)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pD),
(pD)-[:KNOWS {creationDate: datetime('2010-06-10T11:05:56.000+00:00')}]->(pE),
// Organisations
(cambridge:Organisation:University {id: 1, name: 'Cambridge'}),
(softengco:Organisation:Company {id: 2, name: 'SoftEngCo'}),
(pA)-[:STUDY_AT {classYear: 2008}]->(cambridge),
(pD)-[:STUDY_AT {classYear: 2006}]->(cambridge),
(pE)-[:STUDY_AT {classYear: 2008}]->(cambridge),
(pA)-[:WORK_AT]->(softengco),
// Places
(spain:Place:Country {id: 1, name: 'Spain'}),
(madrid:Place:City {id: 2, name: 'Madrid'}),
(france:Place:Country {id: 3, name: 'France'}),
(paris:Place:City {id: 4, name: 'Paris'}),
(lyon:Place:City {id: 5, name: 'Lyon'}),
(madrid)-[:IS_PART_OF]->(spain),
(paris)-[:IS_PART_OF]->(france),
(lyon)-[:IS_PART_OF]->(france),
(pA)-[:IS_LOCATED_IN]->(paris),
(pB)-[:IS_LOCATED_IN]->(madrid),
(pC)-[:IS_LOCATED_IN]->(lyon),
(pD)-[:IS_LOCATED_IN]->(paris),
// TagClasses
(tc1:TagClass {id: 1, name: 'Holiday resorts'}),
(tc2:TagClass {id: 2, name: 'Ski resorts'}),
(tc3:TagClass {id: 3, name: 'Sports'}),
(tc2)-[:IS_SUBCLASS_OF]->(tc1),
// Tags
(t1:Tag {id: 1, name: 'Pyrenees'}),
(t2:Tag {id: 2, name: 'Snowboard'}),
(pB)-[:HAS_INTEREST]->(t1),
(pD)-[:HAS_INTEREST]->(t2),
(t1)-[:HAS_TYPE]->(tc2),
(t2)-[:HAS_TYPE]->(tc3),
// Forums
(forum1:Forum {id: 1, creationDate: datetime('2011-10-10T11:01:47.000+00:00'), title: 'Skiing trips'}),
(forum2:Forum {id: 2, creationDate: datetime('2012-02-01T13:07:26.000+00:00'), title: 'Cinéma'}),
(forum1)-[:HAS_TAG]->(t1),
(forum1)-[:HAS_MEMBER]->(pA),
(forum1)-[:HAS_MEMBER]->(pB),
(forum1)-[:HAS_MEMBER]->(pC),
(forum2)-[:HAS_MEMBER]->(pC),
(forum2)-[:HAS_MEMBER]->(pA),
(forum1)-[:HAS_MODERATOR]->(pB),
(forum2)-[:HAS_MODERATOR]->(pC),
// Messages,
(p1:Message:Post {id: 10, creationDate: datetime('2011-10-10T11:05:56.000+00:00'), length: 24, content: 'We should go to Hautacam', language: 'en'}),
(c1:Message:Comment {id: 1, creationDate: datetime('2011-10-10T11:08:01.000+00:00'), length: 24, content: 'Yes, I like the Pyrenees'}),
(c2:Message:Comment {id: 2, creationDate: datetime('2011-10-10T11:07:42.000+00:00'), length: 57, content: 'We should go to a place with better options for snowboard'}),
(c3:Message:Comment {id: 3, creationDate: datetime('2011-10-10T11:20:37.000+00:00'), length: 34, content: 'Hautacam is great for snowboarding'}),
(c4:Message:Comment {id: 4, creationDate: datetime('2012-02-15T09:47:23.000+00:00'), length: 58, content: 'It was a great place for snowboarding. Glad we went there!'}),
(c5:Message:Comment {id: 5, creationDate: datetime('2012-02-15T10:24:26.000+00:00'), length: 13, content: 'It was great!'}),
(p2:Message:Post {id: 20, creationDate: datetime('2012-03-04T13:41:23.000+00:00'), length: 38, content: 'Voici un film de snowboard intéressant', language: 'fr'}),
(c6:Message:Comment {id: 6, creationDate: datetime('2012-03-04T10:24:26.000+00:00'), length: 37, content: "Merci, j'adore les films de snowboard"}),
(forum1)-[:CONTAINER_OF]->(p1),
(forum2)-[:CONTAINER_OF]->(p2),
(c1)-[:REPLY_OF]->(p1),
(c2)-[:REPLY_OF]->(p1),
(c3)-[:REPLY_OF]->(c2),
(c4)-[:REPLY_OF]->(c3),
(c5)-[:REPLY_OF]->(c4),
(c6)-[:REPLY_OF]->(p2),
(pA)-[:LIKES]->(p1),
(pB)-[:LIKES]->(c2),
(pB)-[:LIKES]->(c3),
(pC)-[:LIKES]->(p1),
(pC)-[:LIKES]->(c4),
(p1)-[:HAS_CREATOR]->(pB),
(c1)-[:HAS_CREATOR]->(pC),
(c2)-[:HAS_CREATOR]->(pA),
(c3)-[:HAS_CREATOR]->(pC),
(c4)-[:HAS_CREATOR]->(pA),
(c5)-[:HAS_CREATOR]->(pD),
(p2)-[:HAS_CREATOR]->(pC),
(c6)-[:HAS_CREATOR]->(pA),
(p1)-[:HAS_TAG]->(t1),
(c1)-[:HAS_TAG]->(t1),
(c3)-[:HAS_TAG]->(t1),
(c2)-[:HAS_TAG]->(t2),
(c3)-[:HAS_TAG]->(t2),
(c4)-[:HAS_TAG]->(t2),
(p2)-[:HAS_TAG]->(t2),
(c6)-[:HAS_TAG]->(t2)
10 changes: 10 additions & 0 deletions cypher/scripts/create-indices.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

set -e
set -o pipefail

cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

: ${NEO4J_CONTAINER_NAME:?"Environment variable NEO4J_CONTAINER_NAME is unset or empty"}

docker exec --interactive ${NEO4J_CONTAINER_NAME} cypher-shell < indices.cypher
8 changes: 8 additions & 0 deletions cypher/scripts/delete-neo4j-database.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

set -e
set -o pipefail

: ${NEO4J_DATA_DIR:?"Environment variable NEO4J_DATA_DIR is unset or empty"}

rm -rf ${NEO4J_DATA_DIR}
6 changes: 6 additions & 0 deletions cypher/scripts/disable-updates.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

set -e
set -o pipefail

sed -i 's/\(ldbc.snb.interactive.LdbcUpdate.*\)_enable=true/\1_enable=false/' interactive-*.properties
10 changes: 10 additions & 0 deletions cypher/scripts/environment-variables-default.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
cd "$( cd "$( dirname "${BASH_SOURCE[0]:-${(%):-%x}}" )" >/dev/null 2>&1 && pwd )"
cd ..

export NEO4J_CONTAINER_ROOT=`pwd`/neo4j-scratch
export NEO4J_DATA_DIR=`pwd`/neo4j-scratch/data
export NEO4J_VERSION=4.2.1
export NEO4J_ENV_VARS=""
export NEO4J_CSV_DIR=`pwd`/../../ldbc_snb_datagen/out/social_network
export NEO4J_CSV_POSTFIX=_0_0.csv
export NEO4J_CONTAINER_NAME="neo-snb"
31 changes: 31 additions & 0 deletions cypher/scripts/headers.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
static/Organisation id:ID(Organisation)|:LABEL|name:STRING|url:STRING
static/Place id:ID(Place)|name:STRING|url:STRING|:LABEL
static/TagClass id:ID(TagClass)|name:STRING|url:STRING
static/Tag id:ID(Tag)|name:STRING|url:STRING
static/TagClass_isSubclassOf_TagClass :START_ID(TagClass)|:END_ID(TagClass)
static/Tag_hasType_TagClass :START_ID(Tag)|:END_ID(TagClass)
static/Organisation_isLocatedIn_Place :START_ID(Organisation)|:END_ID(Place)
static/Place_isPartOf_Place :START_ID(Place)|:END_ID(Place)
dynamic/Comment creationDate:DATETIME|id:ID(Comment)|locationIP:STRING|browserUsed:STRING|content:STRING|length:LONG
dynamic/Forum creationDate:DATETIME|id:ID(Forum)|title:STRING
dynamic/Person creationDate:DATETIME|id:ID(Person)|firstName:STRING|lastName:STRING|gender:STRING|birthday:DATE|locationIP:STRING|browserUsed:STRING|speaks:STRING[]|email:STRING[]
dynamic/Post creationDate:DATETIME|id:ID(Post)|imageFile:STRING|locationIP:STRING|browserUsed:STRING|language:STRING|content:STRING|length:LONG
dynamic/Comment_hasCreator_Person creationDate:DATETIME|:START_ID(Comment)|:END_ID(Person)
dynamic/Comment_isLocatedIn_Country creationDate:DATETIME|:START_ID(Comment)|:END_ID(Place)
dynamic/Comment_replyOf_Comment creationDate:DATETIME|:START_ID(Comment)|:END_ID(Comment)
dynamic/Comment_replyOf_Post creationDate:DATETIME|:START_ID(Comment)|:END_ID(Post)
dynamic/Forum_containerOf_Post creationDate:DATETIME|:START_ID(Forum)|:END_ID(Post)
dynamic/Forum_hasMember_Person creationDate:DATETIME|:START_ID(Forum)|:END_ID(Person)
dynamic/Forum_hasModerator_Person creationDate:DATETIME|:START_ID(Forum)|:END_ID(Person)
dynamic/Forum_hasTag_Tag creationDate:DATETIME|:START_ID(Forum)|:END_ID(Tag)
dynamic/Person_hasInterest_Tag creationDate:DATETIME|:START_ID(Person)|:END_ID(Tag)
dynamic/Person_isLocatedIn_City creationDate:DATETIME|:START_ID(Person)|:END_ID(Place)
dynamic/Person_knows_Person creationDate:DATETIME|:START_ID(Person)|:END_ID(Person)
dynamic/Person_likes_Comment creationDate:DATETIME|:START_ID(Person)|:END_ID(Comment)
dynamic/Person_likes_Post creationDate:DATETIME|:START_ID(Person)|:END_ID(Post)
dynamic/Person_studyAt_University creationDate:DATETIME|:START_ID(Person)|:END_ID(Organisation)|classYear:LONG
dynamic/Person_workAt_Company creationDate:DATETIME|:START_ID(Person)|:END_ID(Organisation)|workFrom:LONG
dynamic/Post_hasCreator_Person creationDate:DATETIME|:START_ID(Post)|:END_ID(Person)
dynamic/Comment_hasTag_Tag creationDate:DATETIME|:START_ID(Comment)|:END_ID(Tag)
dynamic/Post_hasTag_Tag creationDate:DATETIME|:START_ID(Post)|:END_ID(Tag)
dynamic/Post_isLocatedIn_Country creationDate:DATETIME|:START_ID(Post)|:END_ID(Place)
63 changes: 63 additions & 0 deletions cypher/scripts/import-to-neo4j.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/bin/bash

set -e
set -o pipefail

cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
cd ..

: ${NEO4J_CONTAINER_ROOT:?"Environment variable NEO4J_CONTAINER_ROOT is unset or empty"}
: ${NEO4J_DATA_DIR:?"Environment variable NEO4J_DATA_DIR is unset or empty"}
: ${NEO4J_CSV_DIR:?"Environment variable NEO4J_CSV_DIR is unset or empty"}
: ${NEO4J_VERSION:?"Environment variable NEO4J_VERSION is unset or empty"}
: ${NEO4J_CONTAINER_NAME:?"Environment variable NEO4J_CONTAINER_NAME is unset or empty"}

# make sure directories exist
mkdir -p ${NEO4J_CONTAINER_ROOT}/{logs,import,plugins}

# start with a fresh data dir (required by the CSV importer)
mkdir -p ${NEO4J_DATA_DIR}
rm -rf ${NEO4J_DATA_DIR}/*

docker run --rm \
--user="$(id -u):$(id -g)" \
--publish=7474:7474 \
--publish=7687:7687 \
--volume=${NEO4J_DATA_DIR}:/data \
--volume=${NEO4J_CSV_DIR}:/import \
${NEO4J_ENV_VARS} \
neo4j:${NEO4J_VERSION} \
neo4j-admin import \
--id-type=INTEGER \
--nodes=Place="/import/static/Place${NEO4J_CSV_POSTFIX}" \
--nodes=Organisation="/import/static/Organisation${NEO4J_CSV_POSTFIX}" \
--nodes=TagClass="/import/static/TagClass${NEO4J_CSV_POSTFIX}" \
--nodes=Tag="/import/static/Tag${NEO4J_CSV_POSTFIX}" \
--nodes=Forum="/import/dynamic/Forum${NEO4J_CSV_POSTFIX}" \
--nodes=Person="/import/dynamic/Person${NEO4J_CSV_POSTFIX}" \
--nodes=Message:Comment="/import/dynamic/Comment${NEO4J_CSV_POSTFIX}" \
--nodes=Message:Post="/import/dynamic/Post${NEO4J_CSV_POSTFIX}" \
--relationships=IS_PART_OF="/import/static/Place_isPartOf_Place${NEO4J_CSV_POSTFIX}" \
--relationships=IS_SUBCLASS_OF="/import/static/TagClass_isSubclassOf_TagClass${NEO4J_CSV_POSTFIX}" \
--relationships=IS_LOCATED_IN="/import/static/Organisation_isLocatedIn_Place${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_TYPE="/import/static/Tag_hasType_TagClass${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_CREATOR="/import/dynamic/Comment_hasCreator_Person${NEO4J_CSV_POSTFIX}" \
--relationships=IS_LOCATED_IN="/import/dynamic/Comment_isLocatedIn_Country${NEO4J_CSV_POSTFIX}" \
--relationships=REPLY_OF="/import/dynamic/Comment_replyOf_Comment${NEO4J_CSV_POSTFIX}" \
--relationships=REPLY_OF="/import/dynamic/Comment_replyOf_Post${NEO4J_CSV_POSTFIX}" \
--relationships=CONTAINER_OF="/import/dynamic/Forum_containerOf_Post${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_MEMBER="/import/dynamic/Forum_hasMember_Person${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_MODERATOR="/import/dynamic/Forum_hasModerator_Person${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_TAG="/import/dynamic/Forum_hasTag_Tag${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_INTEREST="/import/dynamic/Person_hasInterest_Tag${NEO4J_CSV_POSTFIX}" \
--relationships=IS_LOCATED_IN="/import/dynamic/Person_isLocatedIn_City${NEO4J_CSV_POSTFIX}" \
--relationships=KNOWS="/import/dynamic/Person_knows_Person${NEO4J_CSV_POSTFIX}" \
--relationships=LIKES="/import/dynamic/Person_likes_Comment${NEO4J_CSV_POSTFIX}" \
--relationships=LIKES="/import/dynamic/Person_likes_Post${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_CREATOR="/import/dynamic/Post_hasCreator_Person${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_TAG="/import/dynamic/Comment_hasTag_Tag${NEO4J_CSV_POSTFIX}" \
--relationships=HAS_TAG="/import/dynamic/Post_hasTag_Tag${NEO4J_CSV_POSTFIX}" \
--relationships=IS_LOCATED_IN="/import/dynamic/Post_isLocatedIn_Country${NEO4J_CSV_POSTFIX}" \
--relationships=STUDY_AT="/import/dynamic/Person_studyAt_University${NEO4J_CSV_POSTFIX}" \
--relationships=WORK_AT="/import/dynamic/Person_workAt_Company${NEO4J_CSV_POSTFIX}" \
--delimiter '|'
15 changes: 15 additions & 0 deletions cypher/scripts/indices.cypher
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
CREATE CONSTRAINT ON (n:City) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Comment) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Country) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Forum) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Message) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Organisation) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Person) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Post) ASSERT n.id IS UNIQUE;
CREATE CONSTRAINT ON (n:Tag) ASSERT n.id IS UNIQUE;
CREATE INDEX ON :Country(name);
CREATE INDEX ON :Message(creationDate);
CREATE INDEX ON :Person(firstName);
CREATE INDEX ON :Post(creationDate);
CREATE INDEX ON :Tag(name);
CREATE INDEX ON :TagClass(name);
3 changes: 3 additions & 0 deletions cypher/scripts/install-dependencies.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash

pip3 install -U neo4j
33 changes: 33 additions & 0 deletions cypher/scripts/load-in-one-step.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/bin/bash

set -e
set -o pipefail

cd "$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

echo ===============================================================================
echo Loading the Neo4j database with the following parameters
echo -------------------------------------------------------------------------------
echo NEO4J_CONTAINER_ROOT: ${NEO4J_CONTAINER_ROOT}
echo NEO4J_DATA_DIR: ${NEO4J_DATA_DIR}
echo NEO4J_CSV_DIR: ${NEO4J_CSV_DIR}
echo NEO4J_CSV_POSTFIX: ${NEO4J_CSV_POSTFIX}
echo NEO4J_VERSION: ${NEO4J_VERSION}
echo NEO4J_CONTAINER_NAME: ${NEO4J_CONTAINER_NAME}
echo NEO4J_ENV_VARS: ${NEO4J_ENV_VARS}
echo ===============================================================================

: ${NEO4J_CONTAINER_ROOT:?"Environment variable NEO4J_CONTAINER_ROOT is unset or empty"}
: ${NEO4J_DATA_DIR:?"Environment variable NEO4J_DATA_DIR is unset or empty"}
: ${NEO4J_CSV_DIR:?"Environment variable NEO4J_CSV_DIR is unset or empty"}
: ${NEO4J_CSV_POSTFIX:?"Environment variable NEO4J_CSV_POSTFIX is unset or empty"}
: ${NEO4J_VERSION:?"Environment variable NEO4J_VERSION is unset or empty"}
: ${NEO4J_CONTAINER_NAME:?"Environment variable NEO4J_CONTAINER_NAME is unset or empty"}
# env vars can be empty, hence no check is required

./stop-neo4j.sh
./delete-neo4j-database.sh
./convert-csvs.sh
./import-to-neo4j.sh
./start-neo4j.sh
./create-indices.sh
Loading