ื ื™ืกื•ื™ ื‘ื•ื“ืง ืืช ื”ื™ืฉื™ืžื•ืช ืฉืœ ื’ืจืฃ JanusGraph DBMS ืœืคืชืจื•ืŸ ื”ื‘ืขื™ื” ืฉืœ ืžืฆื™ืืช ื ืชื™ื‘ื™ื ืžืชืื™ืžื™ื

ื ื™ืกื•ื™ ื‘ื•ื“ืง ืืช ื”ื™ืฉื™ืžื•ืช ืฉืœ ื’ืจืฃ JanusGraph DBMS ืœืคืชืจื•ืŸ ื”ื‘ืขื™ื” ืฉืœ ืžืฆื™ืืช ื ืชื™ื‘ื™ื ืžืชืื™ืžื™ื

ืฉืœื•ื ืœื›ื•ืœื. ืื ื• ืžืคืชื—ื™ื ืžื•ืฆืจ ืœื ื™ืชื•ื— ืชื ื•ืขื” ืœื ืžืงื•ื•ื ืช. ืœืคืจื•ื™ืงื˜ ื™ืฉ ืžืฉื™ืžื” ื”ืงืฉื•ืจื” ืœื ื™ืชื•ื— ืกื˜ื˜ื™ืกื˜ื™ ืฉืœ ืžืกืœื•ืœื™ ืžื‘ืงืจื™ื ืขืœ ืคื ื™ ืื–ื•ืจื™ื.

ื›ื—ืœืง ืžืžืฉื™ืžื” ื–ื•, ืžืฉืชืžืฉื™ื ื™ื›ื•ืœื™ื ืœืฉืื•ืœ ืืช ืฉืื™ืœืชื•ืช ื”ืžืขืจื›ืช ืžื”ืกื•ื’ ื”ื‘ื:

  • ื›ืžื” ืžื‘ืงืจื™ื ืขื‘ืจื• ืžืื–ื•ืจ "A" ืœืื–ื•ืจ "B";
  • ื›ืžื” ืžื‘ืงืจื™ื ืขื‘ืจื• ืžืื–ื•ืจ "A" ืœืื–ื•ืจ "B" ื“ืจืš ืื–ื•ืจ "C" ื•ืœืื—ืจ ืžื›ืŸ ื“ืจืš ืื–ื•ืจ "D";
  • ื›ืžื” ื–ืžืŸ ืœืงื— ืœืกื•ื’ ืžืกื•ื™ื ืฉืœ ืžื‘ืงืจ ืœื ืกื•ืข ืžืื–ื•ืจ "A" ืœืื–ื•ืจ "B".

ื•ืžืกืคืจ ืฉืื™ืœืชื•ืช ืื ืœื™ื˜ื™ื•ืช ื“ื•ืžื•ืช.

ืชื ื•ืขืช ื”ืžื‘ืงืจ ืขืœ ืคื ื™ ืื–ื•ืจื™ื ื”ื™ื ื’ืจืฃ ืžื›ื•ื•ืŸ. ืœืื—ืจ ืงืจื™ืืช ื”ืื™ื ื˜ืจื ื˜, ื’ื™ืœื™ืชื™ ืฉ-DBMS ื’ืจืคื™ื ืžืฉืžืฉื™ื ื’ื ืœื“ื•ื—ื•ืช ืื ืœื™ื˜ื™ื™ื. ื”ื™ื” ืœื™ ืจืฆื•ืŸ ืœืจืื•ืช ืื™ืš DBMSs ื’ืจืคื™ื ื™ืชืžื•ื“ื“ื• ืขื ืฉืื™ืœืชื•ืช ื›ืืœื” (TL; DR; ื’ืจื•ืข).

ื‘ื—ืจืชื™ ืœื”ืฉืชืžืฉ ื‘-DBMS JanusGraph, ื›ื ืฆื™ื’ ืžืฆื˜ื™ื™ืŸ ืฉืœ DBMS ื‘ืงื•ื“ ืคืชื•ื— ืฉืœ ื’ืจืคื™ื, ื”ืžืกืชืžืš ืขืœ ืขืจื™ืžื” ืฉืœ ื˜ื›ื ื•ืœื•ื’ื™ื•ืช ื‘ื•ื’ืจื•ืช, ืฉืืžื•ืจื•ืช (ืœื“ืขืชื™) ืœืกืคืง ืœื• ืžืืคื™ื™ื ื™ื ืชืคืขื•ืœื™ื™ื ื”ื’ื•ื ื™ื:

  • ืื—ื•ืจื™ ืื—ืกื•ืŸ ืฉืœ BerkeleyDB, Apache Cassandra, Scylla;
  • ื ื™ืชืŸ ืœืื—ืกืŸ ืื™ื ื“ืงืกื™ื ืžื•ืจื›ื‘ื™ื ื‘- Lucene, Elasticsearch, Solr.

ืžื—ื‘ืจื™ JanusGraph ื›ื•ืชื‘ื™ื ืฉื”ื•ื ืžืชืื™ื ื’ื ืœ-OLTP ื•ื’ื ืœ-OLAP.

ืขื‘ื“ืชื™ ืขื BerkeleyDB, Apache Cassandra, Scylla ื•-ES, ื•ืžื•ืฆืจื™ื ืืœื” ืžืฉืžืฉื™ื ืœืขืชื™ื ืงืจื•ื‘ื•ืช ื‘ืžืขืจื›ื•ืช ืฉืœื ื•, ืื– ื”ื™ื™ืชื™ ืื•ืคื˜ื™ืžื™ ืœื’ื‘ื™ ื‘ื“ื™ืงืช ื”-DBMS ื”ื’ืจืคื™ ื”ื–ื”. ื ืจืื” ืœื™ ืžื•ื–ืจ ืœื‘ื—ื•ืจ ื‘-BerkeleyDB ืขืœ ืคื ื™ RocksDB, ืื‘ืœ ื–ื” ื›ื ืจืื” ื ื•ื‘ืข ืžื“ืจื™ืฉื•ืช ื”ืขืกืงื”. ื‘ื›ืœ ืžืงืจื”, ืœืฉื™ืžื•ืฉ ื‘ืžื•ืฆืจ ื ื™ืชืŸ ืœื”ืจื—ื‘ื”, ืžื•ืžืœืฅ ืœื”ืฉืชืžืฉ ื‘-backend ืขืœ Cassandra ืื• Scylla.

ืœื ืฉืงืœืชื™ ืืช Neo4j ืžื›ื™ื•ื•ืŸ ืฉืฆืจื•ืจ ืžืฆืจื™ืš ื’ืจืกื” ืžืกื—ืจื™ืช, ื›ืœื•ืžืจ, ื”ืžื•ืฆืจ ืื™ื ื• ืคืชื•ื—.

ื’ืจืคื™ื DBMSs ืื•ืžืจื™ื: "ืื ื–ื” ื ืจืื” ื›ืžื• ื’ืจืฃ, ื”ืชื™ื™ื—ืก ืœื–ื” ื›ืžื• ื’ืจืฃ!" - ื™ื•ืคื™!

ืจืืฉื™ืช, ืฆื™ื™ืจืชื™ ื’ืจืฃ, ืฉื ืขืฉื” ื‘ื“ื™ื•ืง ืœืคื™ ื”ืงื ื•ื ื™ื ืฉืœ DBMSs ื’ืจืคื™ื:

ื ื™ืกื•ื™ ื‘ื•ื“ืง ืืช ื”ื™ืฉื™ืžื•ืช ืฉืœ ื’ืจืฃ JanusGraph DBMS ืœืคืชืจื•ืŸ ื”ื‘ืขื™ื” ืฉืœ ืžืฆื™ืืช ื ืชื™ื‘ื™ื ืžืชืื™ืžื™ื

ื™ืฉ ืžื”ื•ืช Zone, ืื—ืจืื™ ืขืœ ื”ืฉื˜ื—. ืื ZoneStep ืฉื™ื™ืš ืœื–ื” Zone, ืื– ื”ื•ื ืžืชื™ื™ื—ืก ืœื–ื”. ืขืœ ื”ืžื”ื•ืช Area, ZoneTrack, Person ืืœ ืชืฉื™ื ืœื‘, ื”ื ืฉื™ื™ื›ื™ื ืœืชื—ื•ื ื•ืื™ื ื ื ื—ืฉื‘ื™ื ื›ื—ืœืง ืžื”ื‘ื“ื™ืงื”. ื‘ืกืš ื”ื›ืœ, ืฉืื™ืœืชืช ื—ื™ืคื•ืฉ ืฉืจืฉืจืช ืขื‘ื•ืจ ืžื‘ื ื” ื’ืจืฃ ื›ื–ื” ืชื™ืจืื” ื›ืš:

g.V().hasLabel('Zone').has('id',0).in_()
       .repeat(__.out()).until(__.out().hasLabel('Zone').has('id',19)).count().next()

ืžื” ื–ื” ื‘ืจื•ืกื™ืช ืžืฉื”ื• ื›ื–ื”: ืžืฆื ืื–ื•ืจ ืขื ID=0, ืงื— ืืช ื›ืœ ื”ืงื•ื“ืงื•ื“ื™ื ืฉืžื”ื ืžื’ื™ืข ืงืฆื” ืืœื™ื• (ZoneStep), ืจืงืข ื‘ืœื™ ืœื—ื–ื•ืจ ืื—ื•ืจื” ืขื“ ืฉืชืžืฆื ืืช ื”-ZoneSteps ืฉืžื”ื ื™ืฉ ืงืฆื” ืœ-Zone ืขื ID=19, ืกืคืจื• ืืช ืžืกืคืจ ืฉืจืฉืจืื•ืช ื›ืืœื”.

ืื ื™ ืœื ืžืชื™ื™ืžืจ ืœื“ืขืช ืืช ื›ืœ ื”ืžื•ืจื›ื‘ื•ื™ื•ืช ืฉืœ ื—ื™ืคื•ืฉ ื‘ื’ืจืคื™ื, ืื‘ืœ ื”ืฉืื™ืœืชื” ื”ื–ื• ื ื•ืฆืจื” ืขืœ ืกืžืš ื”ืกืคืจ ื”ื–ื” (https://kelvinlawrence.net/book/Gremlin-Graph-Guide.html).

ื˜ืขื ืชื™ 50 ืืœืฃ ืจืฆื•ืขื•ืช ื‘ืื•ืจืš ืฉืœ ื‘ื™ืŸ 3 ืœ-20 ื ืงื•ื“ื•ืช ืœืชื•ืš ืžืกื“ ื ืชื•ื ื™ื ืฉืœ ื’ืจืคื™ื ืฉืœ JanusGraph ื‘ืืžืฆืขื•ืช ื”-BerkeleyDB backend, ื™ืฆืจืชื™ ืื™ื ื“ืงืกื™ื ืœืคื™ ืžึทื ื”ึดื™ื’ื•ึผืช.

ืกืงืจื™ืคื˜ ื”ื•ืจื“ื” ืฉืœ Python:


from random import random
from time import time

from init import g, graph

if __name__ == '__main__':

    points = []
    max_zones = 19
    zcache = dict()
    for i in range(0, max_zones + 1):
        zcache[i] = g.addV('Zone').property('id', i).next()

    startZ = zcache[0]
    endZ = zcache[max_zones]

    for i in range(0, 10000):

        if not i % 100:
            print(i)

        start = g.addV('ZoneStep').property('time', int(time())).next()
        g.V(start).addE('belongs').to(startZ).iterate()

        while True:
            pt = g.addV('ZoneStep').property('time', int(time())).next()
            end_chain = random()
            if end_chain < 0.3:
                g.V(pt).addE('belongs').to(endZ).iterate()
                g.V(start).addE('goes').to(pt).iterate()
                break
            else:
                zone_id = int(random() * max_zones)
                g.V(pt).addE('belongs').to(zcache[zone_id]).iterate()
                g.V(start).addE('goes').to(pt).iterate()

            start = pt

    count = g.V().count().next()
    print(count)

ื”ืฉืชืžืฉื ื• ื‘-VM ืขื 4 ืœื™ื‘ื•ืช ื•-16 GB RAM ืขืœ SSD. JanusGraph ื ืคืจืก ื‘ืืžืฆืขื•ืช ืคืงื•ื“ื” ื–ื•:

docker run --name janusgraph -p8182:8182 janusgraph/janusgraph:latest

ื‘ืžืงืจื” ื–ื”, ื”ื ืชื•ื ื™ื ื•ื”ืื™ื ื“ืงืกื™ื ื”ืžืฉืžืฉื™ื ืœื—ื™ืคื•ืฉื™ื ื‘ื”ืชืืžื” ืžื“ื•ื™ืงืช ืžืื•ื—ืกื ื™ื ื‘-BerkeleyDB. ืœืื—ืจ ืฉื‘ื™ืฆืขืชื™ ืืช ื”ื‘ืงืฉื” ืฉื ื™ืชื ื” ืงื•ื“ื ืœื›ืŸ, ืงื™ื‘ืœืชื™ ื–ืžืŸ ื”ืฉื•ื•ื” ืœื›ืžื” ืขืฉืจื•ืช ืฉื ื™ื•ืช.

ืขืœ ื™ื“ื™ ื”ืคืขืœืช 4 ื”ืกืงืจื™ืคื˜ื™ื ืœืขื™ืœ ื‘ืžืงื‘ื™ืœ, ื”ืฆืœื—ืชื™ ืœื”ืคื•ืš ืืช ื”-DBMS ืœื“ืœืขืช ืขื ื–ืจื ืขืœื™ื– ืฉืœ Java stacktraces (ื•ื›ื•ืœื ื• ืื•ื”ื‘ื™ื ืœืงืจื•ื Java stacktraces) ื‘ื™ื•ืžื ื™ Docker.

ืœืื—ืจ ืžื—ืฉื‘ื”, ื”ื—ืœื˜ืชื™ ืœืคืฉื˜ ืืช ื“ื™ืื’ืจืžืช ื”ื’ืจืฃ ืœื“ื‘ืจื™ื ื”ื‘ืื™ื:

ื ื™ืกื•ื™ ื‘ื•ื“ืง ืืช ื”ื™ืฉื™ืžื•ืช ืฉืœ ื’ืจืฃ JanusGraph DBMS ืœืคืชืจื•ืŸ ื”ื‘ืขื™ื” ืฉืœ ืžืฆื™ืืช ื ืชื™ื‘ื™ื ืžืชืื™ืžื™ื

ื”ื—ืœื˜ื” ืฉื—ื™ืคื•ืฉ ืœืคื™ ืชื›ื•ื ื•ืช ื™ืฉื•ืช ื™ื”ื™ื” ืžื”ื™ืจ ื™ื•ืชืจ ืžืืฉืจ ื—ื™ืคื•ืฉ ืœืคื™ ืงืฆื•ื•ืช. ื›ืชื•ืฆืื” ืžื›ืš, ื”ื‘ืงืฉื” ืฉืœื™ ื”ืคื›ื” ืœื“ื‘ืจื™ื ื”ื‘ืื™ื:

g.V().hasLabel('ZoneStep').has('id',0).repeat(__.out().simplePath()).until(__.hasLabel('ZoneStep').has('id',19)).count().next()

ืžื” ื‘ืจื•ืกื™ืช ื–ื” ืžืฉื”ื• ื›ื–ื”: ืžืฆื ืืช ZoneStep ืขื ID=0, ืจืงืข ื‘ืœื™ ืœื—ื–ื•ืจ ืขื“ ืฉืชืžืฆื ืืช ZoneStep ืขื ID=19, ืกืคืจ ืืช ืžืกืคืจ ื”ืฉืจืฉืจืื•ืช ื”ืœืœื•.

ืคื™ืฉื˜ืชื™ ื’ื ืืช ืกืงืจื™ืคื˜ ื”ื˜ืขื™ื ื” ื”ืžื•ืคื™ืข ืœืขื™ืœ ื›ื“ื™ ืœื ืœื™ืฆื•ืจ ืงืฉืจื™ื ืžื™ื•ืชืจื™ื, ื•ืœื”ื’ื‘ื™ืœ ืืช ืขืฆืžื™ ืœืชื›ื•ื ื•ืช.

ื”ื‘ืงืฉื” ืขื“ื™ื™ืŸ ืืจื›ื” ืžืกืคืจ ืฉื ื™ื•ืช ืœื”ืฉืœืžืชื•, ื“ื‘ืจ ืฉืœื ื”ื™ื” ืžืงื•ื‘ืœ ืœื—ืœื•ื˜ื™ืŸ ืขื‘ื•ืจ ื”ืžืฉื™ืžื” ืฉืœื ื•, ืžื›ื™ื•ื•ืŸ ืฉื”ื™ื ื›ืœืœ ืœื ื”ืชืื™ืžื” ืœืžื˜ืจื•ืช ืฉืœ ื‘ืงืฉื•ืช AdHoc ืžื›ืœ ืกื•ื’ ืฉื”ื•ื.

ื ื™ืกื™ืชื™ ืœืคืจื•ืก ืืช JanusGraph ื‘ืืžืฆืขื•ืช Scylla ื›ื™ื™ืฉื•ื Cassandra ื”ืžื”ื™ืจ ื‘ื™ื•ืชืจ, ืื‘ืœ ื–ื” ื’ื ืœื ื”ื•ื‘ื™ืœ ืœืฉื™ื ื•ื™ื™ื ืžืฉืžืขื•ืชื™ื™ื ื‘ื‘ื™ืฆื•ืขื™ื.

ืื– ืœืžืจื•ืช ื”ืขื•ื‘ื“ื” ืฉ"ื–ื” ื ืจืื” ื›ืžื• ื’ืจืฃ", ืœื ื”ืฆืœื—ืชื™ ืœื’ืจื•ื ืœ-DBMS ืฉืœ ื”ื’ืจืฃ ืœืขื‘ื“ ืื•ืชื• ื‘ืžื”ื™ืจื•ืช. ืื ื™ ืžื ื™ื— ืœื—ืœื•ื˜ื™ืŸ ืฉื™ืฉ ืžืฉื”ื• ืฉืื ื™ ืœื ื™ื•ื“ืข ื•ืฉื ื™ืชืŸ ืœื’ืจื•ื ืœ-JanusGraph ืœื‘ืฆืข ืืช ื”ื—ื™ืคื•ืฉ ื”ื–ื” ื‘ืฉื‘ืจื™ืจ ืฉื ื™ื™ื”, ืขื ื–ืืช, ืœื ื”ืฆืœื—ืชื™ ืœืขืฉื•ืช ื–ืืช.

ืžื›ื™ื•ื•ืŸ ืฉื”ื‘ืขื™ื” ืขื“ื™ื™ืŸ ืฆืจื™ื›ื” ืœื”ื™ืคืชืจ, ื”ืชื—ืœืชื™ ืœื—ืฉื•ื‘ ืขืœ JOINs ื•-Pivots ืฉืœ ื˜ื‘ืœืื•ืช, ืฉืœื ืขื•ืจืจื• ืื•ืคื˜ื™ืžื™ื•ืช ืžื‘ื—ื™ื ืช ืืœื’ื ื˜ื™ื•ืช, ืื‘ืœ ื™ื›ืœื• ืœื”ื™ื•ืช ืื•ืคืฆื™ื” ื™ืฉื™ืžื” ืœื—ืœื•ื˜ื™ืŸ ื‘ืคื•ืขืœ.

ื”ืคืจื•ื™ืงื˜ ืฉืœื ื• ื›ื‘ืจ ืžืฉืชืžืฉ ื‘- Apache ClickHouse, ืื– ื”ื—ืœื˜ืชื™ ืœื‘ื“ื•ืง ืืช ื”ืžื—ืงืจ ืฉืœื™ ืขืœ DBMS ืื ืœื™ื˜ื™ ื–ื”.

ืคืจืกื• ืืช ClickHouse ื‘ืืžืฆืขื•ืช ืžืชื›ื•ืŸ ืคืฉื•ื˜:

sudo docker run -d --name clickhouse_1 
     --ulimit nofile=262144:262144 
     -v /opt/clickhouse/log:/var/log/clickhouse-server 
     -v /opt/clickhouse/data:/var/lib/clickhouse 
     yandex/clickhouse-server

ื™ืฆืจืชื™ ืžืกื“ ื ืชื•ื ื™ื ื•ื˜ื‘ืœื” ื‘ื• ื›ืš:

CREATE TABLE 
db.steps (`area` Int64, `when` DateTime64(1, 'Europe/Moscow') DEFAULT now64(), `zone` Int64, `person` Int64) 
ENGINE = MergeTree() ORDER BY (area, zone, person) SETTINGS index_granularity = 8192

ืžื™ืœืืชื™ โ€‹โ€‹ืื•ืชื• ื‘ื ืชื•ื ื™ื ื‘ืืžืฆืขื•ืช ื”ืกืงืจื™ืคื˜ ื”ื‘ื:

from time import time

from clickhouse_driver import Client
from random import random

client = Client('vm-12c2c34c-df68-4a98-b1e5-a4d1cef1acff.domain',
                database='db',
                password='secret')

max = 20

for r in range(0, 100000):

    if r % 1000 == 0:
        print("CNT: {}, TS: {}".format(r, time()))

    data = [{
            'area': 0,
            'zone': 0,
            'person': r
        }]

    while True:
        if random() < 0.3:
            break

        data.append({
                'area': 0,
                'zone': int(random() * (max - 2)) + 1,
                'person': r
            })

    data.append({
            'area': 0,
            'zone': max - 1,
            'person': r
        })

    client.execute(
        'INSERT INTO steps (area, zone, person) VALUES',
        data
    )

ืžื›ื™ื•ื•ืŸ ืฉื”ืชื•ืกืคื•ืช ืžื’ื™ืขื•ืช ื‘ืืฆื•ื•ืช, ื”ืžื™ืœื•ื™ ื”ื™ื” ืžื”ื™ืจ ื”ืจื‘ื” ื™ื•ืชืจ ืžืืฉืจ ืขื‘ื•ืจ JanusGraph.

ื‘ื ื” ืฉืชื™ ืฉืื™ืœืชื•ืช ื‘ืืžืฆืขื•ืช JOIN. ื›ื“ื™ ืœืขื‘ื•ืจ ืžื ืงื•ื“ื” ื' ืœื ืงื•ื“ื” ื‘':

SELECT s1.person AS person,
       s1.zone,
       s1.when,
       s2.zone,
       s2.when
FROM
  (SELECT *
   FROM steps
   WHERE (area = 0)
     AND (zone = 0)) AS s1 ANY INNER JOIN
  (SELECT *
   FROM steps AS s2
   WHERE (area = 0)
     AND (zone = 19)) AS s2 USING person
WHERE s1.when <= s2.when

ื›ื“ื™ ืœืขื‘ื•ืจ ืขืœ 3 ื ืงื•ื“ื•ืช:

SELECT s3.person,
       s1z,
       s1w,
       s2z,
       s2w,
       s3.zone,
       s3.when
FROM
  (SELECT s1.person AS person,
          s1.zone AS s1z,
          s1.when AS s1w,
          s2.zone AS s2z,
          s2.when AS s2w
   FROM
     (SELECT *
      FROM steps
      WHERE (area = 0)
        AND (zone = 0)) AS s1 ANY INNER JOIN
     (SELECT *
      FROM steps AS s2
      WHERE (area = 0)
        AND (zone = 3)) AS s2 USING person
   WHERE s1.when <= s2.when) p ANY INNER JOIN
  (SELECT *
   FROM steps
   WHERE (area = 0)
     AND (zone = 19)) AS s3 USING person
WHERE p.s2w <= s3.when

ื”ื‘ืงืฉื•ืช, ื›ืžื•ื‘ืŸ, ื ืจืื•ืช ืžืคื—ื™ื“ื•ืช ืœืžื“ื™; ืœืฉื™ืžื•ืฉ ืืžื™ืชื™, ืืชื” ืฆืจื™ืš ืœื™ืฆื•ืจ ืจืชืžืช ืžื—ื•ืœืœ ืชื•ื›ื ื”. ืขื ื–ืืช, ื”ื ืขื•ื‘ื“ื™ื ื•ื”ื ืขื•ื‘ื“ื™ื ื‘ืžื”ื™ืจื•ืช. ื”ืŸ ื”ื‘ืงืฉื” ื”ืจืืฉื•ื ื” ื•ื”ืŸ ื”ื‘ืงืฉื” ื”ืฉื ื™ื™ื” ืžืกืชื™ื™ืžื•ืช ืชื•ืš ืคื—ื•ืช ืž-0.1 ืฉื ื™ื•ืช. ื”ื ื” ื“ื•ื’ืžื” ืœื–ืžืŸ ื‘ื™ืฆื•ืข ื”ืฉืื™ืœืชื” ืขื‘ื•ืจ ืกืคื™ืจื”(*) ืฉืขื•ื‘ืจืช ื“ืจืš 3 ื ืงื•ื“ื•ืช:

SELECT count(*)
FROM 
(
    SELECT 
        s1.person AS person, 
        s1.zone AS s1z, 
        s1.when AS s1w, 
        s2.zone AS s2z, 
        s2.when AS s2w
    FROM 
    (
        SELECT *
        FROM steps
        WHERE (area = 0) AND (zone = 0)
    ) AS s1
    ANY INNER JOIN 
    (
        SELECT *
        FROM steps AS s2
        WHERE (area = 0) AND (zone = 3)
    ) AS s2 USING (person)
    WHERE s1.when <= s2.when
) AS p
ANY INNER JOIN 
(
    SELECT *
    FROM steps
    WHERE (area = 0) AND (zone = 19)
) AS s3 USING (person)
WHERE p.s2w <= s3.when

โ”Œโ”€count()โ”€โ”
โ”‚   11592 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

1 rows in set. Elapsed: 0.068 sec. Processed 250.03 thousand rows, 8.00 MB (3.69 million rows/s., 117.98 MB/s.)

ื”ืขืจื” ืœื’ื‘ื™ IOPS. ื‘ืขืช ืื™ื›ืœื•ืก ื ืชื•ื ื™ื, JanusGraph ื™ืฆืจ ืžืกืคืจ ื’ื‘ื•ื” ืœืžื“ื™ ืฉืœ IOPS (1000-1300 ืขื‘ื•ืจ ืืจื‘ืขื” ืฉืจืฉื•ืจื™ ืื•ื›ืœื•ืกื™ื™ืช ื ืชื•ื ื™ื) ื•-IOWAIT ื”ื™ื” ื’ื‘ื•ื” ืœืžื“ื™. ื‘ืžืงื‘ื™ืœ, ClickHouse ื™ืฆืจ ืขื•ืžืก ืžื™ื ื™ืžืœื™ ืขืœ ืชืช-ืžืขืจื›ืช ื”ื“ื™ืกืง.

ืžืกืงื ื”

ื”ื—ืœื˜ื ื• ืœื”ืฉืชืžืฉ ื‘-ClickHouse ื›ื“ื™ ืœืชืช ืฉื™ืจื•ืช ื‘ืงืฉื•ืช ืžืกื•ื’ ื–ื”. ืื ื—ื ื• ืชืžื™ื“ ื™ื›ื•ืœื™ื ืœื‘ืฆืข ืื•ืคื˜ื™ืžื™ื–ืฆื™ื” ื ื•ืกืคืช ืฉืœ ืฉืื™ืœืชื•ืช ื‘ืืžืฆืขื•ืช ืชืฆื•ื’ื•ืช ืžืชืžืžืฉื•ืช ื•ื”ืงื‘ื™ืœื” ืขืœ ื™ื“ื™ ืขื™ื‘ื•ื“ ืžื•ืงื“ื ืฉืœ ื–ืจื ื”ืื™ืจื•ืขื™ื ื‘ืืžืฆืขื•ืช Apache Flink ืœืคื ื™ ื˜ืขื™ื ืชืŸ ืœ-ClickHouse.

ื”ื‘ื™ืฆื•ืขื™ื ื›ืœ ื›ืš ื˜ื•ื‘ื™ื ืฉื›ื ืจืื” ืœื ื ืฆื˜ืจืš ืืคื™ืœื• ืœื—ืฉื•ื‘ ืขืœ ืกื™ื‘ื•ื‘ ื˜ื‘ืœืื•ืช ื‘ืื•ืคืŸ ืชื›ื ื•ืชื™. ื‘ืขื‘ืจ, ื”ื™ื™ื ื• ืฆืจื™ื›ื™ื ืœื‘ืฆืข ืฆื™ืจื™ื ืฉืœ ื ืชื•ื ื™ื ืฉืื•ื—ื–ืจื• ืž-Vertica ื‘ืืžืฆืขื•ืช ื”ืขืœืื” ืœ- Apache Parquet.

ืœืจื•ืข ื”ืžื–ืœ, ื ื™ืกื™ื•ืŸ ื ื•ืกืฃ ืœื”ืฉืชืžืฉ ื‘-DBMS ื’ืจืคื™ ืœื ืฆืœื—. ืœื ืžืฆืืชื™ ืœ-JanusGraph ืžืขืจื›ืช ืืงื•ืœื•ื’ื™ืช ื™ื“ื™ื“ื•ืชื™ืช ืฉืžืืคืฉืจืช ืœื”ืชืขื“ื›ืŸ ื‘ืงืœื•ืช ื‘ืžื•ืฆืจ. ื‘ืžืงื‘ื™ืœ, ื›ื“ื™ ืœื”ื’ื“ื™ืจ ืืช ื”ืฉืจืช, ื ืขืฉื” ืฉื™ืžื•ืฉ ื‘ืฉื™ื˜ืช Java ื”ืžืกื•ืจืชื™ืช, ืฉืชื’ืจื•ื ืœืื ืฉื™ื ืฉืื™ื ื ื‘ืงื™ืื™ื ื‘ื’'ืื•ื•ื” ืœื‘ื›ื•ืช ื“ืžืขื•ืช ืฉืœ ื“ื:

host: 0.0.0.0
port: 8182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
channelizer: org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer

graphManager: org.janusgraph.graphdb.management.JanusGraphManager
graphs: {
  ConfigurationManagementGraph: conf/janusgraph-cql-configurationgraph.properties,
  airlines: conf/airlines.properties
}

scriptEngines: {
  gremlin-groovy: {
    plugins: { org.janusgraph.graphdb.tinkerpop.plugin.JanusGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.server.jsr223.GremlinServerGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.tinkergraph.jsr223.TinkerGraphGremlinPlugin: {},
               org.apache.tinkerpop.gremlin.jsr223.ImportGremlinPlugin: {classImports: [java.lang.Math], methodImports: [java.lang.Math#*]},
               org.apache.tinkerpop.gremlin.jsr223.ScriptFileGremlinPlugin: {files: [scripts/airline-sample.groovy]}}}}

serializers:
# GraphBinary is here to replace Gryo and Graphson
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1, config: { serializeResultToString: true }}
  # Gryo and Graphson, latest versions
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  # Older serialization versions for backwards compatibility:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }}

processors:
  - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
  - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }}

metrics: {
  consoleReporter: {enabled: false, interval: 180000},
  csvReporter: {enabled: false, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
  jmxReporter: {enabled: false},
  slf4jReporter: {enabled: true, interval: 180000},
  gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
  graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferHighWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
  enabled: false}

ื”ืฆืœื—ืชื™ ื‘ื˜ืขื•ืช "ืœื”ื›ื ื™ืก" ืืช ื’ืจืกืช BerkeleyDB ืฉืœ JanusGraph.

ื”ืชื™ืขื•ื“ ื“ื™ ืขืงื•ื ืžื‘ื—ื™ื ืช ื”ืื™ื ื“ืงืกื™ื, ืฉื›ืŸ ื ื™ื”ื•ืœ ืื™ื ื“ืงืกื™ื ืžื—ื™ื™ื‘ ืื•ืชืš ืœื‘ืฆืข ืื™ื–ื• ืฉืžืื ื™ื–ื ื“ื™ ืžื•ื–ืจ ื‘-Groovy. ืœืžืฉืœ, ื™ืฆื™ืจืช ืื™ื ื“ืงืก ื—ื™ื™ื‘ืช ืœื”ื™ืขืฉื•ืช ืขืœ ื™ื“ื™ ื›ืชื™ื‘ืช ืงื•ื“ ื‘ืงื•ื ืกื•ืœืช Gremlin (ืฉืื’ื‘, ืœื ืขื•ื‘ื“ืช ืžื—ื•ืฅ ืœืงื•ืคืกื”). ืžืชื•ืš ื”ืชื™ืขื•ื“ ื”ืจืฉืžื™ ืฉืœ JanusGraph:

graph.tx().rollback() //Never create new indexes while a transaction is active
mgmt = graph.openManagement()
name = mgmt.getPropertyKey('name')
age = mgmt.getPropertyKey('age')
mgmt.buildIndex('byNameComposite', Vertex.class).addKey(name).buildCompositeIndex()
mgmt.buildIndex('byNameAndAgeComposite', Vertex.class).addKey(name).addKey(age).buildCompositeIndex()
mgmt.commit()

//Wait for the index to become available
ManagementSystem.awaitGraphIndexStatus(graph, 'byNameComposite').call()
ManagementSystem.awaitGraphIndexStatus(graph, 'byNameAndAgeComposite').call()
//Reindex the existing data
mgmt = graph.openManagement()
mgmt.updateIndex(mgmt.getGraphIndex("byNameComposite"), SchemaAction.REINDEX).get()
mgmt.updateIndex(mgmt.getGraphIndex("byNameAndAgeComposite"), SchemaAction.REINDEX).get()
mgmt.commit()

ืื—ืจื™ืช ื“ื‘ืจ

ื‘ืžื•ื‘ืŸ ืžืกื•ื™ื, ื”ื ื™ืกื•ื™ ืœืขื™ืœ ื”ื•ื ื”ืฉื•ื•ืื” ื‘ื™ืŸ ื—ื ืœืจืš. ืื ืืชื” ื—ื•ืฉื‘ ืขืœ ื–ื”, ื’ืจืฃ DBMS ืžื‘ืฆืข ืคืขื•ืœื•ืช ืื—ืจื•ืช ื›ื“ื™ ืœื”ืฉื™ื’ ืืช ืื•ืชืŸ ืชื•ืฆืื•ืช. ืขื ื–ืืช, ื›ื—ืœืง ืžื”ื‘ื“ื™ืงื•ืช, ืขืจื›ืชื™ ื’ื ื ื™ืกื•ื™ ืขื ื‘ืงืฉื” ื›ืžื•:

g.V().hasLabel('ZoneStep').has('id',0)
    .repeat(__.out().simplePath()).until(__.hasLabel('ZoneStep').has('id',1)).count().next()

ืžื” ืฉืžืฉืงืฃ ืžืจื—ืง ื”ืœื™ื›ื”. ืขื ื–ืืช, ื’ื ืขืœ ื ืชื•ื ื™ื ื›ืืœื”, ื”ื’ืจืฃ DBMS ื”ืจืื” ืชื•ืฆืื•ืช ืฉื—ืจื’ื• ืžืขื‘ืจ ืœืžืกืคืจ ืฉื ื™ื•ืช... ื–ืืช, ื›ืžื•ื‘ืŸ, ื‘ืฉืœ ื”ืขื•ื‘ื“ื” ืฉื”ื™ื• ื ืชื™ื‘ื™ื ื›ืžื• 0 -> X -> Y ... -> 1, ืฉื’ื ืžื ื•ืข ื”ื’ืจืฃ ื‘ื“ืง.

ืืคื™ืœื• ืขื‘ื•ืจ ืฉืื™ืœืชื” ื›ืžื•:

g.V().hasLabel('ZoneStep').has('id',0).out().has('id',1)).count().next()

ืœื ื”ืฆืœื—ืชื™ ืœืงื‘ืœ ืชื’ื•ื‘ื” ืคืจื•ื“ื•ืงื˜ื™ื‘ื™ืช ืขื ื–ืžืŸ ืขื™ื‘ื•ื“ ืฉืœ ืคื—ื•ืช ืžืฉื ื™ื™ื”.

ืžื•ืกืจ ื”ื”ืฉื›ืœ ืฉืœ ื”ืกื™ืคื•ืจ ื”ื•ื ืฉืจืขื™ื•ืŸ ื™ืคื” ื•ืžื•ื“ืœื™ื ืคืจื“ื™ื’ืžื˜ื™ื™ื ืื™ื ื ืžื•ื‘ื™ืœื™ื ืœืชื•ืฆืื” ื”ืจืฆื•ื™ื”, ืืฉืจ ืžื•ื“ื’ืžืช ื‘ื™ืขื™ืœื•ืช ื’ื‘ื•ื”ื” ื‘ื”ืจื‘ื” ื‘ืืžืฆืขื•ืช ื”ื“ื•ื’ืžื” ืฉืœ ClickHouse. ืžืงืจื” ื”ืฉื™ืžื•ืฉ ื”ืžื•ืฆื’ ื‘ืžืืžืจ ื–ื” ื”ื•ื ืื ื˜ื™ ื“ืคื•ืก ื‘ืจื•ืจ ืขื‘ื•ืจ DBMSs ื’ืจืคื™ื, ืื ื›ื™ ื ืจืื” ืฉื”ื•ื ืžืชืื™ื ืœืžื•ื“ืœื™ื ื‘ืคืจื“ื™ื’ืžื” ืฉืœื”ื.

ืžืงื•ืจ: www.habr.com

ื”ื•ืกืคืช ืชื’ื•ื‘ื”