Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

8/13/2018

Reading time:2 min

elubow/titan-gremlin

by John Doe

Titan is a free, open source database that is capable of processingextremely large graphs and it supports a variety of indexing and storage backends,which makes it easier to extend than some popular NoSQL Graph databases.This docker image instantiaties a Titan graph database that is capable ofintegrating with an ElasticSearch container (Indexing) and a Cassandra container (Storage).The default distribution of Titan runs on a single node, so I thought it would be helpfulif there was a modular way at runtime to hook up Titan to its dependencies.Enter Docker. Now it is possible to run Titan and it's dependencies in separate Docker containers.TitanThis container is using Titan 1.0.0. Please refer toits page for more information.TinkerPop and GremlinTinkerPop is a vendor-independent API specification formanipulating and access Graph databases. This is using TinkerPop 3.0.1.RunningThe minimum system requirements for this stack is 1 GB with 2 cores.docker run -d --name es1 elasticsearchdocker run -d --name cas1 elubow/cassandradocker run -d -P --name titan1 --link es1:elasticsearch --link cas1:cassandra elubow/titan-gremlinI run with a 3 node Cassandra cluster and some local ports exported, like so:docker run -d --name cas1 -p 7000:7000 -p 7001:7001 -p 7199:7199 -p 9160:9160 -p 9042:9042 elubow/cassandradocker run -d --name cas2 --link cas1:cassandra elubow/cassandra start docker inspect --format '{{ .NetworkSettings.IPAddress }}' cas1docker run -d --name cas3 --link cas1:cassandra elubow/cassandra start docker inspect --format '{{ .NetworkSettings.IPAddress }}' cas1docker run -d --name es1 --link cas1:cassandra -p 9200:9200 elasticsearchdocker run -d --name titan1 --link es1:elasticsearch --link cas1:cassandra -p 8182:8182 -p 8184:8184 elubow/titan-gremlinConnecting with Gremlin ClientIf you want to connect from a Gremlin client, download Titan.Then create a properties file that looks like this where the storage.hostname is the hostname or IP of docker.storage.backend=cassandrathriftstorage.hostname=192.168.99.100Then start the gremlin server by doing bin/gremlin.sh and run the following commands inside the Gremlin console:gremlin> graph = TitanFactory.open('/Users/elubow/tmp/local-gremlin.properties')==>standardtitangraph[cassandrathrift:[192.168.99.100]]gremlin> g = graph.traversal()==>graphtraversalsource[standardtitangraph[cassandrathrift:[192.168.99.100]], standard]gremlin> g.V()==>v[4168]NOTE: This will not use the elasticsearch backend.Ports8182: HTTP port for REST API8184: JMX Port (You won't need to use this, probably)To test out the REST API (over Boot2docker):curl "http://192.168.99.100:8182?gremlin=100-1"curl "http://192.168.99.100:8182?gremlin=g.addV('Name','Eric')"curl "http://192.168.99.100:8182?gremlin=g.V()"DependenciesI've tested this container with the following containers:- elubow/cassandra: This is the Cassandra Storage backend for Titan. It scales well for large datasets. Also forces Cassandra 2.1 as that's compatible with Titan.- elasticsearch: This is the ElasticSearch Indexing backend for Titan. It provides search capabilities for Titan graph datasets.RoadmapIn the near future, I'd like to add support for:- Scaling/Clustering Cassandra and ElasticSearch backends.- External volumes for persistent data.- Security between Titan and its backends.- Example application stack integrating with Titan.

Illustration Image

image

Titan is a free, open source database that is capable of processing extremely large graphs and it supports a variety of indexing and storage backends, which makes it easier to extend than some popular NoSQL Graph databases.

This docker image instantiaties a Titan graph database that is capable of integrating with an ElasticSearch container (Indexing) and a Cassandra container (Storage).

The default distribution of Titan runs on a single node, so I thought it would be helpful if there was a modular way at runtime to hook up Titan to its dependencies.

Enter Docker. Now it is possible to run Titan and it's dependencies in separate Docker containers.

Titan

This container is using Titan 1.0.0. Please refer to its page for more information.

TinkerPop and Gremlin

TinkerPop is a vendor-independent API specification for manipulating and access Graph databases. This is using TinkerPop 3.0.1.

Running

The minimum system requirements for this stack is 1 GB with 2 cores.

docker run -d --name es1 elasticsearch
docker run -d --name cas1 elubow/cassandra
docker run -d -P --name titan1 --link es1:elasticsearch --link cas1:cassandra elubow/titan-gremlin

I run with a 3 node Cassandra cluster and some local ports exported, like so:

docker run -d --name cas1 -p 7000:7000 -p 7001:7001 -p 7199:7199 -p 9160:9160 -p 9042:9042 elubow/cassandra
docker run -d --name cas2 --link cas1:cassandra elubow/cassandra start docker inspect --format '{{ .NetworkSettings.IPAddress }}' cas1
docker run -d --name cas3 --link cas1:cassandra elubow/cassandra start docker inspect --format '{{ .NetworkSettings.IPAddress }}' cas1
docker run -d --name es1 --link cas1:cassandra -p 9200:9200 elasticsearch
docker run -d --name titan1 --link es1:elasticsearch --link cas1:cassandra -p 8182:8182 -p 8184:8184 elubow/titan-gremlin

Connecting with Gremlin Client

If you want to connect from a Gremlin client, download Titan. Then create a properties file that looks like this where the storage.hostname is the hostname or IP of docker.

storage.backend=cassandrathrift
storage.hostname=192.168.99.100

Then start the gremlin server by doing bin/gremlin.sh and run the following commands inside the Gremlin console:

gremlin> graph = TitanFactory.open('/Users/elubow/tmp/local-gremlin.properties')
==>standardtitangraph[cassandrathrift:[192.168.99.100]]
gremlin> g = graph.traversal()
==>graphtraversalsource[standardtitangraph[cassandrathrift:[192.168.99.100]], standard]
gremlin> g.V()
==>v[4168]

NOTE: This will not use the elasticsearch backend.

Ports

8182: HTTP port for REST API 8184: JMX Port (You won't need to use this, probably)

To test out the REST API (over Boot2docker):

curl "http://192.168.99.100:8182?gremlin=100-1"
curl "http://192.168.99.100:8182?gremlin=g.addV('Name','Eric')"
curl "http://192.168.99.100:8182?gremlin=g.V()"

Dependencies

I've tested this container with the following containers:

- elubow/cassandra: This is the Cassandra Storage backend for Titan. It scales well for large datasets. Also forces Cassandra 2.1 as that's compatible with Titan.
- elasticsearch: This is the ElasticSearch Indexing backend for Titan. It provides search capabilities for Titan graph datasets.

Roadmap

In the near future, I'd like to add support for:

- Scaling/Clustering Cassandra and ElasticSearch backends.
- External volumes for persistent data.
- Security between Titan and its backends.
- Example application stack integrating with Titan.

Related Articles

python
java
cassandra

Vald

John Doe

2/11/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

gremlin