1/6/2017

Reading time:8 min

How To Configure and Run Cassandra on OpenShift – OpenShift Blog

by John Doe

Cassandra is an open-source, distributed, decentralized, horizontally scalable, and highly available NoSQL database. It is based on Amazon Dynamo distribution model and its data model is based on Google BigTable. Cassandra does not have any notion of master/slave as all of its nodes are the same. This helps Cassandra in becoming fault-tolerant and avoiding single point of failure. The purpose of this blog is to show users how they can install Cassandra on OpenShift platform as a service. In case you want to learn more about Cassandra please read the documentation.In this blog, we will be doing single node Cassandra installation. Please note that whole point of using Cassandra is fault tolerance and high availability, so single node installation is only good for a POC and getting your hands dirty with Cassandra. This blog is divided into two parts:How to install Cassandra running on a DIY application How you can use Cassandra as an embedded cartridge in a simple Java application.PrerequisitesBefore we can start with deploying Cassandra on OpenShift you need to do the following :Sign up for an OpenShift Account : If you don’t already have an OpenShift account, head on over to the website and sign up. It is completely free and Red Hat gives every user three free Gears on which to run your applications. At the time of this writing, the combined resources allocated for each user is 1.5 GB of memory and 3 GB of disk space. Install the client tools on your machine : The OpenShift client tools are written in a very popular programming language called Ruby. With OSX 10.6 or later and most Linux distributions, ruby is installed by default so installing the client tools is a snap. Simply issue the following command on your terminal application:sudo gem install rhcSetting up OpenShift : The rhc client tool makes it very easy to setup your openshift instance with ssh keys, git and your applications namespace. The namespace is a unique name per user which becomes part of your application url. For example, if your namespace is cix and application name is cassandra then url of the application will be http://cassandra-cix.rhcloud.com/. The command is shown belowrhc setup -l openshift_loginPart 1 : Installing Cassandra on a DIY applicationAfter you have signed up for OpenShift account and ran rhc setup command. The next step is to create a diy application using rhc command line tool. OpenShift’s powerful Do-It-Yourself (DIY) feature allows you to use your own Languages and Data Stores if the built-in Perl, Ruby, PHP, Python, and Java support doesn’t suit you. People have used it to run clojure,jruby,go, couchdb, redis and many other programming languages and datastores. OpenShift can run any binary that will run on RHEL 6.2 x64 because the OpenShift execution environment is a carefully secured Red Hat Enterprise Linux 6.2 running on x64 systems.Creating Cassandra DIY ApplicationTo create a diy application execute the command shown below.rhc app create cassandra diyThis will create an application container for us, called a gear, and setup all of the required SELinux policies and cgroup configuration. OpenShift will also setup a private git repository for you and clone the repository to your local system. Finally OpenShift will propagate the DNS to outside world.The template code generated by OpenShift has nothing interesting as it only contains a very simple ruby based http server listening on 8080 port and serves index.html file. The testrubyserver.rb and index.html exists in diy folder.Pulling the code from GithubTo get started with Cassandra quickly I have created a quickstart application which we can use to install Cassandra on OpenShift. The code is on github at https://github.com/shekhargulati/cassandra-openshift-quickstart. The quickstart downloads the latest cassandra tar, untar it, make configuration changes, and finally starts the Cassandra database. I will talk about it in detail latter in the post. Execute the git commands shown below to pull the quickstart code.git remote add upstream https://github.com/shekhargulati/cassandra-openshift-quickstart.git git pull -s recursive -X theirs upstream masterPushing the code to OpenShiftNow that you have Cassandra quickstart on your machine, let’s push the code to OpenShift which will do all the necessary steps required to install Cassandra on OpenShift.git pushAfter you execute git push, please wait for a minute as this command will do all the necessary steps required to install Cassandra on your diy application. After git push succeeds, ssh into the application gear as shown below.ssh f677086ae4b84936XXXXefrfrfr3f8e53f43eb56@cassandra-demo.rhcloud.comNow if you run ps -ef|grep cassandra you will find that cassandra Java process is running as shown below.Taking Cassandra to Test driveNow that we are sure Cassandra is running on the OpenShift gear, lets test it by creating some sample keyspace and column family. Then we will install some data into Column Family. Cassandra provides a command line utility called CQL which we can use for testing. To run CQL go to cassandra/bin folder in $OPENSHIFT_DATA_DIR and run cqlsh as shown below.cd app-root/data/cassandra/bin/./cqlsh $OPENSHIFT_DIY_IP 19160 -2You can also run DESCRIBE schema command which will output the Cassandra system keyspace schema.Lets now create our keyspace, column family. Execute the commands shown below.CREATE KEYSPACE MyKeyspace with strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' AND strategy_options:replication_factor = 1; use MyKeyspace;CREATE TABLE users ( user_name varchar PRIMARY KEY, password varchar, gender varchar, session_token varchar, state varchar, birth_year bigint); INSERT INTO MyKeyspace.users(user_name,password,gender,session_token,state,birth_year) VALUES ( 'shekhar','password','M','session','Haryana',1984);You can also view the data using SELECT query as shown below.Under the HoodNow that we have got Cassandra running on OpenShift, let’s take a look at how we achieved that. The changes that we made in the code are in three files which exists in .openshift/action_hooks folder inside your application directory. Let’s take a look at all these files one by one.deploy : The deploy hook gets invoked after dependencies of an application are resolved but before starting back the application again. In this script we create a new cassandra directory under $OPENSHIFT_DATA_DIR, download the tar file,create directories required by Cassandra to keep application data, logs, etc. And finally, update some configuration files to make Cassandra work. The configuration file we change are cassandra.yaml, log4j-server.properties, and cassandra-env.sh. The changes that we make in these files are related to port changes, using $OPENSHIFT_DIY_IP instead of localhost,setting memory, etc. You can view the file on github.start : The start script encapsulate the logic required to start the application. So, this script is where we start the cassandra database. You can view the start script on github.stop : The stop script encapsulate the logic to stop the application. Here we find the cassandra process id and kill the process. You can view the stop script on github.That’s the only changes that we had to make in order to run Cassandra on OpenShift.Part 2 : Using Cassandra as an Embedded Cartridge from with in a Java applicationSo, far we have seen how you can install cassandra on a diy application. But, it would make more sense to use Cassandra as an embedded cartridge from other application type like Java, PHP,Python, Ruby supported by OpenShift. This way you don’t have to install any other server or runtime for your application. In this part, we will create a very simple Java application which will be deployed on tomcat via JBoss EWS cartridge.Creating Tomcat ApplicationThe first step that we will be doing is create a tomcat application called cassandrajavademo. Execute the command shown below to create the application.rhc app create cassandrajavademo tomcat-6Updating OpenShift Action Hook Scripts to install CassandraTo install cassandra we have to update three scripts deploy, pre_start_jbossews, pre_stop_jbossews. These files will contain the same content as contained by deploy, start, and stop scripts we created in Part1.So, please update them accordingly. The only one change that we have to make is that we will be creating the keyspace and column family in pre_start_jbossews. So,please add the following line at the end.bin/cassandra-cli -h $OPENSHIFT_DIY_IP -p 19160 -f $OPENSHIFT_REPO_DIR/cassandra-tutorial.txtJava code to interact with CassandraFinally, I have written a very simple Spring MVC application which just has one controller which write data into cassandra and you can view the data by ssh’ing to the instance and running cqlsh command line utility. The controller is shown below.@Controllerpublic class CassandraController { @RequestMapping(value = "/cassandra", method = RequestMethod.GET) public String process() throws TException, InvalidRequestException, UnavailableException, UnsupportedEncodingException, NotFoundException, TimedOutException { String host = System.getenv("OPENSHIFT_INTERNAL_IP"); int port = 19160; TTransport transport = new TFramedTransport(new TSocket(host,port)); TProtocol protocol = new TBinaryProtocol(transport); Cassandra.Client client = new Cassandra.Client(protocol); transport.open(); client.set_keyspace("tutorials"); // define column parent ColumnParent parent = new ColumnParent("User"); // define row id ByteBuffer rowid = ByteBuffer.wrap("100".getBytes()); // define column to add Column username = new Column(); username.setName("username".getBytes()); username.setValue("shekhargulati".getBytes()); username.setTimestamp(System.currentTimeMillis()); // define consistency level ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE; // execute insert client.insert(rowid, parent, username, consistencyLevel); Column password = new Column(); password.setName("password".getBytes()); password.setValue("password".getBytes()); password.setTimestamp(System.currentTimeMillis()); client.insert(rowid, parent, password, consistencyLevel); // release resources transport.flush(); transport.close(); return "hello"; } }Push code to OpenShiftFinally push the code to github which will install cassandra, build a new war and deploy it on tomcat. Now, if you hit http://cassandrajavademo-cix.rhcloud.com/cassandra a new row will be created in Cassandra and you will see “Hello from Cassandra”.Source code of the application is available on my github repositoryConclusionIn this blog, I showed you how easy it is to extend OpenShift by installing Cassandra on top of it. What are you waiting for, Try it now!What’s Next?Sign up for OpenShift OnlineInterested in a private PaaS? Register for an evaluation of OpenShift EnterpriseNeed Help? Post your questions in the forumsFollow us on Twitter

Read this article if you want to know more about How To Configure and Run Cassandra on OpenShift – OpenShift Blog

In this blog, we will be doing single node Cassandra installation. Please note that whole point of using Cassandra is fault tolerance and high availability, so single node installation is only good for a POC and getting your hands dirty with Cassandra. This blog is divided into two parts:

How to install Cassandra running on a DIY application
How you can use Cassandra as an embedded cartridge in a simple Java application.

Prerequisites

Before we can start with deploying Cassandra on OpenShift you need to do the following :

Sign up for an OpenShift Account : If you don’t already have an OpenShift account, head on over to the website and sign up. It is completely free and Red Hat gives every user three free Gears on which to run your applications. At the time of this writing, the combined resources allocated for each user is 1.5 GB of memory and 3 GB of disk space.
Install the client tools on your machine : The OpenShift client tools are written in a very popular programming language called Ruby. With OSX 10.6 or later and most Linux distributions, ruby is installed by default so installing the client tools is a snap. Simply issue the following command on your terminal application:
```
sudo gem install rhc
```
Setting up OpenShift : The rhc client tool makes it very easy to setup your openshift instance with ssh keys, git and your applications namespace. The namespace is a unique name per user which becomes part of your application url. For example, if your namespace is cix and application name is cassandra then url of the application will be http://cassandra-cix.rhcloud.com/. The command is shown below
```
rhc setup -l openshift_login
```

Part 1 : Installing Cassandra on a DIY application

After you have signed up for OpenShift account and ran rhc setup command. The next step is to create a diy application using rhc command line tool. OpenShift’s powerful Do-It-Yourself (DIY) feature allows you to use your own Languages and Data Stores if the built-in Perl, Ruby, PHP, Python, and Java support doesn’t suit you. People have used it to run clojure,jruby,go, couchdb, redis and many other programming languages and datastores. OpenShift can run any binary that will run on RHEL 6.2 x64 because the OpenShift execution environment is a carefully secured Red Hat Enterprise Linux 6.2 running on x64 systems.

Creating Cassandra DIY Application

To create a diy application execute the command shown below.

rhc app create cassandra diy

This will create an application container for us, called a gear, and setup all of the required SELinux policies and cgroup configuration. OpenShift will also setup a private git repository for you and clone the repository to your local system. Finally OpenShift will propagate the DNS to outside world.

The template code generated by OpenShift has nothing interesting as it only contains a very simple ruby based http server listening on 8080 port and serves index.html file. The testrubyserver.rb and index.html exists in diy folder.

Pulling the code from Github

To get started with Cassandra quickly I have created a quickstart application which we can use to install Cassandra on OpenShift. The code is on github at https://github.com/shekhargulati/cassandra-openshift-quickstart. The quickstart downloads the latest cassandra tar, untar it, make configuration changes, and finally starts the Cassandra database. I will talk about it in detail latter in the post. Execute the git commands shown below to pull the quickstart code.

git remote add upstream https://github.com/shekhargulati/cassandra-openshift-quickstart.git
 
git pull -s recursive -X theirs upstream master

Pushing the code to OpenShift

Now that you have Cassandra quickstart on your machine, let’s push the code to OpenShift which will do all the necessary steps required to install Cassandra on OpenShift.

git push

After you execute git push, please wait for a minute as this command will do all the necessary steps required to install Cassandra on your diy application. After git push succeeds, ssh into the application gear as shown below.

ssh f677086ae4b84936XXXXefrfrfr3f8e53f43eb56@cassandra-demo.rhcloud.com

Now if you run ps -ef|grep cassandra you will find that cassandra Java process is running as shown below.

View Cassandra Process

Taking Cassandra to Test drive

Now that we are sure Cassandra is running on the OpenShift gear, lets test it by creating some sample keyspace and column family. Then we will install some data into Column Family. Cassandra provides a command line utility called CQL which we can use for testing. To run CQL go to cassandra/bin folder in $OPENSHIFT_DATA_DIR and run cqlsh as shown below.

cd app-root/data/cassandra/bin/
./cqlsh $OPENSHIFT_DIY_IP 19160 -2

You can also run DESCRIBE schema command which will output the Cassandra system keyspace schema.

Lets now create our keyspace, column family. Execute the commands shown below.

CREATE KEYSPACE MyKeyspace with strategy_class = 'org.apache.cassandra.locator.SimpleStrategy' AND strategy_options:replication_factor = 1;
 
use MyKeyspace;
CREATE TABLE users (
  user_name varchar PRIMARY KEY,
  password varchar,
  gender varchar,
  session_token varchar,
  state varchar,
  birth_year bigint
);
 
INSERT INTO MyKeyspace.users(user_name,password,gender,session_token,state,birth_year) VALUES ( 'shekhar','password','M','session','Haryana',1984);

You can also view the data using SELECT query as shown below.
Cassandra Select Statement

Under the Hood

Now that we have got Cassandra running on OpenShift, let’s take a look at how we achieved that. The changes that we made in the code are in three files which exists in .openshift/action_hooks folder inside your application directory. Let’s take a look at all these files one by one.

deploy : The deploy hook gets invoked after dependencies of an application are resolved but before starting back the application again. In this script we create a new cassandra directory under $OPENSHIFT_DATA_DIR, download the tar file,create directories required by Cassandra to keep application data, logs, etc. And finally, update some configuration files to make Cassandra work. The configuration file we change are cassandra.yaml, log4j-server.properties, and cassandra-env.sh. The changes that we make in these files are related to port changes, using $OPENSHIFT_DIY_IP instead of localhost,setting memory, etc. You can view the file on github.
start : The start script encapsulate the logic required to start the application. So, this script is where we start the cassandra database. You can view the start script on github.
stop : The stop script encapsulate the logic to stop the application. Here we find the cassandra process id and kill the process. You can view the stop script on github.

That’s the only changes that we had to make in order to run Cassandra on OpenShift.

Part 2 : Using Cassandra as an Embedded Cartridge from with in a Java application

So, far we have seen how you can install cassandra on a diy application. But, it would make more sense to use Cassandra as an embedded cartridge from other application type like Java, PHP,Python, Ruby supported by OpenShift. This way you don’t have to install any other server or runtime for your application. In this part, we will create a very simple Java application which will be deployed on tomcat via JBoss EWS cartridge.

Creating Tomcat Application

The first step that we will be doing is create a tomcat application called cassandrajavademo. Execute the command shown below to create the application.

rhc app create cassandrajavademo tomcat-6

Updating OpenShift Action Hook Scripts to install Cassandra

To install cassandra we have to update three scripts deploy, pre_start_jbossews, pre_stop_jbossews. These files will contain the same content as contained by deploy, start, and stop scripts we created in Part1.So, please update them accordingly. The only one change that we have to make is that we will be creating the keyspace and column family in pre_start_jbossews. So,please add the following line at the end.

bin/cassandra-cli -h $OPENSHIFT_DIY_IP -p 19160 -f $OPENSHIFT_REPO_DIR/cassandra-tutorial.txt

Java code to interact with Cassandra

Finally, I have written a very simple Spring MVC application which just has one controller which write data into cassandra and you can view the data by ssh’ing to the instance and running cqlsh command line utility. The controller is shown below.

@Controller
public class CassandraController {
 
    @RequestMapping(value = "/cassandra", method = RequestMethod.GET)
    public String process() throws TException, InvalidRequestException,
            UnavailableException, UnsupportedEncodingException,
            NotFoundException, TimedOutException {
 
        String host = System.getenv("OPENSHIFT_INTERNAL_IP");
        int port = 19160;
        TTransport transport = new TFramedTransport(new TSocket(host,port));
        TProtocol protocol = new TBinaryProtocol(transport);
        Cassandra.Client client = new Cassandra.Client(protocol);
        transport.open();
 
        client.set_keyspace("tutorials");
 
        // define column parent
        ColumnParent parent = new ColumnParent("User");
 
        // define row id
        ByteBuffer rowid = ByteBuffer.wrap("100".getBytes());
 
        // define column to add
        Column username = new Column();
        username.setName("username".getBytes());
        username.setValue("shekhargulati".getBytes());
        username.setTimestamp(System.currentTimeMillis());
 
        // define consistency level
        ConsistencyLevel consistencyLevel = ConsistencyLevel.ONE;
 
        // execute insert
        client.insert(rowid, parent, username, consistencyLevel);
 
        Column password = new Column();
        password.setName("password".getBytes());
        password.setValue("password".getBytes());
        password.setTimestamp(System.currentTimeMillis());
        client.insert(rowid, parent, password, consistencyLevel);
 
        // release resources
        transport.flush();
        transport.close();
        return "hello";
    }
 
}

Push code to OpenShift

Finally push the code to github which will install cassandra, build a new war and deploy it on tomcat. Now, if you hit http://cassandrajavademo-cix.rhcloud.com/cassandra a new row will be created in Cassandra and you will see “Hello from Cassandra”.

Source code of the application is available on my github repository

Conclusion

In this blog, I showed you how easy it is to extend OpenShift by installing Cassandra on top of it. What are you waiting for, Try it now!

What’s Next?

Sign up for OpenShift Online
Interested in a private PaaS? Register for an evaluation of OpenShift Enterprise
Need Help? Post your questions in the forums
Follow us on Twitter

node

hybrid.cloud

datastax

GitHub - IBM/datastax-cassandra-clickstream: Use DataStax Enterprise built on Apache Cassandra as a clickstream database

IBM

12/8/2023

kubernetes

cassandra

docker

How to deploy Cassandra on Openshift and open it up to remote connections

John Doe

1/12/2021

cluster

datastax

system

riptano/ccm

John Doe

8/2/2018

resources

cassandra

tutorials

DataStax Academy

John Doe

6/20/2018

resources

operations

system

Jon Haddad: Cassandra Summit Recap - Diagnosing Problems in Production

John Doe

1/30/2018

github

system

aws

cloudurable/cassandra-image

John Doe

1/12/2018

cassandra

database

system

How to Setup a Highly Available Multi-AZ Cassandra Cluster on AWS EC2 - High Scalability -

John Doe

12/21/2017

database

datastax

system

DSE Search (Apache Cassandra + Solr) Deployment Guide

John Doe

7/13/2017

akka

cassandra

spark

Using the SDACK Architecture to Build a Big Data Product

John Doe

6/2/2017

cassandra

openshift

system

Adding Cassandra Node in OpenShift to Existing Cluster

John Doe

1/6/2017

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt!  We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

Prerequisites

Part 1 : Installing Cassandra on a DIY application

Creating Cassandra DIY Application

Pulling the code from Github

Pushing the code to OpenShift

Taking Cassandra to Test drive

Under the Hood

Part 2 : Using Cassandra as an Embedded Cartridge from with in a Java application

Creating Tomcat Application

Updating OpenShift Action Hook Scripts to install Cassandra

Java code to interact with Cassandra

Push code to OpenShift

Conclusion

What’s Next?

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Contact Info

Resources

Properties

Follow Us