May 4 · 6 min read
11/14/2019
Reading time:6 min
Create Cassandra Cluster in Vagrant using Ansible
by John Doe
Pratheep AnanthanMay 4 · 6 min readPhoto by Samson Creative. on UnsplashThis post explains about deploying a 3-node Cassandra cluster in your laptop by simply using a ‘vagrant up’ command.Usually, developers struggle in setting up tools and project environment in their laptop especially in Windows. If the purpose is just to check the connectivity with their program and a third party tool, its time consuming to understand the configurations of the third party tool and manually do the setup. It will be a more difficult task if its a database cluster setup. Therefore, the following automation can be a good friend in such cases.I prepared an Ansible role to automate Cassandra cluster deployment. Details of vagrant setup and Ansible role will be explained below.Required tools set:VirtualboxVagrantExecution steps:Install virtual box and vagrantgit clone https://github.com/apkan/vagrant-cassandra-ansible.gitcd vagrant-cassandra-ansiblevagrant upOutput:Ansible playbook will execute a set of tasks and eventually a Cassandra cluster status will be displayed.The final result of ‘vagrant up’ commandVagrantIn vagrant, Ansible local provisioner is used for this automated deployment. This helps to avoid initial difficulties in Ansible installation and setup, especially, you do not need to worry about SSH key exchanges.This vagrant file will deploy 3 nodes and 1 controller which will be used as Ansible controller node. Ansible will be installed automatically and execute ‘cassandraCluster’ playbook from the controller node when you issue ‘ vagrant up’ command.machine.vm.provider "virtualbox" do |vbox|vbox.memory = "1024"vbox.cpus = "1"vbox.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]vbox.customize ["modifyvm", :id, "--natdnsproxy1", "on"]endHardware resources for each vagrant nodes can be customized as per your hardware availability. ‘natdnshostresolver1’ and ‘natdnsproxy1’ will be enabled on all nodes to give a boost on networking performance.---all:hosts:controller:ansible_connection: localansible_ssh_host: x.x.x.xnode1:ansible_ssh_host: x.x.x.xansible_ssh_private_key_file: /vagrant/.vagrant/machines/node1/virtualbox/private_keynode2:ansible_ssh_host: x.x.x.xansible_ssh_private_key_file: /vagrant/.vagrant/machines/node2/virtualbox/private_keynode3:ansible_ssh_host: x.x.x.xansible_ssh_private_key_file: /vagrant/.vagrant/machines/node3/virtualbox/private_keychildren:cluster-nodes:hosts:node1:node2:node3:seed-nodes:hosts:node1:node2:This inventory file plays a key role in this setup. This will be the global configuration for host IPs, private keys and cassandra seed nodes. Vagrant and Ansible roles will fetch the IP address from this inventory yml file.config.vm.define "node1" do |machine| machine.vm.network "private_network", ip: inventory["all"]["hosts"]["node1"]["ansible_ssh_host"] ... endAnsible role└── cassandra-cluster├── handlers│ └── main.yml├── tasks│ └── main.yml├── templates│ ├── cassandra.sh│ └── cassandra.yaml.j2└── vars└── main.ymlThis is a simple Ansible role to deploy 3 node cassandra cluster. Following shows the tasks used in the role:- name: Check if java 8 alread installedcommand: java -version 2>&1 | grep version | awk '{print $3}' | sed 's/"//g'register: java_versionignore_errors: True- debug: msg="Java not installed"when: java_version is failed- name: installJavaRepoapt_repository: repo='ppa:openjdk-r/ppa'when: java_version is failed- name: updateCacheapt: update_cache=yeswhen: java_version is failed- name: installJavaapt: name=openjdk-8-jdk state=present when: java_version is failed- stat: path: /opt/apache-cassandra-3.7-bin.tar.gzregister: cassandra_source_file- name: download the sourceget_url: url: https://archive.apache.org/dist/cassandra/3.7/apache-cassandra-3.7-bin.tar.gzdest: /opt/apache-cassandra-3.7-bin.tar.gzmode: 0440when: cassandra_source_file.stat.exists == False- stat:path: /opt/apache-cassandra-3.7-bin.tar.gzchecksum_algorithm: sha1register: downloaded_file_checksum- name: download checksum from apache pageuri: url=https://archive.apache.org/dist/cassandra/3.7/apache-cassandra-3.7-bin.tar.gz.sha1 return_content=yesregister: apache_page_checksumfailed_when: downloaded_file_checksum.stat.checksum != apache_page_checksum.content- name: Create home directoryfile: path: "{{ cassandra_home }}"state: directorymode: 0755 - name: Extract cassandra sourceunarchive:src: /opt/apache-cassandra-3.7-bin.tar.gzdest: "{{ cassandra_home }}"extra_opts: [--strip-components=1]remote_src: yes - name: Export pathshell: "echo 'PATH=$PATH:{{ cassandra_path }}' > /etc/profile.d/custom-path.sh && . /etc/profile.d/custom-path.sh"- name: install the configuration filetemplate: src=cassandra.yaml.j2 dest='{{ cassandra_home }}/conf/cassandra.yaml' mode=750- name: copy the startup scripttemplate: src=cassandra.sh dest=/etc/init.d/cassandra owner=root group=root mode=755- name: Enable the daemonshell: update-rc.d cassandra "{{ item }}"with_items:- "defaults"- "enable"- name: start cassandra serviceservice: name=cassandra state=startednotify:- CheckClusterStatus- GetClusterStatusOutput- DisplayClusterStatusOutputMost of the important parameters in Cassandra.yaml are configured using Ansible template. Following shows the variables used in this role:cluster_name: MyCluster#Directory locationscassandra_home: /opt/cassandra cassandra_path: /opt/cassandra/bincassandra_data_directory: /opt/cassandra/datacassandra_hints_directory: /opt/cassandra/data/hintscassandra_commitlog_directory: /opt/cassandra/data/commitlogscassandra_saved_caches_directory: /opt/cassandra/data/saved_cachescassandra_seeds_resolved: "{{ groups['seed-nodes'] | map('extract', hostvars, ['ansible_ssh_host']) | join(',') }}"#For IP Address configuration in YMLbroadcast_address: "{{ hostvars[inventory_hostname]['ansible_ssh_host'] }}"listen_address: "{{ hostvars[inventory_hostname]['ansible_ssh_host'] }}"broadcast_rpc_address: "{{ hostvars[inventory_hostname]['ansible_ssh_host'] }}"#Port configurationscassandra_port: 9042rpc_port: 9160storage_port: 7000ssl_storage_port: 7001It was an interesting challenge to configure a list of Cassandra seed IPs. I have mentioned node 2 and node 3 as seed nodes in inventory file. Those two IPs will be passed to cassandra.yaml file from ‘cassandra_seeds_resolved’ variable.Best practice tips:Make use of variable as much as possible and avoid hard coding in Ansible tasks.Prepare Ansible roles with the concept of idempotency. This is to ensure that your environment is consistent. Also, this will help in reusing the same environment to test the changes in your Ansible roles.I used an Ansible task to check if java already installed. These sort of verification can help if we want to run this playbook in some other environment.Display cluster status at last of all tasks, so you can easily identify whether playbook execution successfully created the cluster.Connectivity TestingIn addition, I prepared a simple java program to check the connectivity. You just need to parse one of the node IP and port (mentioned in inventory file), then the driver can fetch the other node’s IP and display the cluster details.Connectivity testDestroying the setupJust like how we deployed this setup, destroying it is also super easy. Anytime, you can destroy this whole setup by simply using the ‘vagrant destroy’ command.ConclusionI hope this post helps to give an idea about how deployment in local environment can be automated using tools like Vagrant and Ansible.Please note that this can be faster and more efficient if you use cloud environment instead of Vagrant.( I will write that in another post — “Deployment using Terraform and Ansible”).ReferencesFull source code available in https://github.com/apkan/vagrant-cassandra-ansible.githttps://www.vagrantup.com/docs/provisioning/ansible_local.htmlhttps://galaxy.ansible.com/andrewrothstein/cassandra-clusterhttps://docs.datastax.com/en/developer/java-driver/3.6/manual/Follow us on Twitter 🐦 and Facebook 👥 and join our Facebook Group 💬.To join our community Slack 🗣️ and read our weekly Faun topics 🗞️, click here⬇
Related Articles
Checkout Planet Cassandra
Join Our Newsletter!
Sign up below to receive email updates and see what's going on with our company
Explore Further
cassandra
ansible
vagrant