Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

11/14/2019

Reading time:3 min

wireapp/ansible-cassandra

by John Doe

ansible-cassandraAnsible role to install an Apache Cassandra cluster supervised by systemd. Includes the following:Some OS tuning options such as installing jemalloc, setting max_map_count and tcp_keepalive, disabling swap.Bootstraps nodes using the IPs of the servers in the cassandra_seed (configurable) inventory group.Weekly scheduled repairs via cron jobs that are non-overlapping (see cassandra_repair_slots).requires setting cassandra_keyspaces (default [] will have no effect)Incremental and full backup scripts as well as a restore script. (disabled by default, optional) (NOTE: needs better testing)backup/restore requires access to S3.prometheus-style metrics using jmx-exporterStatus: beta, see TODOsAnsible RequirementsRole VariablesDependenciesPlatformsExample PlaybookLicenseA note on openjdk vs oracle:Development setupCreditsTODOAnsible Requirementsansible >= 2.4 (>= 2.7.9 recommended)Role VariablesGive your cluster a better name:# set cassandra_cluster_name before running the playbook for the first time; never change it afterwardscassandra_cluster_name: defaultYou should override the keyspaces to match keyspaces on which you wish to run weekly repairs:cassandra_keyspaces: []You may wish to override the following defaults to enable backups:# backupscassandra_backup_enabled: false # recommended to enable thiscassandra_backup_s3_bucket: # set a name here and ensure hosts have access rights to an S3 bucketcassandra_env: dev # used in naming backups in case you have more than one environment (e.g. production, staging, ...)For a list of all variables, see defaults/main.yml.DependenciesThe following should be installed before installing this role:java (openJDK or Oracle, see A note on openjdk vs oracle:)ntpFor the above dependencies, you can use the same roles as in molecule/default/requirements.yml - but you don't have to.PlatformsCurrently tested with ubuntu 16.04 onlyExample PlaybookAssuming an inventory with 5 nodes where you wish to install cassandra on, two of them seed nodes:# hosts.ini[all]host01 ansible_host=<some IP>host02 ansible_host=<some IP>host03 ansible_host=<some IP>host04 ansible_host=<some IP>host05 ansible_host=<some IP>[cassandra]host01host02host03host04host05# cassandra_seed group will be used to configure seed bootstrapping# recommended is 2 seed nodes per datacenter[cassandra_seed]host01host02Then the following should work and start your cluster:# playbook.yml- hosts: cassandra vars: # set cluster_name before running the playbook for the first time; never change it afterwards cassandra_cluster_name: my_cluster cassandra_keyspaces: - my_keyspace1 roles: # ensure to install java ntp first, e.g. by running these roles (see Dependencies section): # - ansible-ntp # - ansible-java - ansible-cassandraIf you don't wish to configure cassandra seed nodes via a cassandra_seed_groupname (default: cassandra_seed) inventory group, you can configure them statically: vars: cassandra_seed_resolution: static cassandra_seeds: - 1.2.3.4 - ...LicenseAGPL. See LICENSEA note on openjdk vs oracle:As of November 2018, the cassandra homepage lists both openJDK and Oracle Java as supported (and offers their download links).In the official upgrade-to-DSE docs one can find:Important:Although Oracle JRE/JDK 8 is supported, DataStax does moreextensive testing on OpenJDK 8 starting with DSE 6.0.3. This change is due to theend of public updates for Oracle JRE/JDK 8.)It seems OpenJDK is the more future-proof JVM to use. This role is tested using openjdk.Development setupInstall molecule. E.g. ensure you have docker installed, then, using a virtualenv, pip install molecule ansible docker.molecule converge to run the playbook against docker containers. If something fails, molecule --debug converge shows error details.molecule lint and molecule syntax can be used to get feedback on your yaml changes.molecule test to destroy + converge + converge again for idempotence + destroyIf you want 'mocule converge' to be run each time you save a file in this repository, install entr, then run 'make'.troubleshooting: this issue has been observed with molecule, ansible 2.7 and docker. Workaround was to downgrade to ansible 2.5.CreditsThis role has been inspired byinternal role used at Wire initially targeting older OSes and older cassandra versions.this cassandra role and its dependent roles (insufficient for our needs)TODO WARN: JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info. test backups and restore document usage of prometheus .prom files and node-exporter check out if instead of cron jobs a repair alternative could be https://github.com/thelastpickle/cassandra-reaper

Illustration Image

ansible-cassandra

Ansible role to install an Apache Cassandra cluster supervised by systemd. Includes the following:

  • Some OS tuning options such as installing jemalloc, setting max_map_count and tcp_keepalive, disabling swap.
  • Bootstraps nodes using the IPs of the servers in the cassandra_seed (configurable) inventory group.
  • Weekly scheduled repairs via cron jobs that are non-overlapping (see cassandra_repair_slots).
    • requires setting cassandra_keyspaces (default [] will have no effect)
  • Incremental and full backup scripts as well as a restore script. (disabled by default, optional) (NOTE: needs better testing)
    • backup/restore requires access to S3.
  • prometheus-style metrics using jmx-exporter

Status: beta, see TODOs

Build Status

Ansible Requirements

  • ansible >= 2.4 (>= 2.7.9 recommended)

Role Variables

Give your cluster a better name:

# set cassandra_cluster_name before running the playbook for the first time; never change it afterwards
cassandra_cluster_name: default

You should override the keyspaces to match keyspaces on which you wish to run weekly repairs:

cassandra_keyspaces: []

You may wish to override the following defaults to enable backups:

# backups
cassandra_backup_enabled: false # recommended to enable this
cassandra_backup_s3_bucket: # set a name here and ensure hosts have access rights to an S3 bucket
cassandra_env: dev # used in naming backups in case you have more than one environment (e.g. production, staging, ...)

For a list of all variables, see defaults/main.yml.

Dependencies

The following should be installed before installing this role:

For the above dependencies, you can use the same roles as in molecule/default/requirements.yml - but you don't have to.

Platforms

  • Currently tested with ubuntu 16.04 only

Example Playbook

Assuming an inventory with 5 nodes where you wish to install cassandra on, two of them seed nodes:

# hosts.ini
[all]
host01 ansible_host=<some IP>
host02 ansible_host=<some IP>
host03 ansible_host=<some IP>
host04 ansible_host=<some IP>
host05 ansible_host=<some IP>
[cassandra]
host01
host02
host03
host04
host05
# cassandra_seed group will be used to configure seed bootstrapping
# recommended is 2 seed nodes per datacenter
[cassandra_seed]
host01
host02

Then the following should work and start your cluster:

# playbook.yml
- hosts: cassandra
  vars:
    # set cluster_name before running the playbook for the first time; never change it afterwards
    cassandra_cluster_name: my_cluster
    cassandra_keyspaces:
      - my_keyspace1
  roles:
    # ensure to install java ntp first, e.g. by running these roles (see Dependencies section):
    # - ansible-ntp
    # - ansible-java
    - ansible-cassandra

If you don't wish to configure cassandra seed nodes via a cassandra_seed_groupname (default: cassandra_seed) inventory group, you can configure them statically:

  vars:
    cassandra_seed_resolution: static
    cassandra_seeds:
      - 1.2.3.4
      - ...

License

AGPL. See LICENSE

A note on openjdk vs oracle:

As of November 2018, the cassandra homepage lists both openJDK and Oracle Java as supported (and offers their download links).

In the official upgrade-to-DSE docs one can find:

Important: Although Oracle JRE/JDK 8 is supported, DataStax does more extensive testing on OpenJDK 8 starting with DSE 6.0.3. This change is due to the end of public updates for Oracle JRE/JDK 8.)

It seems OpenJDK is the more future-proof JVM to use. This role is tested using openjdk.

Development setup

Install molecule. E.g. ensure you have docker installed, then, using a virtualenv, pip install molecule ansible docker.

  • molecule converge to run the playbook against docker containers. If something fails, molecule --debug converge shows error details.
  • molecule lint and molecule syntax can be used to get feedback on your yaml changes.
  • molecule test to destroy + converge + converge again for idempotence + destroy

If you want 'mocule converge' to be run each time you save a file in this repository, install entr, then run 'make'.

  • troubleshooting: this issue has been observed with molecule, ansible 2.7 and docker. Workaround was to downgrade to ansible 2.5.

Credits

This role has been inspired by

  • internal role used at Wire initially targeting older OSes and older cassandra versions.
  • this cassandra role and its dependent roles (insufficient for our needs)

TODO

  • WARN: JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
  • test backups and restore
  • document usage of prometheus .prom files and node-exporter
  • check out if instead of cron jobs a repair alternative could be https://github.com/thelastpickle/cassandra-reaper

Related Articles

cassandra
ansible

GitHub - locp/ansible-role-cassandra: Ansible role to install and configure Apache Cassandra

locp

8/25/2022

kubernetes
terraform
cassandra

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra