Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

6/19/2019

Reading time:3 min

lekane/ansible-cassandra

by John Doe

Ansible provisioning/maintenance tasks for Cassandra. Can be used to install & manage upgrades for an Apache Cassandra or Datastax (DCE or DSE+Opscenter) based Cassandra cluster & SparkUsage:Create the servers for Cassandra and other services (e.g Datastax OpsCenter, Spark master)Define an Ansible inventory (see inventory/example.hosts) for your environmentRun the playbook to install Cassandra + other servicesInventory configuration:Inventory groupVariableOptionsDefaultDescriptioncassandra_nodesdcDC1, DC2, ...-data center of nodecassandra_nodesrackRAC1, RAC2, ...-rack of nodecassandra_nodesrepair_weekdayMON,TUE,WED,THU,FRI,SAT,SUN-day(s) to run repair on nodecassandra_nodesrepair_start_hour00-2303hour to start cron based repaircassandra_nodesrepair_start_minute00-590minute to start cron based repaircassandra_nodesseedtrue, false-is the node a seedcassandra_nodesnode_iptrue, false-IP for internal cluster communicationscassandra_nodesspark_enabledtrue, falsefalseenable Spark on node (DSE only)cassandra_nodess3_backup_enabledtrue, falsefalseenable S3 backupscassandra_nodess3_backup_environmentaws, riakcs-environment for S3 backupscassandra_nodess3_backup_hosthost-S3 host (for non-AWS)cassandra_nodess3_backup_bucketbucket-S3 bucket where to store backupscassandra_nodess3_backup_keyspaceskeyspace,keyspace,...-Cassandra keyspaces to backup (comma separated)cassandra_nodess3_backup_access_keyaccess_key-S3 access keycassandra_nodess3_backup_secret_keysecret_key-S3 secret keycassandra_nodeslocal_jmxyes, noyesJMX local onlycassandra_nodesadmin_jmx_remote_passwordpassword-JMX password for admin (readwrite)cassandra_nodesmonitoring_jmx_remote_passwordpassword-JMX password for monitoring (readonly)------------opscenter_nodesnode_iptrue, false-IP for internal cluster communications------------all_cassandra_nodesdata_disk_environmentephemeral_raid, directory_symlink, create_data_directoryephemeral_raiddata disk optionsall_cassandra_nodesdata_disk_symlinksymlink name-name of symlink when using "directory_symlink" data_disk_environmentall_cassandra_nodesdeployment_environmentaws, euca-environment for installationall_cassandra_nodesinstall_versionapache, dce, dse-Cassandra to install (apache=Apache Cassandra, dce=Datastax Community Edition, dse=Datastax Enterprise Edition)all_cassandra_nodesignore_shutdown_errorstrue, falsefalseShould we ignore errors with graceful node shutdownall_cassandra_nodesdse_usernameDSE username-DSE username (only for DSE install)all_cassandra_nodesdse_passwordDSE password-DSE password (only for DSE install)Requirements:Ansible 2.0 or laterNodes running Ubuntu 14.04 or laterNode have the following installed: gitRunning:Check out main cassandra.yml comments for typical running options (e.g. new install, upgrade, cron/backup only updates etc)Data disk environment options:Deployment data options are controlled by the required "data_disk_environment" environment variable, which can be set for all nodes or per-node basis.The supported environments are:ephemeral_raid: Creates a RAID-0 array for local ephemeral drives. Works also for a single ephemeral drive. (default)directory_symlink: Creates a symlink from /ephemeral to "data_disk_symlink".create_data_directory: Creates /data directory on root device.Spark setup:Typical way of setting up the environment would be to define 2 Cassandra data centers: one for real-time transactions (plain Cassandra) andanother for analytics workloads (Cassandra with co-located Spark nodes). You can also use the playbook without installing Spark.Notes:DCE to Apache Cassandra migration: As Datastax dropped support for DCE (3.0.9 is the last supported version), it is recommended you migrate toApache Cassandra based setup (or run DSE). The migration path we took in our clusters was an round-robin DCE->Apache migration (graceful shutdown of node, removal of DCE, running the playbook with default setup on the node (installs & configures Apache Cassandra and keeps the old node data)). You'll probably want to setignore_shutdown_errors=true so that the playbook will run when the old binaries have been remove & service isn't running.

Illustration Image

Ansible provisioning/maintenance tasks for Cassandra. Can be used to install & manage upgrades for an Apache Cassandra or Datastax (DCE or DSE+Opscenter) based Cassandra cluster & Spark

Usage:

  1. Create the servers for Cassandra and other services (e.g Datastax OpsCenter, Spark master)
  2. Define an Ansible inventory (see inventory/example.hosts) for your environment
  3. Run the playbook to install Cassandra + other services

Inventory configuration:

Inventory group Variable Options Default Description
cassandra_nodes dc DC1, DC2, ... - data center of node
cassandra_nodes rack RAC1, RAC2, ... - rack of node
cassandra_nodes repair_weekday MON,TUE,WED,THU,FRI,SAT,SUN - day(s) to run repair on node
cassandra_nodes repair_start_hour 00-23 03 hour to start cron based repair
cassandra_nodes repair_start_minute 00-59 0 minute to start cron based repair
cassandra_nodes seed true, false - is the node a seed
cassandra_nodes node_ip true, false - IP for internal cluster communications
cassandra_nodes spark_enabled true, false false enable Spark on node (DSE only)
cassandra_nodes s3_backup_enabled true, false false enable S3 backups
cassandra_nodes s3_backup_environment aws, riakcs - environment for S3 backups
cassandra_nodes s3_backup_host host - S3 host (for non-AWS)
cassandra_nodes s3_backup_bucket bucket - S3 bucket where to store backups
cassandra_nodes s3_backup_keyspaces keyspace,keyspace,... - Cassandra keyspaces to backup (comma separated)
cassandra_nodes s3_backup_access_key access_key - S3 access key
cassandra_nodes s3_backup_secret_key secret_key - S3 secret key
cassandra_nodes local_jmx yes, no yes JMX local only
cassandra_nodes admin_jmx_remote_password password - JMX password for admin (readwrite)
cassandra_nodes monitoring_jmx_remote_password password - JMX password for monitoring (readonly)
--- --- --- ---
opscenter_nodes node_ip true, false - IP for internal cluster communications
--- --- --- ---
all_cassandra_nodes data_disk_environment ephemeral_raid, directory_symlink, create_data_directory ephemeral_raid data disk options
all_cassandra_nodes data_disk_symlink symlink name - name of symlink when using "directory_symlink" data_disk_environment
all_cassandra_nodes deployment_environment aws, euca - environment for installation
all_cassandra_nodes install_version apache, dce, dse - Cassandra to install (apache=Apache Cassandra, dce=Datastax Community Edition, dse=Datastax Enterprise Edition)
all_cassandra_nodes ignore_shutdown_errors true, false false Should we ignore errors with graceful node shutdown
all_cassandra_nodes dse_username DSE username - DSE username (only for DSE install)
all_cassandra_nodes dse_password DSE password - DSE password (only for DSE install)

Requirements:

  • Ansible 2.0 or later
  • Nodes running Ubuntu 14.04 or later
  • Node have the following installed: git

Running:

  • Check out main cassandra.yml comments for typical running options (e.g. new install, upgrade, cron/backup only updates etc)

Data disk environment options: Deployment data options are controlled by the required "data_disk_environment" environment variable, which can be set for all nodes or per-node basis. The supported environments are:

  • ephemeral_raid: Creates a RAID-0 array for local ephemeral drives. Works also for a single ephemeral drive. (default)
  • directory_symlink: Creates a symlink from /ephemeral to "data_disk_symlink".
  • create_data_directory: Creates /data directory on root device.

Spark setup: Typical way of setting up the environment would be to define 2 Cassandra data centers: one for real-time transactions (plain Cassandra) and another for analytics workloads (Cassandra with co-located Spark nodes). You can also use the playbook without installing Spark.

Notes:

  • DCE to Apache Cassandra migration: As Datastax dropped support for DCE (3.0.9 is the last supported version), it is recommended you migrate to Apache Cassandra based setup (or run DSE). The migration path we took in our clusters was an round-robin DCE->Apache migration (graceful shutdown of node, removal of DCE, running the playbook with default setup on the node (installs & configures Apache Cassandra and keeps the old node data)). You'll probably want to set ignore_shutdown_errors=true so that the playbook will run when the old binaries have been remove & service isn't running.

Related Articles

cassandra
ansible

GitHub - locp/ansible-role-cassandra: Ansible role to install and configure Apache Cassandra

locp

8/25/2022

kubernetes
terraform
cassandra

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra