Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

10/24/2018

Reading time:14 mins

voxpupuli/puppet-cassandra

by John Doe

Table of ContentsOverviewSetup - The basics of getting started with CassandraWhat Cassandra affectsUpgradingBeginning with CassandraUsage - Configuration options and additional functionalitySetup a keyspace and usersCreate a Cluster in a Single Data CenterCreate a Cluster in Multiple Data CentersDataStax EnterpriseApache CassandraReferenceLimitations - OS compatibility, etc.DevelopmentOverviewA Puppet module to install and manage Cassandra, DataStax Agent & OpsCenterSetupWhat Cassandra affectsWhat the Cassandra class affectsInstalls the Cassandra package (default cassandra22 on Red Hat andcassandra on Debian).Configures settings in ${config_path}/cassandra.yaml.On CentOS 7 if the init service provider is used, then cassandrais added as a system service.Optionally ensures that the Cassandra service is enabled and running.On Debian systems:Optionally replace /etc/init.d/cassandra with a workaround forCASSANDRA-9822.What the cassandra::datastax_agent class affectsOptionally installs the DataStax agent.Optionally sets JAVA_HOME in /etc/default/datastax-agent.What the cassandra::datastax_repo class affectsOptionally configures a Yum repository to install the Cassandra packagesfrom (on Red Hat).Optionally configures an Apt repository to install the Cassandra packagesfrom (on Debian).What the cassandra::dse class affectsOptionally configures files in the /etc/dse directory if one is usingDataStax Enterprise.What the cassandra::firewall_ports class affectsOptionally configures the firewall for the Cassandra related networkports.What the cassandra::java class affectsOptionally installs a JRE/JDK package (e.g. java-1.7.0-openjdk) and theJava Native Access (JNA).What the cassandra::optutils class affectsOptionally installs the Cassandra support tools (e.g. cassandra22-tools).UpgradingWe follow SemVer Versioning and an update of the majorrelease (i.e. from 1.Y.Z to 2.Y.Z) will indicate a significant changeto the API which will most probably require a change to your manifest.Changes in 2.0.0This is a major change to the API and you will more than likely need tochange your manifest to accomodate these changes.The service_ensure attribute of the cassandra class now defaults toundef, users who do want to manage service status in Puppet can still setit to true. If leaving the value at the default and settingservice_refresh and service_enable to false will mean that theuser and not Puppet running will control the running state ofCassandra. This currently works OK on the Red Hat family, buthas issues on Debian due toCASSANDRA-2356during an initial install or package upgrade.All the functionality relating to OpsCenter has been divested to thelocp/opscenter module onPuppet Forge.It should also be noted that the module no longer creates directories forthe data, commitlog, saved_caches and for Cassandra 3 the hintsdirectory. These resources will now need to be defined in yourmanifest/profile.For a list of features that have been deprecated in this release, please seehttps://github.com/voxpupuli/puppet-cassandra/wiki/DeprecationsFor details on migrating from the version 1.X.X attributes to the settingshash, see(https://github.com/voxpupuli/puppet-cassandra/wiki/Suggested-Baseline-Settings)Please also see the notes for 2.0.0 in theCHANGELOG.Changes in 1.19.0The hints_directory documentation will cause a change in the cassandra.yamlfile regardless of the value you set it to. If you do not wish this toresult in a refesh of the Cassandra service, please set service_refresh tofalse.Changes in 1.9.2Now that Cassandra 3 is available from the DataStax repositories, there isa problem (especially on Debian) with the operating system package managerattempting to install Cassandra 3. This can be mitigated against usingsomething similar to the code in this modules acceptance test. Please notethat the default Cassandra package name has now been changed from 'dsc'. Seethe documentation for cassandra::package_name below for details. if $::osfamily == 'RedHat' { $version = '2.2.4-1' } else { $version = '2.2.4' } class { 'cassandra': package_ensure => $version, }Changes in 1.8.0A somewhat embarrassing correction to the spelling of thecassandra::fail_on_non_suppoted_os to cassandra::fail_on_non_supported_os.Issues when Upgrading to 1.4.0Unfortunately both releases 1.3.7 and 1.4.0 have subsequently been found tocall a refresh service even when no changes had been made to the underlyingconfiguration. In release 1.8.0 (somewhat belatedly) the service_refreshflag has been introduced to mitigate against similar problems.Issues When Upgrading to 1.3.7Please see the notes for 1.4.0.Changes in 1.0.0cassandra::cassandra_package_ensure has been renamed tocassandra::package_ensure.cassandra::cassandra_package_name has been renamed tocassandra::package_name.Changes in 0.4.0There is now a cassandra::datastax_agent class, therefore:cassandra::datastax_agent_package_ensure has now been replaced withcassandra::datastax_agent::package_ensure.cassandra::datastax_agent_service_enable has now been replaced withcassandra::datastax_agent::service_enable.cassandra::datastax_agent_service_ensure has now been replaced withcassandra::datastax_agent::service_ensure.cassandra::datastax_agent_package_name has now been replaced withcassandra::datastax_agent::package_name.cassandra::datastax_agent_service_name has now been replaced withcassandra::datastax_agent::service_name.Likewise now there is a new class for handling the installation of Java:cassandra::java_package_ensure has now been replaced withcassandra::java::ensure.cassandra::java_package_name has now been replaced withcassandra::java::package_name.Also there is now a class for installing the optional utilities:cassandra::cassandra_opt_package_ensure has now been replaced withcassandra::optutils:ensure.cassandra::cassandra_opt_package_name has now been replaced withcassandra::optutils:package_name.Changes in 0.3.0cassandra_opt_package_ensure changed from 'present' to undef.The manage_service option has been replaced with service_enable andservice_ensure.Beginning with CassandraCreate a Cassandra 2.X cluster called MyCassandraCluster which uses theGossipingPropertyFileSnitch and password authentication. In this verybasic example the node itself becomes a seed for the cluster and thecredentials will default to a user called cassandra with a passwordcalled of cassandra.# Cassandra pre-requisitesinclude cassandra::datastax_repoinclude cassandra::javaclass { 'cassandra': settings => { 'authenticator' => 'PasswordAuthenticator', 'cluster_name' => 'MyCassandraCluster', 'commitlog_directory' => '/var/lib/cassandra/commitlog', 'commitlog_sync' => 'periodic', 'commitlog_sync_period_in_ms' => 10000, 'data_file_directories' => ['/var/lib/cassandra/data'], 'endpoint_snitch' => 'GossipingPropertyFileSnitch', 'listen_address' => $::ipaddress, 'partitioner' => 'org.apache.cassandra.dht.Murmur3Partitioner', 'saved_caches_directory' => '/var/lib/cassandra/saved_caches', 'seed_provider' => [ { 'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider', 'parameters' => [ { 'seeds' => $::ipaddress, }, ], }, ], 'start_native_transport' => true, }, require => Class['cassandra::datastax_repo', 'cassandra::java'],}However, PLEASE note that this is the ABSOLUTE MINIMUM configurationto get Cassandra up and running but will probably give you a rather badlyconfigured node. Please seeSuggested Baseline Settingsfor details on making your configuration a lot more robust.For this code to run with version 3.X of Cassandra, the hints_directory willalso need to be specified:...class { 'cassandra': settings => { ... 'hints_directory' => '/var/lib/cassandra/hints', ... }, require => Class['cassandra::datastax_repo', 'cassandra::java'],}HieraIn your top level node classification (usually common.yaml), add thesettings hash and all the tweaks you want all the clusters to use:cassandra::baseline_settings: authenticator: AllowAllAuthenticator authorizer: AllowAllAuthorizer auto_bootstrap: true auto_snapshot: true ...Then, in the individual node classification add the parts which definethe cluster:cassandra::settings: cluster_name: developer playground cassandra clustercassandra::dc: Onsite1cassandra::rack: RAC1cassandra::package_ensure: 3.0.5-1cassandra::package_name: cassandra30UsageSetup a keyspace and usersWe assume that authentication has been enabled for the cassandracluster and we are connecting with the default user name and password('cassandra/cassandra').In this example, we create a keyspace (mykeyspace) with a table called'users' and an index called 'users_lname_idx'.We also add three users (to Cassandra, not the mykeyspace.userstable) called spillman, akers and boone while ensuring that a usercalled lucan is absent.class { 'cassandra': ...}class { 'cassandra::schema': cqlsh_password => 'cassandra', cqlsh_user => 'cassandra', cqlsh_host => $::ipaddress, indexes => { 'users_lname_idx' => { table => 'users', keys => 'lname', keyspace => 'mykeyspace', }, }, keyspaces => { 'mykeyspace' => { durable_writes => false, replication_map => { keyspace_class => 'SimpleStrategy', replication_factor => 1, }, } }, permissions => { 'Grant select permissions to spillman to all keyspaces' => { permission_name => 'SELECT', user_name => 'spillman', }, 'Grant modify to to keyspace mykeyspace to akers' => { keyspace_name => 'mykeyspace', permission_name => 'MODIFY', user_name => 'akers', }, 'Grant alter permissions to mykeyspace to boone' => { keyspace_name => 'mykeyspace', permission_name => 'ALTER', user_name => 'boone', }, 'Grant ALL permissions to mykeyspace.users to gbennet' => { keyspace_name => 'mykeyspace', permission_name => 'ALTER', table_name => 'users', user_name => 'gbennet', }, }, tables => { 'users' => { columns => { user_id => 'int', fname => 'text', lname => 'text', 'PRIMARY KEY' => '(user_id)', }, keyspace => 'mykeyspace', }, }, users => { 'spillman' => { password => 'Niner27', }, 'akers' => { password => 'Niner2', superuser => true, }, 'boone' => { password => 'Niner75', }, 'gbennet' => { 'password' => 'foobar', }, 'lucan' => { 'ensure' => absent }, },}Create a Cluster in a Single Data CenterIn the DataStax documentation Initializing a multiple node cluster (singledata center)http://docs.datastax.com/en/cassandra/2.2/cassandra/initialize/initSingleDS.htmlthere is a basic example of a six node cluster with two seeds to be created ina single data center spanning two racks. The nodes in the cluster are:Node NameIP Addressnode0 (seed 1)110.82.155.0node1110.82.155.1node2110.82.155.2node3 (seed 2)110.82.156.3node4110.82.156.4node5110.82.156.5Each node is configured to use the GossipingPropertyFileSnitch and 256 virtualnodes (vnodes). The name of the cluster is MyCassandraCluster. Also,while building the initial cluster, we are setting the auto_bootstrapto false.In this initial example, we are going to expand the example by:Ensuring that the software is installed via the DataStax Communityrepository by including cassandra::datastax_repo. This needs to beexecuted before the Cassandra package is installed.That a suitable Java Runtime environment (JRE) is installed with Java NativeAccess (JNA) by including cassandra::java. This need to be executedbefore the Cassandra service is started.node /^node\d+$/ { class { 'cassandra::datastax_repo': before => Class['cassandra'] } class { 'cassandra::java': before => Class['cassandra'] } class { 'cassandra': settings => { 'authenticator' => 'AllowAllAuthenticator', 'auto_bootstrap' => false, 'cluster_name' => 'MyCassandraCluster', 'commitlog_directory' => '/var/lib/cassandra/commitlog', 'commitlog_sync' => 'periodic', 'commitlog_sync_period_in_ms' => 10000, 'data_file_directories' => ['/var/lib/cassandra/data'], 'endpoint_snitch' => 'GossipingPropertyFileSnitch', 'hints_directory' => '/var/lib/cassandra/hints', 'listen_interface' => 'eth1', 'num_tokens' => 256, 'partitioner' => 'org.apache.cassandra.dht.Murmur3Partitioner', 'saved_caches_directory' => '/var/lib/cassandra/saved_caches', 'seed_provider' => [ { 'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider', 'parameters' => [ { 'seeds' => '110.82.155.0,110.82.156.3', }, ], }, ], 'start_native_transport' => true, }, }}The default value for the num_tokens is already 256, but it isincluded in the example for clarity. Do not forget to eitherset auto_bootstrap to true or not set the attribute at allafter initializing the cluster.Create a Cluster in Multiple Data CentersTo continue with the examples provided by DataStax, we look at the examplefor a cluster across multiple data centershttp://docs.datastax.com/en/cassandra/2.2/cassandra/initialize/initMultipleDS.html.Node NameIP AddressData CenterRacknode0 (seed 1)10.168.66.41DC1RAC1node110.176.43.66DC1RAC1node210.168.247.41DC1RAC1node3 (seed 2)10.176.170.59DC2RAC1node410.169.61.170DC2RAC1node510.169.30.138DC2RAC1For the sake of simplicity, we will confine this example to the nodes:node /^node[012]$/ { class { 'cassandra': dc => 'DC1', settings => { 'authenticator' => 'AllowAllAuthenticator', 'auto_bootstrap' => false, 'cluster_name' => 'MyCassandraCluster', 'commitlog_directory' => '/var/lib/cassandra/commitlog', 'commitlog_sync' => 'periodic', 'commitlog_sync_period_in_ms' => 10000, 'data_file_directories' => ['/var/lib/cassandra/data'], 'endpoint_snitch' => 'GossipingPropertyFileSnitch', 'hints_directory' => '/var/lib/cassandra/hints', 'listen_interface' => 'eth1', 'num_tokens' => 256, 'partitioner' => 'org.apache.cassandra.dht.Murmur3Partitioner', 'saved_caches_directory' => '/var/lib/cassandra/saved_caches', 'seed_provider' => [ { 'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider', 'parameters' => [ { 'seeds' => '110.82.155.0,110.82.156.3', }, ], }, ], 'start_native_transport' => true, }, }}node /^node[345]$/ { class { 'cassandra': dc => 'DC2', settings => { 'authenticator' => 'AllowAllAuthenticator', 'auto_bootstrap' => false, 'cluster_name' => 'MyCassandraCluster', 'commitlog_directory' => '/var/lib/cassandra/commitlog', 'commitlog_sync' => 'periodic', 'commitlog_sync_period_in_ms' => 10000, 'data_file_directories' => ['/var/lib/cassandra/data'], 'endpoint_snitch' => 'GossipingPropertyFileSnitch', 'hints_directory' => '/var/lib/cassandra/hints', 'listen_interface' => 'eth1', 'num_tokens' => 256, 'partitioner' => 'org.apache.cassandra.dht.Murmur3Partitioner', 'saved_caches_directory' => '/var/lib/cassandra/saved_caches', 'seed_provider' => [ { 'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider', 'parameters' => [ { 'seeds' => '110.82.155.0,110.82.156.3', }, ], }, ], 'start_native_transport' => true, }, }}We don't need to specify the rack name (with the rack attribute) as RAC1 isthe default value. Again, do not forget to either set auto_bootstrap totrue or not set the attribute at all after initializing the cluster.DataStax EnterpriseAfter configuring the relevant repository, the following snippet works onCentOS 7 to install DSE Cassandra 4.7.0, set the HADOOP_LOG_DIR, set theDSE_HOME and configure DataStax Enterprise to use LDAP for authentication:class { 'cassandra::datastax_repo': descr => 'DataStax Repo for DataStax Enterprise', pkg_url => 'https://username:password@rpm.datastax.com/enterprise', before => Class['cassandra'],}class { 'cassandra': cluster_name => 'MyCassandraCluster', config_path => '/etc/dse/cassandra', package_ensure => '4.7.0-1', package_name => 'dse-full', service_name => 'dse', ...}class { 'cassandra::dse': file_lines => { 'Set HADOOP_LOG_DIR directory' => { ensure => present, path => '/etc/dse/dse-env.sh', line => 'export HADOOP_LOG_DIR=/var/log/hadoop', match => '^# export HADOOP_LOG_DIR=<log_dir>', }, 'Set DSE_HOME' => { ensure => present, path => '/etc/dse/dse-env.sh', line => 'export DSE_HOME=/usr/share/dse', match => '^#export DSE_HOME', }, }, settings => { ldap_options => { server_host => localhost, server_port => 389, search_dn => 'cn=Admin', search_password => secret, use_ssl => false, use_tls => false, truststore_type => jks, user_search_base => 'ou=users,dc=example,dc=com', user_search_filter => '(uid={0})', credentials_validity_in_ms => 0, connection_pool => { max_active => 8, max_idle => 8, } } }}Apache CassandraDataStax announced in late October 2016 that it was no longer supportingthe community edition of Cassandra or DSC as it was known (see[Take a bow PlanetCassandra](http://www.datastax.com/2016/10/take-a-bow-planet-cassandra)for details). However, the following snippet of code running on Ubuntu14.04 worked fine without having to change any of the ::cassandra classsettings:require cassandra::javainclude cassandra::optutilsclass { 'cassandra::apache_repo': release => '310x', before => Class['cassandra', 'cassandra::optutils'],}class { 'cassandra': ...}ReferenceThe reference documentation is generated using thepuppet-strings tool. To seeall of it, please go tohttp://voxpupuli.github.io/puppet-cassandra.LimitationsWhen using a Ruby version before 1.9.0, the contents of the Cassandraconfiguration file may change order of elements due to a problem withto_yaml in earlier versions of Ruby.When creating key spaces, indexes, cql_types and users the settings will onlybe used to create a new resource if it does not currently exist. If a changeis made to the Puppet manifest but the resource already exits, this changewill not be reflected.At the moment the cassandra::system::transparent_hugepage does notpersist between reboots.Acceptance for Debian 7 are confined to Cassandra 2.1 and 2.2. There isa conflict between the GLIBC on Debian 7 and the newer releases of Cassandra3.X.DevelopmentContributions will be gratefully accepted. Please go to the project page,fork the project, make your changes locally and then raise a pull request.Details on how to do this are available athttps://guides.github.com/activities/contributing-to-open-source.Please also see theCONTRIBUTING.mdpage for project specific requirements.Additional ContributersFor a list of contributers seeCONTRIBUTING.mdand https://github.com/voxpupuli/puppet-cassandra/graphs/contributors

Illustration Image

Build Status

Table of Contents

  1. Overview
  2. Setup - The basics of getting started with Cassandra
  3. Usage - Configuration options and additional functionality
  4. Reference
  5. Limitations - OS compatibility, etc.
  6. Development

Overview

A Puppet module to install and manage Cassandra, DataStax Agent & OpsCenter

Setup

What Cassandra affects

What the Cassandra class affects

  • Installs the Cassandra package (default cassandra22 on Red Hat and cassandra on Debian).
  • Configures settings in ${config_path}/cassandra.yaml.
  • On CentOS 7 if the init service provider is used, then cassandra is added as a system service.
  • Optionally ensures that the Cassandra service is enabled and running.
  • On Debian systems:
    • Optionally replace /etc/init.d/cassandra with a workaround for CASSANDRA-9822.

What the cassandra::datastax_agent class affects

  • Optionally installs the DataStax agent.
  • Optionally sets JAVA_HOME in /etc/default/datastax-agent.

What the cassandra::datastax_repo class affects

  • Optionally configures a Yum repository to install the Cassandra packages from (on Red Hat).
  • Optionally configures an Apt repository to install the Cassandra packages from (on Debian).

What the cassandra::dse class affects

  • Optionally configures files in the /etc/dse directory if one is using DataStax Enterprise.

What the cassandra::firewall_ports class affects

  • Optionally configures the firewall for the Cassandra related network ports.

What the cassandra::java class affects

  • Optionally installs a JRE/JDK package (e.g. java-1.7.0-openjdk) and the Java Native Access (JNA).

What the cassandra::optutils class affects

  • Optionally installs the Cassandra support tools (e.g. cassandra22-tools).

Upgrading

We follow SemVer Versioning and an update of the major release (i.e. from 1.Y.Z to 2.Y.Z) will indicate a significant change to the API which will most probably require a change to your manifest.

Changes in 2.0.0

This is a major change to the API and you will more than likely need to change your manifest to accomodate these changes.

The service_ensure attribute of the cassandra class now defaults to undef, users who do want to manage service status in Puppet can still set it to true. If leaving the value at the default and setting service_refresh and service_enable to false will mean that the user and not Puppet running will control the running state of Cassandra. This currently works OK on the Red Hat family, but has issues on Debian due to CASSANDRA-2356 during an initial install or package upgrade.

All the functionality relating to OpsCenter has been divested to the locp/opscenter module on Puppet Forge.

It should also be noted that the module no longer creates directories for the data, commitlog, saved_caches and for Cassandra 3 the hints directory. These resources will now need to be defined in your manifest/profile.

For a list of features that have been deprecated in this release, please see https://github.com/voxpupuli/puppet-cassandra/wiki/Deprecations

For details on migrating from the version 1.X.X attributes to the settings hash, see (https://github.com/voxpupuli/puppet-cassandra/wiki/Suggested-Baseline-Settings)

Please also see the notes for 2.0.0 in the CHANGELOG.

Changes in 1.19.0

The hints_directory documentation will cause a change in the cassandra.yaml file regardless of the value you set it to. If you do not wish this to result in a refesh of the Cassandra service, please set service_refresh to false.

Changes in 1.9.2

Now that Cassandra 3 is available from the DataStax repositories, there is a problem (especially on Debian) with the operating system package manager attempting to install Cassandra 3. This can be mitigated against using something similar to the code in this modules acceptance test. Please note that the default Cassandra package name has now been changed from 'dsc'. See the documentation for cassandra::package_name below for details.

 if $::osfamily == 'RedHat' {
   $version = '2.2.4-1'
 } else {
   $version = '2.2.4'
 }
 class { 'cassandra':
   package_ensure => $version,
 }

Changes in 1.8.0

A somewhat embarrassing correction to the spelling of the cassandra::fail_on_non_suppoted_os to cassandra::fail_on_non_supported_os.

Issues when Upgrading to 1.4.0

Unfortunately both releases 1.3.7 and 1.4.0 have subsequently been found to call a refresh service even when no changes had been made to the underlying configuration. In release 1.8.0 (somewhat belatedly) the service_refresh flag has been introduced to mitigate against similar problems.

Issues When Upgrading to 1.3.7

  • Please see the notes for 1.4.0.

Changes in 1.0.0

  • cassandra::cassandra_package_ensure has been renamed to cassandra::package_ensure.
  • cassandra::cassandra_package_name has been renamed to cassandra::package_name.

Changes in 0.4.0

There is now a cassandra::datastax_agent class, therefore:

  • cassandra::datastax_agent_package_ensure has now been replaced with cassandra::datastax_agent::package_ensure.
  • cassandra::datastax_agent_service_enable has now been replaced with cassandra::datastax_agent::service_enable.
  • cassandra::datastax_agent_service_ensure has now been replaced with cassandra::datastax_agent::service_ensure.
  • cassandra::datastax_agent_package_name has now been replaced with cassandra::datastax_agent::package_name.
  • cassandra::datastax_agent_service_name has now been replaced with cassandra::datastax_agent::service_name.

Likewise now there is a new class for handling the installation of Java:

  • cassandra::java_package_ensure has now been replaced with cassandra::java::ensure.
  • cassandra::java_package_name has now been replaced with cassandra::java::package_name.

Also there is now a class for installing the optional utilities:

  • cassandra::cassandra_opt_package_ensure has now been replaced with cassandra::optutils:ensure.
  • cassandra::cassandra_opt_package_name has now been replaced with cassandra::optutils:package_name.

Changes in 0.3.0

  • cassandra_opt_package_ensure changed from 'present' to undef.

  • The manage_service option has been replaced with service_enable and service_ensure.

Beginning with Cassandra

Create a Cassandra 2.X cluster called MyCassandraCluster which uses the GossipingPropertyFileSnitch and password authentication. In this very basic example the node itself becomes a seed for the cluster and the credentials will default to a user called cassandra with a password called of cassandra.

# Cassandra pre-requisites
include cassandra::datastax_repo
include cassandra::java
class { 'cassandra':
  settings => {
    'authenticator'               => 'PasswordAuthenticator',
    'cluster_name'                => 'MyCassandraCluster',
    'commitlog_directory'         => '/var/lib/cassandra/commitlog',
    'commitlog_sync'              => 'periodic',
    'commitlog_sync_period_in_ms' => 10000,
    'data_file_directories'       => ['/var/lib/cassandra/data'],
    'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
    'listen_address'              => $::ipaddress,
    'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
    'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
    'seed_provider'               => [
      {
        'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
        'parameters' => [
          {
            'seeds' => $::ipaddress,
          },
        ],
      },
    ],
    'start_native_transport'      => true,
  },
  require  => Class['cassandra::datastax_repo', 'cassandra::java'],
}

However, PLEASE note that this is the ABSOLUTE MINIMUM configuration to get Cassandra up and running but will probably give you a rather badly configured node. Please see Suggested Baseline Settings for details on making your configuration a lot more robust.

For this code to run with version 3.X of Cassandra, the hints_directory will also need to be specified:

...
class { 'cassandra':
  settings => {
    ...
    'hints_directory'             => '/var/lib/cassandra/hints',
    ...
  },
  require  => Class['cassandra::datastax_repo', 'cassandra::java'],
}

Hiera

In your top level node classification (usually common.yaml), add the settings hash and all the tweaks you want all the clusters to use:

cassandra::baseline_settings:
  authenticator: AllowAllAuthenticator
  authorizer: AllowAllAuthorizer
  auto_bootstrap: true
  auto_snapshot: true
  ...

Then, in the individual node classification add the parts which define the cluster:

cassandra::settings:
  cluster_name: developer playground cassandra cluster
cassandra::dc: Onsite1
cassandra::rack: RAC1
cassandra::package_ensure: 3.0.5-1
cassandra::package_name: cassandra30

Usage

Setup a keyspace and users

We assume that authentication has been enabled for the cassandra cluster and we are connecting with the default user name and password ('cassandra/cassandra').

In this example, we create a keyspace (mykeyspace) with a table called 'users' and an index called 'users_lname_idx'.

We also add three users (to Cassandra, not the mykeyspace.users table) called spillman, akers and boone while ensuring that a user called lucan is absent.

class { 'cassandra':
  ...
}
class { 'cassandra::schema':
  cqlsh_password => 'cassandra',
  cqlsh_user     => 'cassandra',
  cqlsh_host     => $::ipaddress,
  indexes        => {
    'users_lname_idx' => {
      table    => 'users',
      keys     => 'lname',
      keyspace => 'mykeyspace',
    },
  },
  keyspaces      => {
    'mykeyspace' => {
      durable_writes  => false,
      replication_map => {
        keyspace_class     => 'SimpleStrategy',
        replication_factor => 1,
      },
    }
  },
  permissions    => {
    'Grant select permissions to spillman to all keyspaces' => {
      permission_name => 'SELECT',
      user_name       => 'spillman',
    },
    'Grant modify to to keyspace mykeyspace to akers'       => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'MODIFY',
      user_name       => 'akers',
    },
    'Grant alter permissions to mykeyspace to boone'        => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'ALTER',
      user_name       => 'boone',
    },
    'Grant ALL permissions to mykeyspace.users to gbennet'  => {
      keyspace_name   => 'mykeyspace',
      permission_name => 'ALTER',
      table_name      => 'users',
      user_name       => 'gbennet',
    },
  },
  tables         => {
    'users' => {
      columns  => {
        user_id       => 'int',
        fname         => 'text',
        lname         => 'text',
        'PRIMARY KEY' => '(user_id)',
      },
      keyspace => 'mykeyspace',
    },
  },
  users          => {
    'spillman' => {
      password => 'Niner27',
    },
    'akers'    => {
      password  => 'Niner2',
      superuser => true,
    },
    'boone'    => {
      password => 'Niner75',
    },
    'gbennet'  => {
      'password' => 'foobar',
    },
    'lucan'    => {
      'ensure' => absent
    },
  },
}

Create a Cluster in a Single Data Center

In the DataStax documentation Initializing a multiple node cluster (single data center) http://docs.datastax.com/en/cassandra/2.2/cassandra/initialize/initSingleDS.html there is a basic example of a six node cluster with two seeds to be created in a single data center spanning two racks. The nodes in the cluster are:

Node Name IP Address
node0 (seed 1) 110.82.155.0
node1 110.82.155.1
node2 110.82.155.2
node3 (seed 2) 110.82.156.3
node4 110.82.156.4
node5 110.82.156.5

Each node is configured to use the GossipingPropertyFileSnitch and 256 virtual nodes (vnodes). The name of the cluster is MyCassandraCluster. Also, while building the initial cluster, we are setting the auto_bootstrap to false.

In this initial example, we are going to expand the example by:

  • Ensuring that the software is installed via the DataStax Community repository by including cassandra::datastax_repo. This needs to be executed before the Cassandra package is installed.
  • That a suitable Java Runtime environment (JRE) is installed with Java Native Access (JNA) by including cassandra::java. This need to be executed before the Cassandra service is started.
node /^node\d+$/ {
  class { 'cassandra::datastax_repo':
    before => Class['cassandra']
  }
  class { 'cassandra::java':
    before => Class['cassandra']
  }
  class { 'cassandra':
    settings       => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}

The default value for the num_tokens is already 256, but it is included in the example for clarity. Do not forget to either set auto_bootstrap to true or not set the attribute at all after initializing the cluster.

Create a Cluster in Multiple Data Centers

To continue with the examples provided by DataStax, we look at the example for a cluster across multiple data centers http://docs.datastax.com/en/cassandra/2.2/cassandra/initialize/initMultipleDS.html.

Node Name IP Address Data Center Rack
node0 (seed 1) 10.168.66.41 DC1 RAC1
node1 10.176.43.66 DC1 RAC1
node2 10.168.247.41 DC1 RAC1
node3 (seed 2) 10.176.170.59 DC2 RAC1
node4 10.169.61.170 DC2 RAC1
node5 10.169.30.138 DC2 RAC1

For the sake of simplicity, we will confine this example to the nodes:

node /^node[012]$/ {
  class { 'cassandra':
    dc             => 'DC1',
    settings       => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}
node /^node[345]$/ {
  class { 'cassandra':
    dc             => 'DC2',
    settings       => {
      'authenticator'               => 'AllowAllAuthenticator',
      'auto_bootstrap'              => false,
      'cluster_name'                => 'MyCassandraCluster',
      'commitlog_directory'         => '/var/lib/cassandra/commitlog',
      'commitlog_sync'              => 'periodic',
      'commitlog_sync_period_in_ms' => 10000,
      'data_file_directories'       => ['/var/lib/cassandra/data'],
      'endpoint_snitch'             => 'GossipingPropertyFileSnitch',
      'hints_directory'             => '/var/lib/cassandra/hints',
      'listen_interface'            => 'eth1',
      'num_tokens'                  => 256,
      'partitioner'                 => 'org.apache.cassandra.dht.Murmur3Partitioner',
      'saved_caches_directory'      => '/var/lib/cassandra/saved_caches',
      'seed_provider'               => [
        {
          'class_name' => 'org.apache.cassandra.locator.SimpleSeedProvider',
          'parameters' => [
            {
              'seeds' => '110.82.155.0,110.82.156.3',
            },
          ],
        },
      ],
      'start_native_transport'      => true,
    },
  }
}

We don't need to specify the rack name (with the rack attribute) as RAC1 is the default value. Again, do not forget to either set auto_bootstrap to true or not set the attribute at all after initializing the cluster.

DataStax Enterprise

After configuring the relevant repository, the following snippet works on CentOS 7 to install DSE Cassandra 4.7.0, set the HADOOP_LOG_DIR, set the DSE_HOME and configure DataStax Enterprise to use LDAP for authentication:

class { 'cassandra::datastax_repo':
  descr   => 'DataStax Repo for DataStax Enterprise',
  pkg_url => 'https://username:password@rpm.datastax.com/enterprise',
  before  => Class['cassandra'],
}
class { 'cassandra':
  cluster_name    => 'MyCassandraCluster',
  config_path     => '/etc/dse/cassandra',
  package_ensure  => '4.7.0-1',
  package_name    => 'dse-full',
  service_name    => 'dse',
  ...
}
class { 'cassandra::dse':
  file_lines => {
    'Set HADOOP_LOG_DIR directory' => {
      ensure => present,
      path   => '/etc/dse/dse-env.sh',
      line   => 'export HADOOP_LOG_DIR=/var/log/hadoop',
      match  => '^# export HADOOP_LOG_DIR=<log_dir>',
    },
    'Set DSE_HOME'                 => {
      ensure => present,
      path   => '/etc/dse/dse-env.sh',
      line   => 'export DSE_HOME=/usr/share/dse',
      match  => '^#export DSE_HOME',
    },
  },
  settings   => {
    ldap_options => {
      server_host                => localhost,
      server_port                => 389,
      search_dn                  => 'cn=Admin',
      search_password            => secret,
      use_ssl                    => false,
      use_tls                    => false,
      truststore_type            => jks,
      user_search_base           => 'ou=users,dc=example,dc=com',
      user_search_filter         => '(uid={0})',
      credentials_validity_in_ms => 0,
      connection_pool            => {
        max_active => 8,
        max_idle   => 8,
      }
    }
  }
}

Apache Cassandra

DataStax announced in late October 2016 that it was no longer supporting the community edition of Cassandra or DSC as it was known (see [Take a bow Planet Cassandra](http://www.datastax.com/2016/10/take-a-bow-planet-cassandra) for details). However, the following snippet of code running on Ubuntu 14.04 worked fine without having to change any of the ::cassandra class settings:

require cassandra::java
include cassandra::optutils
class { 'cassandra::apache_repo':
  release => '310x',
  before  => Class['cassandra', 'cassandra::optutils'],
}
class { 'cassandra':
  ...
}

Reference

The reference documentation is generated using the puppet-strings tool. To see all of it, please go to http://voxpupuli.github.io/puppet-cassandra.

Limitations

  • When using a Ruby version before 1.9.0, the contents of the Cassandra configuration file may change order of elements due to a problem with to_yaml in earlier versions of Ruby.
  • When creating key spaces, indexes, cql_types and users the settings will only be used to create a new resource if it does not currently exist. If a change is made to the Puppet manifest but the resource already exits, this change will not be reflected.
  • At the moment the cassandra::system::transparent_hugepage does not persist between reboots.
  • Acceptance for Debian 7 are confined to Cassandra 2.1 and 2.2. There is a conflict between the GLIBC on Debian 7 and the newer releases of Cassandra 3.X.

Development

Contributions will be gratefully accepted. Please go to the project page, fork the project, make your changes locally and then raise a pull request. Details on how to do this are available at https://guides.github.com/activities/contributing-to-open-source.

Please also see the CONTRIBUTING.md page for project specific requirements.

Additional Contributers

For a list of contributers see CONTRIBUTING.md and https://github.com/voxpupuli/puppet-cassandra/graphs/contributors

Related Articles

cluster
troubleshooting
datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

arodrime

4/3/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

puppet