Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

10/24/2018

Reading time:22 mins

michaelklishin/cassandra-chef-cookbook

by John Doe

This is a Chef cookbook for Apache Cassandra (DataStaxCommunity Edition) aswell as DataStax Enterprise.It uses officially released packages and provides an Upstart servicescript. It has fairly complete support for adjustment of Cassandraconfiguration parameters using Chef node attributes.It was originally created for CI and development environments and now supports cluster discovery using Chef search. Feel free to contribute what you find missing!Supported Chef VersionsThis cookbook targets Chef 12 and later versions.Cookbook Dependenciesdepends 'java'depends 'ulimit'depends 'apt'depends 'yum'depends 'ark'Cassandra DependenciesModern Cassandra versions require OracleJDK 8.BerkshelfMost Recent Releasecookbook 'cassandra-dse', '~> 4.5.0'From Gitcookbook 'cassandra-dse', github: 'michaelklishin/cassandra-chef-cookbook'Supported Apache Cassandra VersionThis cookbook currently providesCassandra via tarballsCassandra (DataStax Community Edition) via apt and yum packagesDataStax Enterprise (DSE) via packagesSupported OS DistributionsUbuntu 12.04 through 17.101 via DataStax apt repo.RHEL/CentOS via DataStax yum repo.RHEL/CentOS/Amazon via tarballSupport JDK VersionsCassandra 2.x requires JDK 7+, later versions require Oracle JDK 8+.RecipesThe main recipe is cassandra-dse::default which together with the node[:cassandra][:install_method] attribute will be responsible for including the proper installation recipe and recipe cassandra-dse::config for configuring both datastax and tarball C* installation.Two actual installation recipes are cassandra-dse::tarball and cassandra-dse::datastax. The former uses official tarballand thus can be used to provision any specific version.The latter uses DataStax repository via packages. You can install different versions (ex. dsc20 for v2.0) available in the repository by altering :package_name attribute (dsc20 by default).Recently we have moved all the configuration resources to a separate recipe cassandra-des::config, which means recipes cassandra-dse::tarball and cassandra-dse::datastax are only responsible for C* installation.Users with cookbook version =<3.5.0 needs to update the run_list, in case of not using cassandra-dse::default recipe.include_recipe cassandra-dse uses cassandra-dse::datastax as the default.DataStax EnterpriseYou can also install the DataStax Enterprise edition by adding node[:cassandra][:dse] attributes according to the datastax.rb.node[:cassandra][:package_name]: Override default value to 'dse-full'.node[:cassandra][:service_name]: Override default value to 'dse'.Unencrypted Credentials:node[:cassandra][:dse][:credentials][:username]: Your username from Datastax website.node[:cassandra][:dse][:credentials][:password]: Your password from Datastax website.Encrypted Credentials:node[:cassandra][:dse][:credentials][:databag][:name]: Databag name, i.e. the value 'cassandra' will reference to /data_bags/cassandra.node[:cassandra][:dse][:credentials][:databag][:item]: Databag item, i.e. the value 'main' will reference to /data_bags/cassandra/main.json.node[:cassandra][:dse][:credentials][:databag][:entry]: The field name in the databag item, in which the credetials are written. i.e. the data_bag:{ "id": "main", "entry": { "username": "%USERNAME%", "password": "%PASSWORD%" }}There are also recipes for DataStax opscenter installation (cassandra-dse::opscenter_agent_tarball,cassandra-dse::opscenter_agent_datastax, andcassandra-dse::opscenter_server ) along with attributes availablefor override (see below).JNA Support (for C* Versions Prior to 2.1.0)The node[:cassandra][:setup_jna] attribute will install the jna.jar in the/usr/share/java/jna.jar, and create a symbolic link to it on#{cassandra.lib\_dir}/jna.jar, according to the DataStaxdocumentation.Node AttributesPlease note that the maintainers try to keep the list below up-to-date but it fairly often missessome recently added attributes. Please refer to the attributes files if an attribute you are looking for isn't listed.Core Attributesnode[:cassandra][:install_method] (default: datastax): The installation method to use (either 'datastax' or 'tarball').node[:cassandra][:config][:cluster_name] (default: none): Name of the cluster to create. This is required.node[:cassandra][:version] (default: a recent patch version): version to provisionnode[:cassandra][:tarball][:url] and node[:cassandra][:tarball][:sha256sum] specify tarball URL and SHA256 check sum used by the cassandra::tarball recipe.Setting node[:cassandra][:tarball][:url] to "auto" (default) will download the tarball of the specified version from the Apache repository.node[:cassandra][:setup_user] (default: true): create user/group for Cassandra node processnode[:cassandra][:setup_user_limits] (default: true): setup Cassandra user limitsnode[:cassandra][:user]: username Cassandra node process will usenode[:cassandra][:group]: groupname Cassandra node process will usenode[:cassandra][:heap_new_size] set JVM -Xmn. If set, node[:cassandra][:max_heap_size] must also be set; if nil, defaults to min(100MB * num_cores, 1/4 * heap size)node[:cassandra][:max_heap_size] set JVM -Xms and -Xmx. If set, node[:cassandra][:heap_new_size] must also be set; if nil, defaults to max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))node[:cassandra][:installation_dir] (default: /usr/local/cassandra): installation directorynode[:cassandra][:root_dir] (default: /var/lib/cassandra): data directory rootnode[:cassandra][:log_dir] (default: /var/log/cassandra): log directorynode[:cassandra][:tmp_dir] (default: none): tmp directory. Be careful what you set this to, as the cassandra user will be given ownership of that directory.node[:cassandra][:local_jmx] (default: true): bind JMX listener to localhostnode[:cassandra][:jmx_port] (default: 7199): port to listen for JMXnode[:cassandra][:jmx_remote_rmi_port] (default: $JMX_PORT): port for jmx remote method invocation. If using internode SSL, there is a bug requiring this to be different than node[:cassandra][:jmx_port]node[:cassandra][:jmx_remote_authenticate] (default: false): turn on to require username/password for jmx operations including nodetool. To turn on requires node[:cassandra][:local_jmx] to be falsenode[:cassandra][:jmx][:user] (default: cassandra): username for jmx authenticationnode[:cassandra][:jmx][:password] (default: cassandra): password for jmx authentication.node[:cassandra][:notify_restart] (default: false): notify Cassandra service restart upon resource updateSetting node[:cassandra][:notify_restart] to true will restart Cassandra service upon resource changenode[:cassandra][:setup_jna] (default: true): installs jna.jarnode[:cassandra][:skip_jna] (default: false): (2.1.0 and up only) removes jna.jar, adding '-Dcassandra.boot_without_jna=true' for low-memory C* installationsnode[:cassandra][:pid_dir] (default: true): pid directory for Cassandra node process for cassandra::tarball recipenode[:cassandra][:dir_mode] (default: 0755): default permission set for Cassandra node directory / filesnode[:cassandra][:service_action] (default: [:enable, :start]): default service actions for the servicenode[:cassandra][:install_java] (default: true): whether to run the open source java cookbooknode[:cassandra][:cassandra_old_version_20] (default: ): attribute used in cookbook to determine C* version older or newer than 2.1node[:cassandra][:log_config_files] (default: calculated): log framework configuration files name arraynode[:cassandra][:xss] JVM per thread stack-size (-Xss option) (default: 256k).node[:cassandra][:jmx_server_hostname] java.rmi.server.hostname option for JMX interface, necessary to set when you have problems connecting to JMX) (default: false)node[:cassandra][:heap_dump] -XX:+HeapDumpOnOutOfMemoryError JVM parameter (default: true)node[:cassandra][:heap_dump_dir] Directory where heap dumps will be placed (default: nil, which will use cwd)node[:cassandra][:vnodes] enable vnodes. (default: true)For the complete set of supported attributes, please consult the source.Attributes used to define JBOD functionalitydefault['cassandra']['jbod']['slices'] - defines the number of jbod slices while each represents data directory. By default disables with nil.default['cassandra']['jbod']['dir_name_prefix'] - defines the data directory prefixFor example if you want to connect 4 EBS disks as a JBOD slices the names will be in the following format: data1,data2,data3,data4cassandra.yaml.erb will generate automatically entry per data_dir locationPlease note: this functionality is not creating volumes or directories. It takes care of configuration. You can use same parameters with AWS cookbook to create EBS volumes and map to directories.Attributes for fine tuning CMS/ParNew, the GC algorithm recommended for Cassandra deployments:node[:cassandra][:gc_survivor_ratio] -XX:SurvivorRatio JVM parameter (default: 8)node[:cassandra][:gc_max_tenuring_threshold] -XX:MaxTenuringThreshold JVM parameter (default: 1)node[:cassandra][:gc_cms_initiating_occupancy_fraction] -XX:CMSInitiatingOccupancyFraction JVM parameter (default: 75)Descriptions for these JVM parameters can be found here and here.Attributes for enabling G1 GC.node[:cassandra][:jvm][:g1] (default: false)Attributes for enabling GC detail/logging.node[:cassandra][:jvm][:gcdetail] (default: false)Attributes for fine tuning the G1 GC algorithm:node[:cassandra][:jvm][:g1_rset_updating_pause_time_percent] (default: 10)node[:cassandra][:jvm][:g1_heap_region_size] -XX:G1HeapRegionSize (default: 0)node[:cassandra][:jvm][:max_gc_pause_millis] -XX:MaxGCPauseMillis (default: 200)node[:cassandra][:jvm][:heap_occupancy_threshold] -XX:InitiatingHeapOccupancyPercent (default: 45)node[:cassandra][:jvm][:max_parallel_gc_threads] This will set -XX:ParallelGCThreads to the number of cores on the machine (default: false)node[:cassandra][:jvm][:max_conc_gc_threads] This will set -XX:ConcGCThreads to the number of cores on the machine (default: false)node[:cassandra][:jvm][:parallel_ref_proc] -XX:ParallelRefProcEnabled (default: false)node[:cassandra][:jvm][:always_pre_touch] -XX:AlwaysPreTouch (default: false)node[:cassandra][:jvm][:use_biased_locking] -XX:UseBiasedLocking (default: true)node[:cassandra][:jvm][:use_tlab] -XX:UseTLAB (default: true)node[:cassandra][:jvm][:resize_tlab] -XX:ResizeTLAB (default: true)Oracle JVM 8 tuning parameters: hereSeed Discovery Attributesnode[:cassandra][:seeds] (default: [node[:ipaddress]]): an array of nodes this node will contact to discover cluster topologynode[:cassandra][:seed_discovery][:use_chef_search] (default: false): enabled seed discovery using Chef searchnode[:cassandra][:seed_discovery][:search_role] (default: "cassandra-seed"): role to use in search querynode[:cassandra][:seed_discovery][:search_query] (default: uses node[:cassandra][:seed_discovery][:search_role]): allowsfor overriding the entire Chef search querynode[:cassandra][:seed_discovery][:count] (default: 3): how many nodes to include into seed list. First N nodes aretaken in the order Chef search returns them. IP addresses of the nodes are sorted lexographically.cassandra.yaml Attributesnode[:cassandra][:config][:num_tokens] set the desired number of tokens. (default: 256)node[:cassandra][:config][:listen_address] (default: node[:ipaddress]): address clients will use to connect to the nodenode[:cassandra][:config][:broadcast_address] (default: node IP address): address to broadcast to other Cassandra nodesnode[:cassandra][:config][:rpc_address] (default: 0.0.0.0): address to bind the RPC interface. Leave blank to lookup IP from hostname.node[:cassandra][:config][:hinted_handoff_enabled] see http://wiki.apache.org/cassandra/HintedHandoff (default: true)node[:cassandra][:config][:max_hint_window_in_ms] The maximum amount of time a dead host will have hints generated (default: 10800000).node[:cassandra][:config][:hinted_handoff_throttle_in_kb] throttle in KB's per second, per delivery thread (default: 1024)node[:cassandra][:config][:max_hints_delivery_threads] Number of threads with which to deliver hints (default: 2)node[:cassandra][:config][:authenticator] Authentication backend (default: org.apache.cassandra.auth.AllowAllAuthenticator)node[:cassandra][:config][:authorizer] Authorization backend (default: org.apache.cassandra.auth.AllowAllAuthorizer)node[:cassandra][:config][:permissions_validity_in_ms] Validity period for permissions cache, set to0 to disable (default: 2000)node[:cassandra][:config][:partitioner] The partitioner to distribute keys across the cluster (default: org.apache.cassandra.dht.Murmur3Partitioner).node[:cassandra][:config][:disk_failure_policy] policy for data disk failures: stop, best_effort, or ignore (default: stop)node[:cassandra][:config][:key_cache_size_in_mb] Maximum size of the key cache in memory. Set to 0 to disable, or "" for auto = (min(5% of Heap (in MB), 100MB)) (default: "", auto).node[:cassandra][:config][:key_cache_save_period] Duration in seconds after which key cache is saved to saved_caches_directory. (default: 14400)node[:cassandra][:config][:row_cache_size_in_mb] Maximum size of the row cache in memory, 0 to disable (default: 0)node[:cassandra][:config][:row_cache_save_period] Duration in seconds after which row cache is saved to saved_caches_directory, 0 to disable cache save. (default: 0)node[:cassandra][:config][:row_cache_provider] The provider for the row cache to use (default: SerializingCacheProvider)node[:cassandra][:config][:commitlog_sync] periodic to ack writes immediately with periodic fsyncs, or batch to wait until fsync to ack writes (default: periodic)node[:cassandra][:config][:commitlog_sync_period_in_ms] period for commitlog fsync when commitlog_sync = periodic (default: 10000)node[:cassandra][:config][:commitlog_sync_batch_window_in_ms] batch window for fsync when commitlog_sync = batch (default: 50)node[:cassandra][:config][:commitlog_segment_size_in_mb] Size of individual commitlog file segments (default: 32)node[:cassandra][:config][:commitlog_total_space_in_mb] If space gets above this value (it will round up to the next nearest segment multiple), Cassandra will flush every dirty CF in the oldest segment and remove it. (default: 4096)node[:cassandra][:config][:concurrent_reads] Should be set to 16 * drives (default: 32)node[:cassandra][:config][:concurrent_writes] Should be set to 8 * cpu cores (default: 32)node[:cassandra][:config][:trickle_fsync] Enable this to avoid sudden dirty buffer flushing from impacting read latencies. Almost always a good idea on SSDs; not necessary on platters (default: false)node[:cassandra][:config][:trickle_fsync_interval_in_kb] Interval for fsync when doing sequential writes (default: 10240)node[:cassandra][:config][:storage_port] TCP port, for commands and data (default: 7000)node[:cassandra][:config][:ssl_storage_port] SSL port, unused unless enabled in encryption options (default: 7001)node[:cassandra][:config][:listen_address] Address to bind for communication with other nodes. Leave blank to lookup IP from hostname. 0.0.0.0 is always wrong. (default: node[:ipaddress]).node[:cassandra][:config][:broadcast_address] Address to broadcast to other Cassandra nodes. If '', will use listen_address (default: '')node[:cassandra][:config][:start_native_transport] Whether to start the native transport server (default: true)node[:cassandra][:config][:native_transport_port] Port for the CQL native transport to listen for clients on (default: 9042)node[:cassandra][:config][:start_rpc] Whether to start the Thrift RPC server (default: true)node[:cassandra][:config][:rpc_port] Port for Thrift RPC server to listen for clients on (default: 9160)node[:cassandra][:config][:rpc_keepalive] Enable keepalive on RPC connections (default: true)node[:cassandra][:config][:rpc_server_type] sync for one thread per connection; hsha for "half synchronous, half asynchronous" (default: sync)node[:cassandra][:config][:thrift_framed_transport_size_in_mb] Frame size for Thrift (maximum field length) (default: 15)node[:cassandra][:config][:thrift_max_message_length_in_mb] Max length of a Thrift message, including all fields and internal Thrift overhead (default: 16)node[:cassandra][:config][:incremental_backups] Enable hardlinks in backups/ for each sstable flushed or streamed locally. Removing these links is the operator's responsibility (default: false)node[:cassandra][:config][:snapshot_before_compaction] Take a snapshot before each compaction (default: false)node[:cassandra][:config][:auto_snapshot] Take a snapshot before keyspace truncation or dropping of column families. If you set this value to false, you will lose data on truncation or drop (default: true)node[:cassandra][:config][:column_index_size_in_kb] Add column indexes to a row after its contents reach this size (default: 64)node[:cassandra][:config][:compaction_throughput_mb_per_sec] Throttle compaction to this total system throughput. Generally should be 16-32 times data insertion rate (default: 16)node[:cassandra][:config][:read_request_timeout_in_ms] How long the coordinator should wait for read operations to complete (default: 10000)node[:cassandra][:config][:range_request_timeout_in_ms] How long the coordinator should wait for seq or index scans to complete (default: 10000).node[:cassandra][:config][:write_request_timeout_in_ms] How long the coordinator should wait for writes to complete (default: 10000)node[:cassandra][:config][:truncate_request_timeout_in_ms] How long the coordinator should wait for truncates to complete (default: 60000)node[:cassandra][:config][:request_timeout_in_ms] Default timeout for other, miscellaneous operations (default: 10000)node[:cassandra][:config][:cross_node_timeout] Enable operation timeout information exchange between nodes to accurately measure request timeouts. Be sure ntp is installed and node times are synchronized before enabling. (default: false)node[:cassandra][:config][:streaming_socket_timeout_in_ms] Enable socket timeout for streaming operation (default: 3600000 - 1 hour)node[:cassandra][:config][:phi_convict_threshold] Adjusts the sensitivity of the failure detector on an exponential scale (default: 8)node[:cassandra][:config][:endpoint_snitch] SimpleSnitch, PropertyFileSnitch, GossipingPropertyFileSnitch, RackInferringSnitch, Ec2Snitch, Ec2MultiRegionSnitch (default: SimpleSnitch)node[:cassandra][:config][:dynamic_snitch_update_interval_in_ms] How often to perform the more expensive part of host score calculation (default: 100)node[:cassandra][:config][:dynamic_snitch_reset_interval_in_ms] How often to reset all host scores, allowing a bad host to possibly recover (default: 600000)node[:cassandra][:config][:dynamic_snitch_badness_threshold] Allow 'pinning' of replicas to hosts in order to increase cache capacity. (default: 0.1)node[:cassandra][:config][:request_scheduler] Class to schedule incoming client requests (default: org.apache.cassandra.scheduler.NoScheduler)node[:cassandra][:config][:index_interval] index_interval controls the sampling of entries from the primary row index in terms of space versus time (default: 128).node[:cassandra][:config][:auto_bootstrap] Setting this parameter to false prevents the new nodes from attempting to get all the data from the other nodes in the data center. (default: true).node[:cassandra][:config][:enable_assertions] Enable JVM assertions. Disabling this in production will give a modest performance benefit (around 5%) (default: true).node[:cassandra][:config][:data_file_directories] (default: node['cassandra']['data_dir']): C* data cirectoriesnode[:cassandra][:config][:saved_caches_directory] (default: saved_caches_directory): C* saved cache directorynode[:cassandra][:config][:commitlog_directory] (default: node['cassandra']['commitlog_dir']) *C commit log directoryC* <v2.0 Attributesnode[:cassandra][:config][:memtable_flush_queue_size] Number of full memtables to allow pending flush, i.e., waiting for a writer thread (default: 4)node[:cassandra][:config][:in_memory_compaction_limit_in_mb] Size limit for rows being compacted in memory (default: 64)node[:cassandra][:config][:concurrent_compactors] Sets the number of concurrent compaction processes allowed to run simultaneously on a node. (default: nil, which will result in one compaction process per CPU core)node[:cassandra][:config][:multithreaded_compaction] Enable multithreaded compaction. Uses one thread per core, plus one thread per sstable being merged. (default: false)node[:cassandra][:config][:compaction_preheat_key_cache] Track cached row keys during compaction and re-cache their new positions in the compacted sstable. Disable if you use really large key caches (default: true)node[:cassandra][:config][:native_transport_min_threads] Min number of threads for handling transport requests when the native protocol is used (default: nil)node[:cassandra][:config][:native_transport_max_threads] Max number of threads for handling transport requests when the native protocol is used (default: nil)C* >v2.1 Attributesnode[:cassandra][:config][:broadcast_rpc_address] RPC address to broadcast to drivers and other Cassandra nodes (default: node[:ipaddress])node[:cassandra][:config][:tombstone_failure_threshold] tombstone attribute, check C* documentation for more info (default: 100000)node[:cassandra][:config][:tombstone_warn_threshold] tombstone attribute, check C* documentation for more info (default: 1000)node[:cassandra][:config][:sstable_preemptive_open_interval_in_mb] This helps to smoothly transfer reads between the sstables, reducing page cache churn and keeping hot rows hot (default: 50)node[:cassandra][:config][:memtable_allocation_type] Specify the way Cassandra allocates and manages memtable memory (default: heap_buffers)node[:cassandra][:config][:index_summary_capacity_in_mb] A fixed memory pool size in MB for for SSTable index summaries. If left empty, this will default to 5% of the heap size (default: nil)node[:cassandra][:config][:index_summary_resize_interval_in_minutes] How frequently index summaries should be resampled (default: 60)node[:cassandra][:config][:concurrent_counter_writes] Concurrent writes, since writes are almost never IO bound, the ideal number of "concurrent_writes" is dependent on the number of cores in your system; (8 * number_of_cores) (default: 32)node[:cassandra][:config][:counter_cache_save_period] Duration in seconds after which Cassandra should save the counter cache (keys only) (default: 7200)node[:cassandra][:config][:counter_cache_size_in_mb] Counter cache helps to reduce counter locks' contention for hot counter cells. Default value is empty to make it "auto" (min(2.5% of Heap (in MB), 50MB)). Set to 0 to disable counter cache. (default: nil)node[:cassandra][:config][:counter_write_request_timeout_in_ms] How long the coordinator should wait for counter writes to complete (default: 5000)node[:cassandra][:config][:commit_failure_policy] policy for commit disk failures (default: stop)node[:cassandra][:config][:cas_contention_timeout_in_ms] How long a coordinator should continue to retry a CAS operation that contends with other proposals for the same row (default: 1000)node[:cassandra][:config][:batch_size_warn_threshold_in_kb] Log WARN on any batch size exceeding this value. 5kb per batch by default (default: 5)node[:cassandra][:config][:batchlog_replay_throttle_in_kb] Maximum throttle in KBs per second, total. This will be reduced proportionally to the number of nodes in the cluster (default: 1024)JAMM Attributesnode[:cassandra][:setup_jamm] (default: false): install the jamm jar file and use it to set java option -javaagent, obsolete for C* versions >v0.8.0node[:cassandra][:jamm][:sha256sum] (default: calculated): jamm lib sha256sum for calculated versionnode[:cassandra][:jamm][:base_url] (default: calculated): jamm lib jar urlnode[:cassandra][:jamm][:jar_name] (default: calculated): jamm lib jar namenode[:cassandra][:jamm][:version] (default: calculated): jamm lib versionJNA Attributes (Prior C* version 2.1.0)node[:cassandra][:jna][:base_url] The base url to fetch the JNA jar (default: https://github.com/twall/jna/tree/4.0/dist)node[:cassandra][:jna][:jar_name] The name of the jar to download from the base url. (default: jna.jar)node[:cassandra][:jna][:sha256sum] The SHA-256 checksum of the file. If the local jna.jar file matches the checksum, the chef-client will not re-download it. (default: dac270b6441ce24d93a96ddb6e8f93d8df099192738799a6f6fcfc2b2416ca19)Priam Attributesnode[:cassandra][:setup_priam] (default: false): install the priam jar file and use it to set java option -javaagent, uses the priam version corresponding to the cassandra versionnode[:cassandra][:priam][:sha256sum] (default: 9fde9a40dc5c538adee54f40fa9027cf3ebb7fd42e3592b3e6fdfe3f7aff81e1): priam lib sha256sum for version 2.2.0node[:cassandra][:priam][:base_url] (default: priam url on maven.org): priam lib jar urlnode[:cassandra][:priam][:jar_name] (default: calculated): priam lib jar nameLogback Attributesnode[:cassandra][:logback][:file][:max_file_size] (default: "20MB"): logback File appender log file rotation sizenode[:cassandra][:logback][:file][:max_index] (default: 20): logback File appender log files max_indexnode[:cassandra][:logback][:file][:min_index] (default: 1): logback File appender log files min_indexnode[:cassandra][:logback][:file][:pattern] (default: "%-5level [%thread] %date{ISO8601} %F:%L - %msg%n"): logback File appender log patternnode[:cassandra][:logback][:debug][:enable] (default: false): enable logback File appender log debugnode[:cassandra][:logback][:debug][:max_file_size] (default: "20MB"): logback File appender log file rotation sizenode[:cassandra][:logback][:debug][:max_index] (default: 20): logback File appender log files max_indexnode[:cassandra][:logback][:debug][:min_index] (default: 1): logback File appender log files min_indexnode[:cassandra][:logback][:debug][:pattern] (default: "%-5level [%thread] %date{ISO8601} %F:%L - %msg%n"): logback File appender log patternnode[:cassandra][:logback][:stdout][:enable] (default: true): enable logback STDOUT appendernode[:cassandra][:logback][:stdout][:pattern] (default: "%-5level %date{HH:mm:ss,SSS} %msg%n"): logback STDOUT appender log patternnode[:cassandra][:logback][:syslog][:enable] (default: false): enable logback SYSLOG appender. Requires RSYSLOG be installed and running on the node.node[:cassandra][:logback][:syslog][:host] (default: localhost): The host name the syslog is written to.node[:cassandra][:logback][:syslog][:facility] (default: USER) The facility specified for the appender.node[:cassandra][:logback][:syslog][:pattern] (default: "%-5level [%thread] %F:%L - %msg%n") lockback SYSLOG appender log patternnode[:cassandra][:logback][:override_loggers] (default: {}) Override log level of specific logger (i.e { 'org.apache.cassandra.utils.StatusLogger' => 'WARN' })Ulimit Attributesnode[:cassandra][:limits][:memlock] (default: "unlimited"): memory ulimit for Cassandra node processnode[:cassandra][:limits][:nofile] (default: 48000): file ulimit for Cassandra node processnode[:cassandra][:limits][:nproc] (default: "unlimited"): process ulimit for Cassandra node processYum Attributesnode[:cassandra][:yum][:repo] (default: datastax): name of the repo from which to installnode[:cassandra][:yum][:description] (default: "DataStax Repo for Apache Cassandra"): description of the reponode[:cassandra][:yum][:baseurl] (default: "http://rpm.datastax.com/community"): repo urlnode[:cassandra][:yum][:mirrorlist] (default: nil): a mirrorlist filenode[:cassandra][:yum][:gpgcheck] (default: false): whether to use gpgchecknode[:cassandra][:yum][:enabled] (default: true): whether the repo is enabled by defaultnode[:cassandra][:yum][:options] (default: ""): Additional options to pass to yum_packageOpsCenter AttributesDataStax Ops Center Server attributesnode[:cassandra][:opscenter][:server][:package_name] (default: opscenter-free)node[:cassandra][:opscenter][:server][:port] (default: 8888)node[:cassandra][:opscenter][:server][:interface] (default: 0.0.0.0)node[:cassandra][:opscenter][:server][:authentication] (default: false)node[:cassandra][:opscenter][:cassandra_metrics][:ignored_keyspaces] (default: [system, OpsCenter])node[:cassandra][:opscenter][:cassandra_metrics][:ignored_column_families] (default: [])node[:cassandra][:opscenter][:cassandra_metrics][:1min_ttl] (default: 604800)node[:cassandra][:opscenter][:cassandra_metrics][:5min_ttl] (default: 2419200)node[:cassandra][:opscenter][:cassandra_metrics][:2hr_ttl] (default: 31536000)node[:cassandra][:opscenter][:custom_configuration] (default: {}) a hash of custom configuration sections to add to opscenterd.conf, e.g.:{ 'ui' => { 'default_api_timeout' => 300 }, 'stat_reporter' => { 'interval' => 1 }}DataStax Ops Center Agent Tarball attributesnode[:cassandra][:opscenter][:agent][:download_url] (default: "") Required. You need to specifyagent download url, because that could be different for each opscenter server version. ( S3 is a greatplace to store packages )node[:cassandra][:opscenter][:agent][:checksum] (default: nil)node[:cassandra][:opscenter][:agent][:install_dir] (default: /opt)node[:cassandra][:opscenter][:agent][:install_folder_name] (default: opscenter_agent)node[:cassandra][:opscenter][:agent][:binary_name] (default: opscenter-agent) Introduced since Datastax changed agent binary name from opscenter-agent to datastax-agent. Make sure to set it right if you are updating to 4.0.2node[:cassandra][:opscenter][:agent][:server_host] (default: "" ). If left empty, will use search to get IP by opscenter server_role role.node[:cassandra][:opscenter][:agent][:server_role] (default: opscenter_server). Will be use for opscenter server IP lookup if :server_host is not set.node[:cassandra][:opscenter][:agent][:use_chef_search] (default: true). Determines whether chef search will be used for locating the data agent server.node[:cassandra][:opscenter][:agent][:use_ssl] (default: false)DataStax Ops Center Agent Datastax attributesnode[:cassandra][:opscenter][:agent][:package_name] (default: "datastax-agent" ).node[:cassandra][:opscenter][:agent][:server_host] (default: "" ). If left empty, will use search to get IP by opscenter server_role role.node[:cassandra][:opscenter][:agent][:server_role] (default: opscenter_server). Will be use for opscenter server IP lookup if :server_host is not set.node[:cassandra][:opscenter][:agent][:use_ssl] (default: false)Data Center and Rack Attributesnode[:cassandra][:rackdc][:dc] (default: "") The datacenter to specify in the cassandra-rackdc.properties file. (GossipingPropertyFileSnitch only)node[:cassandra][:rackdc][:rack] (default: "") The rack to specify in the cassandra-rackdc.properties file (GossipingPropertyFileSnitch only)node[:cassandra][:rackdc][:prefer_local] (default: "false") Whether the snitch will prefer the internal ip when possible, as the Ec2MultiRegionSnitch does. (GossipingPropertyFileSnitch only)ContributingSee CONTRIBUTING.md and TESTING.md.Copyright & LicenseMichael S. Klishin, Travis CI Development Team, and contributors,2012-2018.Released under the Apache 2.0 License.

Illustration Image

Build Status

This is a Chef cookbook for Apache Cassandra (DataStax Community Edition) as well as DataStax Enterprise.

It uses officially released packages and provides an Upstart service script. It has fairly complete support for adjustment of Cassandra configuration parameters using Chef node attributes.

It was originally created for CI and development environments and now supports cluster discovery using Chef search. Feel free to contribute what you find missing!

Supported Chef Versions

This cookbook targets Chef 12 and later versions.

Cookbook Dependencies

depends 'java'
depends 'ulimit'
depends 'apt'
depends 'yum'
depends 'ark'

Cassandra Dependencies

Modern Cassandra versions require OracleJDK 8.

Berkshelf

Most Recent Release

cookbook 'cassandra-dse', '~> 4.5.0'

From Git

cookbook 'cassandra-dse', github: 'michaelklishin/cassandra-chef-cookbook'

Supported Apache Cassandra Version

This cookbook currently provides

  • Cassandra via tarballs
  • Cassandra (DataStax Community Edition) via apt and yum packages
  • DataStax Enterprise (DSE) via packages

Supported OS Distributions

  • Ubuntu 12.04 through 17.101 via DataStax apt repo.
  • RHEL/CentOS via DataStax yum repo.
  • RHEL/CentOS/Amazon via tarball

Support JDK Versions

Cassandra 2.x requires JDK 7+, later versions require Oracle JDK 8+.

Recipes

The main recipe is cassandra-dse::default which together with the node[:cassandra][:install_method] attribute will be responsible for including the proper installation recipe and recipe cassandra-dse::config for configuring both datastax and tarball C* installation.

Two actual installation recipes are cassandra-dse::tarball and cassandra-dse::datastax. The former uses official tarball and thus can be used to provision any specific version.

The latter uses DataStax repository via packages. You can install different versions (ex. dsc20 for v2.0) available in the repository by altering :package_name attribute (dsc20 by default).

Recently we have moved all the configuration resources to a separate recipe cassandra-des::config, which means recipes cassandra-dse::tarball and cassandra-dse::datastax are only responsible for C* installation.

Users with cookbook version =<3.5.0 needs to update the run_list, in case of not using cassandra-dse::default recipe.

include_recipe cassandra-dse uses cassandra-dse::datastax as the default.

DataStax Enterprise

You can also install the DataStax Enterprise edition by adding node[:cassandra][:dse] attributes according to the datastax.rb.

  • node[:cassandra][:package_name]: Override default value to 'dse-full'.
  • node[:cassandra][:service_name]: Override default value to 'dse'.

Unencrypted Credentials:

  • node[:cassandra][:dse][:credentials][:username]: Your username from Datastax website.
  • node[:cassandra][:dse][:credentials][:password]: Your password from Datastax website.

Encrypted Credentials:

  • node[:cassandra][:dse][:credentials][:databag][:name]: Databag name, i.e. the value 'cassandra' will reference to /data_bags/cassandra.
  • node[:cassandra][:dse][:credentials][:databag][:item]: Databag item, i.e. the value 'main' will reference to /data_bags/cassandra/main.json.
  • node[:cassandra][:dse][:credentials][:databag][:entry]: The field name in the databag item, in which the credetials are written. i.e. the data_bag:
{
  "id": "main",
  "entry": {
    "username": "%USERNAME%",
    "password": "%PASSWORD%"
  }
}

There are also recipes for DataStax opscenter installation ( cassandra-dse::opscenter_agent_tarball, cassandra-dse::opscenter_agent_datastax, and cassandra-dse::opscenter_server ) along with attributes available for override (see below).

JNA Support (for C* Versions Prior to 2.1.0)

The node[:cassandra][:setup_jna] attribute will install the jna.jar in the /usr/share/java/jna.jar, and create a symbolic link to it on #{cassandra.lib\_dir}/jna.jar, according to the DataStax documentation.

Node Attributes

Please note that the maintainers try to keep the list below up-to-date but it fairly often misses some recently added attributes. Please refer to the attributes files if an attribute you are looking for isn't listed.

Core Attributes

  • node[:cassandra][:install_method] (default: datastax): The installation method to use (either 'datastax' or 'tarball').
  • node[:cassandra][:config][:cluster_name] (default: none): Name of the cluster to create. This is required.
  • node[:cassandra][:version] (default: a recent patch version): version to provision
  • node[:cassandra][:tarball][:url] and node[:cassandra][:tarball][:sha256sum] specify tarball URL and SHA256 check sum used by the cassandra::tarball recipe.
  • Setting node[:cassandra][:tarball][:url] to "auto" (default) will download the tarball of the specified version from the Apache repository.
  • node[:cassandra][:setup_user] (default: true): create user/group for Cassandra node process
  • node[:cassandra][:setup_user_limits] (default: true): setup Cassandra user limits
  • node[:cassandra][:user]: username Cassandra node process will use
  • node[:cassandra][:group]: groupname Cassandra node process will use
  • node[:cassandra][:heap_new_size] set JVM -Xmn. If set, node[:cassandra][:max_heap_size] must also be set; if nil, defaults to min(100MB * num_cores, 1/4 * heap size)
  • node[:cassandra][:max_heap_size] set JVM -Xms and -Xmx. If set, node[:cassandra][:heap_new_size] must also be set; if nil, defaults to max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
  • node[:cassandra][:installation_dir] (default: /usr/local/cassandra): installation directory
  • node[:cassandra][:root_dir] (default: /var/lib/cassandra): data directory root
  • node[:cassandra][:log_dir] (default: /var/log/cassandra): log directory
  • node[:cassandra][:tmp_dir] (default: none): tmp directory. Be careful what you set this to, as the cassandra user will be given ownership of that directory.
  • node[:cassandra][:local_jmx] (default: true): bind JMX listener to localhost
  • node[:cassandra][:jmx_port] (default: 7199): port to listen for JMX
  • node[:cassandra][:jmx_remote_rmi_port] (default: $JMX_PORT): port for jmx remote method invocation. If using internode SSL, there is a bug requiring this to be different than node[:cassandra][:jmx_port]
  • node[:cassandra][:jmx_remote_authenticate] (default: false): turn on to require username/password for jmx operations including nodetool. To turn on requires node[:cassandra][:local_jmx] to be false
  • node[:cassandra][:jmx][:user] (default: cassandra): username for jmx authentication
  • node[:cassandra][:jmx][:password] (default: cassandra): password for jmx authentication.
  • node[:cassandra][:notify_restart] (default: false): notify Cassandra service restart upon resource update
  • Setting node[:cassandra][:notify_restart] to true will restart Cassandra service upon resource change
  • node[:cassandra][:setup_jna] (default: true): installs jna.jar
  • node[:cassandra][:skip_jna] (default: false): (2.1.0 and up only) removes jna.jar, adding '-Dcassandra.boot_without_jna=true' for low-memory C* installations
  • node[:cassandra][:pid_dir] (default: true): pid directory for Cassandra node process for cassandra::tarball recipe
  • node[:cassandra][:dir_mode] (default: 0755): default permission set for Cassandra node directory / files
  • node[:cassandra][:service_action] (default: [:enable, :start]): default service actions for the service
  • node[:cassandra][:install_java] (default: true): whether to run the open source java cookbook
  • node[:cassandra][:cassandra_old_version_20] (default: ): attribute used in cookbook to determine C* version older or newer than 2.1
  • node[:cassandra][:log_config_files] (default: calculated): log framework configuration files name array
  • node[:cassandra][:xss] JVM per thread stack-size (-Xss option) (default: 256k).
  • node[:cassandra][:jmx_server_hostname] java.rmi.server.hostname option for JMX interface, necessary to set when you have problems connecting to JMX) (default: false)
  • node[:cassandra][:heap_dump] -XX:+HeapDumpOnOutOfMemoryError JVM parameter (default: true)
  • node[:cassandra][:heap_dump_dir] Directory where heap dumps will be placed (default: nil, which will use cwd)
  • node[:cassandra][:vnodes] enable vnodes. (default: true)

For the complete set of supported attributes, please consult the source.

Attributes used to define JBOD functionality

  • default['cassandra']['jbod']['slices'] - defines the number of jbod slices while each represents data directory. By default disables with nil.
  • default['cassandra']['jbod']['dir_name_prefix'] - defines the data directory prefix For example if you want to connect 4 EBS disks as a JBOD slices the names will be in the following format: data1,data2,data3,data4 cassandra.yaml.erb will generate automatically entry per data_dir location Please note: this functionality is not creating volumes or directories. It takes care of configuration. You can use same parameters with AWS cookbook to create EBS volumes and map to directories.

Attributes for fine tuning CMS/ParNew, the GC algorithm recommended for Cassandra deployments:

  • node[:cassandra][:gc_survivor_ratio] -XX:SurvivorRatio JVM parameter (default: 8)
  • node[:cassandra][:gc_max_tenuring_threshold] -XX:MaxTenuringThreshold JVM parameter (default: 1)
  • node[:cassandra][:gc_cms_initiating_occupancy_fraction] -XX:CMSInitiatingOccupancyFraction JVM parameter (default: 75)

Descriptions for these JVM parameters can be found here and here.

Attributes for enabling G1 GC.

  • node[:cassandra][:jvm][:g1] (default: false)

Attributes for enabling GC detail/logging.

  • node[:cassandra][:jvm][:gcdetail] (default: false)

Attributes for fine tuning the G1 GC algorithm:

  • node[:cassandra][:jvm][:g1_rset_updating_pause_time_percent] (default: 10)
  • node[:cassandra][:jvm][:g1_heap_region_size] -XX:G1HeapRegionSize (default: 0)
  • node[:cassandra][:jvm][:max_gc_pause_millis] -XX:MaxGCPauseMillis (default: 200)
  • node[:cassandra][:jvm][:heap_occupancy_threshold] -XX:InitiatingHeapOccupancyPercent (default: 45)
  • node[:cassandra][:jvm][:max_parallel_gc_threads] This will set -XX:ParallelGCThreads to the number of cores on the machine (default: false)
  • node[:cassandra][:jvm][:max_conc_gc_threads] This will set -XX:ConcGCThreads to the number of cores on the machine (default: false)
  • node[:cassandra][:jvm][:parallel_ref_proc] -XX:ParallelRefProcEnabled (default: false)
  • node[:cassandra][:jvm][:always_pre_touch] -XX:AlwaysPreTouch (default: false)
  • node[:cassandra][:jvm][:use_biased_locking] -XX:UseBiasedLocking (default: true)
  • node[:cassandra][:jvm][:use_tlab] -XX:UseTLAB (default: true)
  • node[:cassandra][:jvm][:resize_tlab] -XX:ResizeTLAB (default: true)

Oracle JVM 8 tuning parameters: here

Seed Discovery Attributes

  • node[:cassandra][:seeds] (default: [node[:ipaddress]]): an array of nodes this node will contact to discover cluster topology
  • node[:cassandra][:seed_discovery][:use_chef_search] (default: false): enabled seed discovery using Chef search
  • node[:cassandra][:seed_discovery][:search_role] (default: "cassandra-seed"): role to use in search query
  • node[:cassandra][:seed_discovery][:search_query] (default: uses node[:cassandra][:seed_discovery][:search_role]): allows for overriding the entire Chef search query
  • node[:cassandra][:seed_discovery][:count] (default: 3): how many nodes to include into seed list. First N nodes are taken in the order Chef search returns them. IP addresses of the nodes are sorted lexographically.

cassandra.yaml Attributes

  • node[:cassandra][:config][:num_tokens] set the desired number of tokens. (default: 256)
  • node[:cassandra][:config][:listen_address] (default: node[:ipaddress]): address clients will use to connect to the node
  • node[:cassandra][:config][:broadcast_address] (default: node IP address): address to broadcast to other Cassandra nodes
  • node[:cassandra][:config][:rpc_address] (default: 0.0.0.0): address to bind the RPC interface. Leave blank to lookup IP from hostname.
  • node[:cassandra][:config][:hinted_handoff_enabled] see http://wiki.apache.org/cassandra/HintedHandoff (default: true)
  • node[:cassandra][:config][:max_hint_window_in_ms] The maximum amount of time a dead host will have hints generated (default: 10800000).
  • node[:cassandra][:config][:hinted_handoff_throttle_in_kb] throttle in KB's per second, per delivery thread (default: 1024)
  • node[:cassandra][:config][:max_hints_delivery_threads] Number of threads with which to deliver hints (default: 2)
  • node[:cassandra][:config][:authenticator] Authentication backend (default: org.apache.cassandra.auth.AllowAllAuthenticator)
  • node[:cassandra][:config][:authorizer] Authorization backend (default: org.apache.cassandra.auth.AllowAllAuthorizer)
  • node[:cassandra][:config][:permissions_validity_in_ms] Validity period for permissions cache, set to0 to disable (default: 2000)
  • node[:cassandra][:config][:partitioner] The partitioner to distribute keys across the cluster (default: org.apache.cassandra.dht.Murmur3Partitioner).
  • node[:cassandra][:config][:disk_failure_policy] policy for data disk failures: stop, best_effort, or ignore (default: stop)
  • node[:cassandra][:config][:key_cache_size_in_mb] Maximum size of the key cache in memory. Set to 0 to disable, or "" for auto = (min(5% of Heap (in MB), 100MB)) (default: "", auto).
  • node[:cassandra][:config][:key_cache_save_period] Duration in seconds after which key cache is saved to saved_caches_directory. (default: 14400)
  • node[:cassandra][:config][:row_cache_size_in_mb] Maximum size of the row cache in memory, 0 to disable (default: 0)
  • node[:cassandra][:config][:row_cache_save_period] Duration in seconds after which row cache is saved to saved_caches_directory, 0 to disable cache save. (default: 0)
  • node[:cassandra][:config][:row_cache_provider] The provider for the row cache to use (default: SerializingCacheProvider)
  • node[:cassandra][:config][:commitlog_sync] periodic to ack writes immediately with periodic fsyncs, or batch to wait until fsync to ack writes (default: periodic)
  • node[:cassandra][:config][:commitlog_sync_period_in_ms] period for commitlog fsync when commitlog_sync = periodic (default: 10000)
  • node[:cassandra][:config][:commitlog_sync_batch_window_in_ms] batch window for fsync when commitlog_sync = batch (default: 50)
  • node[:cassandra][:config][:commitlog_segment_size_in_mb] Size of individual commitlog file segments (default: 32)
  • node[:cassandra][:config][:commitlog_total_space_in_mb] If space gets above this value (it will round up to the next nearest segment multiple), Cassandra will flush every dirty CF in the oldest segment and remove it. (default: 4096)
  • node[:cassandra][:config][:concurrent_reads] Should be set to 16 * drives (default: 32)
  • node[:cassandra][:config][:concurrent_writes] Should be set to 8 * cpu cores (default: 32)
  • node[:cassandra][:config][:trickle_fsync] Enable this to avoid sudden dirty buffer flushing from impacting read latencies. Almost always a good idea on SSDs; not necessary on platters (default: false)
  • node[:cassandra][:config][:trickle_fsync_interval_in_kb] Interval for fsync when doing sequential writes (default: 10240)
  • node[:cassandra][:config][:storage_port] TCP port, for commands and data (default: 7000)
  • node[:cassandra][:config][:ssl_storage_port] SSL port, unused unless enabled in encryption options (default: 7001)
  • node[:cassandra][:config][:listen_address] Address to bind for communication with other nodes. Leave blank to lookup IP from hostname. 0.0.0.0 is always wrong. (default: node[:ipaddress]).
  • node[:cassandra][:config][:broadcast_address] Address to broadcast to other Cassandra nodes. If '', will use listen_address (default: '')
  • node[:cassandra][:config][:start_native_transport] Whether to start the native transport server (default: true)
  • node[:cassandra][:config][:native_transport_port] Port for the CQL native transport to listen for clients on (default: 9042)
  • node[:cassandra][:config][:start_rpc] Whether to start the Thrift RPC server (default: true)
  • node[:cassandra][:config][:rpc_port] Port for Thrift RPC server to listen for clients on (default: 9160)
  • node[:cassandra][:config][:rpc_keepalive] Enable keepalive on RPC connections (default: true)
  • node[:cassandra][:config][:rpc_server_type] sync for one thread per connection; hsha for "half synchronous, half asynchronous" (default: sync)
  • node[:cassandra][:config][:thrift_framed_transport_size_in_mb] Frame size for Thrift (maximum field length) (default: 15)
  • node[:cassandra][:config][:thrift_max_message_length_in_mb] Max length of a Thrift message, including all fields and internal Thrift overhead (default: 16)
  • node[:cassandra][:config][:incremental_backups] Enable hardlinks in backups/ for each sstable flushed or streamed locally. Removing these links is the operator's responsibility (default: false)
  • node[:cassandra][:config][:snapshot_before_compaction] Take a snapshot before each compaction (default: false)
  • node[:cassandra][:config][:auto_snapshot] Take a snapshot before keyspace truncation or dropping of column families. If you set this value to false, you will lose data on truncation or drop (default: true)
  • node[:cassandra][:config][:column_index_size_in_kb] Add column indexes to a row after its contents reach this size (default: 64)
  • node[:cassandra][:config][:compaction_throughput_mb_per_sec] Throttle compaction to this total system throughput. Generally should be 16-32 times data insertion rate (default: 16)
  • node[:cassandra][:config][:read_request_timeout_in_ms] How long the coordinator should wait for read operations to complete (default: 10000)
  • node[:cassandra][:config][:range_request_timeout_in_ms] How long the coordinator should wait for seq or index scans to complete (default: 10000).
  • node[:cassandra][:config][:write_request_timeout_in_ms] How long the coordinator should wait for writes to complete (default: 10000)
  • node[:cassandra][:config][:truncate_request_timeout_in_ms] How long the coordinator should wait for truncates to complete (default: 60000)
  • node[:cassandra][:config][:request_timeout_in_ms] Default timeout for other, miscellaneous operations (default: 10000)
  • node[:cassandra][:config][:cross_node_timeout] Enable operation timeout information exchange between nodes to accurately measure request timeouts. Be sure ntp is installed and node times are synchronized before enabling. (default: false)
  • node[:cassandra][:config][:streaming_socket_timeout_in_ms] Enable socket timeout for streaming operation (default: 3600000 - 1 hour)
  • node[:cassandra][:config][:phi_convict_threshold] Adjusts the sensitivity of the failure detector on an exponential scale (default: 8)
  • node[:cassandra][:config][:endpoint_snitch] SimpleSnitch, PropertyFileSnitch, GossipingPropertyFileSnitch, RackInferringSnitch, Ec2Snitch, Ec2MultiRegionSnitch (default: SimpleSnitch)
  • node[:cassandra][:config][:dynamic_snitch_update_interval_in_ms] How often to perform the more expensive part of host score calculation (default: 100)
  • node[:cassandra][:config][:dynamic_snitch_reset_interval_in_ms] How often to reset all host scores, allowing a bad host to possibly recover (default: 600000)
  • node[:cassandra][:config][:dynamic_snitch_badness_threshold] Allow 'pinning' of replicas to hosts in order to increase cache capacity. (default: 0.1)
  • node[:cassandra][:config][:request_scheduler] Class to schedule incoming client requests (default: org.apache.cassandra.scheduler.NoScheduler)
  • node[:cassandra][:config][:index_interval] index_interval controls the sampling of entries from the primary row index in terms of space versus time (default: 128).
  • node[:cassandra][:config][:auto_bootstrap] Setting this parameter to false prevents the new nodes from attempting to get all the data from the other nodes in the data center. (default: true).
  • node[:cassandra][:config][:enable_assertions] Enable JVM assertions. Disabling this in production will give a modest performance benefit (around 5%) (default: true).
  • node[:cassandra][:config][:data_file_directories] (default: node['cassandra']['data_dir']): C* data cirectories
  • node[:cassandra][:config][:saved_caches_directory] (default: saved_caches_directory): C* saved cache directory
  • node[:cassandra][:config][:commitlog_directory] (default: node['cassandra']['commitlog_dir']) *C commit log directory

C* <v2.0 Attributes

  • node[:cassandra][:config][:memtable_flush_queue_size] Number of full memtables to allow pending flush, i.e., waiting for a writer thread (default: 4)
  • node[:cassandra][:config][:in_memory_compaction_limit_in_mb] Size limit for rows being compacted in memory (default: 64)
  • node[:cassandra][:config][:concurrent_compactors] Sets the number of concurrent compaction processes allowed to run simultaneously on a node. (default: nil, which will result in one compaction process per CPU core)
  • node[:cassandra][:config][:multithreaded_compaction] Enable multithreaded compaction. Uses one thread per core, plus one thread per sstable being merged. (default: false)
  • node[:cassandra][:config][:compaction_preheat_key_cache] Track cached row keys during compaction and re-cache their new positions in the compacted sstable. Disable if you use really large key caches (default: true)
  • node[:cassandra][:config][:native_transport_min_threads] Min number of threads for handling transport requests when the native protocol is used (default: nil)
  • node[:cassandra][:config][:native_transport_max_threads] Max number of threads for handling transport requests when the native protocol is used (default: nil)

C* >v2.1 Attributes

  • node[:cassandra][:config][:broadcast_rpc_address] RPC address to broadcast to drivers and other Cassandra nodes (default: node[:ipaddress])
  • node[:cassandra][:config][:tombstone_failure_threshold] tombstone attribute, check C* documentation for more info (default: 100000)
  • node[:cassandra][:config][:tombstone_warn_threshold] tombstone attribute, check C* documentation for more info (default: 1000)
  • node[:cassandra][:config][:sstable_preemptive_open_interval_in_mb] This helps to smoothly transfer reads between the sstables, reducing page cache churn and keeping hot rows hot (default: 50)
  • node[:cassandra][:config][:memtable_allocation_type] Specify the way Cassandra allocates and manages memtable memory (default: heap_buffers)
  • node[:cassandra][:config][:index_summary_capacity_in_mb] A fixed memory pool size in MB for for SSTable index summaries. If left empty, this will default to 5% of the heap size (default: nil)
  • node[:cassandra][:config][:index_summary_resize_interval_in_minutes] How frequently index summaries should be resampled (default: 60)
  • node[:cassandra][:config][:concurrent_counter_writes] Concurrent writes, since writes are almost never IO bound, the ideal number of "concurrent_writes" is dependent on the number of cores in your system; (8 * number_of_cores) (default: 32)
  • node[:cassandra][:config][:counter_cache_save_period] Duration in seconds after which Cassandra should save the counter cache (keys only) (default: 7200)
  • node[:cassandra][:config][:counter_cache_size_in_mb] Counter cache helps to reduce counter locks' contention for hot counter cells. Default value is empty to make it "auto" (min(2.5% of Heap (in MB), 50MB)). Set to 0 to disable counter cache. (default: nil)
  • node[:cassandra][:config][:counter_write_request_timeout_in_ms] How long the coordinator should wait for counter writes to complete (default: 5000)
  • node[:cassandra][:config][:commit_failure_policy] policy for commit disk failures (default: stop)
  • node[:cassandra][:config][:cas_contention_timeout_in_ms] How long a coordinator should continue to retry a CAS operation that contends with other proposals for the same row (default: 1000)
  • node[:cassandra][:config][:batch_size_warn_threshold_in_kb] Log WARN on any batch size exceeding this value. 5kb per batch by default (default: 5)
  • node[:cassandra][:config][:batchlog_replay_throttle_in_kb] Maximum throttle in KBs per second, total. This will be reduced proportionally to the number of nodes in the cluster (default: 1024)

JAMM Attributes

  • node[:cassandra][:setup_jamm] (default: false): install the jamm jar file and use it to set java option -javaagent, obsolete for C* versions >v0.8.0
  • node[:cassandra][:jamm][:sha256sum] (default: calculated): jamm lib sha256sum for calculated version
  • node[:cassandra][:jamm][:base_url] (default: calculated): jamm lib jar url
  • node[:cassandra][:jamm][:jar_name] (default: calculated): jamm lib jar name
  • node[:cassandra][:jamm][:version] (default: calculated): jamm lib version

JNA Attributes (Prior C* version 2.1.0)

  • node[:cassandra][:jna][:base_url] The base url to fetch the JNA jar (default: https://github.com/twall/jna/tree/4.0/dist)
  • node[:cassandra][:jna][:jar_name] The name of the jar to download from the base url. (default: jna.jar)
  • node[:cassandra][:jna][:sha256sum] The SHA-256 checksum of the file. If the local jna.jar file matches the checksum, the chef-client will not re-download it. (default: dac270b6441ce24d93a96ddb6e8f93d8df099192738799a6f6fcfc2b2416ca19)

Priam Attributes

  • node[:cassandra][:setup_priam] (default: false): install the priam jar file and use it to set java option -javaagent, uses the priam version corresponding to the cassandra version
  • node[:cassandra][:priam][:sha256sum] (default: 9fde9a40dc5c538adee54f40fa9027cf3ebb7fd42e3592b3e6fdfe3f7aff81e1): priam lib sha256sum for version 2.2.0
  • node[:cassandra][:priam][:base_url] (default: priam url on maven.org): priam lib jar url
  • node[:cassandra][:priam][:jar_name] (default: calculated): priam lib jar name

Logback Attributes

  • node[:cassandra][:logback][:file][:max_file_size] (default: "20MB"): logback File appender log file rotation size
  • node[:cassandra][:logback][:file][:max_index] (default: 20): logback File appender log files max_index
  • node[:cassandra][:logback][:file][:min_index] (default: 1): logback File appender log files min_index
  • node[:cassandra][:logback][:file][:pattern] (default: "%-5level [%thread] %date{ISO8601} %F:%L - %msg%n"): logback File appender log pattern
  • node[:cassandra][:logback][:debug][:enable] (default: false): enable logback File appender log debug
  • node[:cassandra][:logback][:debug][:max_file_size] (default: "20MB"): logback File appender log file rotation size
  • node[:cassandra][:logback][:debug][:max_index] (default: 20): logback File appender log files max_index
  • node[:cassandra][:logback][:debug][:min_index] (default: 1): logback File appender log files min_index
  • node[:cassandra][:logback][:debug][:pattern] (default: "%-5level [%thread] %date{ISO8601} %F:%L - %msg%n"): logback File appender log pattern
  • node[:cassandra][:logback][:stdout][:enable] (default: true): enable logback STDOUT appender
  • node[:cassandra][:logback][:stdout][:pattern] (default: "%-5level %date{HH:mm:ss,SSS} %msg%n"): logback STDOUT appender log pattern
  • node[:cassandra][:logback][:syslog][:enable] (default: false): enable logback SYSLOG appender. Requires RSYSLOG be installed and running on the node.
  • node[:cassandra][:logback][:syslog][:host] (default: localhost): The host name the syslog is written to.
  • node[:cassandra][:logback][:syslog][:facility] (default: USER) The facility specified for the appender.
  • node[:cassandra][:logback][:syslog][:pattern] (default: "%-5level [%thread] %F:%L - %msg%n") lockback SYSLOG appender log pattern
  • node[:cassandra][:logback][:override_loggers] (default: {}) Override log level of specific logger (i.e { 'org.apache.cassandra.utils.StatusLogger' => 'WARN' })

Ulimit Attributes

  • node[:cassandra][:limits][:memlock] (default: "unlimited"): memory ulimit for Cassandra node process
  • node[:cassandra][:limits][:nofile] (default: 48000): file ulimit for Cassandra node process
  • node[:cassandra][:limits][:nproc] (default: "unlimited"): process ulimit for Cassandra node process

Yum Attributes

  • node[:cassandra][:yum][:repo] (default: datastax): name of the repo from which to install
  • node[:cassandra][:yum][:description] (default: "DataStax Repo for Apache Cassandra"): description of the repo
  • node[:cassandra][:yum][:baseurl] (default: "http://rpm.datastax.com/community"): repo url
  • node[:cassandra][:yum][:mirrorlist] (default: nil): a mirrorlist file
  • node[:cassandra][:yum][:gpgcheck] (default: false): whether to use gpgcheck
  • node[:cassandra][:yum][:enabled] (default: true): whether the repo is enabled by default
  • node[:cassandra][:yum][:options] (default: ""): Additional options to pass to yum_package

OpsCenter Attributes

DataStax Ops Center Server attributes

  • node[:cassandra][:opscenter][:server][:package_name] (default: opscenter-free)
  • node[:cassandra][:opscenter][:server][:port] (default: 8888)
  • node[:cassandra][:opscenter][:server][:interface] (default: 0.0.0.0)
  • node[:cassandra][:opscenter][:server][:authentication] (default: false)
  • node[:cassandra][:opscenter][:cassandra_metrics][:ignored_keyspaces] (default: [system, OpsCenter])
  • node[:cassandra][:opscenter][:cassandra_metrics][:ignored_column_families] (default: [])
  • node[:cassandra][:opscenter][:cassandra_metrics][:1min_ttl] (default: 604800)
  • node[:cassandra][:opscenter][:cassandra_metrics][:5min_ttl] (default: 2419200)
  • node[:cassandra][:opscenter][:cassandra_metrics][:2hr_ttl] (default: 31536000)
  • node[:cassandra][:opscenter][:custom_configuration] (default: {}) a hash of custom configuration sections to add to opscenterd.conf, e.g.:
{
 'ui' => {
   'default_api_timeout' => 300
 },
 'stat_reporter' => {
   'interval' => 1
 }
}

DataStax Ops Center Agent Tarball attributes

  • node[:cassandra][:opscenter][:agent][:download_url] (default: "") Required. You need to specify agent download url, because that could be different for each opscenter server version. ( S3 is a great place to store packages )
  • node[:cassandra][:opscenter][:agent][:checksum] (default: nil)
  • node[:cassandra][:opscenter][:agent][:install_dir] (default: /opt)
  • node[:cassandra][:opscenter][:agent][:install_folder_name] (default: opscenter_agent)
  • node[:cassandra][:opscenter][:agent][:binary_name] (default: opscenter-agent) Introduced since Datastax changed agent binary name from opscenter-agent to datastax-agent. Make sure to set it right if you are updating to 4.0.2
  • node[:cassandra][:opscenter][:agent][:server_host] (default: "" ). If left empty, will use search to get IP by opscenter server_role role.
  • node[:cassandra][:opscenter][:agent][:server_role] (default: opscenter_server). Will be use for opscenter server IP lookup if :server_host is not set.
  • node[:cassandra][:opscenter][:agent][:use_chef_search] (default: true). Determines whether chef search will be used for locating the data agent server.
  • node[:cassandra][:opscenter][:agent][:use_ssl] (default: false)

DataStax Ops Center Agent Datastax attributes

  • node[:cassandra][:opscenter][:agent][:package_name] (default: "datastax-agent" ).
  • node[:cassandra][:opscenter][:agent][:server_host] (default: "" ). If left empty, will use search to get IP by opscenter server_role role.
  • node[:cassandra][:opscenter][:agent][:server_role] (default: opscenter_server). Will be use for opscenter server IP lookup if :server_host is not set.
  • node[:cassandra][:opscenter][:agent][:use_ssl] (default: false)

Data Center and Rack Attributes

  • node[:cassandra][:rackdc][:dc] (default: "") The datacenter to specify in the cassandra-rackdc.properties file. (GossipingPropertyFileSnitch only)
  • node[:cassandra][:rackdc][:rack] (default: "") The rack to specify in the cassandra-rackdc.properties file (GossipingPropertyFileSnitch only)
  • node[:cassandra][:rackdc][:prefer_local] (default: "false") Whether the snitch will prefer the internal ip when possible, as the Ec2MultiRegionSnitch does. (GossipingPropertyFileSnitch only)

Contributing

See CONTRIBUTING.md and TESTING.md.

Copyright & License

Michael S. Klishin, Travis CI Development Team, and contributors, 2012-2018.

Released under the Apache 2.0 License.

Related Articles

cassandra
automation
kafka

Anomalia Machina 2 - Automatic Provisioning: Massively Scalable Anomaly Detection with Apache Kafka and Apache Cassandra - Instaclustr

John Doe

3/22/2019

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

chef