DataStax: Extreme Cassandra Optimization: The Sequel

Successfully reported this slideshow.

We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DataStax: Extreme Cassandra Optimization: The Sequel
©2015 DataStax Confidential. Do not distribute without consent. https://goo.gl/JtC9YR @AlTobey Extreme Cassandra Optimizati...
init() •This is all specific to Cassandra 2.1 •I will try to call out dangerous and apocryphal settings •Focus is on the l...
OODA
benchmark configure observe think START HERE (unless you’re already in prod, in which case, START HERE)
Questions to ask: • Look at the available hardware and make an educated guess • How many sockets/cores? Hyperthreading? NU...
0x00b0 0x00b0 Hypervisor IOMMU vCPU 0 vCPU 1 vCPU 2 vCPU 3 application kernel vCPU 0 vCPU 1 vCPU 2 vCPU 3 application 0x00...
containers (Docker) 0x00b0 0x00b0 kernel 0x00b0 0x00b0 bridge veth application iptables application host networking Docker...
benchmark configure observe think YOU ARE HERE
JVM • Use Hotspot Java 8 >= u45 • Java 7 is EOL and slower • OpenJDK is fine •Zulu is a handy way to get the latest •http:...
cassandra-env.sh: G1GC #JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}" # REJOICE! JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$J...
cassandra-env.sh: CMS MAX_HEAP_SIZE=8G HEAP_NEWSIZE=2G # start here, adjust to workload # http://blog.ragozin.info/2012/03...
cassandra-env.sh: More JVM flags JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTL...
cassandra.yaml: IO threads concurrent_reads: 128 concurrent_writes: 128
cassandra.yaml: memtables memtable_heap_space_in_mb: 2048 memtable_cleanup_threshold: 0.10 memtable_flush_writers: 4 #memt...
cassandra.yaml: commitlog # Cassandra >= 2.1.9 commitlog_segment_recycling: false # on SSDs and some HDD RAID trickle_fsyn...
cassandra.yaml: miscellaneous num_tokens: 32 # or 1, if you prefer # default in OSS is “all” internode_compression: dc # C...
cassandra: schema • The data model is the single most important factor for performance! • Check your compression block siz...
Linux: sysctl.d vm.dirty_background_bytes = 16777216 vm.dirty_bytes = 4294967296 fs.file-max = 1000000 vm.max_map_count = ...
Linux: storage cd /sys/block for drive in sd* xvd* vd* nvme* do echo deadline > $drive/queue/scheduler echo 8 > $drive/que...
Linux: RAID & filesystems • use xfs • ext4 if you must • ZFS if you love yourself and want to be happy • btrfs if you like ...
Linux kernel boot parameters isolcpus=0 idle=mwait intel_idle.max_cstate=0 processor.max_cstate=0 idle=halt (C1 only) idle...
Disable Frequency Scaling # make sure the CPUs run at max frequency for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]* do ...
Docker
benchmark configure observe think YOU ARE HERE
BENCHMARKETING
cassandra-stress cassandra-stress write n=100M cl=LOCAL_QUORUM -col "size=fixed(128)" "n=fixed(10)" -schema "replicat...
ops/s mean median p95 p99 p99.9 max
cassandra-stress: user schema cassandra-stress user n=100M cl=LOCAL_QUORUM profile=bank_stress.yaml 'ops(simple=1)' ...
benchmark configure observe think YOU ARE HERE
drop cache increase RA job done
drop cache 332MiB free 91.6GiB free
lspci -vv
https://goo.gl/JtC9YR
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel

Upcoming SlideShare

Loading in …5

×

image1

Like this presentation? Why not share!

  1. 1. ©2015 DataStax Confidential. Do not distribute without consent. https://goo.gl/JtC9YR @AlTobey Extreme Cassandra Optimization: The Sequel 1
  2. 2. init() •This is all specific to Cassandra 2.1 •I will try to call out dangerous and apocryphal settings •Focus is on the low-hanging fruit
  3. 3. OODA
  4. 4. benchmark configure observe think START HERE (unless you’re already in prod, in which case, START HERE)
  5. 5. Questions to ask: • Look at the available hardware and make an educated guess • How many sockets/cores? Hyperthreading? NUMA? • How much RAM? • memory bandwidth matters • What kind of storage? • How much per node? • What kind of network interface is it? • Some clouds have PPS limit
  6. 6. 0x00b0 0x00b0 Hypervisor IOMMU vCPU 0 vCPU 1 vCPU 2 vCPU 3 application kernel vCPU 0 vCPU 1 vCPU 2 vCPU 3 application 0x00b0 0x00b0 kernel hypervisors
  7. 7. containers (Docker) 0x00b0 0x00b0 kernel 0x00b0 0x00b0 bridge veth application iptables application host networking Docker networking
  8. 8. benchmark configure observe think YOU ARE HERE
  9. 9. JVM • Use Hotspot Java 8 >= u45 • Java 7 is EOL and slower • OpenJDK is fine •Zulu is a handy way to get the latest •http://www.azulsystems.com/products/zulu •Speaking of Azul … • Some Datastax customers are having success with C4 • But I can’t talk about any of them
  10. 10. cassandra-env.sh: G1GC #JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}" # REJOICE! JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=20" JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5" #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500" JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled"
  11. 11. cassandra-env.sh: CMS MAX_HEAP_SIZE=8G HEAP_NEWSIZE=2G # start here, adjust to workload # http://blog.ragozin.info/2012/03/secret-hotspot-option- improving-gc.html JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions" JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=4096" # these will need to be adjusted to the workload; start here JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=2" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=15"
  12. 12. cassandra-env.sh: More JVM flags JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB" JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch" # esp. Docker! JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC" JVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem"
  13. 13. cassandra.yaml: IO threads concurrent_reads: 128 concurrent_writes: 128
  14. 14. cassandra.yaml: memtables memtable_heap_space_in_mb: 2048 memtable_cleanup_threshold: 0.10 memtable_flush_writers: 4 #memtable_allocation_type: offheap_objects # MAYBE Set these together!
  15. 15. cassandra.yaml: commitlog # Cassandra >= 2.1.9 commitlog_segment_recycling: false # on SSDs and some HDD RAID trickle_fsync: true trickle_fsync_interval_in_kb: 1024 # and/or set vm.dirty_background_bytes low echo 8388608 > /proc/sys/vm/dirty_background_bytes
  16. 16. cassandra.yaml: miscellaneous num_tokens: 32 # or 1, if you prefer # default in OSS is “all” internode_compression: dc # Cassandra >= 2.1.5 otc_coalescing_strategy: TIMEHORIZON # https://issues.apache.org/jira/browse/CASSANDRA-8611 streaming_socket_timeout_in_ms: 600000
  17. 17. cassandra: schema • The data model is the single most important factor for performance! • Check your compression block size (per table) • Use size-tiered compaction (STCS) • leveled compaction (LCS) for read-heavy workloads on fast storage • the current default of 160MB sstable_size_in_mb is fine • DTCS for time series (http://www.datastax.com/dev/blog/dtcs-notes-from-the-field)
  18. 18. Linux: sysctl.d vm.dirty_background_bytes = 16777216 vm.dirty_bytes = 4294967296 fs.file-max = 1000000 vm.max_map_count = 1048576 vm.swappiness = 1
  19. 19. Linux: storage cd /sys/block for drive in sd* xvd* vd* nvme* do echo deadline > $drive/queue/scheduler echo 8 > $drive/queue/read_ahead_kb # only on fast SSDs echo 0 > $drive/queue/nomerges done
  20. 20. Linux: RAID & filesystems • use xfs • ext4 if you must • ZFS if you love yourself and want to be happy • btrfs if you like to live dangerously • RAID*: Pass stripe size & width to mkfs whenever possible • RAID0 is by far the most common choice • RAID10 is fine if you can afford the disks • RAID5/6 in some circumstances, but there’s a tradeoff • JBOD is great but has tradeoffs
  21. 21. Linux kernel boot parameters isolcpus=0 idle=mwait intel_idle.max_cstate=0 processor.max_cstate=0 idle=halt (C1 only) idle=poll (for extreme cases, wastes power) Disable in BIOS
  22. 22. Disable Frequency Scaling # make sure the CPUs run at max frequency for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]* do echo performance > $sysfs_cpu/cpufreq/scaling_governor done
  23. 23. Docker
  24. 24. benchmark configure observe think YOU ARE HERE
  25. 25. BENCHMARKETING
  26. 26. cassandra-stress cassandra-stress write n=100M cl=LOCAL_QUORUM -col "size=fixed(128)" "n=fixed(10)" -schema "replication(factor=3)" -rate threads=512 limit=35000/s -errors ignore -mode native cql3 -node 127.0.0.1
  27. 27. ops/s mean median p95 p99 p99.9 max
  28. 28. cassandra-stress: user schema cassandra-stress user n=100M cl=LOCAL_QUORUM profile=bank_stress.yaml 'ops(simple=1)' no-warmup -rate threads=512 limit=35000/s -errors ignore -node 127.0.0.1
  29. 29. benchmark configure observe think YOU ARE HERE
  30. 30. drop cache increase RA job done
  31. 31. drop cache 332MiB free 91.6GiB free
  32. 32. lspci -vv
  33. 33. https://goo.gl/JtC9YR

×