3/1/2020

Reading time:5 min

DataStax: Extreme Cassandra Optimization: The Sequel

by DataStax Academy

DataStax: Extreme Cassandra Optimization: The SequelSlideShare Explore YouSuccessfully reported this slideshow.We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.DataStax: Extreme Cassandra Optimization: The SequelUpcoming SlideShareLoading in …5×1 1 of 48 Like this presentation? Why not share!2 Comments12 LikesStatisticsNotesColleen Velo , Senior Technical Operations Engineer at SmartThings at SmartThings Leela Krishna Kandrakota , Data Science Solutions Architect muayyad alsadi , Principal Software Engineer at OpenSooq.com at OpenSooq Vijayakumar Ramdoss , Platform Architect at Dell Romain Hardouin , Cassandra DBA / Architect at Teads.tv Show MoreNo DownloadsNo notes for slide1. ©2015 DataStax Conﬁdential. Do not distribute without consent. https://goo.gl/JtC9YR @AlTobey Extreme Cassandra Optimization: The Sequel 12. init() •This is all specific to Cassandra 2.1 •I will try to call out dangerous and apocryphal settings •Focus is on the low-hanging fruit3. OODA4. benchmark conﬁgure observe think START HERE (unless you’re already in prod, in which case, START HERE)5. Questions to ask: • Look at the available hardware and make an educated guess • How many sockets/cores? Hyperthreading? NUMA? • How much RAM? • memory bandwidth matters • What kind of storage? • How much per node? • What kind of network interface is it? • Some clouds have PPS limit6. 0x00b0 0x00b0 Hypervisor IOMMU vCPU 0 vCPU 1 vCPU 2 vCPU 3 application kernel vCPU 0 vCPU 1 vCPU 2 vCPU 3 application 0x00b0 0x00b0 kernel hypervisors7. containers (Docker) 0x00b0 0x00b0 kernel 0x00b0 0x00b0 bridge veth application iptables application host networking Docker networking8. benchmark conﬁgure observe think YOU ARE HERE9. JVM • Use Hotspot Java 8 >= u45 • Java 7 is EOL and slower • OpenJDK is fine •Zulu is a handy way to get the latest •http://www.azulsystems.com/products/zulu •Speaking of Azul … • Some Datastax customers are having success with C4 • But I can’t talk about any of them10. cassandra-env.sh: G1GC #JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}" # REJOICE! JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=20" JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5" #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500" JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled"11. cassandra-env.sh: CMS MAX_HEAP_SIZE=8G HEAP_NEWSIZE=2G # start here, adjust to workload # http://blog.ragozin.info/2012/03/secret-hotspot-option- improving-gc.html JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions" JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=4096" # these will need to be adjusted to the workload; start here JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=2" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=15"12. cassandra-env.sh: More JVM flags JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB" JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch" # esp. Docker! JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC" JVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem"13. cassandra.yaml: IO threads concurrent_reads: 128 concurrent_writes: 12814. cassandra.yaml: memtables memtable_heap_space_in_mb: 2048 memtable_cleanup_threshold: 0.10 memtable_flush_writers: 4 #memtable_allocation_type: offheap_objects # MAYBE Set these together!15. cassandra.yaml: commitlog # Cassandra >= 2.1.9 commitlog_segment_recycling: false # on SSDs and some HDD RAID trickle_fsync: true trickle_fsync_interval_in_kb: 1024 # and/or set vm.dirty_background_bytes low echo 8388608 > /proc/sys/vm/dirty_background_bytes16. cassandra.yaml: miscellaneous num_tokens: 32 # or 1, if you prefer # default in OSS is “all” internode_compression: dc # Cassandra >= 2.1.5 otc_coalescing_strategy: TIMEHORIZON # https://issues.apache.org/jira/browse/CASSANDRA-8611 streaming_socket_timeout_in_ms: 60000017. cassandra: schema • The data model is the single most important factor for performance! • Check your compression block size (per table) • Use size-tiered compaction (STCS) • leveled compaction (LCS) for read-heavy workloads on fast storage • the current default of 160MB sstable_size_in_mb is fine • DTCS for time series (http://www.datastax.com/dev/blog/dtcs-notes-from-the-field)18. Linux: sysctl.d vm.dirty_background_bytes = 16777216 vm.dirty_bytes = 4294967296 fs.file-max = 1000000 vm.max_map_count = 1048576 vm.swappiness = 119. Linux: storage cd /sys/block for drive in sd* xvd* vd* nvme* do echo deadline > $drive/queue/scheduler echo 8 > $drive/queue/read_ahead_kb # only on fast SSDs echo 0 > $drive/queue/nomerges done20. Linux: RAID & ﬁlesystems • use xfs • ext4 if you must • ZFS if you love yourself and want to be happy • btrfs if you like to live dangerously • RAID*: Pass stripe size & width to mkfs whenever possible • RAID0 is by far the most common choice • RAID10 is fine if you can afford the disks • RAID5/6 in some circumstances, but there’s a tradeoff • JBOD is great but has tradeoffs21. Linux kernel boot parameters isolcpus=0 idle=mwait intel_idle.max_cstate=0 processor.max_cstate=0 idle=halt (C1 only) idle=poll (for extreme cases, wastes power) Disable in BIOS22. Disable Frequency Scaling # make sure the CPUs run at max frequency for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]* do echo performance > $sysfs_cpu/cpufreq/scaling_governor done23. Docker24. benchmark conﬁgure observe think YOU ARE HERE25. BENCHMARKETING26. cassandra-stress cassandra-stress write n=100M cl=LOCAL_QUORUM -col "size=fixed(128)" "n=fixed(10)" -schema "replication(factor=3)" -rate threads=512 limit=35000/s -errors ignore -mode native cql3 -node 127.0.0.127. ops/s mean median p95 p99 p99.9 max28. cassandra-stress: user schema cassandra-stress user n=100M cl=LOCAL_QUORUM profile=bank_stress.yaml 'ops(simple=1)' no-warmup -rate threads=512 limit=35000/s -errors ignore -node 127.0.0.129. benchmark conﬁgure observe think YOU ARE HERE30. drop cache increase RA job done31. drop cache 332MiB free 91.6GiB free32. lspci -vv33. https://goo.gl/JtC9YRRecommendedPowerPoint 2016: ShortcutsOnline Course - LinkedIn LearningCommunication in the 21st Century ClassroomOnline Course - LinkedIn LearningFlipping the ClassroomOnline Course - LinkedIn LearningPerformance tuning - A key to successful cassandra migrationRamkumar NottathForrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax AcademyIntroduction to DataStax Enterprise Graph DatabaseDataStax AcademyIntroduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax AcademyCassandra on Docker @ Walmart LabsDataStax AcademyCassandra 3.0 Data ModelingDataStax AcademyCassandra Adoption on Cisco UCS & Open stackDataStax Academy×Share Clipboard×LinkPublic clipboards featuring this slideNo public clipboards found for this slideSelect another clipboard×Looks like you’ve clipped this slide to already.Create a clipboardYou just clipped your first slide!Clipping is a handy way to collect important slides you want to go back to later. Now customize the name of a clipboard to store your clips.

Read this article if you want to know more about DataStax: Extreme Cassandra Optimization: The Sequel

DataStax: Extreme Cassandra Optimization: The Sequel

SlideShare Explore You

Successfully reported this slideshow.

We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DataStax: Extreme Cassandra Optimization: The Sequel

©2015 DataStax Conﬁdential. Do not distribute without consent. https://goo.gl/JtC9YR @AlTobey Extreme Cassandra Optimizati...

init() •This is all specific to Cassandra 2.1 •I will try to call out dangerous and apocryphal settings •Focus is on the l...

benchmark conﬁgure observe think START HERE (unless you’re already in prod, in which case, START HERE)

Questions to ask: • Look at the available hardware and make an educated guess • How many sockets/cores? Hyperthreading? NU...

0x00b0 0x00b0 Hypervisor IOMMU vCPU 0 vCPU 1 vCPU 2 vCPU 3 application kernel vCPU 0 vCPU 1 vCPU 2 vCPU 3 application 0x00...

containers (Docker) 0x00b0 0x00b0 kernel 0x00b0 0x00b0 bridge veth application iptables application host networking Docker...

benchmark conﬁgure observe think YOU ARE HERE

JVM • Use Hotspot Java 8 >= u45 • Java 7 is EOL and slower • OpenJDK is fine •Zulu is a handy way to get the latest •http:...

$cassandra-env.sh: G1GC #JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}" # REJOICE! JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$J...$

cassandra-env.sh: CMS MAX_HEAP_SIZE=8G HEAP_NEWSIZE=2G # start here, adjust to workload # http://blog.ragozin.info/2012/03...

cassandra-env.sh: More JVM flags JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTL...

cassandra.yaml: IO threads concurrent_reads: 128 concurrent_writes: 128

cassandra.yaml: memtables memtable_heap_space_in_mb: 2048 memtable_cleanup_threshold: 0.10 memtable_flush_writers: 4 #memt...

cassandra.yaml: commitlog # Cassandra >= 2.1.9 commitlog_segment_recycling: false # on SSDs and some HDD RAID trickle_fsyn...

cassandra.yaml: miscellaneous num_tokens: 32 # or 1, if you prefer # default in OSS is “all” internode_compression: dc # C...

cassandra: schema • The data model is the single most important factor for performance! • Check your compression block siz...

Linux: sysctl.d vm.dirty_background_bytes = 16777216 vm.dirty_bytes = 4294967296 fs.file-max = 1000000 vm.max_map_count = ...

Linux: storage cd /sys/block for drive in sd* xvd* vd* nvme* do echo deadline > $drive/queue/scheduler echo 8 > $drive/que...

Linux: RAID & ﬁlesystems • use xfs • ext4 if you must • ZFS if you love yourself and want to be happy • btrfs if you like ...

Linux kernel boot parameters isolcpus=0 idle=mwait intel_idle.max_cstate=0 processor.max_cstate=0 idle=halt (C1 only) idle...

Disable Frequency Scaling # make sure the CPUs run at max frequency for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]* do ...

cassandra-stress cassandra-stress write n=100M cl=LOCAL_QUORUM -col "size=fixed(128)" "n=fixed(10)" -schema "replicat...

DataStax: Extreme Cassandra Optimization: The Sequel

Upcoming SlideShare

Loading in …5

×

Like this presentation? Why not share!

No Downloads

No notes for slide

1. ©2015 DataStax Conﬁdential. Do not distribute without consent. https://goo.gl/JtC9YR @AlTobey Extreme Cassandra Optimization: The Sequel 1
2. init() •This is all specific to Cassandra 2.1 •I will try to call out dangerous and apocryphal settings •Focus is on the low-hanging fruit
3. OODA
4. benchmark conﬁgure observe think START HERE (unless you’re already in prod, in which case, START HERE)
5. Questions to ask: • Look at the available hardware and make an educated guess • How many sockets/cores? Hyperthreading? NUMA? • How much RAM? • memory bandwidth matters • What kind of storage? • How much per node? • What kind of network interface is it? • Some clouds have PPS limit
6. 0x00b0 0x00b0 Hypervisor IOMMU vCPU 0 vCPU 1 vCPU 2 vCPU 3 application kernel vCPU 0 vCPU 1 vCPU 2 vCPU 3 application 0x00b0 0x00b0 kernel hypervisors
7. containers (Docker) 0x00b0 0x00b0 kernel 0x00b0 0x00b0 bridge veth application iptables application host networking Docker networking
8. benchmark conﬁgure observe think YOU ARE HERE
9. JVM • Use Hotspot Java 8 >= u45 • Java 7 is EOL and slower • OpenJDK is fine •Zulu is a handy way to get the latest •http://www.azulsystems.com/products/zulu •Speaking of Azul … • Some Datastax customers are having success with C4 • But I can’t talk about any of them
10. cassandra-env.sh: G1GC #JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}" # REJOICE! JVM_OPTS="$JVM_OPTS -XX:+UseG1GC" JVM_OPTS="$JVM_OPTS -XX:InitiatingHeapOccupancyPercent=20" JVM_OPTS="$JVM_OPTS -XX:G1RSetUpdatingPauseTimePercent=5" #JVM_OPTS="$JVM_OPTS -XX:ParallelGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=24" #JVM_OPTS="$JVM_OPTS -XX:MaxGCPauseMillis=500" JVM_OPTS="$JVM_OPTS -XX:+ParallelRefProcEnabled"
11. cassandra-env.sh: CMS MAX_HEAP_SIZE=8G HEAP_NEWSIZE=2G # start here, adjust to workload # http://blog.ragozin.info/2012/03/secret-hotspot-option- improving-gc.html JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions" JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=4096" # these will need to be adjusted to the workload; start here JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=2" JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=15"
12. cassandra-env.sh: More JVM flags JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking" JVM_OPTS="$JVM_OPTS -XX:+UseTLAB -XX:+ResizeTLAB" JVM_OPTS="$JVM_OPTS -XX:+AlwaysPreTouch" # esp. Docker! JVM_OPTS="$JVM_OPTS -XX:+DisableExplicitGC" JVM_OPTS="$JVM_OPTS -XX:+PerfDisableSharedMem"
13. cassandra.yaml: IO threads concurrent_reads: 128 concurrent_writes: 128
14. cassandra.yaml: memtables memtable_heap_space_in_mb: 2048 memtable_cleanup_threshold: 0.10 memtable_flush_writers: 4 #memtable_allocation_type: offheap_objects # MAYBE Set these together!
15. cassandra.yaml: commitlog # Cassandra >= 2.1.9 commitlog_segment_recycling: false # on SSDs and some HDD RAID trickle_fsync: true trickle_fsync_interval_in_kb: 1024 # and/or set vm.dirty_background_bytes low echo 8388608 > /proc/sys/vm/dirty_background_bytes
16. cassandra.yaml: miscellaneous num_tokens: 32 # or 1, if you prefer # default in OSS is “all” internode_compression: dc # Cassandra >= 2.1.5 otc_coalescing_strategy: TIMEHORIZON # https://issues.apache.org/jira/browse/CASSANDRA-8611 streaming_socket_timeout_in_ms: 600000
17. cassandra: schema • The data model is the single most important factor for performance! • Check your compression block size (per table) • Use size-tiered compaction (STCS) • leveled compaction (LCS) for read-heavy workloads on fast storage • the current default of 160MB sstable_size_in_mb is fine • DTCS for time series (http://www.datastax.com/dev/blog/dtcs-notes-from-the-field)
18. Linux: sysctl.d vm.dirty_background_bytes = 16777216 vm.dirty_bytes = 4294967296 fs.file-max = 1000000 vm.max_map_count = 1048576 vm.swappiness = 1
19. Linux: storage cd /sys/block for drive in sd* xvd* vd* nvme* do echo deadline > $drive/queue/scheduler echo 8 > $drive/queue/read_ahead_kb # only on fast SSDs echo 0 > $drive/queue/nomerges done
20. Linux: RAID & ﬁlesystems • use xfs • ext4 if you must • ZFS if you love yourself and want to be happy • btrfs if you like to live dangerously • RAID*: Pass stripe size & width to mkfs whenever possible • RAID0 is by far the most common choice • RAID10 is fine if you can afford the disks • RAID5/6 in some circumstances, but there’s a tradeoff • JBOD is great but has tradeoffs
21. Linux kernel boot parameters isolcpus=0 idle=mwait intel_idle.max_cstate=0 processor.max_cstate=0 idle=halt (C1 only) idle=poll (for extreme cases, wastes power) Disable in BIOS
22. Disable Frequency Scaling # make sure the CPUs run at max frequency for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]* do echo performance > $sysfs_cpu/cpufreq/scaling_governor done
23. Docker
24. benchmark conﬁgure observe think YOU ARE HERE
25. BENCHMARKETING
26. cassandra-stress cassandra-stress write n=100M cl=LOCAL_QUORUM -col "size=fixed(128)" "n=fixed(10)" -schema "replication(factor=3)" -rate threads=512 limit=35000/s -errors ignore -mode native cql3 -node 127.0.0.1
27. ops/s mean median p95 p99 p99.9 max
28. cassandra-stress: user schema cassandra-stress user n=100M cl=LOCAL_QUORUM profile=bank_stress.yaml 'ops(simple=1)' no-warmup -rate threads=512 limit=35000/s -errors ignore -node 127.0.0.1
29. benchmark conﬁgure observe think YOU ARE HERE
30. drop cache increase RA job done
31. drop cache 332MiB free 91.6GiB free
32. lspci -vv
33. https://goo.gl/JtC9YR

×

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt!  We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company