Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

8/19/2021

Reading time:1 min

Cassandra 2.1 boot camp, exercise

by Joshua McKenzie

Cassandra 2.1 boot camp, exercise Successfully reported this slideshow.Your SlideShare is downloading.Cassandra 2.1 boot camp, exerciseUpcoming SlideShareLoading in …5× Next SlideShares 0 Comments 0 Likes Statistics Notes Be the first to like this No notes for slide 1. Assignment 2. Build a compaction strategy that compacts the most overlapping sstables together 3. 0. Setting up your IDE https://wiki.apache.org/cassandra/RunningCassandraInIDEA http://wiki.apache.org/cassandra/RunningCassandraInEclipse 4. 1. Implement a no-op compaction strategy ● class Xyz extends AbstractCompactionStrategy {..} ● Implement the abstract methods ○ getNextBackgroundTask ■ Return a CompactionTask containing the sstables you want to compact, null if none ○ getMaximalTask ■ ‘Major compaction’ - should compact all sstables ○ ... ● ALTER TABLE foo WITH compaction = { class: ‘Xyz’ } 5. 2. Make it compact the most overlapping sstables ● We should reduce disk usage the most if we compact the overlapping sstables together ● CompactionMetadata has ICardinality ○ HyperLogLog - count unique items in a stream ○ Currently used to estimate how big bloom filters we need to allocate during compaction ○ https://github.com/addthis/stream-lib ○ SSTableReader#getApproximateKeyCount ○ ICardinality#merge - merge several of these components to find count of keys in the union of the sstables. 6. 3. Add support for worthDroppingTombstones ● Single-sstable compaction to drop tombstones ● Tries to figure how much sstables overlap and then estimate how many tombstones we have outside that overlap ● Currently we check for range overlap ● Could probably be improved if we used ICardinality 7. 4. Add heuristics to avoid n² CompactionMetadata comparisons Algorithms! 8. Summary 1. Implement a no-op compaction strategy 2. Make it compact the most overlapping sstables 3. Add support for worth dropping tombstones 4. Add heuristics to avoid n² comparisons Slides: bit.ly/1pd9Bws Cassandra Summit Boot Camp, 2014
Coding Exercise × Public clipboards featuring this slideNo public clipboards found for this slideSelect another clipboard ×Looks like you’ve clipped this slide to already.Create a clipboardYou just clipped your first slide! Clipping is a handy way to collect important slides you want to go back to later. Now customize the name of a clipboard to store your clips. Description Visibility Others can see my Clipboard

Illustration Image
Cassandra 2.1 boot camp, exercise

Related Articles

cluster
troubleshooting
datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

arodrime

4/3/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra