Successfully reported this slideshow.
Your SlideShare is downloading.
Cassandra 2.1 boot camp, exercise
Upcoming SlideShare
Loading in …5
×
-
Be the first to like this
No notes for slide
- 1. Assignment
- 2. Build a compaction strategy that compacts the most overlapping sstables together
- 3. 0. Setting up your IDE https://wiki.apache.org/cassandra/RunningCassandraInIDEA http://wiki.apache.org/cassandra/RunningCassandraInEclipse
- 4. 1. Implement a no-op compaction strategy ● class Xyz extends AbstractCompactionStrategy {..} ● Implement the abstract methods ○ getNextBackgroundTask ■ Return a CompactionTask containing the sstables you want to compact, null if none ○ getMaximalTask ■ ‘Major compaction’ - should compact all sstables ○ ... ● ALTER TABLE foo WITH compaction = { class: ‘Xyz’ }
- 5. 2. Make it compact the most overlapping sstables ● We should reduce disk usage the most if we compact the overlapping sstables together ● CompactionMetadata has ICardinality ○ HyperLogLog - count unique items in a stream ○ Currently used to estimate how big bloom filters we need to allocate during compaction ○ https://github.com/addthis/stream-lib ○ SSTableReader#getApproximateKeyCount ○ ICardinality#merge - merge several of these components to find count of keys in the union of the sstables.
- 6. 3. Add support for worthDroppingTombstones ● Single-sstable compaction to drop tombstones ● Tries to figure how much sstables overlap and then estimate how many tombstones we have outside that overlap ● Currently we check for range overlap ● Could probably be improved if we used ICardinality
- 7. 4. Add heuristics to avoid n² CompactionMetadata comparisons Algorithms!
- 8. Summary 1. Implement a no-op compaction strategy 2. Make it compact the most overlapping sstables 3. Add support for worth dropping tombstones 4. Add heuristics to avoid n² comparisons Slides: bit.ly/1pd9Bws
Cassandra Summit Boot Camp, 2014 Coding Exercise
Public clipboards featuring this slide
No public clipboards found for this slide