Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

5/10/2021

Reading time:3 min

Apache Cassandra Lunch #17: Tombstones - Business Platform Team

by Obioma Anomnachi

In Cassandra Lunch #17, we discuss tombstones in Cassandra. Tombstones are a special kind of write that signifies deleted values, stops them from being returned on reads, and eventually allows them to be deleted during compaction. We discuss what tombstones are and why they are used, as well as how they work in practice.What and WhyTombstones are a value that can be written to a Cassandra cluster in a number of different ways. Tombstones contain deletion_info, which details when a record gets deleted. We use tombstones in order to allow Cassandra to have high speed writes and reads. Rather than having a process where individual deletes must be processed immediately, making those deletes take a very long time, we instead have a process where a tombstone is written to the cluster instead, taking advantage of the high-speed process that Cassandra has in place for writes. This also allows it to scale to a large number of machines. Tombstones cause problems by taking up cluster space before compaction takes place, and also by slowing down reads since Cassandra select queries pull entire SSTables, including tombstones, into memtable before filtering and returning results.HowTombstones are created when data in a Cassandra cluster is deleted. This deletion can create partition, row, or cell tombstones depending on the nature of that deletion. They can also be created when null values are inserted or updated into a Cassandra table, or when a table, row, or cell is created with a TTL (Time to Live) value defining when the selected data is to be deleted.Tombstones are cleaned up automatically during the compaction process. On the cluster level, the settings that affect when compaction takes place are Compaction Executors and Compaction Throughput settings. On the table, the setting that affects compaction are gc_grace_seconds, which defaults to 10 days, and is a counter that triggers a compaction when it reaches zero. The tombstone_threshold is a percentage value for when a table has enough tombstones that it is ready to be marked for compaction. The default value is 0.2 or 20% tombstones. The tombstone_compaction_interval setting determines how long a table must exist before being eligible for compaction. The unchecked_tombstone_compaction allows the bypass of the previous setting.We can trigger compaction manually via nodetool compact, nodetool garbagecollect, and nodetool scrub. Generally, it is better to configure a cluster such that tombstone accumulation is not a problem. The accumulation of tombstones can happen when automatic processes push null values. Sometimes manual compaction is the best way to take care of this.In order to monitor tombstones and compaction processes, we can use nodetool tablestats to get statistics for a particular machine. Anant’s Cassandra Toolkit’s TableAnalyzer can aggregate these values from the machines that make up a cluster. We can also look in the logs for tombstone warnings that display when reads pull a certain number of tombstones. Cassandra toolkit also has tools for this, under NodeAnalyzer or log-analysis.Additional ResourcesCassandra Links | Understanding Cassandra Tombstones – Beyond the LinesCassandra Links | The Curious Case of TombstonesCassandra Links | Experiences with Tombstones in Apache CassandraCassandra Links | Common Problems with Cassandra Tombstones – OpenCredoCassandra Links | Undetectable Tombstones in Apache CassandraCassandra Links | About Deletes and Tombstones in CassandraCassandra Links | Deletes and TombstonesResources for Monitoring Datastax, Cassandra, Spark, & Solr PerformanceCompaction subproperties Cassandra Lunch 17 RecordingICYMICassandra.LinkCassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email! Posted in Modern Business | Comments Off on Apache Cassandra Lunch #17: Tombstones

Illustration Image

In Cassandra Lunch #17, we discuss tombstones in Cassandra. Tombstones are a special kind of write that signifies deleted values, stops them from being returned on reads, and eventually allows them to be deleted during compaction. We discuss what tombstones are and why they are used, as well as how they work in practice.

What and Why

Tombstones are a value that can be written to a Cassandra cluster in a number of different ways. Tombstones contain deletion_info, which details when a record gets deleted. We use tombstones in order to allow Cassandra to have high speed writes and reads. Rather than having a process where individual deletes must be processed immediately, making those deletes take a very long time, we instead have a process where a tombstone is written to the cluster instead, taking advantage of the high-speed process that Cassandra has in place for writes. This also allows it to scale to a large number of machines. Tombstones cause problems by taking up cluster space before compaction takes place, and also by slowing down reads since Cassandra select queries pull entire SSTables, including tombstones, into memtable before filtering and returning results.

How

Tombstones are created when data in a Cassandra cluster is deleted. This deletion can create partition, row, or cell tombstones depending on the nature of that deletion. They can also be created when null values are inserted or updated into a Cassandra table, or when a table, row, or cell is created with a TTL (Time to Live) value defining when the selected data is to be deleted.

Tombstones are cleaned up automatically during the compaction process. On the cluster level, the settings that affect when compaction takes place are Compaction Executors and Compaction Throughput settings. On the table, the setting that affects compaction are gc_grace_seconds, which defaults to 10 days, and is a counter that triggers a compaction when it reaches zero. The tombstone_threshold is a percentage value for when a table has enough tombstones that it is ready to be marked for compaction. The default value is 0.2 or 20% tombstones. The tombstone_compaction_interval setting determines how long a table must exist before being eligible for compaction. The unchecked_tombstone_compaction allows the bypass of the previous setting.

We can trigger compaction manually via nodetool compact, nodetool garbagecollect, and nodetool scrub. Generally, it is better to configure a cluster such that tombstone accumulation is not a problem. The accumulation of tombstones can happen when automatic processes push null values. Sometimes manual compaction is the best way to take care of this.

In order to monitor tombstones and compaction processes, we can use nodetool tablestats to get statistics for a particular machine. Anant’s Cassandra Toolkit’s TableAnalyzer can aggregate these values from the machines that make up a cluster. We can also look in the logs for tombstone warnings that display when reads pull a certain number of tombstones. Cassandra toolkit also has tools for this, under NodeAnalyzer or log-analysis.

Additional Resources

Cassandra Links | Understanding Cassandra Tombstones – Beyond the Lines

Cassandra Links | The Curious Case of Tombstones

Cassandra Links | Experiences with Tombstones in Apache Cassandra

Cassandra Links | Common Problems with Cassandra Tombstones – OpenCredo

Cassandra Links | Undetectable Tombstones in Apache Cassandra

Cassandra Links | About Deletes and Tombstones in Cassandra

Cassandra Links | Deletes and Tombstones

Resources for Monitoring Datastax, Cassandra, Spark, & Solr Performance

Compaction subproperties

Cassandra Lunch 17 Recording

ICYMI

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Related Articles

cluster
troubleshooting
datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

arodrime

4/3/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

tombstones