Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

10/30/2020

Reading time:3 min

Apache Cassandra Lunch #21: Cassandra Stages / Thread Pools - Business Platform Team

by John Doe

In Cassandra Lunch #21, we discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages/thread pools. Additionally, there is a video of Cassandra Lunch #21 embedded in this blog as well. Join Cassandra Lunch weekly at 12 PM EST every Wednesday here and check out our youtube channel for past Cassandra Lunches. In Cassandra Lunch #21, we discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages/thread pools. Cassandra is based on a Staged Event-Driven Architecture, where Cassandra separates different tasks into stages connected by a messaging service and each stage has a queue and a thread pool. Although, some stages skip the messaging service and queue tasks immediately on a different stage when it exists on the same node. Cassandra can back up a queue if the next stage is too busy and lead to performance bottlenecks.Example of a read requestHere is a list with quick descriptions of the different stages and thread pools we also discussed in Cassandra Lunch #21.AntiEntropyStageProcessing repair messages and streaming CacheCleanupExecutorClearing the cacheCommitlogArchiverCopying or archiving commitlog files for recoveryCompactionExecutorRunning compactionCounterMutationStageProcessing local counter changes. Will back up if the write rate exceeds the mutation rate. A high pending count will be seen if the consistency level is set to ONE and there is a high counter increment workload.GossipStageDistributing node information via Gossip. Out of sync schemas can cause issues. You may have to sync using nodetool resetlocalschemaHintedHandoffSending missed mutations to other nodes. Usually symptom of a problem elsewhere. Use nodetool disablehandoff and run repair.InternalResponseStageResponding to non-client initiated messages, including bootstrapping and schema checkingMemtableFlushWriterWriting memtable contents to disk. May back up if the queue overruns the disk I/O, or because of sorting processes. WARNING: nodetool tpstats no longer reports blocked threads in the MemtableFlushWriter pool. Check the Pending Flushes metric reported by nodetool tablestats.MemtablePostFlushCleaning up after flushing the memtable (discarding commit logs and secondary indexes as needed)MemtableReclaimMemoryMaking unused memory availableMigrationStageProcessing schema changesMiscStageSnapshotting, replicating data after node remove completed.MutationStagePerforming local inserts/updates, schema merges, commit log replays or hints in progress. A high number of Pending write requests indicates the node is having a problem handling them. Fix this by adding a node, tuning hardware and configuration, and/or updating data models.Native-Transport-RequestsProcessing CQL requests to the serverPendingRangeCalculatorCalculating pending ranges per bootstraps and departed nodes Reporting by this tool is not useful.ReadRepairStagePerforming read repairs. Usually fast, if there is good connectivity between replicas. If Pending grows too large, attempt to lower the rate for high-read tables by altering the table to use a smaller read_repair_chance value, like 0.11.ReadStagePerforming local reads. Also includes deserializing data from row cache. Pending values can cause increased read latency. Generally resolved by adding nodes or tuning the system.RequestResponseStageHandling responses from other nodesValidationExecutorValidating schemaWe also covered nodetool tpstats, which provides usage statistics of thread pools. The nodetool tpstats command reports on each stage of Cassandra operations by thread pool:The number of active threads.The number of pending requests waiting to be executed by this thread pool.The number of tasks completed by this thread pool.The number of requests that are currently blocked because the thread pool for the next step in the service is full.The total number of all-time blocked requests, which are all requests blocked in this thread pool up to now.In addition to nodetool tpstats, check out this blog where you can check out more resources for monitoring Datastax, Cassandra, Spark, & Solr performance.Again as mentioned above, you can join Cassandra Lunch weekly at 12 PM EST every Wednesday here, and also check out our youtube channel for past Cassandra Lunches!Cassandra Lunch #21 Recording ResourcesCassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email! Posted in Modern Business | Comments Off on Apache Cassandra Lunch #21: Cassandra Stages / Thread Pools

Illustration Image

In Cassandra Lunch #21, we discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages/thread pools. Additionally, there is a video of Cassandra Lunch #21 embedded in this blog as well. Join Cassandra Lunch weekly at 12 PM EST every Wednesday here and check out our youtube channel for past Cassandra Lunches.

In Cassandra Lunch #21, we discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages/thread pools. Cassandra is based on a Staged Event-Driven Architecture, where Cassandra separates different tasks into stages connected by a messaging service and each stage has a queue and a thread pool. Although, some stages skip the messaging service and queue tasks immediately on a different stage when it exists on the same node. Cassandra can back up a queue if the next stage is too busy and lead to performance bottlenecks.

Example of a read request

Here is a list with quick descriptions of the different stages and thread pools we also discussed in Cassandra Lunch #21.

  • AntiEntropyStage
    • Processing repair messages and streaming
  • CacheCleanupExecutor
    • Clearing the cache
  • CommitlogArchiver
    • Copying or archiving commitlog files for recovery
  • CompactionExecutor
    • Running compaction
  • CounterMutationStage
    • Processing local counter changes. Will back up if the write rate exceeds the mutation rate. A high pending count will be seen if the consistency level is set to ONE and there is a high counter increment workload.
  • GossipStage
    • Distributing node information via Gossip. Out of sync schemas can cause issues. You may have to sync using nodetool resetlocalschema
  • HintedHandoff
    • Sending missed mutations to other nodes. Usually symptom of a problem elsewhere. Use nodetool disablehandoff and run repair.
  • InternalResponseStage
    • Responding to non-client initiated messages, including bootstrapping and schema checking
  • MemtableFlushWriter
    • Writing memtable contents to disk. May back up if the queue overruns the disk I/O, or because of sorting processes. WARNING: nodetool tpstats no longer reports blocked threads in the MemtableFlushWriter pool. Check the Pending Flushes metric reported by nodetool tablestats.
  • MemtablePostFlush
    • Cleaning up after flushing the memtable (discarding commit logs and secondary indexes as needed)
  • MemtableReclaimMemory
    • Making unused memory available
  • MigrationStage
    • Processing schema changes
  • MiscStage
    • Snapshotting, replicating data after node remove completed.
  • MutationStage
    • Performing local inserts/updates, schema merges, commit log replays or hints in progress. A high number of Pending write requests indicates the node is having a problem handling them. Fix this by adding a node, tuning hardware and configuration, and/or updating data models.
  • Native-Transport-Requests
    • Processing CQL requests to the server
  • PendingRangeCalculator
    • Calculating pending ranges per bootstraps and departed nodes Reporting by this tool is not useful.
  • ReadRepairStage
    • Performing read repairs. Usually fast, if there is good connectivity between replicas. If Pending grows too large, attempt to lower the rate for high-read tables by altering the table to use a smaller read_repair_chance value, like 0.11.
  • ReadStage
    • Performing local reads. Also includes deserializing data from row cache. Pending values can cause increased read latency. Generally resolved by adding nodes or tuning the system.
  • RequestResponseStage
    • Handling responses from other nodes
  • ValidationExecutor
    • Validating schema

We also covered nodetool tpstats, which provides usage statistics of thread pools. The nodetool tpstats command reports on each stage of Cassandra operations by thread pool:

  • The number of active threads.
  • The number of pending requests waiting to be executed by this thread pool.
  • The number of tasks completed by this thread pool.
  • The number of requests that are currently blocked because the thread pool for the next step in the service is full.
  • The total number of all-time blocked requests, which are all requests blocked in this thread pool up to now.

In addition to nodetool tpstats, check out this blog where you can check out more resources for monitoring Datastax, Cassandra, Spark, & Solr performance.

Again as mentioned above, you can join Cassandra Lunch weekly at 12 PM EST every Wednesday here, and also check out our youtube channel for past Cassandra Lunches!

Cassandra Lunch #21 Recording

Resources

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Related Articles

stargate
cassandra.lunch
cassandra

Apache Cassandra Lunch #87: Cassandra.api, Astra, and Stargate - Business Platform Team

Obioma Anomnachi

7/8/2022

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra

cassandra.architecture

cassandra
cassandra.architecture