Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

1/24/2019

Reading time:3 min

Advanced Node Replace - Instaclustr

by John Doe

Instaclustr has a number of internal tools and procedures that help us keep Cassandra clusters healthy. One of those tools allows us to replace instance backing a Cassandra node while keeping the IP’s and data. Dynamic Resizing released in June 2017 uses technology developed for that tool to allow customers to scale Cassandra clusters vertically based on demand.Initially the replace tool operated by detaching the volumes from the instance being replaced, then re-attaching the volumes to the new instance. This limited the usage of the tool to EBS-backed instances. Another often-requested extension to the tool was resizing a Data Centre to a different node size to upgrade to a newly added node size, for example, or to switch over to the resizable class nodes.One option for changing instance size where we could not just detach and reattach data volumes was to use Cassandra’s native node replace functionality to replace each instance in the cluster in a rolling fashion. At first, this approach seems attractive and can be conducted with zero downtime. However, quite some time ago we realised that, unless you run a repair between each replacement, this approach has almost certain loss of a small amount of data if any replace operation exceeds the hinted hand-off window. As a result, we relied for quite a while on fairly tedious and complex methods of rolling upgrades involving attaching and re-attaching EBS volumes.To address this problem, we have recently extended the replace tool to remove these limitation and a support the advanced use case. The new “copy data” replace mode replaces a node in the following stages: Provision the new node of desired size Copy most of data from the old node to the new node Stop the old node ensuring so that no data is lost Join the replacement node to the cluster Provisioning is trivial with our powerful provisioning system, but copying large amounts from a live node presents some specific challenges.  We had to develop a solution which was able to copy large amounts of data from a live node without created too much additional load on a cluster which might already be under stress.  We also had to work carefully within constraints created by Cassandra’s hinted handoff system. We explored a number of solutions to the problem of copy data to the new node while minimising impact to the running nodes.  After discarding several alternatives, settled on a solution which builds on Instaclustr’s existing, proven backup/restore system.  This ensures minimal resource strain on the node being replaced as we only need to copy data added since the last backup was taken and most of the data is already stored in the cloud storage.Stopping the old node ensuring no data is lost requires stopping Cassandra and uploading the last remaining bit of data added since the previous step. This process usually completes within 10 minutes to ensure minimal degradation of cluster performance.After all of the data is on the new node the old node is terminated, its public and private IP’s are transferred to the new node and Cassandra is started on the new node. As the replacement node joins it receives the data it missed during the short downtime as hinted handoffs.The new solution has allowed us to standardise our approach to node replacement for all instance types using the proven technology of our Cassandra backup system to improve the overall performance of the process. At the moment this resize functionality is controlled by our administrators and can be requested by customers via our support channel. We will like make the functionality available directly to users in the future.

Illustration Image

Instaclustr has a number of internal tools and procedures that help us keep Cassandra clusters healthy. One of those tools allows us to replace instance backing a Cassandra node while keeping the IP’s and data. Dynamic Resizing released in June 2017 uses technology developed for that tool to allow customers to scale Cassandra clusters vertically based on demand.

Initially the replace tool operated by detaching the volumes from the instance being replaced, then re-attaching the volumes to the new instance. This limited the usage of the tool to EBS-backed instances. Another often-requested extension to the tool was resizing a Data Centre to a different node size to upgrade to a newly added node size, for example, or to switch over to the resizable class nodes.

One option for changing instance size where we could not just detach and reattach data volumes was to use Cassandra’s native node replace functionality to replace each instance in the cluster in a rolling fashion. At first, this approach seems attractive and can be conducted with zero downtime. However, quite some time ago we realised that, unless you run a repair between each replacement, this approach has almost certain loss of a small amount of data if any replace operation exceeds the hinted hand-off window. As a result, we relied for quite a while on fairly tedious and complex methods of rolling upgrades involving attaching and re-attaching EBS volumes.

To address this problem, we have recently extended the replace tool to remove these limitation and a support the advanced use case. The new “copy data” replace mode replaces a node in the following stages:

    1. Provision the new node of desired size
    2. Copy most of data from the old node to the new node
    3. Stop the old node ensuring so that no data is lost
    4. Join the replacement node to the cluster

Provisioning is trivial with our powerful provisioning system, but copying large amounts from a live node presents some specific challenges.  We had to develop a solution which was able to copy large amounts of data from a live node without created too much additional load on a cluster which might already be under stress.  We also had to work carefully within constraints created by Cassandra’s hinted handoff system.

We explored a number of solutions to the problem of copy data to the new node while minimising impact to the running nodes.  After discarding several alternatives, settled on a solution which builds on Instaclustr’s existing, proven backup/restore system.  This ensures minimal resource strain on the node being replaced as we only need to copy data added since the last backup was taken and most of the data is already stored in the cloud storage.

Stopping the old node ensuring no data is lost requires stopping Cassandra and uploading the last remaining bit of data added since the previous step. This process usually completes within 10 minutes to ensure minimal degradation of cluster performance.

After all of the data is on the new node the old node is terminated, its public and private IP’s are transferred to the new node and Cassandra is started on the new node. As the replacement node joins it receives the data it missed during the short downtime as hinted handoffs.

The new solution has allowed us to standardise our approach to node replacement for all instance types using the proven technology of our Cassandra backup system to improve the overall performance of the process. At the moment this resize functionality is controlled by our administrators and can be requested by customers via our support channel. We will like make the functionality available directly to users in the future.

Related Articles

cluster
troubleshooting
datastax

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

arodrime

4/3/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra