This is a guest post for the Computer Weekly Open Source Insider blog written by Ben Slater in his capacity as chief product officer at Instaclustr.
Instaclustr positions itself as firm offering managed and supported solutions for Apache Cassandra, ScyllaDB, Elasticsearch, Apache Spark, Apache Zeppelin, Kibana and Apache Lucene.
Indeed, Instaclustr is known for its willingness to describe itself as a managed open source as a service company, if that expression actually exists.
The original title in full for this piece was: Migrating Your Cassandra Cluster – with Zero Downtime – in 7 Easy Steps.
Slater’s moves for writing this piece are (obviously) directed at companies who are looking to move a live Apache Cassandra deployment to a new location.
With this task in mind, it is (obviously) natural that these same companies will have some concerns, such as how you can keep Cassandra clusters 100% available throughout the process.
Arguing that if your application is able to remain online throughout connection setting changes, Slater says it can also remain fully available during this transition.
NOTE: For extra protection and peace of mind, the following technique also includes a rapid rollback strategy to return to your original configuration, up until the moment the migration is completed.
Slater writes as follows:
Here’s a recommended 7-step Cassandra cluster migration order-of-operations that will avoid any downtime:
1) Get your existing environment ready
First of all, make sure that your application is using a datacentre-aware load balancing policy, as well as LOCAL_*. Also, check that all of the keyspaces that will be copied over to the new cluster are set to use NetworkTopologyStrategy as their replication strategy. It’s also recommended that all keyspaces use this replication strategy when created, as altering this later can become complicated.
2) Create the new cluster
Now it’s time to create the new cluster that you’ll be migrating to. A few things to be careful about here: be sure that the new cluster and the original cluster use the same Cassandra version and cluster name. Also, the new datacenter name that you use must be different from the name of the existing datacenter.
3) Join the clusters together
To do this, first make any necessary firewall rule changes in order to allow the clusters to be joined, remembering that some changes to the source cluster may also be necessary. Then, change the new cluster’s seed nodes – and start them. Once this is done, the new cluster will be a second datacenter in the original cluster.
4) Change the replication settings
Next, in the existing cluster, update the replication settings for the keyspaces that will be copied, so that data will now be replicated with the new datacenter as the destination.
5) Copy the data to the new cluster
When the clusters are joined together, Cassandra will begin to replicate writes to the new cluster. It’s still necessary, however, to copy any existing data over with the nodetool rebuild function. It’s a best practice to perform this function on the new cluster one or two nodes at a time, so as not to place an overwhelming streaming load on the existing cluster.
6) Change over the application’s connection points
After all uses of the rebuild function are completed, each of the clusters will contain a complete copy of the data being migrated, which Cassandra will keep in sync automatically. It’s now time to change the initial connection points of your application over to the nodes in the new cluster. Once this is completed, all reads and writes will be served by the new cluster, and will subsequently be replicated in the original cluster. Finally, it’s smart to run a repair function across the cluster, in order to ensure that all data has been replicated successfully from the original.
7) Shut down the original cluster
Complete the process with a little post-migration clean up, removing the original cluster. First, change the firewall rules to disconnect the original cluster from the new one. Then, update the replication settings in the new cluster to cease replication of data to the original cluster. Lastly, shut the original cluster down.
There you have it: your Apache Cassandra deployment has been fully migrated, with zero downtime, low risk and in a manner completely seamless and transparent from the perspective of your end users.
You can follow Instaclustr on Twitter.