Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

11/1/2024

Reading time:4 min

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

by ',d,a,t,a,s,t,a,x,'

The ZDM Proxy is client-server component written in Go that enables users to migrate with zero downtime from an Apache Cassandra® cluster to another (which may be an Astra cluster) and not requiring code changes in the application client.The only change to the client is pointing it to the proxy rather than directly to the original cluster (Origin). In turn, the proxy connects to both Origin and Target clusters.By default, the proxy will forward read requests only to the Origin cluster, though you can optionally configure it to forward reads to both clusters asynchronously, while writes will always be sent to both clusters concurrently.An overview of the proxy architecture and logical flow can be viewed here.In order to run the proxy, you'll need to set some environment variables or pass reference to YAML configuration file. Below you'll find a list with the most important variables along with their default values. The required ones are marked with a comment. Variable names for YAML configuration file do not have ZDM_ prefix and are lower-cased.ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1 #requiredZDM_ORIGIN_USERNAME=cassandra #requiredZDM_ORIGIN_PASSWORD=cassandra #requiredZDM_ORIGIN_PORT=9042ZDM_TARGET_CONTACT_POINTS=10.0.0.2 #requiredZDM_TARGET_USERNAME=cassandra #requiredZDM_TARGET_PASSWORD=cassandra #requiredZDM_TARGET_PORT=9042ZDM_PROXY_LISTEN_PORT=14002ZDM_PROXY_LISTEN_ADDRESS=127.0.0.1ZDM_PRIMARY_CLUSTER=ORIGINZDM_READ_MODE=PRIMARY_ONLYZDM_LOG_LEVEL=INFOThe environment variables (or YAM configuration file) must be set for the proxy to work.In order to get started quickly, in your local environment, grab a copy of the binary distribution in the Releases page. For the recommended installation in a production environment, check the Production Setup section below.Now, suppose you have two clusters running at 10.0.0.1 and 10.0.0.2 with cassandra/cassandra credentials and the same key-value schema. You can start the proxy and connect it to these clusters like this:$ export ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1 \ export ZDM_TARGET_CONTACT_POINTS=10.0.0.2 \export ZDM_ORIGIN_USERNAME=cassandra \export ZDM_ORIGIN_PASSWORD=cassandra \export ZDM_TARGET_USERNAME=cassandra \export ZDM_TARGET_PASSWORD=cassandra \./zdm-proxy-v2.0.0 # run the ZDM proxy executableIf you prefer to use YAML configuration file, an equivalent setup would look like:$ cat zdm-config.ymlorigin_contact_points: 10.0.0.1target_contact_points: 10.0.0.2origin_username: cassandraorigin_password: cassandratarget_username: cassandratarget_password: cassandra$ ./zdm-proxy-v2.0.0 --config=./zdm-config.yml # run the ZDM proxy executableAt this point, you should be able to connect some client such as CQLSH to the proxy and write data to it and the proxy will take care of forwarding the requests to both clusters concurrently.$ cqlsh <proxy-ip-address> 14002 # this is the proxy's default listen portFrom the CQLSH prompt:cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (1, 'ABC');cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (2, 'DEF');cqlsh> SELECT * FROM test.keyvalue;cqlsh> UPDATE test.keyvalue SET value='GYEKJF' WHERE key = 1;cqlsh> DELETE FROM test.keyvalue WHERE key = 2;You can confirm that the data is stored in both clusters by querying them directly in other cqlsh sessions.Note: For the moment, the keyspace must be specified when accessing a table, even after using USE <keyspace>.If you don't have test clusters readily available to try with, check the alternative method with docker-compose in the Contributor's guide, which will set up all the dependencies, including two test clusters and a proxy instance, in a containerized sandbox environment.ZDM Proxy supports protocol versions v2, v3, v4, DSE_V1 and DSE_V2.It technically doesn't support v5, but handles protocol negotiation so that the client application properly downgrades the protocol version to v4 if v5 is requested. This means that any client application using a recent driver that supports protocol version v5 can be migrated using the ZDM Proxy (as long as it does not use v5-specific functionality).ZDM Proxy requires origin and target clusters to have at least one protocol version in common. It is therefore not feasible to configure Apache Cassandra 2.0 as origin and 3.x / 4.x as target. Below table displays protocol versions supported by various C* versions:Apache CassandraProtocol Version2.0V22.1V2, V32.2V2, V3, V43.xV3, V44.xV3, V4, V5⚠️ Thrift is not supported by ZDM Proxy. If you are using a very old driver or cluster version that only supports Thrift then you need to change your client application to use CQL and potentially upgrade your cluster before starting the migration process.In practice this means that ZDM Proxy supports the following cluster versions (as Origin and / or Target):Apache Cassandra from 2.0+ up to (and including) Apache Cassandra 4.x. (although both clusters have to support a common protocol version as mentioned above).DataStax Enterprise 4.8+. DataStax Enterprise 4.6 and 4.7 support will be introduced when protocol version v2 is supported.DataStax Astra DB (both Serverless and Classic)The setup we described above is only for testing in a local environment. It is NOT recommended for a production installation where the minimum number of proxy instances is 3.For a comprehensive guide with the recommended production setup check the documentation available at Datastax Migration.There you'll find information about an Ansible-based tool that automates most of the process.For information on the packaged dependencies of the Zero Downtime Migration (ZDM) Proxy and their licenses, check out our open source report.For frequently asked questions, please refer to our separate FAQ page.

Illustration Image

The ZDM Proxy is client-server component written in Go that enables users to migrate with zero downtime from an Apache Cassandra® cluster to another (which may be an Astra cluster) and not requiring code changes in the application client.

The only change to the client is pointing it to the proxy rather than directly to the original cluster (Origin). In turn, the proxy connects to both Origin and Target clusters.

By default, the proxy will forward read requests only to the Origin cluster, though you can optionally configure it to forward reads to both clusters asynchronously, while writes will always be sent to both clusters concurrently.

An overview of the proxy architecture and logical flow can be viewed here.

In order to run the proxy, you'll need to set some environment variables or pass reference to YAML configuration file. Below you'll find a list with the most important variables along with their default values. The required ones are marked with a comment. Variable names for YAML configuration file do not have ZDM_ prefix and are lower-cased.

ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1  #required
ZDM_ORIGIN_USERNAME=cassandra       #required
ZDM_ORIGIN_PASSWORD=cassandra       #required
ZDM_ORIGIN_PORT=9042
ZDM_TARGET_CONTACT_POINTS=10.0.0.2  #required
ZDM_TARGET_USERNAME=cassandra       #required
ZDM_TARGET_PASSWORD=cassandra       #required
ZDM_TARGET_PORT=9042
ZDM_PROXY_LISTEN_PORT=14002
ZDM_PROXY_LISTEN_ADDRESS=127.0.0.1
ZDM_PRIMARY_CLUSTER=ORIGIN
ZDM_READ_MODE=PRIMARY_ONLY
ZDM_LOG_LEVEL=INFO

The environment variables (or YAM configuration file) must be set for the proxy to work.

In order to get started quickly, in your local environment, grab a copy of the binary distribution in the Releases page. For the recommended installation in a production environment, check the Production Setup section below.

Now, suppose you have two clusters running at 10.0.0.1 and 10.0.0.2 with cassandra/cassandra credentials and the same key-value schema. You can start the proxy and connect it to these clusters like this:

$ export ZDM_ORIGIN_CONTACT_POINTS=10.0.0.1 \ 
export ZDM_TARGET_CONTACT_POINTS=10.0.0.2 \
export ZDM_ORIGIN_USERNAME=cassandra \
export ZDM_ORIGIN_PASSWORD=cassandra \
export ZDM_TARGET_USERNAME=cassandra \
export ZDM_TARGET_PASSWORD=cassandra \
./zdm-proxy-v2.0.0 # run the ZDM proxy executable

If you prefer to use YAML configuration file, an equivalent setup would look like:

$ cat zdm-config.yml
origin_contact_points: 10.0.0.1
target_contact_points: 10.0.0.2
origin_username: cassandra
origin_password: cassandra
target_username: cassandra
target_password: cassandra
$ ./zdm-proxy-v2.0.0 --config=./zdm-config.yml # run the ZDM proxy executable

At this point, you should be able to connect some client such as CQLSH to the proxy and write data to it and the proxy will take care of forwarding the requests to both clusters concurrently.

$ cqlsh <proxy-ip-address> 14002 # this is the proxy's default listen port

From the CQLSH prompt:

cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (1, 'ABC');
cqlsh> INSERT INTO test.keyvalue (key, value) VALUES (2, 'DEF');
cqlsh> SELECT * FROM test.keyvalue;
cqlsh> UPDATE test.keyvalue SET value='GYEKJF' WHERE key = 1;
cqlsh> DELETE FROM test.keyvalue WHERE key = 2;

You can confirm that the data is stored in both clusters by querying them directly in other cqlsh sessions.

Note: For the moment, the keyspace must be specified when accessing a table, even after using USE <keyspace>.

If you don't have test clusters readily available to try with, check the alternative method with docker-compose in the Contributor's guide, which will set up all the dependencies, including two test clusters and a proxy instance, in a containerized sandbox environment.

ZDM Proxy supports protocol versions v2, v3, v4, DSE_V1 and DSE_V2.

It technically doesn't support v5, but handles protocol negotiation so that the client application properly downgrades the protocol version to v4 if v5 is requested. This means that any client application using a recent driver that supports protocol version v5 can be migrated using the ZDM Proxy (as long as it does not use v5-specific functionality).

ZDM Proxy requires origin and target clusters to have at least one protocol version in common. It is therefore not feasible to configure Apache Cassandra 2.0 as origin and 3.x / 4.x as target. Below table displays protocol versions supported by various C* versions:

Apache Cassandra Protocol Version
2.0 V2
2.1 V2, V3
2.2 V2, V3, V4
3.x V3, V4
4.x V3, V4, V5

⚠️ Thrift is not supported by ZDM Proxy. If you are using a very old driver or cluster version that only supports Thrift then you need to change your client application to use CQL and potentially upgrade your cluster before starting the migration process.


In practice this means that ZDM Proxy supports the following cluster versions (as Origin and / or Target):

  • Apache Cassandra from 2.0+ up to (and including) Apache Cassandra 4.x. (although both clusters have to support a common protocol version as mentioned above).
  • DataStax Enterprise 4.8+. DataStax Enterprise 4.6 and 4.7 support will be introduced when protocol version v2 is supported.
  • DataStax Astra DB (both Serverless and Classic)

The setup we described above is only for testing in a local environment. It is NOT recommended for a production installation where the minimum number of proxy instances is 3.

For a comprehensive guide with the recommended production setup check the documentation available at Datastax Migration.

There you'll find information about an Ansible-based tool that automates most of the process.

For information on the packaged dependencies of the Zero Downtime Migration (ZDM) Proxy and their licenses, check out our open source report.

For frequently asked questions, please refer to our separate FAQ page.

Related Articles

migration
proxy
cassandra

GitHub - datastax/cql-proxy: A client-side CQL proxy/sidecar.

datastax

11/1/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

migration