Use Cassandra or DSE in Kubernetes with Cass Operator.
This topic explains how to use your configured and provisioned Apache Cassandra® or DSE cluster in Kubernetes.
Tip: If you haven't already, create a Kubernetes cluster. For a walkthrough of the steps – especially if you're new to Kubernetes – see the Google Kubernetes Engine (GKE) cloud example in this guide's topic, Create a Kubernetes cluster.
This topic assumes you've completed the steps to configure the Cass Operator, and to provision and deploy Cassandra or DSE cluster in your existing Kubernetes environment.
Connecting from inside the Kubernetes cluster
For an example of invoking cqlsh
from inside a Kubernetes cluster,
refer to Connect to Cassandra via cqlsh within Kubernetes cluster.
Cass Operator makes a Kubernetes headless service
available at <clusterName>-<datacenterName>-service
. Any
client app that submits CQL commands inside the Kubernetes cluster should be able to
connect to (for example) cluster1-dc1-service.my-db-cluster
and use
the nodes in a round-robin fashion as contact points.
- multiple
Inet[Socket]Address
parameters to connect - or one
Inet[Socket]Address
parameter to connect, in which case the Java driver uses it as the control connection, and talks directly to the cluster to discover the other nodes, and connects to them
However, with version 4.x of the DataStax Java driver, when you specify a
hostname
in the contact points definition in the config file,
the Java driver will first resolve all the hosts associated with the
hostname
. In the case of the cluster deployed by the Cassandra
Operator, using the hostname
of the Kubernetes service (example:
cluster1-dc1-service
) resolves to all the IP addresses
associated with the DSE cluster; that is, all the IPs of the DSE nodes. The DataStax
Java driver chooses one of those nodes as the control connection, connects to the
other resolved nodes, and performs cluster verification to all its connected local
DSE nodes.
InetAddress.getByName("cluster1-dc1-service")
to resolve only one host, which the driver will use at init time, and then connect to DSE to discover the rest of the nodes.- Or use
InetAddress.getAllByName("cluster1-dc1-service")
, which will resolve to all the nodes directly, and the driver uses this setting as if you had specified the multiple IP addresses of the nodes in the contact points.
Connecting from outside the Kubernetes cluster
When applications run within a Kubernetes cluster, you'll need to access those services from outside the cluster. Connecting to a Cassandra cluster running within Kubernetes can range from trivial to complex, depending on where the client is running, latency requirements, and security requirements. See Connect to Cassandra and apps from outside the Kubernetes cluster.
Scaling up the datacenter
size
parameter on theCassandaDatacenter
determines how many Cassandra or DSE instances are present in the datacenter. To add
more nodes, edit the YAML file that is described inthis steps of
the provisioning topic. Then reapply theCassandaDatacenter
configuration using the same command shown
in that prior topic:kubectl -n my-db-ns apply -f ./cluster1-dc1.yamlWhen you reapply the YAML with the additional nodes defined, Cass Operator restarts and Kubernetes adds the pods to your datacenter, provided there are sufficient Kubernetes worker nodes available.
Important: As part of the scaling up process, each rack in the Kubernetes cluster must contain the same number of server instances.
Changing the server configuration
CassandaDatacenter
parameter and edit theconfig
section of thespec
key. Then reapply
theCassandaDatacenter
configuration using the same command shown
in that prior topic:kubectl -n my-db-ns apply -f ./cluster1-dc1.yaml
Important: Cass Operator updates the config and restarts one node at a time in a rolling fashion.
Establishing a multi-datacenter cluster
To make a multi-datacenter cluster, create two CassandaDatacenter
resources and give them the same clusterName
in the
spec
.
Note: However, multi-region clusters and advanced workloads are not supported, which makes many multi-dc use cases inappropriate for Cass Operator.
Using kubectl to monitor resources in the Kubernetes cluster
kubectl
commands to get more information about the Cassandra or
DSE pods running in the Kubernetes cluster.- To get information about ongoing or recent events:
kubectl get event --all-namespaces
Note: By default, each event is configured by Kubernetes to only have a one hour Time to Live (TTL).
- To check for errors in the Kubernetes log for your operator's instance, use
kubectl logs
. First, get the instance name by using thekubectl get pod
command, specifying your namespace. Example:kubectl -n my-db-ns get pod
NAME READY STATUS RESTARTS AGE cass-operator-f74447c57-kdf2p 1/1 Running 0 13m gke-cluster1-dc1-r1-sts-0 1/1 Running 0 5m38s gke-cluster1-dc1-r2-sts-0 1/1 Running 0 42s gke-cluster1-dc1-r3-sts-0 1/1 Running 0 6m7s
Then usekubectl logs
. The log entries may be large; consider writing the output to a file. Example:kubectl -n my-db-ns logs cass-operator-f74447c57-kdf2p > ~/cass-operator-log.txt
Tip: To tail the Cassandra/DSE logs, use a command such as:kubectl -n my-db-ns logs --container server-system-logger --follow gke-cluster1-dc1-r1-sts-0
- You can also use the kubectl
describe pod
command to get identifying information about your pod. Example:kubectl -n my-db-ns describe pod cass-operator-f74447c57-kdf2p
Name: cass-operator-f74447c57-kdf2p Namespace: my-db-ns Priority: 0 Node: ip-10-101-34-70.srv101.myinternal.org/10.101.34.70 Start Time: Thu, 10 Sep 2020 23:39:42 -0600 Labels: name=cass-operator pod-template-hash=f74447c57 Annotations: <none> Status: Running IP: 10.244.2.2 IPs: IP: 10.244.2.2 Controlled By: ReplicaSet/cass-operator-f74447c57 Containers: dse-operator: Container ID: docker://bacfba382ed6be8893a0c344089d40fbb6c36db93a3e3677464390dd358fef35 Image: datastax/cass-operator:1.4.1-20200910 Image ID: docker-pullable://datastax/cass-operator@sha256:4e80f26c54594133a99adefc9e2e7e9b2b5915788d8c6b24457407e2d470a36a Port: <none> Host Port: <none> State: Running Started: Thu, 10 Sep 2020 23:39:51 -0600 Ready: True Restart Count: 0 Environment: WATCH_NAMESPACE: my-db-ns (v1:metadata.namespace) POD_NAME: cass-operator-f74447c57-kdf2p (v1:metadata.name) OPERATOR_NAME: cass-operator Mounts: /var/run/secrets/kubernetes.io/serviceaccount from cass-operator-token-q9hq5 (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: cass-operator-token-q9hq5: Type: Secret (a volume populated by a Secret) SecretName: cass-operator-token-q9hq5 Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: <none>
What's next
Learn how to use the metric reporter dashboards for Cassandra or DSE clusters in Kubernetes.