Today we will explore persistent storage for Cassandra on Kubernetes with OpenEBS.
What will be doing
We will be deploying a k3s distribution of kubernetes on Civo, deploy a cassandra cluster, write some data to the cluster and then test the persistence by deleting all the cassandra pods
Taken from their Github Page, OpenEBS describes them as the leading open-source container attached storage, built using cloud native architecture, simplifies running stateful applications on kubernetes.
Deploy Kubernetes
Create a new civo k3s cluster with 3 nodes:
$ civo kubernetes create demo-cluster --size=g2.small --nodes=3 --wait
Append the kubernetes config to your kubeconfig file:
$ civo kubernetes config demo-cluster --save Merged config into ~/.kube/config
Switch the context to the new cluster:
$ kubectx demo-cluster Switched to context "demo-cluster".
Install OpenEBS
Deploy OpenEBS to your Kubernetes cluster by applying this, which can be found from their github page:
$ kubectl apply -f https://openebs.github.io/charts/openebs-operator.yaml
Give it some time for all the pods to check in and then have a look if all the pods under the openebs
namespace is ready:
$ kubectl get pods -n openebs NAME READY STATUS RESTARTS AGE openebs-provisioner-c68bfd6d4-5n7kk 1/1 Running 0 5m52s openebs-ndm-df25v 1/1 Running 0 5m51s openebs-snapshot-operator-7ffd685677-h8jpf 2/2 Running 0 5m52s openebs-admission-server-889d78f96-t64m6 1/1 Running 0 5m50s openebs-ndm-44k6n 1/1 Running 0 5m51s openebs-ndm-hpmg8 1/1 Running 0 5m51s openebs-localpv-provisioner-67bddc8568-shkqr 1/1 Running 0 5m49s openebs-ndm-operator-5db67cd5bb-mfg5m 1/1 Running 0 5m50s maya-apiserver-7f664b95bb-l87bm 1/1 Running 0 5m53s
We can see a list of storage classes that comes with OpenEBS:
$ kubectl get sc NAME PROVISIONER AGE local-path (default) rancher.io/local-path 28m openebs-jiva-default openebs.io/provisioner-iscsi 5m43s openebs-snapshot-promoter volumesnapshot.external-storage.k8s.io/snapshot-promoter 5m41s openebs-hostpath openebs.io/local 5m41s openebs-device openebs.io/local 5m41s
Deploy Cassandra
Deploy a Cassandra Service to your Kubernetes Cluster:
$ kubectl apply -f https://raw.githubusercontent.com/ruanbekker/blog-assets/master/civo.com-openebs-kubernetes/manifests/cassandra/cassandra-service.yaml
The Cassandra Stateful Set has 3 replicas which is defined in the manifest as replicas: 3
. I am also using the Jiva storage engine defined as volume.beta.kubernetes.io/storage-class: openebs-jiva-default
which is best suited for running replicated block storage on nodes that make use of ephemeral storage on the Kubernetes worker nodes.
Let's deploy Cassandra to your Kubernetes Cluster:
$ kubectl apply -f https://raw.githubusercontent.com/ruanbekker/blog-assets/master/civo.com-openebs-kubernetes/manifests/cassandra/cassandra-statefulset.yaml
Give it some time for all the pods to check in, then have a look at the pods:
$ kubectl get pods NAME READY STATUS RESTARTS AGE pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-7nwfl 1/1 Running 0 5m39s pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-wr4hl 1/1 Running 0 5m39s pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-ctrl-6f95fd555-ldd86 2/2 Running 0 5m53s pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-2v75w 1/1 Running 1 5m39s cassandra-0 1/1 Running 0 5m53s pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-ctrl-bc4769cb4-c8stc 2/2 Running 0 3m40s pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-w2gqs 1/1 Running 0 3m37s pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-jk6c2 1/1 Running 0 3m37s pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-qq9bb 1/1 Running 1 3m37s cassandra-1 1/1 Running 0 3m40s pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-ctrl-55887dc66f-zjbx8 2/2 Running 0 117s pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-cqvdb 1/1 Running 0 114s pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-hclgs 1/1 Running 0 114s pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-85nqt 1/1 Running 1 114s cassandra-2 0/1 Running 0 118s
When we look at our persistent volumes:
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93 5G RWO Delete Bound default/cassandra-data-cassandra-0 openebs-jiva-default 12m pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2 5G RWO Delete Bound default/cassandra-data-cassandra-1 openebs-jiva-default 10m pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456 5G RWO Delete Bound default/cassandra-data-cassandra-2 openebs-jiva-default 8m49s
And our persistent volume claims:
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE cassandra-data-cassandra-0 Bound pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93 5G RWO openebs-jiva-default 13m cassandra-data-cassandra-1 Bound pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2 5G RWO openebs-jiva-default 10m cassandra-data-cassandra-2 Bound pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456 5G RWO openebs-jiva-default 9m11s
Interact with Cassandra
View the cluster status:
$ kubectl exec cassandra-0 -- nodetool status Datacenter: dc1-civo-k3s ======================== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 192.168.1.9 83.16 KiB 32 63.4% f3debec3-77c7-438e-bf86-38630fa01c14 rack1-civo-k3s UN 192.168.0.10 65.6 KiB 32 66.1% 3f78ec71-7485-477a-a095-44aa8fade0bd rack1-civo-k3s UN 192.168.2.12 84.73 KiB 32 70.5% 0160da34-5975-43d9-ab89-4c92cfe4ceeb rack1-civo-k3s
Now let's connect to our cassandra cluster:
$ kubectl exec -it cassandra-0 -- cqlsh cassandra 9042 --cqlversion="3.4.2" Connected to civo-k3s at cassandra:9042. [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4] Use HELP for help. cqlsh>
Then create a keyspace, table and write some dummy data to our cluster:
cqlsh> create keyspace test with replication = {'class': 'SimpleStrategy', 'replication_factor': 2 }; cqlsh> use test; cqlsh:test> create table people (id uuid, name varchar, age int, PRIMARY KEY ((id), name)); cqlsh:test> insert into people (id, name, age) values(uuid(), 'ruan', 28); cqlsh:test> insert into people (id, name, age) values(uuid(), 'samantha', 25); cqlsh:test> insert into people (id, name, age) values(uuid(), 'stefan', 29); cqlsh:test> insert into people (id, name, age) values(uuid(), 'james', 25); cqlsh:test> insert into people (id, name, age) values(uuid(), 'michelle', 30); cqlsh:test> insert into people (id, name, age) values(uuid(), 'tim', 32);
Now let's read the data:
cqlsh:test> select * from people; id | name | age --------------------------------------+----------+----- e3a2261d-77d3-4ed6-9daf-51e65ccf618f | tim | 32 594b24ee-e5bc-44cf-87cf-afed48a74df9 | samantha | 25 ca65c207-8dd0-4dc0-99e3-1a0301128921 | michelle | 30 1eeb2b77-57d0-4e10-b5d7-6dc58c43006e | james | 25 3cc12f26-395d-41d1-a954-f80f2c2ff88d | ruan | 28 755d52f2-5a27-4431-8591-7d34bacf7bee | stefan | 29 (6 rows)
Test Data Persistence
Now, it's time to delete our pods and see if our data is persisted. First we will look at our pods to determine how long they are running and on which node they are running:
$ kubectl get pods --selector app=cassandra -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cassandra-0 1/1 Running 0 35m 192.168.0.10 kube-master-75a2 <none> <none> cassandra-1 1/1 Running 0 33m 192.168.1.9 kube-node-5d0a <none> <none> cassandra-2 1/1 Running 0 30m 192.168.2.12 kube-node-aca8 <none> <none>
Let's delete all our cassandra pods:
$ kubectl delete pod/cassandra-{0..2} pod "cassandra-0" deleted pod "cassandra-1" deleted pod "cassandra-2" deleted
As we can see the first pod is busy starting:
$ kubectl get pods --selector app=cassandra -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cassandra-0 0/1 Running 0 37s 192.168.2.13 kube-node-aca8 <none> <none>
Give it some time to allow all three pods to check in:
$ kubectl get pods --selector app=cassandra -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES cassandra-0 1/1 Running 0 4m36s 192.168.2.13 kube-node-aca8 <none> <none> cassandra-1 1/1 Running 0 3m24s 192.168.0.17 kube-master-75a2 <none> <none> cassandra-2 1/1 Running 0 2m3s 192.168.1.12 kube-node-5d0a <none> <none>
Right, now that all pods has checked in, let's connect to our cluster and see if the data is still there:
$ kubectl exec -it cassandra-0 -- cqlsh cassandra 9042 --cqlversion="3.4.2" Connected to civo-k3s at cassandra:9042. [cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4] Use HELP for help. cqlsh> use test; cqlsh:test> select * from people; id | name | age --------------------------------------+----------+----- e3a2261d-77d3-4ed6-9daf-51e65ccf618f | tim | 32 594b24ee-e5bc-44cf-87cf-afed48a74df9 | samantha | 25 ca65c207-8dd0-4dc0-99e3-1a0301128921 | michelle | 30 1eeb2b77-57d0-4e10-b5d7-6dc58c43006e | james | 25 3cc12f26-395d-41d1-a954-f80f2c2ff88d | ruan | 28 755d52f2-5a27-4431-8591-7d34bacf7bee | stefan | 29 (6 rows) cqlsh:test> exit;
And as you can see the data is being persisted.
Thank You
I am super impressed with OpenEBS, have a look at their docs and also have a look at their examples on their github page.
Let me know what you think. If you liked my content, feel free to visit me at ruan.dev or follow me on twitter at @ruanbekker