Table of Contents
- Google Kubernetes Engine Stateful Application Demo
- Introduction
- Architecture
- Prerequisites
- Deployment Steps
- Validation
- Tear Down
- Troubleshooting
- Relevant Materials
Introduction
This proof of concept deploys a Kubernetes Engine Cluster and then installs an Apache Cassandra database running on that cluster.
Various scripts are contained within this project that provide push button creation, validation, and deletion of the Cassandra(C*) database and Kubernetes Engine cluster.
Apache Cassandra was chosen for various reasons. These reasons include that out of the box Cassandra functions well as a cloud native database. Moreover, this POC continues the work started with the Kubernetes example, and the blog post: Thousand Instances of Cassandra using Kubernetes Pet Set.
When running a database using Kubernetes Engine, as an operator you need to be experienced with Kubernetes Engine and the datasource that you are running. If you are able to use a hosted solution then it is recommended that you use the hosted solution. On the other hand, if you are not able to run a datasource in a hosted solution then hosting it is Kubernetes Engine is doable. The challenge is not Kubernetes or Kubernetes Engine, but the challenge is migrating and operating a database in a container based system that can do things like moving pods/containers between nodes.
Architecture
The intricacy of the running stateful datastores via Kubernetes Engine or K8s lies with the various Kubernetes manifests and pod configurations used. A container that has been customized to run as a pod is used as well. This demo includes multiple configured manifests and a specialized container for Cassandra.
Many people, including Kelsey Hightower, have talked about how running stateful datestores inside of K8s is not trivial, and frankly quite complicated. If you can run a database on a managed service, do it. Let someone else wake up at 2 am to fix your database.
There are two possibilities when you run a stateful datastore on K8s.
- You are very experienced K8s user and know exactly what is around the corner.
- You have no idea what is around the corner, and you are going to learn very fast.
Architecture Diagram
The following diagram represents a Cassandra cluster deployed on Kubernetes Engine.
Kubernetes Manifests and Controllers
Various manifests and controllers are utilized to install Cassandra. The following sections outline the different types used.
StatefulSets
StatefulSets is the controller type that is used to install a Cassandra Ring on Kubernetes Engine. This controller manages the installation and scaling of a set of Pods and provides various features:
- Stable, unique network identifiers.
- Stable, persistent storage.
- Ordered, graceful deployment and scaling.
- Ordered, graceful deletion and termination.
- Ordered, automated rolling updates.
- Automated creation of storage volumes.
- Stable guaranteed storage.
Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.
Cassandra utilizes all of these features in order to run on Kubernetes Engine.
Find more information here.
Persistent Storage with Persistent Volumes
With a stateful datasource storage is required.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator, or automatically by a Stateful Set. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
Find more information here.
Pod Disruptions Budgets
An Application Owner can create a PodDisruptionBudget object (PDB) for each application. A PDB limits the number pods of a replicated application that are down simultaneously from voluntary disruptions. For example, with Cassandra we need to ensure that the number of replicas running is never brought below the number needed for a quorum.
Find more information visit here.
Pod Scheduling
This proof of concept utilizes advanced scheduling for pod placement. Both scheduling anti-affinity rules and taints are used to instruct the K8s scheduler where to launch C* pods.
Anti-Affinity rules instruct the scheduler to use preferred or required pod placement. Preferred rules provide a weight value that impacts the scheduling algorithm to not schedule C* pods on the same node. While required rules force the scheduler to not schedule C* pods on the same node. Not having multiple Cassandra pods on the same node increases fault tolerance, as when you lose a node you only lose one pod. But then you need to have an excess of nodes because a pod will not reschedule. Preferred does not provide the same level of high availability, but will allow pods to reschedule on existing nodes if headroom exists.
Find more information visit here.
Taints prevent pods to be scheduled on a node, unless the pod's manifest contains the required toleration. The typical use case for this is to target a specific node pool for say Cassandra pods. For instance, often larger machine types are required for C*, and adding taints to those nodes ensure that Cassandra pods will only be scheduled on the nodes.
Find more information visit here.
DNS and Headless Services
The Cassandra ring that is installed as part of this demo is named "cassandra". The name of the service "cassandra" creates the DNS name, and the tld is "default.svc.cluster.local", where "default" is the name of the namespace. With an application you would connect to at least three of the nodes to provide an HA connection with a client. The name of the nodes, for example would be:
cassandra-0.cassandra.default.svc.cluster.local cassandra-1.cassandra.default.svc.cluster.local cassandra-1.cassandra.default.svc.cluster.local
For more information visit here.
Cassandra Container
Within this demo we have included a container that is specifically built for running on Kubernetes. Features:
- OSS maintained base container that is used by K8s OSS project
- Cassandra is installed from a tarball
- Container size has been reduced
- A readiness and liveness probes are built-in
- Prometheus is installed, but the port is not exposed in the manifest by default
- dumb-init is used to handle signal processing
- A logging encoder is provided for JSON logging
- Datacenter configuration
- JMX configuration
- Multiple ENV variables are exposed for manifest based configuration
For more information about the ENV variables that have been exposed see container/README.md, container/files/run.sh and also the manifest.
This container is hosted by Google Professional Services.
Prerequisites
A Google Cloud account and project is required for this demo. The project must have the proper quota to run a Kubernetes Engine cluster with at least 3 n1-standard-4 and 3 n1-standard-1 nodes. Additionally, the project must have at least 30 Compute Engine API CPUs and 12 Compute Engine API In-use IP Addresses.
Access to an existing Google Cloud project with the Kubernetes Engine service enabled If you do not have a Google Cloud account please signup for a free trial here.
How to check your account's quota is documented here: quotas.
Run Demo in a Google Cloud Shell
Click the button below to run the demo in a Google Cloud Shell.
All the tools for the demo are installed. When using Cloud Shell execute the following command in order to setup gcloud cli. When executing this command please setup your region and zone.
gcloud init
Supported Operating Systems
This project will run on macOS, or in a Google Cloud Shell.
Tools
When not using Cloud Shell, the following tools are required.
- Google Cloud SDK version >= 204.0.0
- gcloud cli
- kubectl matching the latest GKE version
- docker (used to build container, you can use hosted container)
More recent versions of all the tools may function, please feel free to file an issue if you encounter problems with newer versions.
Install Cloud SDK
The Google Cloud SDK is used to interact with your GCP resources. Installation instructions for multiple platforms are available online.
Install kubectl CLI
The kubectl CLI is used to interteract with both Kubernetes Engine and kubernetes in general. Installation instructions for multiple platforms are available online.
Configure Authentication
The script will execute against your GCP environment and it will use your personal account to build out these resources. To setup the default account the script will use, run the following command to select the appropriate account:
gcloud auth login
Deployment Steps
To deploy the demo execute the following commands.
git clone https://github.com/GoogleCloudPlatform/gke-stateful-applications-demo
cd gke-stateful-applications-demo
./create.sh -c my-cluster-1
Replace the text 'my-cluster-1' the name of the cluster that you would like to create.
The create script will output the following message when complete.
NAME MACHINE_TYPE DISK_SIZE_GB NODE_VERSION
nodepool-cassdemo-2 n1-standard-1 100 1.10.2-gke.3
The script will:
- create a new Kubernetes Engine cluster in your default ZONE, VPC and network.
- install multiple Kubernetes manifests that can be found in the manifests directory.
Because we are creating a cluster of 6 nodes, the cluster may go into a 'RECONCILING' status as the control plane's instance size may be increased.
Use the following command to view the current status of the cluster.
gcloud container clusters list
An example of the output while the cluster is reconciling.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
my-cluster-1 us-central1-a 1.10.2-gke.3 35.184.70.165 n1-standard-4 1.10.2-gke.3 6 RECONCILING
The status will change to RUNNING once the masters have been updated.
Validation
The following script will validate that the demo is deployed correctly:
./validate.sh -c my-cluster-1
Replace the text 'my-cluster-1' the name of the cluster that you would like to validate.
The validation script uses kubectl rollout status
to test if the rollout is
complete. If the cluster is in 'RECONCILING' state this script will fail as
well.
If the script fails it will output:
Validation Failed: Statefulset has not been deployed
If the script passes if will output:
Validation Passed: the Statefulset has been deployed
Using Cassandara
These commands exercise the Cassandra cluster.
Launch a Cassandra container
These kubectl
commands launch a K8s deployment, wait for the deployment,
and exec into the shell.
kubectl run cass-dev --image gcr.io/pso-examples/cassandra:3.11.3-cqlsh-v22 --command -- /bin/bash -c "tail -f /dev/null"
kubectl rollout status deployment cass-dev
kubectl exec $(kubectl get po --no-headers | grep cass-dev | awk '{print $1}') -it -- /bin/bash
This will launch a bash prompt for the next steps.
Connect to the ring with cqlsh
cqlsh command
/usr/local/apache-cassandra/bin/cqlsh cassandra-0.cassandra.default.svc.cluster.local
You will now be using the interactive cqlsh
shell.
The output of the command:
Connected to K8Demo at cassandra-0.cassandra.default.svc.cluster.local:9042.
[cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>
Create a keyspace
CREATE KEYSPACE greek_monsters WITH REPLICATION = { 'class' : 'SimpleStrategy' , 'replication_factor' : 3 };
The user prompt is shown if the command is successful.
Describe the keyspace
DESC greek_monsters;
Example output:
CREATE KEYSPACE greek_monsters WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true;
cqlsh>
Use the new keyspace
USE greek_monsters;
The prompt will now include the keyspace that is selected.
cqlsh:greek_monsters>
Create a table
CREATE TABLE monsters (pet_id timeuuid, name text, description text, PRIMARY KEY (pet_id));
The user prompt is shown if the command is successful.
See the newly created table
DESCRIBE TABLES;
This command will output:
monsters
cqlsh:greek_monsters>
Insert data into the table
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Cerberus (Hellhound)','The three-headed giant hound, that guarded the gates of the Underworld.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Orthrus','A two-headed dog, brother of Cerberus, slain by Heracles.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Argos','Odysseus faithful dog, known for his speed, strength and his superior tracking skills.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Golden Dog','A dog which guarded the infant god Zeus.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Guard Dog of Hephaestus Temple','The temple of Hephaestus at Mount Etna was guarded by a pack of sacred dogs.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Laelaps',' female dog destined always to catch its prey.');
INSERT INTO monsters (pet_id,name,description) VALUES (now(), 'Maera','The hound of Erigone, daughter of Icarius of Athens.');
The user prompt is shown if the command is successful.
Select the data from the table
SELECT * from monsters ;
Example output:
pet_id | description | name
--------------------------------------+----------------------------------------------------------------------------------------+--------------------------------
ca3d9f20-6a89-11e8-a0dd-114bb9e30b07 | Odysseus faithful dog, known for his speed, strength and his superior tracking skills. | Argos
ca3c3f90-6a89-11e8-a0dd-114bb9e30b07 | A two-headed dog, brother of Cerberus, slain by Heracles. | Orthrus
ca42f650-6a89-11e8-a0dd-114bb9e30b07 | The hound of Erigone, daughter of Icarius of Athens. | Maera
ca3f4cd0-6a89-11e8-a0dd-114bb9e30b07 | A dog which guarded the infant god Zeus. | Golden Dog
ca41bdd0-6a89-11e8-a0dd-114bb9e30b07 | female dog destined always to catch its prey. | Laelaps
ca40ac60-6a89-11e8-a0dd-114bb9e30b07 | The temple of Hephaestus at Mount Etna was guarded by a pack of sacred dogs. | Guard Dog of Hephaestus Temple
ca3a6ad0-6a89-11e8-a0dd-114bb9e30b07 | The three-headed giant hound, that guarded the gates of the Underworld. | Cerberus (Hellhound)
Exit
Exit the pod with the following commands:
exit
exit
This will return you to your command line.
Delete the deployment
Execute the following command to remove the deployment.
kubectl delete deployment cass-dev
The following message is displayed:
deployment "cass-dev" deleted
Tear Down
The following script will tear down the Cassandra cluster and remove the Kubernetes Engine cluster.
./delete.sh -c my-cluster-1
This output will change depending on the cluster name. In this example the name "my-cluster-1" was used.
Replace the text 'my-cluster-1' the name of the cluster that you would like to validate. The last lines of the output will be:
persistentvolumeclaim "cassandra-data-cassandra-0" deleted
persistentvolumeclaim "cassandra-data-cassandra-1" deleted
persistentvolumeclaim "cassandra-data-cassandra-2" deleted
Deleting cluster
The tear down script removes all of the components in the manifests directory, and it also destroys the cluster. This script also waits 60 seconds before it removes the PVC storage components.
Troubleshooting
- Since we are creating two nodepools in this demo, the cluster may upgrade the control plane and it may go to a "RECONCILING" state. Give the cluster some time and it will scale. The validate script will fail during this time.
- Run "gcloud container clusters list" command to check the cluster status. Cluster
- If you get errors about quotas, please increase your quota in the project. See here for more details.
- A great diagnostic command to run is simply
kubectl get pods
. It will show the running pods.
Initially, the cluster may show as "RUNNING" but then go into a "RECONCILING" state
The symptom will be timeouts when running kubectl get pods
Use the "gcloud container clusters list" command to check the latest state, and wait until it changes back to "RUNNING"
Relevant Materials
- Thousand Instances of Cassandra
- Stateful Sets
- Persistent Volumes
- Pod Disruption Budgets
- Pod Scheduling
- Taints and Tolerations
- Headless Services
- Google Cloud Quotas
- Signup for Google Cloud
- Google Cloud Shell
This is not an officially supported Google product