In Apache Cassandra Lunch #41: Apache Cassandra Lunch #41: Cassandra on Kubernetes – Docker/Kubernetes/Helm Part 1, we discuss Cassandra on Kubernetes and give an introduction to Docker, Kubernetes, and Helm. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!
In Apache Cassandra Lunch #41: Apache Cassandra Lunch #41: Cassandra on Kubernetes – Docker/Kubernetes/Helm Part 1, we discuss Cassandra on Kubernetes and introduce containerization technologies and how they work.
Containers : “Get with the program”
- Docker – One “Dockerfile” becomes an Image which runs as one container on a node / host.
- Docker-Compose – Can run multiple containers on a node / host in as a composition. (Kompose)
- Docker Swarm – Can run docker-compose at scale on many node / hosts.
- Mesosphere/DCOS – Mesos, grand daddy of containerization adopted Docker, and Kubernetes
- Kubernetes – Basically everyone who does anything uses Kubernetes to deploy big applications.
Kubernetes General Architecture
- Nodes (minimum of 3 physical nodes)
- Master Nodes / Control Plane (at least 1)
- A master node is a node which controls and manages a set of worker nodes (workloads runtime) and resembles a cluster in Kubernetes. … All external communication to the cluster is via the API-Server. Kube-Controller-Manager, which runs a set of controllers for the running cluster.
- Whenever master node under kubernetes fails, the cluster still remains in an operational mode. It doesn’t affect pod creation or service member changes. If worker node fails, master stops receiving updates from worker node
- High availability / multi-master nodes use a load balancer to manage the API server which serves both the clients and the worker nodes
- Master Nodes / Control Plane (at least 1)
- Worker Nodes (at least 2)
- The worker nodes run the workloads. The worker node contains kubelet and kube-proxy that both connects to the pods within the docker. … A kubelet takes information from the master node and ensures that any pods assigned to it are running and configured in the desired state. All Kubernetes nodes must have a kubelet.
- Containers
- By themselves, don’t do much in Kubernetes.
- Need to make it part of a Pod.
- Pods
- Pods are the smallest, most basic deployable objects in Kubernetes. A Pod represents a single instance of a running process in your cluster. Pods contain one or more containers, such as Docker containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod’s resources.
- Workloads
- Deployments / ReplicaSet (Stateless)
- Node Express
- Python Flask
- Java Springboot
- StatefulSets (State)
- MySQL
- Cassandra
- Leverages “PersistentVolume” that is managed as a resource.
- DaemonSet
- “Sidecar” local to that node
- Monitoring / Management
- Job / CronJob
- Tasks that run and stop.
- Job – one-off tasks
- CronJob – recurring tasks
- Deployments / ReplicaSet (Stateless)
Helm Charts
- Components
- The client (CLI), which lives on your local workstation.
- The server (Tiller), which lives on the Kubernetes cluster to execute what’s needed.
- Concepts
- Chart: A package of pre-configured Kubernetes resources.
- Release: A specific instance of a chart which has been deployed to the cluster using Helm.
- Repository: A group of published charts which can be made available to others.
- Why?
- Boosts productivity
- Reduces duplication & complexity
- Smoothens the learning curve
- Simplifies deployments
- Operators
- The Operator pattern aims to capture the key aim of a human operator who is managing a service or set of services. Human operators who look after specific applications and services have deep knowledge of how the system ought to behave, how to deploy it, and how to react if there are problems.
- Tasks
- deploying an application on demand
- taking and restoring backups of that application’s state
- handling upgrades of the application code alongside related changes such as database schemas or extra configuration settings
- publishing a Service to applications that don’t support Kubernetes APIs to discover them
- simulating failure in all or part of your cluster to test its resilience
- choosing a leader for a distributed application without an internal member election process
- Why?
- Robots are better at doing repetitive tasks
- Simplifies the overall complexity of an operation into a few configurable components
- Makes pattern reuse possible
- How to make Operators? (via Operator SDK)
- Helm
- Go
- Java
- Ansible
If you missed last week’s Apache Cassandra Lunch #40: Scylla Migrator for Cassandra Data Operations, don’t forget to check it out as well! If you want to attend Cassandra Lunch live every Wednesday at 12 PM EST, then you can register here now! Additionally, the playlist with all the previously recorded Cassandra Lunches is available here.
Resources
- https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-intro/
- https://kubernetes.io/docs/concepts/workloads/
- https://medium.com/velotio-perspectives/demystifying-high-availability-in-kubernetes-using-kubeadm-3d83ed8c458b#:~:text=Kubernetes High-Availability is about,access to same worker nodes.
- https://enterprisersproject.com/article/2020/9/pod-cluster-container-what-is-difference
- https://medium.com/developerworld/pod-vs-node-in-kubernetes-26c858988f94
- https://www.bmc.com/blogs/kubernetes-helm-charts/
- https://www.vamsitalkstech.com/architecture/big-data-kubernetes-a-reference-architecture-for-spark-with-kubernetes-2-4/
- banzaicloud/kafka-operator
- https://www.vembu.com/blog/docker-vs-kubernetes-differences-architecture-and-use-cases/
- https://tanzu.vmware.com/developer/guides/kubernetes/create-helm-chart/
Cassandra.Link
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!