In Apache Cassandra Lunch #29: Cassandra & Kubernetes Update, we cover updates regarding Cassandra and Kubernetes after the recent KubeCon event. The live recording of Cassandra Lunch, which includes a more in-depth discussion, is also embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register here now!
In Apache Cassandra Lunch #29: Cassandra & Kubernetes Update, we cover updates regarding Cassandra and Kubernetes after the recent KubeCon event. We cover DataStax’s Cass-Operator updates, Orange Telecom’s CassKop updates, and a new project called K8ssandra. As mentioned above, a more in-depth discussion is available in the embedded YouTube video below, so don’t forget to watch that in conjunction with this blog. Also, while you are there, don’t forget to like and subscribe!
DataStax’s Cass-Operator
- Datacenter provisioning
- Schedule all pods
- Bootstrap nodes in the appropriate order
- Seeds
- Across racks
- etc.
- Uniform configuration
- Scale-up
- Add new nodes in a balanced manner across rack
- Scale-down
- Remove nodes one at a time across racks
- Node recovery
- Restart process
- Reschedule instance (IE replace node)
- Replace instance
- Specific workflows for seed node replacements
- Multi-DC / Multi-Rack
- Multi-Region / Multi-K8s Cluster
- Note this requires support at a networking layer for pod to pod IP connectivity. This may be accomplished within the cluster with CNIs like Cilium or externally via traditional networking tools.
- Differentiators
- OSS Ecosystem / Components
- Cass Config Builder – OSS project extracted from DataStax OpsCenter
- Life Cycle Manager to provide automated configuration file rendering
- Cass Config Definitions – definitions files for cass-config-builder,
- defines all configuration files, their parameters, and templates
- Management API for Apache Cassandra (MAAC)
- Metrics Collector for Apache Cassandra (MCAC)
- Reference Prometheus Operator CRDs
- ServiceMonitor
- Instance
- Reference Grafana Operator CRDs
- Instance
- Dashboards
- Datasource
- PodTemplateSpec
- Customization of existing pods including support for adding containers, volumes, etc
- Advanced Networking
- Node Port
- Host Network
- Simple security
- Management API mTLS support
- Automated generation of keystore and truststore for internode and client to node TLS
- Automated superuser account configuration
- The default superuser (cassandra/cassandra) is disabled and never available to clients
- Cluster administration account may be automatically (or provided) with values stored in a k8s secret
- Automatic application of NetworkTopologyStrategy with appropriate RF for system keyspaces
- Validating webhook
- Invalid changes are rejected with a helpful message
- Rolling cluster updates
- Change in binary (C* upgrade)
- Change in configuration
- Canary deployments – single rack application of changes for validation before broader deployment
- Rolling restart
- Platform Integration / Testing / Certification
- Red Hat Openshift compatible and certified
- Secure, Universal Base Image (UBI) foundation images with security
- scanning performed by Red Hat
- cass-operator
- cass-config-builder
- apache-cassandra w/ MCAC and MAAC
- scanning performed by Red Hat
- Integration with Red Hat certification pipeline / marketplace
- Presence in Red Hat Operator Hub built into OpenShift interface
- Secure, Universal Base Image (UBI) foundation images with security
- VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
- Security scanning for images performed by VMware
- Amazon EKS
- Azure AKS
- Google GKE
- Red Hat Openshift compatible and certified
- Documentation / Reference Implementations
- Cloud storage classes
- Ingress solutions
- Sample connection validation application with reference implementations of Java Driver client connection parameters
- Cluster-level Stop / Resume
- Stop all running instances while keeping persistent storage
- Allows for scaling compute down to zero. Bringing the cluster back up follows expected startup procedures
- Road Map / Inflight
- Repair
- Reaper integration
- Backups
- Velero integration
- Medusa integration
- Advanced Networking via sidecar
- Combination of proxy sidecars (a la Envoy) to allow for persistent IP addresses despite Kubernetes’ best efforts to shuffle them.
- Single pod canary deployments
- Platform Certification
- VMware Project Pacific
- Rancher Kubernetes Engine (K3s)
- Documentation
- Multi-region
- Multi-cloud
- Additional ingress providers
- Voyager
- HAProxy
- Gloo
- Ambassador
- Envoy
- NGINX Ingress Controller
- Additional storage class references
- OpenEBS
- Cassandra Enhancements
- Repair
Orange Telecom’s CassKop
- Nodes labeling to map any internal architecture (including network specific labels to muti-dc setup)
- Volumes & sidecars management (possibly linked to PodTemplateSpec)
- Backup & restore (we ruled out velero and can share why we went with Instaclustr but Medusa could work too)
- Kubectl plugin integration (quite useful on the ops side without an admin UI)
- MultiCassKop evolution to drive multiple cass-operators instead of multiple casskops (this could remain Orange internal if too specific)
K8ssandra
- K8ssandra provides a production-ready platform for running Apache Cassandra on Kubernetes. This includes automation for operational tasks such as repairs, backups, and monitoring.
- K8ssandra is a cloud native distribution of Apache Cassandra meant to run on Kubernetes.
- At a pure component level, K8ssandra integrates and packages together
- Apache Cassandra 3.11.7
- Kubernetes Operator for Apache Cassandra (cass-operator)
- Reaper, also known as the Repair Web Interface
- Medusa for backup and restore
- Metrics Collector, with Prometheus integration, and visualization via preconfigured Grafana dashboards
- Templates for connections into your Kubernetes environment via Ingress solutions such as Traefik
- Right now K8ssandra is deployed as an entire stack. It currently assumes your deployment uses the entire stack. Trading out certain components for others is not supported.
If you missed last week’s Apache Cassandra Lunch #28: Cassandra Backup / Restore Scenarios, be sure to check it out! As mentioned above, the live recording of Apache Cassandra Lunch #29 is embedded below. Also, check out our YouTube channel for more videos and the Cassandra Lunch playlist here! Don’t forget to subscribe while you are there!
Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.
We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!