DataStax at the online KubeCon + CloudNativeCon North America 2020 conference released K8ssandra, a curated open source distribution of Apache Cassandra database that includes all the components required to deploy the platform on a Kubernetes cluster.
The Cassandra database itself already runs on Kubernetes alongside a wide range of other platforms. K8ssandra adds support for a Kubernetes Helm chart that deploys all the components need to deploy and manage a Cassandra database, including an instance of the open source Prometheus monitoring platform and a dashboard based on open source Grafana visualization tools.
Other components include a previously available instance of an open source Operator for Cassandra that DataStax made available earlier this year to simplify the management of the database on Kubernetes clusters, and Cassandra Reaper, a tool that helps manage the critical maintenance tasks of a Cassandra database. There’s also Cassandra Medusa, a backup and restore tool for Cassandra databases.
Patrick McFadin, vice president of developer relations for DataStax, says Cassandra is gaining traction in Kubernetes environments because its core architecture lends itself better to stateful distributions of federated applications. Designed to run big data applications on top of a wide column-store database, Cassandra stores data on nodes that make use of a Gossip protocol to intelligently replicate data to optimize application performance, McFadin says, noting that approach makes it easier to deploy stateful applications based on event-driven architecture at scale from the network edge to the cloud.
While the bulk of container applications deployed on Kubernetes clusters have tended to be stateless, the rate at which stateful applications are now being deployed on Kubernetes clusters is starting to accelerate as enterprise IT organizations gain confidence in their ability to manage Kubernetes environments. Rather than storing data on an external storage system, it’s generally more efficient to streamline the management of compute and storage around a set of processes managed by a single team.
There are, of course, no shortage of options when it comes to databases and data stores that might be employed to run a stateful application on a Kubernetes cluster. DataStax is making a case for an elastic Cassandra database that will continue to scale as stateful applications grow and expand over time.
In the meantime, it’s not clear if databases in Kubernetes environments will continue to be managed by database administrators or whether that role will be subsumed in a set of DevOps processes managed by site reliability engineers (SREs). Regardless of approach, it’s probable most organizations will be employing multiple types of databases on Kubernetes clusters that are optimized for different application use cases. Each application development team within the same organization tends to have its own database preferences.
Cassandra, of course, is already widely employed on a variety of platforms so it’s now only a matter of time before many of those organizations deploy Cassandra on top of a Kubernetes cluster. The battle now is for the hearts and minds of developers building the next generation of cloud-native stateful applications that have yet to decide which database platform they prefer.