Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

12/2/2020

Reading time:N/A min

Big Data Analytics On Kubernetes for Streaming Application - XenonStack

by John Doe

Setting up Analytics Stack for Streaming ApplicationsSKACK Stack is an open source Full-Stack platform for Real-Time analysis of Big Data. It consists of Apache Spark, Kubernetes, Akka, Apache Cassandra, and Apache Kafka.GCP & GlusterFS acts a storage solution as it supports multi-mount and data remains on all nodes of GlusterFS & GCP.Challenge for Setting Up Multi Node cluster on SKACKSet up a multi-node cluster for SKACK Stack with a document on Kubernetes.Container environment is not persistent by default, so application in Kubernetes needs Persistent storage to store data.Using Kubernetes to scale up Spark.Using Kubernetes to scale up CassandraUsing Kubernetes to scale up KafkaSolution Offerings for Setting Up on Premises Kubernetes ClusterTo overcome the challenges mentioned above, set up a three-node on premises Kubernetes cluster in which one will as a master and the other two workers.The Cluster includes –Kubernetes MasterKubernetes SchedulerKubernetes Controller ManagerSetup for analyzing the cluster and reporting to the API server to store metrics that contains resource utilization, availability, and performance.

Illustration Image

Setting up Analytics Stack for Streaming Applications

  • SKACK Stack is an open source Full-Stack platform for Real-Time analysis of Big Data. It consists of Apache Spark, Kubernetes, Akka, Apache Cassandra, and Apache Kafka.
  • GCP & GlusterFS acts a storage solution as it supports multi-mount and data remains on all nodes of GlusterFS & GCP.

Challenge for Setting Up Multi Node cluster on SKACK

  • Set up a multi-node cluster for SKACK Stack with a document on Kubernetes.
  • Container environment is not persistent by default, so application in Kubernetes needs Persistent storage to store data.
    • Using Kubernetes to scale up Spark.
    • Using Kubernetes to scale up Cassandra
    • Using Kubernetes to scale up Kafka

Solution Offerings for Setting Up on Premises Kubernetes Cluster

To overcome the challenges mentioned above, set up a three-node on premises Kubernetes cluster in which one will as a master and the other two workers.

The Cluster includes –

  • Kubernetes Master
  • Kubernetes Scheduler
  • Kubernetes Controller Manager

Setup for analyzing the cluster and reporting to the API server to store metrics that contains resource utilization, availability, and performance.

Related Articles

sstable
cassandra
spark

Spark and Cassandra’s SSTable loader

Arunkumar

11/1/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

kubernetes