Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

12/2/2020

Reading time:N/A min

Cassandra and Spark: Optimizing for Data Locality - Databricks

by John Doe

There are only three things that are important in doing analytics on a distributed database: Locality, locality and locality. Learn how the Cassandra-Spark connector builds RDD’s and optimizes for interacting with local Cassandra machines. We’ll go in depth into how Cassandra stores data in a cluster and the steps the Open Source Connector uses for both reading and writing data to Cassandra. Discover the Cassandra specific RDD functions that allow you to take advantage of underlying Cassandra mechanisms and perform lightening fast analytics on the world’s most scalable OLTP database. You will learn to take advantage of these strategies in your applications and make sure that you are making the most of your cluster resources.Learn more:Spark and Cassandra: An Amazing Apache Love StorySpark And Cassandra: 2 Fast, 2 FuriousCassandra and SparkSQL: You Don’t Need Functional Programming for FunZen and the Art of Apache Spark Maintenance with Cassandra« back

Illustration Image

There are only three things that are important in doing analytics on a distributed database: Locality, locality and locality. Learn how the Cassandra-Spark connector builds RDD’s and optimizes for interacting with local Cassandra machines. We’ll go in depth into how Cassandra stores data in a cluster and the steps the Open Source Connector uses for both reading and writing data to Cassandra. Discover the Cassandra specific RDD functions that allow you to take advantage of underlying Cassandra mechanisms and perform lightening fast analytics on the world’s most scalable OLTP database. You will learn to take advantage of these strategies in your applications and make sure that you are making the most of your cluster resources.

Learn more:

Spark and Cassandra: An Amazing Apache Love Story Spark And Cassandra: 2 Fast, 2 Furious Cassandra and SparkSQL: You Don’t Need Functional Programming for Fun Zen and the Art of Apache Spark Maintenance with Cassandra

« back

Related Articles

python
cassandra
spark

GitHub - andreia-negreira/Data_streaming_project: Data streaming project with robust end-to-end pipeline, combining tools such as Airflow, Kafka, Spark, Cassandra and containerized solution to easy deployment.

andreia-negreira

12/2/2023

cassandra
spark

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra