Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

12/2/2020

Reading time:1 min

Spark And Cassandra: 2 Fast, 2 Furious - Databricks

by John Doe

Not since peanut butter and jelly has there been such an epic combo. Spark is the world’s foremost distributed analytics platform, delivering in-memory analytics with a speed and ease of use unheard of in Hadoop. Cassandra is the lighting fast distributed database powering such IT giants as Outbrain and Netflix. Did you know you can combine them with free open source technology? Integrate them easily with the Datastax Open Source Spark Cassandra Connector. This feature-rich integration allows Spark to fully take advantage of Cassandra as well as use Cassandra-specific Spark optimizations. Increase the efficiency of your application with the insider knowledge delivered by one of the main authors of the connector. In this session we’ll go over some of the most common use cases of the Spark Cassandra Connector and highlight how to avoid the most common pitfalls. We will walk through: Spark Cassandra Basic Features: How the Spark Cassandra Connector reads and writes data to C* How Spark Dataframes are integrated with Cassandra How to use Cassandra data locality to your advantage How Cassandra predicate pushdown works in SparkSQL Building and Tuning Spark Streaming Applications with Cassandra: Tuning standard RDD operations for maximum throughput Using the internal C* driver pool for flexibility and efficient access Understanding how receivers work and interact with Cassandra locality Use Spark to Perform Common Cassandra Maintenance: Migrate data from RDBMS sources directly into Cassandra Using Spark to migrate information between different Cassandra Clusters Bulk loading Cassandra using Spark and DataFrames Rebuilding Cassandra tables with different indexes using Spark« back

Illustration Image

Not since peanut butter and jelly has there been such an epic combo. Spark is the world’s foremost distributed analytics platform, delivering in-memory analytics with a speed and ease of use unheard of in Hadoop. Cassandra is the lighting fast distributed database powering such IT giants as Outbrain and Netflix. Did you know you can combine them with free open source technology? Integrate them easily with the Datastax Open Source Spark Cassandra Connector. This feature-rich integration allows Spark to fully take advantage of Cassandra as well as use Cassandra-specific Spark optimizations. Increase the efficiency of your application with the insider knowledge delivered by one of the main authors of the connector. In this session we’ll go over some of the most common use cases of the Spark Cassandra Connector and highlight how to avoid the most common pitfalls. We will walk through: Spark Cassandra Basic Features: How the Spark Cassandra Connector reads and writes data to C* How Spark Dataframes are integrated with Cassandra How to use Cassandra data locality to your advantage How Cassandra predicate pushdown works in SparkSQL Building and Tuning Spark Streaming Applications with Cassandra: Tuning standard RDD operations for maximum throughput Using the internal C* driver pool for flexibility and efficient access Understanding how receivers work and interact with Cassandra locality Use Spark to Perform Common Cassandra Maintenance: Migrate data from RDBMS sources directly into Cassandra Using Spark to migrate information between different Cassandra Clusters Bulk loading Cassandra using Spark and DataFrames Rebuilding Cassandra tables with different indexes using Spark

« back

Related Articles

python
cassandra
spark

GitHub - andreia-negreira/Data_streaming_project: Data streaming project with robust end-to-end pipeline, combining tools such as Airflow, Kafka, Spark, Cassandra and containerized solution to easy deployment.

andreia-negreira

12/2/2023

cassandra
spark

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

spark.rdd