There are only three things that are important in doing analytics on a distributed database: Locality, locality and locality. Learn how the Cassandra-Spark connector builds RDD’s and optimizes for interacting with local Cassandra machines. We’ll go in depth into how Cassandra stores data in a cluster and the steps the Open Source Connector uses for both reading and writing data to Cassandra. Discover the Cassandra specific RDD functions that allow you to take advantage of underlying Cassandra mechanisms and perform lightening fast analytics on the world’s most scalable OLTP database. You will learn to take advantage of these strategies in your applications and make sure that you are making the most of your cluster resources.
Learn more:
Spark and Cassandra: An Amazing Apache Love Story Spark And Cassandra: 2 Fast, 2 Furious Cassandra and SparkSQL: You Don’t Need Functional Programming for Fun Zen and the Art of Apache Spark Maintenance with Cassandra