Successfully reported this slideshow.
Spark and Cassandra: An Amazing Apache Love Story by Patrick McFadin
Upcoming SlideShare
Loading in …5
×
- 1. ©2013 DataStax Confidential. Do not distribute without consent. @PatrickMcFadin Patrick McFadin Chief Evangelist, DataStax Spark and Cassandra: An amazing Apache love story 1
- 2. Store a ton of data Analyze a ton of data
- 3. Community Response?
- 4. Cassandra Only DC
- 5. Cassandra Only DC Cassandra + Spark DC Spark Jobs
- 6. Cassandra Only DC Cassandra + Spark DC Spark Jobs Spark Streaming
- 7. Worker Worker Worker Worker Analytics WorkloadTransactional Workload
- 8. DataStax Enterprise
- 9. DataStax Enterprise
- 10. • 10T of high frequency event data daily • Constant increasing volume “The web server that powers the interface can query both datacenters, depending on which the user is closest to,” “A small set of signals tend to double every eight months. So we needed a model that can scale linearly.” - Arun Jayandra, Microsoft
- 11. REST API O365 Event Hub Ingestion Worker (Azure worker role using DataStax C# driver) C* Analytics REST API O365 Kafka C*/ Spark Streaming Analytics G4 – Local SSD Kafka: G4 – Data Disk ZooKeeper: A7 – Data Disk PaaS Small G4 – Local SSD Cluster 1: Cluster 2: 20k – 50k events/sec 200k+ events/sec
- 12. Data Protection • Maximilian Schrems v Data Protection Commissioner • No longer OK to ship EU data to US under “Safe Harbour” Product_Catalog RF=3 Product_Catalog RF=3 Customer_Data RF=3 Customer_Data RF=0 Product_Catalog RF=3 Customer_Data RF=3
- 13. • 300k customers • Report on energy usage • Predict boiler failure “We’re dealing largely with time series data, and Spark is 10 to 100 times quicker as it is operating on data in-memory…Cassandra delivers what we need today and if you look at the Internet of Things space; that is what is really useful right now.” - Jim Anning, British Gas Hive Active Heating™
- 14. Cassandra Only DC Cassandra + Spark DC Spark Jobs Spark Streaming Home Data Center Hive Active Heating™
- 15. Store a ton of data Analyze a ton of data Thank you!
Public clipboards featuring this slide
No public clipboards found for this slide