codecentric AG: CQRS and Event Sourcing Applications with Cassandra
CQRS and Event
Sourcing Applications
with Cassandra_
Matthias Niehoff
#CassandraSummit 2015
1
! The Use Case
! Event Sourcing
! CQRS
! Cassandra for Storage
! Spark for Processing
! Benefits & Pitfalls
! Q&A
Agenda_
2
The Use Case
3
24x7 Proxy_
4
LegacySystems

(Not24x7)
“InternetReady“
Applications
(24x7available)
24x7 Proxy
•Caches data
•Provides data...
•Solution needs to be highly scalable 

(up to 100.000 reads/s, 10.000 writes/s)
•Read and write access needs to be low la...
Event Sourcing
6
Traditional Pattern: Saving Application State_
7
Store
ID
Address
Article
Name
StockSize updateInventory()
getInventory()
...
A series of sales and replenishments for
• a tablet
• Starting with 60, sell 20, replenish 10
• a stove
• Starting with 25...
Saving only application state
What is the Difference?_
9
:ArticleInventory
Fancy Tablet
50
:ArticleInventory
Gas Stove
20
Saving events instead of state
What is the Difference?_
10
:ArticleInventory
Fancy Tablet
39
15-08-14T19:..
:ArticleInvent...
•Log of all stock changes
•Complete rebuild of the state
•Temporal query
•Event replay and rollback
Benefits of Storing Ev...
CQRS
12
Default Application Architecture_
13
UserInterface
DomainModel
ApplicationServices
DB
CQRS Application Architecture_
14
UserInterface
Query
Services
Command
Services
DomainModel
DB
•The pattern is simple
•Going further
• Split up the domain model
• Independent scaling of models
• Not using a query mode...
Event Sourcing & CQRS_
16
Command
Services
Command
Model
ReadLayer
Query
Services
Query
Services
Query
Services Asynchrono...
Storage with Cassandra
17
•Not only an event sink
• Compaction
• Selective replay
•No single point of failure
•Horizontal scale & Geo Replication
•W...
For accessing all entities of a given type
Event Store_
19
CREATE TABLE event_source_by_type (
entity_type TEXT,
bucket IN...
CREATE TABLE event_source_by_key (
entity_type TEXT,
entity_key TEXT,
insert_time TIMESTAMP,
update_time TIMESTAMP,
payloa...
•Create tables that fit your queries!
•E.g. „Get articles in category ‚computer‘“
Query Stores_
21
CREATE TABLE articles_by...
Query Stores_
22
„I need ad-hoc queries“
„I need specific queries with
a lot of different filters“
Query Stores_
23
Processing with Spark
24
•Command model triggers event processor
•Event processor updates query views
From Event Store to Query Store_
25
Command
M...
Event Processing in Detail_
26
Command
Model DB
DB
DB
•Easy scale out
•Easy deployment
•Intuitive Scala & Java API
•Fault tolerant
•Out-of-the-box Kafka adapter
•Integrates wel...
•Spark Streaming application
•Consumes only topics of interest
•Joins the stream of events with the current view
• Use pri...
1. Create a table for the query view
2. Create a Spark job filling your table
3. Deploy the Spark job
4. Init reprocess of ...
Benefits &
Pitfalls
30
•Scalability
• On storage & processing: just add nodes
• Efficient queries due to separation
•Collaboration
• Every client ...
•More complexity than simple CRUD
•Side effects on event replay
•Eventual consistency in query views
•Concurrent writes
•P...
Lost Updates
•Due to parallel processing
• Two events A and B as sequential input
• A is processed after B
•Solution
• Par...
•Event Store Compaction
• Compact store to improve processing time
• Only store latest entry of a entity key
• e.g. a Spar...
The Use Case
Solved with ES & CQRS
35
24x7 Proxy
24x7 Proxy_
36
LegacyCoreSystems

(Not24x7)
“InternetReady“Applications
(24x7available)
37
Questions?
Thank You!
Matthias Niehoff,
IT-Consultant
codecentric AG
Zeppelinstraße 2
76185 Karlsruhe, Germany
www.codecentric.de
blo...

Upcoming SlideShare

Loading in …5

×

  1. 1. CQRS and Event Sourcing Applications with Cassandra_ Matthias Niehoff #CassandraSummit 2015 1
  2. 2. ! The Use Case ! Event Sourcing ! CQRS ! Cassandra for Storage ! Spark for Processing ! Benefits & Pitfalls ! Q&A Agenda_ 2
  3. 3. The Use Case 3
  4. 4. 24x7 Proxy_ 4 LegacySystems
 (Not24x7) “InternetReady“ Applications (24x7available) 24x7 Proxy •Caches data •Provides data •Stores changes •Provides changes •No business logic/validation
  5. 5. •Solution needs to be highly scalable 
 (up to 100.000 reads/s, 10.000 writes/s) •Read and write access needs to be low latency •Read/write ratio is 10:1 or higher •Solution needs to deal with up to 500.000.000 customers Assumptions_ 5
  6. 6. Event Sourcing 6
  7. 7. Traditional Pattern: Saving Application State_ 7 Store ID Address Article Name StockSize updateInventory() getInventory() sells
  8. 8. A series of sales and replenishments for • a tablet • Starting with 60, sell 20, replenish 10 • a stove • Starting with 25, sell 5, no replenishments What is different with Event Sourcing?_ 8
  9. 9. Saving only application state What is the Difference?_ 9 :ArticleInventory Fancy Tablet 50 :ArticleInventory Gas Stove 20
  10. 10. Saving events instead of state What is the Difference?_ 10 :ArticleInventory Fancy Tablet 39 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 45 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 50 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:..
  11. 11. •Log of all stock changes •Complete rebuild of the state •Temporal query •Event replay and rollback Benefits of Storing Events_ 11
  12. 12. CQRS 12
  13. 13. Default Application Architecture_ 13 UserInterface DomainModel ApplicationServices DB
  14. 14. CQRS Application Architecture_ 14 UserInterface Query Services Command Services DomainModel DB
  15. 15. •The pattern is simple •Going further • Split up the domain model • Independent scaling of models • Not using a query model at all • Different databases for models A Pattern Changing Your Mindset_ 15
  16. 16. Event Sourcing & CQRS_ 16 Command Services Command Model ReadLayer Query Services Query Services Query Services Asynchronous DB Event Store Query Stores ProcessorEvent Processor DB DB DB
  17. 17. Storage with Cassandra 17
  18. 18. •Not only an event sink • Compaction • Selective replay •No single point of failure •Horizontal scale & Geo Replication •Write ahead of unmodified data •Plays well with further processing •Open source & a huge community •Easy operations Why Cassandra… 18
  19. 19. For accessing all entities of a given type Event Store_ 19 CREATE TABLE event_source_by_type ( entity_type TEXT, bucket INT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,bucket),insert_time,entity_key) ) 
 WITH CLUSTERING ORDER BY (created_at DESC,entity_key ASC); e.g. as JSON, XML, protobuf, Avro prevent huge partitions
  20. 20. CREATE TABLE event_source_by_key ( entity_type TEXT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,entity_key),created_at) ) 
 WITH CLUSTERING ORDER BY (created_at DESC); For accessing an entity directly Optional: Second Table_ 20 e.g. as JSON, XML or protobuf
  21. 21. •Create tables that fit your queries! •E.g. „Get articles in category ‚computer‘“ Query Stores_ 21 CREATE TABLE articles_by_category ( category TEXT PRIMARY KEY, article_id UUID, article_info TEXT ); may need bucketing could also be a JSON document
  22. 22. Query Stores_ 22 „I need ad-hoc queries“ „I need specific queries with a lot of different filters“
  23. 23. Query Stores_ 23
  24. 24. Processing with Spark 24
  25. 25. •Command model triggers event processor •Event processor updates query views From Event Store to Query Store_ 25 Command Model Event Processor DB DB DB Event Processor Event Processor
  26. 26. Event Processing in Detail_ 26 Command Model DB DB DB
  27. 27. •Easy scale out •Easy deployment •Intuitive Scala & Java API •Fault tolerant •Out-of-the-box Kafka adapter •Integrates well with Cassandra Why Spark? 27
  28. 28. •Spark Streaming application •Consumes only topics of interest •Joins the stream of events with the current view • Use primary key of entity for correlation • Use joinWithCassandraTable Spark Job in Detail_ 28
  29. 29. 1. Create a table for the query view 2. Create a Spark job filling your table 3. Deploy the Spark job 4. Init reprocess of the event DB • same transformation logic as in normal processing • source can be different 5. Mark view as initialized If you need a new query view_ 29 Query DB Event DB
  30. 30. Benefits & Pitfalls 30
  31. 31. •Scalability • On storage & processing: just add nodes • Efficient queries due to separation •Collaboration • Every client gets its own data access • Easy to support new queries Benefits_ 31
  32. 32. •More complexity than simple CRUD •Side effects on event replay •Eventual consistency in query views •Concurrent writes •Performance of replay Pitfalls_ 32
  33. 33. Lost Updates •Due to parallel processing • Two events A and B as sequential input • A is processed after B •Solution • Partition Spark RDD by entity key • Use a lambda architecture Pitfalls_ 33 speed Data Stream Serving Layer batch
  34. 34. •Event Store Compaction • Compact store to improve processing time • Only store latest entry of a entity key • e.g. a Spark batch job / Cassandra TTL •Snapshot / Master State • Constantly build a complete state of all data • Can be used • To speed up initialization • As a store for a search engine Pitfalls_ 34
  35. 35. The Use Case Solved with ES & CQRS 35
  36. 36. 24x7 Proxy 24x7 Proxy_ 36 LegacyCoreSystems
 (Not24x7) “InternetReady“Applications (24x7available)
  37. 37. 37 Questions?
  38. 38. Thank You! Matthias Niehoff, IT-Consultant codecentric AG Zeppelinstraße 2 76185 Karlsruhe, Germany www.codecentric.de blog.codecentric.de matthiasniehoff 38