Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

5/18/2020

Reading time:5 min

Understanding Data Consistency in Apache Cassandra

by DataStax

Understanding Data Consistency in Apache Cassandra SlideShare Explore You Successfully reported this slideshow.Understanding Data Consistency in Apache CassandraUpcoming SlideShareLoading in …5× 3 Comments 28 Likes Statistics Notes shiwanlin Akash Talole , Senior Research Manager at ESDS Software Solution Pvt Ltd. Passion For Innovation. IoT Maker. at ESDS Software Solution Pvt Ltd. Subhasis Gorai at Hewlett-Packard STSD Labs Kenniston Arraes Bonfim , Arquiteto de Software Dominic Kumar , SME - Project Management, BigData & Database Performance Tuning at Schlumberger Show More No DownloadsNo notes for slide 1. Cassandra EssentialsTutorial Series Understanding Data Consistency in Apache Cassandra 2. Agenda›  Overview of reading/writing data in Cassandra›  Details on how Cassandra writes data›  Review of the CAP theorem›  Tunable data consistency›  Choosing a data consistency strategy for writes›  Choosing a data consistency strategy for reads›  CQL examples of data consistency›  Where to get Cassandra www.datastax.com 3. Reading and Writing in CassandraCassandra is a peer-to-peer, read/write anywherearchitecture, so any user can connect to any node inany data center and read/write the data they need,with all writes being partitioned and replicated for themautomatically throughout the cluster. www.datastax.com 4. Writes in Cassandra›  Data is first written to a commit log for durability›  Then written to a memtable in memory›  Once the memtable becomes full, it is flushed to an SSTable (sorted strings table)›  Writes are atomic at the row level; all columns are written or updated, or none are. RDBMS-styled transactions are not supported INSERT INTO… Commit log memtable SSTable Cassandra is known for being the fastest database in the industry where write operations are concerned. www.datastax.com 5. Writes in Cassandra vs. Other Databases Cassandra is up to: 4x better in writes! 2x better in reads! 12x better in reads/updates! Sept, 2011: http://blog.cubrid.org/dev-platform/nosql-benchmarking/ www.datastax.com 6. Review of the CAP Theorem www.datastax.com 7. Tunable Data Consistency›  Choose between strong and eventual consistency (All to any node responding) depending on the need›  Can be done on a per-operation basis, and for both reads and writes›  Handles Multi-data center operations 1 6 2 Writes Reads ›  Any ›  One 5 3 ›  One ›  Quorum ›  Quorum ›  Local_Quorum 4 ›  Local_Quorum ›  Each_Quorum ›  Each_Quorum ›  All ›  All www.datastax.com 8. Selecting a Strategy for Writes›  Any – a write must succeed on any available node›  One – a write must succeed on any node responsible for that row (either primary or replica)›  Quorum – a write must succeed on a quorum of replica nodes (determined by (replication_factor /2 )+ 1›  Local_Quorum - a write must succeed on a quorum of replica nodes in the same data center as the coordinator node›  Each_Quorum - a write must succeed on a quorum of replica nodes in all data centers›  All – a write must succeed on all replica nodes for a row key www.datastax.com 9. Hinted Handoffs›  Cassandra attempts to write a row to all replicas for that row›  If all replica nodes are not available, a hint is stored on one node to update any downed nodes with the row once they are available again›  If no replica nodes are available for a row, the use of the ANY consistency level will instruct the coordinator node to store a hint and the row data, which it passes to the replica nodes when they are available Replica 1 Replica3 Replica2 Hint for Node5 www.datastax.com 10. Selecting a Strategy for Reads›  One – reads from the closest node holding the data›  Quorum – returns a result from a quorum of servers with the most recent timestamp for the data›  Local_Quorum - returns a result from a quorum of servers with the most recent timestamp for the data in the same data center as the coordinator node›  Each_Quorum - returns a result from a quorum of servers with the most recent timestamp in all data centers›  All – returns a result from all replica nodes for a row key www.datastax.com 11. Read Repair›  Cassandra ensures that frequently-read data remains consistent›  When a read is done, the coordinator node compares the data from all the remaining replicas that own the row in the background, and if they are inconsistent, issues writes to the out-of-date replicas to update the row to reflect the most recently written values.›  Read repair can be configured per column family and is enabled by default. Replica 1 st repair reque Replica3 Replica2 www.datastax.com 12. CQL ExamplesSELECT total_purchases FROM SALESUSING CONSISTENCY QUORUMWHERE customer_id = 5UPDATE SALESUSING CONSISTENCY ONESET total_purchases = 500000WHERE customer_id = 4 www.datastax.com 13. Where to get Cassandra?›  Go to www.datastax.com›  DataStax makes free smart start installers available for Cassandra that include: ›  The most up-to-date Cassandra version that is production quality ›  A version of DataStax OpsCenter, which is a visual, browser-based management tool for managing and monitoring Cassandra ›  Drivers and connectors for popular development languages ›  Same database and application ›  Automatic configuration assistance for ensuring optimal performance and setup for either stand- alone or cluster implementations ›  Getting Started Guide www.datastax.com 14. Where Can I Learn More? www.datastax.com ›  Free Online Documentation ›  Technical White Papers ›  Technical Articles ›  Tutorials ›  User Forums ›  User/Customer Case Studies ›  FAQ’s ›  Videos ›  Blogs ›  Software downloads www.datastax.com 15. Cassandra EssentialsTutorial Series Understanding Data Partitioning and Replication in Apache Cassandra Thanks! Recommended Learning to Write a SyllabusOnline Course - LinkedIn Learning Elearning Techniques: Visual DesignOnline Course - LinkedIn Learning Gaining Skills with LinkedIn LearningOnline Course - LinkedIn Learning C* Summit 2013: Eventual Consistency != Hopeful Consistency by Christos Kalan...DataStax Academy Introduction to Cassandra: Replication and ConsistencyBenjamin Black Indexing in CassandraEd Anuff Understanding Data Partitioning and Replication in Apache CassandraDataStax An Overview of Apache CassandraDataStax Cassandra is great but how do I test my application?Christopher Batey Fault Tolerance in CassandraAcunu About Blog Terms Privacy Copyright LinkedIn Corporation © 2020 × Public clipboards featuring this slideNo public clipboards found for this slideSelect another clipboard ×Looks like you’ve clipped this slide to already.Create a clipboardYou just clipped your first slide! Clipping is a handy way to collect important slides you want to go back to later. Now customize the name of a clipboard to store your clips. Description Visibility Others can see my Clipboard

Illustration Image
Understanding Data Consistency in Apache Cassandra

Successfully reported this slideshow.

Understanding Data Consistency in Apache Cassandra
Cassandra EssentialsTutorial Series    Understanding  Data Consistency        in Apache        Cassandra
Agenda›  Overview   of reading/writing data in Cassandra›  Details on how Cassandra writes data›  Review of the CAP the...
Reading and Writing in CassandraCassandra is a peer-to-peer, read/write anywherearchitecture, so any user can connect to a...
Writes in Cassandra›  Data is first written to a commit log for durability›  Then written to a memtable in memory›  Onc...
Writes in Cassandra vs. Other Databases Cassandra is up to: 4x better in writes! 2x better in reads! 12x better in reads/u...
Review of the CAP Theorem            www.datastax.com
Tunable Data Consistency›  Choose  between strong and eventual    consistency (All to any node responding)    depending o...
Selecting a Strategy for Writes›  Any – a write must succeed on any available node›  One – a write must succeed on any n...
Hinted Handoffs›  Cassandra attempts to write a row to all replicas for that    row›  If all replica nodes are not avail...
Selecting a Strategy for Reads›  One – reads from the closest node holding the data›  Quorum – returns a result from a q...
Read Repair›  Cassandra ensures that frequently-read data remains    consistent›  When a read is done, the coordinator n...
CQL ExamplesSELECT total_purchases FROM SALESUSING CONSISTENCY QUORUMWHERE customer_id = 5UPDATE   SALESUSING    CONSISTEN...
Where to get Cassandra?›  Go to www.datastax.com›  DataStax makes free smart start installers    available for Cassandra...
Where Can I Learn More?          www.datastax.com         ›    Free Online Documentation         ›    Technical White Pa...
Cassandra EssentialsTutorial Series         Understanding   Data Partitioning and  Replication in Apache              Cass...

Upcoming SlideShare

Loading in …5

×

  1. 1. Cassandra EssentialsTutorial Series Understanding Data Consistency in Apache Cassandra
  2. 2. Agenda›  Overview of reading/writing data in Cassandra›  Details on how Cassandra writes data›  Review of the CAP theorem›  Tunable data consistency›  Choosing a data consistency strategy for writes›  Choosing a data consistency strategy for reads›  CQL examples of data consistency›  Where to get Cassandra www.datastax.com
  3. 3. Reading and Writing in CassandraCassandra is a peer-to-peer, read/write anywherearchitecture, so any user can connect to any node inany data center and read/write the data they need,with all writes being partitioned and replicated for themautomatically throughout the cluster. www.datastax.com
  4. 4. Writes in Cassandra›  Data is first written to a commit log for durability›  Then written to a memtable in memory›  Once the memtable becomes full, it is flushed to an SSTable (sorted strings table)›  Writes are atomic at the row level; all columns are written or updated, or none are. RDBMS-styled transactions are not supported INSERT INTO… Commit log memtable SSTable Cassandra is known for being the fastest database in the industry where write operations are concerned. www.datastax.com
  5. 5. Writes in Cassandra vs. Other Databases Cassandra is up to: 4x better in writes! 2x better in reads! 12x better in reads/updates! Sept, 2011: http://blog.cubrid.org/dev-platform/nosql-benchmarking/ www.datastax.com
  6. 6. Review of the CAP Theorem www.datastax.com
  7. 7. Tunable Data Consistency›  Choose between strong and eventual consistency (All to any node responding) depending on the need›  Can be done on a per-operation basis, and for both reads and writes›  Handles Multi-data center operations 1 6 2 Writes Reads ›  Any ›  One 5 3 ›  One ›  Quorum ›  Quorum ›  Local_Quorum 4 ›  Local_Quorum ›  Each_Quorum ›  Each_Quorum ›  All ›  All www.datastax.com
  8. 8. Selecting a Strategy for Writes›  Any – a write must succeed on any available node›  One – a write must succeed on any node responsible for that row (either primary or replica)›  Quorum – a write must succeed on a quorum of replica nodes (determined by (replication_factor /2 )+ 1›  Local_Quorum - a write must succeed on a quorum of replica nodes in the same data center as the coordinator node›  Each_Quorum - a write must succeed on a quorum of replica nodes in all data centers›  All – a write must succeed on all replica nodes for a row key www.datastax.com
  9. 9. Hinted Handoffs›  Cassandra attempts to write a row to all replicas for that row›  If all replica nodes are not available, a hint is stored on one node to update any downed nodes with the row once they are available again›  If no replica nodes are available for a row, the use of the ANY consistency level will instruct the coordinator node to store a hint and the row data, which it passes to the replica nodes when they are available Replica 1 Replica3 Replica2 Hint for Node5 www.datastax.com
  10. 10. Selecting a Strategy for Reads›  One – reads from the closest node holding the data›  Quorum – returns a result from a quorum of servers with the most recent timestamp for the data›  Local_Quorum - returns a result from a quorum of servers with the most recent timestamp for the data in the same data center as the coordinator node›  Each_Quorum - returns a result from a quorum of servers with the most recent timestamp in all data centers›  All – returns a result from all replica nodes for a row key www.datastax.com
  11. 11. Read Repair›  Cassandra ensures that frequently-read data remains consistent›  When a read is done, the coordinator node compares the data from all the remaining replicas that own the row in the background, and if they are inconsistent, issues writes to the out-of-date replicas to update the row to reflect the most recently written values.›  Read repair can be configured per column family and is enabled by default. Replica 1 st repair reque Replica3 Replica2 www.datastax.com
  12. 12. CQL ExamplesSELECT total_purchases FROM SALESUSING CONSISTENCY QUORUMWHERE customer_id = 5UPDATE SALESUSING CONSISTENCY ONESET total_purchases = 500000WHERE customer_id = 4 www.datastax.com
  13. 13. Where to get Cassandra?›  Go to www.datastax.com›  DataStax makes free smart start installers available for Cassandra that include: ›  The most up-to-date Cassandra version that is production quality ›  A version of DataStax OpsCenter, which is a visual, browser-based management tool for managing and monitoring Cassandra ›  Drivers and connectors for popular development languages ›  Same database and application ›  Automatic configuration assistance for ensuring optimal performance and setup for either stand- alone or cluster implementations ›  Getting Started Guide www.datastax.com
  14. 14. Where Can I Learn More? www.datastax.com ›  Free Online Documentation ›  Technical White Papers ›  Technical Articles ›  Tutorials ›  User Forums ›  User/Customer Case Studies ›  FAQ’s ›  Videos ›  Blogs ›  Software downloads www.datastax.com
  15. 15. Cassandra EssentialsTutorial Series Understanding Data Partitioning and Replication in Apache Cassandra Thanks!

×

Related Articles

cassandra.consistency
cassandra

Consistency levels in Apache Cassandra explained

John Doe

6/9/2020

cassandra.consistency
cassandra
yugabyte

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra.consistency