How to size up an Apache Cassandra cluster (Training)

Successfully reported this slideshow.

How to size up an Apache Cassandra cluster (Training)
How To Size Up A Cassandra Cluster
Joe Chu, Technical Trainer
jchu@datastax.com
April 2014
©2014 DataStax Confidential. Do...
What is Apache Cassandra?
• Distributed NoSQL database
• Linearly scalable
• Highly available with no single point of fail...
Peer-to-peer architecture
• All nodes are the same
• No master / slave architecture
• Less operational overhead for better...
Linear Scalability
• Operation throughput increases linearly with the number of
nodes added.
©2014 DataStax Confidential. ...
Data Replication
• Cassandra can write copies of data on different nodes.
RF = 3
• Replication factor setting determines t...
Node
• Instance of a running Cassandra process.
• Usually represented a single machine or server.
©2014 DataStax Confident...
Rack
• Logical grouping of nodes.
• Allows data to be replicated across different racks.
©2014 DataStax Confidential. Do n...
Datacenter
• Grouping of nodes and racks.
• Each data center can have separate replication settings.
• May be in different...
Cluster
• Grouping of datacenters, racks, and nodes that communicate
with each other and replicate data.
• Clusters are no...
Consistency Models
• Immediate consistency
When a write is successful, subsequent reads are
guaranteed to return that late...
Tunable Consistency
• Cassandra offers the ability to chose between immediate and
eventual consistency by setting a consis...
CL ONE
• Write: Success when at least one replica node has
acknowleged the write.
• Read: Only one replica node is given t...
CL QUORUM
• Write: Success when a majority of the replica nodes has
acknowledged the write.
• Read: A majority of the node...
CL ALL
• Write: Success when all of the replica nodes has
acknowledged the write.
• Read: All replica nodes are given the ...
Log-Structured Storage Engine
• Cassandra storage engine inspired by Google BigTable
• Key to fast write performance on Ca...
Updates and Deletes
• SSTable files are immutable and cannot be changed.
• Updates are written as new data.
• Deletes writ...
Compaction
• Periodically an operation is triggered that will merge the data
in several SSTables into a single SSTable.
• ...
Cluster Sizing
©2014 DataStax Confidential. Do not distribute without consent. 19
Cluster Sizing Considerations
• Replication Factor
• Data Size
“How many nodes would I need to store my data set?”
• Data ...
Choosing a Replication Factor
©2014 DataStax Confidential. Do not distribute without consent. 21
What Are You Using Replication For?
• Durability or Availability?
• Each node has local durability (Commit Log), but repli...
How Replication Can Affect Consistency Level
• When RF < 3, you do not have as much flexibility when
choosing consistency ...
Using A Larger Replication Factor
• When RF > 3, there is more data usage and higher latency for
operations requiring imme...
Data Size
©2014 DataStax Confidential. Do not distribute without consent. 25
Disk Usage Factors
• Data Size
• Replication Setting
• Old Data
• Compaction
• Snapshots
©2014 DataStax Confidential. Do n...
Data Sizing
• Row and Column Data
• Row and Column Overhead
• Indices and Other Structures
©2014 DataStax Confidential. Do...
Replication Overhead
• A replication factor > 1 will effectively multiply your data size
by that amount.
©2014 DataStax Co...
Old Data
• Updates and deletes do not actually overwrite or delete data.
• Older versions of data and tombstones remain in...
Compaction
• Compaction needs free disk space to write the new
SSTable, before the SSTables being compacted are removed.
•...
Snapshots
• Snapshots are hard-links or copies of SSTable data files.
• After SSTables are compacted, the disk space may n...
Recommended Disk Capacity
• For current Cassandra versions, the ideal disk capacity is
approximate 1TB per node if using s...
Data Velocity (Performance)
©2014 DataStax Confidential. Do not distribute without consent. 33
How to Measure Performance
• I/O Throughput
“How many reads and writes can be completed per
second?”
• Read and Write Late...
Sizing for Failure
• Cluster must be sized taking into account the performance
impact caused by failure.
• When a node fai...
Hardware Considerations for Performance
CPU
• Operations are often CPU-intensive.
• More cores are better.
Memory
• Cassan...
Some Final Words…
©2014 DataStax Confidential. Do not distribute without consent. 37
Summary
• Cassandra allows flexibility when sizing your cluster from a
single node to thousands of nodes
• Your use case w...
Additional Resources
• DataStax Documentation
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architectu
re/...
Questions?
Questions?
©2014 DataStax Confidential. Do not distribute without consent. 40
Thank You
We power the big data
apps that transform business.
41©2014 DataStax Confidential. Do not distribute without con...

Upcoming SlideShare

Loading in …5

×

  1. 1. How To Size Up A Cassandra Cluster Joe Chu, Technical Trainer jchu@datastax.com April 2014 ©2014 DataStax Confidential. Do not distribute without consent.
  2. 2. What is Apache Cassandra? • Distributed NoSQL database • Linearly scalable • Highly available with no single point of failure • Fast writes and reads • Tunable data consistency • Rack and Datacenter awareness ©2014 DataStax Confidential. Do not distribute without consent. 2
  3. 3. Peer-to-peer architecture • All nodes are the same • No master / slave architecture • Less operational overhead for better scalability. • Eliminates single point of failure, increasing availability. ©2014 DataStax Confidential. Do not distribute without consent. 3 Master Slave Slave Peer Peer PeerPeer Peer
  4. 4. Linear Scalability • Operation throughput increases linearly with the number of nodes added. ©2014 DataStax Confidential. Do not distribute without consent. 4
  5. 5. Data Replication • Cassandra can write copies of data on different nodes. RF = 3 • Replication factor setting determines the number of copies. • Replication strategy can replicate data to different racks and and different datacenters. ©2014 DataStax Confidential. Do not distribute without consent. 5 INSERT INTO user_table (id, first_name, last_name) VALUES (1, „John‟, „Smith‟); R1 R2 R3
  6. 6. Node • Instance of a running Cassandra process. • Usually represented a single machine or server. ©2014 DataStax Confidential. Do not distribute without consent. 6
  7. 7. Rack • Logical grouping of nodes. • Allows data to be replicated across different racks. ©2014 DataStax Confidential. Do not distribute without consent. 7
  8. 8. Datacenter • Grouping of nodes and racks. • Each data center can have separate replication settings. • May be in different geographical locations, but not always. ©2014 DataStax Confidential. Do not distribute without consent. 8
  9. 9. Cluster • Grouping of datacenters, racks, and nodes that communicate with each other and replicate data. • Clusters are not aware of other clusters. ©2014 DataStax Confidential. Do not distribute without consent. 9
  10. 10. Consistency Models • Immediate consistency When a write is successful, subsequent reads are guaranteed to return that latest value. • Eventual consistency When a write is successful, stale data may still be read but will eventually return the latest value. ©2014 DataStax Confidential. Do not distribute without consent. 10
  11. 11. Tunable Consistency • Cassandra offers the ability to chose between immediate and eventual consistency by setting a consistency level. • Consistency level is set per read or write operation. • Common consistency levels are ONE, QUORUM, and ALL. • For multi-datacenters, additional levels such as LOCAL_QUORUM and EACH_QUORUM to control cross- datacenter traffic. ©2014 DataStax Confidential. Do not distribute without consent. 11
  12. 12. CL ONE • Write: Success when at least one replica node has acknowleged the write. • Read: Only one replica node is given the read request. ©2014 DataStax Confidential. Do not distribute without consent. 12 R1 R2 R3Coordinator Client RF = 3
  13. 13. CL QUORUM • Write: Success when a majority of the replica nodes has acknowledged the write. • Read: A majority of the nodes are given the read request. • Majority = ( RF / 2 ) + 1 ©2013 DataStax Confidential. Do not distribute without consent. 13©2014 DataStax Confidential. Do not distribute without consent. 13 R1 R2 R3Coordinator Client RF = 3
  14. 14. CL ALL • Write: Success when all of the replica nodes has acknowledged the write. • Read: All replica nodes are given the read request. ©2013 DataStax Confidential. Do not distribute without consent. 14©2014 DataStax Confidential. Do not distribute without consent. 14 R1 R2 R3Coordinator Client RF = 3
  15. 15. Log-Structured Storage Engine • Cassandra storage engine inspired by Google BigTable • Key to fast write performance on Cassandra ©2014 DataStax Confidential. Do not distribute without consent. 16 Memtable SSTable SSTable SSTable Commit Log
  16. 16. Updates and Deletes • SSTable files are immutable and cannot be changed. • Updates are written as new data. • Deletes write a tombstone, which mark a row or column(s) as deleted. • Updates and deletes are just as fast as inserts. ©2014 DataStax Confidential. Do not distribute without consent. 17 SSTable SSTable SSTable id:1, first:John, last:Smith timestamp: …405 id:1, first:John, last:Williams timestamp: …621 id:1, deleted timestamp: …999
  17. 17. Compaction • Periodically an operation is triggered that will merge the data in several SSTables into a single SSTable. • Helps to limits the number of SSTables to read. • Removes old data and tombstones. • SSTables are deleted after compaction ©2014 DataStax Confidential. Do not distribute without consent. 18 SSTable SSTable SSTable id:1, first:John, last:Smith timestamp:405 id:1, first:John, last:Williams timestamp:621 id:1, deleted timestamp:999 New SSTable id:1, deleted timestamp:999 . . . . . . . . . . . . . . . .
  18. 18. Cluster Sizing ©2014 DataStax Confidential. Do not distribute without consent. 19
  19. 19. Cluster Sizing Considerations • Replication Factor • Data Size “How many nodes would I need to store my data set?” • Data Velocity (Performance) “How many nodes would I need to achieve my desired throughput?” ©2014 DataStax Confidential. Do not distribute without consent. 20
  20. 20. Choosing a Replication Factor ©2014 DataStax Confidential. Do not distribute without consent. 21
  21. 21. What Are You Using Replication For? • Durability or Availability? • Each node has local durability (Commit Log), but replication can be used for distributed durability. • For availability, a recommended setting is RF=3. • RF=3 is the minimum necessary to achieve both consistency and availability using QUORUM and LOCAL_QUORUM. ©2014 DataStax Confidential. Do not distribute without consent. 22
  22. 22. How Replication Can Affect Consistency Level • When RF < 3, you do not have as much flexibility when choosing consistency and availability. • QUORUM = ALL ©2014 DataStax Confidential. Do not distribute without consent. 23 R1 R2 Coordinator Client RF = 2
  23. 23. Using A Larger Replication Factor • When RF > 3, there is more data usage and higher latency for operations requiring immediate consistency. • If using eventual consistency, a consistency level of ONE will have consistent performance regardless of the replication factor. • High availability clusters may use a replication factor as high as 5. ©2014 DataStax Confidential. Do not distribute without consent. 24
  24. 24. Data Size ©2014 DataStax Confidential. Do not distribute without consent. 25
  25. 25. Disk Usage Factors • Data Size • Replication Setting • Old Data • Compaction • Snapshots ©2014 DataStax Confidential. Do not distribute without consent. 26
  26. 26. Data Sizing • Row and Column Data • Row and Column Overhead • Indices and Other Structures ©2014 DataStax Confidential. Do not distribute without consent. 27
  27. 27. Replication Overhead • A replication factor > 1 will effectively multiply your data size by that amount. ©2014 DataStax Confidential. Do not distribute without consent. 28 RF = 1 RF = 2 RF = 3
  28. 28. Old Data • Updates and deletes do not actually overwrite or delete data. • Older versions of data and tombstones remain in the SSTable files until they are compacted. • This becomes more important for heavy update and delete workloads. ©2014 DataStax Confidential. Do not distribute without consent. 29
  29. 29. Compaction • Compaction needs free disk space to write the new SSTable, before the SSTables being compacted are removed. • Leave enough free disk space on each node to allow compactions to run. • Worst case for the Size Tier Compaction Strategy is 50% of the total data capacity of the node. • For the Leveled Compaction Strategy, that is about 10% of the total data capacity. ©2014 DataStax Confidential. Do not distribute without consent. 30
  30. 30. Snapshots • Snapshots are hard-links or copies of SSTable data files. • After SSTables are compacted, the disk space may not be reclaimed if a snapshot of those SSTables were created. Snapshots are created when: • Executing the nodetool snapshot command • Dropping a keyspace or table • Incremental backups • During compaction ©2014 DataStax Confidential. Do not distribute without consent. 31
  31. 31. Recommended Disk Capacity • For current Cassandra versions, the ideal disk capacity is approximate 1TB per node if using spinning disks and 3-5 TB per node using SSDs. • Having a larger disk capacity may be limited by the resulting performance. • What works for you is still dependent on your data model design and desired data velocity. ©2014 DataStax Confidential. Do not distribute without consent. 32
  32. 32. Data Velocity (Performance) ©2014 DataStax Confidential. Do not distribute without consent. 33
  33. 33. How to Measure Performance • I/O Throughput “How many reads and writes can be completed per second?” • Read and Write Latency “How fast should I be able to get a response for my read and write requests?” ©2014 DataStax Confidential. Do not distribute without consent. 34
  34. 34. Sizing for Failure • Cluster must be sized taking into account the performance impact caused by failure. • When a node fails, the corresponding workload must be absorbed by the other replica nodes in the cluster. • Performance is further impacted when recovering a node. Data must be streamed or repaired using the other replica nodes. ©2014 DataStax Confidential. Do not distribute without consent. 35
  35. 35. Hardware Considerations for Performance CPU • Operations are often CPU-intensive. • More cores are better. Memory • Cassandra uses JVM heap memory. • Additional memory used as off-heap memory by Cassandra, or as the OS page cache. Disk • C* optimized for spinning disks, but SSDs will perform better. • Attached storage (SAN) is strongly discouraged. ©2014 DataStax Confidential. Do not distribute without consent. 36
  36. 36. Some Final Words… ©2014 DataStax Confidential. Do not distribute without consent. 37
  37. 37. Summary • Cassandra allows flexibility when sizing your cluster from a single node to thousands of nodes • Your use case will dictate how you want to size and configure your Cassandra cluster. Do you need availability? Immediate consistency? • The minimum number of nodes needed will be determined by your data size, desired performance and replication factor. ©2014 DataStax Confidential. Do not distribute without consent. 38
  38. 38. Additional Resources • DataStax Documentation http://www.datastax.com/documentation/cassandra/2.0/cassandra/architectu re/architecturePlanningAbout_c.html • Planet Cassandra http://planetcassandra.org/nosql-cassandra-education/ • Cassandra Users Mailing List user-subscribe@cassandra.apache.org http://mail-archives.apache.org/mod_mbox/cassandra-user/ ©2014 DataStax Confidential. Do not distribute without consent. 39
  39. 39. Questions? Questions? ©2014 DataStax Confidential. Do not distribute without consent. 40
  40. 40. Thank You We power the big data apps that transform business. 41©2014 DataStax Confidential. Do not distribute without consent.