9/23/2020
Reading time:6 min
Introduction to Cassandra
by Gokhan Atil
Introduction to Cassandra SlideShare Explore You Successfully reported this slideshow.Introduction to CassandraUpcoming SlideShareLoading in …5× 3 Comments 6 Likes Statistics Notes Nandlal Sarda , Professor at IIT Bombay, at Indian Institute of Technology, Bombay Michael (毕建华) Bi , Solution Architect APAC at Irdeto at Irdeto Gawi Jyu Bui Kiet , gfg at fgf at gfg Mary Ann Redd / CCNA / PARALEGAL , CCNA, CWMP, SAP, on the road to AWS Certifications Show More No DownloadsNo notes for slide 1. INTRODUCTION TOAPACHE CASSANDRAGökhan Atıl 2. GÖKHAN ATIL➤ Database Administrator➤ Oracle ACE Director (2016) ACE (2011)➤ 10g/11g and R12 Oracle Certified Professional (OCP)➤ Co-author of Expert Oracle Enterprise Manager 12c➤ Founding Member and Vice President of TROUG➤ Blogger (since 2008) gokhanatil.com➤ Twitter: @gokhanatil2 3. INTRODUCTION TO APACHE CASSANDRA➤ What is Apache Cassandra? Why to use it?➤ Cassandra Architecture➤ Cassandra Query Language (CQL)➤ Cassandra Data Modeling➤ How to install and run Cassandra?➤ Cassandra nodetool➤ Backup and Recovery3 4. WHAT IS APACHE CASSANDRA? WHY TO USE IT?4 5. WHAT IS APACHE CASSANDRA? WHY TO USE IT?➤ Fast Distributed (Column Family NoSQL) DatabaseHigh availabilityLinear ScalabilityHigh Performance➤ Fault tolerant on Commodity Hardware➤ Multi-Data Center Support➤ Easy to operate➤ Proven: CERN, Netflix, eBay, GitHub, Instagram, Reddit5 6. HIGH AVAILABILITY: CAP THEOREM AND CASSANDRA6PartitionToleranceAvailabilityConsistency (ACID)RDBMSAtomicityConsistencyIsolationDurability 7. HIGH AVAILABILITY: THE RING7NO MASTER NO SLAVEPEER TOPEERgossipgossipI'm online! 8. LINEAR SCALABILITY8 9. CASSANDRA ARCHITECTURE9 10. CASSANDRA PARTITIONS10EMAIL NAME PHONEgokhan@ Gokhan 542xxxxxxxaylin@ Aylin 532xxxxxxxilayda@ Ilayda 532xxxxxxxpartitionerPRIMARY KEYPARTITION KEY, CLUSTERING KEY 11. REPLICATION FACTOR11EMAILgokhan@Murmur3Partitioner# 60 12. WRITE PATH (CLUSTER)12coordinatornodeclienthintedhand off 13. WRITE PATH (NODE)➤ Logging data in the commit log➤ Writing data to the memtable➤ Flushing to (immutable)SSTables (Sorted Strings Table)13memtablecommit log SSTable SSTable SSTablediskmemflushcompaction 14. READ PATH (CLUSTER)14coordinatornodeclient➤ Read Repair: repair during read path using digest and timestampdatadigestdigest 15. READ PATH (NODE)15memtable row (read) cachebloom filter (maybe or no)partition keycachepartitionsummarypartition index SSTablefoundmaybefoundnodiskmem 16. CONSISTENCY LEVELS➤ Formula for Strong Consistency: R + W > N16ANY (write only) at least one nodeONE, TWO, THREEat least one/two/three replicanodeQUORUMa quorum (N/2+1) of replicanodes across all datacentersLOCAL_QUORUMa quorum (N/2+1) of replicanodes in the same datacenterALL on all replica nodes 17. CASSANDRA QUERY LANGUAGE (CQL)17 18. CASSANDRA QUERY LANGUAGE (CQL)➤ Create a Keyspace (Database): create keyspace demo with replication = { 'class' :'SimpleStrategy', 'replication_factor' :1 };➤ Remove a keyspace: drop keyspace demo;➤ Select a keyspace to operate: use demo;18 19. CASSANDRA QUERY LANGUAGE (CQL)➤ Create a table: create table demo.democlients ( email text, name text,phone text, primary key (email, name));➤ Alter a table: alter table democlients add money int;➤ Remove a table: drop table democlients;➤ Remove all rows in a table: truncate table democlients;19EMAIL: PARTITION KEYNAME: CLUSTERING KEY 20. CASSANDRA QUERY LANGUAGE (CQL)➤ Retrieve rows: select * from democlients where name='Gokhan Atil'ALLOW FILTERING; -- or create a secondary index➤ Retrieve distinct values: select DISTINCT email from democlients;