The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

6/10/2020

Reading time:5 min

Cassandra 3.0 Data Modeling

by DataStax Academy

Cassandra 3.0 Data Modeling SlideShare Explore You Successfully reported this slideshow.Cassandra 3.0 Data ModelingUpcoming SlideShareLoading in …5× 1 Comment 5 Likes Statistics Notes Younes Rahimi , Full Stack Developer at Institute of Applied Intelligent Systems of University of Tehran (IAIS) at Institute of Applied Intelligent Systems of University of Tehran (IAIS) Aravindh Elanthirian Krishnan , Member of Technical Staff at Zoho Corporation at Zoho Corporation Saulius Preidys , Member of Committee of Management of Information Systems Programme at Communication faculty, Vilnius University at Vilnius University Colin Kuo , Staff Engineer at Brocade Communications Systems at Brocade Radha Krishna Proddaturi , Principal Software Engineer at Coupons.com at Coupons.com No DownloadsNo notes for slide 1. @PatrickMcFadinPatrick McFadin
Chief Evangelist, DataStaxCassandra 3.0 Data Modeling1 2. A brief history of CQLYou 3. CQL 3.0 - Cassandra 1.2• Goodbye CQL 2.0!• Custom secondary indexes• Empty IN 4. CQL 3.1 - Cassandra 2.0• Aliases• CREATE <table> IF NOT EXISTS• INSERT IF NOT EXISTS• UPDATE IF• DELETE IF EXISTS• IN supports cluster columnsLWT 5. CQL 3.2 - Cassandra 2.1• User Defined Types• Collection Indexing• Indexes can use contains• Tuples? 6. User Defined TypesCREATE TYPE video_metadata (height int,width int,video_bit_rate set<text>,encoding text); 7. User Defined TypesCREATE TABLE videos (videoid uuid,userid uuid,name varchar,description varchar,location text,location_type int,preview_thumbnails map<text,text>,tags set<varchar>,metadata set <frozen<video_metadata>>,added_date timestamp,PRIMARY KEY (videoid)); 8. CQL 3.3 - Cassandra 2.2• Date and Time are now types• TinyInt and SmallInt• User Defined Functions• Aggregates• User Defined Aggregates 9. User Defined FunctionsCREATE TABLE video_rating (videoid uuid,rating_counter counter,rating_total counter,PRIMARY KEY (videoid));CREATE OR REPLACE FUNCTIONavg_rating (rating_counter counter, rating_total counter)CALLED ON NULL INPUTRETURNS doubleLANGUAGE java AS'return Double.valueOf(rating_total.doubleValue()/rating_counter.doubleValue());'; 10. User Defined FunctionsSELECT avg_rating(rating_counter, rating_total) AS avg_ratingFROM video_ratingWHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0; 11. AggregatesCREATE TABLE video_ratings_by_user (videoid uuid,userid uuid,rating int,PRIMARY KEY (videoid, userid));SELECT count(userid)FROM video_ratings_by_userWHERE videoed = 49f64d40-7d89-4890-b910-dbf923563a33 12. CQL 3.4 - Cassandra 3.x• CAST operator• Per Partition Limit• Materialized Views• SASI 13. Materialized ViewCREATE TABLE videos (videoid uuid,userid uuid,name varchar,description varchar,location text,location_type int,preview_thumbnails map<text,text>,tags set<varchar>,metadata set <frozen<video_metadata>>,added_date timestamp,PRIMARY KEY (videoid));Lookup by this? 14. Materialized ViewCREATE TABLE videos_by_location (videoid uuid,userid uuid,location text,added_date timestamp,PRIMARY KEY (location, videoid));Roll your own 15. Materialized ViewCREATE MATERIALIZED VIEW videos_by_locationAS SELECT userid, added_date, videoid, locationFROM videosWHERE videoId IS NOT NULL AND location IS NOT NULLPRIMARY KEY(location, videoid);Cassandra rolls for you 16. Materialized View Perf 17. Materialized View Perf5 Materialized Views vs 5 tables writes async 18. Materialized ViewSELECT location, videoidFROM videos_by_location ;location | videoid-------------------------------------------------+--------------------------------------http://www.youtube.com/watch?v=px6U2n74q3g | 06049cbb-dfed-421f-b889-5f649a0de1edhttp://www.youtube.com/watch?v=qphhxujn5Es | 873ff430-9c23-4e60-be5f-278ea2bb21bd/us/vid/0c/0c3f7e87-f6b6-41d2-9668-2b64d117102c | 0c3f7e87-f6b6-41d2-9668-2b64d117102c/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | 99051fe9-6a9c-46c2-b949-38ef78858dd0/us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | b3a76c6b-7c7f-4af6-964f-803a9283c401http://www.youtube.com/watch?v=HdJlsOZVGwM | 49f64d40-7d89-4890-b910-dbf923563a33/us/vid/41/416a5ddc-00a5-49ed-adde-d99da9a27c0c | 416a5ddc-00a5-49ed-adde-d99da9a27c0c 19. SASICREATE TABLE users (userid uuid,firstname varchar,lastname varchar,email text,created_date timestamp,PRIMARY KEY (userid));Lookup by this? 20. Storage Attached Secondary Index 21. SASI 22. SASICREATE CUSTOM INDEX ON users (firstname)USING 'org.apache.cassandra.index.sasi.SASIIndex'WITH OPTIONS = {'analyzer_class':'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer','case_sensitive': 'false'}; 23. SASICREATE CUSTOM INDEX ON users (lastname)USING 'org.apache.cassandra.index.sasi.SASIIndex'WITH OPTIONS = {'mode': 'CONTAINS'}; 24. SASICREATE CUSTOM INDEX ON users (created_date)USING 'org.apache.cassandra.index.sasi.SASIIndex'WITH OPTIONS = {'mode': 'SPARSE'}; 25. SASI IndexesClientINSERT INTO users(userid,firstname,lastname,email,created_date)
VALUES (9761d3d7-7fbd-4269-9988-6cfd4e188678,’Patrick’,’McFadin’,’patrick@datastax.com’,’2015-06-01’);userid 1userid 2MemtableSSTableSSTableSSTableSASI IndexNodeDatalastnamelastnamefirstnamefirstnameemailemailcreated_datecreated_dateSASI IndexSASI IndexIndexer 26. SASI QueriesSELECT * FROM users WHERE firstname LIKE 'pat%';SELECT * FROM users WHERE lastname LIKE ‘%Fad%';SELECT * FROM users WHERE email LIKE '%data%';SELECT * FROM usersWHERE created_date > '2011-6-15'AND created_date < '2011-06-30';userid | created_date | email | firstname | lastname--------------------------------------+---------------------------------+----------------------+-----------+----------9761d3d7-7fbd-4269-9988-6cfd4e188678 | 2011-06-20 20:50:00.000000+0000 | patrick@datastax.com | Patrick | McFadin 27. SASI Guidelines• Multiple fields to search• No more than 1000 rows returned• You know the partition key• Indexing static columnsUse SASI when… 28. SASI Guidelines• Searching large partitions• Tight SLA on reads• Search for analytics• Ordering search is importantDon’t Use SASI when… Recommended Teaching Techniques: Blended LearningOnline Course - LinkedIn Learning Gamification of LearningOnline Course - LinkedIn Learning Social Media in the ClassroomOnline Course - LinkedIn Learning Data Modeling for Apache CassandraDataStax Academy Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy Introduction to DataStax Enterprise Graph DatabaseDataStax Academy Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy Cassandra on Docker @ Walmart LabsDataStax Academy Cassandra Adoption on Cisco UCS & Open stackDataStax Academy Coursera Cassandra DriverDataStax Academy About Blog Terms Privacy Copyright LinkedIn Corporation © 2020 × Public clipboards featuring this slideNo public clipboards found for this slideSelect another clipboard ×Looks like you’ve clipped this slide to already.Create a clipboardYou just clipped your first slide! Clipping is a handy way to collect important slides you want to go back to later. Now customize the name of a clipboard to store your clips. Description Visibility Others can see my Clipboard

Cassandra 3.0 Data Modeling

Successfully reported this slideshow.

Cassandra 3.0 Data Modeling
@PatrickMcFadin
Patrick McFadin

Chief Evangelist, DataStax
Cassandra 3.0 Data Modeling
1
A brief history of CQL
You
CQL 3.0 - Cassandra 1.2
• Goodbye CQL 2.0!
• Custom secondary indexes
• Empty IN
CQL 3.1 - Cassandra 2.0
• Aliases
• CREATE <table> IF NOT EXISTS
• INSERT IF NOT EXISTS
• UPDATE IF
• DELETE IF EXISTS
• I...
CQL 3.2 - Cassandra 2.1
• User Defined Types
• Collection Indexing
• Indexes can use contains
• Tuples?
User Defined Types
CREATE TYPE video_metadata (
height int,
width int,
video_bit_rate set<text>,
encoding text
);
User Defined Types
CREATE TABLE videos (
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
locat...
CQL 3.3 - Cassandra 2.2
• Date and Time are now types
• TinyInt and SmallInt
• User Defined Functions
• Aggregates
• User ...
User Defined Functions
CREATE TABLE video_rating (
videoid uuid,
rating_counter counter,
rating_total counter,
PRIMARY KEY ...
User Defined Functions
SELECT avg_rating(rating_counter, rating_total) AS avg_rating
FROM video_rating
WHERE videoid = 9905...
Aggregates
CREATE TABLE video_ratings_by_user (
videoid uuid,
userid uuid,
rating int,
PRIMARY KEY (videoid, userid)
);
SE...
CQL 3.4 - Cassandra 3.x
• CAST operator
• Per Partition Limit
• Materialized Views
• SASI
Materialized View
CREATE TABLE videos (
videoid uuid,
userid uuid,
name varchar,
description varchar,
location text,
locat...
Materialized View
CREATE TABLE videos_by_location (
videoid uuid,
userid uuid,
location text,
added_date timestamp,
PRIMAR...
Materialized View
CREATE MATERIALIZED VIEW videos_by_location
AS SELECT userid, added_date, videoid, location
FROM videos
...
Materialized View Perf
Materialized View Perf
5 Materialized Views vs 5 tables writes async
Materialized View
SELECT location, videoid
FROM videos_by_location ;
location | videoid
----------------------------------...
SASI
CREATE TABLE users (
userid uuid,
firstname varchar,
lastname varchar,
email text,
created_date timestamp,
PRIMARY KE...
Storage Attached Secondary Index
SASI
SASI
CREATE CUSTOM INDEX ON users (firstname)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {
'analyzer...
SASI
CREATE CUSTOM INDEX ON users (lastname)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {'mode': 'CO...
SASI
CREATE CUSTOM INDEX ON users (created_date)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {'mode':...
SASI Indexes
Client
INSERT INTO users(userid,firstname,lastname,email,created_date)

VALUES (9761d3d7-7fbd-4269-9988-6cfd4...
SASI Queries
SELECT * FROM users WHERE firstname LIKE 'pat%';
SELECT * FROM users WHERE lastname LIKE ‘%Fad%';
SELECT * FR...
SASI Guidelines
• Multiple fields to search
• No more than 1000 rows returned
• You know the partition key
• Indexing stat...
SASI Guidelines
• Searching large partitions
• Tight SLA on reads
• Search for analytics
• Ordering search is important
Do...
Cassandra 3.0 Data Modeling

Upcoming SlideShare

Loading in …5

×

  1. 1. @PatrickMcFadin Patrick McFadin
 Chief Evangelist, DataStax Cassandra 3.0 Data Modeling 1
  2. 2. A brief history of CQL You
  3. 3. CQL 3.0 - Cassandra 1.2 • Goodbye CQL 2.0! • Custom secondary indexes • Empty IN
  4. 4. CQL 3.1 - Cassandra 2.0 • Aliases • CREATE <table> IF NOT EXISTS • INSERT IF NOT EXISTS • UPDATE IF • DELETE IF EXISTS • IN supports cluster columns LWT
  5. 5. CQL 3.2 - Cassandra 2.1 • User Defined Types • Collection Indexing • Indexes can use contains • Tuples?
  6. 6. User Defined Types CREATE TYPE video_metadata ( height int, width int, video_bit_rate set<text>, encoding text );
  7. 7. User Defined Types CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) );
  8. 8. CQL 3.3 - Cassandra 2.2 • Date and Time are now types • TinyInt and SmallInt • User Defined Functions • Aggregates • User Defined Aggregates
  9. 9. User Defined Functions CREATE TABLE video_rating ( videoid uuid, rating_counter counter, rating_total counter, PRIMARY KEY (videoid) ); CREATE OR REPLACE FUNCTION avg_rating (rating_counter counter, rating_total counter) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS 'return Double.valueOf(rating_total.doubleValue()/ rating_counter.doubleValue());';
  10. 10. User Defined Functions SELECT avg_rating(rating_counter, rating_total) AS avg_rating FROM video_rating WHERE videoid = 99051fe9-6a9c-46c2-b949-38ef78858dd0;
  11. 11. Aggregates CREATE TABLE video_ratings_by_user ( videoid uuid, userid uuid, rating int, PRIMARY KEY (videoid, userid) ); SELECT count(userid) FROM video_ratings_by_user WHERE videoed = 49f64d40-7d89-4890-b910-dbf923563a33
  12. 12. CQL 3.4 - Cassandra 3.x • CAST operator • Per Partition Limit • Materialized Views • SASI
  13. 13. Materialized View CREATE TABLE videos ( videoid uuid, userid uuid, name varchar, description varchar, location text, location_type int, preview_thumbnails map<text,text>, tags set<varchar>, metadata set <frozen<video_metadata>>, added_date timestamp, PRIMARY KEY (videoid) ); Lookup by this?
  14. 14. Materialized View CREATE TABLE videos_by_location ( videoid uuid, userid uuid, location text, added_date timestamp, PRIMARY KEY (location, videoid) ); Roll your own
  15. 15. Materialized View CREATE MATERIALIZED VIEW videos_by_location AS SELECT userid, added_date, videoid, location FROM videos WHERE videoId IS NOT NULL AND location IS NOT NULL PRIMARY KEY(location, videoid); Cassandra rolls for you
  16. 16. Materialized View Perf
  17. 17. Materialized View Perf 5 Materialized Views vs 5 tables writes async
  18. 18. Materialized View SELECT location, videoid FROM videos_by_location ; location | videoid -------------------------------------------------+-------------------------------------- http://www.youtube.com/watch?v=px6U2n74q3g | 06049cbb-dfed-421f-b889-5f649a0de1ed http://www.youtube.com/watch?v=qphhxujn5Es | 873ff430-9c23-4e60-be5f-278ea2bb21bd /us/vid/0c/0c3f7e87-f6b6-41d2-9668-2b64d117102c | 0c3f7e87-f6b6-41d2-9668-2b64d117102c /us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | 99051fe9-6a9c-46c2-b949-38ef78858dd0 /us/vid/b3/b3a76c6b-7c7f-4af6-964f-803a9283c401 | b3a76c6b-7c7f-4af6-964f-803a9283c401 http://www.youtube.com/watch?v=HdJlsOZVGwM | 49f64d40-7d89-4890-b910-dbf923563a33 /us/vid/41/416a5ddc-00a5-49ed-adde-d99da9a27c0c | 416a5ddc-00a5-49ed-adde-d99da9a27c0c
  19. 19. SASI CREATE TABLE users ( userid uuid, firstname varchar, lastname varchar, email text, created_date timestamp, PRIMARY KEY (userid) ); Lookup by this?
  20. 20. Storage Attached Secondary Index
  21. 21. SASI
  22. 22. SASI CREATE CUSTOM INDEX ON users (firstname) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = { 'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', 'case_sensitive': 'false' };
  23. 23. SASI CREATE CUSTOM INDEX ON users (lastname) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'CONTAINS'};
  24. 24. SASI CREATE CUSTOM INDEX ON users (created_date) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {'mode': 'SPARSE'};
  25. 25. SASI Indexes Client INSERT INTO users(userid,firstname,lastname,email,created_date)
 VALUES (9761d3d7-7fbd-4269-9988-6cfd4e188678,’Patrick’,’McFadin’, ’patrick@datastax.com’,’2015-06-01’); userid 1 userid 2 Memtable SSTable SSTable SSTable SASI Index Node Data lastname lastname firstname firstname email email created_date created_date SASI Index SASI Index Indexer
  26. 26. SASI Queries SELECT * FROM users WHERE firstname LIKE 'pat%'; SELECT * FROM users WHERE lastname LIKE ‘%Fad%'; SELECT * FROM users WHERE email LIKE '%data%'; SELECT * FROM users WHERE created_date > '2011-6-15' AND created_date < '2011-06-30'; userid | created_date | email | firstname | lastname --------------------------------------+---------------------------------+----------------------+-----------+---------- 9761d3d7-7fbd-4269-9988-6cfd4e188678 | 2011-06-20 20:50:00.000000+0000 | patrick@datastax.com | Patrick | McFadin
  27. 27. SASI Guidelines • Multiple fields to search • No more than 1000 rows returned • You know the partition key • Indexing static columns Use SASI when…
  28. 28. SASI Guidelines • Searching large partitions • Tight SLA on reads • Search for analytics • Ordering search is important Don’t Use SASI when…

×