Successfully reported this slideshow.
Migration from Thrift to CQL (Brij Bhushan Ravat, Ericsson) | Cassandra Summit 2016
Upcoming SlideShare
Loading in …5
×
-
Be the first to like this
No Downloads
No notes for slide
- 1. Migration from Thrift to CQL Brij Bhushan Ravat Chief Architect, Voucher Server - Charging System
- 2. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 2 To be continued … “I didn't come here to tell you how this is going to end. I came here to tell you how it's going to begin.”
- 3. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 3 1 Why Thrift-to-CQL 2 Impact on data model 3 Approaches 4 Comparison 5 Summary
- 4. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 4 Context When everyone thought there is no alternative to RDBMS, one started his journey with Cassandra He interfaced the application with Cassandra using Thrift interface Thrift is deprecated and CQL is the new interface
- 5. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 5 Thrift & CQL › CQL interface was introduced in Cassandra in Nov, 2012. › Since then Cassandra can be interfaced, using Thrift as well as CQL interface
- 6. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 6 › Development of Thrift interface has been officially frozen from last 2 years › Thrift’s support will be completely removed in Cassandra 4.0 › Therefore, moving from Thrift to CQL is not just a choice. It is mandatory: – in order to leverage new capabilities of Cassandra, and – if you want your application to be ready for Cassandra 4.0 › Moreover: – With Cassandra 3.0 onwards, performance of CQL is much better than that of Thrift, and – CQL is easier to use because it is similar to SQL Thrift & CQL
- 7. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 7 › Moving from Thrift to CQL changes all touch points of an application with Cassandra › This implies that the application may have to redesign its framework for those operations that directly work on data – For example: › Atomicity of multiple updates › Isolation of a transaction Few Points to ponder
- 8. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 8 1 Why Thrift-to-CQL 2 Impact on data model 3 Approaches 4 Comparison 5 Summary
- 9. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 9 › Unlike Thrift, CQL depends more on column_metadata › Therefore, if your table has both fixed & dynamic columns then CQL will be able to read only fixed columns Impact on data model
- 10. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 10 Cassandra table Key Name City State Zip Phone ABC_SJ ABC Hotel San Jose CA 95113 124-543-2244 567_SJ 567 Hotel San Jose CA 95113 124-756-1567 XYZ_LV XYZ Hotel Las Vegas NV 89109 311-587-2222 create column family hotels with key_validation_class = UTF8Type and comparator = UTF8Type and column_metadata = [ {column_name: Name, validation_class: UTF8Type}, {column_name: City, validation_class: UTF8Type}, {column_name: State, validation_class: UTF8Type}, {column_name: Zip, validation_class: IntegerType}, {column_name: Phone, validation_class: UTF8Type} ]
- 11. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 11 Cassandra table ABC_SJ Name ABC Hotel 2005-10-30 T 10:45 City San Jose 2005-10-30 T 10:46 State CA 2005-10-30 T 10:47 Zip 95113 2005-10-30 T 10:48 Phone 124-543-2244 2011-04-16 T 08:15 567_SJ Name 567 Hotel 2005-11-14 T 15:06 City San Jose 2005-11-14 T 15:06 State CA 2005-11-14 T 15:06 Zip 95113 2005-11-14 T 15:06 Phone 124-756-1567 2005-11-14 T 15:06 XYZ_LV Name XYZ Hotel 2005-02-21 T 09:10 City Las Vegas 2005-02-21 T 09:10 State NV 2005-02-21 T 09:10 Zip 89109 2005-02-21 T 09:10 Phone 311-587-2222 2007-12-02 T 14:02 Actual format
- 12. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 12 Dynamic column ABC_SJ Name ABC Hotel 2005-10-30 T 10:45 City San Jose 2005-10-30 T 10:46 State CA 2005-10-30 T 10:47 Zip 95113 2005-10-30 T 10:48 Phone 124-543-2244 2011-04-16 T 08:15 My Rating Just average 2011-05-22 T 15:07 567_SJ Name 567 Hotel 2005-11-14 T 15:06 City San Jose 2005-11-14 T 15:06 State CA 2005-11-14 T 15:06 Zip 95113 2005-11-14 T 15:06 Phone 124-756-1567 2005-11-14 T 15:06 XYZ_LV Name XYZ Hotel 2005-02-21 T 09:10 City Las Vegas 2005-02-21 T 09:10 State NV 2005-02-21 T 09:10 • If table has dynamic column along with fixed columns • CQL will fail to read the dynamic column(s) • Because unlike Thrift, CQL depends more on metadata column_metadata = [ column_name: Name, column_name: City, column_name: State, column_name: Zip, column_name: Phone ]
- 13. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 13 1 Why Thrift-to-CQL 2 Impact on data model 3 Approaches 4 Comparison 5 Summary
- 14. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 14 › Add collections to the schema › Make the table schema-less (If you have both fixed & dynamic columns) Approaches (for Moving to CQL)
- 15. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 15 Add collections to the schema • Replace multiple dynamic columns with one (or more) collections, like map ABC_SJ Name ABC Hotel City San Jose State CA Zip 95113 Phone 124-543-2244 Rating {my_rating: "Just average"} 567_SJ Name 567 Hotel City San Jose State CA Zip 95113 Phone 124-756-1567 Rating {my_rating: "Above average", portal_rating: "Good"} ABC_SJ Name ABC Hotel City San Jose State CA Zip 95113 Phone 124-543-2244 My Rating Just average 567_SJ Name 567 Hotel City San Jose State CA Zip 95113 Phone 124-756-1567 Portal Rating Good My Rating Above average
- 16. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 16 Make the table schema-less column1 value timestamp ABC_SJ Name ABC Hotel 2005-10-30 T 10:45 City San Jose 2005-10-30 T 10:46 State CA 2005-10-30 T 10:47 Zip 95113 2005-10-30 T 10:48 Phone 124-543-2244 2011-04-16 T 08:15 My Rating Just average 2011-05-22 T 15:07 › Drop the entire column_metadata › This will make it possible to read all the columns of the table (because now all the columns are dynamic). – In absence of column_metadata, CQL will make use of an internal column called ‘column1’ – column1 has listing of all the column names in the table update column family hotels with key_validation_class = UTF8Type and comparator = UTF8Type and column_metadata=[]
- 17. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 17 › Usually, an application performs multiple operations on a table, like – Adding new data records – Seek a data record & update it – Update data records in bulk (based on a criteria) – Just read the data records & generate a report › Good news – APIs that work with Thrift interface, will continue working even without the metadata – This gives flexibility in development to migrate functionalities from Thrift to CQL one- by-one. Schema-less table: advantage
- 18. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 18 Schema-less table: Advantage Cassandra Thrift interface CQL interface Add new hotel Update hotel record Add a rating to hotel Report Generation Hector APIs Hector APIs Hector APIs Hector APIsCQL APIs CQL APIs CQL APIs CQL APIs
- 19. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 19 1 Why Thrift-to-CQL 2 Impact on data model 3 Approaches 4 Comparison 5 Summary
- 20. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 20 Comparison › Schema with collections – Requires data migration – Once schema is changed, all functions will require migration in one-go › Schema-less – No need for data migration – Application functions can be migrated in multiple phases Wait ! – Don’t jump to schema-less. The decision won’t be that easy. – There is one more dimension to be evaluated. --------> Performance
- 21. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 21 › By moving to CQL – Marginal improvement in performance, in key-based read queries and write operations – But there is a major performance drop in full-table scan scenario, with ‘Schema with collections’ Performance Comparison Cassandra: 2.0.14 Data size: 1 million records Data size: 50 million rec.
- 22. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 22
- 23. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 23 “I didn't come here to tell you how this is going to end. I came here to tell you how it's going to begin.” So, when you move to CQL from Thrift and you are on Cassandra v 2.0.x, you don’t get significant performance benefit. Performance benefit will come when you upgrade Cassandra
- 24. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 24 Performance across versions • At v 2.0.14, Schema-less gives same performance • but v 2.1 onwards, performance drops Data size: 50 million recordsScenario: Full-table scan Spark version: 1.2 Schema-less • At v 2.0.14, Schema-with-collection give almost half performance • but v 2.1 onwards, its performance is always better than schema-less Schema-with-collection
- 25. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 25 1 Why Thrift-to-CQL 2 Impact on data model 3 Approaches 4 Comparison 5 Summary
- 26. Migration from Thrift to CQL | Public | © Ericsson AB 2016 | 2016-09-04 | Page 26 › Moving from Thrift to CQL is important for lifecycle management of a solution › CQL gives a challenge when both fixed as well as dynamic columns are present › There are two approaches for moving to CQL – Schema-less › Doesn’t require data migration. Hence, data remains compatible with Thrift APIs › Better performance with Cassandra v 2.0.14 – Schema with collections › Requires data migration. Hence, data is no longer compatible with Thrift APIs › Better performance with Cassandra v 2.1.13 & higher › Performance of CQL (schema with collection) improves with Cassandra version upgrade & becomes significantly high after upgrade to Cassandra 3.x Summary
Public clipboards featuring this slide
No public clipboards found for this slide