At OpenCredo we have been working with Cassandra since 2012 and we are big fans of both open source Apache Cassandra and the capabilities of DataStax Enterprise. Over the years we have collected a great deal of experience throughout the company on how to deliver the benefits of Cassandra in real world projects and have also seen some common pitfalls that businesses have fallen into.
In order to help more people enjoy success with Cassandra we have decided to put together a series of articles to help people who are adopting Cassandra get the kind of results they expect whilst navigating around some of the common issues. The conclusion of the series will be a webinar exploring some of these issues with the opportunity at the end to put questions to our consultants in a Question and Answer session. (View the recording here)
In our first article we will explore at a high level the strengths and limitations of Cassandra before moving on to more detailed technical ideas as well as looking deeper into issues that can have negative consequences if they aren’t well understood or accounted for at the outset of an initiative. The nature of some of the issues that we have seen are such that often the problems they cause only surface late on in projects where corrective measures are more costly to resolve. So we hope sharing this series of articles and hence our experience, will help many people avoid potential issues in the future.
Our first article by Guy Richardson “Fulfilling the promise of Apache Cassandra” discusses the promises and limitations of Cassandra before then exploring when and in what environments one should consider using Cassandra.
Alla Babkina then explores data modelling in a two part series “Patterns of Successful Cassandra Data Modelling”. Alla provides some patterns which can avoid, as she puts it, a “whole world of hurt as your application starts to scale”. By reference to the way Cassandra stores its data she explains patterns that are sympathetic to the underlying storage.
The third article in the series will explore a common problem which is that the SQL like nature of CQL can lead developers to treat Cassandra like a relational database. In his article “How Not To Use Cassandra like an RDBMS (and what will happen if you do)” Dominic Fox provides detailed examples of usage that is more RDBMS and explains “where a beginner might be unpleasantly surprised by the differences”. He also shares his advice that “if Cassandra doesn’t immediately allow you to do something, it’s worth making sure you understand why”.
In the final article of the series, “Common Problems with Cassandra Tombstones” by Alla Babkina will explore how tombstones are created, answering questions such as “We are not deleting anything, how are we getting tombstones?”. Before then exploring how to get a better understanding of tombstones in your Cassandra cluster by looking at SSTable content where necessary.