Advice on Cassandra partition and clustering key


Author: Archit Rai Gupta

Originally Sourced from: https://stackoverflow.com/questions/66065095/advice-on-cassandra-partition-and-clustering-key

I have doubts with respect to what should be the partition key and clustering key in the following two cases:

Case 1: Users can make multiple posts Queries:

  1. Get post by id
  2. Get all posts by user_id
  3. Get all posts made by user_id where createdAt > X

If we have chosen the partition key to be post_id, then we can support only Query1. If we have chosen partition key as user_id (not very good as it can lead to imbalanced servers) then we can support Query 2 and 3 but not 1.

One way to support all the queries is by having multiple copies of data with different partition keys but that will take huge storage.

Case 2: Users make hotel reservations Queries

  1. Get reservation by id
  2. Get all reservations by user_id
  3. Get all reservations by hotel_id

Is Cassandra the best choice for the above two data given:

  1. Very high write throughput
  2. Huge amount of data
  3. No transactions needed.

Any pointers are deeply appreciated.