- SizeTieredCompactionStrategy: default and suitable as a starting point for most uses cases with balance of reads and writes
- LevelledCompactionStrategy: does more compaction work to improve read performance. Generally used if high ratio of reads to writes.
- DateTieredCompactionStrategy: useful for data where data is “hot” when first written but sees less access over time.
- Check that the compaction strategy is appropriately tuned (see Cassandra Docs) defaults are usually ok, but DTCS requires specific compaction options set to be effective.
Also secondary indexes on boolean columns are not effective.
See Cassandra Docs and
http://www.wentnet.com/blog/?p=77
Is the indexed column frequently updated/deleted? Overhead of maintaining index will be incurred on each update/delete and may also result in excessive tombstones in the index table. Queries Are there logged batches used? If so, are they relatively small (<100) Logged batches require coordinate node to control all operations and can result in very high load on coordinator node for large batches. Logged batches are only required for atomic operations across multiple rows/tables (not performance). Are there unlogged batches? If so, are they small (<100) or on the same partition key? Unlogged batches can improve performance but need to either be small or on a single partition key otherwise they can negatively impact performance. Not that unlogged batches do not provide atomic operations. For large range queries, is the client paging through results? Paging is necessary to read large results sets without memory constraints. Most drivers have inbuilt paging support but needs to be explicitly turned on in query code.Does the query on the index lookup a row in a large partition?
Whole partition will be scanned to find matching rows – potentially expensive reads.