8/24/2020

Reading time:4 min

scylladb/scylla

by scylladb

Cassandra partitions can be found using a partition key. What if the applications wants to find the partition by the value of a different column?Of course, it's easy to do this inefficiently by scanning all the data. But doing this efficiently requires building additional indexes to make the new type of lookups quick. Cassandra 3 has three implementations of such indexes: The classic secondary index, the newer SASI, and materialized views, briefly described here:The two indexing methods "secondary index" and "SASI" share a common design: Each indexes the data held by a given node, on that node. The downside to such a design is that when we search for a column with a particular value, we need to forward the query to all nodes, because we cannot know in which node to look. This makes search with this method unscalable: if a cluster with 1 node can process K search requests per second, a cluster with N nodes can also process only K - and not N*K - search requests per second.On the other hand, the benefit of this approach is fairly efficient to modify to the index.Proponents of Local Indexing note that it is not necessarily inefficient for low cardinality data, i.e., a column for which there is a low number of different values and therefore each query yields a very large number of matching partitions. In that case, to get just the first N results (in token order), we don't need to send the query to all nodes, but rather need to start only in the vnode which holds the lowest tokens, and only go to the next vnodes if we need more results. But for high-cardinality data (where each query yields only a few results), it is indeed inefficient, because we need to query on all nodes (and in Scylla, on all shards) even though most of them will find nothing.Secondary IndexSee also Secondary IndexingThis is the classic indexing method of Cassandra, introduced in Cassandra 0.7. The index for each indexed column is stored in a hidden, local (not distributed across the cluster) column family, whose partition key is the indexed column, and the value is the list of partition keys in the main index with this column value.SASISASI (acroynym of "SStable-Attached Secondary Indexing") is a reimplementation of the classic Cassandra secondary indexing with one main goal in mind - efficiently support more sophisticated search queries such as:AND or OR combinations of queries.Wildcard search in string values.Range queries.Lucene-inspired word search in string values (including word breaking, capitalization normalization, stemming, etc., as determined by a user-given "Analyzer").Some of these things were already possible with secondary index, but inefficient, because required getting a long list of partitions, reading them (requiring inefficient seeks to each one) and filtering on them. SASI implement them using a new on-disk format based on B+ trees, and does not reuse regular Cassandra column families or sstables like the classic Secondary Indexing method did.SASI attaches to each sstable its own immutable index file (and hence the name of this method), and also attaches an index to each memtable. During compaction, the indexes of the files being compacted together are also compacted to create one new index.In Local Indexing above, we only held on each node an index of the data it holds. Conversely, in Distributed Indexing we distribute the whole index over the cluster, using Cassandra's normal distribution scheme - namely that when we are looking for a certain value of a column, we can know (using the token ring) which of the nodes hold it.Materialized ViewsCassandra's "Materialized Views" feature was developed in https://issues.apache.org/jira/browse/CASSANDRA-6477 and explained in http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views.Often a distributed index will only list partition keys matching a column value (although Lucene, for example, does support general "payloads" in the inverted index), but Cassandra's implementation,Materialized Views, basically builds a new table (a regular distributed Cassandra table) with the indexed column as a partition key, and the user's choice of data as values.People have been doing this, creating additional tables containing other "views" into the same data (and calling them "Materialized Views") for a long time. The new feature in Cassandra 3.0 is how to do this in the server without application support,automatically, safely and efficiently. In other words, the application just updates the base table, and the additional materialized-view tables gets updated automatically as needed. Ensuring correctness and atomicity (the materialized views are updated together with the tables) makes updates to a table with materialized views significantly slower than normal updates, although Cassandra docs claim its performance is still acceptable. Reads, however, are just regular fast reads (as opposed to secondary index), and both read and write operations are scalable.Materialized Views do not have the sophisticated Lucene-inspired query features of SASI.

Read this article if you want to know more about scylladb/scylla

Cassandra partitions can be found using a partition key. What if the applications wants to find the partition by the value of a different column?

Of course, it's easy to do this inefficiently by scanning all the data. But doing this efficiently requires building additional indexes to make the new type of lookups quick. Cassandra 3 has three implementations of such indexes: The classic secondary index, the newer SASI, and materialized views, briefly described here:

The two indexing methods "secondary index" and "SASI" share a common design: Each indexes the data held by a given node, on that node. The downside to such a design is that when we search for a column with a particular value, we need to forward the query to all nodes, because we cannot know in which node to look. This makes search with this method unscalable: if a cluster with 1 node can process K search requests per second, a cluster with N nodes can also process only K - and not N*K - search requests per second.

On the other hand, the benefit of this approach is fairly efficient to modify to the index.

Proponents of Local Indexing note that it is not necessarily inefficient for low cardinality data, i.e., a column for which there is a low number of different values and therefore each query yields a very large number of matching partitions. In that case, to get just the first N results (in token order), we don't need to send the query to all nodes, but rather need to start only in the vnode which holds the lowest tokens, and only go to the next vnodes if we need more results. But for high-cardinality data (where each query yields only a few results), it is indeed inefficient, because we need to query on all nodes (and in Scylla, on all shards) even though most of them will find nothing.

Secondary Index

SASI

SASI (acroynym of "SStable-Attached Secondary Indexing") is a reimplementation of the classic Cassandra secondary indexing with one main goal in mind - efficiently support more sophisticated search queries such as:

AND or OR combinations of queries.
Wildcard search in string values.
Range queries.
Lucene-inspired word search in string values (including word breaking, capitalization normalization, stemming, etc., as determined by a user-given "Analyzer").

Some of these things were already possible with secondary index, but inefficient, because required getting a long list of partitions, reading them (requiring inefficient seeks to each one) and filtering on them. SASI implement them using a new on-disk format based on B+ trees, and does not reuse regular Cassandra column families or sstables like the classic Secondary Indexing method did.

SASI attaches to each sstable its own immutable index file (and hence the name of this method), and also attaches an index to each memtable. During compaction, the indexes of the files being compacted together are also compacted to create one new index.

In Local Indexing above, we only held on each node an index of the data it holds. Conversely, in Distributed Indexing we distribute the whole index over the cluster, using Cassandra's normal distribution scheme - namely that when we are looking for a certain value of a column, we can know (using the token ring) which of the nodes hold it.

Materialized Views

Cassandra's "Materialized Views" feature was developed in https://issues.apache.org/jira/browse/CASSANDRA-6477 and explained in http://www.datastax.com/dev/blog/new-in-cassandra-3-0-materialized-views.

Often a distributed index will only list partition keys matching a column value (although Lucene, for example, does support general "payloads" in the inverted index), but Cassandra's implementation, Materialized Views, basically builds a new table (a regular distributed Cassandra table) with the indexed column as a partition key, and the user's choice of data as values.

People have been doing this, creating additional tables containing other "views" into the same data (and calling them "Materialized Views") for a long time. The new feature in Cassandra 3.0 is how to do this in the server without application support, automatically, safely and efficiently. In other words, the application just updates the base table, and the additional materialized-view tables gets updated automatically as needed. Ensuring correctness and atomicity (the materialized views are updated together with the tables) makes updates to a table with materialized views significantly slower than normal updates, although Cassandra docs claim its performance is still acceptable. Reads, however, are just regular fast reads (as opposed to secondary index), and both read and write operations are scalable.

Materialized Views do not have the sophisticated Lucene-inspired query features of SASI.

python

java

cassandra

Vald

John Doe

2/11/2024

mongo

scylladb

rust

Our Top NoSQL Blogs of the Year: Rust, Raft, MongoDB, Books & Tablets

John Doe

1/5/2024

migration

schema

scylladb

GitHub - eighty4/cquill: Versioned CQL migrations for Cassandra and ScyllaDB

eighty4

12/2/2023

scylladb

cassandra

Apache Cassandra 4.0 vs. ScyllaDB 4.4: Comparing Performance

Peter Corless

2/16/2023

scylladb

cassandra

dynamo

The Three DynamoDB Limits You Need to Know

John Doe

2/15/2023

scylladb

business.intelligence

metabase

Apache Cassandra Lunch #81: Redash and Cassandra - Business Platform Team

Arpan Patel

7/5/2022

benchmarking

scylladb

cassandra

Benchmarking Apache Cassandra (40 Nodes) vs. ScyllaDB (4 Nodes)

John Doe

5/13/2022

scylladb

business.intelligence

metabase

Apache Cassandra Lunch #81: Redash and Cassandra - Business Platform Team

Arpan Patel

4/28/2022

cassandra

cassandra.sasi

Apache Cassandra indexing without having to say I’m sorry - SD Times

John Doe

1/15/2021

scylladb

mysql

cassandra

hugegraph/hugegraph

hugegraph

12/1/2020

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt!  We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

scylladb

python

java

cassandra

Vald

John Doe

2/11/2024

mongo