Basic rules of data modeling in Cassandra involve manually denormalizing data into separate tables based on the queries that will be run against that table. Currently, the only way to query a column without specifying the partition key is to use secondary indexes, but they are not a substitute for the denormalization of data into new tables as they are not fit for high cardinality data. High cardinality secondary index queries often require responses from all of the nodes in the ring, which adds latency to each request. Instead, client-side denormalization and multiple independent tables are used, which means that the same code is rewritten for many different users. In 3.0, Cassandra will introduce a new feature called Materialized Views. Materialized views handle automated server-side denormalization, removing the need for client side handling of this denormalization and ensuring eventual consistency between the base and view data. This denormalization allows for very fast lookups of data in each view using the normal Cassandra read path.
An Example
As an example of how materialized views can be used, suppose we want to track the high scores for players of several games. We have a number of queries that we would like to be able to answer:
- Given a game, who has the highest score, and what is it?
- Given a game and a day, who had the highest score, and what was it?
- Given a game and a month, who had the highest score, and what was it?
Materialized views maintain a correspondence of one CQL row each in the base and the view, so we need to ensure that each CQL row which is required for the views will be reflected in the base table's primary keys. For the first query, we will need the game, the player, and their highest score. For the second, we will need the game, the player, their high score, as well the day, the month, and the year of that high score. For the final query, we need everything from the second except the day. The second query will be the most restrictive, so it determines the primary key we will use. A user can update their high score over the course of day, so we only need to track the highest score for a particular day.
|
Next, we'll create the view which presents the all time high scores. To create the materialized view, we provide a simple select statement and the primary key to use for this view. Specifying the CLUSTERING ORDER BY
allows us to reverse sort the high score so we can get the highest score by simply selecting the first item in the partition.
|
To query the daily high scores, we create a materialized view that groups the game title and date together so a single partition contains the values for that date. We do the same for the monthly high scores.
|
We prime our materialized views with some data. We just insert the data into the scores table, and Cassandra will populate the materialized views accordingly.
|
We can now search for users who have scored the highest ever on our games:
|
|
And the daily high score:
|
|
All of the entries have been copied into the all time high materialized view:
|
|
Because we have a CQL Row in the view for each CQL Row in the base, 'pcmanus' and 'tjake' appear multiple times in the high scores table, one for each date in the base table.
We can also delete rows from the base table and the materialized view's records will be deleted. We'll delete the tjake
rows from the scores table:
|
Now, looking at all of the top scores, we don't find the tjake
entries anymore:
|
|
When a deletion occurs, the materialized view will query all of the deleted values in the base table and generate tombstones for each of the materialized view rows, because the values that need to be tombstoned in the view are not included in the base table's tombstone. For the single base tombstone, two view tombstones were generated; one for (tjake, 1000)
and one for (tjake, 500)
.
How it works
The base replica performs a local read of the data in order to create the correct update for the view. If the primary key of the view has been updated in the base table, a tombstone would need to be generated so that the old value is no longer present in the view. The batchlog is used to provide an equivalent eventual consistency to what is provided on the base table. Without the batchlog if view updates are not applied but the base updates are, the view and the base will be inconsistent with each other. Using the batchlog, however, does add significant overhead, especially since the batchlog must be written to twice.
Creation
When a materialized view is created against a table which has data already, a building process will be kicked off to populate the materialized view. As such, materialized views can be created on existing tables, but there will be a period during which queries against the materialized view may not return all results. This is similar in behavior to how secondary indexes currently work. When the build is complete, the system.built_materializedviews
table on each node will be updated with the view's name.
Altering the Base Table
When a base view is altered, the materialized view is updated as well. If the materialized view has a SELECT *
statement, any added columns will be included in the materialized view's columns. Any deleted columns which are part of the SELECT
statement will be removed from the materialized view. If a column in the base table is altered, the same alteration will occur in the view table. If the base table is dropped, any associated views will also be dropped.
When not to use Materialized Views
-
Materialized views do not have the same write performance characteristics that normal table writes have
The materialized view requires an additional read-before-write, as well as data consistency checks on each replica before creating the view updates. These additions overhead, and may change the latency of writes.
- If the rows are to be combined before placed in the view, materialized views will not work. Materialized views will create a CQL Row in the view for each CQL Row in the base
-
Low cardinality data will create hotspots around the ring
If the partition key of all of the data is the same, those nodes would become overloaded. In the
alltimehigh
materialized view above, if the only game that we stored high scores for was 'Coup', only the nodes which stored 'Coup' would have any data stored on them. -
Currently, only simple
SELECT
statements are supported, but a ticket has been filed to add support for more complexSELECT
statementsWHERE
clauses,ORDER BY
, and functions aren't available with materialized views - If there will be a large number of partition tombstones, the performance may suffer; the materialized view must query for all of the current values and generate a tombstone for each of them. The materialized view will have one tombstone per CQL row deleted in the base table
- Materialized views are not supported through Thrift