Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

11/13/2020

Reading time:2 min

Start using virtual tables in Apache Cassandra 4.0

by John Doe

Among the many additions in the recent Apache Cassandra 4.0 beta release, virtual tables is one that deserves some attention.In previous versions of Cassandra, users needed access to Java Management Extensions (JMX) to examine Cassandra details such as running compactions, clients, metrics, and a variety of configuration settings. Virtual tables removes these challenges. Cassandra 4.0 beta enables users to query those details and data as Cassandra Query Language (CQL) rows from a read-only system table.Here is how the JMX-based mechanism in previous Cassandra versions worked. Imagine a user wants to check on the compaction status of a particular node in a cluster. The user first has to establish a JMX connection to run nodetool compactionstats on the node. This requirement immediately presents the user with a few complications. Is the user's client configured for JMX access? Are the Cassandra nodes and firewall configured to allow JMX access? Are the proper measures for security and auditing prepared and in place? These are only some of the concerns users had to contend with when dealing with in previous versions of Cassandra.With Cassandra 4.0, virtual tables make it possible for users to query the information they need by utilizing their previously configured driver. This change removes all overhead associated with implementing and maintaining JMX access.Cassandra 4.0 creates two new keyspaces to help users leverage virtual tables: system_views and system_virtual_schema. The system_views keyspace contains all the valuable information that users seek, usefully stored in a number of tables. The system_virtual_schema keyspace, as the name implies, stores all necessary schema information for those virtual tables.It's important to understand that the scope of each virtual table is restricted to its node. Any query of virtual tables will return data that is valid only for the node that acts as its coordinator, regardless of consistency. To simplify for this requirement, support has been added to several drivers to specify the coordinator node in these queries (the Python, DataStax Java, and other drivers now offer this support).To illustrate, examine this sstable_tasks virtual table. This virtual table displays all operations on SSTables, including compactions, cleanups, upgrades, and more.If a user were to run nodetool compactionstats in a previous Cassandra version, this is the same type of information that would be displayed. Here, the query finds that the node currently has one active compaction. It also displays its progress and its keyspace and table. Thanks to the virtual table, a user can gather this information quickly, and just as efficiently gain the insight needed to correctly diagnose the cluster's health.To be clear, Cassandra 4.0 doesn't eliminate the need for JMX access: JMX is still the only option for querying some metrics. That said, users will welcome the ability to pull key cluster metrics simply by using CQL. Thanks to the convenience afforded by virtual tables, users may be able to reinvest time and resources previously devoted to JMX tools into Cassandra itself. Client-side tooling should also begin to leverage the advantages offered by virtual tables.If you are interested in the Cassandra 4.0 beta release and its virtual tables feature, try it out.

Illustration Image

Among the many additions in the recent Apache Cassandra 4.0 beta release, virtual tables is one that deserves some attention.

In previous versions of Cassandra, users needed access to Java Management Extensions (JMX) to examine Cassandra details such as running compactions, clients, metrics, and a variety of configuration settings. Virtual tables removes these challenges. Cassandra 4.0 beta enables users to query those details and data as Cassandra Query Language (CQL) rows from a read-only system table.

Here is how the JMX-based mechanism in previous Cassandra versions worked. Imagine a user wants to check on the compaction status of a particular node in a cluster. The user first has to establish a JMX connection to run nodetool compactionstats on the node. This requirement immediately presents the user with a few complications. Is the user's client configured for JMX access? Are the Cassandra nodes and firewall configured to allow JMX access? Are the proper measures for security and auditing prepared and in place? These are only some of the concerns users had to contend with when dealing with in previous versions of Cassandra.

With Cassandra 4.0, virtual tables make it possible for users to query the information they need by utilizing their previously configured driver. This change removes all overhead associated with implementing and maintaining JMX access.

Cassandra 4.0 creates two new keyspaces to help users leverage virtual tables: system_views and system_virtual_schema. The system_views keyspace contains all the valuable information that users seek, usefully stored in a number of tables. The system_virtual_schema keyspace, as the name implies, stores all necessary schema information for those virtual tables.

system_views and system_virtual_schema keyspaces and tables

It's important to understand that the scope of each virtual table is restricted to its node. Any query of virtual tables will return data that is valid only for the node that acts as its coordinator, regardless of consistency. To simplify for this requirement, support has been added to several drivers to specify the coordinator node in these queries (the Python, DataStax Java, and other drivers now offer this support).

To illustrate, examine this sstable_tasks virtual table. This virtual table displays all operations on SSTables, including compactions, cleanups, upgrades, and more.

Querying the sstable_tasks virtual table

If a user were to run nodetool compactionstats in a previous Cassandra version, this is the same type of information that would be displayed. Here, the query finds that the node currently has one active compaction. It also displays its progress and its keyspace and table. Thanks to the virtual table, a user can gather this information quickly, and just as efficiently gain the insight needed to correctly diagnose the cluster's health.

To be clear, Cassandra 4.0 doesn't eliminate the need for JMX access: JMX is still the only option for querying some metrics. That said, users will welcome the ability to pull key cluster metrics simply by using CQL. Thanks to the convenience afforded by virtual tables, users may be able to reinvest time and resources previously devoted to JMX tools into Cassandra itself. Client-side tooling should also begin to leverage the advantages offered by virtual tables.

If you are interested in the Cassandra 4.0 beta release and its virtual tables feature, try it out.

Related Articles

mongo
nocode
elasticsearch

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

ibagroup-eu

12/2/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra