This is a simple example of reading and writing to an Apache Cassandra™ database cluster while giving a window into the internal routing and execution tracing within the cluster. It utilizes basic logging, Cassandra query tracing, and events from the driver's connection to the cluster. The driver events can include notifications that members of the clusters go down or come back up along with connections to local and remote data centers.
There are often questions about why the server throws certain exceptions to the
client application. For example, why do I get NoNodeAvailableException
errors when I know nodes in my cluster are available? Diagnosis of a fault in a
distributed system is tricky. The application and driver have limited visibility
of the status of the network and members of the database cluster. Each exception
emanates from a limited knowledge of the state of all components in the system. See the
Byzantine Generals problem
for a general explanation. However with more information about what is seen at each stage
through tracing and logging, you will hopefully arrive at a diagnosis more efficiently.
Not considered directly for this example are multi-data center connections and data center failover. However, the logging in this example shows what connections are made from the driver to nodes in the cluster. Best practices for failover, especially in a multi-data center environment, are discussed at length in this white paper and in this webinar about designing fault tolerant applications. There is also an accompanying demo to show best practices for data center failover with the latest 4.x Java driver.
Contributors: Jeremy Hanna
Objectives
- Understand how queries interact with and within a Cassandra cluster
- Learn about tools to help diagnose problems such as logging, tracing, and those in Additional Resources
Project Layout
The project has a standard Apache Maven project layout with a single Java class: QueryDiagnostics.
Setup and Running
Prerequisites
- Apache Maven 3 should be installed and in the path
- JDK 14
- An Apache Cassandra™ cluster is running and accessible through the contacts points and data center identified in application.conf.
The program will create a keyspace with
NetworkToplogyStrategy
with replication indc1
, the Apache Cassandra default when using theGossipingPropertyFileSnitch
.
Running
Building
At the project root level, execute
mvn clean package
Configuration changes
The main configuration file is src/main/resources/application.conf
.
Consider the following changes from defaults for your cluster and environment:
- Change
basic.contact-points = ["127.0.0.1:9042"]
to the address and cql port of your Cassandra cluster. - Change
basic.local-datacenter = dc1
to connect to a data center in your Cassandra cluster. - Change
basic.request.consistency = LOCAL_ONE
to your preferred consistency level. - Change
replication = {'class': 'NetworkTopologyStrategy', 'dc1' : 1}
inQueryDiagnostics.java
to your preferred replication settings.
Running the program
To execute the program, run the following:
mvn exec:java -D"exec.mainClass"="com.datastax.examples.QueryDiagnostics"
Additional Resources
- Note in this repo that we've enabled debug logging of
com.datastax.oss.driver.internal.core.channel
in the logback configuration to log connection updates with different nodes in the cluster - Java driver documentation on query tracing
cqlsh
documentation on query tracing- DataStax Studio developer notebooks have execution configurations that can perform query traces
- Interactive request tracing
- Probabilistic tracing
- Packet capture for dynamic tracing
- Using Wireshark for dynamic cql tracing
- Replacing Cassandra tracing with Zipkin (this method is not yet possible with DataStax Enterprise)