6/21/2018

Reading time:3 min

criteo/cassandra_exporter

by John Doe

README.md DescriptionCassandra exporter is a standalone application which exports Apache Cassandra® metrics throught a prometheus friendly endpoint.This project is originally a fork of JMX exporter but aims at an easier integration with Apache Cassandra®.Specifically, this project brings : Exporting EstimatedHistogram metrics specific to Apache Cassandra® Filtering on mbean's attributes Metrics naming that respect the mbean hierarchy Comprehensive config fileAn essential design choice the project makes is to not let prometheus drive the scraping frequency. This decision has been taken because a lot of Apache Cassandra® metrics are expensive to scrap and can hinder the performance of the node.As we don't want this kind of situation to happen in production, the scrape frequency is restricted via the configuration of Cassandra Exporter.How to useTo start the applicationjava -jar cassandra_exporter.jar config.ymlThe Cassandra exporter needs to run on every Cassandra nodes to get all the informations regarding the whole cluster.You can have a look at a full configuration file hereThe 2 main parts are :blacklistmaxScrapFrequencyInSecIn the blacklist block, you specify the metrics you don't want the exporter to scrape. This is important as JMX is an RPC mechanism and you don't want to trigger some of those RPC. For example, mbeans endpoint from org:apache:cassandra:db:.* does not expose any metrics but are used to trigger actions on Cassandra's nodes.In the maxScrapFrequencyInSec, you specify the metrics you want to be scraped at which frequency.Basically, starting from the set of all mbeans, the blacklist is applied first to filter this set and then the maxScrapFrequencyInSec is applied as a whitelist to filter the resulting set.As an example, if we take as input set the metrics {a, b, c} and the config file isblacklist: - amaxScrapFrequencyInSec: 50: - .* 3600: - bCassandra Exporter will have the following behavior:The metrics matching the blacklisted entries will never be scraped, here the metric a won't be availableIn reverse order of frequency the metrics matching maxScrapFrequencyInSec will be scrapedMetric b will be scraped every hourRemaining metrics will be scrapped every 50s, here only cResulting in :MetricScrap Frequencyaneverbevery hourcevery 50 secondsOnce started the prometheus endpoint will be available at localhost:listenPort/ or localhost:listenPort/metrics and metrics format will look like the one belowcassandra_stats{name="org:apache:cassandra:metrics:table:biggraphite:datapoints_5760p_3600s_aggr:writelatency:50thpercentile",} 35.425000000000004How to debugRun the program with the following options:java -Dorg.slf4j.simpleLogger.defaultLogLevel=trace -jar cassandra_exporter.jar config.yml --oneshotYou will get the duration of how long it took to scrape individual MBean, this is useful to understand which metrics are expansive to scrape.Goods sources of information to understand what Mbeans are doing/create your dashboards are:https://cassandra.apache.org/doc/latest/operating/metrics.htmlhttps://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/metricshttp://thelastpickle.com/blog/2017/12/05/datadog-tlp-dashboards.htmlhttps://www.youtube.com/watch?v=Q9AAR4UQzMkConfig file examplehost: localhost:7199ssl: Falseuser:password:listenPort: 8080blacklist: # Unaccessible metrics (not enough privilege) - java:lang:memorypool:.*usagethreshold.* # Leaf attributes not interesting for us but that are presents in many path (reduce cardinality of metrics) - .*:999thpercentile - .*:95thpercentile - .*:fifteenminuterate - .*:fiveminuterate - .*:durationunit - .*:rateunit - .*:stddev - .*:meanrate - .*:mean - .*:min # Path present in many metrics but uninterresting - .*:viewlockacquiretime:.* - .*:viewreadtime:.* - .*:cas[a-z]+latency:.* - .*:colupdatetimedeltahistogram:.* # Mostly for RPC, do not scrap them - org:apache:cassandra:db:.* # columnfamily is an alias for Table metrics in cassandra 3.x # https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/TableMetrics.java#L162 - org:apache:cassandra:metrics:columnfamily:.* # Should we export metrics for system keyspaces/tables ? - org:apache:cassandra:metrics:[^:]+:system[^:]*:.* # Don't scrape us - com:criteo:nosql:cassandra:exporter:.*maxScrapFrequencyInSec: 50: - .* # Refresh those metrics only every hour as it is costly for cassandra to retrieve them 3600: - .*:snapshotssize:.* - .*:estimated.* - .*:totaldiskspaceused:.*DockerYou can pull an image directly from Dockerhub:docker pull criteord/cassandra_exporter:latestKubernetesTo get an idea on how to integrate Cassandra Exporter in Kubernetes, you can look at this helm Chart.GrafanaDedicated dashboards can be found here

Read this article if you want to know more about criteo/cassandra_exporter

README.md

Description

Cassandra exporter is a standalone application which exports Apache Cassandra® metrics throught a prometheus friendly endpoint. This project is originally a fork of JMX exporter but aims at an easier integration with Apache Cassandra®.

Specifically, this project brings :

Exporting EstimatedHistogram metrics specific to Apache Cassandra®
Filtering on mbean's attributes
Metrics naming that respect the mbean hierarchy
Comprehensive config file

An essential design choice the project makes is to not let prometheus drive the scraping frequency. This decision has been taken because a lot of Apache Cassandra® metrics are expensive to scrap and can hinder the performance of the node. As we don't want this kind of situation to happen in production, the scrape frequency is restricted via the configuration of Cassandra Exporter.

How to use

To start the application

java -jar cassandra_exporter.jar config.yml

The Cassandra exporter needs to run on every Cassandra nodes to get all the informations regarding the whole cluster.

You can have a look at a full configuration file here The 2 main parts are :

blacklist
maxScrapFrequencyInSec

In the blacklist block, you specify the metrics you don't want the exporter to scrape. This is important as JMX is an RPC mechanism and you don't want to trigger some of those RPC. For example, mbeans endpoint from org:apache:cassandra:db:.* does not expose any metrics but are used to trigger actions on Cassandra's nodes.

In the maxScrapFrequencyInSec, you specify the metrics you want to be scraped at which frequency. Basically, starting from the set of all mbeans, the blacklist is applied first to filter this set and then the maxScrapFrequencyInSec is applied as a whitelist to filter the resulting set.

As an example, if we take as input set the metrics {a, b, c} and the config file is

blacklist:
  - a
maxScrapFrequencyInSec:
  50:
    - .*
  3600:
    - b

Cassandra Exporter will have the following behavior:

The metrics matching the blacklisted entries will never be scraped, here the metric a won't be available
In reverse order of frequency the metrics matching maxScrapFrequencyInSec will be scraped
1. Metric b will be scraped every hour
2. Remaining metrics will be scrapped every 50s, here only c

Resulting in :

Metric	Scrap Frequency
a	never
b	every hour
c	every 50 seconds

Once started the prometheus endpoint will be available at localhost:listenPort/ or localhost:listenPort/metrics and metrics format will look like the one below

cassandra_stats{name="org:apache:cassandra:metrics:table:biggraphite:datapoints_5760p_3600s_aggr:writelatency:50thpercentile",} 35.425000000000004

How to debug

Run the program with the following options:

java -Dorg.slf4j.simpleLogger.defaultLogLevel=trace -jar cassandra_exporter.jar config.yml --oneshot

You will get the duration of how long it took to scrape individual MBean, this is useful to understand which metrics are expansive to scrape.

Goods sources of information to understand what Mbeans are doing/create your dashboards are:

Config file example

host: localhost:7199
ssl: False
user:
password:
listenPort: 8080
blacklist:
   # Unaccessible metrics (not enough privilege)
   - java:lang:memorypool:.*usagethreshold.*
   # Leaf attributes not interesting for us but that are presents in many path (reduce cardinality of metrics)
   - .*:999thpercentile
   - .*:95thpercentile
   - .*:fifteenminuterate
   - .*:fiveminuterate
   - .*:durationunit
   - .*:rateunit
   - .*:stddev
   - .*:meanrate
   - .*:mean
   - .*:min
   # Path present in many metrics but uninterresting
   - .*:viewlockacquiretime:.*
   - .*:viewreadtime:.*
   - .*:cas[a-z]+latency:.*
   - .*:colupdatetimedeltahistogram:.*
   # Mostly for RPC, do not scrap them
   - org:apache:cassandra:db:.*
   # columnfamily is an alias for Table metrics in cassandra 3.x
   # https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/TableMetrics.java#L162
   - org:apache:cassandra:metrics:columnfamily:.*
   # Should we export metrics for system keyspaces/tables ?
   - org:apache:cassandra:metrics:[^:]+:system[^:]*:.*
   # Don't scrape us
   - com:criteo:nosql:cassandra:exporter:.*
maxScrapFrequencyInSec:
  50:
    - .*
  # Refresh those metrics only every hour as it is costly for cassandra to retrieve them
  3600:
    - .*:snapshotssize:.*
    - .*:estimated.*
    - .*:totaldiskspaceused:.*

Docker

You can pull an image directly from Dockerhub:

docker pull criteord/cassandra_exporter:latest

Kubernetes

To get an idea on how to integrate Cassandra Exporter in Kubernetes, you can look at this helm Chart.

Grafana

Dedicated dashboards can be found here

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt!  We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

monitoring

cassandra

Cassandra Summit Recap: Diagnosing Problems in Production - RustyRazorblade.com

John Doe

3/1/2023

monitoring

cassandra