{"componentChunkName":"component---src-templates-article-single-page-js","path":"/post/monitoring-apache-cassandra-tm-made-simple","result":{"pageContext":{"obj_id":"344cc497-9d5b-5ea8-b392-f5271d516517","node":{"content":"<p><em>To learn more about the DataStax open-source project, <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> and to try a demo, visit us on <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">GitHub</a>.</em></p><p>Apache Cassandra is a resilient system for users to build applications on, but many operators see Cassandra as a bit of a black box. It’s not that Cassandra doesn’t have <a href=\"https://cassandra.apache.org/doc/latest/operating/metrics.html\">hundreds of metrics to consume</a>, it does (over 300 metric series per table!). The fact is visualizing and getting a unified view of the cluster combined with OS-level metrics and application metrics is not an easy thing for Cassandra users to set up.  </p><h3>What is the Metrics Collector for Apache Cassandra?</h3><p>To help solve this problem, DataStax released a new open source project called the <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> (MCAC for short).  This project provides a drop-in solution to solve this monitoring gap for Apache Cassandra. Here’s how it works.</p><p>MCAC is built on the widely used <a href=\"https://collectd.org/\">collectd</a> agent but with a novel twist. Collectd is a metric collection agent that is well adopted and integrates well with all kinds of external metrics systems like, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Prometheus\">prometheus</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Graphite\">graphite</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Stackdriver\">stackdriver</a>, and <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_HTTP\">others</a>. While collectd can scrape JMX metrics out of the box, <a href=\"https://github.com/prometheus/jmx_exporter/issues/246#issuecomment-367573931\">JMX scraping can be quite slow</a> and works best with only a subset of metrics. Not to mention many people don’t want to maintain and configure the metric agent on every node.  </p><p>We use MCAC to power the health tab in <a href=\"https://astra.datastax.com/register\">Astra</a> and is bundled with our <a href=\"https://github.com/datastax/cass-operator\">Kubernetes operator for Apache Cassandra</a>. </p><h3>Why MCAC is different</h3><p>To solve this problem MCAC comes as a single bundle with our java agent and a linux portable collectd build all in one. Just add the agent to the cassandra-env.sh, it brings up collectd and ships every metric in Cassandra to collectd via a unix-socket. It works on all Apache Cassandra versions from 2.2 -&gt; 4.0. </p><p>By shipping the metrics this way efficiently it is able to export hundreds of thousands of series per node with little/no impact on C* performance.</p><p>Not only does it send the metrics, but it is specially designed to work well with prometheus out of the box, like <a href=\"https://www.robustperception.io/how-does-a-prometheus-histogram-work\">histograms are tailored for aggregation</a> by prometheus and labels are automatically converted on ingest. This means you can slice and dice metrics across DCs, racks, down to even tables.</p><p>The Cassandra metrics are one aspect of the equation but with collectd we can also gather and expose all the OS level metrics, like context switches and disk/network performance.</p><p>MCAC also creates a historical log on the nodes of metric and non-metric diagnostic events related to activity on the node. Non-metric events include details on Flushes, Compactions, Exceptions, GC, etc. This DataLog can be used to help analyze performance or other impacting issue on the cluster. If you need help our SRE team is available to help you diagnose problems with this log <a href=\"https://www.datastax.com/keepcalm\">https://www.datastax.com/keepcalm</a> and if you have any questions we're here to help at <a href=\"https://community.datastax.com/\">https://community.datastax.com/</a>.</p><p>Finally, what good are all these metrics without a way to visualize them! To tie it all together, MCAC comes with pre-built grafana dashboards which give operators the best Cassandra monitoring solution out there. These dashboards will change over time to focus on specific aspects of the system to make it easier to drill into the cluster.</p><p><img alt=\"Grafana\" data-entity-type=\"file\" data-entity-uuid=\"3352a498-c70f-43a8-adb4-cf1086877b2b\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC1_0.png\" /></p><p><img alt=\"mcac\" data-entity-type=\"file\" data-entity-uuid=\"7a740a4f-874c-4520-b95d-a549adb48be6\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC3.png\" /></p><p><img alt=\"mcac2\" data-entity-type=\"file\" data-entity-uuid=\"f6e23d85-a306-4141-a3b1-35b2c92ba846\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC2_0.png\" /></p>","id":"344cc497-9d5b-5ea8-b392-f5271d516517","title":"Monitoring Apache Cassandra™ Made Simple","origin_url":"https://www.datastax.com/blog/2020/05/monitoring-apache-cassandratm-made-simple","url":"https://www.datastax.com/blog/2020/05/monitoring-apache-cassandratm-made-simple","wallabag_created_at":"2020-05-20T21:17:41+00:00","published_at":null,"published_by":"['']","reading_time":2,"domain_name":"www.datastax.com","preview_picture":"https://www.ibm.com/content/dam/connectedassets-adobe-cms/worldwide-content/creative-assets/s-migr/ul/g/d8/31/datastax-overview-leadspace.png/_jcr_content/renditions/cq5dam.thumbnail.1280.1280.png","tags":["monitoring","cassandra","grafana"],"description":"To learn more about the DataStax open-source project, Metric Collector for Apache Cassandra and to try a demo, visit us on GitHub.Apache Cassandra is a resilient system for users to build applications..."},"relatedArticles":[{"content":"<p>DataStax Mission Control currently focuses on the lifecycle management and observability of DSE clusters. <a href=\"https://docs.datastax.com/en/mission-control/docs/install/product.html\" class=\"xref page\">Install</a> DataStax Mission Control quickly, and then use it to create and manage DataStax Enterprise 6.8.26+ clusters. DataStax Mission Control is composed of a suite of operators which handle the orchestration of automation across regional cluster boundaries. This simplifies management of globally deployed DSE clusters. Define a <a href=\"https://docs.datastax.com/en/mission-control/docs/reference/dse-cluster.html\" class=\"xref page\">DSE Cluster</a> custom resource (CR) to <a href=\"https://docs.datastax.com/en/mission-control/docs/manage/dse/add-dse-cluster.html\" class=\"xref page\">create</a> your first DSE Cluster with DataStax Mission Control.</p>","id":"8c9be501-0d76-5325-81f8-ae484d83122d","title":"DataStax Mission Control :: DataStax Project Mission Control","origin_url":"https://docs.datastax.com/en/mission-control/docs/index.html","url":"https://docs.datastax.com/en/mission-control/docs/index.html","wallabag_created_at":"2023-04-15T01:46:36+00:00","published_at":null,"published_by":null,"reading_time":null,"domain_name":"docs.datastax.com","preview_picture":"https://docs.datastax.com/en/_/img/datastax-docs-banner.png","tags":["kubernetes","datastax","cassandra","grafana"],"description":"DataStax Mission Control currently focuses on the lifecycle management and observability of DSE clusters. Install DataStax Mission Control quickly, and then use it to create and manage DataStax Enterp..."},{"content":"<p>Last week at the Cassandra Summit I gave a talk with <a href=\"https://twitter.com/rustyrazorblade/status/511932312515526656\">Blake Eggleston</a> on diagnosing performance problems in production. We spoke to about 300 people for about 25 minutes followed by a healthy Q&amp;A session. I’ve expanded on our presentation to include a few extra tools, screenshots, and more clarity on our talking points.</p><p>There’s finally a lot of material available for someone looking to get started with Cassandra. There’s several introductory videos on YouTube by both <a href=\"https://www.youtube.com/watch?v=W45Ysb9b6oE\">me</a> and <a href=\"https://www.youtube.com/watch?v=B-bTPSwhsDY\">Patrick McFadin</a> as well as videos on <a href=\"https://www.youtube.com/watch?v=Vv3QJxAdjic\">time series data modeling</a>. I’ve posted videos for my own project, cqlengine, (<a href=\"https://www.youtube.com/watch?v=zrbQcPNMbB0\">intro</a> &amp; <a href=\"https://www.youtube.com/watch?v=clXN9pnakvI\">advanced</a>), and plenty more on the <a href=\"https://www.youtube.com/channel/UCvP-AXuCr-naAeEccCfKwUA\">PlanetCassandra channel</a>. There’s also a boatload of <a href=\"http://planetcassandra.org/client-drivers-tools/\">getting started</a> material on PlanetCassandra written by <a href=\"https://twitter.com/rebccamills\">Rebecca Mills</a>.</p><p>This is the guide for what to do once you’ve built your application and you’re ready to put Cassandra in production. Whether you’ve been in operations for years or you are first getting started, this post should give you a good sense of what you need in order to address any issues you encounter.</p><p>The original slides are available via <a href=\"http://www.slideshare.net/JonHaddad/diagnosing-problems-in-production-cassandra-summit-2014\">Slideshare</a>.</p><p>Update: the presentation is now <a href=\"https://www.youtube.com/watch?v=QOwVDcLZd0A\">available on YouTube</a>!</p><iframe width=\"560\" height=\"315\" src=\"http://www.youtube.com/embed/QOwVDcLZd0A\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\">[embedded content]</iframe><p>Before you even put your cluster under load, there’s a few things you can set up that will help you diagnose problems if they pop up.</p><ol><li>\n<p>Ops center</p>\n<p>This is the standard management tool for Cassandra clusters. This is recommended for every cluster. While not open source, the community version is free. It gives you a high level overview of your cluster and provides historical metrics for the most important information. It comes with a variety of graphs that handle about 90% of what you need on a day to day basis.</p>\n<p><img src=\"http://www.datastax.com/wp-content/themes/datastax-2013/images/opscenter/opsc4-ring-view-c-hadoop-solr.jpg\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p>\n</li>\n<li>\n<p>Metrics plugins</p>\n<p>Cassandra has since version 1.1 included the <a href=\"https://dropwizard.github.io/metrics/3.1.0/\">metrics library</a>. In every release it tracks more metrics using it. <strong>Why is this awesome?</strong> In previous persons of Cassandra, the standard way to access what was going on in the internals was over JMX, a very Java centric communications protocol. That meant writing a Java Agent, setting up mx4j, or Jolokia, then digging through JMX, which can be a little hairy. Not everyone wants to do this much work.</p>\n<p>The metrics library allows you to tell Cassandra to report its internal, table level metrics out to a whole slew of different places. Out to CSV, Ganglia, Graphite, and STDOUT, and it’s pluggable to push metrics to anywhere you want.</p>\n<p><img src=\"http://www.datastax.com/wp-content/uploads/2013/11/client-vs-cf.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p>\n<p><a href=\"http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2\">Read more about the metrics library integration.</a></p>\n</li>\n<li>\n<p>Munin, Nagios, Icinga (or other system metrics monitoring)</p>\n<p>I’ve found these tools to be incredibly useful at graphing system metrics as well as custom application metrics. There are many options. If you’re already familiar with one tool, you can probably keep using it. There are hosted solutions as well (server density, data dog, etc)</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cassandra_writes.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>Statsd, Graphite, Grafana</p>\n<p>Your application should be tracking internal metrics. Timing queries, frequently called functions, etc. These tools let you get a profile of what’s going on with your code in production. Statsd collects raw stats and aggregates them together, then kicks them to graphite. Grafana is an optional (better) front end to Graphite.</p>\n<p>There was a great post by etsy, <a href=\"http://codeascraft.com/2011/02/15/measure-anything-measure-everything/\">Measure Anything, Measure Everything</a>, that introduced statsd and outlined its usage with Graphite.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cassandra-graphite2.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>Logstash</p>\n<p>We didn’t mention <a href=\"http://logstash.net/\">Logstash</a> in our presentation, but we’ve found it to be incredibly useful in correlating application issues with other failures. This is useful for application logging aggregation. If you don’t want to host your own log analysis tool, there are hosted services for this as well.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/logstash_blog-1024x514.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n</ol><p>There’s a bunch of system tools that are useful if you’re logged onto a machine and want to see real time information.</p><ol><li>\n<p>iostat</p>\n<p>iostat is useful for seeing what’s happening with each disk on your machine. If you’re hitting I/O issues, you’ll see it here. Specifically, you’re looking for high read &amp; write rates and a big avgqu-sz (disk queue), or a high svctm (service time) there’s a good chance you’re bottlenecked on your disk. You either want to use more disks or faster disks. Cassandra loves SSDs.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/iostat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>htop</p>\n<p>Htop is a better version of top, which is useful for getting a quick glance at your system. It shows load, running processes, memory usage, and a bunch of other information at a quick glance.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/htop.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>iftop &amp; netstat</p>\n<p>iftop is like top, but shows you active connections and the transfer rates between your server and whoever is at the other end.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/iftop.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>Netstat is more of a networking swiss army knife. You can see network connections, routing tables, interface statistics, and a variety of other network information.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/netstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>dstat</p>\n<p>I prefer to use dstat over iostat now since it includes all of its functionality and much of the functionality of other tools as well.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/dstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>strace</p>\n<p>strace is useful when you want to know what system calls are happening for a given process.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/strace.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>pcstat</p>\n<p>This tool, written by <a href=\"https://twitter.com/AlTobey\">Al Tobey</a>, allows you to examine a bunch of files and quickly determine how much of each file is in the buffer cache. If you’re trying to figure out why table access is slow, this tool can tell you if your data is in cache already or if you have to go out to disk. <a href=\"http://www.linuxatemyram.com/\">Here’s a good read</a> to get familiar with buffer cache. <a href=\"https://github.com/tobert/pcstat\">Check out the repo</a>.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/pcstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n</ol><p>There’s a few issues that are easy to run into that I’d consider “gotchas”, things that come up often enough that they’re worth mentioning.</p><p>A important design decision in Cassandra is that it uses last write wins when there are two inserts, updates, or deletes to a cell. To determine the last update, Cassandra uses the system clock (or the client can specify the time explicitly). If server times are different, the last write may not actually win, it’ll be the one that’s the most skewed into the future.</p><p>To address this issue, always make sure your clocks are synced. Ntpd will constantly correct for drift. ntpdate will perform a hard adjustment to your system clock. Ntpdate needs to be used if you clock is significantly off, and ntpd will keep it at the correct time.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/ntpdate.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><h2 id=\"disk-space-not-reclaimed\">Disk space not reclaimed</h2><p>if you add new nodes to a cluster, each replica is responsible for less data. it’s streamed to the new nodes. however, it is not removed from the old nodes. If you’re adding new nodes because you’re running low on disk space, this is extremely important. You are required to run <code>nodetool cleanup</code> in order to reclaim that disk space. This is a good idea any time you change your database topology.</p><h2 id=\"issues-adding-nodes-or-running-repairs\">Issues adding nodes, or running repairs</h2><p>There are two common problems that come up with repair. The first is that repairs take forever in 2.0. <a href=\"http://www.datastax.com/dev/blog/more-efficient-repairs\">This is solved in 2.1</a> which uses an incremental repair, and does not repair data which has already been repaired. The second issue relates to trying to repair (or add nodes) to a cluster when the versions do not match. It is, in general, not a good idea (yet) to stream data between servers which are of different versions. It will appear to have started, but will just hang around doing nothing.</p><p>Cassandra comes with several tools to help diagnose performance problems. They are available via <code>nodetool</code>, Cassandra’s multipurpose administration tool.</p><h2 id=\"compaction\">Compaction</h2><p>Compaction is the process of merging SSTables together. It reduces the number of seeks required to return a result. It’s a necessary part of Cassandra. If not configured correctly, it can be problematic. You can limit the I/O used by compaction by using <code>nodetool setcompactionthroughput</code>.</p><p>There’s 2 types of compaction available out of the box. Size Tiered is the default and great for write heavy workloads. Leveled compaction is good for read &amp; update heavy workloads, but since it uses much higher I/O it’s recommended you use this only if you’re on SSD. I recommend reading through the <a href=\"http://datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_configure_compaction_t.html\">documentation</a> to understand more about which is right for your workload.</p><p><img src=\"http://www.datastax.com/documentation/cassandra/2.0/cassandra/images/dml_compaction.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p><p>Histograms let you quickly understand at both a high level and table level what your performance looks like on a single node in your cluster. The first histogram, <code>proxyhistograms</code>, give you a quick top level view of all your tables on a node. This includes network latency. Histogram output has changed between versions to be more user friendly. The screenshot below is from Cassandra 2.1.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/proxyhistograms.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>If you’d like to find out if you’ve got a performance problem isolated to a particular table, I suggest first running <code>nodetool cfstats</code> on a keyspace. You’ll be able to scan the list of tables and see if there’s any abnormalities. You’ll be able to quickly tell which tables are queried the most (both reads and writes).</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cfstats.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p><code>nodetool cfhistograms</code> lets you identify performance problems with a single table on a single node. The statistics are more easily read in Cassandra 2.1.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cfhistograms.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><h2 id=\"query-tracing\">Query Tracing</h2><p>If you’ve narrowed down your problem to a particular table, you can start to trace the queries that you execute. If you’re coming from a something like MySQL, you’re used to the command <code>explain</code>, which tells in in advance what the query plan is for a given query. Tracing takes a different approach. Instead of showing a query plan, query tracing keeps track of the events in the system whewn it actually executes. Here’s an example where we’ve created a whole bunch of tombstones on a partition. Even on a SSD you still want to avoid a lot of tombstones - it’s disk, CPU, and memory intensive.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/tracing.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>The JVM gets a reputation for being a bit of a beast. It’s a really impressive feat of engineering, but it shouldn’t be regarded as black magic. I strongly recommend reading through <a href=\"http://blakeeggleston.com/cassandra-tuning-the-jvm-for-read-heavy-workloads.html\">Blake Eggleston’s post on the JVM</a>, it’s well written and does a great job of explaining things. (Much better than I would here).</p><p>OK - we’ve got all these tools under our belt. Now we can start to narrow down the problem.</p><ul><li>\n<p>Are you seeing weird consistency issues, even on consistency level ALL?<br />It’s possible you’re dealing with a clock sync issue. If you’re sending queries really close to one another, they might also be getting the same millisecond level timestamp due to an async race condition in your code. If you’re sending lots of writes at the same time to the same row, you may have a problem in your application. Try to rethink your data model to avoid this.</p>\n</li>\n<li>\n<p>Has query performance dropped? Are you bottlenecked on disk, network, CPU, memory? Use the tools above to figure out your bottleneck. Did the number of queries to your cluster increase? Are you seeing longer than normal garbage collection times? Ops center has historical graphs that are useful here. Is there a single table affected, or every table? Use histograms and cfstats to dig into it.</p>\n</li>\n<li>\n<p>Are nodes going up and down? Use a combination of ops center and your system metrics to figure out which node it is. If it’s the same node, start investigating why. Is there a hot partition? Is it doing a lot of garbage collection? Is your application opening more connections than before? You should have system metrics that show these trends over time. Maybe you just have additional load on the system - it may be necessary to add new nodes. Don’t forget to run cleanup.</p>\n</li>\n</ul><p>This started out as a small recap but has evolved into much more than that. The tools above have helped me a wide variety of problems, not just Cassandra ones. If you follow the above recommendations you should be in a great spot to diagnose most problems that come your way.</p><p>You can find me on <a href=\"https://twitter.com/rustyrazorblade\">Twitter</a> for any comments or suggestions.</p>","id":"eef63404-0bd6-5a8d-96dc-2a4c8b270540","title":"Cassandra Summit Recap: Diagnosing Problems in Production - RustyRazorblade.com","origin_url":"http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/","url":"http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/","wallabag_created_at":"2023-03-01T19:46:02+00:00","published_at":null,"published_by":null,"reading_time":10,"domain_name":"rustyrazorblade.com","preview_picture":"http://rustyrazorblade.com/images/default.png","tags":["monitoring","cassandra"],"description":"Last week at the Cassandra Summit I gave a talk with Blake Eggleston on diagnosing performance problems in production. We spoke to about 300 people for about 25 minutes followed by a healthy Q&A sessi..."},{"content":"<div id=\"js-flash-container\" data-turbo-replace=\"\"><div class=\"flash flash-full {{ className }} px-2\"><p>{{ message }}</p></div>\n</div><div class=\"application-main\" data-commit-hovercards-enabled=\"\" data-discussion-hovercards-enabled=\"\" data-issue-and-pr-hovercards-enabled=\"\"><main id=\"js-repo-pjax-container\"><div id=\"repository-container-header\" class=\"pt-3 hide-full-screen c4\" data-turbo-replace=\"\"><div class=\"d-flex flex-wrap flex-justify-end mb-3 px-3 px-md-4 px-lg-5 c2\"><p> / <strong itemprop=\"name\" class=\"mr-2 flex-self-stretch\"><a data-pjax=\"#repo-content-pjax-container\" data-turbo-frame=\"repo-content-turbo-frame\" href=\"https://github.com/jlacefie/cfstats-csv-parser\">cfstats-csv-parser</a></strong> Public</p><ul class=\"pagehead-actions flex-shrink-0 d-none d-md-inline c1\"><li><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;notification subscription menu watch&quot;,&quot;repository_id&quot;:null,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"f144271ccf5238dfe8481cdfcc1ac7064ce0108d684e9aa04b963d09dffcc7dd\" aria-label=\"You must be signed in to change notification settings\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn\">Notifications</a></li>\n<li><a id=\"fork-button\" href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;repo details fork button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"4e22fc47232edd447fd14c647078608494a6bd8f96021d637e89ac722316da72\" data-view-component=\"true\" class=\"btn-sm btn\">Fork 4</a></li>\n<li>\n<p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;star button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"8b058a295e332e9274ab81d9abdca5de977f80971274e4b41f0e790b84513eae\" aria-label=\"You must be signed in to star a repository\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn BtnGroup-item\"> Star 1</a> </p>\n</li>\n</ul></div><div class=\"d-block d-md-none mb-2 px-3 px-md-4 px-lg-5\" id=\"responsive-meta-container\" data-turbo-replace=\"\"><p class=\"f4 mb-3\">Repo for a utility to parse cfstats into a csv file for analysis</p><h3 class=\"sr-only\">License</h3><p><a href=\"https://github.com/jlacefie/cfstats-csv-parser/blob/master/LICENSE\" class=\"Link--muted\" data-analytics-event=\"{&quot;category&quot;:&quot;Repository Overview&quot;,&quot;action&quot;:&quot;click&quot;,&quot;label&quot;:&quot;location:sidebar;file:license&quot;}\"> MIT license</a></p><p><a class=\"Link--secondary no-underline mr-3\" href=\"https://github.com/jlacefie/cfstats-csv-parser/stargazers\"> 1 star</a> <a class=\"Link--secondary no-underline\" href=\"https://github.com/jlacefie/cfstats-csv-parser/network/members\"> 4 forks</a></p><div class=\"d-flex\"><p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;star button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"8b058a295e332e9274ab81d9abdca5de977f80971274e4b41f0e790b84513eae\" aria-label=\"You must be signed in to star a repository\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn btn-block BtnGroup-item\"> Star</a> </p><p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;notification subscription menu watch&quot;,&quot;repository_id&quot;:null,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"f144271ccf5238dfe8481cdfcc1ac7064ce0108d684e9aa04b963d09dffcc7dd\" aria-label=\"You must be signed in to change notification settings\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn btn-block\">Notifications</a></p></div></div></div>\n</main></div><footer class=\"footer width-full container-xl p-responsive\">\n<div class=\"position-relative d-flex flex-items-center pb-2 f6 color-fg-muted border-top color-border-muted flex-column-reverse flex-lg-row flex-wrap flex-lg-nowrap mt-6 pt-6\"><p> © 2023 GitHub, Inc.</p></div>\n</footer><p> You can’t perform that action at this time.</p><p> You signed in with another tab or window. <a href=\"\">Reload</a> to refresh your session. You signed out in another tab or window. <a href=\"\">Reload</a> to refresh your session.</p><details class=\"details-reset details-overlay details-overlay-dark lh-default color-fg-default hx_rsm\" open=\"open\">\n</details>","id":"8e3a1b9c-7cd2-578f-a0b1-877ac6f47cb3","title":"GitHub - jlacefie/cfstats-csv-parser: Repo for a utility to parse cfstats into a csv file for analysis","origin_url":"https://github.com/jlacefie/cfstats-csv-parser","url":"https://github.com/jlacefie/cfstats-csv-parser","wallabag_created_at":"2023-03-01T19:40:38+00:00","published_at":null,"published_by":"['jlacefie']","reading_time":null,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/a44d5ad918f82ef2250d84b5d7d746b2e183f62eb7bd35a736db6977a6ff0bfb/jlacefie/cfstats-csv-parser","tags":["monitoring","cassandra"],"description":"{{ message }}\n / cfstats-csv-parser PublicNotifications\nFork 4\n\n Star 1 \n\nRepo for a utility to parse cfstats into a csv file for analysisLicense MIT license 1 star  4 forks Star Notifications\n\n © 202..."},{"content":"<div class=\"entry clearfix\"><p>In Apache Cassandra Lunch #62, guest speaker Sarma Pydipally presented on the Grafana Dashboard for Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register <a href=\"https://www.meetup.com/Cassandra-DataStax-DC/events/\" target=\"_blank\" rel=\"noreferrer noopener\">here</a> now!</p><h2>Grafana Dashboard for Apache Cassandra</h2><p>We appreciate Sarma Pydipally for taking the time to create a presentation and sharing his knowledge of the Grafana Dashboard and how it can be used with Apache Cassandra.</p><h3>Grafana Dashboard</h3><p>Grafana Dashboards is an open-source data and analytics visualization tool. Some key features of Grafana include visualization, queries, alerts, and metrics exploration. It allows users to take data from their time-series databases (TSDBs) and create graphs and visualizations.</p><h3>Prometheus</h3><p>Prometheus is an open-source monitoring system. It records and tracks real-time metrics in a time series database. Prometheus supports flexible queries using PromQL and supports real-time alerting. It collects data through a pull model. The Prometheus server queries various data sources from a list at a set frequency and updates the current values based on these queries.</p><h3>Cassandra</h3><p>Apache Cassandra is a highly available and highly scalable, open-source, distributed NoSQL database. Cassandra features proven fault-tolerance on hardware or in the cloud.</p><h3>Grafana Dashboard with Prometheus and Cassandra</h3><p>To find out more about how to use Grafana Dashboard with Cassandra, please check out Sarma’s video and presentation embedded below. The GitHub repo to the demo featured in the presentation is linked below.</p><figure class=\"wp-block-image size-large is-style-default\"><a href=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard.jpg\"><img width=\"1024\" height=\"576\" src=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1024x576.jpg\" alt=\"Grafana Dashboard for Cassandra architecture diagram.\" class=\"wp-image-178116\" srcset=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1024x576.jpg 1024w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-300x169.jpg 300w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-768x432.jpg 768w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1536x864.jpg 1536w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard.jpg 1920w\" referrerpolicy=\"no-referrer\" /></a></figure><h3>Demo</h3><p><a href=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra\">https://github.com/sarma1807/Prometheus-Grafana-Cassandra</a></p><figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><iframe title=\"Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra\" width=\"900\" height=\"506\" src=\"https://www.youtube.com/embed/ATfKQ9YLfv8?feature=oembed\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"> </iframe>\n</figure><h2>Cassandra.Link</h2><p><a href=\"https://cassandra.link/\" target=\"_blank\" rel=\"noreferrer noopener\">Cassandra.Link</a> is a knowledge base that we created for all things Apache Cassandra. Our goal with <a href=\"https://cassandra.link/\" target=\"_blank\" rel=\"noreferrer noopener\">Cassandra.Link</a> was to not only fill the gap of <a href=\"https://web.archive.org/web/*/http://www.planetcassandra.org\" target=\"_blank\" rel=\"noreferrer noopener\">Planet Cassandra</a> but to bring the <strong>Cassandra </strong>community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.</p><p>We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!</p></div><p class=\"blog-post-meta\">Posted in <a href=\"https://blog.anant.us/category/platform/data-analytics/\" rel=\"category tag\">Data &amp; Analytics</a>, <a href=\"https://blog.anant.us/category/events/\" rel=\"category tag\">Events</a> <strong>|</strong> Comments Off on Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra</p>","id":"efd6b1f7-3f4d-5a7e-8313-2984d4a821c7","title":"Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra - Business Platform Team","origin_url":"https://blog.anant.us/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra/","url":"https://blog.anant.us/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra/","wallabag_created_at":"2022-06-25T14:34:24+00:00","published_at":"2021-09-09T17:41:33+00:00","published_by":"['']","reading_time":1,"domain_name":"blog.anant.us","preview_picture":"https://blog.anant.us/wp-content/uploads/2021/08/SM_Webinar_Deck_Template-KEEP-12-1.png","tags":["cassandra.lunch","grafana.dashboard","cassandra","grafana"],"description":"In Apache Cassandra Lunch #62, guest speaker Sarma Pydipally presented on the Grafana Dashboard for Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a de..."},{"content":"<h2>Cassandra DataSource for Grafana</h2><p>Apache Cassandra Datasource for Grafana. This datasource is to visualise <strong>time-series data</strong> stored in Cassandra/DSE, if you are looking for Cassandra <strong>metrics</strong>, you may need <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\" target=\"_blank\">datastax/metric-collector-for-apache-cassandra</a> instead.</p><p><img src=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/workflows/Handle%20Release/badge.svg\" alt=\"Release Status\" /><img src=\"https://github.com/HadesArchitect/grafana-cassandra-source/workflows/CodeQL/badge.svg?branch=master\" alt=\"CodeQL\" /><img src=\"https://img.shields.io/github/downloads/hadesarchitect/grafanacassandradatasource/total?color=%2326c458&amp;label=Downloads&amp;logo=github\" alt=\"GitHub all releases\" /></p><p>To see the datasource in action, please follow the <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/Quick-Demo\" target=\"_blank\">Quick Demo</a> steps. Documentation is available <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki\" target=\"_blank\">here</a></p><p><strong>Supports</strong>:</p><ul><li>Grafana 5.x, 6.x, 7.x (4.x not tested, 8.x WiP not supported yet)</li>\n<li>Cassandra 3.x, 4.x (2.x not tested)</li>\n<li>DataStax Enterprise 6.x</li>\n<li>DataStax Astra (<a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/DataStax-Astra\" target=\"_blank\">docs</a>)</li>\n<li>AWS Keyspaces (limited support) (<a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/AWS-Keyspaces\" target=\"_blank\">docs</a>)</li>\n<li>Linux, OSX (incl. M1), Windows</li>\n</ul><p><strong>Contacts</strong>:</p><ul><li><a href=\"https://discord.gg/FU2Cb4KTyp\" target=\"_blank\"><img src=\"https://img.shields.io/badge/discord-chat%20with%20us-green\" alt=\"Discord Chat\" /></a></li>\n<li><a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/discussions\" target=\"_blank\"><img src=\"https://img.shields.io/badge/github-discussions-green\" alt=\"Github discussions\" /></a></li>\n</ul><h2>Usage</h2><p>You can find more detailed instructions in <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki\" target=\"_blank\">the datasource wiki</a>.</p><h3>Installation</h3><ol><li>Download the plugin using <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/releases/latest\" target=\"_blank\">latest release</a>, please download <code>cassandra-datasource-VERSION.zip</code> and uncompress a file into the Grafana plugins directory (<code>grafana/plugins</code>).</li>\n<li>Add the Cassandra DataSource as a datasource at the datasource configuration page.</li>\n<li>Configure the datasource specifying contact point and port like \"10.11.12.13:9042\", username and password. It's recommended to use a dedicated user with read-only permissions only to the table you have to access.</li>\n<li>Push the \"Save and Test\" button, if there is an error message, check the credentials and connection.</li>\n</ol><p><img src=\"https://user-images.githubusercontent.com/1742301/148654400-3ac4a477-8ca3-4606-86e7-5d10cbdc4ea9.png\" alt=\"Datasource Configuration\" /></p><h3>Panel Setup</h3><p>There are <strong>two ways</strong> to query data from Cassandra/DSE, <strong>Query Configurator</strong> and <strong>Query Editor</strong>. Configurator is easier to use but has limited capabilities, Editor is more powerful but requires understanding of <a href=\"https://cassandra.apache.org/doc/latest/cql/\" target=\"_blank\">CQL</a>.</p><h4>Query Configurator</h4><p><img src=\"https://user-images.githubusercontent.com/1742301/148654262-b9cb7253-4086-4367-8aae-35ea458fcbb6.png\" alt=\"Query Configurator\" /></p><p>Query Configurator is the easiest way to query data. At first, enter the keyspace and table name, then pick proper columns. If keyspace and table names are given correctly, the datasource will suggest the column names automatically.</p><ul><li><strong>Time Column</strong> - the column storing the timestamp value, it's used to answer \"when\" question.</li>\n<li><strong>Value Column</strong> - the column storing the value you'd like to show. It can be the <code>value</code>, <code>temperature</code> or whatever property you need.</li>\n<li><strong>ID Column</strong> - the column to uniquely identify the source of the data, e.g. <code>sensor_id</code>, <code>shop_id</code> or whatever allows you to identify the origin of data.</li>\n</ul><p>After that, you have to specify the <code>ID Value</code>, the particular ID of the data origin you want to show. You may need to enable \"ALLOW FILTERING\" although we recommend to avoid it.</p><p><strong>Example</strong> Imagine you want to visualise reports of a temperature sensor installed in your smart home. Given the sensor reports its ID, time, location and temperature every minute, we create a table to store the data and put some values there:</p><pre>CREATE TABLE IF NOT EXISTS temperature (\n    sensor_id uuid,\n    registered_at timestamp,\n    temperature int,\n    location text,\n    PRIMARY KEY ((sensor_id), registered_at)\n);\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:21:59.001+0000, 18, \"kitchen\");\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:22:59.001+0000, 19, \"kitchen\");\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:23:59.001+0000, 20, \"kitchen\");\n</pre><p>In this case, we have to fill the configurator fields the following way to get the results:</p><ul><li><strong>Keyspace</strong> - smarthome <em>(keyspace name)</em></li>\n<li><strong>Table</strong> - temperature <em>(table name)</em></li>\n<li><strong>Time Column</strong> - registered_at <em>(occurence)</em></li>\n<li><strong>Value Column</strong> - temperature <em>(value to show)</em></li>\n<li><strong>ID Column</strong> - sensor_id <em>(ID of the data origin)</em></li>\n<li><strong>ID Value</strong> - 99051fe9-6a9c-46c2-b949-38ef78858dd0 <em>ID of the sensor</em></li>\n<li><strong>ALLOW FILTERING</strong> - FALSE <em>(not required, so we are happy to avoid)</em></li>\n</ul><p>In case of a few origins (multiple sensors) you will need to add more rows. If your case is as simple as that, query configurator will be a good choice, otherwise please proceed to the query editor.</p><h4>Query Editor</h4><p>Query Editor is more powerful way to query data. To enable query editor, press \"toggle text edit mode\" button.</p><p><img src=\"https://user-images.githubusercontent.com/1742301/148654475-6718f3ff-1290-4d7a-a40b-dc107c52ac15.png\" alt=\"102781863-a8bd4b80-4398-11eb-8c28-4d06a1f29279\" /></p><p>Query Editor unlocks all possibilities of CQL including Used-Defined Functions, aggregations etc.</p><p>Example using <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/blob/master/test_data.cql\" target=\"_blank\">test_data.cql</a>:</p><pre>SELECT id, CAST(value as double), created_at FROM test.test WHERE id IN (99051fe9-6a9c-46c2-b949-38ef78858dd1, 99051fe9-6a9c-46c2-b949-38ef78858dd0) AND created_at &gt; $__timeFrom and created_at &lt; $__timeTo\n</pre><ol><li>Follow the order of the SELECT expressions, it's important!</li>\n</ol><ul><li><strong>Identifier</strong> - the first property in the SELECT expression must be the ID, something that uniquely identifies the data (e.g. <code>sensor_id</code>)</li>\n<li><strong>Value</strong> - The second property must be the value what you are going to show</li>\n<li><strong>Timestamp</strong> - The third value must be timestamp of the value.\nAll other properties will be ignored</li>\n</ul><ol start=\"2\"><li>To filter data by time, use <code>$__timeFrom</code> and <code>$__timeTo</code> placeholders as in the example. The datasource will replace them with time values from the panel. <strong>Notice</strong> It's important to add the placeholders otherwise query will try to fetch data for the whole period of time. Don't try to specify the timeframe on your own, just put the placeholders. It's grafana's job to specify time limits.</li>\n</ol><p><img src=\"https://user-images.githubusercontent.com/1742301/148654522-8e50617d-0ba9-4c5a-a3f0-7badec92e31f.png\" alt=\"103153625-1fd85280-4792-11eb-9c00-085297802117\" /></p><h2>Development</h2><p><a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/Developer-Guide\" target=\"_blank\">Developer documentation</a></p>","id":"de9ff26a-f27a-57fb-b10d-bcbd676a6d9d","title":"Apache Cassandra","origin_url":"https://grafana.com/grafana/plugins/hadesarchitect-cassandra-datasource/","url":"https://grafana.com/grafana/plugins/hadesarchitect-cassandra-datasource/","wallabag_created_at":"2022-04-25T21:45:46+00:00","published_at":null,"published_by":null,"reading_time":4,"domain_name":"grafana.com","preview_picture":"https://grafana.com/media/images/meta/grafana-labs-meta-default_1200x630.png","tags":["cassandra","grafana"],"description":"Cassandra DataSource for GrafanaApache Cassandra Datasource for Grafana. This datasource is to visualise time-series data stored in Cassandra/DSE, if you are looking for Cassandra metrics, you may nee..."},{"content":"<div class=\"top-section\"><div class=\"container-fluid main-content-area\"><div id=\"scroll-status-bar\"><div id=\"scroll-status-percent\"></div><div class=\"blog-post-hero container-fluid\"><div class=\"blog-post-hero-bg\"><div class=\"blog-post-hero-bg-img\"></div><div class=\"container\"><img src=\"https://sematext.com/wp-content/uploads/2021/10/critical-cassandra-metrics-to-monitor.jpg\" id=\"the-featured-image\" width=\"1140\" height=\"626\" alt=\"image\" /></div></div><div class=\"container-fluid container-single-blog-post\"><div class=\"container\"><article class=\"single-article-blog-post\" id=\"post-53726\"><main><section><div id=\"the-content\"><p>Apache Cassandra is a distributed database known for its high availability, fault tolerance, and near-linear scaling. It was initially developed by Facebook, but it is a widely used open-source system used by the largest tech companies in the world. There are numerous reasons behind its popularity, including no single point of failure, exceptional horizontal scaling with a data layout designed as a perfect fit for time-series data.</p><p>However, despite these perks, like any other system, Cassandra is prone to performance issues. This makes monitoring imperative. And it all starts with knowing what to measure. In this article, we will explain the <strong>key Cassandra performance metrics</strong> you should monitor to make sure everything is up and running at all times.</p><h2>What Is Cassandra and How Does It Work?</h2><p>Let’s keep it short – Apache Cassandra is a distributed NoSQL database designed to provide fault-tolerant and highly available architecture with performance in mind.</p><p>As a distributed system Cassandra is built out of nodes. A <strong>node</strong> is a single instance of Apache Cassandra that can operate on its own. Multiple nodes can form a <strong>cluster</strong> – a distributed system holding common data and responding to query requests. Cassandra works in a master-less architecture where each node communicates in a <strong>peer to peer </strong>fashion using a protocol known as <strong>Gossip</strong>. The <strong>gossip</strong> protocol is designed so that each node is informed about the state of all other nodes and a single node performs <strong>gossip</strong> communication with up to three other nodes every second.</p><p>The <strong>cluster</strong> can be divided into <strong>data centers</strong> and <strong>racks</strong>, just like the real-life data centers are divided. In Cassandra terminology, a <strong>data center</strong> is designed to hold multiple <strong>racks</strong> and a single <strong>rack </strong>holds a complete replica of the data.</p><p><img data-lazyloaded=\"1\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-3.png\" alt=\"image\" /></p><noscript><img src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-3.png\" alt=\"image\" /></noscript><p><em>Cassandra Cluster Logical Overview</em></p><p>When it comes to the data, Cassandra stores it in tables that are organized the same way as in any other database – in rows and columns. A single table is called a column family. The tables themselves are grouped into keyspaces, where a keyspace usually holds logically similar data – for example, from a business perspective. The keyspace is also used for data replication, and the replication itself is configured on a keyspace level.</p><p>Getting back to the tables. Each table defines a primary key that is built of the partition key and the clustering columns. Cassandra uses the partition key to index the data. All data that share a common partition key make a single data partition – a basic unit for data retrieval and storage. The clustering columns are optional.</p><p>Needless to say, Apache Cassandra is a complicated, distributed system and it’s not uncommon for users to encounter operation problems and difficulties. Everything breaks eventually, from the low-level bare metal components, up to the high-level software. It is not unusual for users to deal with network issues and CPU utilization problems, especially on very large clusters. Cassandra is written in Java and uses both off-heap and heap memory, which means that as the volume of data grows, you may hit issues with the garbage collector. Finally, because of the amount of data that you will process you may need to deal with the hard disk space and performance of your I/O subsystem. All of these can be avoided by keeping an eye for the relevant metrics with the help of a good <a href=\"https://sematext.com/integrations/cassandra\">Cassandra monitoring tool</a>.</p><h2>How Is Cassandra Performance Measured?</h2><p>The most complex, distributed systems provide a set of metrics that you should take care of, monitor, and alert on to ensure that your system is healthy and working well. Apache Cassandra is no different. It provides a plethora of performance metrics which we can divide into three categories:</p><ul><li>Dedicated Apache Cassandra metrics that describe how the system and its parts perform.</li><li><a href=\"https://sematext.com/blog/jvm-metrics/\">Java Virtual Machine metrics</a> that tell you about the execution environment on which Apache Cassandra is running.</li><li><a href=\"https://sematext.com/server-monitoring/\">Operating system metrics</a> describing the metrics related to the bare metal servers, virtual machines, or containers, depending on the environment that you are using.</li></ul><h3>Dedicated Cassandra Performance Metrics</h3><p>When monitoring Apache Cassandra clusters, is the metrics that the distributed data store exposes via the JMX interface. There are many Cassandra performance metrics exposed in the JMX and having visibility into most of them is a good idea. You never know what can be useful when troubleshooting.</p><h4>Nodes</h4><p>One of the most important Cassandra metrics is the number of nodes that are currently available and connected to form a cluster. The ability to store the data and respond to queries is directly related to the availability of nodes.</p><h4>Compaction Metrics</h4><p><a href=\"https://cassandra.apache.org/doc/latest/operating/compaction/index.html\">Compaction</a> is the operation of merging multiple smaller instances of <a href=\"https://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstables\">SSTable</a> into one bigger SSTable that contains all the data from the smaller tables. Because of that, it can be very expensive and resource-consuming. Having visibility into compaction performance is critical for long-term observability – the <a href=\"https://sematext.com/blog/cassandra-monitoring-tools/\">Cassandra monitoring tool</a> of your choice needs to provide the number of compactions and the number of compacted bytes.</p><p>During compaction, until the process ends, the total disk space used may be double that before the compaction. Because of that, you should consider leaving about 50% of space free to account for compactions and, of course, set up appropriate alerts to inform you when the amount of free disk space is close to a level where compaction could fail.</p><h4>Read and Write Performance Metrics</h4><p>The next set of metrics is dedicated to clients and the read and write side of the operations. You should measure the number of reads happening in a given period, the request latency, and the number of timeouts and failures. Your Cassandra monitoring tool should provide the top-level view and allow for slicing and dicing through the data showing you the aggregated view, per node view, per keyspace view, and per table view. The same goes for write operations.</p><p>You should see the number of write requests happening and write latency. Local writes and reads may also be important when troubleshooting.</p><h4>Table Metrics</h4><p>Table metrics are also essential. The ones you should pay close attention to are partition size, tombstone scans, and the number of SSTables per read.</p><h5>Partition Size</h5><p>Partition size is crucial for cluster performance. Cassandra uses it as a unit of data storage, replication, and retrieval, thus directly dictating the performance of your Cassandra tables. The ideal partition size varies but is usually below 100MB and not less than 10 – 20MB.</p><h5>Tombstones</h5><p>Cassandra produces <a href=\"https://cassandra.apache.org/doc/latest/operating/compaction.html#tombstones-and-garbage-collection-gc-grace\">tombstones</a> when you delete the data. They are markers of the deleted data. Data in Cassandra is immutable by design, and because of that, it can only be physically removed from the SSTable during compactions. Because of that, you should keep an eye on how they affect your disk space.</p><h5>SSTables Per Read</h5><p>Similar to tombstones, the number of SSTables per read is related to the immutability of the data in Cassandra. A single table can be built of multiple SSTables, which are written sequentially. A single read operation can result in reading multiple SSTables to retrieve the relevant data. The more SSTables Cassandra needs to read to return the data, the more resources are required to complete the read operation. This is why you should minimize the number whenever possible.</p><h4>Other Metrics</h4><p>As we mentioned earlier, other Apache Cassandra performance metrics can be helpful and you should consider monitoring them.</p><h5>Caches</h5><p>There are two types of caches in Cassandra – the key cache and the row cache. Cassandra uses the key cache to store the location of row keys in memory so that the rows can be accessed without the need to hit the disk. The row cache stores the rows themselves in memory. By using the caches, Cassandra reduces the need to read the data from the disk and trades the memory usage for performance.</p><p>You need to monitor the key cache requests and row cache requests, which tell how many requests to a given cache type were made, and the key cache hit ratio and the row cache hit ratio, which show the percentage of results retrieved from the cache instead of the disk.</p><h5>Threadpool</h5><p>Cassandra is designed to handle the high load, withstand backpressure, and perform asynchronous tasks. Monitoring various thread pools is crucial for understanding Cassandra’s performance and bottlenecks. Each thread pool exposes the number of active, pending, and blocked tasks. Accumulated, pending, and blocked tasks usually tell about performance issues and the need for more processing power or different data and query architecture.</p><h5>Bloom Filter</h5><p>In the read path, Cassandra merges the data stored on a disk inside the SSTables with the data stored in memory. To minimize the amount of checking for data existence in the SSTables on the disk Cassandra uses a data structure called bloom filter.</p><p>The bloom filter is a probabilistic data structure that can tell Cassandra that the data is definitely not in a given file or that the data may be present in a given file. The key metrics to monitor here are the amount of space used by bloom filters, the number of false positives, and the ratio. You can reduce the number of false positives by assigning more memory to the bloom filters.</p><h3>Java Virtual Machine Metrics</h3><p><a href=\"https://cassandra.apache.org/\">Apache Cassandra</a> is a JVM-based application that comes with all the usual JVM pros and cons. From the developer’s perspective, memory management is easier and requires less hassle – you just use an object and forget about it, letting the JVM do the cleaning up. But that means that something has to clean up all the unused objects in memory. This is where the <a href=\"https://sematext.com/blog/java-garbage-collection/\">Java Garbage Collection</a> comes in and the metrics that come with it.</p><p>A proper <a href=\"https://sematext.com/integrations/cassandra-monitoring/\">Cassandra monitoring tool</a> should provide metrics that allow you to check and troubleshoot issues with the Java Virtual Machine, such as JVM memory utilization and garbage collection count and time. You can read more about them in our guide about <a href=\"https://sematext.com/blog/jvm-metrics/\">JVM metrics</a>.</p><h3>Operating System Metrics</h3><p>You can’t ignore Operating System metrics either. Information such as CPU utilization, memory usage, and disk usage is essential and can play a major role when it comes to Cassandra performance.</p><h4>CPU Utilization</h4><p>Your CPU is used for data processing and query handling. The more spare CPU cycles you have on a given node, the data and queries it can process. The <strong>user</strong> part of the CPU usage will show you your Cassandra process needs, while the <strong>wait</strong> can point to a bottleneck in I/O or network. As with every Java application, CPU cycles are also needed for garbage collection, so keep that in mind when planning.</p><h4>Memory Usage</h4><p>Memory usage is crucial for every JVM-based application. The newest version of Cassandra leverages both off-heap and heap memory. This means that you not only need to set the heap size of your Cassandra nodes correctly but also have enough off-heap memory for keeping your cluster performance at its best.</p><h4>Disk Usage</h4><p>Disk and I/O are crucial – Cassandra keeps its data on the disk, and each query may require a substantial number of I/O operations to return the results. You need to be sure that your hardware can handle your data retrieval needs. You also need to be sure that you have enough space to hold your data and handle the compaction process.</p><h2>Monitor Cassandra Performance with Sematext</h2><p><img data-lazyloaded=\"1\" data-placeholder-resp=\"1999x1017\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-1.png\" class=\"alignnone\" alt=\"monitoring cassandra performance metrics with sematext\" width=\"1999\" height=\"1017\" /></p><noscript><img class=\"alignnone\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-1.png\" alt=\"monitoring cassandra performance metrics with sematext\" width=\"1999\" height=\"1017\" /></noscript><p><a href=\"https://sematext.com/cloud/\">Sematext Cloud</a> and its <a href=\"https://sematext.com/integrations/cassandra\">Apache Cassandra monitoring</a> integration provide all that you need to monitor your distributed database. Everything is within a single view available without distractions:</p><ul><li>The overview report gives you a perfect start point for your metrics, painting a picture of the whole cluster.</li><li>A dedicated Cassandra report that provides an in-depth view of all relevant metrics related to the distributed database.</li><li>The OS report provides necessary operating system metrics such as CPU and memory utilization and visibility into your network traffic.</li><li>Finally, the JVM metrics give the full view of the Java Virtual Machine, such as metrics related to garbage collection and per-heap space memory utilization.</li></ul><p>Using the dedicated split-view, you can correlate all the available metrics with other metrics, <a href=\"https://sematext.com/logsene/\">logs</a>, and <a href=\"https://sematext.com/experience/\">real user monitoring</a> data, making Sematext a perfect visibility tool.</p><p>Sematext allows you to set up alerts on any metric or log the event and supports both threshold-based and anomaly-based alerts for full flexibility. You don’t have to watch your metrics over and over. Once you configure your alerts, you can sleep well, and Sematext will let you know if something is wrong.</p><p>If you want to see how Sematext stacks against similar solutions, read our article about the best <a href=\"https://sematext.com/blog/cassandra-monitoring-tools/\">Cassandra monitoring tools</a> available today.</p><h2>Get Started with Cassandra Monitoring</h2><p>As the distributed database Apache Cassandra can quickly become an operational challenge without visibility into what is happening from a global perspective as well as on a node level. You need to have full visibility from top to bottom, but that is not enough. You need to be sure that your monitoring system can notify you when an issue happens and also predict issues before your customers notice them.</p><p>One of the tools that will give you all of that is Sematext’s <a href=\"https://sematext.com/integrations/\">Apache Cassandra monitoring</a> integration. String monitoring your Cassandra cluster by creating the Sematext Cloud account and then the Cassandra monitoring App. Don’t forget to create a Logs App as well to ship you Cassandra logs for a full observability experience.</p><div id=\"jp-relatedposts\" class=\"jp-relatedposts\"><h3 class=\"jp-relatedposts-headline\"><em>You might also like</em></h3></div></div><div id=\"twitter-button\"><p class=\"text-center\"><a href=\"https://apps.sematext.com/ui/registration\" id=\"continue-conversation-twitter\" class=\"g-btn-outline-orange\">Start Your Free Trial</a></p></div></section><aside><div class=\"aside-blog-content\"><div class=\"aside-blog-content-search\"><form role=\"search\" method=\"get\" class=\"form-search\" action=\"https://sematext.com/\"><div class=\"input-group\">\n<label class=\"screen-reader-text\" for=\"s\">Search for:</label>\n<input type=\"text\" class=\"form-control search-query\" placeholder=\"Search…\" value=\"\" name=\"s\" title=\"Search for:\" /><button type=\"submit\" class=\"btn btn-default\" name=\"submit\" id=\"searchsubmit\" value=\"search\">\n</button></div></form></div><div id=\"related-content\"><div class=\"hiring-block\"><h4>Sematext is Hiring</h4><ul><li><a href=\"https://sematext.com/jobs/devops-engineer/\">DevOps Engineer</a></li><li><a href=\"https://sematext.com/jobs/customer-success-manager/\">Customer Success Manager</a></li><li><a href=\"https://sematext.com/jobs/job-product-marketing-manager/\">Product Marketing Manager</a></li><li><a href=\"https://sematext.com/jobs/job-product-manager/\">Product  Manager</a></li><li><a href=\"https://sematext.com/jobs/job-full-stack-developer/\">Full Stack Developer</a></li><li><a href=\"https://sematext.com/jobs/job-search-consulting-and-search-solutions-architect/\">Solr / Elasticsearch Solutions Architect</a></li></ul><p><a href=\"https://sematext.com/jobs/\" title=\"Sematext Jobs\">See all jobs</a></p></div><div class=\"write-to-us\"><h4>Do you have a cool story to share?</h4><p><a href=\"https://sematext.com/contact/\" title=\"Contact Us\">Write for us</a></p></div></div></div></aside></main><footer><div id=\"alternative-sharing-block\"></div></footer></article></div></div></div><div class=\"footer-area\"><div class=\"container footer-inner\"><div class=\"col-md-3 col-sm-6\"><h4>Products</h4><ul><li><a href=\"https://sematext.com/cloud/\" title=\"Sematext Cloud\">Sematext Cloud</a></li><li><a href=\"https://sematext.com/spm/\" title=\"Infrastructure Monitoring\">Infrastructure Monitoring</a></li><li><a href=\"https://sematext.com/logsene/\" title=\"Log Management\">Log Management</a></li><li><a href=\"https://sematext.com/experience/\" title=\"Real User Monitoring\">Real User Monitoring</a></li><li><a href=\"https://sematext.com/synthetic-monitoring/\" title=\"Synthetic Monitoring\">Synthetic Monitoring</a></li><li><a href=\"https://sematext.com/tracing/\" title=\"Distributed Transaction Tracing\">APM / Tracing</a></li><li><a href=\"https://sematext.com/enterprise/\" title=\"Sematext Enterprise\">Sematext Enterprise</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>Services</h4><ul><li><a href=\"https://sematext.com/consulting/\" title=\"Consulting\">Consulting</a></li><li><a href=\"https://sematext.com/support/\" title=\"Support\">Support</a></li><li><a href=\"https://sematext.com/training/\" title=\"Training\">Training</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>About</h4><ul><li><a href=\"https://sematext.com/about/\" title=\"Company\">Company</a></li><li><a href=\"https://sematext.com/blog/\" title=\"Blog\">Blog</a></li><li><a href=\"https://sematext.com/jobs/\" title=\"Jobs\">Jobs</a></li><li><a href=\"https://sematext.com/customers/\" title=\"Customers\">Customers</a></li><li><a href=\"https://status.sematext.com/\" title=\"Status\">Status</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>Contact</h4><ul><li><i class=\"fa fa-phone fa-fw\"> <a href=\"tel:+1%20347-480-1610\">+1 347-480-1610</a></i></li><li><i class=\"fa fa-envelope fa-fw\"> <a href=\"mailto:info@sematext.com\">info@sematext.com</a></i></li><li><i class=\"fa fa-map-marker fa-fw\"> <a href=\"https://www.google.com/maps/place/540+President+St,+Brooklyn,+NY+11215,+EE.+UU./@40.6773068,-73.9875385,17z/data=!3m1!4b1!4m5!3m4!1s0x89c25a55722bfff7:0x2143eab42dc5c96d!8m2!3d40.67713!4d-73.984982\" target=\"_blank\">Brooklyn, NY USA</a></i></li><li class=\"social-networks\">\n<a href=\"https://twitter.com/sematext\"><i class=\"fa fa-twitter\" aria-hidden=\"true\"></i></a>\n<a href=\"https://www.facebook.com/Sematext/\"><i class=\"fa fa-facebook\" aria-hidden=\"true\"></i></a>\n<a href=\"https://github.com/sematext\"><i class=\"fa fa-github\" aria-hidden=\"true\"></i></a>\n<a href=\"https://www.linkedin.com/company/294493/\"><i class=\"fa fa-linkedin\" aria-hidden=\"true\"></i></a></li></ul></div><div class=\"col-md-3 col-sm-12\"><p>\n<strong>© Sematext Group. All rights reserved</strong>\n<br /><a href=\"https://sematext.com/legal/terms-of-service/\">Terms Of Service</a> · <a href=\"https://sematext.com/legal/privacy/\">Privacy Policy</a></p><figure><a href=\"https://www.softwareadvice.com/network-monitoring/#top-products\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-01.png\" alt=\"Software Advice 2020 Front Runners\" /></a>\n<a href=\"https://www.softwareadvice.com/reporting-tools/#top-products\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-02.png\" alt=\"Software Advice 2021 Front Runners\" /></a>\n<a href=\"https://www.getapp.com/business-intelligence-analytics-software/analytics-reporting/category-leaders/\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-03.png\" alt=\"GetApp Category Leaders 2021\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-04.png\" alt=\"Crozdesk 2020 Quality Choice\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-05.png\" alt=\"Crozdesk 2020 Trusted Vendor\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-06.png\" alt=\"Crozdesk 2020 Happiest Users\" /></a></figure></div></div><footer id=\"colophon\" class=\"site-footer\" role=\"contentinfo\"><div class=\"container\"><div class=\"copyright col-md-12\"><p>\nApache Lucene, Apache Solr and their respective logos are trademarks of the Apache Software Foundation.\nElasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S.\nand in other countries. Sematext Group, Inc. is not affiliated with Elasticsearch BV.</p></div></div></footer></div></div></div></div>","id":"d16c68f8-5520-5e6b-a126-62cd8cda87a2","title":"How Do You Monitor Cassandra Performance: Key Metrics to Measure","origin_url":"https://sematext.com/blog/cassandra-monitoring/","url":"https://sematext.com/blog/cassandra-monitoring/","wallabag_created_at":"2021-11-08T17:30:27+00:00","published_at":"2021-10-04T10:55:25+00:00","published_by":"['Rafal Kuć']","reading_time":11,"domain_name":"sematext.com","preview_picture":"https://sematext.com/wp-content/uploads/2021/10/critical-cassandra-metrics-to-monitor.jpg","tags":["monitoring","cassandra","performance"],"description":"Apache Cassandra is a distributed database known for its high availability, fault tolerance, and near-linear scaling. It was initially developed by Facebook, but it is a widely used open-source system..."},{"content":"<p><a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra/blob/main/Screenshots/JPGs/CassPromGraf_00_Arch.jpg\"><img src=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra/raw/main/Screenshots/JPGs/CassPromGraf_00_Arch.jpg\" alt=\"CassPromGraf_00_Arch.jpg\" /></a> </p><h3><a id=\"user-content-environment\" class=\"anchor\" aria-hidden=\"true\" href=\"#environment\"></a>Environment</h3><h5><a id=\"user-content-following-servers-are-running-with-centos-linux-release-782003-core-\" class=\"anchor\" aria-hidden=\"true\" href=\"#following-servers-are-running-with-centos-linux-release-782003-core-\"></a>Following servers are running with <code>CentOS Linux release 7.8.2003 (Core)</code> :</h5><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.151      eternal1      eternal1.OracleByExample.com&#10;192.168.1.152      eternal2      eternal2.OracleByExample.com&#10;192.168.1.153      eternal3      eternal3.OracleByExample.com&#10;192.168.1.191      PromGraf      PromGraf.OracleByExample.com&#10;\"><pre>192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExample.com\n192.168.1.153      eternal3      eternal3.OracleByExample.com\n192.168.1.191      PromGraf      PromGraf.OracleByExample.com\n</pre></div><h3><a id=\"user-content-apache-cassandra\" class=\"anchor\" aria-hidden=\"true\" href=\"#apache-cassandra\"></a>Apache Cassandra</h3><p>3 node Cassandra cluster <code>cluster_name: 'id_cluster'</code> version : <code>Apache Cassandra 4.0-beta4</code> running on following servers :</p><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.151      eternal1      eternal1.OracleByExample.com&#10;192.168.1.152      eternal2      eternal2.OracleByExample.com&#10;192.168.1.153      eternal3      eternal3.OracleByExample.com&#10;\"><pre>192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExample.com\n192.168.1.153      eternal3      eternal3.OracleByExample.com\n</pre></div><p>We will configure Cassandra's built in <code>metrics-reporter</code> to extract and publish metrics to Prometheus. <br />We will also configure <code>Prometheus node_exporter</code> on each of our Cassandra nodes to extract and publish metrics to Prometheus.\n</p><h3><a id=\"user-content-prometheus--grafana\" class=\"anchor\" aria-hidden=\"true\" href=\"#prometheus--grafana\"></a>Prometheus &amp; Grafana</h3><p>We will configure and run <code>Prometheus &amp; Grafana</code> on following server :</p><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.191      PromGraf      PromGraf.OracleByExample.com&#10;\"><pre>192.168.1.191      PromGraf      PromGraf.OracleByExample.com\n</pre></div><p><code>Prometheus</code> will gather and organize all collected metrics into its internal time-series database. <br /><code>Grafana</code> will consume the metrics from Prometheus and display them in a nice dashboard. </p><h3><a id=\"user-content-prometheus-alertmanager\" class=\"anchor\" aria-hidden=\"true\" href=\"#prometheus-alertmanager\"></a>Prometheus Alertmanager</h3><p>In future, we should implement <code>Alertmanager</code>\n</p>","id":"3d846338-4870-59b3-8a50-8ba8bf31966d","title":"sarma1807/Prometheus-Grafana-Cassandra","origin_url":"https://github.com/sarma1807/Prometheus-Grafana-Cassandra","url":"https://github.com/sarma1807/Prometheus-Grafana-Cassandra","wallabag_created_at":"2021-07-10T11:53:50+00:00","published_at":null,"published_by":"['sarma1807']","reading_time":null,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/b79fbf32a62a4dfb02f40b3c4144d102b6d471b6f61ae0f8af63847ce6ea5fbf/sarma1807/Prometheus-Grafana-Cassandra","tags":["cassandra","grafana","prometheus"],"description":" EnvironmentFollowing servers are running with CentOS Linux release 7.8.2003 (Core) :192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExa..."},{"content":"<p><em>This blog post is a writeup of the presentation Bartek Plotka and I gave at \n<a href=\"https://promcon.io/2019-munich/talks/two-households-both-alike-in-dignity-cortex-and-thanos/\" target=\"_blank\" rel=\"noopener noreferrer\">PromCon 2019</a>.</em></p><p>Cortex is a horizontally scalable, clustered \n<a href=\"https://grafana.com/oss/prometheus/\">Prometheus</a> implementation aimed at giving users a global view of all their Prometheus metrics in one place, and providing long term storage for those metrics. Thanos is newer project aimed at solving the same challenges. In this blog post, we compare these two projects and see how it is possible to have two completely different approaches to the same problems.</p><iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/KmJnmd3K3Ws\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\">[embedded content]</iframe><p>The Cortex Project was started in 2016 by Julius Volz and me, and joined the CNCF sandbox in 2018. Thanos was started in 2017 by Fabian Reinartz and Bartłomiej Płotka, and joined the CNCF sandbox in 2019. Both are written in Go, \n<a href=\"https://github.com/thanos-io/thanos\" target=\"_blank\" rel=\"noopener noreferrer\">hosted on</a> \n<a href=\"https://github.com/cortexproject/cortex\" target=\"_blank\" rel=\"noopener noreferrer\">Github</a>, and re-use large swathes of Prometheus codebase. Both projects aim to solve the same problems: a global view of all your metrics, highly available monitoring with gaps, and long term storage.</p><h2 id=\"1-global-view-queries-over-data-from-multiple-prometheus-servers\">1. Global View: Queries over Data from Multiple Prometheus Servers</h2><p>Prometheus is a pull-based monitoring system; as such, Prometheus servers need to be colocated with the jobs and infrastructure they are monitoring. This works extremely well when your application is deployed in one region; when you start deploying in multiple regions, you need Prometheus servers in each region. And when you want to run queries that cover data in both regions, you can’t. This is the first problem that Cortex and Thanos set out to solve.</p><p>Thanos reuses your existing Prometheus servers in your existing clusters. A new stateless service, the Thanos Querier, “fans out” queries to these existing Prometheus servers. The existing Prometheus servers need an added sidecar (the Thanos Sidecar) deployed alongside them in order to handle these queries.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex1.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex1.jpg\" alt=\"Thanos: Fanout Queries\" width=\"\" height=\"\" title=\"Thanos: Fanout Queries\" /></a>\n\t\n</figure><p>In Cortex, requests flow in the opposite direction – your existing Prometheus servers push data to a central, scalable Cortex cluster using Prometheus’ built-in remote-write capability. The central Cortex cluster stores all the data and handles queries “locally.”</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex2.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex2.jpg\" alt=\"Cortex: Centralized Data\" width=\"\" height=\"\" title=\"Cortex: Centralized Data\" /></a>\n\t\n</figure><p>This presents us with our first tradeoff: query performance and availability. With Thanos, chunk data must be pulled back from the “edge” locations to a central location where your Thanos queriers are running. If there is an interruption in this wide-area network, your query performance and availability will suffer. In the worst case, the latest 2 hours of data may not be available for querying. As Cortex relies on a push-based model for centrally aggregating the data, when an edge location becomes unavailable, all data up to that point is still available centrally. On the other hand, with Cortex you need to manage a separate Cortex cluster and storage on top of your Prometheus deployment – Thanos leverages your existing deployment.</p><h2 id=\"2-ha-prometheus-no-gaps-in-the-graphs\">2. HA Prometheus: No Gaps in the Graphs</h2><p>Prometheus runs as a single process on a single computer; if that computer fails, the operating system needs updating, or Prometheus needs restarting, you end up with gaps in your graphs. To work around this, the best practice is to deploy a pair of Prometheus servers in each region; if one fails, the idea is the other one continues to scrape your jobs and store metrics. But this doesn’t solve the gaps problem – out of the box, Prometheus doesn’t provide a way to merge data from multiple replicas.</p><p>However, Thanos does! The Thanos querier will read from both replicas and combine the metrics into a single, no-gaps-included result. As the two replicas may naturally be slightly out-of-sync, Thanos uses \n<a href=\"https://github.com/thanos-io/thanos/blob/master/pkg/query/iter.go#L458\" target=\"_blank\" rel=\"noopener noreferrer\">a heuristic to determine which results to show to avoid gaps</a>.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex3.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex3.jpg\" alt=\"Thanos: Query Time Deduplication\" width=\"\" height=\"\" title=\"Thanos: Query Time Deduplication\" /></a>\n\t\n</figure><p>Cortex takes a different approach. As each replica is pushing samples to the central Cortex cluster, Cortex deduplicates the streams and only stores a single copy. Cortex relies on tracking the last push from each replica and using a simple timeout to “elect” the other replica as the master.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex4.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex4.jpg\" alt=\"Cortex: Resolve Gaps at Write Time\" width=\"\" height=\"\" title=\"Cortex: Resolve Gaps at Write Time\" /></a>\n\t\n</figure><p>Both systems result in a very similar experience: gapless graphs in Grafana.</p><h2 id=\"3-long-term-storage-store-data-for-long-term-analysis\">3. Long Term Storage: Store Data for Long-Term Analysis</h2><p>Prometheus is designed to run with as few dependencies as possible; it uses the local network to gather metrics and stores data on the local disk. Even through the storage format is capable of almost indefinite retention, truly durable long-term storage needs to consider disk failure, replication, and recovery.</p><p>Thanos offloads Prometheus TSDB blocks to object stores; \n<a href=\"https://github.com/thanos-io/thanos/blob/master/docs/storage.md#configuration\" target=\"_blank\" rel=\"noopener noreferrer\">7 different object stores</a> are supported. Two-hour-long blocks are built locally and then uploaded to the object store by the Thanos sidecar. A separate microservice – the Thanos Store Gateway – is used to handle queries against the blocks in object storage. Responsibility for the durability of the blocks is offloaded to the object store – a wise choice, as replication and repair is difficult to get right.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex5.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex5.jpg\" alt=\"Thanos: TSDB Blocks in Object Store\" width=\"\" height=\"\" title=\"Thanos: TSDB Blocks in Object Store\" /></a>\n\t\n</figure><p>Cortex takes a similar approach, offloading the responsibility for durability and replication to managed services. The key differences are that Cortex stores individual Prometheus chunks in an object store and a custom index in a NoSQL store – recall that a TSDB block is a collection of chunks from multiple time series with an included index. At Grafana Labs, we store both the index and the chunks in a Google Bigtable cluster, although both Cassandra and AWS DynamoDB are supported.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex6.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex6.jpg\" alt=\"Cortex: NoSQL Index &amp; Chunks\" width=\"\" height=\"\" title=\"Cortex: NoSQL Index &amp; Chunks\" /></a>\n\t\n</figure><p>When comparing these approaches it boils down to two things: cost and performance. Both systems use managed services, and both systems can be run with a single dependency. Managed NOSQL stores (such as Google Bigtable or AWS DynamoDB) tend to cost more per GB stored than object stores (such as Google Cloud Storage or AWS S3), although they tend to cost less per IOP.  This does tend to lead to a higher TCO for a Cortex-style system than a Thanos-style system. On the other hand, using a NOSQL store for the index in Cortex gives it an edge when it comes to query performance and scalability.</p><figure class=\"figure-wrapper  figure-wrapper__lightbox\" itemprop=\"associatedMedia\" itemscope=\"itemscope\" itemtype=\"http://schema.org/ImageObject\"><a href=\"https://grafana.com/static/assets/img/blog/thanos_cortex7.jpg\" itemprop=\"contentUrl\">\n\t\t\t<img src=\"https://grafana.com/static/assets/img/blog/thanos_cortex7.jpg\" alt=\"Thanos vs. Cortex\" width=\"\" height=\"\" title=\"Thanos vs. Cortex\" /></a>\n\t\n</figure><h2 id=\"future-collaboration\">Future Collaboration</h2><p>As the Cortex and Thanos projects both aim to solve the same problems, both make heavy use of the existing Prometheus codebase, and two of the maintainers both live in London, we have been collaborating more and more over the past year.</p><p>The first example of this collaboration is that we have made the Cortex query frontend, a caching and parallelization service that accelerates PromQL queries, work for Thanos. You can read more about it in our previous blog post, \n<a href=\"https://grafana.com/blog/2019/09/19/how-to-get-blazin-fast-promql/\">How to Get Blazin’ Fast PromQL</a>. We are really excited to be able to bring these optimizations to a wider audience and welcome contributions to improve Thanos support.</p><p>The second example of collaboration is making it possible for Cortex to use Thanos’ blocks-in-object-storage approach, reusing the existing Thanos code to achieve this. My thanks go out to \n<a href=\"https://github.com/cortexproject/cortex/pull/1695\" target=\"_blank\" rel=\"noopener noreferrer\">Thor at DigitalOcean</a> for putting in the legwork to make this happen. While it’s not quite ready for production usage yet, we’re really excited about the TCO improvements this will bring to Cortex, and are hoping to work closely with the Thanos team to improve scalability and performance.</p><p>I’m always amazed at how two different groups of engineers can come up with completely different (and hopefully equally valid) solutions to the same problems. At Grafana Labs, we use Cortex to power \n<a href=\"https://grafana.com/products/cloud/\">Grafana Cloud</a>’s Prometheus service, and have recently starting using Thanos internally to monitor it. I’m excited to see what the future holds for Cortex and Thanos!</p>","id":"6cfd006a-f3f5-5747-847f-6059ff765462","title":"[PromCon Recap] Two Households, Both Alike in Dignity: Cortex and Thanos","origin_url":"https://grafana.com/blog/2019/11/21/promcon-recap-two-households-both-alike-in-dignity-cortex-and-thanos/","url":"https://grafana.com/blog/2019/11/21/promcon-recap-two-households-both-alike-in-dignity-cortex-and-thanos/","wallabag_created_at":"2021-04-29T12:10:17+00:00","published_at":null,"published_by":"['']","reading_time":6,"domain_name":"grafana.com","preview_picture":"https://s3.amazonaws.com/a-us.storyblok.com/f/1022730/769ef3c07d/thanos_cortex7.jpg","tags":["cortex","kubernetes","cassandra","grafana","aws.s3","prometheus","thanos"],"description":"This blog post is a writeup of the presentation Bartek Plotka and I gave at \nPromCon 2019.Cortex is a horizontally scalable, clustered \nPrometheus implementation aimed at giving users a global view of..."},{"content":"<p><em>To learn more about the DataStax open-source project, <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> and to try a demo, visit us on <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">GitHub</a>.</em></p><p>Apache Cassandra is a resilient system for users to build applications on, but many operators see Cassandra as a bit of a black box. It’s not that Cassandra doesn’t have <a href=\"https://cassandra.apache.org/doc/latest/operating/metrics.html\">hundreds of metrics to consume</a>, it does (over 300 metric series per table!). The fact is visualizing and getting a unified view of the cluster combined with OS-level metrics and application metrics is not an easy thing for Cassandra users to set up.  </p><h3>What is the Metrics Collector for Apache Cassandra?</h3><p>To help solve this problem, DataStax released a new open source project called the <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> (MCAC for short).  This project provides a drop-in solution to solve this monitoring gap for Apache Cassandra. Here’s how it works.</p><p>MCAC is built on the widely used <a href=\"https://collectd.org/\">collectd</a> agent but with a novel twist. Collectd is a metric collection agent that is well adopted and integrates well with all kinds of external metrics systems like, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Prometheus\">prometheus</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Graphite\">graphite</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Stackdriver\">stackdriver</a>, and <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_HTTP\">others</a>. While collectd can scrape JMX metrics out of the box, <a href=\"https://github.com/prometheus/jmx_exporter/issues/246#issuecomment-367573931\">JMX scraping can be quite slow</a> and works best with only a subset of metrics. Not to mention many people don’t want to maintain and configure the metric agent on every node.  </p><p>We use MCAC to power the health tab in <a href=\"https://astra.datastax.com/register\">Astra</a> and is bundled with our <a href=\"https://github.com/datastax/cass-operator\">Kubernetes operator for Apache Cassandra</a>. </p><h3>Why MCAC is different</h3><p>To solve this problem MCAC comes as a single bundle with our java agent and a linux portable collectd build all in one. Just add the agent to the cassandra-env.sh, it brings up collectd and ships every metric in Cassandra to collectd via a unix-socket. It works on all Apache Cassandra versions from 2.2 -&gt; 4.0. </p><p>By shipping the metrics this way efficiently it is able to export hundreds of thousands of series per node with little/no impact on C* performance.</p><p>Not only does it send the metrics, but it is specially designed to work well with prometheus out of the box, like <a href=\"https://www.robustperception.io/how-does-a-prometheus-histogram-work\">histograms are tailored for aggregation</a> by prometheus and labels are automatically converted on ingest. This means you can slice and dice metrics across DCs, racks, down to even tables.</p><p>The Cassandra metrics are one aspect of the equation but with collectd we can also gather and expose all the OS level metrics, like context switches and disk/network performance.</p><p>MCAC also creates a historical log on the nodes of metric and non-metric diagnostic events related to activity on the node. Non-metric events include details on Flushes, Compactions, Exceptions, GC, etc. This DataLog can be used to help analyze performance or other impacting issue on the cluster. If you need help our SRE team is available to help you diagnose problems with this log <a href=\"https://www.datastax.com/keepcalm\">https://www.datastax.com/keepcalm</a> and if you have any questions we're here to help at <a href=\"https://community.datastax.com/\">https://community.datastax.com/</a>.</p><p>Finally, what good are all these metrics without a way to visualize them! To tie it all together, MCAC comes with pre-built grafana dashboards which give operators the best Cassandra monitoring solution out there. These dashboards will change over time to focus on specific aspects of the system to make it easier to drill into the cluster.</p><p><img alt=\"Grafana\" data-entity-type=\"file\" data-entity-uuid=\"3352a498-c70f-43a8-adb4-cf1086877b2b\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC1_0.png\" /></p><p><img alt=\"mcac\" data-entity-type=\"file\" data-entity-uuid=\"7a740a4f-874c-4520-b95d-a549adb48be6\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC3.png\" /></p><p><img alt=\"mcac2\" data-entity-type=\"file\" data-entity-uuid=\"f6e23d85-a306-4141-a3b1-35b2c92ba846\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC2_0.png\" /></p>","id":"44ef75af-85c2-58c6-ac00-d9e7a9bd9296","title":"Monitoring Apache Cassandra™ Made Simple","origin_url":"https://www.datastax.com/blog/monitoring-apache-cassandratm-made-simple","url":"https://www.datastax.com/blog/monitoring-apache-cassandratm-made-simple","wallabag_created_at":"2021-02-12T16:13:22+00:00","published_at":null,"published_by":"['']","reading_time":2,"domain_name":"www.datastax.com","preview_picture":"https://www.ibm.com/content/dam/worldwide-content/stock-assets/adb-stk/ul/g/30/19/adobestock_1831482701.jpeg/_jcr_content/renditions/cq5dam.web.1280.1280.jpeg","tags":["prometheus","monitoring","cassandra","grafana"],"description":"To learn more about the DataStax open-source project, Metric Collector for Apache Cassandra and to try a demo, visit us on GitHub.Apache Cassandra is a resilient system for users to build applications..."},{"content":"<div class=\"post-thumbnail\"><img width=\"200\" height=\"130\" src=\"https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra-200x130.jpg\" class=\"attachment-featured-thumb size-featured-thumb wp-post-image\" alt=\"\" srcset=\"https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra-200x130.jpg 200w, https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra-300x195.jpg 300w, https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra-100x65.jpg 100w, https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra-120x78.jpg 120w, https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra.jpg 667w\" /></div><p>The latest beta release of the Apache Cassandra is designed to hit the ground running as the NoSQL database moves steadily to the cloud to provide managed services in production deployments.</p><p>Cassandra 4.0 released on Monday (July 20) is the first major update of the database since 2017, incorporating more than 1,000 bug fixes and extensive “battle” testing to improve performance in production, making it “the most stable release ever,” maintainers asserted. Performance testing included running Cassandra on clusters as large as 1,000 nodes using an array of enterprise use cases.</p><p>Cassandra promoters note that hyper-scalers such as Apple (NASDAQ: AAPL) have deployed the database in production with more than 75,000 nodes, illustrating its ability to scale.</p><p>Among the new features incorporated into version 4.0 is the ability to stream data between nodes during scaling operations such as adding a new node or datacenter during peak traffic times.</p><p>It also includes the new data access controls operating on a “per datacenter basis.”  In one scenario, operators of datacenters located in the Europe and the United States could configure Cassandra to allow access to a single datacenter using a “network authorizer” feature. Data governance features are gaining traction as European authorities crack down on the <a href=\"https://www.datanami.com/2020/07/16/as-ai-matures-so-too-do-risks-survey-finds/\">cross-border movement</a> of personal user data.</p><p>Monitoring tools are also emphasized in the Cassandra latest release. Previously, open source tools from key code contributors such as DataStax and Instaclustr were the primary tools for observing Cassandra clusters.</p><p>“Constant monitoring of key performance indicators such as latency, disk usage, and throughput is critical to maintaining an optimal deployment,” Justin Cameron, a senior software engineer at <a href=\"https://www.instaclustr.com/\">Instaclustr</a>, wrote last year in <em>Datanami</em>.</p><p>Around-the-clock “monitoring is necessary because both internal and external changes to Cassandra usage patterns are very common,” Cameron added.</p><p>The latest version allows users to selectively monitor system metrics and configuration settings via a feature called <a href=\"https://thelastpickle.com/blog/2019/03/08/virtual-tables-in-cassandra-4_0.html\">Virtual Tables</a>. Other tools allow users to record and replay production workloads to analyze performance.</p><p>Along with DataStax, the data platform developer behind Cassandra, key code contributors to the 4.0 version include Amazon Web Services (NASDAQ: AMZN) and Instaclustr.</p><p>The Cassandra 4.0 better release is <a href=\"https://cassandra.apache.org/download/\">here</a>.</p><p><strong>Recent items:</strong></p><p><a href=\"https://www.datanami.com/2020/05/12/cassandra-now-officially-in-the-cloud-with-datastax-astra/\">Cassandra Now Officially in the Cloud with DataStax Astra</a></p><p><a href=\"https://www.datanami.com/2019/03/19/4-apache-cassandra-pitfalls-you-must-avoid/\">4 Apache Cassandra Pitfalls You Must Avoid</a></p><p>Tags: <a href=\"https://www.datanami.com/tag/apache-cassandra/\" rel=\"tag\">apache cassandra</a>, <a href=\"https://www.datanami.com/tag/cassandra/\" rel=\"tag\">cassandra</a>, <a href=\"https://www.datanami.com/tag/cassandra-4-0/\" rel=\"tag\">Cassandra 4.0</a>, <a href=\"https://www.datanami.com/tag/cloud-database/\" rel=\"tag\">cloud database</a>, <a href=\"https://www.datanami.com/tag/cluster-monitoring/\" rel=\"tag\">cluster monitoring</a>, <a href=\"https://www.datanami.com/tag/data-governance/\" rel=\"tag\">Data Governance</a>, <a href=\"https://www.datanami.com/tag/instaclutr/\" rel=\"tag\">Instaclutr</a>, <a href=\"https://www.datanami.com/tag/network-authorizer/\" rel=\"tag\">network authorizer</a>, <a href=\"https://www.datanami.com/tag/nosql-database/\" rel=\"tag\">NoSQL database</a></p>","id":"4ac5d9e4-3f90-5f78-b0bc-2796eaee3d8e","title":"Cassandra Gets Monitoring, Performance Upgrades","origin_url":"https://www.datanami.com/2020/07/21/cassandra-gets-monitoring-performance-upgrades/","url":"https://www.datanami.com/2020/07/21/cassandra-gets-monitoring-performance-upgrades/","wallabag_created_at":"2020-11-13T15:53:24+00:00","published_at":"2020-07-21T15:19:34+00:00","published_by":null,"reading_time":2,"domain_name":"www.datanami.com","preview_picture":"https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2016/11/shutterstock_cloud_storage_cassandra.jpg","tags":["monitoring","cassandra"],"description":"The latest beta release of the Apache Cassandra is designed to hit the ground running as the NoSQL database moves steadily to the cloud to provide managed services in production deployments.Cassandra ..."}],"tagSets":[{"tag":"monitoring","articles":[{"content":"<p>Last week at the Cassandra Summit I gave a talk with <a href=\"https://twitter.com/rustyrazorblade/status/511932312515526656\">Blake Eggleston</a> on diagnosing performance problems in production. We spoke to about 300 people for about 25 minutes followed by a healthy Q&amp;A session. I’ve expanded on our presentation to include a few extra tools, screenshots, and more clarity on our talking points.</p><p>There’s finally a lot of material available for someone looking to get started with Cassandra. There’s several introductory videos on YouTube by both <a href=\"https://www.youtube.com/watch?v=W45Ysb9b6oE\">me</a> and <a href=\"https://www.youtube.com/watch?v=B-bTPSwhsDY\">Patrick McFadin</a> as well as videos on <a href=\"https://www.youtube.com/watch?v=Vv3QJxAdjic\">time series data modeling</a>. I’ve posted videos for my own project, cqlengine, (<a href=\"https://www.youtube.com/watch?v=zrbQcPNMbB0\">intro</a> &amp; <a href=\"https://www.youtube.com/watch?v=clXN9pnakvI\">advanced</a>), and plenty more on the <a href=\"https://www.youtube.com/channel/UCvP-AXuCr-naAeEccCfKwUA\">PlanetCassandra channel</a>. There’s also a boatload of <a href=\"http://planetcassandra.org/client-drivers-tools/\">getting started</a> material on PlanetCassandra written by <a href=\"https://twitter.com/rebccamills\">Rebecca Mills</a>.</p><p>This is the guide for what to do once you’ve built your application and you’re ready to put Cassandra in production. Whether you’ve been in operations for years or you are first getting started, this post should give you a good sense of what you need in order to address any issues you encounter.</p><p>The original slides are available via <a href=\"http://www.slideshare.net/JonHaddad/diagnosing-problems-in-production-cassandra-summit-2014\">Slideshare</a>.</p><p>Update: the presentation is now <a href=\"https://www.youtube.com/watch?v=QOwVDcLZd0A\">available on YouTube</a>!</p><iframe width=\"560\" height=\"315\" src=\"http://www.youtube.com/embed/QOwVDcLZd0A\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\">[embedded content]</iframe><p>Before you even put your cluster under load, there’s a few things you can set up that will help you diagnose problems if they pop up.</p><ol><li>\n<p>Ops center</p>\n<p>This is the standard management tool for Cassandra clusters. This is recommended for every cluster. While not open source, the community version is free. It gives you a high level overview of your cluster and provides historical metrics for the most important information. It comes with a variety of graphs that handle about 90% of what you need on a day to day basis.</p>\n<p><img src=\"http://www.datastax.com/wp-content/themes/datastax-2013/images/opscenter/opsc4-ring-view-c-hadoop-solr.jpg\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p>\n</li>\n<li>\n<p>Metrics plugins</p>\n<p>Cassandra has since version 1.1 included the <a href=\"https://dropwizard.github.io/metrics/3.1.0/\">metrics library</a>. In every release it tracks more metrics using it. <strong>Why is this awesome?</strong> In previous persons of Cassandra, the standard way to access what was going on in the internals was over JMX, a very Java centric communications protocol. That meant writing a Java Agent, setting up mx4j, or Jolokia, then digging through JMX, which can be a little hairy. Not everyone wants to do this much work.</p>\n<p>The metrics library allows you to tell Cassandra to report its internal, table level metrics out to a whole slew of different places. Out to CSV, Ganglia, Graphite, and STDOUT, and it’s pluggable to push metrics to anywhere you want.</p>\n<p><img src=\"http://www.datastax.com/wp-content/uploads/2013/11/client-vs-cf.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p>\n<p><a href=\"http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2\">Read more about the metrics library integration.</a></p>\n</li>\n<li>\n<p>Munin, Nagios, Icinga (or other system metrics monitoring)</p>\n<p>I’ve found these tools to be incredibly useful at graphing system metrics as well as custom application metrics. There are many options. If you’re already familiar with one tool, you can probably keep using it. There are hosted solutions as well (server density, data dog, etc)</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cassandra_writes.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>Statsd, Graphite, Grafana</p>\n<p>Your application should be tracking internal metrics. Timing queries, frequently called functions, etc. These tools let you get a profile of what’s going on with your code in production. Statsd collects raw stats and aggregates them together, then kicks them to graphite. Grafana is an optional (better) front end to Graphite.</p>\n<p>There was a great post by etsy, <a href=\"http://codeascraft.com/2011/02/15/measure-anything-measure-everything/\">Measure Anything, Measure Everything</a>, that introduced statsd and outlined its usage with Graphite.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cassandra-graphite2.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>Logstash</p>\n<p>We didn’t mention <a href=\"http://logstash.net/\">Logstash</a> in our presentation, but we’ve found it to be incredibly useful in correlating application issues with other failures. This is useful for application logging aggregation. If you don’t want to host your own log analysis tool, there are hosted services for this as well.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/logstash_blog-1024x514.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n</ol><p>There’s a bunch of system tools that are useful if you’re logged onto a machine and want to see real time information.</p><ol><li>\n<p>iostat</p>\n<p>iostat is useful for seeing what’s happening with each disk on your machine. If you’re hitting I/O issues, you’ll see it here. Specifically, you’re looking for high read &amp; write rates and a big avgqu-sz (disk queue), or a high svctm (service time) there’s a good chance you’re bottlenecked on your disk. You either want to use more disks or faster disks. Cassandra loves SSDs.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/iostat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>htop</p>\n<p>Htop is a better version of top, which is useful for getting a quick glance at your system. It shows load, running processes, memory usage, and a bunch of other information at a quick glance.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/htop.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>iftop &amp; netstat</p>\n<p>iftop is like top, but shows you active connections and the transfer rates between your server and whoever is at the other end.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/iftop.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>Netstat is more of a networking swiss army knife. You can see network connections, routing tables, interface statistics, and a variety of other network information.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/netstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>dstat</p>\n<p>I prefer to use dstat over iostat now since it includes all of its functionality and much of the functionality of other tools as well.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/dstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>strace</p>\n<p>strace is useful when you want to know what system calls are happening for a given process.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/strace.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n<li>\n<p>pcstat</p>\n<p>This tool, written by <a href=\"https://twitter.com/AlTobey\">Al Tobey</a>, allows you to examine a bunch of files and quickly determine how much of each file is in the buffer cache. If you’re trying to figure out why table access is slow, this tool can tell you if your data is in cache already or if you have to go out to disk. <a href=\"http://www.linuxatemyram.com/\">Here’s a good read</a> to get familiar with buffer cache. <a href=\"https://github.com/tobert/pcstat\">Check out the repo</a>.</p>\n<figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/pcstat.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure></li>\n</ol><p>There’s a few issues that are easy to run into that I’d consider “gotchas”, things that come up often enough that they’re worth mentioning.</p><p>A important design decision in Cassandra is that it uses last write wins when there are two inserts, updates, or deletes to a cell. To determine the last update, Cassandra uses the system clock (or the client can specify the time explicitly). If server times are different, the last write may not actually win, it’ll be the one that’s the most skewed into the future.</p><p>To address this issue, always make sure your clocks are synced. Ntpd will constantly correct for drift. ntpdate will perform a hard adjustment to your system clock. Ntpdate needs to be used if you clock is significantly off, and ntpd will keep it at the correct time.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/ntpdate.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><h2 id=\"disk-space-not-reclaimed\">Disk space not reclaimed</h2><p>if you add new nodes to a cluster, each replica is responsible for less data. it’s streamed to the new nodes. however, it is not removed from the old nodes. If you’re adding new nodes because you’re running low on disk space, this is extremely important. You are required to run <code>nodetool cleanup</code> in order to reclaim that disk space. This is a good idea any time you change your database topology.</p><h2 id=\"issues-adding-nodes-or-running-repairs\">Issues adding nodes, or running repairs</h2><p>There are two common problems that come up with repair. The first is that repairs take forever in 2.0. <a href=\"http://www.datastax.com/dev/blog/more-efficient-repairs\">This is solved in 2.1</a> which uses an incremental repair, and does not repair data which has already been repaired. The second issue relates to trying to repair (or add nodes) to a cluster when the versions do not match. It is, in general, not a good idea (yet) to stream data between servers which are of different versions. It will appear to have started, but will just hang around doing nothing.</p><p>Cassandra comes with several tools to help diagnose performance problems. They are available via <code>nodetool</code>, Cassandra’s multipurpose administration tool.</p><h2 id=\"compaction\">Compaction</h2><p>Compaction is the process of merging SSTables together. It reduces the number of seeks required to return a result. It’s a necessary part of Cassandra. If not configured correctly, it can be problematic. You can limit the I/O used by compaction by using <code>nodetool setcompactionthroughput</code>.</p><p>There’s 2 types of compaction available out of the box. Size Tiered is the default and great for write heavy workloads. Leveled compaction is good for read &amp; update heavy workloads, but since it uses much higher I/O it’s recommended you use this only if you’re on SSD. I recommend reading through the <a href=\"http://datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_configure_compaction_t.html\">documentation</a> to understand more about which is right for your workload.</p><p><img src=\"http://www.datastax.com/documentation/cassandra/2.0/cassandra/images/dml_compaction.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></p><p>Histograms let you quickly understand at both a high level and table level what your performance looks like on a single node in your cluster. The first histogram, <code>proxyhistograms</code>, give you a quick top level view of all your tables on a node. This includes network latency. Histogram output has changed between versions to be more user friendly. The screenshot below is from Cassandra 2.1.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/proxyhistograms.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>If you’d like to find out if you’ve got a performance problem isolated to a particular table, I suggest first running <code>nodetool cfstats</code> on a keyspace. You’ll be able to scan the list of tables and see if there’s any abnormalities. You’ll be able to quickly tell which tables are queried the most (both reads and writes).</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cfstats.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p><code>nodetool cfhistograms</code> lets you identify performance problems with a single table on a single node. The statistics are more easily read in Cassandra 2.1.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/cfhistograms.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><h2 id=\"query-tracing\">Query Tracing</h2><p>If you’ve narrowed down your problem to a particular table, you can start to trace the queries that you execute. If you’re coming from a something like MySQL, you’re used to the command <code>explain</code>, which tells in in advance what the query plan is for a given query. Tracing takes a different approach. Instead of showing a query plan, query tracing keeps track of the events in the system whewn it actually executes. Here’s an example where we’ve created a whole bunch of tombstones on a partition. Even on a SSD you still want to avoid a lot of tombstones - it’s disk, CPU, and memory intensive.</p><figure class=\"full-middle\"><img src=\"http://rustyrazorblade.com/images/tracing.png\" referrerpolicy=\"no-referrer\" alt=\"image\" /></figure><p>The JVM gets a reputation for being a bit of a beast. It’s a really impressive feat of engineering, but it shouldn’t be regarded as black magic. I strongly recommend reading through <a href=\"http://blakeeggleston.com/cassandra-tuning-the-jvm-for-read-heavy-workloads.html\">Blake Eggleston’s post on the JVM</a>, it’s well written and does a great job of explaining things. (Much better than I would here).</p><p>OK - we’ve got all these tools under our belt. Now we can start to narrow down the problem.</p><ul><li>\n<p>Are you seeing weird consistency issues, even on consistency level ALL?<br />It’s possible you’re dealing with a clock sync issue. If you’re sending queries really close to one another, they might also be getting the same millisecond level timestamp due to an async race condition in your code. If you’re sending lots of writes at the same time to the same row, you may have a problem in your application. Try to rethink your data model to avoid this.</p>\n</li>\n<li>\n<p>Has query performance dropped? Are you bottlenecked on disk, network, CPU, memory? Use the tools above to figure out your bottleneck. Did the number of queries to your cluster increase? Are you seeing longer than normal garbage collection times? Ops center has historical graphs that are useful here. Is there a single table affected, or every table? Use histograms and cfstats to dig into it.</p>\n</li>\n<li>\n<p>Are nodes going up and down? Use a combination of ops center and your system metrics to figure out which node it is. If it’s the same node, start investigating why. Is there a hot partition? Is it doing a lot of garbage collection? Is your application opening more connections than before? You should have system metrics that show these trends over time. Maybe you just have additional load on the system - it may be necessary to add new nodes. Don’t forget to run cleanup.</p>\n</li>\n</ul><p>This started out as a small recap but has evolved into much more than that. The tools above have helped me a wide variety of problems, not just Cassandra ones. If you follow the above recommendations you should be in a great spot to diagnose most problems that come your way.</p><p>You can find me on <a href=\"https://twitter.com/rustyrazorblade\">Twitter</a> for any comments or suggestions.</p>","id":"eef63404-0bd6-5a8d-96dc-2a4c8b270540","title":"Cassandra Summit Recap: Diagnosing Problems in Production - RustyRazorblade.com","origin_url":"http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/","url":"http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/","wallabag_created_at":"2023-03-01T19:46:02+00:00","published_at":null,"published_by":null,"reading_time":10,"domain_name":"rustyrazorblade.com","preview_picture":"http://rustyrazorblade.com/images/default.png","tags":["monitoring","cassandra"],"description":"Last week at the Cassandra Summit I gave a talk with Blake Eggleston on diagnosing performance problems in production. We spoke to about 300 people for about 25 minutes followed by a healthy Q&A sessi..."},{"content":"<div id=\"js-flash-container\" data-turbo-replace=\"\"><div class=\"flash flash-full {{ className }} px-2\"><p>{{ message }}</p></div>\n</div><div class=\"application-main\" data-commit-hovercards-enabled=\"\" data-discussion-hovercards-enabled=\"\" data-issue-and-pr-hovercards-enabled=\"\"><main id=\"js-repo-pjax-container\"><div id=\"repository-container-header\" class=\"pt-3 hide-full-screen c4\" data-turbo-replace=\"\"><div class=\"d-flex flex-wrap flex-justify-end mb-3 px-3 px-md-4 px-lg-5 c2\"><p> / <strong itemprop=\"name\" class=\"mr-2 flex-self-stretch\"><a data-pjax=\"#repo-content-pjax-container\" data-turbo-frame=\"repo-content-turbo-frame\" href=\"https://github.com/jlacefie/cfstats-csv-parser\">cfstats-csv-parser</a></strong> Public</p><ul class=\"pagehead-actions flex-shrink-0 d-none d-md-inline c1\"><li><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;notification subscription menu watch&quot;,&quot;repository_id&quot;:null,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"f144271ccf5238dfe8481cdfcc1ac7064ce0108d684e9aa04b963d09dffcc7dd\" aria-label=\"You must be signed in to change notification settings\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn\">Notifications</a></li>\n<li><a id=\"fork-button\" href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;repo details fork button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"4e22fc47232edd447fd14c647078608494a6bd8f96021d637e89ac722316da72\" data-view-component=\"true\" class=\"btn-sm btn\">Fork 4</a></li>\n<li>\n<p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;star button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"8b058a295e332e9274ab81d9abdca5de977f80971274e4b41f0e790b84513eae\" aria-label=\"You must be signed in to star a repository\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn BtnGroup-item\"> Star 1</a> </p>\n</li>\n</ul></div><div class=\"d-block d-md-none mb-2 px-3 px-md-4 px-lg-5\" id=\"responsive-meta-container\" data-turbo-replace=\"\"><p class=\"f4 mb-3\">Repo for a utility to parse cfstats into a csv file for analysis</p><h3 class=\"sr-only\">License</h3><p><a href=\"https://github.com/jlacefie/cfstats-csv-parser/blob/master/LICENSE\" class=\"Link--muted\" data-analytics-event=\"{&quot;category&quot;:&quot;Repository Overview&quot;,&quot;action&quot;:&quot;click&quot;,&quot;label&quot;:&quot;location:sidebar;file:license&quot;}\"> MIT license</a></p><p><a class=\"Link--secondary no-underline mr-3\" href=\"https://github.com/jlacefie/cfstats-csv-parser/stargazers\"> 1 star</a> <a class=\"Link--secondary no-underline\" href=\"https://github.com/jlacefie/cfstats-csv-parser/network/members\"> 4 forks</a></p><div class=\"d-flex\"><p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;star button&quot;,&quot;repository_id&quot;:14588600,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"8b058a295e332e9274ab81d9abdca5de977f80971274e4b41f0e790b84513eae\" aria-label=\"You must be signed in to star a repository\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn btn-block BtnGroup-item\"> Star</a> </p><p><a href=\"https://github.com/login?return_to=%2Fjlacefie%2Fcfstats-csv-parser\" rel=\"nofollow\" data-hydro-click=\"{&quot;event_type&quot;:&quot;authentication.click&quot;,&quot;payload&quot;:{&quot;location_in_page&quot;:&quot;notification subscription menu watch&quot;,&quot;repository_id&quot;:null,&quot;auth_type&quot;:&quot;LOG_IN&quot;,&quot;originating_url&quot;:&quot;https://github.com/jlacefie/cfstats-csv-parser&quot;,&quot;user_id&quot;:null}}\" data-hydro-click-hmac=\"f144271ccf5238dfe8481cdfcc1ac7064ce0108d684e9aa04b963d09dffcc7dd\" aria-label=\"You must be signed in to change notification settings\" data-view-component=\"true\" class=\"tooltipped tooltipped-s btn-sm btn btn-block\">Notifications</a></p></div></div></div>\n</main></div><footer class=\"footer width-full container-xl p-responsive\">\n<div class=\"position-relative d-flex flex-items-center pb-2 f6 color-fg-muted border-top color-border-muted flex-column-reverse flex-lg-row flex-wrap flex-lg-nowrap mt-6 pt-6\"><p> © 2023 GitHub, Inc.</p></div>\n</footer><p> You can’t perform that action at this time.</p><p> You signed in with another tab or window. <a href=\"\">Reload</a> to refresh your session. You signed out in another tab or window. <a href=\"\">Reload</a> to refresh your session.</p><details class=\"details-reset details-overlay details-overlay-dark lh-default color-fg-default hx_rsm\" open=\"open\">\n</details>","id":"8e3a1b9c-7cd2-578f-a0b1-877ac6f47cb3","title":"GitHub - jlacefie/cfstats-csv-parser: Repo for a utility to parse cfstats into a csv file for analysis","origin_url":"https://github.com/jlacefie/cfstats-csv-parser","url":"https://github.com/jlacefie/cfstats-csv-parser","wallabag_created_at":"2023-03-01T19:40:38+00:00","published_at":null,"published_by":"['jlacefie']","reading_time":null,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/a44d5ad918f82ef2250d84b5d7d746b2e183f62eb7bd35a736db6977a6ff0bfb/jlacefie/cfstats-csv-parser","tags":["monitoring","cassandra"],"description":"{{ message }}\n / cfstats-csv-parser PublicNotifications\nFork 4\n\n Star 1 \n\nRepo for a utility to parse cfstats into a csv file for analysisLicense MIT license 1 star  4 forks Star Notifications\n\n © 202..."},{"content":"<div class=\"top-section\"><div class=\"container-fluid main-content-area\"><div id=\"scroll-status-bar\"><div id=\"scroll-status-percent\"></div><div class=\"blog-post-hero container-fluid\"><div class=\"blog-post-hero-bg\"><div class=\"blog-post-hero-bg-img\"></div><div class=\"container\"><img src=\"https://sematext.com/wp-content/uploads/2021/10/critical-cassandra-metrics-to-monitor.jpg\" id=\"the-featured-image\" width=\"1140\" height=\"626\" alt=\"image\" /></div></div><div class=\"container-fluid container-single-blog-post\"><div class=\"container\"><article class=\"single-article-blog-post\" id=\"post-53726\"><main><section><div id=\"the-content\"><p>Apache Cassandra is a distributed database known for its high availability, fault tolerance, and near-linear scaling. It was initially developed by Facebook, but it is a widely used open-source system used by the largest tech companies in the world. There are numerous reasons behind its popularity, including no single point of failure, exceptional horizontal scaling with a data layout designed as a perfect fit for time-series data.</p><p>However, despite these perks, like any other system, Cassandra is prone to performance issues. This makes monitoring imperative. And it all starts with knowing what to measure. In this article, we will explain the <strong>key Cassandra performance metrics</strong> you should monitor to make sure everything is up and running at all times.</p><h2>What Is Cassandra and How Does It Work?</h2><p>Let’s keep it short – Apache Cassandra is a distributed NoSQL database designed to provide fault-tolerant and highly available architecture with performance in mind.</p><p>As a distributed system Cassandra is built out of nodes. A <strong>node</strong> is a single instance of Apache Cassandra that can operate on its own. Multiple nodes can form a <strong>cluster</strong> – a distributed system holding common data and responding to query requests. Cassandra works in a master-less architecture where each node communicates in a <strong>peer to peer </strong>fashion using a protocol known as <strong>Gossip</strong>. The <strong>gossip</strong> protocol is designed so that each node is informed about the state of all other nodes and a single node performs <strong>gossip</strong> communication with up to three other nodes every second.</p><p>The <strong>cluster</strong> can be divided into <strong>data centers</strong> and <strong>racks</strong>, just like the real-life data centers are divided. In Cassandra terminology, a <strong>data center</strong> is designed to hold multiple <strong>racks</strong> and a single <strong>rack </strong>holds a complete replica of the data.</p><p><img data-lazyloaded=\"1\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-3.png\" alt=\"image\" /></p><noscript><img src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-3.png\" alt=\"image\" /></noscript><p><em>Cassandra Cluster Logical Overview</em></p><p>When it comes to the data, Cassandra stores it in tables that are organized the same way as in any other database – in rows and columns. A single table is called a column family. The tables themselves are grouped into keyspaces, where a keyspace usually holds logically similar data – for example, from a business perspective. The keyspace is also used for data replication, and the replication itself is configured on a keyspace level.</p><p>Getting back to the tables. Each table defines a primary key that is built of the partition key and the clustering columns. Cassandra uses the partition key to index the data. All data that share a common partition key make a single data partition – a basic unit for data retrieval and storage. The clustering columns are optional.</p><p>Needless to say, Apache Cassandra is a complicated, distributed system and it’s not uncommon for users to encounter operation problems and difficulties. Everything breaks eventually, from the low-level bare metal components, up to the high-level software. It is not unusual for users to deal with network issues and CPU utilization problems, especially on very large clusters. Cassandra is written in Java and uses both off-heap and heap memory, which means that as the volume of data grows, you may hit issues with the garbage collector. Finally, because of the amount of data that you will process you may need to deal with the hard disk space and performance of your I/O subsystem. All of these can be avoided by keeping an eye for the relevant metrics with the help of a good <a href=\"https://sematext.com/integrations/cassandra\">Cassandra monitoring tool</a>.</p><h2>How Is Cassandra Performance Measured?</h2><p>The most complex, distributed systems provide a set of metrics that you should take care of, monitor, and alert on to ensure that your system is healthy and working well. Apache Cassandra is no different. It provides a plethora of performance metrics which we can divide into three categories:</p><ul><li>Dedicated Apache Cassandra metrics that describe how the system and its parts perform.</li><li><a href=\"https://sematext.com/blog/jvm-metrics/\">Java Virtual Machine metrics</a> that tell you about the execution environment on which Apache Cassandra is running.</li><li><a href=\"https://sematext.com/server-monitoring/\">Operating system metrics</a> describing the metrics related to the bare metal servers, virtual machines, or containers, depending on the environment that you are using.</li></ul><h3>Dedicated Cassandra Performance Metrics</h3><p>When monitoring Apache Cassandra clusters, is the metrics that the distributed data store exposes via the JMX interface. There are many Cassandra performance metrics exposed in the JMX and having visibility into most of them is a good idea. You never know what can be useful when troubleshooting.</p><h4>Nodes</h4><p>One of the most important Cassandra metrics is the number of nodes that are currently available and connected to form a cluster. The ability to store the data and respond to queries is directly related to the availability of nodes.</p><h4>Compaction Metrics</h4><p><a href=\"https://cassandra.apache.org/doc/latest/operating/compaction/index.html\">Compaction</a> is the operation of merging multiple smaller instances of <a href=\"https://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstables\">SSTable</a> into one bigger SSTable that contains all the data from the smaller tables. Because of that, it can be very expensive and resource-consuming. Having visibility into compaction performance is critical for long-term observability – the <a href=\"https://sematext.com/blog/cassandra-monitoring-tools/\">Cassandra monitoring tool</a> of your choice needs to provide the number of compactions and the number of compacted bytes.</p><p>During compaction, until the process ends, the total disk space used may be double that before the compaction. Because of that, you should consider leaving about 50% of space free to account for compactions and, of course, set up appropriate alerts to inform you when the amount of free disk space is close to a level where compaction could fail.</p><h4>Read and Write Performance Metrics</h4><p>The next set of metrics is dedicated to clients and the read and write side of the operations. You should measure the number of reads happening in a given period, the request latency, and the number of timeouts and failures. Your Cassandra monitoring tool should provide the top-level view and allow for slicing and dicing through the data showing you the aggregated view, per node view, per keyspace view, and per table view. The same goes for write operations.</p><p>You should see the number of write requests happening and write latency. Local writes and reads may also be important when troubleshooting.</p><h4>Table Metrics</h4><p>Table metrics are also essential. The ones you should pay close attention to are partition size, tombstone scans, and the number of SSTables per read.</p><h5>Partition Size</h5><p>Partition size is crucial for cluster performance. Cassandra uses it as a unit of data storage, replication, and retrieval, thus directly dictating the performance of your Cassandra tables. The ideal partition size varies but is usually below 100MB and not less than 10 – 20MB.</p><h5>Tombstones</h5><p>Cassandra produces <a href=\"https://cassandra.apache.org/doc/latest/operating/compaction.html#tombstones-and-garbage-collection-gc-grace\">tombstones</a> when you delete the data. They are markers of the deleted data. Data in Cassandra is immutable by design, and because of that, it can only be physically removed from the SSTable during compactions. Because of that, you should keep an eye on how they affect your disk space.</p><h5>SSTables Per Read</h5><p>Similar to tombstones, the number of SSTables per read is related to the immutability of the data in Cassandra. A single table can be built of multiple SSTables, which are written sequentially. A single read operation can result in reading multiple SSTables to retrieve the relevant data. The more SSTables Cassandra needs to read to return the data, the more resources are required to complete the read operation. This is why you should minimize the number whenever possible.</p><h4>Other Metrics</h4><p>As we mentioned earlier, other Apache Cassandra performance metrics can be helpful and you should consider monitoring them.</p><h5>Caches</h5><p>There are two types of caches in Cassandra – the key cache and the row cache. Cassandra uses the key cache to store the location of row keys in memory so that the rows can be accessed without the need to hit the disk. The row cache stores the rows themselves in memory. By using the caches, Cassandra reduces the need to read the data from the disk and trades the memory usage for performance.</p><p>You need to monitor the key cache requests and row cache requests, which tell how many requests to a given cache type were made, and the key cache hit ratio and the row cache hit ratio, which show the percentage of results retrieved from the cache instead of the disk.</p><h5>Threadpool</h5><p>Cassandra is designed to handle the high load, withstand backpressure, and perform asynchronous tasks. Monitoring various thread pools is crucial for understanding Cassandra’s performance and bottlenecks. Each thread pool exposes the number of active, pending, and blocked tasks. Accumulated, pending, and blocked tasks usually tell about performance issues and the need for more processing power or different data and query architecture.</p><h5>Bloom Filter</h5><p>In the read path, Cassandra merges the data stored on a disk inside the SSTables with the data stored in memory. To minimize the amount of checking for data existence in the SSTables on the disk Cassandra uses a data structure called bloom filter.</p><p>The bloom filter is a probabilistic data structure that can tell Cassandra that the data is definitely not in a given file or that the data may be present in a given file. The key metrics to monitor here are the amount of space used by bloom filters, the number of false positives, and the ratio. You can reduce the number of false positives by assigning more memory to the bloom filters.</p><h3>Java Virtual Machine Metrics</h3><p><a href=\"https://cassandra.apache.org/\">Apache Cassandra</a> is a JVM-based application that comes with all the usual JVM pros and cons. From the developer’s perspective, memory management is easier and requires less hassle – you just use an object and forget about it, letting the JVM do the cleaning up. But that means that something has to clean up all the unused objects in memory. This is where the <a href=\"https://sematext.com/blog/java-garbage-collection/\">Java Garbage Collection</a> comes in and the metrics that come with it.</p><p>A proper <a href=\"https://sematext.com/integrations/cassandra-monitoring/\">Cassandra monitoring tool</a> should provide metrics that allow you to check and troubleshoot issues with the Java Virtual Machine, such as JVM memory utilization and garbage collection count and time. You can read more about them in our guide about <a href=\"https://sematext.com/blog/jvm-metrics/\">JVM metrics</a>.</p><h3>Operating System Metrics</h3><p>You can’t ignore Operating System metrics either. Information such as CPU utilization, memory usage, and disk usage is essential and can play a major role when it comes to Cassandra performance.</p><h4>CPU Utilization</h4><p>Your CPU is used for data processing and query handling. The more spare CPU cycles you have on a given node, the data and queries it can process. The <strong>user</strong> part of the CPU usage will show you your Cassandra process needs, while the <strong>wait</strong> can point to a bottleneck in I/O or network. As with every Java application, CPU cycles are also needed for garbage collection, so keep that in mind when planning.</p><h4>Memory Usage</h4><p>Memory usage is crucial for every JVM-based application. The newest version of Cassandra leverages both off-heap and heap memory. This means that you not only need to set the heap size of your Cassandra nodes correctly but also have enough off-heap memory for keeping your cluster performance at its best.</p><h4>Disk Usage</h4><p>Disk and I/O are crucial – Cassandra keeps its data on the disk, and each query may require a substantial number of I/O operations to return the results. You need to be sure that your hardware can handle your data retrieval needs. You also need to be sure that you have enough space to hold your data and handle the compaction process.</p><h2>Monitor Cassandra Performance with Sematext</h2><p><img data-lazyloaded=\"1\" data-placeholder-resp=\"1999x1017\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-1.png\" class=\"alignnone\" alt=\"monitoring cassandra performance metrics with sematext\" width=\"1999\" height=\"1017\" /></p><noscript><img class=\"alignnone\" src=\"https://sematext.com/wp-content/uploads/2021/10/cassandra-metrics-1.png\" alt=\"monitoring cassandra performance metrics with sematext\" width=\"1999\" height=\"1017\" /></noscript><p><a href=\"https://sematext.com/cloud/\">Sematext Cloud</a> and its <a href=\"https://sematext.com/integrations/cassandra\">Apache Cassandra monitoring</a> integration provide all that you need to monitor your distributed database. Everything is within a single view available without distractions:</p><ul><li>The overview report gives you a perfect start point for your metrics, painting a picture of the whole cluster.</li><li>A dedicated Cassandra report that provides an in-depth view of all relevant metrics related to the distributed database.</li><li>The OS report provides necessary operating system metrics such as CPU and memory utilization and visibility into your network traffic.</li><li>Finally, the JVM metrics give the full view of the Java Virtual Machine, such as metrics related to garbage collection and per-heap space memory utilization.</li></ul><p>Using the dedicated split-view, you can correlate all the available metrics with other metrics, <a href=\"https://sematext.com/logsene/\">logs</a>, and <a href=\"https://sematext.com/experience/\">real user monitoring</a> data, making Sematext a perfect visibility tool.</p><p>Sematext allows you to set up alerts on any metric or log the event and supports both threshold-based and anomaly-based alerts for full flexibility. You don’t have to watch your metrics over and over. Once you configure your alerts, you can sleep well, and Sematext will let you know if something is wrong.</p><p>If you want to see how Sematext stacks against similar solutions, read our article about the best <a href=\"https://sematext.com/blog/cassandra-monitoring-tools/\">Cassandra monitoring tools</a> available today.</p><h2>Get Started with Cassandra Monitoring</h2><p>As the distributed database Apache Cassandra can quickly become an operational challenge without visibility into what is happening from a global perspective as well as on a node level. You need to have full visibility from top to bottom, but that is not enough. You need to be sure that your monitoring system can notify you when an issue happens and also predict issues before your customers notice them.</p><p>One of the tools that will give you all of that is Sematext’s <a href=\"https://sematext.com/integrations/\">Apache Cassandra monitoring</a> integration. String monitoring your Cassandra cluster by creating the Sematext Cloud account and then the Cassandra monitoring App. Don’t forget to create a Logs App as well to ship you Cassandra logs for a full observability experience.</p><div id=\"jp-relatedposts\" class=\"jp-relatedposts\"><h3 class=\"jp-relatedposts-headline\"><em>You might also like</em></h3></div></div><div id=\"twitter-button\"><p class=\"text-center\"><a href=\"https://apps.sematext.com/ui/registration\" id=\"continue-conversation-twitter\" class=\"g-btn-outline-orange\">Start Your Free Trial</a></p></div></section><aside><div class=\"aside-blog-content\"><div class=\"aside-blog-content-search\"><form role=\"search\" method=\"get\" class=\"form-search\" action=\"https://sematext.com/\"><div class=\"input-group\">\n<label class=\"screen-reader-text\" for=\"s\">Search for:</label>\n<input type=\"text\" class=\"form-control search-query\" placeholder=\"Search…\" value=\"\" name=\"s\" title=\"Search for:\" /><button type=\"submit\" class=\"btn btn-default\" name=\"submit\" id=\"searchsubmit\" value=\"search\">\n</button></div></form></div><div id=\"related-content\"><div class=\"hiring-block\"><h4>Sematext is Hiring</h4><ul><li><a href=\"https://sematext.com/jobs/devops-engineer/\">DevOps Engineer</a></li><li><a href=\"https://sematext.com/jobs/customer-success-manager/\">Customer Success Manager</a></li><li><a href=\"https://sematext.com/jobs/job-product-marketing-manager/\">Product Marketing Manager</a></li><li><a href=\"https://sematext.com/jobs/job-product-manager/\">Product  Manager</a></li><li><a href=\"https://sematext.com/jobs/job-full-stack-developer/\">Full Stack Developer</a></li><li><a href=\"https://sematext.com/jobs/job-search-consulting-and-search-solutions-architect/\">Solr / Elasticsearch Solutions Architect</a></li></ul><p><a href=\"https://sematext.com/jobs/\" title=\"Sematext Jobs\">See all jobs</a></p></div><div class=\"write-to-us\"><h4>Do you have a cool story to share?</h4><p><a href=\"https://sematext.com/contact/\" title=\"Contact Us\">Write for us</a></p></div></div></div></aside></main><footer><div id=\"alternative-sharing-block\"></div></footer></article></div></div></div><div class=\"footer-area\"><div class=\"container footer-inner\"><div class=\"col-md-3 col-sm-6\"><h4>Products</h4><ul><li><a href=\"https://sematext.com/cloud/\" title=\"Sematext Cloud\">Sematext Cloud</a></li><li><a href=\"https://sematext.com/spm/\" title=\"Infrastructure Monitoring\">Infrastructure Monitoring</a></li><li><a href=\"https://sematext.com/logsene/\" title=\"Log Management\">Log Management</a></li><li><a href=\"https://sematext.com/experience/\" title=\"Real User Monitoring\">Real User Monitoring</a></li><li><a href=\"https://sematext.com/synthetic-monitoring/\" title=\"Synthetic Monitoring\">Synthetic Monitoring</a></li><li><a href=\"https://sematext.com/tracing/\" title=\"Distributed Transaction Tracing\">APM / Tracing</a></li><li><a href=\"https://sematext.com/enterprise/\" title=\"Sematext Enterprise\">Sematext Enterprise</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>Services</h4><ul><li><a href=\"https://sematext.com/consulting/\" title=\"Consulting\">Consulting</a></li><li><a href=\"https://sematext.com/support/\" title=\"Support\">Support</a></li><li><a href=\"https://sematext.com/training/\" title=\"Training\">Training</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>About</h4><ul><li><a href=\"https://sematext.com/about/\" title=\"Company\">Company</a></li><li><a href=\"https://sematext.com/blog/\" title=\"Blog\">Blog</a></li><li><a href=\"https://sematext.com/jobs/\" title=\"Jobs\">Jobs</a></li><li><a href=\"https://sematext.com/customers/\" title=\"Customers\">Customers</a></li><li><a href=\"https://status.sematext.com/\" title=\"Status\">Status</a></li></ul></div><div class=\"col-md-2 col-sm-6\"><h4>Contact</h4><ul><li><i class=\"fa fa-phone fa-fw\"> <a href=\"tel:+1%20347-480-1610\">+1 347-480-1610</a></i></li><li><i class=\"fa fa-envelope fa-fw\"> <a href=\"mailto:info@sematext.com\">info@sematext.com</a></i></li><li><i class=\"fa fa-map-marker fa-fw\"> <a href=\"https://www.google.com/maps/place/540+President+St,+Brooklyn,+NY+11215,+EE.+UU./@40.6773068,-73.9875385,17z/data=!3m1!4b1!4m5!3m4!1s0x89c25a55722bfff7:0x2143eab42dc5c96d!8m2!3d40.67713!4d-73.984982\" target=\"_blank\">Brooklyn, NY USA</a></i></li><li class=\"social-networks\">\n<a href=\"https://twitter.com/sematext\"><i class=\"fa fa-twitter\" aria-hidden=\"true\"></i></a>\n<a href=\"https://www.facebook.com/Sematext/\"><i class=\"fa fa-facebook\" aria-hidden=\"true\"></i></a>\n<a href=\"https://github.com/sematext\"><i class=\"fa fa-github\" aria-hidden=\"true\"></i></a>\n<a href=\"https://www.linkedin.com/company/294493/\"><i class=\"fa fa-linkedin\" aria-hidden=\"true\"></i></a></li></ul></div><div class=\"col-md-3 col-sm-12\"><p>\n<strong>© Sematext Group. All rights reserved</strong>\n<br /><a href=\"https://sematext.com/legal/terms-of-service/\">Terms Of Service</a> · <a href=\"https://sematext.com/legal/privacy/\">Privacy Policy</a></p><figure><a href=\"https://www.softwareadvice.com/network-monitoring/#top-products\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-01.png\" alt=\"Software Advice 2020 Front Runners\" /></a>\n<a href=\"https://www.softwareadvice.com/reporting-tools/#top-products\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-02.png\" alt=\"Software Advice 2021 Front Runners\" /></a>\n<a href=\"https://www.getapp.com/business-intelligence-analytics-software/analytics-reporting/category-leaders/\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-03.png\" alt=\"GetApp Category Leaders 2021\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-04.png\" alt=\"Crozdesk 2020 Quality Choice\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-05.png\" alt=\"Crozdesk 2020 Trusted Vendor\" /></a>\n<a href=\"https://crozdesk.com/it/application-performance-monitoring-apm-software/sematext-cloud\"><img src=\"https://sematext.com/wp-content/themes/sematext-next/inc/images/crozdesk-badges/badge-06.png\" alt=\"Crozdesk 2020 Happiest Users\" /></a></figure></div></div><footer id=\"colophon\" class=\"site-footer\" role=\"contentinfo\"><div class=\"container\"><div class=\"copyright col-md-12\"><p>\nApache Lucene, Apache Solr and their respective logos are trademarks of the Apache Software Foundation.\nElasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S.\nand in other countries. Sematext Group, Inc. is not affiliated with Elasticsearch BV.</p></div></div></footer></div></div></div></div>","id":"d16c68f8-5520-5e6b-a126-62cd8cda87a2","title":"How Do You Monitor Cassandra Performance: Key Metrics to Measure","origin_url":"https://sematext.com/blog/cassandra-monitoring/","url":"https://sematext.com/blog/cassandra-monitoring/","wallabag_created_at":"2021-11-08T17:30:27+00:00","published_at":"2021-10-04T10:55:25+00:00","published_by":"['Rafal Kuć']","reading_time":11,"domain_name":"sematext.com","preview_picture":"https://sematext.com/wp-content/uploads/2021/10/critical-cassandra-metrics-to-monitor.jpg","tags":["monitoring","cassandra","performance"],"description":"Apache Cassandra is a distributed database known for its high availability, fault tolerance, and near-linear scaling. It was initially developed by Facebook, but it is a widely used open-source system..."},{"content":"<p><em>To learn more about the DataStax open-source project, <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> and to try a demo, visit us on <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">GitHub</a>.</em></p><p>Apache Cassandra is a resilient system for users to build applications on, but many operators see Cassandra as a bit of a black box. It’s not that Cassandra doesn’t have <a href=\"https://cassandra.apache.org/doc/latest/operating/metrics.html\">hundreds of metrics to consume</a>, it does (over 300 metric series per table!). The fact is visualizing and getting a unified view of the cluster combined with OS-level metrics and application metrics is not an easy thing for Cassandra users to set up.  </p><h3>What is the Metrics Collector for Apache Cassandra?</h3><p>To help solve this problem, DataStax released a new open source project called the <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\">Metric Collector for Apache Cassandra</a> (MCAC for short).  This project provides a drop-in solution to solve this monitoring gap for Apache Cassandra. Here’s how it works.</p><p>MCAC is built on the widely used <a href=\"https://collectd.org/\">collectd</a> agent but with a novel twist. Collectd is a metric collection agent that is well adopted and integrates well with all kinds of external metrics systems like, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Prometheus\">prometheus</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Graphite\">graphite</a>, <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_Stackdriver\">stackdriver</a>, and <a href=\"https://collectd.org/wiki/index.php/Plugin:Write_HTTP\">others</a>. While collectd can scrape JMX metrics out of the box, <a href=\"https://github.com/prometheus/jmx_exporter/issues/246#issuecomment-367573931\">JMX scraping can be quite slow</a> and works best with only a subset of metrics. Not to mention many people don’t want to maintain and configure the metric agent on every node.  </p><p>We use MCAC to power the health tab in <a href=\"https://astra.datastax.com/register\">Astra</a> and is bundled with our <a href=\"https://github.com/datastax/cass-operator\">Kubernetes operator for Apache Cassandra</a>. </p><h3>Why MCAC is different</h3><p>To solve this problem MCAC comes as a single bundle with our java agent and a linux portable collectd build all in one. Just add the agent to the cassandra-env.sh, it brings up collectd and ships every metric in Cassandra to collectd via a unix-socket. It works on all Apache Cassandra versions from 2.2 -&gt; 4.0. </p><p>By shipping the metrics this way efficiently it is able to export hundreds of thousands of series per node with little/no impact on C* performance.</p><p>Not only does it send the metrics, but it is specially designed to work well with prometheus out of the box, like <a href=\"https://www.robustperception.io/how-does-a-prometheus-histogram-work\">histograms are tailored for aggregation</a> by prometheus and labels are automatically converted on ingest. This means you can slice and dice metrics across DCs, racks, down to even tables.</p><p>The Cassandra metrics are one aspect of the equation but with collectd we can also gather and expose all the OS level metrics, like context switches and disk/network performance.</p><p>MCAC also creates a historical log on the nodes of metric and non-metric diagnostic events related to activity on the node. Non-metric events include details on Flushes, Compactions, Exceptions, GC, etc. This DataLog can be used to help analyze performance or other impacting issue on the cluster. If you need help our SRE team is available to help you diagnose problems with this log <a href=\"https://www.datastax.com/keepcalm\">https://www.datastax.com/keepcalm</a> and if you have any questions we're here to help at <a href=\"https://community.datastax.com/\">https://community.datastax.com/</a>.</p><p>Finally, what good are all these metrics without a way to visualize them! To tie it all together, MCAC comes with pre-built grafana dashboards which give operators the best Cassandra monitoring solution out there. These dashboards will change over time to focus on specific aspects of the system to make it easier to drill into the cluster.</p><p><img alt=\"Grafana\" data-entity-type=\"file\" data-entity-uuid=\"3352a498-c70f-43a8-adb4-cf1086877b2b\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC1_0.png\" /></p><p><img alt=\"mcac\" data-entity-type=\"file\" data-entity-uuid=\"7a740a4f-874c-4520-b95d-a549adb48be6\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC3.png\" /></p><p><img alt=\"mcac2\" data-entity-type=\"file\" data-entity-uuid=\"f6e23d85-a306-4141-a3b1-35b2c92ba846\" src=\"https://www.datastax.com/sites/default/files/inline-images/MCAC2_0.png\" /></p>","id":"44ef75af-85c2-58c6-ac00-d9e7a9bd9296","title":"Monitoring Apache Cassandra™ Made Simple","origin_url":"https://www.datastax.com/blog/monitoring-apache-cassandratm-made-simple","url":"https://www.datastax.com/blog/monitoring-apache-cassandratm-made-simple","wallabag_created_at":"2021-02-12T16:13:22+00:00","published_at":null,"published_by":"['']","reading_time":2,"domain_name":"www.datastax.com","preview_picture":"https://www.ibm.com/content/dam/worldwide-content/stock-assets/adb-stk/ul/g/30/19/adobestock_1831482701.jpeg/_jcr_content/renditions/cq5dam.web.1280.1280.jpeg","tags":["prometheus","monitoring","cassandra","grafana"],"description":"To learn more about the DataStax open-source project, Metric Collector for Apache Cassandra and to try a demo, visit us on GitHub.Apache Cassandra is a resilient system for users to build applications..."}]},{"tag":"cassandra","articles":[{"content":"<p>This is the second post in my series on improving node density and lowering costs with Apache Cassandra. In the <a href=\"https://rustyrazorblade.com/post/2025/03-streaming/\">previous post</a>, I examined how streaming performance impacts node density and operational costs. In this post, I’ll focus on compaction throughput, and a recent optimization in Cassandra 5.0.4 that significantly improves it, <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-15452\" target=\"_blank\">CASSANDRA-15452</a>.</p><p>This post assumes some familiarity with Apache Cassandra storage engine fundamentals. The documentation has a nice <a href=\"https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html\" target=\"_blank\">section covering the storage engine</a> if you’d like to brush up before reading this post.</p><h2 id=\"the-compaction-bottleneck\">The Compaction Bottleneck</h2><p>Compaction in Cassandra is the process of merging multiple SSTables and writing out new ones, discarding tombstones, resolving overwrites, and generally organizing data for efficient reads. It’s an I/O intensive background operation that directly competes with foreground operations for system resources. In a later post I’ll look at how compaction strategies impact node density, but for now, I’ll just focus on throughput.</p><h2 id=\"why-compaction-throughput-matters-for-node-density\">Why Compaction Throughput Matters for Node Density</h2><p>As we continue to increase the amount of data we store per node, compaction performance becomes increasingly important. It affects:</p><ul><li>How quickly the system can reclaim disk space</li>\n<li>Whether the cluster can keep up with incoming writes</li>\n<li>Read latency, minimizing SSTables per read</li>\n<li>How fast nodes are able to join a new cluster</li>\n</ul><p>Simply put: as your data volume and write throughput increase, compaction throughput must as well. If it doesn’t, you’ll hit a performance wall that effectively caps your maximum practical node density.</p><p>Despite the significant improvements to compaction throughput over the years, there are some circumstances where compaction performance is inadequate. Let’s take a look at the reason why, then dive into what can be done about it.</p><p>When doing any performance evaluation, it’s important to understand how to measure where your time is spent. A lot of folks make incorrect assumptions, and then waste a lot of time trying to optimize something that doesn’t matter. I’ve written several posts about how useful profiling with the <a href=\"https://rustyrazorblade.com/post/2023/2023-11-07-async-profiler\">async-profiler</a> can be from an application perspective. For looking at the OS and hardware, the eBPF based toolkit <a href=\"https://rustyrazorblade.com/post/2023-11-14-bcc-tools\">bcc-tools</a> can help you identify process bottlenecks. I’ve used these tools extensively over the years, and in this post I’ll show how they’ve helped identify two major performance bottlenecks in compaction. My <a href=\"https://github.com/rustyrazorblade/easy-cass-lab\" target=\"_blank\">easy-cass-lab</a> software includes all these tools, as well as integration with <a href=\"https://axonops.com/\" target=\"_blank\">AxonOps</a> for Cassandra dashboards and operational tooling.</p><h2 id=\"being-10x-smarter-with-our-disk-access\">Being 10x Smarter With Our Disk Access</h2><p>When investigating compaction behavior, I discovered an major inefficiency in how Cassandra was accessing disk. The problem was especially severe in cloud environments with disaggregated storage like AWS EBS, where IOPS (Input/Output Operations Per Second) are both limited and expensive when used improperly.</p><p>When Cassandra would read in data during compaction, it would read individual compressed chunks off disk, one small read at a time. Using bcc-tools, we can monitor every filesystem operation. Here I’m using <code>xfsslower</code> to record every read operation on the filesystem (original headers back in for clarity):</p><div class=\"highlight\"><pre class=\"language-shell\" data-lang=\"shell\">$ sudo /usr/share/bcc/tools/xfsslower 0 -p 26988 | awk '$4 == \"R\" { print $0 }'\nTracing XFS operations\nTIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME\n22:27:38 CompactionExec 26988  R 4096    0           0.01 nb-7-big-Statistics.db\n22:27:38 CompactionExec 26988  R 4096    4           0.00 nb-7-big-Statistics.db\n22:27:38 CompactionExec 26988  R 2062    8           0.00 nb-7-big-Statistics.db\n22:27:38 CompactionExec 26988  R 14907   0           0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14924   14          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14896   29          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14844   43          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14923   58          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14931   72          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14905   87          0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14891   101         0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14919   116         0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14965   130         0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14918   145         0.01 nb-7-big-Data.db\n22:27:38 CompactionExec 26988  R 14930   160         0.01 nb-7-big-Data.db\n</pre></div><p>The above is showing we’re reading about 14KB at a time. That’s the size of the compressed page. This pattern is terrible for performance on cloud storage systems like EBS, where:</p><ol><li>Each read operation, no matter how small, counts against your provisioned IOPS</li>\n<li>Small reads waste IOPS quota while delivering minimal data</li>\n<li>You pay for IOPS allocation whether you use it efficiently or not</li>\n</ol><p>Looking at a wall clock performance profile, we can see compaction is spending a LOT of time waiting on disk, in the really wide column with the <code>pread</code> call at the top:</p><p><img src=\"https://rustyrazorblade.com/images/2025/wall-clock-profile-compaction.png\" alt=\"wall-clock-profile-compaction.png\" referrerpolicy=\"no-referrer\" /></p><p>Readahead is a disk optimization strategy where the operating system reads a larger block of data than was requested into memory. The objective is reduce latency and improve performance for sequential read operations. Unfortunately, when you don’t need the data it’s reading, it can be the source of major performance problems. In my experience, read ahead is one of the worst culprits in the world of Cassandra performance. It’s especially terrible for lightweight transactions and counters, where we perform a read before write.</p><p>My advice to Cassandra operators is to reduce readahead to 4KB to avoid unnecessary read amplification on the read path.</p><p>Readahead does have one place, however, where it can benefit performance. You <em>may</em> have already guessed that it’s compaction. Let’s take a step back and look at how the size of our reads impacts our throughput in a simple benchmark. Larger reads, initiated either from read ahead or the user, should deliver improved throughput, especially when we’re dealing with a quota on our IOPS (EBS), our drives have higher latency (SAN), or both.</p><h2 id=\"benchmarking\">Benchmarking</h2><p>I ran benchmark tests with sequential <code>fio</code> workloads using different request sizes on a 3K IOPS GP3 EBS volume. Here’s the configuration used:</p><div class=\"highlight\"><pre class=\"language-text\" data-lang=\"text\">[global]\nrw=read\ndirectory=data\ndirect=1\ntime_based=1\nfile_service_type=normal\nstonewall\nsize=100M\nnumjobs=12\ngroup_reporting\n[bs4]\nstonewall\nruntime=60s\nblocksize=4k\n[bs8]\nstonewall\nruntime=60s\nblocksize=8k\n[bs16]\nstonewall\nruntime=60s\nblocksize=16k\n[bs32]\nstonewall\nruntime=60s\nblocksize=32k\n[bs64]\nstonewall\nruntime=60s\nblocksize=64k\n[bs128]\nstonewall\nblocksize=128k\nruntime=60s\n[bs256]\nstonewall\nruntime=60s\nblocksize=256k\n</pre></div><p>When reviewing the results, the benefits of using larger request sizes were evident:</p><table><thead><tr><th>Request Size</th>\n<th>IOPS</th>\n<th>Throughput</th>\n</tr></thead><tbody><tr><td>4K</td>\n<td>3049</td>\n<td>11.9 MB/s</td>\n</tr><tr><td>8K</td>\n<td>3012</td>\n<td>23 MB/s</td>\n</tr><tr><td>16K</td>\n<td>3013</td>\n<td>47 MB/s</td>\n</tr><tr><td>32K</td>\n<td>3013</td>\n<td>94 MB/s</td>\n</tr><tr><td>64K</td>\n<td>1938</td>\n<td>121 MB/s</td>\n</tr><tr><td>128K</td>\n<td>957</td>\n<td>120 MB/s</td>\n</tr><tr><td>256K</td>\n<td>478</td>\n<td>120 MB/s</td>\n</tr></tbody></table><p>The data shows that using 256KB reads instead of 16KB reads would deliver almost 3x the throughput while using only 1/6th of the provisioned IOPS. That’s a massive efficiency improvement. Rather than chewing through all our IOPS to deliver a paltry 47MB/s of throughput, we’re only using about 500 for 120MB/s. That means if we can see these gains in the database, we’ll be able to compact faster, put more data on each node, and lower our total cost.</p><h2 id=\"the-solution-internally-buffering-sequential-reads\">The Solution: Internally Buffering Sequential Reads</h2><p>In <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-15452\" target=\"_blank\">CASSANDRA-15452</a>, I worked with my fellow Cassandra committer Jordan West to implement a solution: an efficient, internal read-ahead buffer for bulk reading operations. Here’s how it works:</p><ol><li>Instead of reading tiny chunks, we use a 256KB off-heap buffer</li>\n<li>Each read operation pulls in a full 256KB of data at once</li>\n<li>Compressed chunks are extracted from this buffer as needed</li>\n<li>The buffer is refilled only when necessary</li>\n</ol><p>This approach maximizes IOPS efficiency by using larger reads during compaction (as well as repair and range reads) that deliver more data per operation. For cloud environments, it’s a game-changer that directly aligns with storage provider recommendations. AWS EBS, for instance, <a href=\"https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html#ebs-io-iops\" target=\"_blank\">considers any I/O operation up to 256KB as a single operation</a>, so by using the largest possible size we should get optimal performance.</p><h2 id=\"real-world-impact-a-major-improvement-in-compaction-throughput\">Real-World Impact: A Major Improvement in Compaction Throughput</h2><p>When Jordan and I tested the implementation using <a href=\"https://github.com/rustyrazorblade/easy-cass-lab\" target=\"_blank\">easy-cass-lab</a> on EBS, the results were nothing short of spectacular. The <code>10.0.2.171</code> node is running our patched version, the other two nodes are running an unpatched release. The graphs clearly show a 2-3x improvement to throughput and a 3x reduction in IOPS.</p><p><img src=\"https://rustyrazorblade.com/images/2025/15452-bytes-read.png\" alt=\"15452-bytes-read.png\" referrerpolicy=\"no-referrer\" /></p><p><img src=\"https://rustyrazorblade.com/images/2025/15452-compaction.png\" alt=\"Compaction Throughput Comparison\" referrerpolicy=\"no-referrer\" /></p><p>You can see the results in the flamegraph as well. The calls to <code>pread</code> take up significantly less time.</p><p><img src=\"https://rustyrazorblade.com/images/2025/wall-clock-profile-compaction-after-15452.png\" alt=\"wall-clock-profile-compaction-after-15452.png\" referrerpolicy=\"no-referrer\" /></p><p>We can use <code>xfsslower</code> from <code>bcc-tools</code> again to watch the filesystem access:</p><div class=\"highlight\"><pre class=\"language-shell\" data-lang=\"shell\">$ sudo /usr/share/bcc/tools/xfsslower 0 -p $(cassandra-pid) | awk '$4 == \"R\" { print $0 }'\nTIME     COMM           PID    T BYTES   OFF_KB   LAT(ms) FILENAME\n14:40:29 CompactionExec 1782   R 262144  256         0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  512         0.06 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  768         0.06 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1024        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1280        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1536        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 241123  1792        0.07 nb-4-big-Data.db\n</pre></div><p>This is a lot better, now we’re fetching 256KB at a time using way fewer requests.</p><p>The EBS test configuration used a GP3 volume with 3K IOPS and 256MB throughput. With the existing code, compaction was bottlenecked by IOPS, peaking at exactly 3K IOPS but achieving only about 51MB/s throughput. With our optimization, the same operation used only ~500 IOPS to achieve around 106MB/s—a more than 2x improvement in throughput with 1/3IOPS.</p><p>In our most aggressive testing, <strong>we actually hit the EBS throughput limit rather than the IOPS limit</strong>. That’s a significant transformation in Cassandra’s resource utilization profile.</p><p>The patch also has the benefit of applying to anti-compaction, repair, and range reads. We can see a significant reduction in range reads, aka table scans:</p><p><img src=\"https://rustyrazorblade.com/images/2025/15452-range-reads.png\" alt=\"15452-range-reads.png\" referrerpolicy=\"no-referrer\" /></p><p>If you’re running Spark jobs using the Cassandra connector, you should see an improvement in performance, and your repair times should decrease.</p><h2 id=\"whats-next--can-we-do-more\">What’s next? Can we do more?</h2><p>Yes, absolutely! There’s several more improvements to IO that will help improve things. I’ll cover them here very quickly, and if there’s interest I’ll write about them in detail in a future post.</p><h3 id=\"avoid-reading-the-statistics\">Avoid Reading the Statistics</h3><p>When compacting, we read data out of the Statistics.db file before reading the data itself. This is completely unnecessary, as it’s stats about the data we’re about to read. Skipping this can reduce IO even further. Looking at a compaction’s IO activity, I see about 30% of the filesystem access is reading from <code>Statistics.db</code>:</p><div class=\"highlight\"><pre class=\"language-text\" data-lang=\"text\">14:40:29 CompactionExec 1782   R 4096    0           0.00 nb-3-big-Statistics.db\n14:40:29 CompactionExec 1782   R 701     4           0.00 nb-3-big-Statistics.db\n14:40:29 CompactionExec 1782   R 4096    0           0.00 nb-4-big-Statistics.db\n14:40:29 CompactionExec 1782   R 4096    4           0.00 nb-4-big-Statistics.db\n14:40:29 CompactionExec 1782   R 1962    8           0.00 nb-4-big-Statistics.db\n14:40:29 CompactionExec 1782   R 262144  0           0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 2115    0           0.01 nb-3-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  256         0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  512         0.06 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  768         0.06 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1024        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1280        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 262144  1536        0.07 nb-4-big-Data.db\n14:40:29 CompactionExec 1782   R 241123  1792        0.07 nb-4-big-Data.db\n</pre></div><p>This has already been fixed in <code>trunk</code> by Branimir Lambov in <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-20092\" target=\"_blank\">CASSANDRA-20092</a> and is being backported to 5.0 by Jordan.</p><h3 id=\"direct-io-for-compaction\">Direct I/O for Compaction</h3><p>Let’s talk more about page cache. Since we go through the Linux page cache when doing reads, we want to make sure it’s working optimally. Page cache lets us avoid going to disk! Unfortunately we also use it when reading for compaction. This is a problem because we’re pulling data into the page cache that we plan on deleting. To make room for the new data, other data will be evicted. If we compact 10GB of data, we’re pushing out a lot of valuable data from the page cache, meaning it needs to be fetched back into memory later on. Using <a href=\"https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/5/html/global_file_system/s1-manage-direct-io\" target=\"_blank\">Direct I/O</a> we can bypass the page cache entirely, which will prevent data from being evicted. This can be a huge help in latency sensitive systems or systems where IOPS are limited like EBS.</p><p>I’ve filed <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-19987\" target=\"_blank\">CASSANDRA-19987</a> to look at this.</p><h3 id=\"non-blocking-compression\">Non Blocking Compression</h3><p>Next, compression. When we’re writing to disk, we fill a buffer, sized by the <code>chunk_length_in_kb</code> table setting, compress, and write to disk. The compression here is a blocking call, which means we can spend a lot of time waiting on compression to finish, when we could be reading and merging the next chunk in parallel This can show up as a performance bottleneck, so I’ve filed <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-20085\" target=\"_blank\">CASSANDRA-20085</a> to look into it.</p><h3 id=\"better-memory-management\">Better Memory Management</h3><p>When a system is not bottlenecked on disk I/O, such as when using NVMe, the main issue we’ll run into is our heap allocation rate. I’ll go into details in a future post, but for now, it’s enough to know that the more memory we allocate, the worse our performance. Being smart about memory allocations can make a big difference in overall time spent, as allocations aren’t free. It also reduces both the frequency and duration of Garbage Collection. Big wins all around.</p><p>I recently profiled an instance where the row size was about 2KB (not out of the ordinary) and found that a single call was accounting for roughly 50% of memory allocated. Fixing this <em>one</em> thing has the potential to deliver a massive performance improvement, especially in workloads where we have either lots of fields, or large fields like serialized blobs.</p><p>Reaching again for async-profiler, this time we run it with <code>-e alloc</code> to track allocations and <code>--reverse</code> to reverse the stacks. I do this because the same underlying call comes from the read path and compaction, and I want to see the time in aggregate.</p><p><img src=\"https://rustyrazorblade.com/images/2025/allocation-profile-compaction.png\" alt=\"allocation-profile-compaction.png\" referrerpolicy=\"no-referrer\" /></p><p>Addressing this single allocation won’t just deliver faster compaction, but will reduce pressure on the heap, which in turn reduces GC overhead. As part of this series I’ll also be covering GC, as a lot’s changed since I wrote about it last.</p><p>I’ve filed <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-20428\" target=\"_blank\">CASSANDRA-20428</a> and there’s already a fair bit of discussion about different approaches to solving the problem.</p><h2 id=\"conclusion\">Conclusion</h2><p>Maximizing compaction throughput is critical for achieving higher node density with Apache Cassandra. The improvements in <a href=\"https://issues.apache.org/jira/browse/CASSANDRA-15452\" target=\"_blank\">CASSANDRA-15452</a> have removed one of the primary bottlenecks that previously limited practical node size in a lot of clusters.</p><p>By upgrading to Cassandra 5.0.4 (or later) you can:</p><ol><li>Dramatically improve compaction throughput</li>\n<li>Reduce IOPS consumption significantly</li>\n<li>Improve overall system stability during write-heavy workloads</li>\n<li>Increase the maximum practical data density per node</li>\n<li>Significantly reduce your cloud storage costs</li>\n</ol><p>This improvement, combined with the streaming optimizations discussed in the <a href=\"https://rustyrazorblade.com/post/2025/03-streaming/\">previous post</a>, creates a multiplier effect on your ability to increase node density. Each optimization removes a bottleneck, allowing you to push your hardware further and achieve more with less.</p><p>In my next post, I’ll be discussing how and why compaction strategies affect node density. Picking the right strategy can have a significant impact on your cluster’s performance and cost efficiency. Make sure you sign up for my <a href=\"https://rustyrazorblade.com/mailing-list/\">mailing list</a> if you’re interested in getting notified when it’s released!</p>If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please<a href=\"mailto:info@rustyrazorblade.com?subject=Consulting%20Services%20Inquiry\">reach out</a>if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.","id":"53b52a64-a2c0-5b1e-83ff-89bcfe2da9b9","title":"Cassandra Compaction Throughput Performance Explained","origin_url":"https://rustyrazorblade.com/post/2025/04-compaction-throughput/","url":"https://rustyrazorblade.com/post/2025/04-compaction-throughput/","wallabag_created_at":"2025-04-24T12:03:02+00:00","published_at":"2025-04-16T00:00:00+00:00","published_by":"['']","reading_time":14,"domain_name":"rustyrazorblade.com","preview_picture":"https://rustyrazorblade.com/images/2025/wall-clock-profile-compaction.png","tags":["cassandra","performance"],"description":"This is the second post in my series on improving node density and lowering costs with Apache Cassandra. In the previous post, I examined how streaming performance impacts node density and operational..."},{"content":"<p dir=\"auto\">Welcome to the Awesome Accord repository! This guide provides resources and examples for implementing ACID transactions in Apache Cassandra. Learn how to leverage distributed transactions for building reliable applications.</p><ul dir=\"auto\"><li><strong>Quick Start with Docker</strong>: Single-node deployment for immediate testing</li>\n<li><strong>Lab Environment</strong>: Multi-node cluster setup for development</li>\n<li><strong>Use Cases &amp; Examples</strong>: Production-ready implementations</li>\n<li><strong>Learning Resources</strong>: Documentation and best practices</li>\n</ul><p dir=\"auto\">Accord is in active development and still a feature branch in the Apasche Cassandra® Repo. You will find bug. What we ask is that you help with a contribution of a bug report.</p><p dir=\"auto\">You can use the <a href=\"https://github.com/pmcfadin/awesome-accord/discussions\">Github discussions</a> bug report forum for this or use the Planet Cassandra Discord channel for accord listed below. A bug report should have the folowing:</p><ul dir=\"auto\"><li>The data model used</li>\n<li>Actions to reproduce the bug</li>\n<li>Full stack trace from system.log</li>\n</ul><p dir=\"auto\">If you have suggestions about syntax or improving the overall developer expirience, we want to hear about that to! Add it as a suggestion or feature request using <a href=\"https://github.com/pmcfadin/awesome-accord/discussions\">Github discussions</a> or let us know in the Planet Cassandra Discord.</p><p dir=\"auto\">Now, on to the fun!</p><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker pull pmcfadin/cassandra-accord docker run -d --name cassandra-accord -p 9042:9042 pmcfadin/cassandra-accord\"><pre>docker pull pmcfadin/cassandra-accord\ndocker run -d --name cassandra-accord -p 9042:9042 pmcfadin/cassandra-accord</pre></div><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"brew tap rustyrazorblade/rustyrazorblade brew install easy-cass-lab\"><pre>brew tap rustyrazorblade/rustyrazorblade\nbrew install easy-cass-lab</pre></div><ul dir=\"auto\"><li><strong>Banking Transactions</strong>: Account transfers with ACID guarantees</li>\n<li><strong>Inventory Management</strong>: Race-free inventory tracking</li>\n<li><strong>User Management</strong>: Multi-table atomic operations</li>\n</ul><ul dir=\"auto\"><li>Provide feedback and bug reports in the <a href=\"https://github.com/pmcfadin/awesome-accord/discussions\">repository forum</a></li>\n<li><a href=\"https://discord.gg/GrRCajJqmQ\" rel=\"nofollow\">Join our Discord Community</a> for discussions and support</li>\n<li>Review our <a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/CONTRIBUTING.md\">Contributor Guide</a></li>\n<li>Submit issues and improvements through GitHub</li>\n</ul><div class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"/ ├── docker/ # Docker configuration and setup ├── easy-cass-lab/ # Multi-node testing environment ├── examples/ # Implementation examples │ ├── banking/ # Financial transaction examples │ ├── inventory/ # Stock management examples │ └── user-mgmt/ # User operations examples └── docs/ # Guides and documentation\"><pre>/\n├── docker/              # Docker configuration and setup\n├── easy-cass-lab/      # Multi-node testing environment\n├── examples/           # Implementation examples\n│   ├── banking/       # Financial transaction examples\n│   ├── inventory/     # Stock management examples\n│   └── user-mgmt/     # User operations examples\n└── docs/              # Guides and documentation\n</pre></div><p dir=\"auto\">Our <a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/docs/README.md\">documentation</a> includes:</p><ul dir=\"auto\"><li>Comprehensive setup instructions</li>\n<li>Transaction patterns and implementations</li>\n<li>Performance optimization guides</li>\n<li>Troubleshooting and best practices</li>\n</ul><ol dir=\"auto\"><li>Choose your deployment option:\n<ul dir=\"auto\"><li><a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/docker/README.md\">Docker Guide</a></li>\n<li><a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/easy-cass-lab/README.md\">Easy-Cass-Lab Guide</a></li>\n</ul></li>\n<li>Follow the <a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/docs/quickstart.md\">Quick Start Guide</a></li>\n<li>Explore <a href=\"https://github.com/pmcfadin/awesome-accord/blob/main/examples\">example implementations</a></li>\n<li>Connect with our <a href=\"https://discord.gg/GrRCajJqmQ\" rel=\"nofollow\">Discord community</a></li>\n<li>Feedback! <a href=\"https://github.com/pmcfadin/awesome-accord/discussions\">Github Discussions</a></li>\n</ol><div class=\"highlight highlight-source-sql notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"BEGIN TRANSACTION LET fromBalance = (SELECT account_balance FROM ks.accounts WHERE account_holder='alice'); IF fromBalance.account_balance &gt;= 20 THEN UPDATE ks.accounts SET account_balance -= 20 WHERE account_holder='alice'; UPDATE ks.accounts SET account_balance += 20 WHERE account_holder='bob'; END IF COMMIT TRANSACTION;\"><pre>BEGIN TRANSACTION\n    LET fromBalance = (SELECT account_balance \n                      FROM ks.accounts \n                      WHERE account_holder='alice');\n    IF fromBalance.account_balance &gt;= 20 THEN\n        UPDATE ks.accounts \n        SET account_balance -= 20 \n        WHERE account_holder='alice';\n        UPDATE ks.accounts \n        SET account_balance += 20 \n        WHERE account_holder='bob';\n    END IF\nCOMMIT TRANSACTION;</pre></div><p dir=\"auto\">Apache License 2.0</p>","id":"227c7330-078e-50ac-95e7-f5e8c0264a5c","title":"GitHub - pmcfadin/awesome-accord: Repository of all kinds of things to help you get up and running with ACID transactions on Apache Cassandra®","origin_url":"https://github.com/pmcfadin/awesome-accord","url":"https://github.com/pmcfadin/awesome-accord","wallabag_created_at":"2025-01-16T16:28:31+00:00","published_at":null,"published_by":"['pmcfadin']","reading_time":1,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/3e477fb2dd2b1ded1c5b53477f4848297badc75ece00c5b49bad1476fdb76167/pmcfadin/awesome-accord","tags":["acid","open.source","cassandra","accord"],"description":"Welcome to the Awesome Accord repository! This guide provides resources and examples for implementing ACID transactions in Apache Cassandra. Learn how to leverage distributed transactions for building..."},{"content":"<p dir=\"auto\">Visual Flow is an ETL tool designed for effective data manipulation via convenient and user-friendly interface. The tool has the following capabilities:</p><ul dir=\"auto\"><li>Can integrate data from heterogeneous sources:\n<ul dir=\"auto\"><li>AWS S3</li>\n<li>Cassandra</li>\n<li>Click House</li>\n<li>DB2</li>\n<li>Dataframe (for reading)</li>\n<li>Elastic Search</li>\n<li>IBM COS</li>\n<li>Kafka</li>\n<li>Local File</li>\n<li>MS SQL</li>\n<li>Mongo</li>\n<li>MySQL/Maria</li>\n<li>Oracle</li>\n<li>PostgreSQL</li>\n<li>Redis</li>\n<li>Redshift</li>\n</ul></li>\n<li>Leverage direct connectivity to enterprise applications as sources and targets</li>\n<li>Perform data processing and transformation</li>\n<li>Run custom code</li>\n<li>Leverage metadata for analysis and maintenance</li>\n</ul><p dir=\"auto\">Visual Flow application is divided into the following repositories:</p><p dir=\"auto\"><a href=\"https://github.com/ibagroup-eu/Visual-Flow/blob/main/CONTRIBUTING.md\">Check the official guide</a>.</p><p dir=\"auto\">Visual flow is an open-source software licensed under the <a href=\"https://github.com/ibagroup-eu/Visual-Flow/blob/main/LICENSE\">Apache-2.0 license</a>.</p>","id":"3f0ec87d-a3e1-5490-9ee1-3e96df4176c1","title":"GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository","origin_url":"https://github.com/ibagroup-eu/Visual-Flow","url":"https://github.com/ibagroup-eu/Visual-Flow","wallabag_created_at":"2024-12-02T13:34:31+00:00","published_at":null,"published_by":"['ibagroup-eu']","reading_time":null,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/9187fdecad3a37939c1971bcdec19ffed4090307ee508b009f47c7bcd49a7f8d/ibagroup-eu/Visual-Flow","tags":["mongo","nocode","elasticsearch","open.source","cassandra","data.pipeline","elastic","aws.s3","etl","low.code","postgres"],"description":"Visual Flow is an ETL tool designed for effective data manipulation via convenient and user-friendly interface. The tool has the following capabilities:Can integrate data from heterogeneous sources:\nA..."},{"content":"<p dir=\"auto\"><a href=\"https://github.com/datastax/cql-proxy/actions/workflows/test.yml\"><img src=\"https://github.com/datastax/cql-proxy/actions/workflows/test.yml/badge.svg\" alt=\"GitHub Action\" class=\"c13\" referrerpolicy=\"no-referrer\" /></a> <a href=\"https://goreportcard.com/report/github.com/datastax/cql-proxy\" rel=\"nofollow\"><img src=\"https://camo.githubusercontent.com/e1c32ff51117d37ba38fd853bb54c63214d25a3a367d0de90a00a03124924acb/68747470733a2f2f676f7265706f7274636172642e636f6d2f62616467652f6769746875622e636f6d2f64617461737461782f63716c2d70726f7879\" alt=\"Go Report Card\" data-canonical-src=\"https://goreportcard.com/badge/github.com/datastax/cql-proxy\" class=\"c13\" referrerpolicy=\"no-referrer\" /></a></p><p dir=\"auto\"><a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https://github.com/datastax/cql-proxy/blob/main/cql-proxy.png\"><img src=\"https://github.com/datastax/cql-proxy/raw/main/cql-proxy.png\" alt=\"cql-proxy\" class=\"c13\" referrerpolicy=\"no-referrer\" /></a></p><p dir=\"auto\"><code>cql-proxy</code> is designed to forward your application's CQL traffic to an appropriate database service. It listens on a local address and securely forwards that traffic.</p><p dir=\"auto\">The <code>cql-proxy</code> sidecar enables unsupported CQL drivers to work with <a href=\"https://astra.datastax.com/\" rel=\"nofollow\">DataStax Astra</a>. These drivers include both legacy DataStax <a href=\"https://docs.datastax.com/en/driver-matrix/doc/driver_matrix/common/driverMatrix.html\" rel=\"nofollow\">drivers</a> and community-maintained CQL drivers, such as the <a href=\"https://github.com/gocql/gocql\">gocql</a> driver and the <a href=\"https://github.com/scylladb/scylla-rust-driver\">rust-driver</a>.</p><p dir=\"auto\"><code>cql-proxy</code> also enables applications that are currently using <a href=\"https://cassandra.apache.org/\" rel=\"nofollow\">Apache Cassandra</a> or <a href=\"https://www.datastax.com/products/datastax-enterprise\" rel=\"nofollow\">DataStax Enterprise (DSE)</a> to use Astra without requiring any code changes. Your application just needs to be configured to use the proxy.</p><p dir=\"auto\">If you're building a new application using DataStax <a href=\"https://docs.datastax.com/en/driver-matrix/doc/driver_matrix/common/driverMatrix.html\" rel=\"nofollow\">drivers</a>, <code>cql-proxy</code> is not required, as the drivers can communicate directly with Astra. DataStax drivers have excellent support for Astra out-of-the-box, and are well-documented in the <a href=\"https://docs.datastax.com/en/astra/docs/connecting-to-astra-databases-using-datastax-drivers.html\" rel=\"nofollow\">driver-guide</a> guide.</p><p dir=\"auto\">Use the <code>-h</code> or <code>--help</code> flag to display a listing all flags and their corresponding descriptions and environment variables (shown below as items starting with <code>$</code>):</p><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"$ ./cql-proxy -h Usage: cql-proxy Flags: -h, --help Show context-sensitive help. -b, --astra-bundle=STRING Path to secure connect bundle for an Astra database. Requires '--username' and '--password'. Ignored if using the token or contact points option ($ASTRA_BUNDLE). -t, --astra-token=STRING Token used to authenticate to an Astra database. Requires '--astra-database-id'. Ignored if using the bundle path or contact points option ($ASTRA_TOKEN). -i, --astra-database-id=STRING Database ID of the Astra database. Requires '--astra-token' ($ASTRA_DATABASE_ID) --astra-api-url=&quot;https://api.astra.datastax.com&quot; URL for the Astra API ($ASTRA_API_URL) --astra-timeout=10s Timeout for contacting Astra when retrieving the bundle and metadata ($ASTRA_TIMEOUT) -c, --contact-points=CONTACT-POINTS,... Contact points for cluster. Ignored if using the bundle path or token option ($CONTACT_POINTS). -u, --username=STRING Username to use for authentication ($USERNAME) -p, --password=STRING Password to use for authentication ($PASSWORD) -r, --port=9042 Default port to use when connecting to cluster ($PORT) -n, --protocol-version=&quot;v4&quot; Initial protocol version to use when connecting to the backend cluster (default: v4, options: v3, v4, v5, DSEv1, DSEv2) ($PROTOCOL_VERSION) -m, --max-protocol-version=&quot;v4&quot; Max protocol version supported by the backend cluster (default: v4, options: v3, v4, v5, DSEv1, DSEv2) ($MAX_PROTOCOL_VERSION) -a, --bind=&quot;:9042&quot; Address to use to bind server ($BIND) -f, --config=CONFIG YAML configuration file ($CONFIG_FILE) --debug Show debug logging ($DEBUG) --health-check Enable liveness and readiness checks ($HEALTH_CHECK) --http-bind=&quot;:8000&quot; Address to use to bind HTTP server used for health checks ($HTTP_BIND) --heartbeat-interval=30s Interval between performing heartbeats to the cluster ($HEARTBEAT_INTERVAL) --idle-timeout=60s Duration between successful heartbeats before a connection to the cluster is considered unresponsive and closed ($IDLE_TIMEOUT) --readiness-timeout=30s Duration the proxy is unable to connect to the backend cluster before it is considered not ready ($READINESS_TIMEOUT) --idempotent-graph If true it will treat all graph queries as idempotent by default and retry them automatically. It may be dangerous to retry some graph queries -- use with caution ($IDEMPOTENT_GRAPH). --num-conns=1 Number of connection to create to each node of the backend cluster ($NUM_CONNS) --proxy-cert-file=STRING Path to a PEM encoded certificate file with its intermediate certificate chain. This is used to encrypt traffic for proxy clients ($PROXY_CERT_FILE) --proxy-key-file=STRING Path to a PEM encoded private key file. This is used to encrypt traffic for proxy clients ($PROXY_KEY_FILE) --rpc-address=STRING Address to advertise in the 'system.local' table for 'rpc_address'. It must be set if configuring peer proxies ($RPC_ADDRESS) --data-center=STRING Data center to use in system tables ($DATA_CENTER) --tokens=TOKENS,... Tokens to use in the system tables. It's not recommended ($TOKENS)\"><pre>$ ./cql-proxy -h\nUsage: cql-proxy\nFlags:\n  -h, --help                                              Show context-sensitive help.\n  -b, --astra-bundle=STRING                               Path to secure connect bundle for an Astra database. Requires '--username' and '--password'. Ignored if using the\n                                                          token or contact points option ($ASTRA_BUNDLE).\n  -t, --astra-token=STRING                                Token used to authenticate to an Astra database. Requires '--astra-database-id'. Ignored if using the bundle path\n                                                          or contact points option ($ASTRA_TOKEN).\n  -i, --astra-database-id=STRING                          Database ID of the Astra database. Requires '--astra-token' ($ASTRA_DATABASE_ID)\n      --astra-api-url=\"https://api.astra.datastax.com\"    URL for the Astra API ($ASTRA_API_URL)\n      --astra-timeout=10s                                 Timeout for contacting Astra when retrieving the bundle and metadata ($ASTRA_TIMEOUT)\n  -c, --contact-points=CONTACT-POINTS,...                 Contact points for cluster. Ignored if using the bundle path or token option ($CONTACT_POINTS).\n  -u, --username=STRING                                   Username to use for authentication ($USERNAME)\n  -p, --password=STRING                                   Password to use for authentication ($PASSWORD)\n  -r, --port=9042                                         Default port to use when connecting to cluster ($PORT)\n  -n, --protocol-version=\"v4\"                             Initial protocol version to use when connecting to the backend cluster (default: v4, options: v3, v4, v5, DSEv1,\n                                                          DSEv2) ($PROTOCOL_VERSION)\n  -m, --max-protocol-version=\"v4\"                         Max protocol version supported by the backend cluster (default: v4, options: v3, v4, v5, DSEv1, DSEv2)\n                                                          ($MAX_PROTOCOL_VERSION)\n  -a, --bind=\":9042\"                                      Address to use to bind server ($BIND)\n  -f, --config=CONFIG                                     YAML configuration file ($CONFIG_FILE)\n      --debug                                             Show debug logging ($DEBUG)\n      --health-check                                      Enable liveness and readiness checks ($HEALTH_CHECK)\n      --http-bind=\":8000\"                                 Address to use to bind HTTP server used for health checks ($HTTP_BIND)\n      --heartbeat-interval=30s                            Interval between performing heartbeats to the cluster ($HEARTBEAT_INTERVAL)\n      --idle-timeout=60s                                  Duration between successful heartbeats before a connection to the cluster is considered unresponsive and closed\n                                                          ($IDLE_TIMEOUT)\n      --readiness-timeout=30s                             Duration the proxy is unable to connect to the backend cluster before it is considered not ready\n                                                          ($READINESS_TIMEOUT)\n      --idempotent-graph                                  If true it will treat all graph queries as idempotent by default and retry them automatically. It may be\n                                                          dangerous to retry some graph queries -- use with caution ($IDEMPOTENT_GRAPH).\n      --num-conns=1                                       Number of connection to create to each node of the backend cluster ($NUM_CONNS)\n      --proxy-cert-file=STRING                            Path to a PEM encoded certificate file with its intermediate certificate chain. This is used to encrypt traffic\n                                                          for proxy clients ($PROXY_CERT_FILE)\n      --proxy-key-file=STRING                             Path to a PEM encoded private key file. This is used to encrypt traffic for proxy clients ($PROXY_KEY_FILE)\n      --rpc-address=STRING                                Address to advertise in the 'system.local' table for 'rpc_address'. It must be set if configuring peer proxies\n                                                          ($RPC_ADDRESS)\n      --data-center=STRING                                Data center to use in system tables ($DATA_CENTER)\n      --tokens=TOKENS,...                                 Tokens to use in the system tables. It's not recommended ($TOKENS)</pre></div><p dir=\"auto\">To pass configuration to <code>cql-proxy</code>, either command-line flags, environment variables, or a configuration file can be used. Using the <code>docker</code> method as an example, the following samples show how the token and database ID are defined with each method.</p><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -p 9042:9042 --rm datastax/cql-proxy:v0.1.5 --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-datbase-id&gt;\"><pre>docker run -p 9042:9042 \\\n  --rm datastax/cql-proxy:v0.1.5 \\\n  --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-datbase-id&gt;</pre></div><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -p 9042:9042 --rm datastax/cql-proxy:v0.1.5 -e ASTRA_TOKEN=&lt;astra-token&gt; -e ASTRA_DATABASE_ID=&lt;astra-datbase-id&gt;\"><pre>docker run -p 9042:9042  \\\n  --rm datastax/cql-proxy:v0.1.5 \\\n  -e ASTRA_TOKEN=&lt;astra-token&gt; -e ASTRA_DATABASE_ID=&lt;astra-datbase-id&gt;</pre></div><p dir=\"auto\">Proxy settings can also be passed using a configuration file with the <code>--config /path/to/proxy.yaml</code> flag. This can be mixed and matched with command-line flags and environment variables. Here are some example configuration files:</p><div class=\"highlight highlight-source-yaml notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"contact-points: - 127.0.0.1 username: cassandra password: cassandra port: 9042 bind: 127.0.0.1:9042 # ...\"><pre>contact-points:\n  - 127.0.0.1\nusername: cassandra\npassword: cassandra\nport: 9042\nbind: 127.0.0.1:9042\n# ...</pre></div><p dir=\"auto\">or with a Astra token:</p><div class=\"highlight highlight-source-yaml notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"astra-token: &lt;astra-token&gt; astra-database-id: &lt;astra-database-id&gt; bind: 127.0.0.1:9042 # ...\"><pre>astra-token: &lt;astra-token&gt;\nastra-database-id: &lt;astra-database-id&gt;\nbind: 127.0.0.1:9042\n# ...</pre></div><p dir=\"auto\">All configuration keys match their command-line flag counterpart, e.g. <code>--astra-bundle</code> is <code>astra-bundle:</code>, <code>--contact-points</code> is <code>contact-points:</code> etc.</p><p dir=\"auto\">Multi-region failover with DC-aware load balancing policy is the most useful case for a multiple proxy setup.</p><p dir=\"auto\">When configuring <code>peers:</code> it is required to set <code>--rpc-address</code> (or <code>rpc-address:</code> in the yaml) for each proxy and it must match is corresponding <code>peers:</code> entry. Also, <code>peers:</code> is only available in the configuration file and cannot be set using a command-line flag.</p><p dir=\"auto\">Here's an example of configuring multi-region failover with two proxies. A proxy is started for each region of the cluster connecting to it using that region's bundle. They all share a common configuration file that contains the full list of proxies.</p><p dir=\"auto\"><em>Note:</em> Only bundles are supported for multi-region setups.</p><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"cql-proxy --astra-bundle astra-region1-bundle.zip --username token --password &lt;astra-token&gt; --bind 127.0.0.1:9042 --rpc-address 127.0.0.1 --data-center dc-1 --config proxy.yaml\"><pre>cql-proxy --astra-bundle astra-region1-bundle.zip --username token --password &lt;astra-token&gt; \\\n  --bind 127.0.0.1:9042 --rpc-address 127.0.0.1 --data-center dc-1 --config proxy.yaml</pre></div><div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"cql-proxy ---astra-bundle astra-region2-bundle.zip --username token --password &lt;astra-token&gt; --bind 127.0.0.2:9042 --rpc-address 127.0.0.2 --data-center dc-2 --config proxy.yaml\"><pre>cql-proxy ---astra-bundle astra-region2-bundle.zip --username token --password &lt;astra-token&gt; \\\n  --bind 127.0.0.2:9042 --rpc-address 127.0.0.2 --data-center dc-2 --config proxy.yaml</pre></div><p dir=\"auto\">The peers settings are configured using a yaml file. It's a good idea to explicitly provide the <code>--data-center</code> flag, otherwise; these values are pulled from the backend cluster and would need to be pulled from the <code>system.local</code> and <code>system.peers</code> table to properly setup the peers <code>data-center:</code> values. Here's an example <code>proxy.yaml</code>:</p><div class=\"highlight highlight-source-yaml notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"peers: - rpc-address: 127.0.0.1 data-center: dc-1 - rpc-address: 127.0.0.2 data-center: dc-2\"><pre>peers:\n  - rpc-address: 127.0.0.1\n    data-center: dc-1\n  - rpc-address: 127.0.0.2\n    data-center: dc-2</pre></div><p dir=\"auto\"><em>Note:</em> It's okay for the <code>peers:</code> to contain entries for the current proxy itself because they'll just be omitted.</p><p dir=\"auto\">There are three methods for using <code>cql-proxy</code>:</p><ul dir=\"auto\"><li>Locally build and run <code>cql-proxy</code></li>\n<li>Run a docker image that has <code>cql-proxy</code> installed</li>\n<li>Use a Kubernetes container to run <code>cql-proxy</code></li>\n</ul><ol dir=\"auto\"><li>\n<p dir=\"auto\">Build <code>cql-proxy</code>.</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"go build\"><pre>go build</pre></div>\n</li>\n<li>\n<p dir=\"auto\">Run with your desired database.</p>\n<ul dir=\"auto\"><li>\n<p dir=\"auto\"><a href=\"https://astra.datastax.com/\" rel=\"nofollow\">DataStax Astra</a> cluster:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"./cql-proxy --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-database-id&gt;\"><pre>./cql-proxy --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-database-id&gt;</pre></div>\n<p dir=\"auto\">The <code>&lt;astra-token&gt;</code> can be generated using these <a href=\"https://docs.datastax.com/en/astra/docs/manage-application-tokens.html\" rel=\"nofollow\">instructions</a>. The proxy also supports using the <a href=\"https://docs.datastax.com/en/astra/docs/obtaining-database-credentials.html#_getting_your_secure_connect_bundle\" rel=\"nofollow\">Astra Secure Connect Bundle</a> along with a client ID and secret generated using these <a href=\"https://docs.datastax.com/en/astra/docs/manage-application-tokens.html\" rel=\"nofollow\">instructions</a>:</p>\n<div class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"./cql-proxy --astra-bundle &lt;your-secure-connect-zip&gt; --username &lt;astra-client-id&gt; --password &lt;astra-client-secret&gt;\"><pre>./cql-proxy --astra-bundle &lt;your-secure-connect-zip&gt; \\\n--username &lt;astra-client-id&gt; --password &lt;astra-client-secret&gt;\n</pre></div>\n</li>\n<li>\n<p dir=\"auto\"><a href=\"https://cassandra.apache.org/\" rel=\"nofollow\">Apache Cassandra</a> cluster:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"./cql-proxy --contact-points &lt;cluster node IPs or DNS names&gt; [--username &lt;username&gt;] [--password &lt;password&gt;]\"><pre>./cql-proxy --contact-points &lt;cluster node IPs or DNS names&gt; [--username &lt;username&gt;] [--password &lt;password&gt;]</pre></div>\n</li>\n</ul></li>\n</ol><ol dir=\"auto\"><li>\n<p dir=\"auto\">Run with your desired database.</p>\n<ul dir=\"auto\"><li>\n<p dir=\"auto\"><a href=\"https://astra.datastax.com/\" rel=\"nofollow\">DataStax Astra</a> cluster:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -p 9042:9042 datastax/cql-proxy:v0.1.5 --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-database-id&gt;\"><pre>docker run -p 9042:9042 \\\n  datastax/cql-proxy:v0.1.5 \\\n  --astra-token &lt;astra-token&gt; --astra-database-id &lt;astra-database-id&gt;</pre></div>\n<p dir=\"auto\">The <code>&lt;astra-token&gt;</code> can be generated using these <a href=\"https://docs.datastax.com/en/astra/docs/manage-application-tokens.html\" rel=\"nofollow\">instructions</a>. The proxy also supports using the <a href=\"https://docs.datastax.com/en/astra/docs/obtaining-database-credentials.html#_getting_your_secure_connect_bundle\" rel=\"nofollow\">Astra Secure Connect Bundle</a>, but it requires mounting the bundle to a volume in the container:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -v &lt;your-secure-connect-bundle.zip&gt;:/tmp/scb.zip -p 9042:9042 --rm datastax/cql-proxy:v0.1.5 --astra-bundle /tmp/scb.zip --username &lt;astra-client-id&gt; --password &lt;astra-client-secret&gt;\"><pre>docker run -v &lt;your-secure-connect-bundle.zip&gt;:/tmp/scb.zip -p 9042:9042 \\\n--rm datastax/cql-proxy:v0.1.5 \\\n--astra-bundle /tmp/scb.zip --username &lt;astra-client-id&gt; --password &lt;astra-client-secret&gt;</pre></div>\n</li>\n<li>\n<p dir=\"auto\"><a href=\"https://cassandra.apache.org/\" rel=\"nofollow\">Apache Cassandra</a> cluster:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"docker run -p 9042:9042 datastax/cql-proxy:v0.1.5 --contact-points &lt;cluster node IPs or DNS names&gt; [--username &lt;username&gt;] [--password &lt;password&gt;]\"><pre>docker run -p 9042:9042 \\\n  datastax/cql-proxy:v0.1.5 \\\n  --contact-points &lt;cluster node IPs or DNS names&gt; [--username &lt;username&gt;] [--password &lt;password&gt;]</pre></div>\n</li>\n</ul></li>\n</ol><p dir=\"auto\">If you wish to have the docker image removed after you are done with it, add <code>--rm</code> before the image name <code>datastax/cql-proxy:v0.1.5</code>.</p><p dir=\"auto\">Using Kubernetes with <code>cql-proxy</code> requires a number of steps:</p><ol dir=\"auto\"><li>\n<p dir=\"auto\">Generate a token following the Astra <a href=\"https://docs.datastax.com/en/astra/docs/manage-application-tokens.html#_create_application_token\" rel=\"nofollow\">instructions</a>. This step will display your Client ID, Client Secret, and Token; make sure you download the information for the next steps. Store the secure bundle in <code>/tmp/scb.zip</code> to match the example below.</p>\n</li>\n<li>\n<p dir=\"auto\">Create <code>cql-proxy.yaml</code>. You'll need to add three sets of information: arguments, volume mounts, and volumes. A full example can be found <a href=\"https://github.com/datastax/cql-proxy/blob/main/k8s/cql-proxy.yml\">here</a>.</p>\n</li>\n</ol><ul dir=\"auto\"><li>\n<p dir=\"auto\">Argument: Modify the local bundle location, username and password, using the client ID and client secret obtained in the last step to the container argument.</p>\n<div class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"command: [&quot;./cql-proxy&quot;] args: [&quot;--astra-bundle=/tmp/scb.zip&quot;,&quot;--username=Client ID&quot;,&quot;--password=Client Secret&quot;]\"><pre>command: [\"./cql-proxy\"]\nargs: [\"--astra-bundle=/tmp/scb.zip\",\"--username=Client ID\",\"--password=Client Secret\"]\n</pre></div>\n</li>\n<li>\n<p dir=\"auto\">Volume mounts: Modify <code>/tmp/</code> as a volume mount as required.</p>\n<div class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"volumeMounts: - name: my-cm-vol mountPath: /tmp/\"><pre>volumeMounts:\n  - name: my-cm-vol\n  mountPath: /tmp/\n</pre></div>\n</li>\n<li>\n<p dir=\"auto\">Volume: Modify the <code>configMap</code> filename as required. In this example, it is named <code>cql-proxy-configmap</code>. Use the same name for the <code>volumes</code> that you used for the <code>volumeMounts</code>.</p>\n<div class=\"snippet-clipboard-content notranslate position-relative overflow-auto\" data-snippet-clipboard-copy-content=\"volumes: - name: my-cm-vol configMap: name: cql-proxy-configmap\"><pre>volumes:\n  - name: my-cm-vol\n    configMap:\n      name: cql-proxy-configmap        \n</pre></div>\n</li>\n</ul><ol start=\"3\" dir=\"auto\"><li>\n<p dir=\"auto\">Create a configmap. Use the same secure bundle that was specified in the <code>cql-proxy.yaml</code>.</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"kubectl create configmap cql-proxy-configmap --from-file /tmp/scb.zip\"><pre>kubectl create configmap cql-proxy-configmap --from-file /tmp/scb.zip </pre></div>\n</li>\n<li>\n<p dir=\"auto\">Check the configmap that was created.</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"kubectl describe configmap cql-proxy-configmap Name: cql-proxy-configmap Namespace: default Labels: &lt;none&gt; Annotations: &lt;none&gt; Data ==== BinaryData ==== scb.zip: 12311 bytes\"><pre>kubectl describe configmap cql-proxy-configmap\n  Name:         cql-proxy-configmap\n  Namespace:    default\n  Labels:       &lt;none&gt;\n  Annotations:  &lt;none&gt;\n  Data\n  ====\n  BinaryData\n  ====\n  scb.zip: 12311 bytes</pre></div>\n</li>\n<li>\n<p dir=\"auto\">Create a Kubernetes deployment with the YAML file you created:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"kubectl create -f cql-proxy.yaml\"><pre>kubectl create -f cql-proxy.yaml</pre></div>\n</li>\n<li>\n<p dir=\"auto\">Check the logs:</p>\n<div class=\"highlight highlight-source-shell notranslate position-relative overflow-auto\" dir=\"auto\" data-snippet-clipboard-copy-content=\"kubectl logs &lt;deployment-name&gt;\"><pre>kubectl logs &lt;deployment-name&gt;</pre></div>\n</li>\n</ol><p dir=\"auto\">Drivers that use token-aware load balancing may print a warning or may not work when using cql-proxy. Because cql-proxy abstracts the backend cluster as a single endpoint this doesn't always work well with token-aware drivers that expect there to be at least \"replication factor\" number of nodes in the cluster. Many drivers print a warning (which can be ignored) and fallback to something like round-robin, but other drivers might fail with an error. For the drivers that fail with an error it is required that they disable token-aware or configure the round-robin load balancing policy.</p>","id":"e7536061-697a-5e22-9f2f-2ff7a472e641","title":"GitHub - datastax/cql-proxy: A client-side CQL proxy/sidecar.","origin_url":"https://github.com/datastax/cql-proxy","url":"https://github.com/datastax/cql-proxy","wallabag_created_at":"2024-11-01T17:26:01+00:00","published_at":null,"published_by":"['datastax']","reading_time":8,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/c2528e3426d98910ed27819e048b4c1081fab2ed2c7adbea6e6a3b1872deb30a/datastax/cql-proxy","tags":["migration","proxy","cassandra","cql"],"description":" cql-proxy is designed to forward your application's CQL traffic to an appropriate database service. It listens on a local address and securely forwards that traffic.The cql-proxy sidecar enables unsu..."}]},{"tag":"grafana","articles":[{"content":"<p>DataStax Mission Control currently focuses on the lifecycle management and observability of DSE clusters. <a href=\"https://docs.datastax.com/en/mission-control/docs/install/product.html\" class=\"xref page\">Install</a> DataStax Mission Control quickly, and then use it to create and manage DataStax Enterprise 6.8.26+ clusters. DataStax Mission Control is composed of a suite of operators which handle the orchestration of automation across regional cluster boundaries. This simplifies management of globally deployed DSE clusters. Define a <a href=\"https://docs.datastax.com/en/mission-control/docs/reference/dse-cluster.html\" class=\"xref page\">DSE Cluster</a> custom resource (CR) to <a href=\"https://docs.datastax.com/en/mission-control/docs/manage/dse/add-dse-cluster.html\" class=\"xref page\">create</a> your first DSE Cluster with DataStax Mission Control.</p>","id":"8c9be501-0d76-5325-81f8-ae484d83122d","title":"DataStax Mission Control :: DataStax Project Mission Control","origin_url":"https://docs.datastax.com/en/mission-control/docs/index.html","url":"https://docs.datastax.com/en/mission-control/docs/index.html","wallabag_created_at":"2023-04-15T01:46:36+00:00","published_at":null,"published_by":null,"reading_time":null,"domain_name":"docs.datastax.com","preview_picture":"https://docs.datastax.com/en/_/img/datastax-docs-banner.png","tags":["kubernetes","datastax","cassandra","grafana"],"description":"DataStax Mission Control currently focuses on the lifecycle management and observability of DSE clusters. Install DataStax Mission Control quickly, and then use it to create and manage DataStax Enterp..."},{"content":"<div class=\"entry clearfix\"><p>In Apache Cassandra Lunch #62, guest speaker Sarma Pydipally presented on the Grafana Dashboard for Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a demo, is embedded below in case you were not able to attend live. If you would like to attend Apache Cassandra Lunch live, it is hosted every Wednesday at 12 PM EST. Register <a href=\"https://www.meetup.com/Cassandra-DataStax-DC/events/\" target=\"_blank\" rel=\"noreferrer noopener\">here</a> now!</p><h2>Grafana Dashboard for Apache Cassandra</h2><p>We appreciate Sarma Pydipally for taking the time to create a presentation and sharing his knowledge of the Grafana Dashboard and how it can be used with Apache Cassandra.</p><h3>Grafana Dashboard</h3><p>Grafana Dashboards is an open-source data and analytics visualization tool. Some key features of Grafana include visualization, queries, alerts, and metrics exploration. It allows users to take data from their time-series databases (TSDBs) and create graphs and visualizations.</p><h3>Prometheus</h3><p>Prometheus is an open-source monitoring system. It records and tracks real-time metrics in a time series database. Prometheus supports flexible queries using PromQL and supports real-time alerting. It collects data through a pull model. The Prometheus server queries various data sources from a list at a set frequency and updates the current values based on these queries.</p><h3>Cassandra</h3><p>Apache Cassandra is a highly available and highly scalable, open-source, distributed NoSQL database. Cassandra features proven fault-tolerance on hardware or in the cloud.</p><h3>Grafana Dashboard with Prometheus and Cassandra</h3><p>To find out more about how to use Grafana Dashboard with Cassandra, please check out Sarma’s video and presentation embedded below. The GitHub repo to the demo featured in the presentation is linked below.</p><figure class=\"wp-block-image size-large is-style-default\"><a href=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard.jpg\"><img width=\"1024\" height=\"576\" src=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1024x576.jpg\" alt=\"Grafana Dashboard for Cassandra architecture diagram.\" class=\"wp-image-178116\" srcset=\"https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1024x576.jpg 1024w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-300x169.jpg 300w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-768x432.jpg 768w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard-1536x864.jpg 1536w, https://blog.anant.us/wp-content/uploads/2021/09/Cassandra-Grafana-Dashboard.jpg 1920w\" referrerpolicy=\"no-referrer\" /></a></figure><h3>Demo</h3><p><a href=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra\">https://github.com/sarma1807/Prometheus-Grafana-Cassandra</a></p><figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><iframe title=\"Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra\" width=\"900\" height=\"506\" src=\"https://www.youtube.com/embed/ATfKQ9YLfv8?feature=oembed\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"> </iframe>\n</figure><h2>Cassandra.Link</h2><p><a href=\"https://cassandra.link/\" target=\"_blank\" rel=\"noreferrer noopener\">Cassandra.Link</a> is a knowledge base that we created for all things Apache Cassandra. Our goal with <a href=\"https://cassandra.link/\" target=\"_blank\" rel=\"noreferrer noopener\">Cassandra.Link</a> was to not only fill the gap of <a href=\"https://web.archive.org/web/*/http://www.planetcassandra.org\" target=\"_blank\" rel=\"noreferrer noopener\">Planet Cassandra</a> but to bring the <strong>Cassandra </strong>community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.</p><p>We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!</p></div><p class=\"blog-post-meta\">Posted in <a href=\"https://blog.anant.us/category/platform/data-analytics/\" rel=\"category tag\">Data &amp; Analytics</a>, <a href=\"https://blog.anant.us/category/events/\" rel=\"category tag\">Events</a> <strong>|</strong> Comments Off on Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra</p>","id":"efd6b1f7-3f4d-5a7e-8313-2984d4a821c7","title":"Apache Cassandra Lunch #62: Grafana Dashboard for Apache Cassandra - Business Platform Team","origin_url":"https://blog.anant.us/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra/","url":"https://blog.anant.us/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra/","wallabag_created_at":"2022-06-25T14:34:24+00:00","published_at":"2021-09-09T17:41:33+00:00","published_by":"['']","reading_time":1,"domain_name":"blog.anant.us","preview_picture":"https://blog.anant.us/wp-content/uploads/2021/08/SM_Webinar_Deck_Template-KEEP-12-1.png","tags":["cassandra.lunch","grafana.dashboard","cassandra","grafana"],"description":"In Apache Cassandra Lunch #62, guest speaker Sarma Pydipally presented on the Grafana Dashboard for Cassandra. The live recording of Cassandra Lunch, which includes a more in-depth discussion and a de..."},{"content":"<h2>Cassandra DataSource for Grafana</h2><p>Apache Cassandra Datasource for Grafana. This datasource is to visualise <strong>time-series data</strong> stored in Cassandra/DSE, if you are looking for Cassandra <strong>metrics</strong>, you may need <a href=\"https://github.com/datastax/metric-collector-for-apache-cassandra\" target=\"_blank\">datastax/metric-collector-for-apache-cassandra</a> instead.</p><p><img src=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/workflows/Handle%20Release/badge.svg\" alt=\"Release Status\" /><img src=\"https://github.com/HadesArchitect/grafana-cassandra-source/workflows/CodeQL/badge.svg?branch=master\" alt=\"CodeQL\" /><img src=\"https://img.shields.io/github/downloads/hadesarchitect/grafanacassandradatasource/total?color=%2326c458&amp;label=Downloads&amp;logo=github\" alt=\"GitHub all releases\" /></p><p>To see the datasource in action, please follow the <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/Quick-Demo\" target=\"_blank\">Quick Demo</a> steps. Documentation is available <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki\" target=\"_blank\">here</a></p><p><strong>Supports</strong>:</p><ul><li>Grafana 5.x, 6.x, 7.x (4.x not tested, 8.x WiP not supported yet)</li>\n<li>Cassandra 3.x, 4.x (2.x not tested)</li>\n<li>DataStax Enterprise 6.x</li>\n<li>DataStax Astra (<a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/DataStax-Astra\" target=\"_blank\">docs</a>)</li>\n<li>AWS Keyspaces (limited support) (<a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/AWS-Keyspaces\" target=\"_blank\">docs</a>)</li>\n<li>Linux, OSX (incl. M1), Windows</li>\n</ul><p><strong>Contacts</strong>:</p><ul><li><a href=\"https://discord.gg/FU2Cb4KTyp\" target=\"_blank\"><img src=\"https://img.shields.io/badge/discord-chat%20with%20us-green\" alt=\"Discord Chat\" /></a></li>\n<li><a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/discussions\" target=\"_blank\"><img src=\"https://img.shields.io/badge/github-discussions-green\" alt=\"Github discussions\" /></a></li>\n</ul><h2>Usage</h2><p>You can find more detailed instructions in <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki\" target=\"_blank\">the datasource wiki</a>.</p><h3>Installation</h3><ol><li>Download the plugin using <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/releases/latest\" target=\"_blank\">latest release</a>, please download <code>cassandra-datasource-VERSION.zip</code> and uncompress a file into the Grafana plugins directory (<code>grafana/plugins</code>).</li>\n<li>Add the Cassandra DataSource as a datasource at the datasource configuration page.</li>\n<li>Configure the datasource specifying contact point and port like \"10.11.12.13:9042\", username and password. It's recommended to use a dedicated user with read-only permissions only to the table you have to access.</li>\n<li>Push the \"Save and Test\" button, if there is an error message, check the credentials and connection.</li>\n</ol><p><img src=\"https://user-images.githubusercontent.com/1742301/148654400-3ac4a477-8ca3-4606-86e7-5d10cbdc4ea9.png\" alt=\"Datasource Configuration\" /></p><h3>Panel Setup</h3><p>There are <strong>two ways</strong> to query data from Cassandra/DSE, <strong>Query Configurator</strong> and <strong>Query Editor</strong>. Configurator is easier to use but has limited capabilities, Editor is more powerful but requires understanding of <a href=\"https://cassandra.apache.org/doc/latest/cql/\" target=\"_blank\">CQL</a>.</p><h4>Query Configurator</h4><p><img src=\"https://user-images.githubusercontent.com/1742301/148654262-b9cb7253-4086-4367-8aae-35ea458fcbb6.png\" alt=\"Query Configurator\" /></p><p>Query Configurator is the easiest way to query data. At first, enter the keyspace and table name, then pick proper columns. If keyspace and table names are given correctly, the datasource will suggest the column names automatically.</p><ul><li><strong>Time Column</strong> - the column storing the timestamp value, it's used to answer \"when\" question.</li>\n<li><strong>Value Column</strong> - the column storing the value you'd like to show. It can be the <code>value</code>, <code>temperature</code> or whatever property you need.</li>\n<li><strong>ID Column</strong> - the column to uniquely identify the source of the data, e.g. <code>sensor_id</code>, <code>shop_id</code> or whatever allows you to identify the origin of data.</li>\n</ul><p>After that, you have to specify the <code>ID Value</code>, the particular ID of the data origin you want to show. You may need to enable \"ALLOW FILTERING\" although we recommend to avoid it.</p><p><strong>Example</strong> Imagine you want to visualise reports of a temperature sensor installed in your smart home. Given the sensor reports its ID, time, location and temperature every minute, we create a table to store the data and put some values there:</p><pre>CREATE TABLE IF NOT EXISTS temperature (\n    sensor_id uuid,\n    registered_at timestamp,\n    temperature int,\n    location text,\n    PRIMARY KEY ((sensor_id), registered_at)\n);\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:21:59.001+0000, 18, \"kitchen\");\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:22:59.001+0000, 19, \"kitchen\");\ninsert into temperature (sensor_id, registered_at, temperature, location) values (99051fe9-6a9c-46c2-b949-38ef78858dd0, 2020-04-01T11:23:59.001+0000, 20, \"kitchen\");\n</pre><p>In this case, we have to fill the configurator fields the following way to get the results:</p><ul><li><strong>Keyspace</strong> - smarthome <em>(keyspace name)</em></li>\n<li><strong>Table</strong> - temperature <em>(table name)</em></li>\n<li><strong>Time Column</strong> - registered_at <em>(occurence)</em></li>\n<li><strong>Value Column</strong> - temperature <em>(value to show)</em></li>\n<li><strong>ID Column</strong> - sensor_id <em>(ID of the data origin)</em></li>\n<li><strong>ID Value</strong> - 99051fe9-6a9c-46c2-b949-38ef78858dd0 <em>ID of the sensor</em></li>\n<li><strong>ALLOW FILTERING</strong> - FALSE <em>(not required, so we are happy to avoid)</em></li>\n</ul><p>In case of a few origins (multiple sensors) you will need to add more rows. If your case is as simple as that, query configurator will be a good choice, otherwise please proceed to the query editor.</p><h4>Query Editor</h4><p>Query Editor is more powerful way to query data. To enable query editor, press \"toggle text edit mode\" button.</p><p><img src=\"https://user-images.githubusercontent.com/1742301/148654475-6718f3ff-1290-4d7a-a40b-dc107c52ac15.png\" alt=\"102781863-a8bd4b80-4398-11eb-8c28-4d06a1f29279\" /></p><p>Query Editor unlocks all possibilities of CQL including Used-Defined Functions, aggregations etc.</p><p>Example using <a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/blob/master/test_data.cql\" target=\"_blank\">test_data.cql</a>:</p><pre>SELECT id, CAST(value as double), created_at FROM test.test WHERE id IN (99051fe9-6a9c-46c2-b949-38ef78858dd1, 99051fe9-6a9c-46c2-b949-38ef78858dd0) AND created_at &gt; $__timeFrom and created_at &lt; $__timeTo\n</pre><ol><li>Follow the order of the SELECT expressions, it's important!</li>\n</ol><ul><li><strong>Identifier</strong> - the first property in the SELECT expression must be the ID, something that uniquely identifies the data (e.g. <code>sensor_id</code>)</li>\n<li><strong>Value</strong> - The second property must be the value what you are going to show</li>\n<li><strong>Timestamp</strong> - The third value must be timestamp of the value.\nAll other properties will be ignored</li>\n</ul><ol start=\"2\"><li>To filter data by time, use <code>$__timeFrom</code> and <code>$__timeTo</code> placeholders as in the example. The datasource will replace them with time values from the panel. <strong>Notice</strong> It's important to add the placeholders otherwise query will try to fetch data for the whole period of time. Don't try to specify the timeframe on your own, just put the placeholders. It's grafana's job to specify time limits.</li>\n</ol><p><img src=\"https://user-images.githubusercontent.com/1742301/148654522-8e50617d-0ba9-4c5a-a3f0-7badec92e31f.png\" alt=\"103153625-1fd85280-4792-11eb-9c00-085297802117\" /></p><h2>Development</h2><p><a href=\"https://github.com/HadesArchitect/GrafanaCassandraDatasource/wiki/Developer-Guide\" target=\"_blank\">Developer documentation</a></p>","id":"de9ff26a-f27a-57fb-b10d-bcbd676a6d9d","title":"Apache Cassandra","origin_url":"https://grafana.com/grafana/plugins/hadesarchitect-cassandra-datasource/","url":"https://grafana.com/grafana/plugins/hadesarchitect-cassandra-datasource/","wallabag_created_at":"2022-04-25T21:45:46+00:00","published_at":null,"published_by":null,"reading_time":4,"domain_name":"grafana.com","preview_picture":"https://grafana.com/media/images/meta/grafana-labs-meta-default_1200x630.png","tags":["cassandra","grafana"],"description":"Cassandra DataSource for GrafanaApache Cassandra Datasource for Grafana. This datasource is to visualise time-series data stored in Cassandra/DSE, if you are looking for Cassandra metrics, you may nee..."},{"content":"<p><a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra/blob/main/Screenshots/JPGs/CassPromGraf_00_Arch.jpg\"><img src=\"https://github.com/sarma1807/Prometheus-Grafana-Cassandra/raw/main/Screenshots/JPGs/CassPromGraf_00_Arch.jpg\" alt=\"CassPromGraf_00_Arch.jpg\" /></a> </p><h3><a id=\"user-content-environment\" class=\"anchor\" aria-hidden=\"true\" href=\"#environment\"></a>Environment</h3><h5><a id=\"user-content-following-servers-are-running-with-centos-linux-release-782003-core-\" class=\"anchor\" aria-hidden=\"true\" href=\"#following-servers-are-running-with-centos-linux-release-782003-core-\"></a>Following servers are running with <code>CentOS Linux release 7.8.2003 (Core)</code> :</h5><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.151      eternal1      eternal1.OracleByExample.com&#10;192.168.1.152      eternal2      eternal2.OracleByExample.com&#10;192.168.1.153      eternal3      eternal3.OracleByExample.com&#10;192.168.1.191      PromGraf      PromGraf.OracleByExample.com&#10;\"><pre>192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExample.com\n192.168.1.153      eternal3      eternal3.OracleByExample.com\n192.168.1.191      PromGraf      PromGraf.OracleByExample.com\n</pre></div><h3><a id=\"user-content-apache-cassandra\" class=\"anchor\" aria-hidden=\"true\" href=\"#apache-cassandra\"></a>Apache Cassandra</h3><p>3 node Cassandra cluster <code>cluster_name: 'id_cluster'</code> version : <code>Apache Cassandra 4.0-beta4</code> running on following servers :</p><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.151      eternal1      eternal1.OracleByExample.com&#10;192.168.1.152      eternal2      eternal2.OracleByExample.com&#10;192.168.1.153      eternal3      eternal3.OracleByExample.com&#10;\"><pre>192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExample.com\n192.168.1.153      eternal3      eternal3.OracleByExample.com\n</pre></div><p>We will configure Cassandra's built in <code>metrics-reporter</code> to extract and publish metrics to Prometheus. <br />We will also configure <code>Prometheus node_exporter</code> on each of our Cassandra nodes to extract and publish metrics to Prometheus.\n</p><h3><a id=\"user-content-prometheus--grafana\" class=\"anchor\" aria-hidden=\"true\" href=\"#prometheus--grafana\"></a>Prometheus &amp; Grafana</h3><p>We will configure and run <code>Prometheus &amp; Grafana</code> on following server :</p><div class=\"snippet-clipboard-content position-relative\" data-snippet-clipboard-copy-content=\"192.168.1.191      PromGraf      PromGraf.OracleByExample.com&#10;\"><pre>192.168.1.191      PromGraf      PromGraf.OracleByExample.com\n</pre></div><p><code>Prometheus</code> will gather and organize all collected metrics into its internal time-series database. <br /><code>Grafana</code> will consume the metrics from Prometheus and display them in a nice dashboard. </p><h3><a id=\"user-content-prometheus-alertmanager\" class=\"anchor\" aria-hidden=\"true\" href=\"#prometheus-alertmanager\"></a>Prometheus Alertmanager</h3><p>In future, we should implement <code>Alertmanager</code>\n</p>","id":"3d846338-4870-59b3-8a50-8ba8bf31966d","title":"sarma1807/Prometheus-Grafana-Cassandra","origin_url":"https://github.com/sarma1807/Prometheus-Grafana-Cassandra","url":"https://github.com/sarma1807/Prometheus-Grafana-Cassandra","wallabag_created_at":"2021-07-10T11:53:50+00:00","published_at":null,"published_by":"['sarma1807']","reading_time":null,"domain_name":"github.com","preview_picture":"https://opengraph.githubassets.com/b79fbf32a62a4dfb02f40b3c4144d102b6d471b6f61ae0f8af63847ce6ea5fbf/sarma1807/Prometheus-Grafana-Cassandra","tags":["cassandra","grafana","prometheus"],"description":" EnvironmentFollowing servers are running with CentOS Linux release 7.8.2003 (Core) :192.168.1.151      eternal1      eternal1.OracleByExample.com\n192.168.1.152      eternal2      eternal2.OracleByExa..."}]}]}},"staticQueryHashes":[]}