“So what”, you say, “Why should I care if that cluster had such low hardware utilization? I’m sure mine’s fine.”
Here’s why you should care: This kind of thing happens all the time in the Cassandra world. At the time of that conversation, I had two other customers struggling with the same challenge, a large chip maker, and a network equipment manufacturer.
There are many variations of how and why this happens. You just might be unknowingly wasting CapEx and leaving a ton of scalability on the table with your Cassandra clusters.
Here’s the reality check:
My customers needed to deploy a different number of Cassandra software nodes than the number of physical host servers they had available. But they didn’t have a way to do it that was feasible for them.
What my customers really needed was the ability to deploy scenarios like 52 Cassandra nodes on 23 host servers, or maybe 17 nodes on 11 servers, or possibly 500 nodes on as few servers as possible to keep their hardware costs down for such a large cluster.
How does this happen? Because you optimize the number of server hosts and Cassandra nodes separately.
- Host Count: Provision host server capacity based on the best economics you can get from your IT or Purchasing for gear that meets or exceeds the Cassandra minimum server specifications. Current best practice is to get enough servers to handle the aggregate peak workload or data set size, whichever is the limiter in your use case.
- Cassandra Node Count: You really need to size your Cassandra software nodes and determine your cluster node count based on testing before you go to production. Sizing Cassandra before you’re in production is a both art and science. It’s hard. You can guestimate through calculations. But sizing calculators are usually not effective. Luckily for you, Cassandra scales linearly. You can test a small cluster and then arrive at your total cluster node count by dividing your aggregate workload or data size by the capacity of each node (from your tests). Don’t forget to factor in replication and multi-DC.
Chances are slim to none the boxes available to you are exactly the right size for a single Cassandra node. Most of my Cassandra customers had no choice but to use servers that were significantly over the minimum Cassandra server requirements. So they tried to over-size their Cassandra node density to get the most out of their servers. That’s a path proven to be fraught with pain.
As your workloads and data sizes change over time, the node:host mismatch is exacerbated. But that’s a topic for another day and another blog post…
**Here’s how we get to the sizing of 3 Cassandra nodes per box based on current practices with Cassandra 3.x:
- Storage – Under a write or read/write workload each Cassandra node does its own back end garbage collecting and flushing that causes a lot of storage IO. Under heavy write workload and tuned well, it will be compacting almost constantly without falling too far behind, if at all. This means the amount of data you can put on a Cassandra node is often IO bound. Think: max of ~1.5TB for HDD and ~4-5TB for SSD. Because there are many variables that factor in how much data you can fit on a Cassandra node while meeting your SLAs, you will want to load test this with your own data model, access patterns with realistic payload sizes, client code, driver setup, hardware stack, etc, before deciding your production data density per node.
- Cores – A single Cassandra node under max load recruits maybe ~12 cores. That 12 cores is not a hard limit. It’s a ballpark before diminishing returns set in. Yes, I’m generalizing here and not hitting all the permutations. It’s still valid. 8 cores is a good sweet spot for a moderate sized yet very capable node.
- Memory – The Cassandra JVM heap can only be so big before you start to get in GC trouble. Buffer cache will take as much memory as you throw at it, which is good for read-heavy workloads. But the cache benefit might not be there for a write or mixed read/write workload. 42GB is fine for 8 cores and reasonable data density per node.