Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

2/21/2020

Reading time:7 min

OpsCenter & Search for its Replacement

by John Doe

OpsCenterAnd Search for its ReplacementTable of ContentsAbstractNeed of Open ToolsTasks of OpsCenterFeatures of OpsCenterEnterprise EditionConclusionAbstractDataStax OpsCenter is a visual management and monitoring solution for Apache Cassandra and DataStax Enterprise.OpsCenter is one of the most important feature in DataStax community. When it comes to monitoring Cassandra clusters, OpsCenter is the favourite among DataStax customers.OpsCenter version 5.2.4 monitor, manage and maintain theOSS Cassandra [ Open Source Software Cassandra clusters ]DDC [ DataStax Distributions of Cassandra Clusters ]Or DSC [ DataStax Community ].But due to the policy changes, starting of OpsCenter version 6.0, OpsCenter will only be compatible with DSE clusters [ DataStax Enterprise clusters ]. DataStax will discontinue their support with OSS, DDC clusters.This brings to the point , where either the customers has to upgrade their clusters to Enterprise Edition or they need to find a replacement for OpsCenter tool.Community Edition of Cassandra Vs DSE [ Enterprise Edition ]FeatureOpen SourceDataStax EnterpriseCore Security Features**Enterprise Security FeaturesNo*Built in automatic management servicesNo*Integrated enterprise searchNo*Integrated streaming, real-time, batch analyticsNo*External Hadoop IntegratoinNo*In-Memory FeatureNo*Worklaod management/isolationNo*Easy Migration of RDBMS and log dataNo*Certified software UpdatesNo*Certified platform supportNo*OpsCenterBasic FuncitonalityAdvanced FunctionalityReplacement for OpsCenterStarting with version 6.0, OpsCenter will only be compatible with DataStax Enterprise (DSE) clusters. Since only the Basic functionality will be available, the earlier versions will be available without the Advanced functionalities such as,Certified Software Updates and Certified Platform Supports.So, when started searching for alternatives, few Open tools came very close to the functionalities of Earlier releases of OpsCenter. Tools such as,NagiosGraphiteGrafanaGrafanaGrafana is an open source metrics dashboard and graph composer for Graphite and InfluxDBLatest Version:  Grafana v3.0Updates in Latest Version:Refactored Data Source plugin architecture and added 2 new plugin types:Panel plugins: to add new panel types for Dashboards.App plugins: Bunles Panels plugins, Data Sources plugins, Dashboards and Grafana Pages.Grafana-cli: Grafana 3.0 comes with a new command line tool called grafana-cli. It easily install plugins from Grafana.net          “ grafana–cli install grafana–pie–chart–panel “Graphite Support:Grafana includes a built in Graphite query parser that takes writing graphite metric expressions to a whole new level. Expressions are easier to read and faster to edit than ever.Click on any metric segment to change itQuickly add functions ( search )Click on a function parameter to change it.Rich templating support.InFluxDB Support:Rich editor with measurement, tag and tag value completionAutomatic handling of group by timeTemplating queries for generic dashboards.Features of Grafana:Rich Graphing:Fast and Flexible client side graphs with a multitude of options, with Bars, Lines, Points, & Thresholds, Logarithmic scales and View or edit graph in fullscreen.Graph Styling:Full control for how each series should be drawn. Mix stacked series with isolated series. & Export any graph to png image.Annotations:Annotate graphs with rich events from different data sources. Hover over events shows you the full event metadata and tags.Fetch annotations from Elasticsearch.Fetch annotations from GRAPHITE Events and Metrics.Fetch annotations from InFluxDBMany data sources & plugins:All data sources in Grafana are using the same data source plugin API.Mix different data sources on the same dashboard.Mix different data sources in the same graph.Add custom data sources.The below picture describes the measurement graph, done for Read, Write operations during Benchmarking Test performed on  a 3 node cluster.NagiosNagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes. In the event of a failure, Nagios can alert technical staff of the problem, allowing them to begin remediation processes before outages affect business processes, end-users, or customers.Cassandra Monitoring pluginsNagios monitors Cassandra in two ways, Check Cassandra Cluster & Check Cassandra Status and Heap Memory.1.    Check Cassandra ClusterThe Cluster Node Check Plugin is designed to verify whether the number of live nodes is less than a specified number, and if so, trigger a warning or critical alert within Nagios XI. This plugin needs to run on a server with the Apache Cassandra nodetool utility.http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_cassandra_cluster-2Esh/details2.    Memory Heap and Other Metrics Retrievable Through JMX:Montiors Cassandra status and Heap Memory UtilizationThe Memory Heap and Status plugin script (written in perl) from the Nagios Exchange can be used to monitor Cassandra status and heap utilization (cassandra.pl). Need to use NRPE or another agent of your choice to use this check and install it on Cassandra Servers, as it relies on Cassandras “nodetool” utilityhttp://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/Check-Cassandra-status-and-heap-memoryutilization/detailsOpsCenter Vs Other Open ToolsOpsCenterTasks of OpsCenter:Adding and expanding clustersConfiguring nodesViewing performance metricsRectifying issuesMonitoring the health of your clusters on the dashboardFeatures of OpsCenter:Dashboard:A single point of communication or a centralized dashboard that allows a, at a glance management.A sample image of a Datastax OpsCenter,Configuration and Administration:Provides an “ User Friendly Environment “ with a visual support for creating new clusters, adding or removing clusters from existing clusters.Administration tasks, such as adding a cluster, using simple point-and-click actions.Multiple cluster management from a single OpsCenter instance using agents.Multiple node management.Downloadable PDF cluster report.Fault Tolerence:Automatic failover is built into OpsCenter, which helps ensure that any scheduled tasks or monitoring activities continue even if a primary OpsCenter service fails.Provactive Assistance:Proactive help is available in OpsCenter for troubleshooting problems and planning for future needs. Forecasting future capacity needs (e.g. when will I need more disk space or RAM?) is carried out with a single mouse click.Enterprise Only FunctionalityEnterprise functionality in OpsCenter is only enabled on DataStax Enterprise clusters.Monitoring capabilities of DSE In-Memory tables.View the Spark console.Automatic failover from the primary OpsCenter to a backup OpsCenter instance.Security, with the ability to define user roles.DSE Management Services:Repair Service:The Repair Service is configured to run continuously and perform repair operations across a DSE cluster in a minimally impactful way.Capacity Service:Using trend analysis and forecasting, the Capacity Service helps users understand how their cluster is performing within its current environment and workload, and gain a better sense of how time affects those trends, both past and future.Data Backup and Restore:A backup is a snapshot of all on-disk data files (SSTable files) stored in the data directory. Backups are taken per keyspace and while the system is online. A backup first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. Backups are stored in the snapshots directory of the column family that’s being snapshotted. For example,/var/lib/cassandra/data/OpsCenter/settings/snapshots..                                   OpsCenter Data Backups allows you to specify a schedule to remove old backups and prevent backups from being taken when disk space falls below a specified level.Scheduling a backupRestoring from a backupUsing custom scripts before and after backupsAlerts:MetricDefinitionNode downWhen a node is not responding to requests, it is marked as down.Write requestsThe number of write requests per second. Monitoring the number of writes over a given time period can give you and idea of system write workload and usage patterns.Write request latencyThe response time (in milliseconds) for successful write operations. The time period starts when a node receives a client write request, and ends when the node responds back to the client.Read requestsThe number of read requests per second. Monitoring the number of reads over a given time period can give you and idea of system read workload and usage patterns.Read request latencyThe response time (in milliseconds) for successful read operations. The time period starts when a node receives a client read request, and ends when the node responds back to the client.CPU usageThe percentage of time that the CPU was busy, which is calculated by subtracting the percentage of time the CPU was idle from 100 percent.          What others are using?On Cassandra mailing list there was a survey about how C* is used and one of the questions was about monitoring tools.212 answers recorded regarding monitoring systems. The most popular ones are (with more than 3% votes):1. DataStax OpsCenter 85 (40%)2. Nagios 43 (20%)3. Grafana 15 (7%)4. In-house tools 11 (5%)                                                   ConclusionAs per the survey and simple to use, Grafana & Nagios are the tools, which comes very close in the solution for “ The Search for OpsCenter’s Replacement “.Will be bringing up more technical details on ” Grafana + InfluxDB ” in Bench marking test on a Cassandra cluster in the next blog.Thank You.

Illustration Image

OpsCenter

And Search for its Replacement

Table of Contents

  • Abstract
  • Need of Open Tools
  • Tasks of OpsCenter
  • Features of OpsCenter
  • Enterprise Edition
  • Conclusion

Abstract

DataStax OpsCenter is a visual management and monitoring solution for Apache Cassandra and DataStax Enterprise.

OpsCenter is one of the most important feature in DataStax community. When it comes to monitoring Cassandra clusters, OpsCenter is the favourite among DataStax customers.

OpsCenter version 5.2.4 monitor, manage and maintain the

  • OSS Cassandra [ Open Source Software Cassandra clusters ]
  • DDC [ DataStax Distributions of Cassandra Clusters ]
  • Or DSC [ DataStax Community ].

But due to the policy changes, starting of OpsCenter version 6.0, OpsCenter will only be compatible with DSE clusters [ DataStax Enterprise clusters ]. DataStax will discontinue their support with OSS, DDC clusters.

This brings to the point , where either the customers has to upgrade their clusters to Enterprise Edition or they need to find a replacement for OpsCenter tool.

Community Edition of Cassandra Vs DSE [ Enterprise Edition ]

Feature

Open Source

DataStax Enterprise

Core Security Features

*

*

Enterprise Security Features

No

*

Built in automatic management services

No

*

Integrated enterprise search

No

*

Integrated streaming, real-time, batch analytics

No

*

External Hadoop Integratoin

No

*

In-Memory Feature

No

*

Worklaod management/isolation

No

*

Easy Migration of RDBMS and log data

No

*

Certified software Updates

No

*

Certified platform support

No

*

OpsCenter

Basic Funcitonality

Advanced Functionality

Replacement for OpsCenter

Starting with version 6.0, OpsCenter will only be compatible with DataStax Enterprise (DSE) clusters. Since only the Basic functionality will be available, the earlier versions will be available without the Advanced functionalities such as,

Certified Software Updates and Certified Platform Supports.

So, when started searching for alternatives, few Open tools came very close to the functionalities of Earlier releases of OpsCenter. Tools such as,

  • Nagios
  • Graphite
  • Grafana

Grafana

Grafana is an open source metrics dashboard and graph composer for Graphite and InfluxDB

Latest Version:  Grafana v3.0

Updates in Latest Version:

  • Refactored Data Source plugin architecture and added 2 new plugin types:
  • Panel plugins: to add new panel types for Dashboards.
  • App plugins: Bunles Panels plugins, Data Sources plugins, Dashboards and Grafana Pages.
  • Grafana-cli: Grafana 3.0 comes with a new command line tool called grafana-cli. It easily install plugins from Grafana.net

          “ grafana–cli install grafana–pie–chart–panelGraphite Support:

Grafana includes a built in Graphite query parser that takes writing graphite metric expressions to a whole new level. Expressions are easier to read and faster to edit than ever.

  • Click on any metric segment to change it
  • Quickly add functions ( search )
  • Click on a function parameter to change it.
  • Rich templating support.

InFluxDB Support:

  • Rich editor with measurement, tag and tag value completion
  • Automatic handling of group by time
  • Templating queries for generic dashboards.

Features of Grafana:

Rich Graphing:

  • Fast and Flexible client side graphs with a multitude of options, with Bars, Lines, Points, & Thresholds, Logarithmic scales and View or edit graph in fullscreen.

Graph Styling:

  • Full control for how each series should be drawn. Mix stacked series with isolated series. & Export any graph to png image.

Annotations:Annotate graphs with rich events from different data sources. Hover over events shows you the full event metadata and tags.

  • Fetch annotations from Elasticsearch.
  • Fetch annotations from GRAPHITE Events and Metrics.
  • Fetch annotations from InFluxDB

Many data sources & plugins:All data sources in Grafana are using the same data source plugin API.

  • Mix different data sources on the same dashboard.
  • Mix different data sources in the same graph.
  • Add custom data sources.

The below picture describes the measurement graph, done for Read, Write operations during Benchmarking Test performed on  a 3 node cluster.

Capture.JPG

Nagios

Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes. In the event of a failure, Nagios can alert technical staff of the problem, allowing them to begin remediation processes before outages affect business processes, end-users, or customers.

Cassandra Monitoring plugins

Nagios monitors Cassandra in two ways, Check Cassandra Cluster & Check Cassandra Status and Heap Memory.

  1. 1.    Check Cassandra Cluster

The Cluster Node Check Plugin is designed to verify whether the number of live nodes is less than a specified number, and if so, trigger a warning or critical alert within Nagios XI. This plugin needs to run on a server with the Apache Cassandra nodetool utility.

http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/check_cassandra_cluster-2Esh/details

  1. 2.    Memory Heap and Other Metrics Retrievable Through JMX:
  • Montiors Cassandra status and Heap Memory Utilization

The Memory Heap and Status plugin script (written in perl) from the Nagios Exchange can be used to monitor Cassandra status and heap utilization (cassandra.pl). Need to use NRPE or another agent of your choice to use this check and install it on Cassandra Servers, as it relies on Cassandras “nodetool” utility

http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/Check-Cassandra-status-and-heap-memoryutilization/details

OpsCenter Vs Other Open Tools

OpsCenter

Tasks of OpsCenter:

  • Adding and expanding clusters
  • Configuring nodes
  • Viewing performance metrics
  • Rectifying issues
  • Monitoring the health of your clusters on the dashboard

Features of OpsCenter:

Dashboard:

A single point of communication or a centralized dashboard that allows a, at a glance management.

A sample image of a Datastax OpsCenter,

Opscenter.jpg

Configuration and Administration:

Provides an “ User Friendly Environment “ with a visual support for creating new clusters, adding or removing clusters from existing clusters.

  • Administration tasks, such as adding a cluster, using simple point-and-click actions.
  • Multiple cluster management from a single OpsCenter instance using agents.
  • Multiple node management.
  • Downloadable PDF cluster report.

Fault Tolerence:

Automatic failover is built into OpsCenter, which helps ensure that any scheduled tasks or monitoring activities continue even if a primary OpsCenter service fails.

Provactive Assistance:

Proactive help is available in OpsCenter for troubleshooting problems and planning for future needs. Forecasting future capacity needs (e.g. when will I need more disk space or RAM?) is carried out with a single mouse click.

Enterprise Only Functionality

  • Enterprise functionality in OpsCenter is only enabled on DataStax Enterprise clusters.
  • Monitoring capabilities of DSE In-Memory tables.
  • View the Spark console.
  • Automatic failover from the primary OpsCenter to a backup OpsCenter instance.
  • Security, with the ability to define user roles.

DSE Management Services:

Repair Service:

The Repair Service is configured to run continuously and perform repair operations across a DSE cluster in a minimally impactful way.

Capacity Service:

Using trend analysis and forecasting, the Capacity Service helps users understand how their cluster is performing within its current environment and workload, and gain a better sense of how time affects those trends, both past and future.

Data Backup and Restore:

A backup is a snapshot of all on-disk data files (SSTable files) stored in the data directory. Backups are taken per keyspace and while the system is online. A backup first flushes all in-memory writes to disk, then makes a hard link of the SSTable files for each keyspace. Backups are stored in the snapshots directory of the column family that’s being snapshotted. For example,/var/lib/cassandra/data/OpsCenter/settings/snapshots.

.                                   OpsCenter Data Backups allows you to specify a schedule to remove old backups and prevent backups from being taken when disk space falls below a specified level.

Alerts:

Metric

Definition

Node down

When a node is not responding to requests, it is marked as down.

Write requests

The number of write requests per second. Monitoring the number of writes over a given time period can give you and idea of system write workload and usage patterns.

Write request latency

The response time (in milliseconds) for successful write operations. The time period starts when a node receives a client write request, and ends when the node responds back to the client.

Read requests

The number of read requests per second. Monitoring the number of reads over a given time period can give you and idea of system read workload and usage patterns.

Read request latency

The response time (in milliseconds) for successful read operations. The time period starts when a node receives a client read request, and ends when the node responds back to the client.

CPU usage

The percentage of time that the CPU was busy, which is calculated by subtracting the percentage of time the CPU was idle from 100 percent.

          What others are using?

On Cassandra mailing list there was a survey about how C* is used and one of the questions was about monitoring tools.

212 answers recorded regarding monitoring systems. The most popular ones are (with more than 3% votes):

  1. 1. DataStax OpsCenter 85 (40%)
  2. 2. Nagios 43 (20%)
  3. 3. Grafana 15 (7%)
  4. 4. In-house tools 11 (5%)

                                                   Conclusion

As per the survey and simple to use, Grafana & Nagios are the tools, which comes very close in the solution for “ The Search for OpsCenter’s Replacement “.

Will be bringing up more technical details on ” Grafana + InfluxDB ” in Bench marking test on a Cassandra cluster in the next blog.

Thank You.

Related Articles

migration
proxy
datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

datastax

11/1/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

opscenter