Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

5/3/2021

Reading time:3 min

grafana/metrictank

by John Doe

IntroductionGrafana Metrictank is a multi-tenant timeseries platform that can be used as a backend or replacement for Graphite.It provides long term storage, high availability, efficient storage, retrieval and processing for large scale environments.Grafana Labs has been running Metrictank in production since December 2015.It currently requires an external datastore like Cassandra or Bigtable, and we highly recommend using Kafka to support clustering, as wellas a clustering manager like Kubernetes. This makes it non-trivial to operate, though Grafana Labs has an on-premise productthat makes this process much easier.Features100% open sourceHeavily compressed chunks (inspired by the Facebook gorilla paper) dramatically lower cpu, memory, and storage requirements and get much greater performance out of Cassandra than other solutions.Writeback RAM buffers and chunk caches, serving most data out of memory.Multiple rollup functions can be configured per serie (or group of series). E.g. min/max/sum/count/average, which can be selected at query time via consolidateBy().So we can do consolidation (combined runtime+archived) accurately and correctly,unlike most other graphite backends like whisperFlexible tenancy: can be used as single tenant or multi tenant. Selected data can be shared across all tenants.Input options: carbon, metrics2.0, kafka.Guards against excessively large queries. (per-request series/points restrictions)Data backfill/import from whisperSpeculative Execution means you can use replicas not only for High Availability but also to reduce query latency.Write-Ahead buffer based on Kafka facilitates robust clustering and enables other analytics use cases.Tags and Meta Tags supportRender response metadata: performance statistics, series lineage information and rollup indicator visible through GrafanaIndex pruning (hide inactive/stale series)Timeseries can change resolution (interval) over time, they will be merged seamlessly at read time. No need for any data migrations.Relation to GraphiteThe goal of Metrictank is to provide a more scalable, secure, resource efficient and performant version of Graphite that is backwards compatible, while also adding some novel functionality.(see Features, above)There's 2 main ways to deploy Metrictank:as a backend for Graphite-web, by setting the CLUSTER_SERVER configuration value.as an alternative to a Graphite stack. This enables most of the additional functionality. Note that Metrictank's API is not quite on par yet with Graphite-web: some less commonly used functions are not implemented natively yet, in which case Metrictank relies on a graphite-web process to handle those requests. See our graphite comparison page for more details.LimitationsNo performance/availability isolation between tenants per instance. (only data isolation)Minimum computation locality: we move the data from storage to processing code, which is both metrictank and graphite.Can't overwrite old data. We support reordering the most recent time window but that's it. (unless you restart MT)Interesting design characteristics (feature or limitation... up to you)Upgrades / process restarts requires running multiple instances (potentially only for the duration of the maintenance) and possibly re-assigning the primary role.Otherwise data loss of current chunks will be incurred. See operations guideclustering works best with an orchestrator like kubernetes. MT itself does not automate master promotions. See clustering for more.Only float64 values. Ints and bools currently stored as floats (works quite well due to the gorilla compression),Only uint32 unix timestamps in second resolution. For higher resolution, consider streaming directly to grafanaWe distribute data by hashing keys, like many similar systems. This means no data locality (data that will be often used together may not live together)Docsinstallation, configuration and operation.features in-depthOtherReleases and versioningreleases and changelogwe aim to keep master stable and vet code before merging to masterWe're pre-1.0 but adopt semver for our 0.MAJOR.MINOR format. The rules are simple:MAJOR version for incompatible API or functionality changesMINOR version when you add functionality in a backwards-compatible manner, andWe don't do patch level releases since minor releases are frequent enough.Copyright 2016-2019 Grafana LabsThis software is distributed under the terms of the GNU Affero General Public License.Some specific packages have a different license:

Illustration Image

Metrictank logo

Circle CI Go Report Card GoDoc

Introduction

Grafana Metrictank is a multi-tenant timeseries platform that can be used as a backend or replacement for Graphite. It provides long term storage, high availability, efficient storage, retrieval and processing for large scale environments.

Grafana Labs has been running Metrictank in production since December 2015. It currently requires an external datastore like Cassandra or Bigtable, and we highly recommend using Kafka to support clustering, as well as a clustering manager like Kubernetes. This makes it non-trivial to operate, though Grafana Labs has an on-premise product that makes this process much easier.

Features

  • 100% open source
  • Heavily compressed chunks (inspired by the Facebook gorilla paper) dramatically lower cpu, memory, and storage requirements and get much greater performance out of Cassandra than other solutions.
  • Writeback RAM buffers and chunk caches, serving most data out of memory.
  • Multiple rollup functions can be configured per serie (or group of series). E.g. min/max/sum/count/average, which can be selected at query time via consolidateBy(). So we can do consolidation (combined runtime+archived) accurately and correctly, unlike most other graphite backends like whisper
  • Flexible tenancy: can be used as single tenant or multi tenant. Selected data can be shared across all tenants.
  • Input options: carbon, metrics2.0, kafka.
  • Guards against excessively large queries. (per-request series/points restrictions)
  • Data backfill/import from whisper
  • Speculative Execution means you can use replicas not only for High Availability but also to reduce query latency.
  • Write-Ahead buffer based on Kafka facilitates robust clustering and enables other analytics use cases.
  • Tags and Meta Tags support
  • Render response metadata: performance statistics, series lineage information and rollup indicator visible through Grafana
  • Index pruning (hide inactive/stale series)
  • Timeseries can change resolution (interval) over time, they will be merged seamlessly at read time. No need for any data migrations.

Relation to Graphite

The goal of Metrictank is to provide a more scalable, secure, resource efficient and performant version of Graphite that is backwards compatible, while also adding some novel functionality. (see Features, above)

There's 2 main ways to deploy Metrictank:

  • as a backend for Graphite-web, by setting the CLUSTER_SERVER configuration value.
  • as an alternative to a Graphite stack. This enables most of the additional functionality. Note that Metrictank's API is not quite on par yet with Graphite-web: some less commonly used functions are not implemented natively yet, in which case Metrictank relies on a graphite-web process to handle those requests. See our graphite comparison page for more details.

Limitations

  • No performance/availability isolation between tenants per instance. (only data isolation)
  • Minimum computation locality: we move the data from storage to processing code, which is both metrictank and graphite.
  • Can't overwrite old data. We support reordering the most recent time window but that's it. (unless you restart MT)

Interesting design characteristics (feature or limitation... up to you)

  • Upgrades / process restarts requires running multiple instances (potentially only for the duration of the maintenance) and possibly re-assigning the primary role. Otherwise data loss of current chunks will be incurred. See operations guide
  • clustering works best with an orchestrator like kubernetes. MT itself does not automate master promotions. See clustering for more.
  • Only float64 values. Ints and bools currently stored as floats (works quite well due to the gorilla compression),
  • Only uint32 unix timestamps in second resolution. For higher resolution, consider streaming directly to grafana
  • We distribute data by hashing keys, like many similar systems. This means no data locality (data that will be often used together may not live together)

Docs

installation, configuration and operation.

features in-depth

Other

Releases and versioning

  • releases and changelog

  • we aim to keep master stable and vet code before merging to master

  • We're pre-1.0 but adopt semver for our 0.MAJOR.MINOR format. The rules are simple:

    • MAJOR version for incompatible API or functionality changes
    • MINOR version when you add functionality in a backwards-compatible manner, and

    We don't do patch level releases since minor releases are frequent enough.

Copyright 2016-2019 Grafana Labs

This software is distributed under the terms of the GNU Affero General Public License.

Some specific packages have a different license:

Related Articles

mongo
neo4j
cassandra

Cassandra – Types of NoSQL Databases

John Doe

2/11/2022

cassandra
bigtable

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

bigtable