In this blog, we will cover 3 different open source BI tools that you can use to do SQL and reporting on Apache Cassandra. This blog is part 4 and wraps up our series on “Doing SQL and Reporting on Apache Cassandra with Open Source Tools”. Parts 1-3 will be linked at the end of the blog if you want to check them out as well. Also, a webinar will be embedded below if you want to watch a video version of this blog; as well as, a live demo using one of the tools we discuss!

In this blog, we will introduce 3 different open source BI tools that we can use to do SQL and report on Apache Cassandra. The tools we will be discussing are:

Metabase

Metabase is an open source tool designed to be the simplest and fastest way to get business intelligence and analytics to everyone in your company.

Features

  • 5-minute setup
  • Fast install with Docker or JAR
  • Let anyone on your team ask questions without knowing SQL
  • Rich beautiful dashboards with auto-refresh and fullscreen
  • SQL Mode for analysts and data pros
  • Create canonical segments and metrics for your team to use
  • Send data to Slack or email on a schedule with Pulses
  • View data in Slack anytime with MetaBot
  • Humanize data for your team by renaming, annotating, and hiding fields
  • See changes in your data with alerts
Example Metabase Visualization
Example Metabase Visualization
Example Metabase Dashboard
Example Metabase Dashboard
Visual joins and multiple aggregations and filtering steps give you the tools to dig deeper into your data.
Visual joins and multiple aggregations and filtering steps give you the tools to dig deeper into your data.
Add variables to your queries to create interactive visualizations that users can tweak and explore.
Add variables to your queries to create interactive visualizations that users can tweak and explore.
Metabase Database Connector List
Metabase Database Connector List

Metabase also allows users to hit their Query API directly from Javascript to integrate the simple analytics they provide with your own application or third-party services to do things like:

  • Build moderation interfaces
  • Export subsets of your users to third party marketing automation software
  • Provide a specialized customer lookup application for the people in your company

Metabase does provide a paid option where you can host on their cloud, but the self-hosted option is free and open source. Pricing can be seen here. In the video embedded below, there is a live demo where we quickly spin up Metabase on Docker and connect to Cassandra via Presto. If you are not familiar with Presto, you can check out Part 1 of the “Doing SQL and Reporting on Apache Cassandra with Open Source Tools” series here (also linked below in a list with parts 1-3).

Redash

Redash is an open source tool designed to enable anyone, regardless of the level of technical sophistication, to harness the power of data big and small. SQL users leverage Redash to explore, query, visualize, and share data from any data sources. Their work in turn enables anybody in their organization to use the data.

Features

  • Browser-based: Everything in your browser, with a shareable URL.
  • Ease-of-use: Become immediately productive with data without the need to master complex software.
  • Query editor: Quickly compose SQL and NoSQL queries with a schema browser and auto-complete.
  • Visualization and dashboards: Create beautiful visualizations with drag and drop, and combine them into a single dashboard.
  • Sharing: Collaborate easily by sharing visualizations and their associated queries, enabling peer review of reports and queries.
  • Schedule refreshes: Automatically update your charts and dashboards at regular intervals you define.
  • Alerts: Define conditions and be alerted instantly when your data changes.
  • REST API: Everything that can be done in the UI is also available through REST API.
  • Broad support for data sources: Extensible data source API with native support for a long list of common databases and platforms.
Demo of Redash
Databases Connections in Redash
Databases Connections in Redash
Integrations Available for Redash
Integrations Available for Redash
Example Redash Dashboard
Example Redash Dashboard

Like Metabase, Redash also offers a paid version where you can host using Redash itself; however, like Metabase, it is free to host on your own. If you want to host on your own, you can check out this page, where you quickly set up an instance of Redash using AWS EC2 AMI, DigitalOcean, Google Compute Engine Image, and Docker.

Apache Superset

Apache Superset is an open source modern data exploration and visualization platform.

Features

  • An intuitive interface to explore and visualize datasets, and create interactive dashboards.
  • A wide array of beautiful visualizations to showcase your data.
  • Easy, code-free, user flows to drill down and slice and dice the data underlying exposed dashboards. The dashboards and charts act as a starting point for deeper analysis.
  • A state of the art SQL editor/IDE exposing a rich metadata browser, and an easy workflow to create visualizations out of any result set.
  • An extensible, high granularity security model allowing intricate rules on who can access which product features and datasets. Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, …)
  • A lightweight semantic layer, allowing to control how data sources are exposed to the user by defining dimensions and metrics
  • Out of the box support for most SQL-speaking databases
  • Deep integration with Druid allows for Superset to stay blazing fast while slicing and dicing large, realtime datasets
  • Fast loading dashboards with configurable caching
Example Superset Dashboard
Example Superset Dashboard
Superset Supported Databases
Superset Supported Databases
Superset SQL Editor
Superset SQL Editor
Superset Visualizations
Superset Visualizations

Unlike Metabase and Redash, there is no paid service / hosting for Apache Superset, so you will need to figure out where to host it. Superset can be set-up using Docker, and more on getting started with Superset can be found here. A word of notice: Superset is not officially supported on Windows, unfortunately. The best option for Windows users to try out Superset locally is to install an Ubuntu Desktop VM via VirtualBox and proceed with the Docker on Linux instructions inside of that VM.

And that wraps up our series on “Doing SQL and Reporting on Apache Cassandra with Open Source Tools” with open source BI tools and reporting on Apache Cassandra. As mentioned above, a video form of this blog is embedded below, which includes a live demo of using Metabase with Presto to connect to Cassandra. Also, parts 1-3 of this series are also linked below!

Doing SQL and Reporting on Apache Cassandra with Open Source Tools

  1. Presto and Cassandra
  2. Spark and Cassandra
  3. Open Source Notebooks and Cassandra
  4. Open Source BI Tools and Cassandra

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!