Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

8/21/2020

Reading time:3 min

Cassandra Lunch #15: Cassandra Backup / Restoration - Business Platform Team

by John Doe

In Cassandra Lunch #15, we discuss Cassandra Backup / Restoration. We discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration. StrategyIn the strategy for backup / restoration, we cover disaster avoidance, disaster recovery, and tools for cassandra backup and restoration. Disaster AvoidanceFor disaster avoidance, we discuss strategy for using multi-datacenter, multi-region, and/or multi-cloud clusters. We also discuss examples using AWS and Google, which you can see more in-depth in the video linked below.Disaster RecoveryWe discussed 3 methods of disaster recovery and a more in-depth explanation can be found in the video linked below.Cassandra Backup / RestoreSingle nodeSnapshot + RestoreMulti-nodeSnapshot + Restore (same size cluster vs different sized cluster)Cloud Iaas Snapshot Backup / RestoreAWS EBSCassandra Backup + Upload to Distributed Filesystem (S3)Tools for Backup / RestorationWe also covered a few different tools that can be used for backup and restoration. A more in-depth discussion about those tools can be seen in the video linked below.DataStax Ops CenterAutomated data synchronizationFull and continuous backupsSeamless enterprise integrationSimplified upgradesEnd-to-end performance visibilityComprehensive cluster health managementTablesnapUses inotify to monitor Cassandra SSTables and upload them to AWS S3Cassandra MedusaMedusa is an Apache Cassandra backup system.Medusa is a command-line tool that offers the following features:single node backupsingle node restorecluster-wide in place restorecluster-wide remote restorebackup purgesupport for local storage, GCS, AWS S3, and others through Apache Libcloudsupport for clusters using single tokens or vnodesfull or incremental backupsCurrently does not supportCassandra deployments with multiple data folder directoriesCassandra-BackupBackup utility and library for Apache CassandraThe tool is able to perform these operations:backup of SSTablesrestore of SSTablesbackup of commit logsrestore of commit logsRubrik Mosaic (Datos.io)Simplifies protection and data management of MongoDB, DataStax Enterprise, and Cassandra while assuring application availability.Achieve a significant storage economy with incremental forever backup and semantic deduplication.Mosaic always-consistent backup speeds recovery and lets you start using the application during recovery.Mosaic is cloud-native, runs on-premises, or both.Mosaic reduces multiple NoSQL replicas into a single always-consistent copy and stores the backup on any cloud.Example ScenarioWe also discussed an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration. A more in-depth discussion of this example can be seen in the video linked below.Cassandra Lunch #15 Additional ResourcesDocumentation: https://cassandra.apache.org/doc/latest/operating/backups.html Cassandra Backup and Restore – Backup in AWS using EBS Volumes: https://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html Create and restore Cassandra backups: https://docs.bitnami.com/google-templates/infrastructure/cassandra/administration/backup-restore/ How To Backup And Restore A Cassandra Keyspace In Linux:https://www.youtube.com/watch?v=Wnn1QWCG9AI Cassandra | Apache Cassandra | Cassandra Backup | Cassandra Backup and Restore: https://www.youtube.com/watch?v=Uw1hez8Ry7cICYMICassandra.LinkCassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email! Posted in Modern Business | Comments Off on Cassandra Lunch #15: Cassandra Backup / Restoration

Illustration Image

In Cassandra Lunch #15, we discuss Cassandra Backup / Restoration. We discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration. 

Strategy

In the strategy for backup / restoration, we cover disaster avoidance, disaster recovery, and tools for cassandra backup and restoration.

Disaster Avoidance

For disaster avoidance, we discuss strategy for using multi-datacenter, multi-region, and/or multi-cloud clusters. We also discuss examples using AWS and Google, which you can see more in-depth in the video linked below.

Disaster Recovery

We discussed 3 methods of disaster recovery and a more in-depth explanation can be found in the video linked below.

  1. Cassandra Backup / Restore
    • Single node
      • Snapshot + Restore
    • Multi-node
      • Snapshot + Restore (same size cluster vs different sized cluster)
  2. Cloud Iaas Snapshot Backup / Restore
    • AWS EBS
  3. Cassandra Backup + Upload to Distributed Filesystem (S3)

Tools for Backup / Restoration

We also covered a few different tools that can be used for backup and restoration. A more in-depth discussion about those tools can be seen in the video linked below.

  • DataStax Ops Center
    • Automated data synchronization
    • Full and continuous backups
    • Seamless enterprise integration
    • Simplified upgrades
    • End-to-end performance visibility
    • Comprehensive cluster health management
  • Tablesnap
    • Uses inotify to monitor Cassandra SSTables and upload them to AWS S3
  • Cassandra Medusa
    • Medusa is an Apache Cassandra backup system.
    • Medusa is a command-line tool that offers the following features:
      • single node backup
      • single node restore
      • cluster-wide in place restore
      • cluster-wide remote restore
      • backup purge
      • support for local storage, GCS, AWS S3, and others through Apache Libcloud
      • support for clusters using single tokens or vnodes
      • full or incremental backups
    • Currently does not support
      • Cassandra deployments with multiple data folder directories
  • Cassandra-Backup
    • Backup utility and library for Apache Cassandra
    • The tool is able to perform these operations:
      • backup of SSTables
      • restore of SSTables
      • backup of commit logs
      • restore of commit logs
  • Rubrik Mosaic (Datos.io)
    • Simplifies protection and data management of MongoDB, DataStax Enterprise, and Cassandra while assuring application availability.
    • Achieve a significant storage economy with incremental forever backup and semantic deduplication.
    • Mosaic always-consistent backup speeds recovery and lets you start using the application during recovery.
    • Mosaic is cloud-native, runs on-premises, or both.
    • Mosaic reduces multiple NoSQL replicas into a single always-consistent copy and stores the backup on any cloud.

Example Scenario

We also discussed an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration. A more in-depth discussion of this example can be seen in the video linked below.

Cassandra Lunch #15

Additional Resources

ICYMI

Cassandra.Link

Cassandra.Link is a knowledge base that we created for all things Apache Cassandra. Our goal with Cassandra.Link was to not only fill the gap of Planet Cassandra, but to bring the Cassandra community together. Feel free to reach out if you wish to collaborate with us on this project in any capacity.

We are a technology company that specializes in building business platforms. If you have any questions about the tools discussed in this post or about any of our services, feel free to send us an email!

Related Articles

migration
proxy
datastax

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

datastax

11/1/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

backup