Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

10/31/2017

Reading time:4 min

trireme

by John Doe

Trireme is a tool providing migration support for Apache Cassandra, DataStax Enterprise Cassandra & Solr. Commands are run using the Python Invoke CLI tool.System Dependenciescqlsh must be on the PATH. Some tasks utilize the cqlsh tool for running scripts and dumping schemas to disk.IntegrationTo use this tool within the scope of your project follow these steps.Install trireme with pip install triremeCreate a tasks.py file with the following content:from invoke import Collectionfrom trireme import triremenamespace = Collection(trireme)Create a trireme_config.py file with your Cassandra and Solr information.# Cassandra Configuration# Contact points for your cluster, currently only the first is usedcontact_points = ['127.0.0.1']# Keyspace to work with, this doesn't have to exist yet.keyspace = 'foo'# Replication options. Defined as a map just as you would in CQL.replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }# replication = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2}# Authentication Informationusername = Nonepassword = None# Flag indicating whether this host is the migration master. Migrations are only run on the migration mastermigration_master = True# Solr Configurationsolr_url = 'http://127.0.0.1:8983/solr'Run the trireme setup task to create the basic directoriesinv trireme.setupUsageMigrators contain logic to run migrations. This project contains a migrator for simple CQL scripts and Solr core configuration files. Not every migrator supports all actions. For example Solr included in DSE 4.6 does not include the ability to delete cores. In this case we do not have a task named drop.List all commands:inv -lTo list optional parameters for a command:inv --help command.name.Example: inv --help cassandra.add_migrationCassandraNote: This feature works with both Apache Cassandra and DataStax EnterpriseActions supported:cassandra.create - Creates the keyspace along with a table to track migrations. Note: The default replication strategy is SimpleStrategy with a Replication Factor of 3.cassandra.drop - Drops the keyspace.cassandra.migrate - Runs all missing migrations against the keyspace.cassandra.add_migration --name migration_name - Generates a migrationfile under db/migrations with the current timestamp prepended to theprovided --name value.cassandra.dump_schema - Dumps the current schmea to ```db/schema.cql``. Be careful when using this feature when Solr is enabled.cassandra.load_schema - Loads the schema from db/schema.cql. This may be faster than running all migrations in a project.Examples:inv cassandra.create cassandra.migrate - Creates the keyspaces and runs all migrationsinv cassandra.load_schema - Creates the keyspace and loads the schema from db/schema.cqlSolrNote: This feature only works with DataStax EnterpriseActions supported:solr.create [--core foo.bar] - Uploads the core configuration files and calls the CREATE API endpointsolr.migrate [--core foo.bar] - Uploads the core configuration files and calls the RELOAD API endpointsolr.add_core --name foo.bar - Creates a core configuration directory and files. Use the formatkeyspace.table_name when naming your cores.Example: inv solr.create - Uploads all core configuration files and calls the create core API endpoint.solr.create and solr.migrate support the --core core.name flag. This will run the task against only one core instead of all cores. Remember the core name in DSE Solr is keyspace.table_name.Directory Layoutdb/migrationsCQL migration files generated by the cassandra.add_migration command will be placed in this directory with a timestamp prepended.Example directory layout:db/ migrations/ 201501301409_create_users_table.cql 201501301623_create_tweets_table.cql ...db/solrFolder containing all the Solr core configuration files. With DataStax Enterprise the core name is comprised of the keyspace and table name in the format keyspace.table_name. Within this directory we house sub-directories for each core. These directories in turn have the schema.xml and solrconfig.xml files needed for configuring the core.Example directory layout:db/ solr/ example_keyspace.a_table/ schema.xml solrconfig.xml example_keyspace.b_table/ schema.xml solrconfig.xmlTrireme Project Layouttrireme/ migrators/ cassandra.py solr.py trireme.pymigratorsThe code that powers the migration engine. Each migrator receives its own file and provides invoke tasks.trireme.pyCollects all of the Invoke tasks into a common namespace along with a simple setup taskPython DependenciesAll required items have been specified in requirements.txt andsetup.py. Select items are outlined below.cassandra-driver - DataStax driver for connecting with Cassandra, used when creating and dropping keyspacesrequests - HTTP Client, used when communicating with the Solr APIsinvoke - Task execution tool & library. This is used to run the exposed migration tasksExtending TriremeAdding a new migrator involves placing the code to invoke annotations in a file within the migrators directory. Next add your migrator to the Collection entry in trireme.py. If you create a new migrator and would like to share it with the community, please fork the repo, add your migrator, and then open a pull request.

Illustration Image

Trireme is a tool providing migration support for Apache Cassandra, DataStax Enterprise Cassandra & Solr. Commands are run using the Python Invoke CLI tool.

System Dependencies

  • cqlsh must be on the PATH. Some tasks utilize the cqlsh tool for running scripts and dumping schemas to disk.

Integration

To use this tool within the scope of your project follow these steps.

  1. Install trireme with pip install trireme

  2. Create a tasks.py file with the following content:

    from invoke import Collection
    from trireme import trireme
    namespace = Collection(trireme)
  3. Create a trireme_config.py file with your Cassandra and Solr information.

    # Cassandra Configuration
    # Contact points for your cluster, currently only the first is used
    contact_points = ['127.0.0.1']
    # Keyspace to work with, this doesn't have to exist yet.
    keyspace = 'foo'
    # Replication options. Defined as a map just as you would in CQL.
    replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }
    # replication = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2}
    # Authentication Information
    username = None
    password = None
    # Flag indicating whether this host is the migration master. Migrations are only run on the migration master
    migration_master = True
    # Solr Configuration
    solr_url = 'http://127.0.0.1:8983/solr'
  4. Run the trireme setup task to create the basic directories

    inv trireme.setup

Usage

Migrators contain logic to run migrations. This project contains a migrator for simple CQL scripts and Solr core configuration files. Not every migrator supports all actions. For example Solr included in DSE 4.6 does not include the ability to delete cores. In this case we do not have a task named drop.

List all commands:

inv -l

To list optional parameters for a command:

inv --help command.name.

Example: inv --help cassandra.add_migration

Cassandra

Note: This feature works with both Apache Cassandra and DataStax Enterprise

Actions supported:

  • cassandra.create - Creates the keyspace along with a table to track migrations. Note: The default replication strategy is SimpleStrategy with a Replication Factor of 3.
  • cassandra.drop - Drops the keyspace.
  • cassandra.migrate - Runs all missing migrations against the keyspace.
  • cassandra.add_migration --name migration_name - Generates a migration file under db/migrations with the current timestamp prepended to the provided --name value.
  • cassandra.dump_schema - Dumps the current schmea to ```db/schema.cql``. Be careful when using this feature when Solr is enabled.
  • cassandra.load_schema - Loads the schema from db/schema.cql. This may be faster than running all migrations in a project.

Examples:

  • inv cassandra.create cassandra.migrate - Creates the keyspaces and runs all migrations
  • inv cassandra.load_schema - Creates the keyspace and loads the schema from db/schema.cql

Solr

Note: This feature only works with DataStax Enterprise

Actions supported:

  • solr.create [--core foo.bar] - Uploads the core configuration files and calls the CREATE API endpoint
  • solr.migrate [--core foo.bar] - Uploads the core configuration files and calls the RELOAD API endpoint
  • solr.add_core --name foo.bar - Creates a core configuration directory and files. Use the format keyspace.table_name when naming your cores.

Example: inv solr.create - Uploads all core configuration files and calls the create core API endpoint.

solr.create and solr.migrate support the --core core.name flag. This will run the task against only one core instead of all cores. Remember the core name in DSE Solr is keyspace.table_name.

Directory Layout

db/migrations

CQL migration files generated by the cassandra.add_migration command will be placed in this directory with a timestamp prepended.

Example directory layout:

db/
  migrations/
    201501301409_create_users_table.cql
    201501301623_create_tweets_table.cql
    ...

db/solr

Folder containing all the Solr core configuration files. With DataStax Enterprise the core name is comprised of the keyspace and table name in the format keyspace.table_name. Within this directory we house sub-directories for each core. These directories in turn have the schema.xml and solrconfig.xml files needed for configuring the core.

Example directory layout:

db/
  solr/
    example_keyspace.a_table/
      schema.xml
      solrconfig.xml
    example_keyspace.b_table/
      schema.xml
      solrconfig.xml

Trireme Project Layout

trireme/
  migrators/
    cassandra.py
    solr.py
  trireme.py

migrators

The code that powers the migration engine. Each migrator receives its own file and provides invoke tasks.

trireme.py

Collects all of the Invoke tasks into a common namespace along with a simple setup task

Python Dependencies

All required items have been specified in requirements.txt and setup.py. Select items are outlined below.

  • cassandra-driver - DataStax driver for connecting with Cassandra, used when creating and dropping keyspaces
  • requests - HTTP Client, used when communicating with the Solr APIs
  • invoke - Task execution tool & library. This is used to run the exposed migration tasks

Extending Trireme

Adding a new migrator involves placing the code to invoke annotations in a file within the migrators directory. Next add your migrator to the Collection entry in trireme.py. If you create a new migrator and would like to share it with the community, please fork the repo, add your migrator, and then open a pull request.

Related Articles

cassandra
tools
sstables

ic-tools for Apache Cassandra SSTables

John Doe

2/17/2023

cassandra
tools

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

github