Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

1/10/2017

Reading time:4 min

LukeTillman/dse-docker

by John Doe

README.md DataStax has started offering official Docker images for development environments, which is great news! Iwas fortunate to be involved in the planning and creation of those images, so if you were using this image before, transitioningto the new official images should involve minimal or no changes. This repository is now archived and not actively maintained. Iencourage you to check out the new official Docker images and open issues/PRs there to help us improve them.Thanks to everyone who helped make this image better by opening issues and PRs. And thanks to everyone who used this image whichultimately played a small role in encouraging DataStax to release official images of their own.A Docker image for DataStax Enterprise. Please use theGitHub repository for opening issues.UsageNote: Not meant for production use. See this whitepaper on DataStax.com fordetails on setting up DSE and Docker in production.Starting DSEBy default, this image will start DSE in Cassandra only mode. For example:docker run --name my-dse -d luketillman/datastax-enterprise:TAGYou should replace TAG in all of these examples with the version of DSE you are trying tostart. (See the Docker Hub tags for a list of available versions.)The image's entrypoint script runs the command dse cassandra and will append any switches youprovide to that command. So it's possible to start DSE in any of the other supported modes byadding switches to the end of your docker run command.Example: Start a Graph Nodedocker run --name my-dse -d luketillman/datastax-enterprise:TAG -gIn the container, this will run dse cassandra -g to start a graph node.Example: Start an Analytics (Spark) Nodedocker run --name my-dse -d luketillman/datastax-enterprise:TAG -kIn the container, this will run dse cassandra -k to start an analytics node.Example: Start a Search Nodedocker run --name my-dse -d luketillman/datastax-enterprise:TAG -sIn the container, this will run dse cassandra -s to start a search node.You can also use combinations of those switches. For more examples, see the Starting DSEdocumentation.Exposing Ports on the HostChances are you'll want to expose some ports on the host so that you can talk to DSE fromoutside of Docker (for example, from code running on your local machine). You can do that usingthe -p switch when calling docker run and the most common port you'll probably want toexpose is 9042 which is where CQL clients communicate. For example:docker run --name my-dse -d -p 9042:9042 luketillman/datastax-enterprise:TAGThis will expose the container's CQL client port (9042) on the host at port 9042. For a list ofthe ports used by DSE, see the Securing DataStax Enterprise ports documentation.Starting Related ToolsWith a node running, use docker exec to run other tools. For example, the nodetool statuscommand:docker exec -it my-dse nodetool statusOr to connect with cqlsh:docker exec -it my-dse cqlshEnvironment VariablesThe following environment variables can be set at runtime to override configuration. Setting thefollowing values will override the corresponding settings in the cassandra.yaml configurationfile:LISTEN_ADDRESS: The IP address to listen for connections from other nodes. Defaults tothe hostname's IP address.BROADCAST_ADDRESS: The IP address to advertise to other nodes. Defaults to the samevalue as the LISTEN_ADDRESS.RPC_ADDRESS: The IP address to listen for client/driver connections. Defaults to0.0.0.0 (i.e. wildcard).BROADCAST_RPC_ADDRESS: The IP address to advertise to clients/drivers. Defaults to thesame value as the BROADCAST_ADDRESS.SEEDS: The comma-delimited list of seed nodes for the cluster. Defaults to this node'sBROADCAST_ADDRESS if not set and will only be set the first time the node is started.START_RPC: Whether to start the Thrift RPC server. Will leave the default in thecassandra.yaml file if not set.CLUSTER_NAME: The name of the cluster. Will leave the default in the cassandra.yamlfile if not set.NUM_TOKENS:: The number of tokens randomly assigned to this node. Will leave thedefault in the cassandra.yaml file if not set.If you need more advanced control over the configuration, the configuration files are exposedas a volume (under /opt/dse/resources) which would allow you to mount a more customizedconfiguration file from the host. See below for more information on volumes.VolumesThe following volumes are created and can be mounted to the host system:/var/lib/cassandra: Data from Cassandra/var/lib/spark: Data from DSE Analytics w/ Spark/var/lib/dsefs: Data from DSEFS/var/log/cassandra: Logs from Cassandra/var/log/spark: Logs from Spark/opt/dse/resources: Most configuration files including cassandra.yaml, dse.yaml, andmore can be found here.LoggingYou can view logs via Docker's container logs:docker logs my-dseBuildsBuild and publish scripts are available in the build folder of the repository. All thosescripts are meant to be run from the root of the repository. For example:> ./build/docker-build.shBecause DSE requires credentials to download, the build requires some way to access thosecredentials without baking them into the final image and exposing them. Since Docker doesn'tcurrent support build-time secrets, I had to come up with a "creative" (read: hacky) workaroundto grab those credentials via a local HTTP server during the build and then remove them afterwe've downloaded DSE. You can see Issue 8 and the files in the srv directory formore details.Continuous integration builds are handled by Travis.

Illustration Image

README.md

DataStax has started offering official Docker images for development environments, which is great news! I was fortunate to be involved in the planning and creation of those images, so if you were using this image before, transitioning to the new official images should involve minimal or no changes. This repository is now archived and not actively maintained. I encourage you to check out the new official Docker images and open issues/PRs there to help us improve them.

Thanks to everyone who helped make this image better by opening issues and PRs. And thanks to everyone who used this image which ultimately played a small role in encouraging DataStax to release official images of their own.


Build Status

A Docker image for DataStax Enterprise. Please use the GitHub repository for opening issues.

Usage

Note: Not meant for production use. See this whitepaper on DataStax.com for details on setting up DSE and Docker in production.

Starting DSE

By default, this image will start DSE in Cassandra only mode. For example:

docker run --name my-dse -d luketillman/datastax-enterprise:TAG

You should replace TAG in all of these examples with the version of DSE you are trying to start. (See the Docker Hub tags for a list of available versions.)

The image's entrypoint script runs the command dse cassandra and will append any switches you provide to that command. So it's possible to start DSE in any of the other supported modes by adding switches to the end of your docker run command.

Example: Start a Graph Node

docker run --name my-dse -d luketillman/datastax-enterprise:TAG -g

In the container, this will run dse cassandra -g to start a graph node.

Example: Start an Analytics (Spark) Node

docker run --name my-dse -d luketillman/datastax-enterprise:TAG -k

In the container, this will run dse cassandra -k to start an analytics node.

Example: Start a Search Node

docker run --name my-dse -d luketillman/datastax-enterprise:TAG -s

In the container, this will run dse cassandra -s to start a search node.

You can also use combinations of those switches. For more examples, see the Starting DSE documentation.

Exposing Ports on the Host

Chances are you'll want to expose some ports on the host so that you can talk to DSE from outside of Docker (for example, from code running on your local machine). You can do that using the -p switch when calling docker run and the most common port you'll probably want to expose is 9042 which is where CQL clients communicate. For example:

docker run --name my-dse -d -p 9042:9042 luketillman/datastax-enterprise:TAG

This will expose the container's CQL client port (9042) on the host at port 9042. For a list of the ports used by DSE, see the Securing DataStax Enterprise ports documentation.

Starting Related Tools

With a node running, use docker exec to run other tools. For example, the nodetool status command:

docker exec -it my-dse nodetool status

Or to connect with cqlsh:

docker exec -it my-dse cqlsh

Environment Variables

The following environment variables can be set at runtime to override configuration. Setting the following values will override the corresponding settings in the cassandra.yaml configuration file:

  • LISTEN_ADDRESS: The IP address to listen for connections from other nodes. Defaults to the hostname's IP address.
  • BROADCAST_ADDRESS: The IP address to advertise to other nodes. Defaults to the same value as the LISTEN_ADDRESS.
  • RPC_ADDRESS: The IP address to listen for client/driver connections. Defaults to 0.0.0.0 (i.e. wildcard).
  • BROADCAST_RPC_ADDRESS: The IP address to advertise to clients/drivers. Defaults to the same value as the BROADCAST_ADDRESS.
  • SEEDS: The comma-delimited list of seed nodes for the cluster. Defaults to this node's BROADCAST_ADDRESS if not set and will only be set the first time the node is started.
  • START_RPC: Whether to start the Thrift RPC server. Will leave the default in the cassandra.yaml file if not set.
  • CLUSTER_NAME: The name of the cluster. Will leave the default in the cassandra.yaml file if not set.
  • NUM_TOKENS:: The number of tokens randomly assigned to this node. Will leave the default in the cassandra.yaml file if not set.

If you need more advanced control over the configuration, the configuration files are exposed as a volume (under /opt/dse/resources) which would allow you to mount a more customized configuration file from the host. See below for more information on volumes.

Volumes

The following volumes are created and can be mounted to the host system:

  • /var/lib/cassandra: Data from Cassandra
  • /var/lib/spark: Data from DSE Analytics w/ Spark
  • /var/lib/dsefs: Data from DSEFS
  • /var/log/cassandra: Logs from Cassandra
  • /var/log/spark: Logs from Spark
  • /opt/dse/resources: Most configuration files including cassandra.yaml, dse.yaml, and more can be found here.

Logging

You can view logs via Docker's container logs:

docker logs my-dse

Builds

Build and publish scripts are available in the build folder of the repository. All those scripts are meant to be run from the root of the repository. For example:

> ./build/docker-build.sh

Because DSE requires credentials to download, the build requires some way to access those credentials without baking them into the final image and exposing them. Since Docker doesn't current support build-time secrets, I had to come up with a "creative" (read: hacky) workaround to grab those credentials via a local HTTP server during the build and then remove them after we've downloaded DSE. You can see Issue 8 and the files in the srv directory for more details.

Continuous integration builds are handled by Travis.

Related Articles

python
java
cassandra

Vald

John Doe

2/11/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra