Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

10/31/2017

Reading time:1 min

Spark Cassandra Stress

by John Doe

A tool for testing the DataStax Spark Cassandra Connector against both ApacheCassandra (TM) and DataStax Enterprise (DSE) with either bundled libraries fromDSE, Maven, or the connector built from source!BuildingThis project is built using Gradle and can be built in three exciting ways:Using the Connector and Spark libraries installed by DataStax EnterpriseUsing libraries downloaded from mavenUsing libraries assembled fresh from your local Spark Cassandra Connector RepositoryThe jar can be built using./gradlew jar -Pagainst=[type]Where type is one of dse,maven or sourceDSE OptionsDSE libraries are located by looking for the installation of DSE on your machine.Change environment variables DSE_HOME and DSE_RESOURCES if your installationdiffers from the default.DefaultsDSE_HOME=$HOME/dseDSE_RESOURCES=$HOME/dse/resourcesMaven OptionsWhen getting libraries from Maven we need to specify the Connector version andSpark Version libraries to compile against. Change environment variablesCONNECTOR_VERSION and SPARK_VERSION to the artifacts you would like touse.DefaultsCONNECTOR_VERSION=1.2.0-rc2SPARK_VERSION=1.2.1Source OptionsGradle will attempt to clean and build the assembly jar for the Spark Connectorlooking for the repository in environment variable SPARKCC_HOME. This willbuild whatever commit the connector is currently at.DefaultSPARKCC_HOME=$HOME/repos/spark-cassandra-connector/RunningThere are many options which can be used to configure your run ofSpark Cassandra Stress but the two main invocations are either usingdse spark-submit or spark-submit.See the run.sh script for examples or use it to launch the program../run.sh [dse|apache] --helpWill bring up the built in help.LicenseCopyright 2014-17, DataStax, Inc.Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Illustration Image

A tool for testing the DataStax Spark Cassandra Connector against both Apache Cassandra (TM) and DataStax Enterprise (DSE) with either bundled libraries from DSE, Maven, or the connector built from source!

Building

This project is built using Gradle and can be built in three exciting ways:

  1. Using the Connector and Spark libraries installed by DataStax Enterprise
  2. Using libraries downloaded from maven
  3. Using libraries assembled fresh from your local Spark Cassandra Connector Repository

The jar can be built using

./gradlew jar -Pagainst=[type]

Where type is one of dse,maven or source

DSE Options

DSE libraries are located by looking for the installation of DSE on your machine. Change environment variables DSE_HOME and DSE_RESOURCES if your installation differs from the default.

Defaults

DSE_HOME=$HOME/dse
DSE_RESOURCES=$HOME/dse/resources

Maven Options

When getting libraries from Maven we need to specify the Connector version and Spark Version libraries to compile against. Change environment variables CONNECTOR_VERSION and SPARK_VERSION to the artifacts you would like to use.

Defaults

CONNECTOR_VERSION=1.2.0-rc2
SPARK_VERSION=1.2.1

Source Options

Gradle will attempt to clean and build the assembly jar for the Spark Connector looking for the repository in environment variable SPARKCC_HOME. This will build whatever commit the connector is currently at.

Default

SPARKCC_HOME=$HOME/repos/spark-cassandra-connector/

Running

There are many options which can be used to configure your run of Spark Cassandra Stress but the two main invocations are either using dse spark-submit or spark-submit.

See the run.sh script for examples or use it to launch the program.

./run.sh [dse|apache] --help

Will bring up the built in help.

License

Copyright 2014-17, DataStax, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Related Articles

python
cassandra
spark

GitHub - andreia-negreira/Data_streaming_project: Data streaming project with robust end-to-end pipeline, combining tools such as Airflow, Kafka, Spark, Cassandra and containerized solution to easy deployment.

andreia-negreira

12/2/2023

cassandra
spark

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

github