Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

Logo

Explore our top articles on Cassandra

Welcome to our article directory on Cassandra database! Here you can find a wide range of informative articles that cover various aspects of Cassandra database management, administration, optimization, and best practices.

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

Visual Flow is an ETL tool designed for effective data manipulation via convenient and user-friendly interface. The tool has the following capabilities:Can integrate data from heterogeneous sources: AWS S3 Cassandra Click House DB2 Dataframe (for reading) Elastic Search IBM COS Kafka Local File MS SQL Mongo MySQL/Maria Oracle PostgreSQL Redis Redshift Leverage direct connectivity to enterprise applications as sources and targets Perform data processing and transformation Run custom code Leverage metadata for analysis and maintenance Visual Flow application is divided into the following repositories:Check the official guide.Visual flow is an open-source software licensed under the Apache-2.0 license.

['ibagroup...

12/2/2024

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

Visual Flow is an ETL tool designed for effective data manipulation via convenient and user-friendly interface. The tool has the following capabilities:Can integrate data from heterogeneous sources: AWS S3 Cassandra Click House DB2 Dataframe (for reading) Elastic Search IBM COS Kafka Local File MS SQL Mongo MySQL/Maria Oracle PostgreSQL Redis Redshift Leverage direct connectivity to enterprise applications as sources and targets Perform data processing and transformation Run custom code Leverage metadata for analysis and maintenance Visual Flow application is divided into the following repositories:Check the official guide.Visual flow is an open-source software licensed under the Apache-2.0 license.

[,',i,b,a,...

12/2/2024

GitHub - datastax/cql-proxy: A client-side CQL proxy/sidecar.

cql-proxy is designed to forward your application's CQL traffic to an appropriate database service. It listens on a local address and securely forwards that traffic.The cql-proxy sidecar enables unsupported CQL drivers to work with DataStax Astra. These drivers include both legacy DataStax drivers and community-maintained CQL drivers, such as the gocql driver and the rust-driver.cql-proxy also enables applications that are currently using Apache Cassandra or DataStax Enterprise (DSE) to use Astra without requiring any code changes. Your application just needs to be configured to use the proxy.If you're building a new application using DataStax drivers, cql-proxy is not required, as the drivers can communicate directly with Astra. DataStax drivers have excellent support for Astra out-of-the-box, and are well-documented in the driver-guide guide.Use the -h or --help flag to display a listing all flags and their corresponding descriptions and environment variables (shown below as items starting with $):$ ./cql-proxy -h Usage: cql-proxy Flags: -h, --help Show context-sensitive help. -b, --astra-bundle=STRING Path to secure connect bundle for an Astra database. Requires '--username' and '--password'. Ignored if using the token or contact points option ($ASTRA_BUNDLE). -t, --astra-token=STRING Token used to authenticate to an Astra database. Requires '--astra-database-id'. Ignored if using the bundle path or contact points option ($ASTRA_TOKEN). -i, --astra-database-id=STRING Database ID of the Astra database. Requires '--astra-token' ($ASTRA_DATABASE_ID) --astra-api-url="https://api.astra.datastax.com" URL for the Astra API ($ASTRA_API_URL) --astra-timeout=10s Timeout for contacting Astra when retrieving the bundle and metadata ($ASTRA_TIMEOUT) -c, --contact-points=CONTACT-POINTS,... Contact points for cluster. Ignored if using the bundle path or token option ($CONTACT_POINTS). -u, --username=STRING Username to use for authentication ($USERNAME) -p, --password=STRING Password to use for authentication ($PASSWORD) -r, --port=9042 Default port to use when connecting to cluster ($PORT) -n, --protocol-version="v4" Initial protocol version to use when connecting to the backend cluster (default: v4, options: v3, v4, v5, DSEv1, DSEv2) ($PROTOCOL_VERSION) -m, --max-protocol-version="v4" Max protocol version supported by the backend cluster (default: v4, options: v3, v4, v5, DSEv1, DSEv2) ($MAX_PROTOCOL_VERSION) -a, --bind=":9042" Address to use to bind server ($BIND) -f, --config=CONFIG YAML configuration file ($CONFIG_FILE) --debug Show debug logging ($DEBUG) --health-check Enable liveness and readiness checks ($HEALTH_CHECK) --http-bind=":8000" Address to use to bind HTTP server used for health checks ($HTTP_BIND) --heartbeat-interval=30s Interval between performing heartbeats to the cluster ($HEARTBEAT_INTERVAL) --idle-timeout=60s Duration between successful heartbeats before a connection to the cluster is considered unresponsive and closed ($IDLE_TIMEOUT) --readiness-timeout=30s Duration the proxy is unable to connect to the backend cluster before it is considered not ready ($READINESS_TIMEOUT) --idempotent-graph If true it will treat all graph queries as idempotent by default and retry them automatically. It may be dangerous to retry some graph queries -- use with caution ($IDEMPOTENT_GRAPH). --num-conns=1 Number of connection to create to each node of the backend cluster ($NUM_CONNS) --proxy-cert-file=STRING Path to a PEM encoded certificate file with its intermediate certificate chain. This is used to encrypt traffic for proxy clients ($PROXY_CERT_FILE) --proxy-key-file=STRING Path to a PEM encoded private key file. This is used to encrypt traffic for proxy clients ($PROXY_KEY_FILE) --rpc-address=STRING Address to advertise in the 'system.local' table for 'rpc_address'. It must be set if configuring peer proxies ($RPC_ADDRESS) --data-center=STRING Data center to use in system tables ($DATA_CENTER) --tokens=TOKENS,... Tokens to use in the system tables. It's not recommended ($TOKENS)To pass configuration to cql-proxy, either command-line flags, environment variables, or a configuration file can be used. Using the docker method as an example, the following samples show how the token and database ID are defined with each method.docker run -p 9042:9042 \ --rm datastax/cql-proxy:v0.1.5 \ --astra-token <astra-token> --astra-database-id <astra-datbase-id>docker run -p 9042:9042 \ --rm datastax/cql-proxy:v0.1.5 \ -e ASTRA_TOKEN=<astra-token> -e ASTRA_DATABASE_ID=<astra-datbase-id>Proxy settings can also be passed using a configuration file with the --config /path/to/proxy.yaml flag. This can be mixed and matched with command-line flags and environment variables. Here are some example configuration files:contact-points: - 127.0.0.1 username: cassandra password: cassandra port: 9042 bind: 127.0.0.1:9042 # ...or with a Astra token:astra-token: <astra-token> astra-database-id: <astra-database-id> bind: 127.0.0.1:9042 # ...All configuration keys match their command-line flag counterpart, e.g. --astra-bundle is astra-bundle:, --contact-points is contact-points: etc.Multi-region failover with DC-aware load balancing policy is the most useful case for a multiple proxy setup.When configuring peers: it is required to set --rpc-address (or rpc-address: in the yaml) for each proxy and it must match is corresponding peers: entry. Also, peers: is only available in the configuration file and cannot be set using a command-line flag.Here's an example of configuring multi-region failover with two proxies. A proxy is started for each region of the cluster connecting to it using that region's bundle. They all share a common configuration file that contains the full list of proxies.Note: Only bundles are supported for multi-region setups.cql-proxy --astra-bundle astra-region1-bundle.zip --username token --password <astra-token> \ --bind 127.0.0.1:9042 --rpc-address 127.0.0.1 --data-center dc-1 --config proxy.yamlcql-proxy ---astra-bundle astra-region2-bundle.zip --username token --password <astra-token> \ --bind 127.0.0.2:9042 --rpc-address 127.0.0.2 --data-center dc-2 --config proxy.yamlThe peers settings are configured using a yaml file. It's a good idea to explicitly provide the --data-center flag, otherwise; these values are pulled from the backend cluster and would need to be pulled from the system.local and system.peers table to properly setup the peers data-center: values. Here's an example proxy.yaml:peers: - rpc-address: 127.0.0.1 data-center: dc-1 - rpc-address: 127.0.0.2 data-center: dc-2Note: It's okay for the peers: to contain entries for the current proxy itself because they'll just be omitted.There are three methods for using cql-proxy:Locally build and run cql-proxy Run a docker image that has cql-proxy installed Use a Kubernetes container to run cql-proxy Build cql-proxy. go build Run with your desired database. DataStax Astra cluster: ./cql-proxy --astra-token <astra-token> --astra-database-id <astra-database-id> The <astra-token> can be generated using these instructions. The proxy also supports using the Astra Secure Connect Bundle along with a client ID and secret generated using these instructions: ./cql-proxy --astra-bundle <your-secure-connect-zip> \ --username <astra-client-id> --password <astra-client-secret> Apache Cassandra cluster: ./cql-proxy --contact-points <cluster node IPs or DNS names> [--username <username>] [--password <password>] Run with your desired database. DataStax Astra cluster: docker run -p 9042:9042 \ datastax/cql-proxy:v0.1.5 \ --astra-token <astra-token> --astra-database-id <astra-database-id> The <astra-token> can be generated using these instructions. The proxy also supports using the Astra Secure Connect Bundle, but it requires mounting the bundle to a volume in the container: docker run -v <your-secure-connect-bundle.zip>:/tmp/scb.zip -p 9042:9042 \ --rm datastax/cql-proxy:v0.1.5 \ --astra-bundle /tmp/scb.zip --username <astra-client-id> --password <astra-client-secret> Apache Cassandra cluster: docker run -p 9042:9042 \ datastax/cql-proxy:v0.1.5 \ --contact-points <cluster node IPs or DNS names> [--username <username>] [--password <password>] If you wish to have the docker image removed after you are done with it, add --rm before the image name datastax/cql-proxy:v0.1.5.Using Kubernetes with cql-proxy requires a number of steps: Generate a token following the Astra instructions. This step will display your Client ID, Client Secret, and Token; make sure you download the information for the next steps. Store the secure bundle in /tmp/scb.zip to match the example below. Create cql-proxy.yaml. You'll need to add three sets of information: arguments, volume mounts, and volumes. A full example can be found here. Argument: Modify the local bundle location, username and password, using the client ID and client secret obtained in the last step to the container argument. command: ["./cql-proxy"] args: ["--astra-bundle=/tmp/scb.zip","--username=Client ID","--password=Client Secret"] Volume mounts: Modify /tmp/ as a volume mount as required. volumeMounts: - name: my-cm-vol mountPath: /tmp/ Volume: Modify the configMap filename as required. In this example, it is named cql-proxy-configmap. Use the same name for the volumes that you used for the volumeMounts. volumes: - name: my-cm-vol configMap: name: cql-proxy-configmap Create a configmap. Use the same secure bundle that was specified in the cql-proxy.yaml. kubectl create configmap cql-proxy-configmap --from-file /tmp/scb.zip Check the configmap that was created. kubectl describe configmap cql-proxy-configmap Name: cql-proxy-configmap Namespace: default Labels: <none> Annotations: <none> Data ==== BinaryData ==== scb.zip: 12311 bytes Create a Kubernetes deployment with the YAML file you created: kubectl create -f cql-proxy.yaml Check the logs: kubectl logs <deployment-name> Drivers that use token-aware load balancing may print a warning or may not work when using cql-proxy. Because cql-proxy abstracts the backend cluster as a single endpoint this doesn't always work well with token-aware drivers that expect there to be at least "replication factor" number of nodes in the cluster. Many drivers print a warning (which can be ignored) and fallback to something like round-robin, but other drivers might fail with an error. For the drivers that fail with an error it is required that they disable token-aware or configure the round-robin load balancing policy.

['datastax...

11/1/2024

cassandrasparkdatastaxkafkakubernetescassandra.lunchgithubexamplesdata.modelingapi
mongo
nocode
elasticsearch

GitHub - ibagroup-eu/Visual-Flow: Visual-Flow main repository

ibagroup-eu

12/2/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

GitHub - datastax/zdm-proxy: An open-source component designed to seamlessly handle the real-time client application activity while a migration is in progress.

No description provided

Spark and Cassandra’s SSTable loader

Arunkumar·Follow3 min read·May 13, 2018--Why: We had a lot of very useful data in our Warehouse and wanted to take advantage of those data in some of our production service to enhance the user’s exper...

GitHub - apache/cassandra-analytics: Apache cassandra

{{ message }} / cassandra-analytics PublicNotifications You must be signed in to change notification settings Fork 11 Star 15 Apache cassandracassandra.apache.org/License Apache-2.0 license 15 sta...

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Build an Event-Driven Architecture with Apache Kafka, Apache Spark, and Apache Cassandra

Author: Cédrick LunvenDataStax·FollowPublished inBuilding Real-World, Real-Time AI·9 min read·May 27, 2022--Knowing how to construct event-driven architectures is a crucial skill for developers as ent...

DataStax Hyper-Converged Database: The Future of Data Infrastructure Is Here | DataStax

Cloud native is now mainstream in infrastructure. In a recent CNCF survey, 84% of organizations used or evaluated Kubernetes (and that figure has been growing steadily). Every cloud provider has a man...

GitHub - arodrime/Montecristo: Datastax Cluster Health Check Tooling

{{ message }} / Montecristo Publicforked fromdatastax-labs/MontecristoNotifications Fork 0 Star 0 Datastax Cluster Health Check ToolingLicense Apache-2.0 license 0 stars 3 forks Branches Tags ...

Explore Further