Cassandra.Link
The best knowledge base on Apache Cassandra®
Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.
A collection of 171 posts
Spark and Cassandra’s SSTable loader
11/1/2024
Arunkumar·Follow3 min read·May 13, 2018--Why: We had a lot of very useful data in our Warehouse and wanted to take advantage of those data in some of our production service to enhance the user’s exper...
GitHub - apache/cassandra-analytics: Apache cassandra
9/4/2024
{{ message }} / cassandra-analytics PublicNotifications You must be signed in to change notification settings Fork 11 Star 15 Apache cassandracassandra.apache.org/License Apache-2.0 license 15 sta...
Build an Event-Driven Architecture with Apache Kafka, Apache Spark, and Apache Cassandra
8/3/2024
Author: Cédrick LunvenDataStax·FollowPublished inBuilding Real-World, Real-Time AI·9 min read·May 27, 2022--Knowing how to construct event-driven architectures is a crucial skill for developers as ent...
GitHub - andreia-negreira/Data_streaming_project: Data streaming project with robust end-to-end pipeline, combining tools such as Airflow, Kafka, Spark, Cassandra and containerized solution to easy deployment.
12/2/2023
{{ message }} / Data_streaming_project PublicNotifications Fork 0 Star 0 Data streaming project with robust end-to-end pipeline, combining tools such as Airflow, Kafka, Spark, Cassandra and conta...
GitHub - airscholar/e2e-data-engineering: An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.
/ e2e-data-engineering PublicNotifications Fork 5 Star 19 An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Ka...
• Google Dataflow - Awesome-Astra
5/10/2023
Integrating Astra and Beam/DataflowAstra allows both bulk and real time operations through AstraDB and Astra Streaming. For each service there are multiple interfaces available and integration with Ap...
Dealing with Large Spark Partitions
2/17/2023
One of the biggest issues with working with Spark and Cassandra is dealing with large Partitions. There are several issues we need to overcome before we can really handle the challenge well. I’m going...
Apache Cassandra Lunch #84: Data & Analytics Platform: Cassandra, Spark, Kafka
11/4/2022