Cassandra.Link
The best knowledge base on Apache Cassandra®
Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.
A collection of 5 posts
Next-Gen Data Movement Platform at PayPal
7/9/2021
…using Apache Airflow scheduler and Apache Gobblin — a data integration framework open-sourced by LinkedIn.As PayPal grows beyond 300 million users, we generate lots of data, both on our online (site)...
brianmhess/DSE-Spark-HDFS
1/28/2021
#IntroductionThe goal of this Exercise is to learn how to access data in HDFS via Spark in DSE 4.6.A simple scenario we want to address is loading data from files in HDFS in an external Hadoop system ...
About the Cassandra File System (CFS)
8/19/2020
Analytics jobs often require a distributed file system. DataStax Enterprise provides a replacement for the Hadoop Distributed File System (HDFS) called the Cassandra File System (CFS)....
SnappyDataInc/snappydata
8/3/2018
SnappyData fuses Apache Spark with an in-memory database to deliver a data engine capable of processing streams, transactions and interactive analytics in a single cluster. The Challenge with Spark an...
tuplejump/snackfs-release
8/2/2018
SnackFS @ Calliope SnackFS is our bite-sized, lightweight HDFS compatible FileSystem built over Cassandra. With it's unique fat driver design it requires no additional SysOps or setup on the Cassanndr...