Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

8/2/2018

Reading time:3 min

tuplejump/snackfs-release

by John Doe

SnackFS @ CalliopeSnackFS is our bite-sized, lightweight HDFS compatible FileSystem built over Cassandra.With it's unique fat driver design it requires no additional SysOps or setup on the Cassanndra Cluster. All you have to do is point to your Cassandra cluster and you are ready to go.As SnackFS was written as a dropin replacement for HDFS, your existing HDFS backed applications not only run as-is on SnackFS, but they also run faster!SnackFS cluster is also more resilient than a HDFS cluster as there is no SPOF like the NameNode.PrerequisitesSBT : It can be set up from the instructions here.Cassandra(v1.2.12) : Instructions can be found here. An easier alternative would be using CCMUsing SnackFSUse the binaryYou can download the SnackFS distribution built with Scala 2.9.x here and Scala 2.10.x hereTo add SnackFS to your SBT project use,For SBT"com.tuplejump" %% "snackfs" % "0.6.1-EA"To add SnackFS to your Maven project use,with Scala 2.9.3 use,<dependency> <groupId>com.tuplejump</groupId> <artifactId>snackfs_2.9.3</artifactId> <version>0.6.1-EA</version></dependency>And with Scala 2.10.3,<dependency> <groupId>com.tuplejump</groupId> <artifactId>snackfs_2.10</artifactId> <version>0.6.1-EA</version></dependency>Build from SourceCheckout the source from http://github.com/tuplejump/snackfsTo build SnackFS distribution run sbt's dist command in the project directory[snackfs]$ sbt distThis will result in a "snackfs-{version}.tgz" file in the "target" directory of "snackfs".Extract "snackfs-{version}.tgz" to the desired location.Start Cassandra (default setup for snackfs assumes its a cluster with 3 nodes)It is possible to configure the file system by updating core-site.xml.The following properties can be added.snackfs.cassandra.host (default 127.0.0.1)snackfs.cassandra.port (default 9160)snackfs.consistencyLevel.write (default QUORUM)snackfs.consistencyLevel.read (default QUORUM)snackfs.keyspace (default snackfs)snackfs.subblock.size (default 8 MB (8 * 1024 * 1024))snackfs.block.size (default 128 MB (128 * 1024 * 1024))snackfs.replicationFactor (default 3)snackfs.replicationStrategy (default org.apache.cassandra.locator.SimpleStrategy)SnackFS Shell provides the fs commands similar to Hadoop Shell. For example to create a directory,[Snackfs(extracted)]$bin/snackfs -mkdir snackfs:///random###To build and use with HadoopSetup Apache Hadoop v1.0.4.(http://hadoop.apache.org/#Getting+Started). The base directory will be referred as 'hadoop-1.0.4' in the following steps.Execute the following commands in the snackfs project directory.[snackfs]$ sbt packageThis will result in a "snackfs_<scala_version>-<version>.jar" file in the "target/scala-<scala_version>" directory of "snackfs".Copy the jar to 'hadoop-1.0.4/lib'.Copy all the jars in snackfs/lib_managed and scala-library-<scala_version>.jar(located at '~/.ivy2/cache/org.scala-lang/scala-library/jars') to 'hadoop-1.0.4/lib'.Copy snackfs/src/main/resources/core-site.xml to 'hadoop-1.0.4/conf'Start Cassandra (default setup for snackfs assumes its a cluster with 3 nodes)Hadoop fs commands can now be run using snackfs. For example,[hadoop-1.0.4]$ bin/hadoop fs -mkdir snackfs:///random###To configure logging,In System EnvironmentSet SNACKFS_LOG_LEVEL in the Shell to one of the following ValuesDEBUGINFOERRORALLOFFDefault value if not set if ERROR####In code (for further control/tuning)If you want your logs in a File, update LogConfiguration.scala like belowval config = new LoggerFactory("", Option(Level.ALL), List(FileHandler("logs")), true)The arguments for LoggerFactory arenode - Name of the logging node. ("") is the top-level logger.level - Log level for this node. Leaving it None implies the parent logger's level.handlers - Where to send log messages.useParents - indicates if log messages are passed up to parent nodes.To stop at this node level, set it to falseAdditional logging configuration details can be found here

Illustration Image

SnackFS @ Calliope

SnackFS is our bite-sized, lightweight HDFS compatible FileSystem built over Cassandra. With it's unique fat driver design it requires no additional SysOps or setup on the Cassanndra Cluster. All you have to do is point to your Cassandra cluster and you are ready to go.

As SnackFS was written as a dropin replacement for HDFS, your existing HDFS backed applications not only run as-is on SnackFS, but they also run faster! SnackFS cluster is also more resilient than a HDFS cluster as there is no SPOF like the NameNode.

Prerequisites

  1. SBT : It can be set up from the instructions here.

  2. Cassandra(v1.2.12) : Instructions can be found here. An easier alternative would be using CCM

Using SnackFS

Use the binary

For SBT

"com.tuplejump" %% "snackfs" % "0.6.1-EA"
  • To add SnackFS to your Maven project use, with Scala 2.9.3 use,
<dependency>
  <groupId>com.tuplejump</groupId>
  <artifactId>snackfs_2.9.3</artifactId>
  <version>0.6.1-EA</version>
</dependency>

And with Scala 2.10.3,

<dependency>
  <groupId>com.tuplejump</groupId>
  <artifactId>snackfs_2.10</artifactId>
  <version>0.6.1-EA</version>
</dependency>

Build from Source

  1. Checkout the source from http://github.com/tuplejump/snackfs

  2. To build SnackFS distribution run sbt's dist command in the project directory

[snackfs]$ sbt dist

This will result in a "snackfs-{version}.tgz" file in the "target" directory of "snackfs". Extract "snackfs-{version}.tgz" to the desired location.

  1. Start Cassandra (default setup for snackfs assumes its a cluster with 3 nodes)

  2. It is possible to configure the file system by updating core-site.xml. The following properties can be added.

    • snackfs.cassandra.host (default 127.0.0.1)
    • snackfs.cassandra.port (default 9160)
    • snackfs.consistencyLevel.write (default QUORUM)
    • snackfs.consistencyLevel.read (default QUORUM)
    • snackfs.keyspace (default snackfs)
    • snackfs.subblock.size (default 8 MB (8 * 1024 * 1024))
    • snackfs.block.size (default 128 MB (128 * 1024 * 1024))
    • snackfs.replicationFactor (default 3)
    • snackfs.replicationStrategy (default org.apache.cassandra.locator.SimpleStrategy)
  3. SnackFS Shell provides the fs commands similar to Hadoop Shell. For example to create a directory,

[Snackfs(extracted)]$bin/snackfs -mkdir snackfs:///random

###To build and use with Hadoop

  1. Setup Apache Hadoop v1.0.4.(http://hadoop.apache.org/#Getting+Started). The base directory will be referred as 'hadoop-1.0.4' in the following steps.

  2. Execute the following commands in the snackfs project directory.

[snackfs]$ sbt package

This will result in a "snackfs_<scala_version>-<version>.jar" file in the "target/scala-<scala_version>" directory of "snackfs". Copy the jar to 'hadoop-1.0.4/lib'.

  1. Copy all the jars in snackfs/lib_managed and scala-library-<scala_version>.jar (located at '~/.ivy2/cache/org.scala-lang/scala-library/jars') to 'hadoop-1.0.4/lib'.

  2. Copy snackfs/src/main/resources/core-site.xml to 'hadoop-1.0.4/conf'

  3. Start Cassandra (default setup for snackfs assumes its a cluster with 3 nodes)

  4. Hadoop fs commands can now be run using snackfs. For example,

[hadoop-1.0.4]$ bin/hadoop fs -mkdir snackfs:///random

###To configure logging,

In System Environment

Set SNACKFS_LOG_LEVEL in the Shell to one of the following Values

  • DEBUG
  • INFO
  • ERROR
  • ALL
  • OFF

Default value if not set if ERROR

####In code (for further control/tuning) If you want your logs in a File, update LogConfiguration.scala like below

val config = new LoggerFactory("", Option(Level.ALL), List(FileHandler("logs")), true)

The arguments for LoggerFactory are

  1. node - Name of the logging node. ("") is the top-level logger.
  2. level - Log level for this node. Leaving it None implies the parent logger's level.
  3. handlers - Where to send log messages.
  4. useParents - indicates if log messages are passed up to parent nodes.To stop at this node level, set it to false

Additional logging configuration details can be found here

Related Articles

data.engineering
hdfs
hadoop

Next-Gen Data Movement Platform at PayPal

John Doe

7/9/2021

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

cassandra