SETUP LAMBDA ARCHITECTURE EXAMPLE

1. Introduction - Setup lambda architecture example

24 March 2016

Let’s see how to build a lambda architecture example (LAE) for processing huge log data.

What is the Lambda Architecture?

Nathan Marz came up with the term Lambda Architecture (LA) for a generic, scalable and fault-tolerant data processing architecture, based on his experience working on distributed data processing systems at Backtype and Twitter.

The LA aims to satisfy the needs for a robust system that is fault-tolerant, both against hardware failures and human mistakes, being able to serve a wide range of workloads and use cases, and in which low-latency reads and updates are required. The resulting system should be linearly scalable, and it should scale out rather than up.

Here’s how it looks like, from a high-level perspective:

LA screenshot

The Lambda Architecture (cred. http://lambda-architecture.net/)

Lambda Architecture Example

LA screenshot

lambda architecture example (LAE)

Let’s build lambda architecture example (LAE) one by one with introduction and codes.

The most diffrent thing is that I put buffering layer for this example lambda architecture.

If the log data goes from api server to S3 or spark-streaming directly, it needs a long time to wait to send a data because I assume that there are quite many API server to send a log at the same time and threads of aws-cli(to send a data to s3) and spark-streaming are limited So I will use kafka - distributed replicated cluster messaging system. It will help the api servers not to wait to send the log data.

In next post, We will learn about kafka and see how to setup for LAE.

Contents

  1. Intro to lambda architecture example (LAE) setup

  2. Kafka - zookeeper (Distribution Layer)

    2-1. Kafka - zookeeper single-server setup

    2-2. Kafka - zookeeper multi-server setup

  3. Spark streaming (Speed Layer)

    3-1. Spark-streaming setup and connect with LAE

  4. Cassandra (Query Layer)

    4-1. NoSQL database performance comparison (canssandra, hbase, etc.)

    4-2. Cassandra db setup and connect with LAE

  5. Example API server setup and connect with LAE

  6. S3 setup and connect with LAE (Batch Layer)

  7. Spark, Zeppelin setup and connect with LAE (View Layer)

  8. Docker setup for LAE