Illustration Image

Cassandra.Link

The best knowledge base on Apache Cassandra®

Helping platform leaders, architects, engineers, and operators build scalable real time data platforms.

4/24/2018

Reading time:2 min

How NerdWallet and BlockCypher are Building Data Platforms

by Lars Kamp

How NerdWallet and BlockCypher are Building Data PlatformsOn February 23 in San Francisco, SF DATA hosted a tech talk on Data Platforms in FinTech. In their talks, speakers revealed how they’re building out data platforms to support blockchain applications, cryptocurrencies, and detect fraudulent activity.Matthieu Riou, CTO and co-founder of BlockCypher, explained how BlockCypher used data to hunt down $70 million in stolen Bitcoins for the Department of Homeland Security.Vaibhav Jajoo, head of Data Infrastructure at NerdWallet, described how the data team at NerdWallet is using a new brand of data analytics solutions with Kafka, Python, EMR, and Redshift.Analytics for Bitcoin and Other Cryptocurrencies at BlockCypherDevelopers, companies, and government agencies use the BlockCypher API to build cryptocurrency applications and analyze patterns in blockchain transactions. Bitcoin alone has 8.2 million new transactions per month, with 250 million IP addresses to monitor. The Department of Homeland Security recently used the BlockCypher Analytics’ API to track down $70 million in stolen Bitcoins from the Bitfinex heist.[embedded content]BlockCypher — Analytics for Bitcoin and other CryptocurrenciesIn August 2016, BlockCypher noticed that 0.75% of Bitcoins suddenly started moving in unusual patterns.While the culprits are still unknown, BlockCypher was able to filter data to pinpoint where the transactions were coming from — in this instance bitcoin wallet provider BitGo. The BlockCypher architecture uses a combination of Cassandra, Redshift and Spark.According to Matthieu, the Holy Grail in cryptocurrency is to deanonymize every transaction, have the ability to tie it with off-chain transactions, classify transactions using machine learning, and provide APIs for law enforcement and industry.Building Data Solutions and Innovations at NerdWalletNerdWallet gives consumers and small businesses clarity around all of life’s financial decisions by building accessible online tools and providing research and expert advice.[embedded content]Vaibhav joined NerdWallet in 2014 and started the Data Analytics Team to help everybody in NerdWallet create meaning from the large volume and variety of data that NerdWallet customers generate every day (popular products, click-through rates, platform attributes etc.). NerdWallet has ~450 employees, and data consumers (i.e. “everybody”) stretch from analysts to the CEO. Product and marketing analysts care about granular views for a specific product or campaign. The CEO cares about high level views about the business. The data platform at NerdWallet needs to be flexible enough to serve data in a form each audience can understand, while allowing to slice and dice data across many dimensions.[embedded content]FinTech Data Challenges NerdWalletUsing a combination of Kafka, Amazon Redshift and EMR, NerdWallet has been able to create “ETL as Scale” and manage dynamic workloads. There are 250+ named SQL users, with different levels of SQL skills. That can pose a lot of challenges for managing the Redshift environment, especially for situations where some users write large ad-hoc queries. A key to balancing resources with workloads is Redshift’s WLM (Workload Management).— —Interested in building data platforms? Subscribe to SF Data Weekly, for more stories on data engineering you don’t want to miss.Attend our next event? Follow the SF Data Facebook page.

Illustration Image

How NerdWallet and BlockCypher are Building Data Platforms

image

On February 23 in San Francisco, SF DATA hosted a tech talk on Data Platforms in FinTech. In their talks, speakers revealed how they’re building out data platforms to support blockchain applications, cryptocurrencies, and detect fraudulent activity.

Matthieu Riou, CTO and co-founder of BlockCypher, explained how BlockCypher used data to hunt down $70 million in stolen Bitcoins for the Department of Homeland Security.

Vaibhav Jajoo, head of Data Infrastructure at NerdWallet, described how the data team at NerdWallet is using a new brand of data analytics solutions with Kafka, Python, EMR, and Redshift.

Analytics for Bitcoin and Other Cryptocurrencies at BlockCypher

Developers, companies, and government agencies use the BlockCypher API to build cryptocurrency applications and analyze patterns in blockchain transactions. Bitcoin alone has 8.2 million new transactions per month, with 250 million IP addresses to monitor. The Department of Homeland Security recently used the BlockCypher Analytics’ API to track down $70 million in stolen Bitcoins from the Bitfinex heist.

BlockCypher — Analytics for Bitcoin and other Cryptocurrencies

In August 2016, BlockCypher noticed that 0.75% of Bitcoins suddenly started moving in unusual patterns.

While the culprits are still unknown, BlockCypher was able to filter data to pinpoint where the transactions were coming from — in this instance bitcoin wallet provider BitGo. The BlockCypher architecture uses a combination of Cassandra, Redshift and Spark.

According to Matthieu, the Holy Grail in cryptocurrency is to deanonymize every transaction, have the ability to tie it with off-chain transactions, classify transactions using machine learning, and provide APIs for law enforcement and industry.

Building Data Solutions and Innovations at NerdWallet

NerdWallet gives consumers and small businesses clarity around all of life’s financial decisions by building accessible online tools and providing research and expert advice.

Vaibhav joined NerdWallet in 2014 and started the Data Analytics Team to help everybody in NerdWallet create meaning from the large volume and variety of data that NerdWallet customers generate every day (popular products, click-through rates, platform attributes etc.). NerdWallet has ~450 employees, and data consumers (i.e. “everybody”) stretch from analysts to the CEO. Product and marketing analysts care about granular views for a specific product or campaign. The CEO cares about high level views about the business. The data platform at NerdWallet needs to be flexible enough to serve data in a form each audience can understand, while allowing to slice and dice data across many dimensions.

FinTech Data Challenges NerdWallet

Using a combination of Kafka, Amazon Redshift and EMR, NerdWallet has been able to create “ETL as Scale” and manage dynamic workloads. There are 250+ named SQL users, with different levels of SQL skills. That can pose a lot of challenges for managing the Redshift environment, especially for situations where some users write large ad-hoc queries. A key to balancing resources with workloads is Redshift’s WLM (Workload Management).

— —

Interested in building data platforms? Subscribe to SF Data Weekly, for more stories on data engineering you don’t want to miss.

Attend our next event? Follow the SF Data Facebook page.

Related Articles

sstable
cassandra
spark

Spark and Cassandra’s SSTable loader

Arunkumar

11/1/2024

Checkout Planet Cassandra

Claim Your Free Planet Cassandra Contributor T-shirt!

Make your contribution and score a FREE Planet Cassandra Contributor T-Shirt! 
We value our incredible Cassandra community, and we want to express our gratitude by sending an exclusive Planet Cassandra Contributor T-Shirt you can wear with pride.

Join Our Newsletter!

Sign up below to receive email updates and see what's going on with our company

Explore Related Topics

AllKafkaSparkScyllaSStableKubernetesApiGithubGraphQl

Explore Further

analytics

analytics
cassandra