Tag archive: hadoop

sticky
Dec 19, 2014

Solving MapReduce Performance Problems With Sharded Joins

Sometimes the answer to a sluggish data pipeline isn’t more […]
sticky
Nov 27, 2014

Data Processing with Apache Crunch at Spotify

All of our lovely Spotify users generate many terabytes of […]
sticky
May 7, 2013

Snakebite: a pure Python HDFS client

As we all know, Hadoop is great and here at […]