Continue reading: Cascading – MapReduce without the complexity

Cascading – MapReduce without the complexity

Just bumped into Cascading, which is an open source (GPL 3) framework "for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster". Hadoop is, I’m sure you all know, an implementation of MapReduce which is at the core of how Google does its processing. Anyway, the Cascading API "lets the

Continue reading