Cascading – MapReduce without the complexity

Just bumped into Cascading, which is an open source (GPL 3) framework "for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster". Hadoop is, I’m sure you all know, an implementation of MapReduce which is at the core of how Google does its processing.

Anyway, the Cascading API "lets the developer quickly assemble complex distributed processes without having to "think" in MapReduce", which can get really complex in non-trivial applications.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.