Back To Schedule
Wednesday, October 3 • 11:30am - 12:00am
Data Scientist Meets Enterprise IT Best Practices

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

 It's been said that all data in the Enterprise is "Big Data". In
particular, relatively new practices such as MapReduce allow
organizations to leverage value from unstructured data -- for example,
working with large-scale collections of log files, integration of
externally available data. Generally these efforts lead toward new
products based on predictive analytics. Cascading API provides
workflow orchestration for large-scale MapReduce, particularly
well-suited for Enterprise IT, with bindings in Java, Scala, Clojure,
Python, and Ruby. Cascading is used in production at Twitter, eBay,
Trulia, AirBnB, Etsy, RapLeaf, Climate Corporation, and many others
for building predictive analytics on large-scale public and private
clouds. This talk will explore best practices and architectural
patterns for large MapReduce workflows, when robustness and
predictability are high priorities.


2012 Keynote & Breakout Sessio...

Paco Nathan

Evil Mad Scientist, Liber 118
Paco Nathan, is a "player/coach" who has led innovative Data teams building large-scale apps for several years. Paco is an O'Reilly author, Apache Spark open source evangelist with Databricks, and an advisor for Zettacap, Amplify Partners, and The Data Guild... Read More →

Wednesday October 3, 2012 11:30am - 12:00am
Conference Theater Registration Floor - Grand Hyatt Hotel