Hadoop Summit 2012 Review

bottom-img

My review of Hadoop Summit 2012 is up on the Hortonworks Blog.

Daniel Dai and Thejas Nair’s talk, Pig programming is more fun: New features in Pig (slides available), covered new features of Pig in v0.10, which you can read more about here. Improvements to Piggybank to include Pig Macros were discussed. ILLUSTRATE has been fixed in Pig 0.10 and now works with AvroStorage. Pig’s ability to cast single-record relations as scalars is a great addition to the language. UDFs in JRuby greatly simplify extending Pig, bringing many JRuby compatible gems into Pig. Pig embedding enables iterative pig scripts, such as training statistical models. Pig’s HCatalog integration enables sharing resources with Hive and MapReduce users. MongoStorage integration enables simple data publishing to MongoDB, a popular NoSQL database. Finally, Talend integration allows graphical programming of Pig.

The Hadoop market continues to mature and grow, but “you ain’t seen nothing, yet!” Every shred of data on earth is going on HDFS, and we’ve only just begun the big data journey. I can’t wait till next year’s Hadoop Summit to find out more!