Paperback: 224 pages
Publisher: O'Reilly Media; 1 edition (October 23, 2011)
Product Dimensions: 7 x 0.8 x 9.2 inches
Shipping Weight: 4 ounces (View shipping rates and policies)
Average Customer Review: 4.0 out of 5 stars See all reviews (16 customer reviews)
Best Sellers Rank: #519,937 in Books (See Top 100 in Books) #171 in Books > Computers & Technology > Databases & Big Data > Data Warehousing #255 in Books > Computers & Technology > Databases & Big Data > Data Modeling & Design #309 in Books > Computers & Technology > Databases & Big Data > Data Mining
The book presents an advanced introduction to PIG. Its a book by an insider for insiders and not an introduction to PIG itself. You may end up producing code to run through some data, but may not necessarily gain any understanding.The book reads like a blog. Beyond spell check, it has no editorial oversight whatsover. the content order is adhoc it goes from one topic to another without any apparent continuity. for example, right after a cursory introduction to Map Reduce and PIG, the discussion goes into an arcane details of commandline and flag settings without any context.the book covers a whole lot of concepts, but the introduction of these concepts itself is weak. For example, projections are introduced as something PIG has in common with SQL.book itself is -2 stars. -1 for 's kindle & oreilly. the publishing quality in this book is horrible. the code fonts smaller than main text, uneven spacing etc. default settings on freely available web publishing softwares produces better content than what and orielly have produced here.
PIG is a powerful programming tool for big data, yet it is simple to write. I got the book to help myself to PIG programming and the book does help with it. If you are new to PIG programming, you will find it useful.I was disappointed that the book only has a cursory reference to piggybank.jar which is a big plus for PIG, and that too at the end with no real examples about it. Also there are tons of PIG examples on the internet which pretty much walk through many scenarios much better than this book .This is a quick reference book, not a PIG Bible
Has some good info especially on how to extend pig functionality, but it leaves a lot to be desired as a primer on how to use pig. Most of the examples are poorly explained. To be honest I learned more about some of the thinking behind pig than I learned about how actually use pig. This book needs a new seriously upgraded edition. It only got three stars because it has no competition.
The company I was working for started using Big Data technologies recently and we were all expected to come up to speed quickly. For PIG, this is the only book I could find. Luckily it's available on Safari, so I didn't have to buy it myself (I didn't feel that the content justified the cost of purchasing this book). You can find a good chunk of the content in this book in online blogs, tutorials, and wikis, but it helped to have this book by my side when we were working on a project since all the relevant information is in a single location. Some of the information in this book is outdated. For instance, it talks about Boolean data type not being supported but the recent versions of PIG do, so be sure to refer to PIG docs from time to time to make sure you have the latest information. PIG is a relatively immature framework - the authors admit this to a certain extent in the book by mentioning that much of the tuning/optimization effort is outsourced to the user (unlike databases which make an effort to optimize queries). This book includes some insights into how to tune for performance (e.g: what types of JOINs to use and when, writing UDFs for performance) which is certainly helpful but the general tone is along the lines "here are somethings to look for but you need to test and find what works best yourself" - in other words, comprehensive examples illustrating the concepts are missing. Like a lot of other books, some typos exist in this book although it's not too hard to figure out. A basic understanding of hadoop is necessary to use this book but a solid foundation for hadoop is not necessarily required (although it helps a lot). All in all, if you are looking for a single place to refer to for PIG related docs, get this book.
The book is written pretty well, and the examples are clear and easy to follow for the most part. The only lacking aspect in my opinion was a deeper delve into the analytic capabilities for Pig. This is minor, and may actually be a good prompt for a follow-up "Pig Cookbook". Other than that, this is a great reference and has proven very useful.
Good basic primer on pig and its relationship with hadoop. Worth getting though beware that pig like any vital OS project is changing fast.
The book is a very good introduction to Pig written by an insider. It does not assume any previous knowledge of the subject. However, you need some programming experience and familiarity with Hadoop concepts.I don't give it 5 stars only because it is already not quite up-to-date.
It seems to me that all the information in this book is freely available online, and much more easily searchable online. I was disappointed by its brevity and lack of fresh helpful information.