My latest book, Agile Data Science v2.0 is out in early release by O'Reilly! The book is rewritten using PySpark, and many latest and greatest tools like scikit-learn, word2vec, Spark SQL, d3.js and many more. In addition, much new content has been added to make the book a great introduction to predictive analytics in theory and practice.
I’m very proud of it :)
http://bit.ly/agile_data_science