The Best of Unpublished Machine Learning and Statistics Books

Sebastian Nowozin - Tue 09 February 2016 -

Nowadays authors in the fields of statistics and machine learning often choose to write their books openly by publishing early draft versions. For popular books this creates a lot of feedback and in the end clearly improves the final book when it is published.

Here is a short list of very promising draft books. Because completing a book is difficult it is likely that some of these books will never be finished.

"Deep Learning"

By Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

Deep learning has revolutionized multiple applied pattern recognition fields since 2011. If you want to get started in applying deep learning methods, now is the time.

First, there are now many well-engineered frameworks available that make learning and experimentation fun.

Second, there are good learning resources available. If you have a solid background in basic machine learning and basic linear algebra, this book is for you. The field of deep learning is advancing very swiftly and likely this book will not cover all the latest models and techniques used when it is published. However, at the moment it is quite up-to-date and simply contains some of the most organized material on the topic of deep learning.

"Advanced Data Analysis from an Elementary Point of View"

By Cosma Rohilla Shalizi.

This book is covering a lot of ground from classical statistics (linear/additive/spline regression, resampling methods, density estimation, dimensionality reduction), and some more recent topics (causality, graphical models, models for dependent data).

One feature of this book is highlighted by the addition ``from an elementary point of view'' to the title: it is quite accessible and the author genuinely cares about conveying understanding, often in a delightfully casual tone.

See also the unfinished but great book written by Cosma Shalizi and Aryeh Kontorovich, "Almost None of the Theory of Stochastic Processes".

"Monte Carlo theory, methods and examples"

By Art Owen.

While Monte Carlo methods have advanced in the last decade, in particular driven by the need for large scale Bayesian statistics, most comprehensive textbooks (Liu, Robert and Casella, Rubinstein and Kroese) are now dated.

Art Owen's book is a wonderful addition that will become a classic once it is completed. From first principles and with great depth Art Owen introduces the theory and practice of Monte Carlo methods. Chapters 8 to 10 contain a wealth of material not found in other textbook treatments. I am eagerly waiting for chapters 11 to 17.

"A Course in Machine Learning"

By Hal Daume III.

A basic course on a broad set of machine learning methods including (eventually, not yet) chapters on structured prediction and Bayesian learning. Very accessible and lots of pseudo-code.

"Introduction to Machine Learning"

By Alex Smola and SVN Vishwanathan.

This draft was last updated in 2010 and probably will never be completed. Chapter 3 however, covering continuous optimization methods, is complete and nicely written.