Intro books to machine learning

This page lists some good intro books to machine learning and artificial intelligence.

The author states that “intuition is what you use when you don’t have enough data“. The author will show heuristically how intuition is slowly being taken out of analyzing big data and being replaced with algorithms which teach themselves how to make the data speak for themselves.

All learning starts with some knowledge” (a quote from Hume that the author invokes), and from Hume we know that there is a problem with induction, no matter what the particular cannot prove the universal. The trick is to get from the data (the particular) to the universal and the author explains in detail the five general ways we learn and shows how they work in practice. The five ways are: 

  • Symbolic (think: rational thought – math & logic),
  • Connectivity (modeling like the neural networks in the brain),
  • Bayesian (probabilities – nothing is certain and all is contingent),
  • Evolutionary (genetic algorithms), and
  • Analogy (similar cases).


It is a great book about machine learning. See below for some reasons:

  • Some negative comments mentioned that the book is math heavy, BUT actually that is not true – the required essential math concepts needed are covered in chapter 2 of the book. Everything goes around that. Machine learning is math.
  • Basic concepts are repeated when they are needed- but that’s what make the book so useful, continuous reinforcement.
  • The basic ideas are explained, again, every time they are used. Yes, it takes up a few additional lines and makes the material a bit redundant but it serves to reinforce the basic ideas upon which everything is built. We do not need to struggle in endless loops. Everything is right there, in a clear way. We do not need to Google.
  • Consistent use of a small vocabulary and a few central ideas: all techniques are boiled down to basic fundamental ideas. The ideas are developed early on, very clearly and we are told early on that the rest of the book will grow on these ideas. In Chapters one and two, Dr. Bishop lays down the fundamentals for Maximum likelihood and Bayesian models, linear models, explains inference and decision, and builds upon these principles.
  • Often clearly illustrates the big picture and its relation to the basic ideas and what are the essential roles of basic ideas would play in more advanced topics. Not many authors of books is able to achieve this.
  • As for negative comments about: Too much theory, not enough practice: yep, there isn’t any python code in the book. But a practical text is for advanced user. For beginners, and intermediate, it is better to understand the fundamentals, otherwise, we would probably fall into the common trap of trying several different models on our data and averaging them. If you are looking for code, just go to scikit-learn.
  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd Edition). Biometrics. Download the book PDF (corrected 12th printing Jan 2017) (2017 version pdf backup)
  • Natural Language Processing with Python – Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper (This version of the NLTK book is updated for Python 3 and NLTK 3.)
  • Introduction to Evaluation metrics can be found in Chapter 6 (here).
  • Chapter 8: Evaluation in information retrieval (pdfhtml)
  • Chapter 17: Hierarchical clustering (pdf, html)
  • Seni, G., & Elder, J. F. (2010). Ensemble methods in data mining: improving accuracy through combining predictions. Synthesis Lectures on Data Mining and Knowledge Discovery2(1), 1-126. (pdf)

P 26-28 has pretty good and concise introduction to cross-validation.