We've had some great threads on books for learning statistics and other topics recently! Machine learning is obviously a massive trend right now, and packages like keras look super exciting.
The trouble is, I have zero background in this stuff, and seeing some minimal examples of a package running isn't really the same as understanding what machine learning is trying to achieve and how it's going about it (perhaps contrasted with, say, regression modelling, which I feel I understand pretty well). I'm looking for some summer reading material (since, y'know, it's not like I have a thesis to submit or anything ) to take me a bit further than the ML elevator pitch. Does anyone have any favourite books? I'm fairly language-agnostic (R would be preferred with Python a runner up, but again, I'm more interested in how well the theory is laid out). Cheers!
A great starting place that I'd recommend would be An Introduction To Statistical Learning; it's a very well-written and accessible introduction to the area, complete with R exercises and code for you to get more familiar with. If you want a heavier version of the material (with more of the maths left in) from the same authors, check out The Elements of Statistical Learning.
For a nice, practical, online course introducing the basic concepts I'd recommend Machine Learning on Coursera; its fantastic content with some excellent hands-on exercises, and will give you some of the finer details and broader concepts, too.
edX also has a few worth checking out. This one from caltech is basically filmed lectures but the guy knows his stuff. Others have good reviews but I haven't tested them personally
I'd definitely be interested to know if anybody has a bookdown project underway featuring relevant R packages
I second "An Introduction To Statistical Learning" as a good start. The authors of the text teach an free online course that follows the material of the book. I found this a very engaging course which gave me a good introduction to the subject. I wasn't expecting it, but it also contains a lot of important supplementary material on statistical methods such as bootstrapping, cross validation etc which are important for the field, but not immediately thought of when putting together a curriculum on machine learning.
n.b The course covers methods such as linear regression, decision trees, and support vector machines. It doesn't really touch on "Deep learning" such as neural nets.
I can highly reccomend the book from Max Kuhn (the author of the caret pkg) - Applied Predictive Modeling. Apart from showing and discussing the models it also goes into the practicalities of training, pitfalls to watch out etc.
An Intro to and Elements of Statistical Learning came up in an earlier thread, so I reckon they'll be a good start I also like the look of that link, @pedram! Thanks, everyone
I 100% support starting with ISLR. It's a great guide to the fundamentals and underlying theory.
For book 2, I recommend:
It's a great walk through a dozen ML models serving a specific purpose. It skips over some theory in favor of getting you to be productive. It also really doesn't require much programming experience.
For book 3, I recommend going to ESL. It's much denser than the other two books, but covers the theory in great detail. If you get through those three books, you'll have a solid understanding of the underlying statistics and the practical usage of machine learning in R. From there, picking up new tools and techniques, like neural nets using Keras, will make a lot more sense and be easier.
Some people here have linked to specific O'Reilly books, but I can't see a link to their free ebooks page, which is fantastic. It covers ebooks from "O’Reilly editors, authors, and Strata speakers" covering topics such as Data Science, AI and Big Data.