-
Notifications
You must be signed in to change notification settings - Fork 58
Home
This was going to be the second edition of my original book “Getting Started with Deep learning”. However, so much has changed about the book and the field since its initial publishing that I decided that a new name was more appropriate. It is now 2020 and Deep Learning is still going strong. In fact, I believe it is accelerating in its evolution. The techniques are widely used by companies now and the algorithms are starting to do things that are truly amazing. As is necessary with progress, the algorithms are also more complicated, with deeper and more resource intensive networks. This is best exemplified by one of the newest deep learning algorithms: The Transformers. Transformers are, for me, the first algorithm I was not able to run on a laptop. They truly require a machine learning “war machine”. Lots of GPU power and memory, etc. The algorithms are much more complicated too. A little bit too much in fact and the programming languages are starting to abstract too much of the code. Something I am not crazy about as I like writing the code from scratch and I never use a deep learning algorithm until I understand every detail about it. My quest for understanding always makes me gravitate away from abstracting libraries and over simplifications. As such, I have great admiration for the computational static graph and the Tensorflow low level API. I feel that I can only understand a deep learning algorithm when I implement it in the low level API with a static graph. As such, all algorithms discussed in this book are implemented in this way. So the goal of this book is to learn and to better understand how to write deep learning algorithms from scratch (as much as is possible using Tensorflow) using the Tensorflow low level API and the static graph. This is a book for everyone from those starting in deep learning to those with more advanced knowledge. The book starts with basic linear regression and builds on every chapter until the more advanced algortihms like CNNs, RNNs, encoders, GANs, Q-Learn, and Transformers, to name a few. I hope you enjoy the book. I sure have enjoyed writing it.
Ever since 2007 with the explosion in the use of parallel hardware, the field of machine learning has become more exciting and more promising. It seems that the dream of true AI is finally just around the corner. Certainly, there are many companies that are starting to rely heavily on AI for their products. These include companies in search like Facebook, Google, as well as retailers and multimedia companies like Amazon and Netflix. But more recently many others in the health-care and cyber security industries are also interested in what AI and machine learning can do for them. Some of these technologies such as Tensorflow (which came about around 2015) are new and not widely understood. In this book I hope to provide basic discussions of machine learning and in particular deep learning to help readers to quickly get started in using these technologies. The book is not a comprehensive survey on deep learning. There are many topics I do not cover here as too much material can be overwhelming to the un-initiated. There are many good books that cover all the theory in depth and I will mention some of them in the book. Instead, the goal in this book is to help people new to deep learning to quickly get started with these concepts using python and Tensorflow. Therefore, a lot of detail is spent on helping the reader to write his or her first deep network classifier. Additionally, I will try to connect several elements in machine learning which I think are related and are very important for data analysis and automatic classification. In general, I prefer python and I will try to present all examples using this great language. I will also use the more common libraries and the Linux development environment. Many people use SKlearn and I have therefore tried to use this library in the Tensorflow examples so that the focus is mainly on creating the deep layer network architectures.