by Harini Janakiraman
Day 24: How to build a Deep Learning Image Classifier for Game of Thrones dragons
Performance of most flavors of the old generations of learning algorithms will plateau. Deep learning, training large neural networks, is scalable and performance keeps getting better as you feed them more data. — Andrew Ng
Deep learning doesn’t take a huge amount of time or computational resources. Nor does it require highly complex code, and in some cases not even a large amount of training data. Curated best practices are now available as libraries that make it easy to plug in and write your own neural network architectures using a minimal amount of code to achieve more than 90% prediction accuracies.
The two most popular deep learning libraries are: (1) pytorch created by Facebook (we will be using fastai today, which is built on top of pytorch) and (2) the keras-tensorflow framework created by Google.
We will build an image classifier using the Convolutional Neural Network (CNN) model to predict if a given image is that of Drogon or Vicerion (any Game of Thrones fans here in the house? Clap to say yay!).
You can adapt this problem statement to any type of image classification that interests you. Here are some ideas: cat or dog (classic deep learning 101), if a person is wearing glasses or not, bus or car, hot dog vs not-hot dog (Silicon Valley fans also say yay! ;) ).
Step 1: Installation
You can use any GPU accelerated cloud computing platform for running your model on. For the purpose of this blog we will be using Paperspace (most affordable). Complete instructions on how to get this up and running are available here.
Once setup, you can launch Jupyter notebook on that machine using the following command:
This will give you a localhost URL that you can open in your browser and replace “localhost” with your machine’s IP address to launch your notebook.
Now you can copy over the iPython notebook and dataset files into the directory structure below from my github repo.
Note: Do not forget to shut down the machine from the paperspace console once you are done to avoid getting accidentally charged.
Step 2: Training
Follow the instructions in the notebook to initialize the libraries needed for this exercise, and point to the location of the PATH to your data directory. Note that each block of code can be run using “shift+enter.” In case you need additional info on Jupyter notebook commands, you can read more here.
Now, coming to the part of training the image classifier, the following three lines of code form the core of building the deep learning model:
- data: represents the validation and training datasets.
- learn: contains the model
- learn.fit(learning_rate,epoch): Fit the model using two parameters — learning rate and epochs.
We have set the learning rate to be “0.01” here. Learning rate needs to be a small enough number so that you move through the image in incremental steps of this factor to learn with accuracy. But it shouldn’t be too small, either, as that would result in too many steps/too long to learn. The library has a learning rate finder method “lr_find()” to find the optimal one.
Epoch is set to “3” in the code here and it represents how many times you should run the batch. We can run as many times as we want, but after a point accuracy will start to get worse due to overfitting.
Step 3: Prediction
We will now run prediction on the validation data using the trained model.
Pytorch gives a log of prediction, so to get the probability you have to get e to the power of using numpy. Follow the instructions step by step on the notebook in my github repo. A probability close to 0 implies its an image of Drogon and a probability close to 1 implies its an image of Viserion.
Step 4: Visualize
Plotting function can be used to visualize the results of the prediction better. The below images show you correctly classified validation data with 0.2–0.3 indicating it’s Drogon and a probablity of 0.7–0.8 indicating it’s Viserion.
You can also see some of the uncertain predictions if they linger closer to 0.5 probability.
The image classifier in some scenarios can have uncertain predictions, for example in case of long tailed images, as it grabs a small piece of the square at a time.
In those cases, enhancement techniques can be done to have better results such as data augmentation, optimizing the learning rate, using differential learning rates for different layers, and test-time augmentation. These advanced concepts will be explored in future posts.
This blog was inspired by fastai CNN video. To get an in-depth understanding and continue your quest in Deep Learning, you can take the famous set of courses by Andrew Ng on coursera.