Transfer Learning in ConvNets

42-29421947The rise of deep learning methods in the areas like computer vision and natural language processing lead to the need of having massive datasets to train the deep neural networks. In most of the cases, finding large enough datasets is not possible and training a deep neural network from the scratch for a particular task may time consuming. For addressing this problem, transfer learning can be used as a learning framework; where the knowledge acquired from a learned related task is transferred for the learning improvement of a new task.

In a simple form, transfer learning helps a machine learning models to learn easily by getting the help from a pre-trained machine learning model which the domain is similar to some extent (not exactly similar).

t1

The ways in which transfer might improve learning

There might be cases where transfer method actually decreases the performance, where we called them as a Negative Transfer. Normally, we (a human) engage with the task of deciding which knowledge can be transferred (mapping) in particular tasks but the active research is going on finding ways to do this mapping automatically.

That’s all about the theories! Let’s discuss how we can apply transfer learning in a computer vision task. As you all know, Convolutional Neural Networks (CNNs) is performing really well in the cases of image classification, image recognition and such tasks. Training deep CNNs need large amounts of image/video data and the massive number of parameter tuning operations takes a long time to train models. In such cases, Transfer Learning is a best fit to train new models and it is widely used in the industry as well as in the research.

There are three main approaches of using transfer learning in machine learning problems. To make it easier to understand I’ll get my examples from the context of training deep neural network models for computer vision (image classification, labeling etc.) related tasks.

ConvNet as fixed feature extractor –

In this case, you use a ConvNet that has been pre-trained with a large image repository like ImageNet and remove its last fully connected layer. The rest is used as a fixed feature extractor for the dataset you are having. Then a linear classifier (softmax or a linear SVM) should be trained for the new dataset.

t2

VGG16 as a feature extractor

Fine-tuning the ConvNet –

Here we are not just stopping by using the ConvNet as a feature extractor. We finetune the weights of the ConvNet with the data that we are having. Sometimes not the whole deepNet, the set of last layers are tuned as the first layers represent most generalized features.

Using pretrained models –

In here we used pre-trained models available in most deep learning frameworks and adjust them according to our need. In the next post, will discuss how to perform this using PyTorch.

One of the most important decisions to get in transfer learning is whether to fine tune the network or to leave it as it is. The size of the dataset and the similarity of the prevailing dataset to the model’s trained training set are the deciding factors for it.  Here’s a summary that would help you to take the decision.

Picture1Let’s discuss how to perform transfer learning with an example in the next post. 😊

Advertisements

The Story of Deep Pan Pizza :AI Explained for Dummies

Artificial Intelligence, Machine Learning, Neural Networks, Deep Learning….

Most probably, the words on the top are the widely used and widely discussed buzz words today. Even the big companies use them to make their products appear more futuristic and “market candy” (Like a ‘tech giant’ recently introduced something called a ‘neural engine’)!

Though AI and related buzz words are so much popular, still there are some misconceptions with people on their definitions. One thing that clearly you should know is; AI, machine learning & deep learning is having a huge deviation from the field called “Big Data”. It’s true that some ML & DL experiments are using big data for training… but keep in mind that handling big data and doing operations with big data is a separate discipline.

So, what is Artificial Intelligence?

“Artificial intelligence, sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans and other animals.” – Wikipedia

Simple as that. If a system has been developed to perform the tasks that need human intelligence such as visual perception, speech recognition, decision making… these systems can be defined as a intelligent system or an so called AI!

The most famous “Turing Test” developed by Alan Turing (Yes. The Enigma guy in the Imitation Game movie!) proposed a way to evaluate the intelligent behavior of an AI system.

Turing_test_diagram

Turing Test

There are two closed rooms… let’s say A & B. in the room A… we have a human while in the room B we have a system. The interrogator; person C is given the task to identify in which room the human is. C is limited to use written questions to make the determination. If C fails to do it- the computer in room A can be defined as an AI! Though this test is not so valid for the intelligent systems we have today, it gives a basic idea on what AI is.

Then Machine Learning?

Machine learning is a sub component of AI, that consists of methods and algorithms allows the computer systems to statistically learn the patterns of data. Isn’t that statistics? No. Machine learning doesn’t rely on rule based programming (It means that a If-Else ladder is not ML 😀 ) where statistical modeling is mostly about formulation of relationships between data in the form of mathematical equations.

There are many machine learning algorithms out there. SVMs, decision trees, unsupervised methods like K-mean clustering and so-called neural networks.

That’s ma boy! Artificial Neural Networks?

Inspired by the neural networks we all have inside our body; artificial neural network systems “learn” to perform tasks by considering many examples. Simply, we show a thousand images of cute cats to a ANN and next time.. when the ANN sees a cat he is gonna yell.. “Hey it seems like a cat!”.

If you wanna know all the math and magic behind that… just Google! Tons of resources there.

Alright… then Deep Learning?

Yes! That’s deep! Imagine the typical vanilla neural networks as thin crust pizza… It’s having the input layer (the crust), one or two hidden layers (the thinly soft part in the middle) and the output layer (the topping). When it comes to Deep Learning or the deep neural networks, that’s DEEP PAN PIZZA!

e8f6eaa267ef4b02b2734d0031767728_th

DNNs are just like Deep Pan Pizzas

Deep Neural Networks consist of many hidden layers between the input layer and the output layer. Not only typical propagation operations, but also some add-ins (like pineapple) in the middle. Pooling layers, activation functions…. MANY!

So, the CNNs… RNNs…

You can have many flavors in Deep Pan Pizzas! Some are good for spicy lovers… some are good for meat lovers. Same with Deep Neural Networks. Many good researchers have found interesting ways of connecting the hidden layers (or baking the yummy middle) of DNNs. Some of them are very good in image interpretation while others are good in predicting values that involves time or the state. Convolutional Neural Networks, Recurrent Neural Networks are most famous flavors of this deep pan pizzas!

These deep pan pizzas have proven that they are able to perform some tasks with close-to-human accuracy and even sometimes with a higher accuracy than humans!deep-learning

Don’t panic! Robots would not invade the world soon…

 

Image Courtesy : DataScienceCentral | Wikipedia

Deep Learning Vs. Traditional Computer Vision

1_K68boX7fmtsYmyG2LlcmhQ

Traditional way of feature extraction

Computer vision can be succinctly described as finding and telling features from images to help discriminate objects and/or classes of objects.

Computer vision has become one of the vital research areas and the commercial applications bounded with the use of computer vision methodologies is becoming a huge portion in industry. The accuracy and the speed of processing and identifying images captured from cameras are has developed through decades. Being the well-known boy in town, deep learning is playing a major role as a computer vision tool.

Is deep learning the only tool to perform computer vision?

A big no! Deep learning came to the scene of computer vision couple of years back. Back then, computer vision was mainly based with image processing algorithms and methods. The main process of computer vision was extracting the features of the image. Detecting the color, edges, corners and objects were the first step to do when performing a computer vision task. These features are human engineered and accuracy and the reliability of the models directly depend on the extracted features and on the methods used for feature extraction. In the traditional vision scope, the algorithms like SIFT (Scale-Invariant Feature Transform), SURF (Speeded-Up Robust Features), BRIEF (Binary Robust Independent Elementary Features) plays the major role of extracting the features from the raw image.

The difficulty with this approach of feature extraction in image classification is that you have to choose which features to look for in each given image. When the number of classes of the classification goes high or the image clarity goes down it’s really hard to cope up with traditional computer vision algorithms.

The Deep Learning approach –

Deep learning, which is a subset of machine learning has shown a significant performance and accuracy gain in the field of computer vision. Arguably one of the most influential papers in applying deep learning to computer vision, in 2012, a neural network containing over 60 million parameters significantly beat previous state-of-the-art approaches to image recognition in a popular ImageNet computer vision competition: ISVRC-2012

Screen-Shot-2018-03-16-at-3.06.48-PMThe boom started with the convolutional neural networks and the modified architectures of ConvNets. By now it is said that some convNet architectures are so close to 100% accuracy of image classification challenges, sometimes beating the human eye!

The main difference in deep learning approach of computer vision is the concept of end-to-end learning. There’s no longer need of defining the features and do feature engineering. The neural do that for you. It can simply put in this way.

If you want to teach a [deep] neural network to recognize a cat, for instance, you don’t tell it to look for whiskers, ears, fur, and eyes. You simply show it thousands and thousands of photos of cats, and eventually it works things out. If it keeps misclassifying foxes as cats, you don’t rewrite the code. You just keep coaching it.

Wired | 2016

Though deep neural networks has its major drawbacks like, need of having huge amount of training data and need of large computation power, the field of computer vision has already conquered by this amazing tool already!

Keras; The API for Human Beings

Obviously deep learning is a hit! Being a subfield of machine learning, building deep neural networks for various predictive and learning tasks is one of the major practices all the AI enthusiasts do today. There are several deep learning frameworks out there that helps for building deep neural networks. TensorFlow, Theano, CNTK are some of the major frameworks used in the industry and in the research. These Frameworks has their own way of defining the tensor units and a way of configuring the connections between nodes. That is involves bit of a learning curve.

keras_1As shown in the graph, TensorFlow is the most popular and widely used deep learning framework right now. When it comes to Keras, it’s not working independently. It works as an upper layer for prevailing deep learning frameworks; namely with TensorFlow, Theano & CNTK (MXNet backend for Keras is on the way).  To be more précised, Keras act as a wrapper for these frameworks. Working with Keras is easy as working with Lego blocks. What you have to know is where to fix the right component. So it is the ultimate deep learning tool for human beings!

keras_2

Architecture of Keras API

Why Keras?

  • Fast prototyping – Most of the cases, you may have to test different neural architectures to find the best fit. Building the models from the beginning may time consuming. Keras will help you in this with modularizing your task and giving you the ability to reuse the code.
  • Supports CNN, RNN & combination of both –
  • Modularity
  • Easy extensibility
  • Simple to get started, simple to keep going
  • Deep enough to build serious models.
  • Well-written document. – Yes! Refer http://keras.io
  • Runs seamlessly on CPU and GPU. – Keras support GPU parallelization that will boost your execution.

Keras follows a very simple design idea. Here I’ve sum-up the main four steps of designing a Keras model deep learning model.

  1. Prepare your inputs and output tensors
  2. Create first layer to handle input tensor
  3. Create output layer to handle targets
  4. Build virtually any model you like in between

Basically, Keras models go through the following pipeline. You may have to re-visit the steps again and again to come up with the best model.

keras_3Let’s start with a simple experiment that involves classifying Dog & Cat images from Kaggle. First make sure to download the training & testing image files from Kaggle (https://www.kaggle.com/c/dogs-vs-cats/data)

Before playing with Keras, you may need to setup your rig. Please refer this post and make your beast ready for deep learning.

Then try this code! The code sections are commented for your reference. Here what I’m using is TensorFlow backend. You can change the configurations a bit and use Theano or CNTK as you wish.

# Convolutional Neural Network with Keras

# Installing Tensorflow
# pip install tensorflow-gpu

# Installing Keras
# pip install --upgrade keras

# Part 1 - Building the CNN

# Importing the Keras libraries and packages
import keras
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

# Initialising the CNN
classifier = Sequential()

# Step 1 - Convolution
#input_shape goes reverse if it is theano backend
#Images are 2D
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

# Step 2 - Pooling
#Most of the time it's (2,2) not loosing many. 
classifier.add(MaxPooling2D(pool_size = (2, 2)))

# Adding a second convolutional layer
#Inputs are the pooled feature maps of the previous layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))

 

# Step 3 - Flattening
classifier.add(Flatten())

# Step 4 - Full connection
#relu - rectifier activation function
#128 nodes in the hidden layer
classifier.add(Dense(units = 128, activation = 'relu'))
#Sigmoid is used because this is a binary classification. For multiclass softmax
classifier.add(Dense(units = 1, activation = 'sigmoid'))

# Compiling the CNN
#adam is for stochastic gradient descent 
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

# Part 2 - Fitting the CNN to the images
#Preprocess the images to reduce overfitting
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255, #All the pixel values would be 0-1
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')

test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')

classifier.fit_generator(training_set,
steps_per_epoch = 8000, #number of images in the training set
epochs = 5,
validation_data = test_set,
validation_steps = 2000)

#Prediction
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('dataset/single_prediction/cat_or_dog_2.jpg', target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
prediction = 'dog'
else:
prediction = 'cat'

print(prediction)

Configuring a Windows Running Deep Learning Rig

When it comes to deep learning; the first thing comes to your mind is the “Computation Power”. The thousands of matrix operations that you going to perform when training the deep neural networks would take ages if you going to use only the CPU to do it.

The solution is the Graphical Processing Units (GPUs). introduction-to-multi-gpu-deep-learning-with-digits-2-mike-wang-22-638

There are few ways that you can get the power of high computation power for deep learning.

No offence, in my experience Linux operating system (What I’m using is the Ubuntu flavor) comes handy with performing deep learning operations in python because the terminal, bash commands, open source editing tools, GPU hackability is bit easy for me in Linux.

But the recent windows and Visual Studio updates too make it possible to do deep learning on your Windows rig.

Here are the steps I’ve followed to configure my laptop to perform some DL based computations with Tensorflow and Keras.

The laptop I’m using is an Asus UX310UA with Core i7 7th Gen processor, 16GB RAM and Nvidia Geforce 940MX 2 GB GPU.c2

I’m running Windows 10 Enterprise 1703 build on my laptop.

Please note that the following steps may change according to some conditions.

  1. Check the GPU processing capability of your GPU

If you wish to use your GPU for do parallel processing, first check the CUDA supportability of your GPU device. More the CUDA cores you have, more the computation you get. As an example, Nvidia Tesla K80 is having 4992 CUDA cores while Geforce 940MX equipped with 384 CUDA cores. The GPU compute capability should be 3.0 or higher.

Check whether your GPU is listed in the list.

https://www.geforce.com/hardware/technology/cuda

 

  1. Install CUDA Toolkit

Installing CUDA on Windows has a dependency for a C++ compiler. The CUDA version I’ve installed in my laptop is CUDA 8.0. Along with that I’ve installed Visual C++ 15.0 compiler. Refer the following guide to install CUDA Toolkit for your computer.

 http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html

 

  1. Install CuDNN Tools

For faster computations, you need to install CUDA Deep Neural Network toolkit. Depends on the CUDA version that you’ve installed you should select the appropriate CuDNN version. In my case with CUDA 8.0 Both CuDNN 7.0 & CuDNN 6.0 works. When it comes to package installations, CuDNN 7.0 throwed me some errors. So, I went with CuDNN 6.0 and it’s working fine on my machine 😊

Note that you need to do some manual file copy pastings in this step.

http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html#install-windows

For safe side, restart the machine now! It’ll then pop up any additional dependencies that the GPU ask you to install.

 

  1. Install Anaconda

Now it’s time for the Big Snake! Anaconda is the leading Python data science platform. This framework comes with many pre-installed essential libraries and configurations that you may need regularly. Go with Python3 since it is the latest.

https://www.anaconda.com/download/

 

  1. Create a python environment for your experiments

Python comes with hell a lot of libraries that you may need to compile your program. So best thing is to create a separate environment for deep learning and use it. It’ll secure you from tangling the dependencies among libraries.

Go for Anaconda prompt (Find it on start menu – Advised to open the conda prompt as administrator) and push the command. We are using python 3.5 at the moment. ‘tensorflow-gpu’ is the environment name.

conda create -n tensorflow-gpu python=3.5 anaconda

Activate the environment

activate thensorflow-gpu

c1

  1. Install Theano

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. We need it! Make sure you are installing all of these inside your environment.

conda install theano

 

  1. Install mingw python

Even though python is an interpreted language, you may ned to install Windows C++ compilers in some cases. For python 3.5/3.6 you can use Visual C++ 14.0 compiler.

conda install mingw libpython

 

  1. Install tensorflow

Tensorflow is an open source library for numerical computation. You can install the cpu version if you don’t have a GPU in your machine just by installing the CPU version.

pip install tensorflow-gpu

 

  1. Install keras

Keras is a high-level neural network API. It can sun on top of TensorFlow, CNTK or Theano. For coding easiness will install Keras too.

conda install keras

 

  1. Update all the packages

conda update –all

All set! 😊 now you are ready to start coding. Start with your favorite IDE. For me, I prefer Spyder and sometimes Visual Studio. You can directly go for spyder from your Anaconda prompt or Anaconda navigator.  c3

Will discuss on dealing with python on Visual Studio in the next article.

Democratizing Machine Learning with Cloud

HiRes.jpg.800x600_q96We have already passed the era of gigabytes when it comes to data. World is talking about terabytes of unstructured data and massive amounts of data points generated from IoT devices and sensors in millions per a second. To analyze these heaps of data, obviously, we need large computation power and massive storage. Building workhorse machines to fulfil those tremendous workloads would definitely cost a lot. Cloud computing paradigm comes handy here. The resourcefulness and the scalability of the public cloud can be used to perform the large calculations in machine learning algorithms.

Almost all the major public cloud providers in the market comes up with machine learning services. Cloud machine learning services in Google Cloud Platform provides modern machine learning services, with pre-trained models and a service to generate your own tailored models. Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. IBM analytics comes up with a machine learning platform with its cloud data services. Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure. We discussed a lot about Azure Machine Learning and its appliances in practical scenarios in the previous posts.

All the mentioned platforms provide machine learning as a service. Most of the platforms offer pre-built ML algorithms in packages. Simple drag and drop user interactions and easy deployment has attracted many developers to use these tools.

But, how would it be if you want to go from the scratch? Either you want to use the power of Graphical Processing Units (GPUs) to process the ML algorithms parallelly? Cloud based Virtual Machines specifically optimized for computation is one of the best solutions that you can consume.

Azure Data Science Virtual Machine (DSVM) –

dsvm

DSVM in Azure Portal

If you already have used Azure virtual machines for your computation, hosting or storage tasks, this would not be a new concept for you. Azure DSVM is specifically optimized for large computations. Azure DSVM comes in two flavors. One with Windows and the other with Linux. You can choose the hardware configurations as you wish. Many development environments, programming IDEs, languages are pre-installed in the VM instances.

dsvm_linuxMy personal favorite here is the Linux DSVM instance. Here I’ve created a Linux DSVM with the basic configurations. For accessing the VM you can use any tool that can do a SSH call. What I normally do is calling the accessing the VM using Ubuntu Bash on Windows 10.

GPUs for machine learning –

GPU_1

GPU_2

Configurations of the Linux VM with Nvidia GPU

Many machine learning algorithms currently available can be executed parallely. Execution parts of those algorithms are embarrassingly parallel. With that parallel programming, you can reduce the execution time of the algorithms drastically. Data scientists in both industry and academia have been using GPUs for machine learning to make groundbreaking improvements across a variety of applications including image classification, video analytics, speech recognition and natural language processing.

google_brain

GPUs Vs. CPU computing

Specially in Deep Learning, parallel processing using GPUs can make a drastic decrease in computation time. Purchasing a deep learning dream machine powered with a CUDA enabled high-end GPU such as Nvidia Tesla K80 would cost nearly 6000 dollars! Rather than spending a lot on a machine like that, the most feasible plan is to provision a virtual machine with the specifications we need and pay as we consume.

VM_size

VM instance price plans

The N-series is a family of Azure Virtual Machines with GPU capabilities that you can use for these kinds of tasks. The N-series will feature the NVIDIA Tesla accelerated platform as well as NVIDIA GRID 2.0 technology, providing the highest-end graphics support available in the cloud today. Through your Azure portal, you can choose a desired price plan with the desired configurations for your tasks when provisioning the VM.

teslaHere’s my Azure VM specifically configured for deep learning exercises. The machine is powered with Tesla K80 GPU which is having 4992 cores in it!! I installed anaconda for that and doing computations using Jupyter notebooks.

Just a hint: stop your VM instance when you are not using it for computation to avoid getting huge unnecessary bills. 😉

No need of huge wallets! The wise decision would be applying cloud technologies for machine learning.