Evaluating AzureML Experiments

Azure Machine Learning Studio allows you to build and deploy predictive machine learning experiments easily with few drags and drops (technically 😉).

The performance of the machine learning models can be evaluated based on number of matrices that are commonly used in machine learning and statistics available through the studio. Evaluation of the supervised machine learning problems such as regression, binary classification and multi-class classification can be done in two ways.

  1. Train-test split evaluation
  2. Cross validation

Train-test evaluation –

In AzureML Studio you can perform train-test evaluation with a simple experiment setup. The ‘Score Model’ module make the predictions for a portion of the original dataset. Normally the dataset is divided into two parts and the majority is used for training while the rest used for testing the trained model.


Train-test split

You can use ‘Split Data’ module to split the data. Choose whether you want a randomized split or not. In most of the cases, randomized split works better. If the dataset is having a periodic distribution for an example a time series data, NEVER use randomized split. Use the regular split.

Stratified split allows you to split the dataset according to the values in the key column. This would make the testing set more unbiased.

  • Pros-
    • Easy to implement and interpret
    • Less time consuming in execution
  • Cons-
    • If the dataset is small, keeping a portion for testing would be decrease the accuracy of the predictive model.
    • If the split is not random, the output of the evaluation matrices are inaccurate.
    • Can cause over-fitted predictive models.

Cross Validation –

Overcome the mentioned pitfalls in train-test split evaluation, cross validation comes handy in evaluating machine learning methods. In cross validation, despite of using a portion of the dataset for generating evaluation matrices, the whole dataset is used to calculate the accuracy of the model.


k-fold cross validation

We split our data into k subsets, and train on k-1 of those subsets. What we do is holding the last subset for test. We’re able to do it for each of the subsets. This is called k-folds cross validation.

  • Pros –
    • More realistic evaluation matrices can be generated.
    • Reduce the risk of over-fitting models.
  • Cons –
    • May take more time in evaluation because more calculations to be done.

Cross-validation with a parameter sweep –

I would say using ‘Tune model Hyperparameters’ module is the easiest way to identify the best predictive model and then use ‘Cross validate Model’ to check its reliability.

Here in my sample experiment I’ve used the breast cancer dataset available in AzureML Studio that normally use for binary classification.

experimentThe dataset consists 683 rows. I used train-test split evaluation as well as cross validation to generate the evaluation matrices. Note that whole dataset has been used to train the model in cross validation case, while train-test split only use 70% of the dataset for training the predictive model.

Two-class neural networks has used as the binary classification algorithm. The parameters are swapped to get the optimal predictive model.

When observing the outputs, the cross-validation evaluation provides that model trained with whole dataset give a mean accuracy of 0.9736 while the train-test evaluation provides an accuracy of 0.985! So, is that mean training with less data has increased the accuracy? Hell no! The evaluation done with cross-validation provides more realistic matrices for the trained model by testing the model with maximum number of data points.

Take-away – Always try to use cross-validation for evaluating predictive models rather than going for a simple train-test split.

You can access the experiment in the Cortana Intelligence Gallery through this link –



Chatbots : What & Why?

robot-customer-serviceThe word ‘chatbots’ has become one of the most whispered words in the tech world today. Each and every tech company is putting a lot of effort on researching and developing bot related technologies.

The very first thing that you should keep in your mind is “Bot is not an acronym neither a magic app”. Bot is an application that operates as an agent for a user or another program or simulates a human activity.

I would say, there’s no Artificial intelligence or natural language processing attached with most of the chatbots you see out there. But AI and machine learning have become prominent factors of giving bots more human side.

The evolution of chatting paradigms and the rapid adaptation of millennials for chatting platforms like Facebook messenger, WhatsApp and Viber increased the need of chatbots that can handle business processes.


Evolution of user interaction

The same way a website is interacting with a user, bot acts as the interface into the service. Simplicity, increasing productivity, personalized service lines are some of the major benefits that we can achieve with getting chatbots into the play.

Super bots Vs domain specific bots

Probably the very first thought that comes to your mind when it says ‘bots’ might be “Siri, Cortana or Google assistant”. Dominating our pockets with their ability of interacting as a personal assistant, these software utilities can be defined as super bots. They are equipped with speech recognition as well as natural language understanding. Normally there’s a persona specifically designed for these super bots. The backend of these intelligent applications is backed with machine learning and deep learning based technological interventions.


AI powered personal assistants in your pocket and home

Domain specific bots are easy to find and easy to build (comparatively to the super bots). They are specifically designed aligning to a particular business process.  Ordering a pizza from nearest pizza shop, customer service call centers or booking a flight ticket are some example business processes that can be easily adopted to a conversational bot interface. These bots may use machine learning techniques for natural language understanding.

Business bots Vs Consumer bots

Bots are not only mend to be to involve in business process. Fun is mandatory! The consumer bots are specifically designed to maintain human like conversations with the users. Sometimes even for flirting 😉 Mitsuku is known as one of the prominent consumer bots that have built for today.

Text or the voice?

Interacting with a chatbot can be done in several ways. Textual communication is just one thing. Speech recognition enables the user to interact with the chatbots with speech. Some chatbots provide interactive clickable cards for user interaction. Amazon Alexa even has a hardware component that interacts with the user with voice commands.

Building bots

There are plenty of programming paradigms prevailing today that helps you to build conversational bots. Microsoft Bot Framework is a programmer friendly framework that supports C# or node.js for deploying bots. Integrating chat channels like Skype and messenger can be too done through the framework.

Natural Language understanding provides more human like nature for bot’s conversations. For that, LUIS service by Microsoft, API.AI, wit.ai are some prominent services used today by the programmers. No need to go from the scratch of machine learning algorithms. Just an API call will do the magic for you.

Bots can be given more human like abilities with machine learning based intelligent API services and SDKs. Microsoft Cognitive services is a valuable toolset that you can use to give your chatbot the ability to see, hear and even think!

What’s next?

I guess, codeless bot building services (some are already there in the market, but not so matured) and natural language generation would be the next big things in the conversational bot building industry. Deep learning will come to scene with language generation for sure.

Time to market is a prominent factor in the world of business. Then why not going with the trend and adopt a chatbot for your own business or start building bots as your business? 😉

Copying & Migrating AzureML experiments

A set Major advantages in using cloud based machine learning platforms are the ability of collaborative projects, easy sharing and easy migration.  Within AzureML Studio you can share or migrate the experiments using various approaches.

01. Share AzureML workspace

If you want to share all the experiments in your workspace with another user, this is the best option you can go with. All your built experiments, trained models, datasets would be shared with the users with this permission.

  1. Click SETTINGS in the left pane
  2. Click the USERS tab
  3. Click INVITE MORE USERS at the bottom of the page

ml4The users you inviting should have a Microsoft account or a work/school account from Azure Active Directory. Two user access levels can be defined as “Users” and “Owners”.

02. Copy experiment to an AzureML workspace

If you want to migrate an experiment from the current workspace to another, you can go for the experiments pane and click “Copy to workspace”. Note that you only can copy experiments to the workspaces in the same Azure region. This is important if you want to move your experiment from a free tier workspace to a paid standard tier.

ml6You’ll not be able to copy multiple experiments using a single click. If you have such kind of scenario, use poweshell scripts as instructed in this descriptive post.

03. Publish to Gallery

ml7For me this is one of the most useful options. You can use this option in two ways. One is to make the experiments public and in a way that only accessible through a shared link. If you share the experiment publicly that will be listed in the Cortana Intelligence Gallery.

ml8If you want to share an experiment only with your peer group, publishing as an ‘unlisted’ experiment is the best way. Users can open the experiment in their own AzureML studio. This option can be used to migrate your experiment within different workspaces as well as between different azure regions. Only the users who’s having the link you shared can only view or use the experiment you shared.

Lambda Architecture & Cortana Intelligence Suite solutions

Data processing has become the key part of modern applications. Not only processing the data, but also visualizing data in a meaningful way is vital for making business decisions in an enterprise application.

With the rise of massive data storages and the speed of data generation, effective data processing architectural patterns came into industrial standards.

In the era of big data processing where data generated in high volume, variety, velocity, veracity and value; there are many architectural patterns that industrial applications are following for data processing. Lambda, Kappa and Zeta are some patterns used for real time big data processing.

Let’s take a look on how Lambda architecture can be adopted with the products and services comes with Microsoft Cortana Intelligence Suite.

What is Lambda Architecture?

2 - lambaLambda architecture is a data processing architecture designed to handle massive quantities of data by taking the advantage of both batch and stream processing methods. Nathan Marz introduced the term of Lambda Architecture (LA) for having a generic, scalable and fault tolerant data processing architecture.

LA contains different layers which handles data in various methodologies in the process of data processing.

The ability of processing both batch data and real-time data streams is one of the significant features of lambda architecture.

What is Cortana Intelligence Suite?

architectureCortana Intelligence Suite is the Microsoft’s umbrella branding for fully managed business intelligence, big data and advanced analytics offerings comes with Azure cloud which enables businesses to transform the data into intelligent actions. So “Cortana” is there in this name. Then what? Is this related to the smart assistant comes with Windows 10? As Microsoft says, Cortana symbolizes the contextual intelligence that the solutions hope to deliver across the entire suite.

Cortana Intelligence Suite comes with the following services that specially designed for following tasks.

  • Information Management
  • Big Data Stores
  • Machine Learning & Analytics
  • Intelligence
  • Dashboards & Visualizations

How Cortana Intelligence Suite aligns with Lambda architecture?

Cortana Intelligence Suite (CIS) comes with different solutions that can cater both batch data sources and data streams. It is a significant improvement where you combine traditional batch processing systems and data stream analysis systems.

For an example think of a system that indicates the fuel level, oil levels, car tire pressure etc. of a vehicle… The system too should have the ability to analyze the data fetching from the IoT sensors real time as well as do predictions using the stored batch of data. CIS comes handy with various approaches to design this system with lambda architecture.


Usage of CIS tools for data processing

IoT sensors creates hundreds or maybe thousands of data points for a second. Handling such data streams and directing them to analytics flows can be done using Event Hubs(https://azure.microsoft.com/en-us/services/event-hubs/).  you can use Azure Stream Analytics to get data from EventHub into Azure Storage Blobs. Thereafter you can use Azure Data Factory (ADF) to copy data on a scheduled basis from Blobs to Azure Data Lake Store. ADF can act as the batch data source. For analyzing and to build predictive models on the batch data HDInsight & Azure Machine Learning is the option you can go with. Azure SQL data warehouse can be used to store the analyzed data and visualizing them using PowerBI can be done. This is the batch data processing line.

In the line of real time data analysis, you can push the data stream coming from event hub to a Stream Analytics service or for an azure machine learning model. Visualizing data with PowerBI would come handy too.

Apart from the above explained components comes for data processing task, Microsoft Cognitive services can be used for transforming the user interaction for more human side. For an example, Bot framework and LUIS can be used with Bing speech API to provide voice commands for applications. Cortana skills can be used for enabling your app to deal with Cortana assistant.

Tips & Tricks for building a better LUIS model

Windows-Live-Writer-1ea16f646b8b_EBC7-image_6b0e682f-3c30-4d18-a764-b7a8375e6ffeChatbots has become a ‘trend’ today. Everyone wants to attach a chatbot for their businesses. Either to the public facing phase or as the interfering interface of an internal business process. If you observe the social media handles of the major brands, they are using chatbots to communicate with their customers.

Building a bot from the scratch is not an easy task and there’s no need of going from the scratch! There’s plenty of frameworks and language understanding services that you can get the aid of.

Api.ai, Recast.ai, Motion.ai, rasa NLU, Wit.ai, Amazon Lex, Watson Conversation and LUIS.ai are some of services that you can use individually or with the use of other channels and frameworks to build your bot.

Here I’ve listed out some of the best practices I use when creating a language understanding engine using LUIS.ai; which is Microsoft’s NLU product framework.

(To get started with LUIS.ai refer the documentation here – https://docs.microsoft.com/en-us/azure/cognitive-services/luis/home )

Narrow down the scope of the bot

Don’t ever try to put up a Bot with a wide scope. First thing you should get into your mind is “LUIS doesn’t’ generate the answers for the questions. It just direct the questions come into the service into pre-defined answering lines.” Although LUIS limits the number of intents for a single LUIS model to 80, try to reduce the number of intents because it increases the probability of getting the right intent for the questions you ask. Lots of intents may confuse the language understanding algorithm of LUIS. It may lead your program to a wrong intent.

Use a good naming convention for the intents and entities

When you plan to build a LUIS model, choose a good naming convention. Else It would be hard for you when you referring the particular intent from your code. Don’t use too lengthy words as intent names. Just use short descriptive wordings. Using Camel case or dot separated phrases is a good practice.

As an example, for creating a Bot for a Pizza shop, these are some sample intents that you can use.

  • General.Introduction – Shows a general introduction about the coffee shop
  • General.ContactDetails – Show the contact details
  • General.ShowPriceList – Display the full price list of the coffee shop
  • Order.Pizza – Pizza ordering process starts
  • Order.ConfirmOrder – Confirm the placed order
  • Order.CancelOrder – Cancel a placed pizza order
  • Feedback.RecieveFeedback – Receive customer feedback.

Using this kind of dot separated phrases helps you to see the same kind of intents together in a row in your LUIS dashboard.

Use Entities wisely

Entities let you change the answer of a particular intent dynamically. It also let you to reduce the number of intents. If you using dynamic SQL queries to fetch data in your bot, entities may help you to build those queries with the appropriate parameters.

There are four types of custom entities as Simple, Composite, Hierarchal and List. If you want to identify a set of names from the question (Names of pizzas etc.) use the List type of custom entity. It increases the discoverability of the entity. You can add 20,000 items for a single list. Never forget to add the synonyms for the list items. Bot have to think from the user’s language.


Adding a item for a List type entity

There a set of pre-built entities that you can use. If you want to extract a number, age, date kind of data from the question, don’t hesitate to use them.


Pre-built Entities

Train the model with the maximum number of utterances that you can think of! –

Programmer never knows how the end user going to think. So, train each and every intent with the maximum number of utterances that you can think of.

Do check the model and improve it regularly

Even after publishing the LUIS model for the production, make sure to check the suggested utterances and retrain the model with them assigned for the correct intent.

Beware! LUIS is not from your country!

LUIS may capable of identifying a name of city in your question where you want it as an entity value. Sadly, LUIS may not recognize the name of your hometown as a city If it’s not popular as Paris or New York! So, use the ‘Feature List’ for this kind of applications. LUIS may learn that the similar kind of items in your feature list should be treated same. It doesn’t do a strict mapping. Just a hint for the algorithms behind the LUIS engine.

Do version controlling

Training the LUIS model with some utterances may lead it to push out some wrong predictions. Maintaining the clones of the previous versions may help you to rollback and start from the place you were right. Do clone the model when needed.

img_4These are just small tips you can use when building your own LUIS model. Do comment any best practice that you find useful in building an accurate model for your Bot.


Image Classification with CustomVision.ai

cv1Extracting the teeny tiny features in images, feeding the features into deep neural networks with number of hidden neuron layers and granting the silicon chips “eyes” to see has become a hot topic today. Computer vision has gone so far from the era of pattern recognition and feature engineering. With the advancement of machine learning algorithms combined with deep learning; understanding the content in the images and using them in real world applications has become a MUST more than a trend.

Recently during the Microsoft Build2017 conference, they announced a handy tool for training a machine learning image classification model to tag or label your own images. Most interesting part of this tool is, it provides an easy to use user interface to upload your own images for training the model.

After training and tuning the model you can use it as a web service. Using the REST API you just have to push the request to the web service and it’ll do the magic for you.

I just did a tiny experiment with this tool by building an image classifier that classifies few famous landmarks.

I’ve the following image set

  • Eiffel tower – 6 images
  • Great wall – 11 images
  • KL tower – 7 images
  • Stonehenge – 7 images
  • Space Needle – 7 images
  • Taj Mahal – 7 images
  • Sigiriya – 8 images

Let’s get started!

Go to customvision.ai – just sign in with your mail id and you’ll land onto the “My Projects” page

cv2Fill the name, description and select the domain you going to build the model. Here I’ve selected Landmarks because the images I’m going to use contains landmarks and structural buildings.

I had the images of each landmark in separate folders in my local machine. I uploaded the images category by category.  System will detect if you upload duplicate images.

cv4All together 53 images with different tags were uploaded for training.

Training will get few minutes. Optimize the probability threshold to get the best precision and recall. Then get the prediction URL. What you have to do is simply forward a JSON input for the Prediction API.cv7

You can retrain the model by tagging the images used for testing. In a production environment, you can use the user inputs to make the perdition model more accurate. The retrained model will appear as a different iteration. You have the freedom to choose the best iteration that should go live with the API.

You can quickly test how well the model you built us performing. Note that any ML model isn’t giving you 100% accuracy.



A prediction from the API

If you prefer to do this in a programmatic way, or your application need to do all the training and calling in the backend, just use Custom vision SDK.


The SDK comes pretty handy with training new models and adding labels for the images and training it before publishing the prediction API.

Grab a set of images. Build a classifier or a tagger. Make your clients WOW! 😃

Democratizing Machine Learning with Cloud

HiRes.jpg.800x600_q96We have already passed the era of gigabytes when it comes to data. World is talking about terabytes of unstructured data and massive amounts of data points generated from IoT devices and sensors in millions per a second. To analyze these heaps of data, obviously, we need large computation power and massive storage. Building workhorse machines to fulfil those tremendous workloads would definitely cost a lot. Cloud computing paradigm comes handy here. The resourcefulness and the scalability of the public cloud can be used to perform the large calculations in machine learning algorithms.

Almost all the major public cloud providers in the market comes up with machine learning services. Cloud machine learning services in Google Cloud Platform provides modern machine learning services, with pre-trained models and a service to generate your own tailored models. Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. IBM analytics comes up with a machine learning platform with its cloud data services. Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure. We discussed a lot about Azure Machine Learning and its appliances in practical scenarios in the previous posts.

All the mentioned platforms provide machine learning as a service. Most of the platforms offer pre-built ML algorithms in packages. Simple drag and drop user interactions and easy deployment has attracted many developers to use these tools.

But, how would it be if you want to go from the scratch? Either you want to use the power of Graphical Processing Units (GPUs) to process the ML algorithms parallelly? Cloud based Virtual Machines specifically optimized for computation is one of the best solutions that you can consume.

Azure Data Science Virtual Machine (DSVM) –


DSVM in Azure Portal

If you already have used Azure virtual machines for your computation, hosting or storage tasks, this would not be a new concept for you. Azure DSVM is specifically optimized for large computations. Azure DSVM comes in two flavors. One with Windows and the other with Linux. You can choose the hardware configurations as you wish. Many development environments, programming IDEs, languages are pre-installed in the VM instances.

dsvm_linuxMy personal favorite here is the Linux DSVM instance. Here I’ve created a Linux DSVM with the basic configurations. For accessing the VM you can use any tool that can do a SSH call. What I normally do is calling the accessing the VM using Ubuntu Bash on Windows 10.

GPUs for machine learning –



Configurations of the Linux VM with Nvidia GPU

Many machine learning algorithms currently available can be executed parallely. Execution parts of those algorithms are embarrassingly parallel. With that parallel programming, you can reduce the execution time of the algorithms drastically. Data scientists in both industry and academia have been using GPUs for machine learning to make groundbreaking improvements across a variety of applications including image classification, video analytics, speech recognition and natural language processing.


GPUs Vs. CPU computing

Specially in Deep Learning, parallel processing using GPUs can make a drastic decrease in computation time. Purchasing a deep learning dream machine powered with a CUDA enabled high-end GPU such as Nvidia Tesla K80 would cost nearly 6000 dollars! Rather than spending a lot on a machine like that, the most feasible plan is to provision a virtual machine with the specifications we need and pay as we consume.


VM instance price plans

The N-series is a family of Azure Virtual Machines with GPU capabilities that you can use for these kinds of tasks. The N-series will feature the NVIDIA Tesla accelerated platform as well as NVIDIA GRID 2.0 technology, providing the highest-end graphics support available in the cloud today. Through your Azure portal, you can choose a desired price plan with the desired configurations for your tasks when provisioning the VM.

teslaHere’s my Azure VM specifically configured for deep learning exercises. The machine is powered with Tesla K80 GPU which is having 4992 cores in it!! I installed anaconda for that and doing computations using Jupyter notebooks.

Just a hint: stop your VM instance when you are not using it for computation to avoid getting huge unnecessary bills. 😉

No need of huge wallets! The wise decision would be applying cloud technologies for machine learning.