Tips & Tricks for building a better LUIS model – 2

Capture2017 ended up making ‘chatbots’ not a trend but an essential in the tech world. With the rise of chatbots, building up effective natural language understanding (NLU) models is a must. LUIS is an admirable service from Microsoft Cognitive Services that can use for building NLU models.

Most of the cases LUIS models perform well but in some cases LUIS models pop out with unexpected results. This may cause with various reasons including the less understanding on the business domain of the chatbot.

Here I’m adding few more for the points we discussed in the previous article, that we need to consider when building accurate LUIS models.

Note that LUIS has got a major update as well as a UI change compared to the version we used back last year.

Using Phrase List

You can create a phrase list for teaching synonyms for the model. For an example here, I’ve added a phrase list for the word “good”. The wizard recommends a set of words that can added to the phrase list and you can select the appropriate words from those suggestions. No need of typing all the synonyms. Let LUIS handle it. 😊

3_phrase listAnother use of the Phrase list is to train the model the domain specific words. As an example, you can add a list of fruit names (Apple, Orange, Grapes) to a phrase list and give it a name. It’ll help the LUIS model to align according to the given domain.

Use same amount of utterances to train each intent

When you training the intents using possible utterances, make sure to use roughly the same number of utterances for each intent. Else the intent predicting process might be bias to intents with higher number of utterances.

Comparing the accuracy of the model with published versions

With the new LUIS portal, you can test the accuracy of the built model without connecting it to a bot service. Adding to that you can compare the difference of two versions of the same LUIS model at the same time. After observing the difference for various utterances, you can decide which model version should go for production.


LUIS Programmatic API

This is not directly attached with optimizing the accuracy. LUIS Programmatic API allows you to do all most all the tasks that you perform to build and train a LUIS model through API calls. This may comes handy when you building a bot that can learn by itself.

Add Bing spell checker to the chatbot

You may notice that the spelling mistakes of the user cause wrong intent identifications. To overcome this issue, you can enable bing spell checker in your LUIS model. This may cost you a bit, but the accuracy of identifying intents would go up.


If you have more tips & tricks to share for optimizing NLU models, do share here as comments. 😊


Deploy Machine Learning Models in a Production environment as APIs (Python Flask + Visual Studio)

Intelligent application building basically consist of integrating machine learning based predictive components for the apps and systems. Mostly data scientists or the AI engineers are accountable of building these machine learning models.

When it comes to integration and deployment in production environment, the problem occurs with platform dependency. Most of the data scientists and AI engineers are pretty comfortable with python or R and they develop their models with them, though the rest of the system would be on .NET or Java based application.

One of the best approaches to connect these components together is deploying the ML predictive module as a web API and calling the API through the application. When it comes to APIs any programmer can work with it when they have the API definition.

Flask is a small and powerful web framework for Python. It’s easy to learn and simple to use, enabling you to build your web app in a short amount of time. Visual Studio provides an easy way to create Python flask web applications through it’s templates. Here’s the steps I’ve gone through for deploying the ML experiment as a REST API.

01. Create the machine learning model, train, tune and evaluate it.

Here what I’ve done is a simple linear regression for predicting the monthly salary according to the years of experience. Sci-kit learn python library has been used for performing the regression operation. The dataset used for the experiment is from SuperDataScience. 

The code is available in the GitHub repository .

02. Creating the pickle

When you deploy the predictive model in production environment, no need of training the model with code again and again. Python has a built-in method of persisting data called pickle. The pickle module can serialize objects or data into a file that we can save and load from. You can just use the pickle as a binary reference generating the output.  scikit-learn has their own model persistence method we will use: joblib. This is more efficient to use with scikit-learn models due to it being better at handling larger numpy arrays that may be stored in the models.

03. Create a Python Flask web application.

Simply go for Visual Studio. (I’m using VS2017 which comes with python by default) Select web project. The step by step guide is here.  I would recommend you to go with option 2 mentioned in the blog because it reduces lot of unnecessary overhead.

f_2For the safe side, use python virtual environments. It would avoid many hassles occurs with library dependencies. I’ve used anaconda environment as the base of virtual environment.


04. Create the API.

Create a new python file in your project and set it as the startup file. (In my case is the startup file which contains the API code). The pickle file that contains the model binaries is the only dependency the API is getting when it is deployed.

f_7Here the API operates through POST methods which accepts the input in JSON.

04. Run & Test

You can run the API and test by sending POST requests to the URL with a JSON body. Here I’ve used postman to send a POST request and it gives me the predicted salary for the entered number of months.


You can access the whole code of the project through my GitHub repo here.


    Do comment if you have any suggestion to change the API structure.

Handling Big snakes on Visual Studio

In the last post we discuss on setting up a Windows rig for deep learning. If you still haven’t setup your machine, go do it first: D

After getting the so called big snakes; python and anaconda in the machine, we should have a proper IDE for coding.

There are many good IDEs you can use in Windows environment to code in python. Pycharm, Spyder are some popular tools.

If you familiar with Visual Studio, the so-called father of all IDEs, python works smoothly with VS. There are few configurations need to be done.

c1No need to purchase Visual Studio enterprise or ultimate. The freely available Visual Studio Community edition works fine. In 2017 version python comes along side with the default installation options. For the later versions you need to install Python Tools for Visual Studio (PTVS) separately.

Refer this guide for more details.

The python environments configured to machines can be seen from ‘Python Environments’ pane of Visual Studio. (If it is not there go for Tools -> Python -> Python Environments)


By default, your Anaconda environment and default python environment should be there. First Refresh those environments to support intelliSense and grab the installed libraries for the DB.

For our deep learning experimentations, we configured a separate python environment before. To add that environment for visual studio follow the following steps.

01. Click Custom on ‘Python environments’

02. Go for anaconda environments and activate your pre-configured environment for deep learning (Mine is tensorflow-gpu)


03. Copy the interpreter path of the environment

04. Insert it for the interpreter path and click “auto detect’. Visual Studio will detect the rest


05. Click Apply

It may take few minutes to refresh the packages as well as the intelliSense. Make the configured environment your default and open the interactive. You are good to go 😊


Configuring a Windows Running Deep Learning Rig

When it comes to deep learning; the first thing comes to your mind is the “Computation Power”. The thousands of matrix operations that you going to perform when training the deep neural networks would take ages if you going to use only the CPU to do it.

The solution is the Graphical Processing Units (GPUs). introduction-to-multi-gpu-deep-learning-with-digits-2-mike-wang-22-638

There are few ways that you can get the power of high computation power for deep learning.

No offence, in my experience Linux operating system (What I’m using is the Ubuntu flavor) comes handy with performing deep learning operations in python because the terminal, bash commands, open source editing tools, GPU hackability is bit easy for me in Linux.

But the recent windows and Visual Studio updates too make it possible to do deep learning on your Windows rig.

Here are the steps I’ve followed to configure my laptop to perform some DL based computations with Tensorflow and Keras.

The laptop I’m using is an Asus UX310UA with Core i7 7th Gen processor, 16GB RAM and Nvidia Geforce 940MX 2 GB GPU.c2

I’m running Windows 10 Enterprise 1703 build on my laptop.

Please note that the following steps may change according to some conditions.

  1. Check the GPU processing capability of your GPU

If you wish to use your GPU for do parallel processing, first check the CUDA supportability of your GPU device. More the CUDA cores you have, more the computation you get. As an example, Nvidia Tesla K80 is having 4992 CUDA cores while Geforce 940MX equipped with 384 CUDA cores. The GPU compute capability should be 3.0 or higher.

Check whether your GPU is listed in the list.


  1. Install CUDA Toolkit

Installing CUDA on Windows has a dependency for a C++ compiler. The CUDA version I’ve installed in my laptop is CUDA 8.0. Along with that I’ve installed Visual C++ 15.0 compiler. Refer the following guide to install CUDA Toolkit for your computer.


  1. Install CuDNN Tools

For faster computations, you need to install CUDA Deep Neural Network toolkit. Depends on the CUDA version that you’ve installed you should select the appropriate CuDNN version. In my case with CUDA 8.0 Both CuDNN 7.0 & CuDNN 6.0 works. When it comes to package installations, CuDNN 7.0 throwed me some errors. So, I went with CuDNN 6.0 and it’s working fine on my machine 😊

Note that you need to do some manual file copy pastings in this step.

For safe side, restart the machine now! It’ll then pop up any additional dependencies that the GPU ask you to install.


  1. Install Anaconda

Now it’s time for the Big Snake! Anaconda is the leading Python data science platform. This framework comes with many pre-installed essential libraries and configurations that you may need regularly. Go with Python3 since it is the latest.


  1. Create a python environment for your experiments

Python comes with hell a lot of libraries that you may need to compile your program. So best thing is to create a separate environment for deep learning and use it. It’ll secure you from tangling the dependencies among libraries.

Go for Anaconda prompt (Find it on start menu – Advised to open the conda prompt as administrator) and push the command. We are using python 3.5 at the moment. ‘tensorflow-gpu’ is the environment name.

conda create -n tensorflow-gpu python=3.5 anaconda

Activate the environment

activate thensorflow-gpu


  1. Install Theano

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. We need it! Make sure you are installing all of these inside your environment.

conda install theano


  1. Install mingw python

Even though python is an interpreted language, you may ned to install Windows C++ compilers in some cases. For python 3.5/3.6 you can use Visual C++ 14.0 compiler.

conda install mingw libpython


  1. Install tensorflow

Tensorflow is an open source library for numerical computation. You can install the cpu version if you don’t have a GPU in your machine just by installing the CPU version.

pip install tensorflow-gpu


  1. Install keras

Keras is a high-level neural network API. It can sun on top of TensorFlow, CNTK or Theano. For coding easiness will install Keras too.

conda install keras


  1. Update all the packages

conda update –all

All set! 😊 now you are ready to start coding. Start with your favorite IDE. For me, I prefer Spyder and sometimes Visual Studio. You can directly go for spyder from your Anaconda prompt or Anaconda navigator.  c3

Will discuss on dealing with python on Visual Studio in the next article.


Artificial Neural Networks with Net# in Azure ML Studio

The ideas for neural networks go back to the 1940s. The essential concept is that a network of artificial neurons built out of interconnected threshold switches can learn to recognize patterns in the same way that an animal brain and nervous system does.

Though the name “neural network” gives an idea of a ‘black box’ type predictive operation; ANN is a set of mathematical operations.


As the name implies by itself; neural network is a structural ‘network’. The nodes of the neural network are organized in layers and the nodes are connected with each other with edges. The edges are directional and they are weighted.

Azure Machine Learning Studio comes with pre-built neural network modules that can easily use for predictive analytics.

NN models

Pre-built neural networks in AML Studio  

Multiclass Neural Network Module –

Used for multiclass classification problems. The number of hidden nodes, the learning date, number of learning iterations and many parameters can be changed easily by changing the module properties.

Two-Class Neural Network –

Ideal for binary classification problems. Same as the Multiclass neural network module, the properties of the neural network can be changed by the module properties.

Neural Network regression –

This is a supervised machine learning method that can be used to predict a numerical value.

These simple pre-built modules can be added to your ML experiment with just a drag and drop and change the parameters by changing the module properties. What you going to do if you want to implement a complex neural network architecture? Or to create a deep neural network with more hidden layers?

AzureML Studio comes handy here with providing you the ability to define the hidden layer/layers of the ANN with a script. Net# scripting language provide the ability to define almost any neural network architecture in an easy to read format.

Net# scripting language is able to

  • Create hidden layers and control the number of nodes in each layer.
  • Specify how layers are to be connected to each other.
  • Define special connectivity structures, such as convolutions and weight sharing bundles.
  • Specify different activation functions.

In Azure Machine Learning, you can add the Net# scripts by choosing ‘Custom definition script’ in Hidden layer specification property. By default, it would set to the fully connected case.


Net# lexical is more similar to C#. The structure of a Net# script has four main sections.

  1. Constant declaration (Optional) – Define values used elsewhere in the neural network definition
  2. Layer declaration – The input, hidden and output layers are defined with the layer dimensions. The layer declaration for hidden or output layer can include the output function.
  3. Connection declaration – You can define connection bundles (Full, Filtered, Convolutional, Pooling, Response normalization) – Full connection bundle is the default configuration.
  4. Share declaration (Optional) – Defining multiple bundles with shared weights.

This is a simple neural network defined by a Net# script to perform a binary classification. You can customize the number of hidden neurons and the activation functions and see how the accuracy of the model variate.

<!– HTML generated using –>

//A simple neural network definition
//auto keyword allows the ANN to automatically include all feature columns in the input examples
//input layer named Data
input Data auto;

//Hidden layer named "H" including 200 nodes
hidden H [200] from Data all;

//output layer named "Out" including 2 nodes (binary classification problem) 
//Sigmoid activation function has been used.
output Out [2] sigmoid from H all;

For more insides here’s the resources –


Evaluating AzureML Experiments

Azure Machine Learning Studio allows you to build and deploy predictive machine learning experiments easily with few drags and drops (technically 😉).

The performance of the machine learning models can be evaluated based on number of matrices that are commonly used in machine learning and statistics available through the studio. Evaluation of the supervised machine learning problems such as regression, binary classification and multi-class classification can be done in two ways.

  1. Train-test split evaluation
  2. Cross validation

Train-test evaluation –

In AzureML Studio you can perform train-test evaluation with a simple experiment setup. The ‘Score Model’ module make the predictions for a portion of the original dataset. Normally the dataset is divided into two parts and the majority is used for training while the rest used for testing the trained model.


Train-test split

You can use ‘Split Data’ module to split the data. Choose whether you want a randomized split or not. In most of the cases, randomized split works better. If the dataset is having a periodic distribution for an example a time series data, NEVER use randomized split. Use the regular split.

Stratified split allows you to split the dataset according to the values in the key column. This would make the testing set more unbiased.

  • Pros-
    • Easy to implement and interpret
    • Less time consuming in execution
  • Cons-
    • If the dataset is small, keeping a portion for testing would be decrease the accuracy of the predictive model.
    • If the split is not random, the output of the evaluation matrices are inaccurate.
    • Can cause over-fitted predictive models.

Cross Validation –

Overcome the mentioned pitfalls in train-test split evaluation, cross validation comes handy in evaluating machine learning methods. In cross validation, despite of using a portion of the dataset for generating evaluation matrices, the whole dataset is used to calculate the accuracy of the model.


k-fold cross validation

We split our data into k subsets, and train on k-1 of those subsets. What we do is holding the last subset for test. We’re able to do it for each of the subsets. This is called k-folds cross validation.

  • Pros –
    • More realistic evaluation matrices can be generated.
    • Reduce the risk of over-fitting models.
  • Cons –
    • May take more time in evaluation because more calculations to be done.

Cross-validation with a parameter sweep –

I would say using ‘Tune model Hyperparameters’ module is the easiest way to identify the best predictive model and then use ‘Cross validate Model’ to check its reliability.

Here in my sample experiment I’ve used the breast cancer dataset available in AzureML Studio that normally use for binary classification.

experimentThe dataset consists 683 rows. I used train-test split evaluation as well as cross validation to generate the evaluation matrices. Note that whole dataset has been used to train the model in cross validation case, while train-test split only use 70% of the dataset for training the predictive model.

Two-class neural networks has used as the binary classification algorithm. The parameters are swapped to get the optimal predictive model.

When observing the outputs, the cross-validation evaluation provides that model trained with whole dataset give a mean accuracy of 0.9736 while the train-test evaluation provides an accuracy of 0.985! So, is that mean training with less data has increased the accuracy? Hell no! The evaluation done with cross-validation provides more realistic matrices for the trained model by testing the model with maximum number of data points.

Take-away – Always try to use cross-validation for evaluating predictive models rather than going for a simple train-test split.

You can access the experiment in the Cortana Intelligence Gallery through this link –


Chatbots : What & Why?

robot-customer-serviceThe word ‘chatbots’ has become one of the most whispered words in the tech world today. Each and every tech company is putting a lot of effort on researching and developing bot related technologies.

The very first thing that you should keep in your mind is “Bot is not an acronym neither a magic app”. Bot is an application that operates as an agent for a user or another program or simulates a human activity.

I would say, there’s no Artificial intelligence or natural language processing attached with most of the chatbots you see out there. But AI and machine learning have become prominent factors of giving bots more human side.

The evolution of chatting paradigms and the rapid adaptation of millennials for chatting platforms like Facebook messenger, WhatsApp and Viber increased the need of chatbots that can handle business processes.


Evolution of user interaction

The same way a website is interacting with a user, bot acts as the interface into the service. Simplicity, increasing productivity, personalized service lines are some of the major benefits that we can achieve with getting chatbots into the play.

Super bots Vs domain specific bots

Probably the very first thought that comes to your mind when it says ‘bots’ might be “Siri, Cortana or Google assistant”. Dominating our pockets with their ability of interacting as a personal assistant, these software utilities can be defined as super bots. They are equipped with speech recognition as well as natural language understanding. Normally there’s a persona specifically designed for these super bots. The backend of these intelligent applications is backed with machine learning and deep learning based technological interventions.


AI powered personal assistants in your pocket and home

Domain specific bots are easy to find and easy to build (comparatively to the super bots). They are specifically designed aligning to a particular business process.  Ordering a pizza from nearest pizza shop, customer service call centers or booking a flight ticket are some example business processes that can be easily adopted to a conversational bot interface. These bots may use machine learning techniques for natural language understanding.

Business bots Vs Consumer bots

Bots are not only mend to be to involve in business process. Fun is mandatory! The consumer bots are specifically designed to maintain human like conversations with the users. Sometimes even for flirting 😉 Mitsuku is known as one of the prominent consumer bots that have built for today.

Text or the voice?

Interacting with a chatbot can be done in several ways. Textual communication is just one thing. Speech recognition enables the user to interact with the chatbots with speech. Some chatbots provide interactive clickable cards for user interaction. Amazon Alexa even has a hardware component that interacts with the user with voice commands.

Building bots

There are plenty of programming paradigms prevailing today that helps you to build conversational bots. Microsoft Bot Framework is a programmer friendly framework that supports C# or node.js for deploying bots. Integrating chat channels like Skype and messenger can be too done through the framework.

Natural Language understanding provides more human like nature for bot’s conversations. For that, LUIS service by Microsoft, API.AI, are some prominent services used today by the programmers. No need to go from the scratch of machine learning algorithms. Just an API call will do the magic for you.

Bots can be given more human like abilities with machine learning based intelligent API services and SDKs. Microsoft Cognitive services is a valuable toolset that you can use to give your chatbot the ability to see, hear and even think!

What’s next?

I guess, codeless bot building services (some are already there in the market, but not so matured) and natural language generation would be the next big things in the conversational bot building industry. Deep learning will come to scene with language generation for sure.

Time to market is a prominent factor in the world of business. Then why not going with the trend and adopt a chatbot for your own business or start building bots as your business? 😉