Lambda Architecture & Cortana Intelligence Suite solutions

Data processing has become the key part of modern applications. Not only processing the data, but also visualizing data in a meaningful way is vital for making business decisions in an enterprise application.

With the rise of massive data storages and the speed of data generation, effective data processing architectural patterns came into industrial standards.

In the era of big data processing where data generated in high volume, variety, velocity, veracity and value; there are many architectural patterns that industrial applications are following for data processing. Lambda, Kappa and Zeta are some patterns used for real time big data processing.

Let’s take a look on how Lambda architecture can be adopted with the products and services comes with Microsoft Cortana Intelligence Suite.

What is Lambda Architecture?

2 - lambaLambda architecture is a data processing architecture designed to handle massive quantities of data by taking the advantage of both batch and stream processing methods. Nathan Marz introduced the term of Lambda Architecture (LA) for having a generic, scalable and fault tolerant data processing architecture.

LA contains different layers which handles data in various methodologies in the process of data processing.

The ability of processing both batch data and real-time data streams is one of the significant features of lambda architecture.

What is Cortana Intelligence Suite?

architectureCortana Intelligence Suite is the Microsoft’s umbrella branding for fully managed business intelligence, big data and advanced analytics offerings comes with Azure cloud which enables businesses to transform the data into intelligent actions. So “Cortana” is there in this name. Then what? Is this related to the smart assistant comes with Windows 10? As Microsoft says, Cortana symbolizes the contextual intelligence that the solutions hope to deliver across the entire suite.

Cortana Intelligence Suite comes with the following services that specially designed for following tasks.

  • Information Management
  • Big Data Stores
  • Machine Learning & Analytics
  • Intelligence
  • Dashboards & Visualizations

How Cortana Intelligence Suite aligns with Lambda architecture?

Cortana Intelligence Suite (CIS) comes with different solutions that can cater both batch data sources and data streams. It is a significant improvement where you combine traditional batch processing systems and data stream analysis systems.

For an example think of a system that indicates the fuel level, oil levels, car tire pressure etc. of a vehicle… The system too should have the ability to analyze the data fetching from the IoT sensors real time as well as do predictions using the stored batch of data. CIS comes handy with various approaches to design this system with lambda architecture.

Lambda

Usage of CIS tools for data processing

IoT sensors creates hundreds or maybe thousands of data points for a second. Handling such data streams and directing them to analytics flows can be done using Event Hubs(https://azure.microsoft.com/en-us/services/event-hubs/).  you can use Azure Stream Analytics to get data from EventHub into Azure Storage Blobs. Thereafter you can use Azure Data Factory (ADF) to copy data on a scheduled basis from Blobs to Azure Data Lake Store. ADF can act as the batch data source. For analyzing and to build predictive models on the batch data HDInsight & Azure Machine Learning is the option you can go with. Azure SQL data warehouse can be used to store the analyzed data and visualizing them using PowerBI can be done. This is the batch data processing line.

In the line of real time data analysis, you can push the data stream coming from event hub to a Stream Analytics service or for an azure machine learning model. Visualizing data with PowerBI would come handy too.

Apart from the above explained components comes for data processing task, Microsoft Cognitive services can be used for transforming the user interaction for more human side. For an example, Bot framework and LUIS can be used with Bing speech API to provide voice commands for applications. Cortana skills can be used for enabling your app to deal with Cortana assistant.

Democratizing Machine Learning with Cloud

HiRes.jpg.800x600_q96We have already passed the era of gigabytes when it comes to data. World is talking about terabytes of unstructured data and massive amounts of data points generated from IoT devices and sensors in millions per a second. To analyze these heaps of data, obviously, we need large computation power and massive storage. Building workhorse machines to fulfil those tremendous workloads would definitely cost a lot. Cloud computing paradigm comes handy here. The resourcefulness and the scalability of the public cloud can be used to perform the large calculations in machine learning algorithms.

Almost all the major public cloud providers in the market comes up with machine learning services. Cloud machine learning services in Google Cloud Platform provides modern machine learning services, with pre-trained models and a service to generate your own tailored models. Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. IBM analytics comes up with a machine learning platform with its cloud data services. Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing Machine Learning workflow on Azure. We discussed a lot about Azure Machine Learning and its appliances in practical scenarios in the previous posts.

All the mentioned platforms provide machine learning as a service. Most of the platforms offer pre-built ML algorithms in packages. Simple drag and drop user interactions and easy deployment has attracted many developers to use these tools.

But, how would it be if you want to go from the scratch? Either you want to use the power of Graphical Processing Units (GPUs) to process the ML algorithms parallelly? Cloud based Virtual Machines specifically optimized for computation is one of the best solutions that you can consume.

Azure Data Science Virtual Machine (DSVM) –

dsvm

DSVM in Azure Portal

If you already have used Azure virtual machines for your computation, hosting or storage tasks, this would not be a new concept for you. Azure DSVM is specifically optimized for large computations. Azure DSVM comes in two flavors. One with Windows and the other with Linux. You can choose the hardware configurations as you wish. Many development environments, programming IDEs, languages are pre-installed in the VM instances.

dsvm_linuxMy personal favorite here is the Linux DSVM instance. Here I’ve created a Linux DSVM with the basic configurations. For accessing the VM you can use any tool that can do a SSH call. What I normally do is calling the accessing the VM using Ubuntu Bash on Windows 10.

GPUs for machine learning –

GPU_1

GPU_2

Configurations of the Linux VM with Nvidia GPU

Many machine learning algorithms currently available can be executed parallely. Execution parts of those algorithms are embarrassingly parallel. With that parallel programming, you can reduce the execution time of the algorithms drastically. Data scientists in both industry and academia have been using GPUs for machine learning to make groundbreaking improvements across a variety of applications including image classification, video analytics, speech recognition and natural language processing.

google_brain

GPUs Vs. CPU computing

Specially in Deep Learning, parallel processing using GPUs can make a drastic decrease in computation time. Purchasing a deep learning dream machine powered with a CUDA enabled high-end GPU such as Nvidia Tesla K80 would cost nearly 6000 dollars! Rather than spending a lot on a machine like that, the most feasible plan is to provision a virtual machine with the specifications we need and pay as we consume.

VM_size

VM instance price plans

The N-series is a family of Azure Virtual Machines with GPU capabilities that you can use for these kinds of tasks. The N-series will feature the NVIDIA Tesla accelerated platform as well as NVIDIA GRID 2.0 technology, providing the highest-end graphics support available in the cloud today. Through your Azure portal, you can choose a desired price plan with the desired configurations for your tasks when provisioning the VM.

teslaHere’s my Azure VM specifically configured for deep learning exercises. The machine is powered with Tesla K80 GPU which is having 4992 cores in it!! I installed anaconda for that and doing computations using Jupyter notebooks.

Just a hint: stop your VM instance when you are not using it for computation to avoid getting huge unnecessary bills. 😉

No need of huge wallets! The wise decision would be applying cloud technologies for machine learning.

Azure ML Web Services gets a new look

Huge buzz going on Machine Learning. What for?  Building intelligent apps is one of the dominant usages of machine learning. Web service is one of the understandable “language” for software developers. If the data scientists can provide a web service for the line of devs, they’ll be super excited because they only have to deal with JSON; not regression algorithms or neural networks! 😀

Azure ML studio provides you the power to deploy web services easily and nice interface that a software developer can understand. Consuming a web service built with Azure machine learning has become pretty easy because it even provide you the code samples and the sample JSONs that transfer in and out.

web-services

services.azureml.net

 

Recently AzureML Studio has come out with a new interface for managing the web services. Now it’s pretty easy for manage and monitor the behavior of your web services.

Go for your ML Studio. In web services section, you’ll find a new link directing to “New web services experience”. Currently it’s in the preview.

dashboard

New web services dashboard

 

Dashboard shows the performance of the web service that you built. The average execution time is shown there. Even you can get a glimpse on monetary terms attached with consuming the web service with the dashboard.

Testing the web services can be done through the new portal. If you want to build web application to consume the web service you built, can direct to the azure web app template that is pre-built for consuming ML web services.

Take a look from (http://services.azureml.net)  you’ll get used to it! 😀

 

 

Modules & Capabilities of Azure Machine Learning – Azure ML Part 03

Through the journey of getting familiar with Azure Machine Learning, cloud based machine learning platform of Microsoft, we discussed about the very first steps of getting started.
When you open up the online studio through your favorite web browser, you’ll directed to create a blank experiment. Let’s start with it.

start screen
Blank Experiment in Azure ML Studio

In your left hand side of the studio, you can see the pre-built modules that you can use to develop your experiments. If they are not enough for your case, you can use R or Python scripts in your experiment.
With Azure ML Studio, you get the ability to deploy models for almost all the machine learning problem types. The algorithms you can use for classification, regression and clustering are in the AML cheat sheet that you can download from here.(http://download.microsoft.com/download/A/6/1/A613E11E-8F9C-424A-B99D-65344785C288/microsoft-machine-learning-algorithm-cheat-sheet-v6.pdf)    machine-learning-algorithm-cheat-sheet-small_v_0_6-01

Will take a look into the sections that modules are categorize. If you want to find a specific module, what you have to do is search the experiment item from the search box.

Saved datasets – You can find out a set of sample datasets that you can use for experiments. Most of the popular machine learning related datasets like “iris dataset” are available here. If you want your own dataset in the studio, you can upload it to here.

Trained models – These are the models that you get as the output after training the data using an appropriate algorithm and methodology. They can be used for building another experiment or a web service later.

Data Format Conversions – The data comes in and going out from the experiment can be converted into a desired format using the modules in this section. If you wish to convert the output of your experiment to ARFF format (which supported in Weka) or to a CSV file you can use the modules here.

Data input & output – Azure ML has the ability to get data from various sources directly.  You can use an Azure SQL database, Azure BLOB storage or a hive query to get the data. Fetching data from a local SQL server is on preview yet (August 2016).

Data transformation – Data transformation tasks like normalization, clipping etc. can be done using the modules listed in this section. You can use SQL queries to do the data transformations if want.

Feature Selection – Appropriate feature selection increases the accuracy of your machine learning model drastically. There are three different methods as “Filter bases feature selection, Fisher linear discrimination and Permutation feature importance” that you can use according to your requirement.

Machine Learning – Within this section you can find out the modules built for training machine learning models, evaluate accuracy etc. Most of the popular machine learning algorithms used for classification, clustering and regression problems are listed down here as modules. The parameters of each module can be changed or use can you Tune Model Hyperparameters module to tune-up the experiment to get the optimal output.

OpenCV library Modules – ML is widely using in image recognition. In Azure ML there’s Predefined Cascade Image Classification that is trained to identify the images with front facing human faces.

Python language models – Python is one of the widely using languages in data mining and machine learning applications. With Azure ML studio you have the ability to execute your own python script using this module. 200+ common python libraries are supported with Azure ML right now.

R language models – Same as Python, R is one of the most favorite statistical languages among data scientists. You can use your favorite R scripts and train models with R using these modules. Most of the R packages are supported in Azure ML. If the package is not there you can import the packages for the experiment. (Unfortunately there are some limitations in this. Some R packages like RJava, openNLP are not supported yet with Azure ML – Aug.2016)

Statistical Functions – If you want to do some mathematical functions for the data or perform statistical operations, here you can find out the modules for that. A basic descriptive statistical analysis on the dataset also can be performed using the modules.

Text Analytics – Machine learning models can be used for text analytics. There are some modules included in Azure ML studio for text preprocessing (omit the stop words, punctuation marks, white spaces etc.), Named entity recognition (Pre trained module) and many more. Vawpal Wabbit learning system library is also included in the modules for the use.

Web service – One of the most notable advantages in Azure ML is the ability to deploy as a web service. Here’s the web service input and output modules that can be used for the built experiments.

Deprecated – Assigning data for clusters, binning, quantizing data, cleansing missing data can be done using these modules.

Building Azure ML experiments and deploying web applications using them are not that hard.

This is one of the best step by step guide for that task from MSDN.

In the coming posts will discuss on interesting applications in Azure ML hacks to build your predictive models.
Play with the tool and leave your experience as comments below.  🙂

  

Behind the Scene – Azure ML Part 02

OverviewOfAzureML_960With the power of cloud, we going to play with data now! 🙂

Machine Learning is a niche part of predictive analysis. Predictive analysis gets its power from the tools and techniques like mathematics, statistics, data mining, machine learning etc.… Predictive analysis doesn’t refer only predicting future events; real-time fraud credit card transaction detection also falls under a usage of predictive analysis.

Am not going to discuss the usages of machine learning and what you can do with machine learning methods. Let’s see what are the benefits that you getting by using Azure ML Studio for your analysis.

Fully managed scalable cloud service –You have to deal with thousands, mostly with millions of data records when you doing your analysis. The computation power of the local machine may not be sufficient for those kind of mammoth tasks. Get the use of Azure scalable & efficient cloud. It’ll make your predictions super-fast.

Ability to develop & deploy –Want to deploy an application that get intelligence with a ML backend? AzureML Studio is the best solution then. It provides you the ability to easily deploy a web service from your built ML model and use that in your application. REST will do the rest. J

Friendly user interface for data science workflow –I’m pretty sure dragging and dropping is your ‘thing’ right? So AML Studio suits for you! D from data loading to deployment of the web service, you get a friendly UI where mostly you can just drag and srop the modules into the workspace without bothering about their underlying complex algorithms.

Wide range of ML algorithms inbuilt –No need to start from the scratch. There are plenty of ML algorithms pre built as models in AML Studio. You can use them right away for building models.

R & Python integration –For data scientists, R and Python are like life blood. IF you wish to do intergrade your own scripts in the model, with AML Studio you have the chance here. You can choose either R/python or the both. AML Studio takes care of it.

Support for R libraries –R language has its vibrant user community and the rich set of libraries. With AML studio you get the access for most of the R libraries and you can add more libraries if want too.

1602.image_3FBAEFDE

Azure Machine Learning Process

Let’s go with the process. All starts with defining the objective. Before jumping into the problem, you should have a clear idea on what you going to do. Whether it’s a classification, linear regression, recommendation… you should be able to figure out it by skimming through the data sources and the problem definition.

 

Then the Data! Data maybe a set of sales data in your enterprise cloud or in your local storage. Identify the relevant data fields and components that you want for building up the model. If dataset exceeds 10GB, it’s better to store the data in Azure SQL database first and get the data through the ‘Import Data’ module. You can use HDInsight stored data using Hive queries too.

Pay attention on the data quality. Normally real world data is noisy, full of outliers, error values, missing values etc. So data preprocessing should be done first.  Make sure the data fields are in the appropriate type (Numerical, categorical, etc.) In Azure ML there are plenty of modules that you can perform data preprocessing tasks.

Model Development! Here’s the fun part. You can use ML algorithms comes with studio or you can go with your own scripts in R or python here. If you familiar with ML model development platforms like Weka, RapidMiner, Orange you will find out this is, it is not so different. You have to put the right module at the right place. Have to use right algorithm to take the right decision.

After developing the model, normally we should train the models. For that you can use the past data that you have. You must always keep a portion from your dataset for testing the model too.

Is it over after training the model? No. Many more in the process. You should score and evaluate the model you built. It is useless if the predictions you making with the model you built is having a high error rate. You may haven’t use the appropriate algorithm or you may haven’t use the correct and optimal parameters. So using the ‘score model’ and ‘evaluate model’ you can compare different algorithms for the particular task and pick the best one out from them.

It’s obvious that ML algorithms are not 100% accurate always. But the model you building should have an accuracy more than a wild guessing.

After building your predicting magic box, you can publish it as a web service. This allows you to consume it either by a custom application, Microsoft Excel or similar tool.

For more accuracy, normally this process goes in an iterative manner.

Finishing up the theories and let’s get our hands dirty with our experiments!

Simply there are 3 steps to start working with Azure ML

  1. Navigate to AzureML and choose your subscription plan
  2. Create a Machine Learning workspace in Azure Portal
  3. Sign in to ML Studio

Step 01 –Go to http://www.azure.com and products -> Analytics -> Machine Learning

cap1You can use AzureML absolutely for free. But if you want to deploy a web service and play with serious tasks have to go for an appropriate subscription. If you have a MSDN subscription, you can use it here 🙂

cap2

Azure ML subscriptions

Step 02 –You need an Azure account here. If you don’t have one go for the 3-month free trial.

cap3In the portal go for new -> data + analytics -> Machine Learning

From there you can create your workspace to do the machine learning tasks.

Step 03 –Sign in to the Azure ML Studio from https://studio.azureml.net

cap4Now you are there! Click on the new -> Blank experiment!

We are ready to start the now.

The GUI of the AML Studio is pretty clear and easy to understand. Try to find out the way to upload the datasets and the modules that contains the ML algorithms from the pane in the left hand side.

Will explore some cool capabilities of Azure ML in the coming posts. Here’s a video for your motivation.

Part 01

Let’s Jump In! – Azure ML Part 01

ImageArtScience5

“In the world of intelligent applications, data will be the king!”. Despite of way they making the revenue, data has become the main asset of each company. Sales and distribution data, customer data repos, employee records, all sort of structured and unstructured data have become the life blood of the company’s business process because it is vital to get the accurate and relevant data to get the correct business decisions and do relevant business related predations.

Digital data and cloud storage follow Moore’s law: the world’s data doubles every two years, while the cost of storing that data declines at roughly the same rate.

pic67f7d1878ee27c87a401e8948934f751

This abundance of large amounts of data enables more features and tasks, and better machine learning models and methodologies should to be created for predictive analytics.

When the data is widely available in the cloud, and when it needs large computation power and infrastructure to process and analyze data repositories, the best move is the cloud!

Machine learning (ML) is starting to move to the cloud, where a scalable web service is an API call away. Data scientists will no longer need to manage infrastructure or implement custom code. The systems will scale for them, generating new models on the fly, and delivering faster, more accurate results.

What is Machine Learning?

Simply, machine learning is teaching the silicon chips to think! 😀 If we use the general definition: “Machine learning is the systematic study of algorithms and systems that improve their knowledge or performance with experience”

When you going through the theories behind machine learning you may find it is closely related to computational statistics, where you use computers in prediction making.  Machine learning comes out with range of computing tasks to solve problems where designing and programming explicit algorithms is unfeasible.

All of these things mean it’s possible to quickly and automatically produce models that can analyze bigger, more complex data and deliver faster, more accurate results – even on a very large scale. The result? High-value predictions that can guide better decisions and smart actions in real time without human intervention.

Where the hell ML is used?

Did you notice that eBay is pushing you to buy a protective glass after you buying a fancy phone case for your iPhone? Netflix is suggesting movies for you? Siri or Cortana speech recognition? All these tiny miracles have been possible with the power of machine learning. Spam filtering you emails, speech recognition, recommender systems in electronic commerce are some famous applications of machine learning.

So… How we going to do?

If you google or do a Bing search on machine learning, you’ll find out hundreds of ways of applying machine learning techniques in practical applications and tools that we can use to create machine learning models.

Screen-Shot-2016-06-08-at-3.35.53-PM-1024x730

Here’s a glimpse of Intelligent App Stack

With my post series, mainly am going to take you a journey with Azure Machine Learning Studio, which comes under the Cortana Intelligence Suite.

Why AzureML?

cortana-intel-suite-640x343

With advanced capabilities, free access, strong support for R, cloud hosting benefits, drag-and-drop development and many more features, Azure ML is ready to take the consumerization of ML to the next level.

It’s easy as ABC and powerful enough to handle petabytes of data with the power of Azure.

Theories??

Basics on computing and statistics will be useful to go forward. It’s fantastic if you have a rough idea about the machine learning algorithms, data pre preparation methods kind of stuff. Don’t worry. Here’s a book to read!  🙂

So will take the first step to Azure ML in the coming post.

Part 02