Behind the Scene – Azure ML Part 02

OverviewOfAzureML_960With the power of cloud, we going to play with data now! 🙂

Machine Learning is a niche part of predictive analysis. Predictive analysis gets its power from the tools and techniques like mathematics, statistics, data mining, machine learning etc.… Predictive analysis doesn’t refer only predicting future events; real-time fraud credit card transaction detection also falls under a usage of predictive analysis.

Am not going to discuss the usages of machine learning and what you can do with machine learning methods. Let’s see what are the benefits that you getting by using Azure ML Studio for your analysis.

Fully managed scalable cloud service –You have to deal with thousands, mostly with millions of data records when you doing your analysis. The computation power of the local machine may not be sufficient for those kind of mammoth tasks. Get the use of Azure scalable & efficient cloud. It’ll make your predictions super-fast.

Ability to develop & deploy –Want to deploy an application that get intelligence with a ML backend? AzureML Studio is the best solution then. It provides you the ability to easily deploy a web service from your built ML model and use that in your application. REST will do the rest. J

Friendly user interface for data science workflow –I’m pretty sure dragging and dropping is your ‘thing’ right? So AML Studio suits for you! D from data loading to deployment of the web service, you get a friendly UI where mostly you can just drag and srop the modules into the workspace without bothering about their underlying complex algorithms.

Wide range of ML algorithms inbuilt –No need to start from the scratch. There are plenty of ML algorithms pre built as models in AML Studio. You can use them right away for building models.

R & Python integration –For data scientists, R and Python are like life blood. IF you wish to do intergrade your own scripts in the model, with AML Studio you have the chance here. You can choose either R/python or the both. AML Studio takes care of it.

Support for R libraries –R language has its vibrant user community and the rich set of libraries. With AML studio you get the access for most of the R libraries and you can add more libraries if want too.

1602.image_3FBAEFDE

Azure Machine Learning Process

Let’s go with the process. All starts with defining the objective. Before jumping into the problem, you should have a clear idea on what you going to do. Whether it’s a classification, linear regression, recommendation… you should be able to figure out it by skimming through the data sources and the problem definition.

 

Then the Data! Data maybe a set of sales data in your enterprise cloud or in your local storage. Identify the relevant data fields and components that you want for building up the model. If dataset exceeds 10GB, it’s better to store the data in Azure SQL database first and get the data through the ‘Import Data’ module. You can use HDInsight stored data using Hive queries too.

Pay attention on the data quality. Normally real world data is noisy, full of outliers, error values, missing values etc. So data preprocessing should be done first.  Make sure the data fields are in the appropriate type (Numerical, categorical, etc.) In Azure ML there are plenty of modules that you can perform data preprocessing tasks.

Model Development! Here’s the fun part. You can use ML algorithms comes with studio or you can go with your own scripts in R or python here. If you familiar with ML model development platforms like Weka, RapidMiner, Orange you will find out this is, it is not so different. You have to put the right module at the right place. Have to use right algorithm to take the right decision.

After developing the model, normally we should train the models. For that you can use the past data that you have. You must always keep a portion from your dataset for testing the model too.

Is it over after training the model? No. Many more in the process. You should score and evaluate the model you built. It is useless if the predictions you making with the model you built is having a high error rate. You may haven’t use the appropriate algorithm or you may haven’t use the correct and optimal parameters. So using the ‘score model’ and ‘evaluate model’ you can compare different algorithms for the particular task and pick the best one out from them.

It’s obvious that ML algorithms are not 100% accurate always. But the model you building should have an accuracy more than a wild guessing.

After building your predicting magic box, you can publish it as a web service. This allows you to consume it either by a custom application, Microsoft Excel or similar tool.

For more accuracy, normally this process goes in an iterative manner.

Finishing up the theories and let’s get our hands dirty with our experiments!

Simply there are 3 steps to start working with Azure ML

  1. Navigate to AzureML and choose your subscription plan
  2. Create a Machine Learning workspace in Azure Portal
  3. Sign in to ML Studio

Step 01 –Go to http://www.azure.com and products -> Analytics -> Machine Learning

cap1You can use AzureML absolutely for free. But if you want to deploy a web service and play with serious tasks have to go for an appropriate subscription. If you have a MSDN subscription, you can use it here 🙂

cap2

Azure ML subscriptions

Step 02 –You need an Azure account here. If you don’t have one go for the 3-month free trial.

cap3In the portal go for new -> data + analytics -> Machine Learning

From there you can create your workspace to do the machine learning tasks.

Step 03 –Sign in to the Azure ML Studio from https://studio.azureml.net

cap4Now you are there! Click on the new -> Blank experiment!

We are ready to start the now.

The GUI of the AML Studio is pretty clear and easy to understand. Try to find out the way to upload the datasets and the modules that contains the ML algorithms from the pane in the left hand side.

Will explore some cool capabilities of Azure ML in the coming posts. Here’s a video for your motivation.

Part 01