Simple Linear Regression with Azure ML + Python

1419973816879Simple linear regression is a statistical method that allows us to summarize and study relationships between two continuous (quantitative) variables: One variable, denoted x, is regarded as the predictor, explanatory, or independent variable. The other variable, denoted y, is regarded as the response, outcome, or dependent variable.

Typically when we doing regression analysis, we consider about the correlation of coefficient of the input variables. Correlation analysis measures the extent to which two variables vary together, including the strength and direction of their relationship.

correlation_dot_graphsLinear correlation coefficient(also called Pearson product-moment correlation coefficient) measure of the strength and direction of a linear association between two random variables.

I used the Istanbul Stock Exchange dataset to demonstrate the steps in doing a simple linear regression prediction. Azure Machine Learning experiment has built (get the experiment from here) for building the regression model. Built-in Bayesian Linear Regression algorithm has been used for building the model.

capture1The most interesting part is coming with python! 🙂

I’ve used a Jupyter Notebook and fetched the data to that workspace to visualize the dataset and to calculate the coefficient values between each variable. Pearsonr method in scipy library has used for that.

Refer the iPython notebook from Azure Notebook for the complete python script and the visualizations.

https://notebooks.azure.com/library/Python%20Visualizations/html/Istanbul%20Stock%20Python%203%20notebook.ipynb

Do run the code by your own. You’ll get it for sure!

 

Jupyter Notebook on AzureML

plot_regression_3d_1 If you are fond of playing with data to dig out the relationships of it and to plot interesting visualizations with data; python is the language you should speak.

Over the years, with the strong community support, python language got dedicated libraries for data analysis and predictive modeling like scikit-learn, Tensorflow, Theano etc. Even the ultimate IDE in town; Visual Studio started supporting python! So, no hesitation. Python is a great choice to make.

You can use many IDEs or even a simple text editor to write your python files. But python comes with a handy web application; Jupyter notebook that can be used to do your code. Even compile it!

Jupyter gets its birth in 2014 as a spin-off project of IPython; which is a command shell for interactive computing in multiple programming languages, originally developed for the Python.

Why Jupyter?

Jupyter notebook is a very popular tool among data scientists which as a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. “Jupyter” is a loose acronym meaning Julia, Python and R. One of the most prominent uses you get when using Jupyter notebook is the ability of sharing the data transformation and visualization steps with your peers.

If you want to run Jupyter notebook in your local machine do refer the link below. With a few easy steps, you can have Jupyter notebook up and running in your machine.

http://jupyter.readthedocs.io/en/latest/install.html

One of the easiest ways to use Jupyter is running the notebook on Azure. No need to have python or the dependencies of it installed on your local machine. You can create, edit and share the Jupyter notes using Azure Machine Learning Studio. All the execution happens on the cloud.

Let’s get started!

1Access your notebook from “Notebooks” tab of AzureML Studio. When creating a new notebook, you can select which language and version you want to have in your notebook. Python 2, Python 3 and R are the supported languages right now.

Same as the Jupyter notebook running on the local machine, you get the same IPython interface on your browser.

2On the notebook menu bar, you can find out the ‘help’ menu which contains a brief user interface tour as well as a list of keyboard shortcuts that you can use to drive the notebook.

Here’s a little data mashup I’ve done using the famous ‘Iris dataset’ included in python sklearn. The .ipynb file is available on my github repo. Feel free to download and play with. A static html page created with the notebook output also included in the repo.

Azure is coming up with Azure Notebook preview feature. Here’s Iris visualization hosted on Azure Notebook

https://notebooks.azure.com/library/Python%20Visualizations/html/Iris+Data+Visualization.ipynb

No Machine learning algorithms or complex code snippets here. Just a data visualization & data transformation. 🙂