Designing and implementing predictive experiments requires an understanding about the problem domain as well as the knowledge on machine learning algorithms and methodologies. Extensive knowledge on programming is a necessity when it comes to real-world machine learning model training and implementations.
Automated machine learning is capable of training and tuning a machine learning model for a given dataset and specified target metrics by selecting the appropriate algorithms and parameters by its own. Azure Machine Learning offers a user-friendly wizard-like Automated ML feature for training and implementing predictive models without giving you the burden of algorithm and hyperparameter selection.
Azure Automated ML comes handy, where you are able to implement a complete machine learning pipeline without a single line of coding. It saves a time and compute resources since the model tuning is done by following data science best practices.
Azure machine learning currently supports three types of machine learning user cases in their AutomatedML pipeline.
1. Classification – To predict one of several categories in the target column
2. Time series forecasting – To predict values based on time
3. Regression – To predict continuous numeric values
Let’s go through the step by step process of developing a machine learning experiment pipeline with Azure Automated ML.
01. CREATE AN AZURE MACHINE LEARNING WORKSPACE
Azure Machine Learning Workspace is the resource you create on Azure to perform all machine learning related activities on the cloud. The steps are straight forward same as creating any other Azure resource. Make sure you create the Workspace edition is ‘Enterprise’ since AutoML is not available in the basic edition.
02. CREATE AUTOMATEDML EXPERIMENT
ml.azure.com web interface is the one stop portal for accessing all the tools and services related to machine learning on Azure. You have to create a new Automated ML run by selecting Automated ML on the Author section of the left pane.
03. SELECT DATASET
As of now, AutomatedML supports tabular data formats only. You can upload your dataset from the local storage, import from a registered datastore, fetch from a web file or else retrieve from Azure open datasets.
04. CONFIGURE RUN
In this section you have to specify the target column of the experiment. If it’s a classification task this should be the column that indicates the class values and if it’s regression that’s the column where the numerical value to be predicted. Select a training cluster where the experiments going to run. Make sure you select a cluster that is enough for the complexity of the dataset you provided.
05. SELECT TASK TYPE AND SETTINGS
Select the task type that is appropriate for the dataset you selected. If you have textual data in your dataset you can enable deep learning (which is in preview) to extract the features.
In the settings of the run, you can specify the evaluation metrics, any algorithms that you are don’t want to use, validation type, exit criterion etc. for the experiment. If you wish to select only a specific set of features in the provided dataset you can configure that through the settings.
Running the experiment may take some time depending on the complexity of the dataset, algorithms you use and the exit criterion you used.
When the run is completed AzureML provides a summary of the run by indicating the best performing algorithm. You can directly deploy or download the best performing model as a .pkl file from the portal.
Deployment comes as a REST API which runs on an Azure Kubernetes Service (AKS) or Azure Container Instance (ACI).
AutomatedML comes handy when you need to do fast prototyping for a specific set of data and supports the agile process of intelligent application development. Will look on the other tools and features we have on Azure AI stack in the coming articles.