Out of all the public cloud platforms, Microsoft Azure has adopted all most all the steps in machine learning life cycle into cloud. Though the resources and the abilities are there, sometimes finding the correct cloud-based product or service to adopt for your solution might be a problem.
Providing a perfect answer for that issue, Azure has come up with the whole new Azure Machine Learning Studio which is in the preview by the time am blogging this! (Don’t get confused with the AzureML Studio, the drag and drop interface we had before. This is a new thing – ml.azure.com ) There’s no framework dependency or restrictions for using these services. You can easily adapt your open source machine learning code base (may be written with Python, sci-kit learn, TensorFlow, PyTorch, Keras… anything)
Most awesome feature of this new service is the AzureML python SDK (https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro) and AzureML R SDK (https://github.com/Azure/azureml-sdk-for-r) . These SDKs allow you to create and manage the ML experiments with your familiar coding style.
In order to use this one-stop solution in Azure you may have to create an Azure Machine Learning Service from Azure portal. Then it’ll direct you for the new interface. You can either go for the Enterprise pricing tier or for the Basic tier. In basic tier you won’t get the visual designer and Automated ML features.
Let’s go through each and every tab we got in the side pane in the latest release and see what can we do with them.
These are fully managed Jupyter notebook instances on cloud. These notebook servers are running on top of a new VM instance type called “notebookVM”. There notebookVMs are fully configured work environments to do your machine learning and data science tasks. No need to worry about installing all the python packages and its dependencies. All are already there! You have the privilege to change the notebook sizes (Yes GPU enabled VMs are also there) or install new packages through a python package manager too.
Automated ML –
Not available in the free tier though. A process of selecting the best suited algorithm for the dataset you are having. Right now, this is supporting classification, regression and time series forecasting for tabular data formats. Not supported for deep learning based computer vision applications. Deep learning based text analysis is also in preview. The Automated ML process runs a set of machine learning algorithms on top of your provided data and see which one gives the best accuracy metric. Good for building prototypes and even in some cases might be in production.
This is the evolution of Azure ML Studio (Old drag and drop thing). Seems like Microsoft is going to end its lifecycle and give the new Designer be its replacement. Here you can build the complete ML workflow by dragging and dropping modules. If you want can integrate R or python scripts in the experiment. The machine learning service endpoint can be exposed through a Azure Kubernetes Service (AKS) deployment.
The place to manage and version your datasets. Datasets can be either tabular or file based. Here you can profile your dataset by performing a basic statistical analysis on your data. If your dataset is sitting on a datastore (which we going to discuss later) this acts as a high-level encapsulation of that data.
You may execute several runs on the same experiment with different configurations. This is the place where you can see all the log files of them and compare the runs with each other.
Don’t get confused Azure machine learning pipelines with the Azure pipelines. Azure ML pipelines are specifically designed for MLOps tasks. You can manage the whole experiment process till production using ML pipelines. These pipelines are reusable and help collaborative development of the solution.
You can register the trained ML models here. Versioning the models, managing which model to go for production are some use cases of this model registry. You can register models that has been trained outside the particular Azure ML workspace too here.
Endpoint of an azure machine learning experiment can be a web service or an IoT module endpoint. Managing the endpoint keys etc. are performed in this section.
In most of the cases, you may use Azure for computations. Here in the Compute section you can create and manage the following compute resource types.
- Notebook VMs – as we discussed previously in the Notebooks section this is a fully managed ML development environment suits for development and prototyping purposes.
- Training clusters – You can make a either a CPU based or a GPU based cluster for running your experiments. Note that this would be charged according to the computation hours as well as for the number of nodes you are using. Good thing is there’s no charge when you are not using the cluster for computation.
- Inference clusters – This talk about AKS clusters where you can deploy the endpoints. Even you can register a prevailing AKS cluster as an inference cluster.
- Attached compute – If you working with Azure Databricks, Data Lake Analytics or HDInsight you can configure the computation here. In an interesting use case if you want to attach your physical computer (which should be a workstation running Ubuntu) as a compute target it’s also possible through the AML service.
When it comes to machine learning experiments its normal to have large amount of data. These data may sit in your Azure storage. Datastore is the storage abstraction over an Azure storage account which then you can use inside your machine learning experiments.
Data labeling –
A cool new feature for data annotators. Right now, this supports Image classification in multi-label / multi-class and object identification (bounding box) annotations. The annotator should not have to have an Azure subscription. You can easily outsource your tedious annotation workload through this feature.
This is just an overview of the options we are having with new Azure Machine Learning Studio. It’s pretty clear that Azure team is going to get all the ML related services under one umbrella. Let’s discuss some cool use cases and tips on using these services in next blog posts.
Happy coding! 😊