Azure Machine Learning Studio allows you to build and deploy predictive machine learning experiments easily with few drags and drops (technically 😉).
The performance of the machine learning models can be evaluated based on number of matrices that are commonly used in machine learning and statistics available through the studio. Evaluation of the supervised machine learning problems such as regression, binary classification and multi-class classification can be done in two ways.
- Train-test split evaluation
- Cross validation
Train-test evaluation –
In AzureML Studio you can perform train-test evaluation with a simple experiment setup. The ‘Score Model’ module make the predictions for a portion of the original dataset. Normally the dataset is divided into two parts and the majority is used for training while the rest used for testing the trained model.
You can use ‘Split Data’ module to split the data. Choose whether you want a randomized split or not. In most of the cases, randomized split works better. If the dataset is having a periodic distribution for an example a time series data, NEVER use randomized split. Use the regular split.
Stratified split allows you to split the dataset according to the values in the key column. This would make the testing set more unbiased.
- Easy to implement and interpret
- Less time consuming in execution
- If the dataset is small, keeping a portion for testing would be decrease the accuracy of the predictive model.
- If the split is not random, the output of the evaluation matrices are inaccurate.
- Can cause over-fitted predictive models.
Cross Validation –
Overcome the mentioned pitfalls in train-test split evaluation, cross validation comes handy in evaluating machine learning methods. In cross validation, despite of using a portion of the dataset for generating evaluation matrices, the whole dataset is used to calculate the accuracy of the model.
We split our data into k subsets, and train on k-1 of those subsets. What we do is holding the last subset for test. We’re able to do it for each of the subsets. This is called k-folds cross validation.
- Pros –
- More realistic evaluation matrices can be generated.
- Reduce the risk of over-fitting models.
- Cons –
- May take more time in evaluation because more calculations to be done.
Cross-validation with a parameter sweep –
I would say using ‘Tune model Hyperparameters’ module is the easiest way to identify the best predictive model and then use ‘Cross validate Model’ to check its reliability.
Here in my sample experiment I’ve used the breast cancer dataset available in AzureML Studio that normally use for binary classification.
The dataset consists 683 rows. I used train-test split evaluation as well as cross validation to generate the evaluation matrices. Note that whole dataset has been used to train the model in cross validation case, while train-test split only use 70% of the dataset for training the predictive model.
Two-class neural networks has used as the binary classification algorithm. The parameters are swapped to get the optimal predictive model.
When observing the outputs, the cross-validation evaluation provides that model trained with whole dataset give a mean accuracy of 0.9736 while the train-test evaluation provides an accuracy of 0.985! So, is that mean training with less data has increased the accuracy? Hell no! The evaluation done with cross-validation provides more realistic matrices for the trained model by testing the model with maximum number of data points.
Take-away – Always try to use cross-validation for evaluating predictive models rather than going for a simple train-test split.
You can access the experiment in the Cortana Intelligence Gallery through this link –