What Is Supervised Learning?

Supervised learning is the most common type of machine learning algorithms. It uses a known dataset (called the training dataset) to train an algorithm with a known set of input data (called features) and known responses to make predictions. The training dataset includes labeled input data that pair with desired outputs or response values. From it, the supervised learning algorithm seeks to create a model by discovering relationships between the features and output data and then makes predictions of the response values for a new dataset.

Prior to applying supervised learning, unsupervised learning is frequently used to discover patterns in the input data that suggest candidate features, and feature engineering transforms them to be more suitable for supervised learning. In addition to identifying features, the correct category or response needs to be identified for all observations in the training set, which is a very labor-intensive step. Semi-supervised learning lets you train models with very limited labeled data and thus reduce the labelling effort.

Once the algorithm is trained, a test dataset, which hasn’t been used for training, is typically used to predict the performance of the algorithm and validate it. To obtain accurate performance results, it is critical that both the training and test set are a good representation of “reality”( i.e., data from the production environment and the model were both validated correctly).

Q&A on model validation

You can train, validate, and tune predictive supervised learning models in MATLAB^® with Deep Learning Toolbox™, and Statistics and Machine Learning Toolbox™.

Supervised Learning Algorithms Categories

Classification: Used for categorical response values, where the data can be separated into specific classes. A binary classification model has two classes and a multiclass classification model has more. You can train classification models with the Classification Learner app with MATLAB.

Common classification algorithms for this category include:

Regression: Used for numerical continuous-response values. Regression models can be easily trained with the Regression Learner app.

Common regression algorithms include:

Supervised Learning Applications

Supervised learning is used in financial applications for credit scoring, algorithmic trading, and bond classification; in biological applications for tumor detection and drug discovery; in energy applications for price and load forecasting; in pattern recognition applications for speech and images; and in predictive maintenance for life of equipment estimates.

Examples and How To

Videos

Examples

Credit Rating by Bagging Decision Trees - Example
K-Nearest Neighbor Classification - Example
Train (Shallow) Neural Network Using Classification Learner - Example
Assess Regression Neural Network - Example

Articles and Tutorials

Machine Learning Q&A: All About the Regression Learner App - Article
Feature Engineering - Overview
Getting Started with Machine Learning - Tutorial

Software Reference

Regression - Documentation
Classification - Documentation
Supervised Learning (Workflow and Algorithms) - Documentation
fitensemble: Create an Ensemble of Bagged Decision Trees - Function

Unsupervised Machine Learning | Introduction to Machine Learning, Part 2

Mastering Machine Learning: A Step-by-Step Guide with MATLAB

Read ebook

How Much Do You Know About Machine Learning?

Start quiz