Reduce Dimensionality

Reduce dimensionality using Principal Component Analysis (PCA) in Live Editor

Since R2022b

Description

The Reduce Dimensionality Live Editor task enables you to interactively perform Principal Component Analysis (PCA). The task generates MATLAB® code for your live script and returns the resulting transformed data to the MATLAB workspace.

Using the Reduce Dimensionality Live Editor task, you can:

• Determine the number of components required to explain the variance of a fixed percentage of the data, such as 95% or 99%.

• Create a scree plot of explained variances of the principal components.

• Create a scatter plot of two principal components.

• Create a biplot of two principal components.

• Obtain the transformed data.

To add the Reduce Dimensionality task to a live script, perform one of these actions:

• On the Live Editor tab, select Task > Reduce Dimensionality; or on the Insert tab, select Task > Reduce Dimensionality.

• In a code block in the live script, type a relevant keyword, such as `pca` or `reduce`. Select Reduce Dimensionality from the suggested command completions.

Examples

expand all

Load the `cities` data set.

`load cities`

In the File section of the Home tab, click New Live Script.

In the Code section of the Live Editor tab, click Task to open the task gallery. Under Statistics and Machine Learning, click Reduce Dimensionality.

Select Input data > ratings.

Run the task by the diagonal striped bar on the left of the Live Editor window, or by pressing Ctrl+Enter. By default, the task creates three plots.

The software returns the transformed data to the workspace as a variable named `transformedData` (by default). You can edit this name.

Load the `moore` data set.

`load moore`

Convert the data into a table.

`tbl = array2table(moore);`

In the File section of the Home tab, click New Live Script.

In the Code section of the Live Editor tab, click Task to open the task gallery. Under Statistics and Machine Learning, click Reduce Dimensionality.

Select Input data > tbl.

Run the task by clicking the diagonal striped bar on the left of the Live Editor window, or by pressing Ctrl+Enter. By default, the task creates three plots.

Parameters

expand all

Specify the data to reduce by selecting a variable from the available workspace variables. The variable can be a numeric matrix or a table.

Specify the criterion for reducing the dimensionality of the data.

• `Explained variance (%)` — Specify the percentage of variance to explain, a nonnegative scalar from 0 through 100. If you specify 100, then the result retains all principal components.

• `Number of components`— Specify from 1 through the number of columns of data. If you specify the number of columns of data, then the result retains all principal components.

Regardless of the criterion you specify, you can plot all the principal components. The reduction criterion changes only the number of columns in the returned, transformed data; the plots can use all the transformed data before reduction.

To display plots of the principal components, select from the available options:

• Select Scree plot to display the percentage of the variance explained by each principal component as a bar chart. The cumulative percentages appear as a line plot above the bars. The task uses the `bar` function to create the bar chart and the `plot` function to plot the cumulative percentages.

• Select 2D scatter plot to display the principal components of the data in a 2D scatter plot. The task uses either the `scatter` function or the `gscatter` function to create the scatter plot, depending on whether you specify a grouping variable.

• Select 2D biplot to plot the data as a 2D biplot. The task uses the `biplot` function to create the biplot.

Tips

• By default, the Reduce Dimensionality task does not run automatically when you modify the task parameters. To have the task run automatically after any change, select the button at the top right of the task. If your data set is large, enabling this option can cause the task to run slowly.

Version History

Introduced in R2022b