Nonclassical and Nonmetric Multidimensional Scaling
Perform nonclassical multidimensional scaling using mdscale
.
Nonclassical Multidimensional Scaling
The function mdscale
performs nonclassical multidimensional scaling. As with cmdscale
, you use mdscale
either to visualize dissimilarity data for which no “locations” exist, or to visualize high-dimensional data by reducing its dimensionality. Both functions take a matrix of dissimilarities as an input and produce a configuration of points. However, mdscale
offers a choice of different criteria to construct the configuration, and allows missing data and weights.
For example, the cereal data include measurements on 10 variables describing breakfast cereals. You can use mdscale
to visualize these data in two dimensions. First, load the data. For clarity, this example code selects a subset of 22 of the observations.
load cereal.mat X = [Calories Protein Fat Sodium Fiber ... Carbo Sugars Shelf Potass Vitamins]; % Take a subset from a single manufacturer mfg1 = strcmp('G',cellstr(Mfg)); X = X(mfg1,:); size(X)
ans = 22 10
Then use pdist
to transform the 10-dimensional data into dissimilarities. The output from pdist
is a symmetric dissimilarity matrix, stored as a vector containing only the (23*22/2) elements in its upper triangle.
dissimilarities = pdist(zscore(X),'cityblock');
size(dissimilarities)
ans = 1 231
This example code first standardizes the cereal data, and then uses city block distance as a dissimilarity. The choice of transformation to dissimilarities is application-dependent, and the choice here is only for simplicity. In some applications, the original data are already in the form of dissimilarities.
Next, use mdscale
to perform metric MDS. Unlike cmdscale
, you must specify the desired number of dimensions, and the method to use to construct the output configuration. For this example, use two dimensions. The metric STRESS criterion is a common method for computing the output; for other choices, see the mdscale
reference page in the online documentation. The second output from mdscale
is the value of that criterion evaluated for the output configuration. It measures the how well the inter-point distances of the output configuration approximate the original input dissimilarities:
[Y,stress] =... mdscale(dissimilarities,2,'criterion','metricstress'); stress
stress = 0.1856
A scatterplot of the output from mdscale
represents the original 10-dimensional data in two dimensions, and you can use the gname
function to label selected points:
plot(Y(:,1),Y(:,2),'o','LineWidth',2); gname(Name(mfg1))
Nonmetric Multidimensional Scaling
Metric multidimensional scaling creates a configuration of points whose inter-point distances approximate the given dissimilarities. This is sometimes too strict a requirement, and non-metric scaling is designed to relax it a bit. Instead of trying to approximate the dissimilarities themselves, non-metric scaling approximates a nonlinear, but monotonic, transformation of them. Because of the monotonicity, larger or smaller distances on a plot of the output will correspond to larger or smaller dissimilarities, respectively. However, the nonlinearity implies that mdscale
only attempts to preserve the ordering of dissimilarities. Thus, there may be contractions or expansions of distances at different scales.
You use mdscale
to perform nonmetric MDS in much the same way as for metric scaling. The nonmetric STRESS criterion is a common method for computing the output; for more choices, see the mdscale
reference page in the online documentation. As with metric scaling, the second output from mdscale
is the value of that criterion evaluated for the output configuration. For nonmetric scaling, however, it measures the how well the inter-point distances of the output configuration approximate the disparities. The disparities are returned in the third output. They are the transformed values of the original dissimilarities:
[Y,stress,disparities] = ... mdscale(dissimilarities,2,'criterion','stress'); stress
stress = 0.1562
To check the fit of the output configuration to the dissimilarities, and to understand the disparities, it helps to make a Shepard plot:
distances = pdist(Y); [dum,ord] = sortrows([disparities(:) dissimilarities(:)]); plot(dissimilarities,distances,'bo', ... dissimilarities(ord),disparities(ord),'r.-', ... [0 25],[0 25],'k-') xlabel('Dissimilarities') ylabel('Distances/Disparities') legend({'Distances' 'Disparities' '1:1 Line'},... 'Location','NorthWest');
This plot shows that mdscale
has found a configuration of points in two dimensions whose inter-point distances approximates the disparities, which in turn are a nonlinear transformation of the original dissimilarities. The concave shape of the disparities as a function of the dissimilarities indicates that fit tends to contract small distances relative to the corresponding dissimilarities. This may be perfectly acceptable in practice.
mdscale
uses an iterative algorithm to find the output configuration, and the results can often depend on the starting point. By default, mdscale
uses cmdscale
to construct an initial configuration, and this choice often leads to a globally best solution. However, it is possible for mdscale
to stop at a configuration that is a local minimum of the criterion. Such cases can be diagnosed and often overcome by running mdscale
multiple times with different starting points. You can do this using the 'start'
and 'replicates'
name-value pair arguments. The following code runs five replicates of MDS, each starting at a different randomly-chosen initial configuration. The criterion value is printed out for each replication; mdscale
returns the configuration with the best fit.
opts = statset('Display','final'); [Y,stress] =... mdscale(dissimilarities,2,'criterion','stress',... 'start','random','replicates',5,'Options',opts);
35 iterations, Final stress criterion = 0.156209 31 iterations, Final stress criterion = 0.156209 48 iterations, Final stress criterion = 0.171209 33 iterations, Final stress criterion = 0.175341 32 iterations, Final stress criterion = 0.185881
Notice that mdscale
finds several different local solutions, some of which do not have as low a stress value as the solution found with the cmdscale
starting point.