Calculating the sum of squared within cluster distances

33 views (last 30 days)
I would like to calculate the "the sum of pair-wise distance between all maps of a given cluster" Murray et al., 2008, to further calculate the Krzanowski-Lai criterion in Matlab. In Murray et al. (2008) the following equation is given for the sum of (squared) distances:
u is a specific the measured values of one time point (one observation, in my case the measured potentials, since I am working with EEG data).
v is a specific the measured values of another time point then in u.
I found a code, which already implemented the Krzanowski-Lai criterion for my specific data in calc_fitmeas provided by atpoulsen. Thereby, the squared sum of within distances are calculated as following:
Nk = size(cluster,2);
clstrsq = dot(cluster,cluster,1);
%sum of pair-wise distance between all maps of cluster k
Dk = sum(sum(bsxfun(@plus,clstrsq',clstrsq)-2*(cluster'*cluster)));
`Cluster` is a vector containing the data for a specific cluster. Columns corresponds to observations, rows to variables. For example, if I have 100 observations and for each observation 10 variables:
cluster = randn(10,100)
I have got no good training in linear algebra and math, thus I have got some questions about the implementation.
First, what kind of distance is calculated here.
Second, I do not understand how this code relates to the formula given above. Can some confirm, that this code corresponds to the formula and explain me the deviration?

Accepted Answer

Matt J
Matt J on 20 Jul 2023
Edited: Matt J on 20 Jul 2023
The method you've posted is highly inefficient. You should just do,
Dk=2*width(cluster)*(norm(cluster-mean(cluster,2),'fro')^2 )
  2 Comments
Matt J
Matt J on 20 Jul 2023
Timing test:
cluster=rand(10,6000);
tic
Dk0 = baselineMethod(cluster);
toc
Elapsed time is 0.364334 seconds.
tic;
Dk=2*width(cluster)*(norm(cluster-mean(cluster,2),'fro')^2 );
toc
Elapsed time is 0.005465 seconds.
relativeError=abs(Dk0-Dk)/Dk0
relativeError = 3.7123e-16
function Dk=baselineMethod(cluster)
clstrsq = dot(cluster,cluster,1);
%sum of pair-wise distance between all maps of cluster k
Dk = sum(sum(bsxfun(@plus,clstrsq',clstrsq)-2*(cluster'*cluster)));
end

Sign in to comment.

More Answers (0)

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!