Index number of k-means clusters

4 views (last 30 days)
Gabriel
Gabriel on 25 Feb 2016
Commented: wathiq dukhan on 26 Jul 2019
Hi guys, I have a program that cluster some data and then calculate the optimized number of clusters.
I want to know how I can get the index number of each data so I can plot them separated in clusters
My code is the following:
%importar dados do excel
imp = xlsread('Academia.xlsx');
%%Loop Kmeans K clusters
k=20
CH=zeros(1,k);
SH=zeros(1,k);
DB=zeros(1,k);
SUB=zeros(1,k);
%
% for i=1:k
% CH{i}=0;
% SH{i}=0;
% DB{i}=0;
% end
% eva = evalclusters(x,clust,criterion)
for i=1:k
[idx,C]=kmeans(imp,i,'MaxIter',10000); %C=centroides
% Como pegar o valor de idx ?
eva = evalclusters(imp, idx, 'CalinskiHarabasz')
CH(1,i)=eva.CriterionValues
eva2 = evalclusters(imp, idx, 'Silhouette')
SH(1,i)=eva2.CriterionValues
eva3 = evalclusters(imp, idx, 'DaviesBouldin')
DB(1,i)=eva3.CriterionValues
end
CH(1,1)=0
% % Descobrir as diferenças entre os valores de CH
while i>1
SUB(1,i)=CH(1,i)-CH(1,i-1);
i=i-1;
end
SUB(1,1)=0
%Achar o pulo maximo entre os clusters para o valor de CH:
% cell2mat(SUB); %Converte para matriz
%SUB2=cell2mat(SUB)
[V,N]=max(SUB) % Por algum motivo está pulando a célula vazia e fornecendo valor incorreto
SH
%SHF=cell2mat(SH)
[V2,N2]=max(SH) %Valor mais próximo de 1
DB
%DBF=cell2mat(DB)
[V3,N3]=min(DB) %Valor mais próximo de 0
if (N==N2) && (N2==N3)
disp('CH=SH=DB')
N
N2
N3
i=N
% y=xlswrite('Academia_target.xlsx',M(1,i),'D1:D80')
elseif N2==N3
disp('SH=DB')
elseif N==N3
disp('CH=DB')
elseif N==N2
disp('CH=SH')
else
disp('Todas as métricas forneceram valores diferentes')
end
In this case I want to retrieve the idx from the clustering in 4 clusters which is the optimized on in my case ( the Excel file from which I take the data is attached to this question.
Thanks in Advance !
  2 Comments
jgg
jgg on 25 Feb 2016
I'm unclear what the problem is here; idx is a vector corresponding to the cluster ID for each observation for your fitted cluster. Isn't that what you want?
Gabriel
Gabriel on 27 Feb 2016
Hi jgg,
The thing is that each time the K-means runs (up to 20) the value of idx changes. And I only know which value "i" I need after the loop for is completed. So I need to store all of idx values and have a way to retrieve the value which corresponds to i=4 in this case.
I already tried using a variable R{i}=idx after the k-means but for some reason I receive a error message saying (Cell contents assignment to a non-cell array object.
Error in import_excelv4 (line 27) R{i}=idx

Sign in to comment.

Answers (1)

wathiq dukhan
wathiq dukhan on 26 Jul 2019
I would like a program to calculate the number of clusters by genetic algorithm and k-means.
  2 Comments
Walter Roberson
Walter Roberson on 26 Jul 2019
k-means itself must be passed the number of clusters to use; it is not able to calculate the number of clusters. However, there are algorithms that can be used that run k-means a number of times and take estimates of what the most likely number of clusters is under certain conditions.
wathiq dukhan
wathiq dukhan on 26 Jul 2019
I need to use GA to calculate the number of clusters.

Sign in to comment.

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!