How can the data be accurately clustered?
Show older comments
Greetings, I possess a dataset labeled as "Data," which encompasses four columns. Following this, I have employed the K-means algorithm to generate clusters, assigning the Cluster Value 6 to the variable "clusters" in the respective cells. Nevertheless, it has come to my attention that certain values within these clusters are inaccurately assigned to other clusters.
I am in need of guidance regarding the reassignment of clusters utilizing a specific algorithm. The ground truth is depicted in the attached image. I would greatly appreciate assistance with this endeavor.

disp('Calculating Centroid')
K=6;
[idx,C,sumdist] = kmeans(Data,K,'Display','final');
dataset=Data;
dataset_idx=zeros(length(dataset),5);
dataset_idx=dataset(:,1:4);
dataset_idx(:,5)=idx;
clusters = cell(K,1);
for i = 1:K
clusters{i} = dataset_idx(dataset_idx(:,5) == i,:);
end
cluster_assignments=idx;
7 Comments
Image Analyst
on 17 Feb 2024
I don't understand your ground truth image.
Med Future
on 18 Feb 2024
Image Analyst
on 18 Feb 2024
So cluster 1 can have times ranging from 30 to 1,250,000?
Med Future
on 18 Feb 2024
Image Analyst
on 18 Feb 2024
I'm sorry but this is just getting so confusing to me. You may have to wait for someone else who understands your problem better. How many actual features are there? Four? Three? Can't you just use masking to classify your observations based on known values? Like
class1Indexes = (data(:, 1) > 1200000) & (data(:, 1) < 1250000) & (data(:, 2) > 1200000) & (data(:, 2) < 1250000) & data(:, 4) == 30
Med Future
on 18 Feb 2024
Med Future
on 19 Feb 2024
Answers (1)
Categories
Find more on Get Started with Statistics and Machine Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
