Formation of higher dimensional subspace clusters

How to find higher dimensional clusters are formed by connecting 2-dimensional subspace clusters sharing common objects. If an outlier object is by chance becomes part of a one dimensional cluster, it will be absent in clusters present in remaining dimensions and its support will be below attribute_threshold. Such outlier objects get eliminated in this step. The attributes containing 2-dimensional clusters are arranged in non-decreasing order on the basis of percentage of coverage of the data items belonging to the clusters. The objects in a 2-dimensional cluster in an attribute are connected to objects in next attribute in the sequence if they contain common object indices to form higher dimensional subspace clusters. And find the count of dimensions in a subspace cluster is less than attribute_threshold.The default attribute threshold is set as 5.I want to do that in Z.
clc;
clear;
data=xlsread('Glassxl.xlsx');
asc=sort(data);
minpts=4;
epsilon=2;
tic
idx=dbscan(asc,epsilon,minpts);
figure(1)
gscatter(asc(:,1),asc(:,2),idx);
title('DBSCAN Using Euclidean Distance Metric')
Z = linkage(idx,'ward');
How can I find the and merge those clusters that share a common object.How can I do that please help me.

 Accepted Answer

I can be your help though, I don't quite get you.
Let me ask you: do you have a good understanding on DBSCAN in the first place?

8 Comments

I use dbscan for find the dense areas and the minpts ,epsilon values are I taken randomly.
OK, can you explain how you can apply dbscan to finding out dense area?
I tried to run your script but it did not work. Also, I wondered why you sorted the data that way.
The 1st column looks like the indices but you treated it as a vector. I was confused. I do not understand what you're trying to do.
anyway, I slightly changed your code so it can work as follows:
clc;
clear;
data=xlsread('Glassxl.xlsx');
%asc=sort(data);
asc = data(:,2:end);
minpts=4;
epsilon=2;
idx=dbscan(asc,epsilon,minpts);
figure(1)
gscatter(asc(:,1),asc(:,2),idx);
title('DBSCAN Using Euclidean Distance Metric')
Z = linkage(idx,'ward');
figure;
dendrogram(Z)
hope this is what you want.
Thank you.I need more number of clusters,for that I sort the data.How can I get more number of clusters.
Try decreasing the number of minpts and increasing epsilon.
I would suggest you should understand how the algorithm works.
Sir, I used rmouliers() for identifying the outliers.How can I visualize that outliers from the data.
A = [57 59 60 100 59 58 57 58 300 61 62 60 62 58 57];
[B,TF] = rmoutliers(A)
% TF is boolean and thus you can get what you removed by conducting:
A(TF)
Does this help?
MathWorks provides well documented explanations for all the function and etc.
You should always go and look for the solutions when you get stucked.
Thank you sir.Is it possible to visualize the outliers in the data as a graph.
Good to know.
Please acept the answer and close this question. Good luck.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!