Finding cluster centre in hierarchial clustering
63 views (last 30 days)
I am trying to find the cluster centers in hierarchical clustering. Below is the code i use. But this returns only the cluster numbers for each of the observations. I am dealing with multi-dimensional data (32 dimensions). Any ideas or code would be very helpful
c = clusterdata(input,'linkage','ward','savememory','off','maxclust',10);
Siddharth Sundar on 14 Oct 2014
Edited: Siddharth Sundar on 14 Oct 2014
Hierarchical Clustering does not use a cluster center based clustering. This doc page talks about what goes on under the hood when you use clusterdata: Hierarchical Clustering. Essentially, the pdist function is used to generate the distance between every pair of objects in a data set . This information is then used in the linkage function which determines how the objects in the data set should be grouped into clusters that form a binary hierarchical cluster tree.
By default, linkage uses the 'single' method to group two clusters in which it uses the smallest distance between objects in the two clusters.
In your case, you use Ward linkage which uses the incremental sum of squares; that is, the increase in the total within-cluster sum of squares as a result of joining two clusters to determine the cluster grouping.
You can however set the linkage method to be 'centroid' (see documentation for linkage) and this will then use the centroids of individual clusters at a lower level in the tree to do the clustering at a higher level.
However, there is no way to actually access these centroids as an output argument (Since the code is for the linkage function is available to you, you could set breakpoints and step through it and check to see how the grouping is done using the centroids).