"No one is harder on a talented person than the person themselves" - Linda Wilkinson ; "Trust your guts and don't follow the herd" ; "Validate direction not destination" ;

February 23, 2016

Hierarchical Clustering


  • Compute distance in every pair of cluster
  • Merge nearest ones until number of clusters = number of clusters needed
  • Entire process can be represented as dendrogram
  • At the end of the algorithm dendogram is plotted
Measuring Distance between clusters
  • Single (Minimum Distance between two pairs one from each clusters)
  • Complete (Maximum  between two pairs one from each clusters)
  • Average (Average of all possible pairs)
library(cluster)
#Avoid country name
#use euclidean
#complete linkage mechanism
#diss - false as you are passing data frame
foodagg=agnes(food[,-1],diss=FALSE,metric="euclidian", method="complete")
plot(foodagg)
#To get the required number of cluster
cutree(foodagg,k=5)
# Agglomerative hierarchical clustering (agnes)
library(cluster)
ah <- agnes(food[,-1])
plot(ah)
#Divisive hierarchical clustering (diana)
library(cluster)
dh <- diana(food[,-1])
plot(dh)
#Examples http://www.math.wustl.edu/~victor/classes/ma322/r-eg-28.txt
view raw Hierarchical.R hosted with ❤ by GitHub

Happy Learning!!!

No comments: