Applied Math & Computer Science Lab
Data Analysis, Optimization & Mathematical Modeling, Artificial Intelligence, Neural Net For Everyday Life Applications
AI/Data Mining Links Online Free Courses Online Bookstore AMCSL Forum Submit Link New Additions Archive
Practical Data Mining Courses      Get Certificate of Completion Now for Free   
Search the Web:    

Hierarchical Clustering

Hierarchical clustering is one of the clustering methods which allows to group objects to clusters and build hierarchy of clusters. In the "top down" approach the program in the beginning assigns all objects to one cluster and then at each step split selected cluster. The decision which cluster to split is based on the distance (or similarity) between objects in the cluster and can be implemented at different ways [2].



In this discussion we will use perl code to create program for hierarchical clustering with "top down" approach. The input data for clustering in our example is 2 dimensional array, however it's not limited size of two.
At each step we keep one dimensional array. The ith element of this array keeps track of objects that are included in ith cluster on the given step. The program saves object ids to string, each object id is separated by special separator character "_". For example if cluster 3 has objects 2,6,9 then 3d element of this array will have string value as "2_6_9_"
After each step the size array will be decreased by one as we merge clusters.
In the end of each step the program outputs clusters and objects that are included in the clusters. So we can see hierarchy of clusters.
The program is using Euclidean distance between objects.
For the input data

the output is produced by program is following:

Perl source code link is provided below [1]. With some minor ajustments it can be used in data mining practial tasks for hierarchical clustering.


References



1. Hierarchical clustering - 'top down' approach - perl script
2. Wikipedia: Hierarchical clustering

Related


1. Hierarchical clustering - 'bottom-up' approach
2. Hierarchical clustering - 'bottom up' approach - online clustering demo