Clustering is the classification of objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure.

Java implementation of the Quality Threshold clustering algorithm

A collection of clustering algorithms and tools written in Java have been developed at the ICB and is available as part of a library called "QtClustering". This is free software distributed under the GNU General Public License.

Algorithms Implemented

  • QT clustering algorithm. QT stands for Quality Threshold (the diameter of a cluster). See Wikipedia and Heyer LJ et al 1999 for the article were the algorithm was reported and tested on microarray data.

Software Requirements

Getting the library

The clustering library is distributed as a precompiled jar files and also in source code form. Distribution types are described in the following sections.

Binary Distribution

The binary distribution of the clustering library contains two jar files described as follows:

  • qtclustering.jar
includes all the external classes needed to run (i.e., fastutil)
  • qtclustering-core.jar
includes only the project classes and will require a fastutil.jar to use in your own projects

Source Distribution

The source distribution of the clustering library contains the Java source code along with supporting files that are used to compile and test the package.


Note that this section is meant only for those with the source distribution or subverion access. Users of the binary distribution should skip this section.

Compiling and packaging

The target used to build the clustering package is called "jar". Executing ant jar will produce the a file called "qtclustering.jar" in the <install-dir>.

Running JUnit Tests

The clustering library is built using ant and a build.xml file located in the <install-dir>. The default target will compile the source and run the junit tests.

Subversion Access from the ICB local environment

This project's Subversion repository can be checked out through SVN with the following instruction set:

 svn co https://pbtech-vc.med.cornell.edu/public/svn/icb/trunk/icb-commons/qtclustering

Browse the clustering package in the Subversion repository.


The clustering Javadoc API is available here and is also included with the binary distribution.

More Information

Cluster_analysis page at wikipedia.

Contact. Email feedback and questions to icb at med.cornell.edu.

