Package rdkit :: Package ML :: Package Cluster :: Module Murtagh
[hide private]
[frames] | no frames]

Module Murtagh

source code

Interface to the C++ Murtagh hierarchic clustering code

Functions [hide private]
 
_LookupDist(dists, i, j, n)
*Internal Use Only*
source code
 
_ToClusters(data, nPts, ia, ib, crit, isDistData=0)
*Internal Use Only*
source code
 
ClusterData(data, nPts, method, isDistData=0)
clusters the data points passed in and returns the cluster tree
source code
Variables [hide private]
  WARDS = 1
  SLINK = 2
  CLINK = 3
  UPGMA = 4
  MCQUITTY = 5
  GOWER = 6
  CENTROID = 7
  methods = [("Ward's Minimum Variance", WARDS, "Ward's Minimum ...

Imports: Clusters, MurtaghCluster, MurtaghDistCluster, numpy


Function Details [hide private]

_LookupDist(dists, i, j, n)

source code 
*Internal Use Only*

returns the distance between points i and j in the symmetric
distance matrix _dists_

_ToClusters(data, nPts, ia, ib, crit, isDistData=0)

source code 
*Internal Use Only*

Converts the results of the Murtagh clustering code into
a cluster tree, which is returned in a single-entry list

ClusterData(data, nPts, method, isDistData=0)

source code 
clusters the data points passed in and returns the cluster tree

**Arguments**

  - data: a list of lists (or array, or whatever) with the input
    data (see discussion of _isDistData_ argument for the exception)

  - nPts: the number of points to be used

  - method: determines which clustering algorithm should be used.
      The defined constants for these are:
      'WARDS, SLINK, CLINK, UPGMA'

  - isDistData: set this toggle when the data passed in is a
      distance matrix.  The distance matrix should be stored
      symmetrically so that _LookupDist (above) can retrieve
      the results:
        for i<j: d_ij = dists[j*(j-1)/2 + i]


**Returns**

  - a single entry list with the cluster tree


Variables Details [hide private]

methods

Value:
[("Ward's Minimum Variance", WARDS, "Ward's Minimum Variance"), ('Aver\
age Linkage', UPGMA, 'Group Average Linkage (UPGMA)'), ('Single Linkag\
e', SLINK, 'Single Linkage (SLINK)'), ('Complete Linkage', CLINK, 'Com\
plete Linkage (CLINK)'), ("Centroid", CENTROID, "Centroid method"),]