mddc

PURPOSE

Minimum Density Divisive Clustering

SYNOPSIS

function [idx,t] = mddc(X, K, varargin)

DESCRIPTION

Minimum Density Divisive Clustering 
[IDX,T] = MDDC(X, K, VARARGIN)

 [IDX, T] = MDDC(X, K) produces a divisive hierarchical clustering of the
 N-by-D data matrix X into (a maximum of) K clusters. This algorithm uses a
 hierarchy of binary partitions each splitting the observations with the
 hyperplane with minimum density integral.  The algorithm can return fewer
 clusters if no valid hyperplane separators are found.

  [IDX, T] = MDDC(X, K) returns the cluster assignment, IDX, and the  binary tree
  (T) containing the cluster hierarchy

  [IDX, T] = MDDC(X, K, 'PARAM1',val1, 'PARAM2',val2, ...) specifies optional parameters
  in the form of Name,Value pairs. 

  OPTIONAL PARAMETERS:
  'v0' - Initial projection vector(s)
    Function handle: v0(X,P) returns D-by-S matrix of initial projection vectors
       (default: v0 = @(y,p)(pca(y,'NumComponents',1)) -- 1st principal component)

  'bandwidth' - Bandwidth parameter
    Function Handle: bandwidth(X,pars) returns bandwidth (positive scalar)
    (default: bandwidth =0.9* sqrt(eigs(cov(X),1)) * size(X,1)^(-0.2))

  'split_index' - Criterion determining which cluster to split
    Function Handle: index = split_index(v, X, pars)
            (v: projection vector, X:data matrix, pars: parameters structure)
    Cluster with MAXIMUM INDEX is split at each step of the algorithm
    Three standard choices of split index can be enabled by settgin 'split_index' to 
    one of the strings below:
        + 'fval':    Split cluster whose hyperplane achieves the lowest density integral
        + 'size':    Split largest cluster
        + 'rdepth':  Split cluster with maximum relative depth
    (default: split_index = 'size')

  'minsize' - Minimum cluster size (integer)
    (default minsize = 1)

  'alphamin' - The minimum ALPHA over which MDHs are sought: 
    [mean(X*V) - ALPHA*std(X*V), mean(X*V) + ALPHA*std(X*V)]. 
    ALPHA starts from (alphamin) and increases by 0.1 every (maxit) iterations until (alphamax) is reached.
    (default: 0)

  'alphamax' - The maximum ALPHA over which MDHs are sought is 
    [mean(X*V) - ALPHA*std(X*V), mean(X*V) + ALPHA*std(X*V)]. 
    ALPHA starts from (alphamin) and increases by 0.1 every (maxit) iterations until (alphamax) is reached.
    (default: 1)

  'maxit' - Number of BFGS iterations to perform for each value of alpha
    (default: 50)

  'ftol' - Stopping criterion for change in objective function value over consecutive iterations
    (default: 1.e-5)

  'verb' - Verbosity. Values greater than 0 enable visualisation during execution
    Enabling this option slows down the algorithm considerably
    (default: 0)

  'labels' - true cluster labels. Specifying these enables the computation of performance over 
    successive iterations and a better visualisation of how clusters are split

  'colours' - Matrix containing colour specification for observations in different clusters
    Number of rows must be equal to the number of true clusters (if 'labels' has been specified) or equal to 2.

Reference:
N.G. Pavlidis, D.P. Hofmeyr and S.K. Tasoulis. Minimum density hyperplanes.
Journal of Machine Learning Research, 17(156):1–33, 2016.
http://jmlr.org/papers/v17/15-307.html.

CROSS-REFERENCE INFORMATION

This function calls: This function is called by:
Generated on Tue 17-Jul-2018 18:58:09 by m2html © 2005