Skip to content

informative eigen vector selection algorithm proposed by: Tao Xiang and Shaogang Gong. 2008. Spectral clustering with eigenvector selection. Pattern Recogn. 41, 3 (March 2008), 1012-1029.

License

Notifications You must be signed in to change notification settings

sudeepsahadevan/EigenVectorSelection

Repository files navigation

title author date output
Eigen vector selection
Sudeep Sahadevan
09/17/2015
BiocStyle::html_document
toc theme highlight
true
united
tango

Functions to perform informative eigen vector selection based on the algorithm proposed by:
Tao Xiang and Shaogang Gong (2008). Spectral clustering with eigenvector selection. Pattern Recogn. 41(3), 1012-1029.
Article DOI: 10.1016/j.patcog.2007.07.023
Pdf link

See the original manuscript or open the html file for equations

##compute.params

Description

Compute posterior probability for the gaussian mixture model. Given an eigen vector, compute the posterior probabilities that the given vector is a gaussian mixture under the given parameters. The variable names in this function follows the pattern adopted by Xiang and Gong (2008) in their manuscript.

Parameters
  • vec: input eigen vector $e_{kn}$
  • rel: numeric variable, $0, {\leq},R_{ek},\leq,1$, relevance of the vector. (default value: 0.50)
  • mean2: mean of the first gaussian mixture, $\mu_{k2}$, if NULL, this parameter is estimated based on init option
  • mean3: mean of the second gaussian mixture, $\mu_{k3}$, if NULL, this parameter is randomly estimated based on init option
  • var2: variance of the first gaussian mixture, $\sigma_{k2}$, if NULL, this parameter is randomly estimated based on init option
  • var3: variance of the second gaussian mixture, $\sigma_{k3}$, if NULL, this parameter is randomly estimated based on init option
  • w: weight of the gaussian mixture, $\mathit{w}_{k}$, if NULL, weight is randomly estimated as w <- runif(1,min = 0, max = 1)
  • init: initialization options "random" or "cluster", "random" random estimation of parameters and "cluster" use cluster mean from k-means clustering with centers = 2. For details on kmeans clustering see kmeans R function
Usage

This function is not expected to be used as such, but rather as a part of compute.relevance function

Return

A list of many things:

  • rnk: estimated $R_{ek}^{new}$
  • wnk: estimated $\mathit{w}_{k}^{new}$
  • m2nk: estimated $\mu_{k2}^{new}$
  • v2nk: estimated $\sigma_{k2}^{new}$
  • m3nk: estimated $\mu_{k3}^{new}$
  • v2nk: estimated $\sigma_{k3}^{new}$

##compute.relevance

#####Description

Given an eigenvector, compute the relevance of the vector according to the expectation maximization algorithm proposed by Xiang and Gong (2008).

Parameters
  • vec: input eigen vector $e_{k}$ in the equations
  • tol: tolerance level for convergence (default: $1e^{-6}$)
  • maxit: maximum number of iterations for the expectation maximization step before convergence (default: 2500)
  • maxtrials: maximum number of multiple runs (default: 25)
  • init: see init description
Usage

Example usage: create random dataset

testdata <- matrix(runif(36,0,1),6,6)
testdata

Make it symmetric, and assign 0 to diagonal elements

testdata <- testdata %*% t(testdata)
diag(testdata) <- 0
testdata

Compute Laplacian as $L, =, D-W$ and compute eigen decomposition of the laplacian

testlap <- diag(rowSums(testdata))-testdata
testlap
testeig <- eigen(testlap,isSymmetric(testlap))
testeig

The eigenvectors can be used for relevance estimation like:

testrel <- compute.relevance(testeig$vectors[,2],tol=1e-6,maxit=2500,maxtrials=2 )
testrel

The relevant eigenvectors with rnk > 0.50 can be used for further downstream processing

Return

A list of values:

wrapper.compute.relevance

Description

Wrapper for the function compute.relevance, instead of using a single eigenvector as input, use the eigenvector matrix.

Parameters
  • mat: input eigen vector matrix
  • tol: tolerance level for convergence (default: $1e^{-6}$)
  • maxit: maximum number of iterations for the expectation maximization step (default: 2500)
  • maxtrials: maximum number of multiple runs (default: 25)
  • init: see init description
  • ncpus: number of cores to use, requires doMC and foreach packages for ncpus>1
Usage
testrel <- wrapper.compute.relevance(testeig$vectors[,c(2:4)])
testrel
Return

About

informative eigen vector selection algorithm proposed by: Tao Xiang and Shaogang Gong. 2008. Spectral clustering with eigenvector selection. Pattern Recogn. 41, 3 (March 2008), 1012-1029.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published