Skip to content

A CUDA implementation of non-negative matrix factorization for GPUs.

License

Notifications You must be signed in to change notification settings

recoord/nmf-gpu

 
 

Repository files navigation

nmf-cuda

A CUDA implementation of non-negative matrix factorization for GPUs.

A description of the implementation is available in [1] below. If you use this in your work, please cite [1].

[1] E. Battenberg and D. Wessel, “Accelerating non-negative matrix factorization for audio source separation on multi-core and many-core architectures,” in International Society for Music Information Retrieval Conference (ISMIR 2009), 2009.

Bibtex entry:

@inproceedings{battenberg2009accelerating,
    title={Accelerating nonnegative matrix factorization for audio source separation on multi-core and many-core architectures},
    author={Battenberg, E. and Wessel, D.},
    booktitle={10th International Society for Music Information Retrieval Conference (ISMIR 2009)},
    year={2009}
}

Implementation Details:

Iterative NMF on Cuda: X = W*H

  • multiplicative updates
  • divergence cost function

nmf.cu contains an example usage of the update_div function.

The function read_matrix reads binary floats in from files and stores them in a matrix struct

  • values need to be stored in column-major order, and the first two values in the file are integer dimensions of the matrix
  • an example of how to create the binary files (from e.g. Matlab data) is contained in the matlab script matrix_export.m
  • data can also just be stored in a column-major float array using the matrix struct and proper assignment of the dim values

The function update_div is where the work is done.

void update_div(
        matrix W, matrix H, 
        matrix X,
        float CONVERGE_THRESH,
        int max_iter,
        double t[10],
        int verbose);
  • W, H are initial values for the factor matrices
  • X is the target matrix to be decomposed
  • CONVERGE_THRESH is the convergence threshold (expressed as a ratio of cost function change to cost function value)
  • max_iter is the maximum number of iterations
  • double t[10] is a pointer to a double array of at least size 10 that will contain individual timing results for different computational pieces. (set this to NULL for normal use)
  • verbose set to 1 if you want more text output, 0 otherwise

About

A CUDA implementation of non-negative matrix factorization for GPUs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 89.9%
  • C++ 5.8%
  • Makefile 1.7%
  • Python 1.3%
  • Shell 1.3%