Skip to content
This repository has been archived by the owner on Nov 10, 2017. It is now read-only.

CallForFunding

Sam Halliday edited this page Jun 21, 2017 · 12 revisions

Fund netlib-java to accelerate the machine learning revolution

The netlib-java project is seeking funding to allow machine learning / Big Data to reap the benefits of new GPU hardware on the Java virtual machine.

According to Eric Schmidt, machine learning and Big Data are going to be at the core of wealth creation from the IT industry for the foreseeable future, with 35x growth in the last two years alone. The workhorses of this revolution are GPUs and next generation hardware, like Google's TPU or NVIDIA's DGX-1.

All forms of machine learning require lightning-fast linear algebra operations. netlib-java provides access to hardware-accelerated high performance linear algebra from the Java virtual machine and ships as a part of Spark (MLlib). It is also widely used with Hadoop. You can find out more details about it in this talk and slide deck.

The explosion of new hardware architectures means that volunteer work on netlib-java is not enough to support and anticipate them all. In order to empower the community to fully embrace these new platforms and help build the next generation of applications, netlib-java needs funding.

A funded version of netlib-java will provide the strategic API for hardware-optimized linear algebra, building on decades of machine code optimization by experts. It will ensure that hardware acceleration for any new algorithm is instantly available and make upgrades to new hardware or OS effortless. It will allow customer-facing businesses to focus on solving their users' problems instead of spending endless, precious hours on low-level hardware integration and optimisation.

Specific Improvements

We have identified a few priority areas for improvement:-

  1. Continuous, automated release.
  2. Support for complex numbers.
  3. Support for hardware-specific memory regions and NIO.

Continuous Release / OS Support

Today, releasing an update to netlib-java requires three physical computers and several hours of effort. With the advent of docker and cloud platforms, we can do better. With funding, we can provide binaries for any virtualised operating system, with snapshot releases on merges to master.

This empowers the community to contribute to the project, and it means that bugfixes / features are available instantly without you needing to understand the complex multi-platform build.

Complex Numbers / additional algorithms

Any application involving signal processing, audio or detailed image analysis requires fast Fourier transforms and thus complex numbers.

However, complex numbers can only be supported by returning to the Fortran JVM compiler at the core of netlib-java. The compiler hasn't seen any code changes in nearly ten years and doesn't support a complex number representation that maps onto accelerated hardware.

A redesign of the compiler would enable complex number support as well as simplifying the build and supporting Fortran reference implementations of breakthrough academic work in the areas of large matrix decomposition and tensor algebra.

Special Memory / Hardware / NIO Safety

Bleeding edge GPUs allow direct access to their memory as if it were in the CPU memory space. netlib-java could support this with a thin native binding layer and direct memory access via NIO. This, again, requires changes to the Fortran compiler.

netlib-java currently uses a kind of native memory access known as critical access, which can cause long delays of garbage collection that may result in OutOfMemory exceptions. Using NIO (as an alternative to) JVM-managed memory would reduce the risk of such corner-case problems.

Funding

If you are interested in contributing funding to this initiative, please get in touch by emailing me at [email protected]

While all interest is appreciated, at this point, we are seeking serious financial commitments -- at least $10,000 per organisation -- not small individual donations. Ideally one organisation would fund me to perform all the necessary work on a 6 to 12 month contract. This is because we want to focus on providing the best, most scalable solution, not on running a crowdfunding campaign (and I cannot fit such a piece of work in between other fulltime work).

You -- whether you are a machine learning startup, a hardware manufacturer, or an established company using big data -- will benefit the most from netlib-java. We need your help to take the project to the next level.

Who Are You?

I am Sam Halliday. I am a chartered mathematician and software engineer, based in London. I am the author of netlib-java.

Clone this wiki locally