-
Notifications
You must be signed in to change notification settings - Fork 168
CallForFunding
GPUs and next generation hardware - Google's
TPU -
are the underpinning technologies of Big Data. netlib-java
, which
ships with Spark's MLlib and is widely used in Hadoop, can only
continue to support and anticipate new hardware architectures with
funding.
We already see fragmentation in Big Data platforms, with each new application or platform creating tailored support for restricted algorithms on specific chipsets. It is a tactical necessity for each player to repeat the work of their competitors, with no advantage.
By pooling resources, netlib-java
can provide the strategic API for
linear algebra operations that benefit from hardware acceleration,
building on decades of machine code optimisation from the experts.
A funded netlib-java
means that when you need hardware acceleration
for your new algorithm, it will already be available for you to use.
When you upgrade your hardware or operating systems, they will already
be supported.
Customer-facing businesses can focus on solving their customer's problems, instead of spending their precious human resource on low-level hardware integration and optimisation issues.
There are a few areas that require improvement:
- support for complex numbers
- support for hardware-specific memory regions and NIO
- an automated, continuous, release
- MultiBLAS
Complex numbers can only be supported by returning to the Fortran JVM
compiler at the core of netlib-java
. The compiler hasn't seen any
code changes in nearly ten years and doesn't support a complex number
representation that maps onto accelerated hardware. A redesign of the
compiler would enable complex number support as well as simplifying
the build and supporting additional Fortran algorithms beyond BLAS,
LAPACK and ARPACK.
Bleeding edge GPUs allow direct access to their memory as if it were
in the CPU memory space. netlib-java
could support this with a thin
native binding layer and direct memory access via NIO. This, again,
requires changes to the Fortran compiler.
netlib-java
currently uses a kind of native memory access known as
critical access, which can cause long delays of garbage collection
that may result in OutOfMemory
exceptions. Using NIO (as an
alternative to) JVM-managed memory would reduce the risk of such
corner-case problems.
Today, to release an update to netlib-java
requires three physical
computers and several hours of effort. With the advent of docker and
cloud platforms, we can do better. There is no reason why we can't
provide binaries for any operating system that can be virtualised,
with snapshot releases on merges to master. The community then becomes
empowered and there is even less need for any of us to write
non-competitive tactical algorithms.
If you are interested in funding this initiative, please get in
touch by emailing me at [email protected]
Please note that I am only interested in serious financial commitments, not individual donations. If your organisation is unable to commit at least $10,000 to the pot, then I'm afraid the numbers just don't work out.
I am Sam Halliday. I am a
chartered mathematician and software engineer, based in London. I am
the author of netlib-java
.