Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Talking to Numpy #10

Open
vegabook opened this issue Sep 23, 2018 · 4 comments
Open

Talking to Numpy #10

vegabook opened this issue Sep 23, 2018 · 4 comments

Comments

@vegabook
Copy link

vegabook commented Sep 23, 2018

Brilliant library. Of course Numpy has a 2 decades of accumulated functionality on top, so there's still a lot of stuff python-side that I'd want to use.

How can Matrix talk to Numpy efficiently, via, say erlports? Erlports sends python back as follows:

Erlang/OTP 21 [erts-10.0.8] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe]
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> {:ok, pid} = :python.start()                   
{:ok, #PID<0.176.0>}
iex(2)> xx = :python.call(pid, :py, :eig, [10])        
{{:"$erlport.opaque", :python,
  <<128, 2, 99, 110, 117, 109, 112, 121, 46, 99, 111, 114, 101, 46, 109, 117,
    108, 116, 105, 97, 114, 114, 97, 121, 10, 95, 114, 101, 99, 111, 110, 115,
    116, 114, 117, 99, 116, 10, 113, 1, 99, 110, 117, 109, 112, 121, ...>>},
 {:"$erlport.opaque", :python,
  <<128, 2, 99, 110, 117, 109, 112, 121, 46, 99, 111, 114, 101, 46, 109, 117,
    108, 116, 105, 97, 114, 114, 97, 121, 10, 95, 114, 101, 99, 111, 110, 115,
    116, 114, 117, 99, 116, 10, 113, 1, 99, 110, 117, 109, 112, ...>>}}
iex(3)> xx = :python.call(pid, :py, :eig_msgpack, [10])
<<146, 133, 164, 116, 121, 112, 101, 164, 60, 99, 49, 54, 164, 107, 105, 110,
  100, 160, 162, 110, 100, 195, 165, 115, 104, 97, 112, 101, 145, 10, 164, 100,
  97, 116, 97, 218, 0, 160, 61, 85, 216, 39, 118, 191, 18, 64, 0, 0, 0, 0, ...>>
iex(4)> Msgpax.unpack!(xx)
[
  %{
    "data" => <<61, 85, 216, 39, 118, 191, 18, 64, 0, 0, 0, 0, 0, 0, 0, 0, 152,
      188, 99, 203, 39, 212, 235, 191, 0, 0, 0, 0, 0, 0, 0, 0, 176, 49, 23, 6,
      60, 177, 190, 63, 82, 69, 170, 45, 5, 196, 229, 63, ...>>,
    "kind" => "",
    "nd" => true,
    "shape" => '\n',
    "type" => "<c16"
  },
  %{
    "data" => <<24, 3, 175, 142, 209, 167, 205, 191, 0, 0, 0, 0, 0, 0, 0, 0, 78,
      49, 82, 243, 187, 186, 202, 191, 0, 0, 0, 0, 0, 0, 0, 0, 24, 28, 80, 170,
      100, 115, 198, 191, 68, 9, 19, 108, 248, 3, 178, ...>>,
    "kind" => "",
    "nd" => true,
    "shape" => '\n\n',
    "type" => "<c16"
  }
]

As you can see, calling eig, which doesn't use msgpack, just returns a binary blob for Numpy. However if we msgpack the numpy arrays first, then we get a more structured return, which might be useful. The "data" field could go into a matrex matrix? How would one go about doing that?

Here by the way is the Python code:

from __future__ import print_function
import numpy as np
import pdb
import IPython
import string
import msgpack
import msgpack_numpy as m
m.patch() # patch msgpack to do numpy

def eig(n):
    np.random.seed(8472)
    xx = np.random.rand(n * n).reshape(n, n)
    yy = np.linalg.eig(xx)
    return yy

def eig_msgpack(n):
    return msgpack.packb(eig(n))

def dicadd(dict):
    return {string.join(dict.keys()): sum(dict.values())}

And here are my mix.exs deps:

  defp deps do
    [
      {:erlport, "~> 0.10.0"},
      {:benchwarmer, "~> 0.0.2"},
      {:msgpax, "~> 2.0"}
    ]
@versilov
Copy link
Owner

Hello!

Thanks for your praise!

What you suggest sounds like a nice idea.
I looked at numpy binary format several months ago, if I remember correctly it should be perfectly compatible with the new matrex format (the new version is in branch 'array'), because this format was inspired by numpy.

In this format data is stored separately in binary form and metafields, like shape, data type, strides etc. are stored as fields of struct.

The 'array' branch is generally ready for merging, except I that I am stuck with multi-dimensional matrices visualization in ASCII.

@vegabook
Copy link
Author

vegabook commented Sep 24, 2018

I'll take a look.

Also there's Apache Arrow.........seems to be gaining traction with Python Ray Framework, GPU data frame. It's basically being written by the guy who wrote Pandas. I'll dig around and see how compatible it is to your format.

Personally I would love to be able to be able to do the maximum I can in Elixir. For production use cases, python is just so stone age now. It's great that you're addressing what to me is Erlang ecosystem's single biggest weakness.

@devnacho
Copy link

Hey @versilov,

First of all thank you for this amazing library! 👏

This idea by @vegabook sounds quite interesting, is there any updates?

@versilov
Copy link
Owner

Hello @devnacho!

Actually, matrex format is already very close to the numpy's one in the branch 'array'.
The data is stored separately and meta fields, like size, type, strides etc. are members of a map.

I still do not merge this branch into master, because I am stuck with multi-dimensional matrices ASCII display.

Guess I should release it with broken multi-dimensional display.

elcritch added a commit to elcritch/matrex that referenced this issue Jan 7, 2020
# This is the 1st commit message:

porting numerix stats libraries

# This is the commit message versilov#2:

adding power function

# This is the commit message versilov#3:

fixing power

# This is the commit message versilov#4:

porting statistics

# This is the commit message versilov#5:

porting over statistics

# This is the commit message versilov#6:

fix apply

# This is the commit message versilov#7:

fix algos

# This is the commit message versilov#8:

fixing algos

# This is the commit message versilov#9:

test

# This is the commit message versilov#10:

updating stats test

# This is the commit message versilov#11:

switch type to matrex

# This is the commit message versilov#12:

fixing more tests

# This is the commit message versilov#13:

matrex

# This is the commit message versilov#14:

updates

# This is the commit message versilov#15:

removing experiment

# This is the commit message versilov#16:

try as row matrix by def

# This is the commit message versilov#17:

add vector pattern match

# This is the commit message versilov#18:

adding pseudo-vector type

# This is the commit message versilov#19:

fix order for column-wise vector

# This is the commit message versilov#20:

fix order for column-wise vector

# This is the commit message versilov#21:

fix order for column-wise vector

# This is the commit message versilov#22:

removing extras deps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants