Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many methods accept both numpy arrays and Python lists are parameters, but the method signatures only mention numpy arrays. Can this be improved? #584

Closed
hx2A opened this issue Dec 31, 2024 · 32 comments · Fixed by #589
Labels
enhancement New feature or request question Further information is requested

Comments

@hx2A
Copy link
Collaborator

hx2A commented Dec 31, 2024

As mentioned in a discussion thread:

I can import it, I just don't know how to use it; looks complicated to me. I don't understand why it appears to be required:
set_fills_err

Originally posted by @vsquared in #577 (reply in thread)

There are methods that accept both numpy arrays and Python lists are parameters, but the method signatures only mention numpy arrays. Can this be improved without making the signatures nasty type Unions? What is most clear for users?

@hx2A hx2A added enhancement New feature or request question Further information is requested labels Dec 31, 2024
@vsquared
Copy link

vsquared commented Dec 31, 2024

I think it's questionable whether you even need set_fills(). So far I can't come up with a demo which shows how to turn it loose on a shape and expect it to find all the faces. It looks at the number of vertices and not the number of faces. For example, a cube has 6 faces, but only 4 vertices for each face and it generates this error if you call it on a PShape cube:
e

@hx2A
Copy link
Collaborator Author

hx2A commented Dec 31, 2024

I think it's questionable whether you even need set_fills().

I agree, and I didn't want to add set_fills() to py5. However, py5's trimesh integration needs it to efficiently convert trimesh objects to Py5Shape objects. So its primary purpose is to be used inside py5. It does pair well with Py5Shape's vertices(). After creating a shape with all of the vertices using a call to vertices(), you can then set the fills with set_fills().

It looks at the number of vertices and not the number of faces. For example, a cube has 6 faces, but only 4 vertices for each face and it generates this error if you call it on a PShape cube

You'd need to get vertex count with get_vertex_count() to find out the correct number. The fill color is assigned to each vertex individually.

@GoToLoop
Copy link

GoToLoop commented Dec 31, 2024

Would that set_fills() accept an argument of type array.array as well?

Maybe even Java arrays returned from Java functions; or created by java.lang.reflect.Array.newInstance() static method?

@villares
Copy link
Collaborator

I think set_fills() is a bit hard to use but it could be useful "performance wise", I'm for a long time planning to explore making particles and flocking with numpy and it will help me render the results more efficiently updating a Py5Shape object with triangles (or so I hope).

An alternative would be to have some vectorized drawing methods that can take shapes + attributes (this is something we discussed when talking about rendering geodataframes...).

It would be cool if points() could have a keyword argument for passing stroke colors.
Crazy idea... can you imagine triangles() that took coordinates + attributes?

@hx2A
Copy link
Collaborator Author

hx2A commented Dec 31, 2024

Would that set_fills() accept an argument of type array.array as well?

Maybe even Java arrays returned from Java functions; or created by java.lang.reflect.Array.newInstance() static method?

Yes, absolutely! It can work with Java arrays. Adding that to the signatures would make them even more complicated though.

My preference is to work with numpy arrays. JPype does let you create numpy arrays that are backed by a Java array. It uses memory buffers. The data will be read only but there is no memory copying, so the performance is very fast.

@hx2A
Copy link
Collaborator Author

hx2A commented Dec 31, 2024

I think set_fills() is a bit hard to use but it could be useful "performance wise", I'm for a long time planning to explore making particles and flocking with numpy and it will help me render the results more efficiently updating a Py5Shape object with triangles (or so I hope).

Well, one of the difficulties is because of the bug. It works well for what it was designed to do, which is efficiently set the fill colors during shape creation so py5 can quickly convert Trimesh objects to something py5 can use.

An alternative would be to have some vectorized drawing methods that can take shapes + attributes (this is something we discussed when talking about rendering geodataframes...).
It would be cool if points() could have a keyword argument for passing stroke colors.
Crazy idea... can you imagine triangles() that took coordinates + attributes?

I would suggest using hybrid coding for this. It's like a way of making fast and efficient Java extensions for py5. They can be customized for your use case.

https://py5coding.org/content/hybrid_programming.html

@villares
Copy link
Collaborator

I would suggest using hybrid coding for this. It's like a way of making fast and efficient Java extensions for py5. They can be customized for your use case.

It occurred to me this would be a way of doing it, probably "the" way, but I was too lazy to engage with these ideas. Maybe in 2025 :D

@hx2A
Copy link
Collaborator Author

hx2A commented Dec 31, 2024

It occurred to me this would be a way of doing it, probably "the" way, but I was too lazy to engage with these ideas. Maybe in 2025 :D

2025 will be the year of big things!

@GoToLoop
Copy link

GoToLoop commented Jan 1, 2025

Adding that to the signatures would make them even more complicated though.

Why not instead use a broader datatype hint like Iterable[int]?

I haven't checked set_fills()'s implementation; but my guess would be it simply iterates over its fills parameter, right?

If it requires direct access to fills's indices, a more specific, but still broad, type hint Sequence[int] would be adequate enough I believe.

I've done some tests on how "jpype" perceives Java's arrays & arraylists:

from array import array
from typing import Iterable, Sequence

import jpype # from 'jpype1' package

# Function to check if an object is a Java array
def is_java_array(obj):
    return obj.getClass().isArray() if isinstance(obj, jpype.JObject) else False

# Start the JVM
jpype.startJVM()

# Import the Array classes
Array = jpype.JClass('java.lang.reflect.Array')
Arrays = jpype.JClass('java.util.Arrays')
Vector = jpype.JClass('java.util.Vector')

# Create a Python tuple for testing
python_tuple = 10, 20, 30, 40, 50

# Create a Java ArrayList from the Python tuple
java_list = Arrays.asList(python_tuple)

# And convert it to a Java Vector, which is also a List-like container
java_list = Vector(java_list)

# Create a Java array and populate it w/ the contents of the Java Vector
java_array = Array.newInstance(jpype.JInt, len(java_list))
for i in range(len(java_array)): java_array[i] = java_list[i].intValue()

# Create a Python array for testing using the Java array
python_array = array('i', java_array)

# Test the function is_java_array()
print(java_array, len(java_array), is_java_array(java_array), type(java_array)) # True
print(java_list, len(java_list), is_java_array(java_list), type(java_list)) # False
print(python_tuple, len(python_tuple), is_java_array(python_tuple), type(python_tuple)) # False
print(python_array, len(python_array), is_java_array(python_array), type(python_array)) # False

# Check if Python believes a Java array is an Iterable or a Sequence
print("\nJava's array is an Iterable:", isinstance(java_array, Iterable)) # True
print("Java's array is a Sequence:", isinstance(java_array, Sequence)) # False

# Check if Python believes a @RandomAccess or a List-like Java container is an Iterable or a Sequence
print("\nJava's ArrayList is an Iterable:", isinstance(java_list, Iterable)) # True
print("Java's ArrayList is a Sequence:", isinstance(java_list, Sequence), '\n') # False

# Check if a Java array behaves as a Python Iterable
for int in java_array: print(int)

# Shutdown the JVM (optional)
jpype.shutdownJVM()

It outputs the following:

[10, 20, 30, 40, 50] 5 True <java class 'int[]'>
[10, 20, 30, 40, 50] 5 False <java class 'java.util.Arrays.Vector'>
(10, 20, 30, 40, 50) 5 False <class 'tuple'>
array('i', [10, 20, 30, 40, 50]) 5 False <class 'array.array'>

Java's array is an Iterable: True
Java's array is a Sequence: False

Java's ArrayList is an Iterable: True
Java's ArrayList is a Sequence: True

10
20
30
40
50

Seems like "jpype" recognizes a Java array as an Iterable, but not as a Sequence though.

So I've opened an issue about it on "jpype"'s repo:
jpype-project/jpype#1252

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 5, 2025

I haven't checked set_fills()'s implementation; but my guess would be it simply iterates over its fills parameter, right?

Here is the link to that code. It just uses a for loop to iterate over the passed array. It is much more efficient to iterate through a list like this in Java. The alternative is to loop in Python and make many individual calls to Java to change one vertex color at a time. That is slow.

Why not instead use a broader datatype hint like Iterable[int]?

This is a good idea! Thank you for the suggestion. I am looking into making this change now.

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 5, 2025

There are many methods like points() with parameters that must have two dimensions, not one.

https://py5coding.org/reference/sketch_points.html

The type hint there could perhaps be Iterable[Iterable[float]], but that might suggest that a jagged array would be acceptable, and it is not. The second dimension must have 2 or 3 columns. But then again, npt.NDArray[np.floating] doesn't communicate that either, and doesn't even make it clear it needs to have 2 dimensions.

A list of tuples does actually work for points() though. I just tested it. Do other people think Iterable[Iterable[float]] would be a better type hint here?

@villares
Copy link
Collaborator

villares commented Jan 5, 2025

A list of tuples does actually work for points() though. I just tested it. Do other people think Iterable[Iterable[float]] would be a better type hint here?

Yep, I've used lists of tuples quite a few times with points(). I'm not the best person to give opinions on type hints, but for me it would look very reasonable!

@GoToLoop
Copy link

GoToLoop commented Jan 5, 2025

This is a good idea! Thank you for the suggestion. I am looking into making this change now.

Interesting, so the whole set_fills(), and other similar methods, are split in Python & Java code!

After taking a careful look at both implementation halves, I believe Iterable[int] isn't an adequate type hint for it.

B/c parameter fills has to be forwarded to the Java half as a Sequence[int], which "jpype" would auto-convert to an actual Java int[] array.

IMO, it's unlikely users would pass an actual generator/iterable as argument to set_fills().

Most users would just pass a list argument; while I would personally prefer a tuple instead. ;-)

So such lines like if isinstance(fills, types.GeneratorType): fills = list(fills) are just wasting CPU power.

Rather, for the rare cases a user would wanna pass a generator to set_fills(), the user should use the unpack/spread operator * over the argument inside a [] or () like this:
[*gen_arg] or (*gen_arg,).

Or explicitly call list() or tuple() over the argument.

BtW, I did some quick refactoring for both halves. What do you think?

https://github.com/py5coding/py5generator/blob/py5-1292-fork-0024-0.10.4a2/py5_resources/py5_module/py5/shape.py#L173-L177

    def set_fills(self, fills: Sequence[int], /) -> None:
        """$class_Py5Shape_set_fills"""
        if fills: _Py5ShapeHelper.setFills(self._instance, fills)

https://github.com/py5coding/py5generator/blob/py5-1292-fork-0024-0.10.4a2/py5_jar/src/main/java/py5/core/Py5ShapeHelper.java#L113-L127

  public static void setFills(final PShape s, final int[] fills) {
    if (fills.length != s.getVertexCount()) throw new RuntimeException(
      "the length of parameter fills (" + fills.length +
      ") must be equal the number of vertices in the shape (" + s.getVertexCount() + ")"
    );

    for (int i = 0; i < fills.length; s.setFill(i, fills[i++]));
  }

@villares
Copy link
Collaborator

villares commented Jan 5, 2025

IMO, it's unlikely users would pass an actual generator/iterable as argument to set_fills().

I quite like the idea of being able to use a generator expressions, and this is exactly what I did here but I would accept the limitation if it would improve things otherwise.

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 5, 2025

I quite like the idea of being able to use a generator expressions, and this is exactly what I did here but I would accept the limitation if it would improve things otherwise.

I agree. I added that code that converts a generator to a list because JPype won't automatically do that to a generator in its attempt to convert the parameter to a Java array. Instead it will give a confusing error.

Interesting, so the whole set_fills(), and other similar methods, are split in Python & Java code!

Yes! And much of py5 is like this. Nobody interacts with the PApplet class directly; there are Java Sketch and SketchBase classes that help facilitate a comfortable interaction between Python and Java. On the Python side there is auto-generated code for much of it with extra decorators added to make small adjustments. The whole thing works very well and is easy to maintain.

for (int i = 0; i < fills.length; s.setFill(i, fills[i++]));

Ah, I like that! I never thought to add code into a for loop like that.

@GoToLoop
Copy link

GoToLoop commented Jan 5, 2025

I quite like the idea of being able to use generator expressions,

Oh, I see! You have code lines like these there:
icosa.set_fills(random_color() for _ in range(60))

Although those could easily be changed to a "list" comprehension syntax by using []:
icosa.set_fills([random_color() for _ in range(60)])

Which would relief those methods the need to have if isinstance(fills, types.GeneratorType): fills = list(fills) inside them.

B/c regardless, your generator comprehension argument would still need to be checked and then converted to a list or tuple before invoking the internal Java method.

And for everything else, there's still the permanent cost of always checking if it happens to be a generator instead of a sequence container.

@villares
Copy link
Collaborator

villares commented Jan 5, 2025

On the other hand, @GoToLoop, Python's list() & tuple() are "indempotent" so tuple(tuple(a)) == tuple(a) or list(list(a) == list(a)and so no need for an if there.

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 6, 2025

And for everything else, there's still the permanent cost of always checking if it happens to be a generator instead of a sequence container.

Yes, but for most cases the time to do the memory copy to send the data from Python to Java will be larger, so it's OK.

@hx2A hx2A mentioned this issue Jan 6, 2025
@hx2A
Copy link
Collaborator Author

hx2A commented Jan 6, 2025

This is now fixed.

Updated reference docs with all of the changes have been pushed to the dev website. EG:

http://dev.py5coding.org/reference/py5shape_set_fills.html

Here's the full list of methods with new signatures:

Py5Graphics.bezier_vertices()
Py5Graphics.curve_vertices()
Py5Graphics.lines()
Py5Graphics.mask()
Py5Graphics.points()
Py5Graphics.quadratic_vertices()
Py5Graphics.text()
Py5Graphics.text_width()
Py5Graphics.vertices()
Py5Image.mask()
Py5Shader.set()
Py5Shape.bezier_vertices()
Py5Shape.curve_vertices()
Py5Shape.get_vertex_codes()
Py5Shape.quadratic_vertices()
Py5Shape.set_fills()
Py5Shape.set_path()
Py5Shape.set_strokes()
Py5Shape.vertices()
Sketch.bezier_vertices()
Sketch.create_font()
Sketch.curve_vertices()
Sketch.lines()
Sketch.points()
Sketch.quadratic_vertices()
Sketch.text()
Sketch.text_width()
Sketch.vertices()

There also were a couple of signature bugs I picked up on along the way. Revisiting this part of py5 helped improve the docs quite a bit. Thanks to everyone who contributed to the discussion here and elsewhere.

@hx2A hx2A closed this as completed Jan 6, 2025
@GoToLoop
Copy link

GoToLoop commented Jan 6, 2025

AFAIK, neither lists, tuples, arrays, etc. are Iterator types, but Iterable & Sequence!

from typing import Iterator, Iterable, Sequence

empty_list, empty_tuple = [], tuple()

# False True True
print(isinstance(empty_list, Iterator), isinstance(empty_list, Iterable), isinstance(empty_list, Sequence))

# False True True
print(isinstance(empty_tuple, Iterator), isinstance(empty_tuple, Iterable), isinstance(empty_tuple, Sequence))

But IMO, Sequence would be the best datatype hint for them.

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 7, 2025

AFAIK, neither lists, tuples, arrays, etc. are Iterator types, but Iterable & Sequence!

Ooops, it looks like I confused Iterator and Iterable. It's been a long day. I'm glad you are awake.

But IMO, Sequence would be the best datatype hint for them.

I just looked up the difference between Sequence and Iterable. Generators are Iterable but not Sequences. Although Py5Shape.set_fills() and others check for generators and converts to a list before passing to Java via JPype, there are some methods listed above that don't do that check, and it would be tedious and difficult to put that everywhere. JPype doesn't do that for you probably because it wants to get the length first to allocate the Java array. So it seems Sequence is the most correct datatype to use here.

@hx2A hx2A reopened this Jan 7, 2025
@villares
Copy link
Collaborator

villares commented Jan 7, 2025

If you always converted things with tuple() upfront (I've heard somewhere it is not that costly...) you could accept generators and iterables in general...

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 7, 2025

If you always converted things with tuple() upfront (I've heard somewhere it is not that costly...) you could accept generators and iterables in general...

Right, but to add that to everywhere it would need to be added would be tedious and difficult, and right now I'm not sure I should have added that conversion to the places where it has been added already. I don't think it is a good idea to modify the user's passed parameters. If I have a generator and pass it to one of these methods, the generator will be "spent" as you cannot rewind a generator. That's probably why JPype doesn't do this when you pass a generator. It's a weird side effect. I think it would be better to make the user add brackets ie [gen] for generator gen because then it is more clear and explicit what is happening.

@villares
Copy link
Collaborator

villares commented Jan 7, 2025

Hmmm... I'm not sure I understand your reasoning... this part of the Python vocabulary is rather confusing anyway. But yes the generator will always be consumed/spent and this is expected either way, converting it to a tuple or inside a loop etc. but never mind! Any solution you adopt, if it is well documented will be fine!

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 7, 2025

Well I guess if someone writes code with the generator in the method call, like this:

shape.set_fills(py5.color(py5.random(255), 0, 0) for _ in range(shape.get_vertex_count()))

people would expect the generator would be created and consumed right there.

But if the generator comes from somewhere else:

gen = (py5.random(255), 0, 0) for _ in range(shape.get_vertex_count()))

...

shape.set_fills(gen)
# gen is now empty

then it could be unexpected. But on the other hand, assigning generator objects to variables like that is a more complex thing to do, and someone doing that would be well aware of what shape.set_fills(gen) would do to the generator.

It does bother me a bit that there is an inconsistency. For example, right now you can pass a generator to Py5Shape.vertices() and Py5Graphics.vertices() but not Sketch.vertices(). But after thinking about it a bit, I see can correct that part. But I just don't want generators to be allowed everywhere, because that would get ridiculous.

@hx2A hx2A mentioned this issue Jan 8, 2025
@hx2A hx2A closed this as completed in #589 Jan 8, 2025
@hx2A
Copy link
Collaborator Author

hx2A commented Jan 8, 2025

OK, now this is fixed. Everything that was Iterator before is now Sequence.

http://dev.py5coding.org/reference/py5shape_set_fills.html

It does bother me a bit that there is an inconsistency. For example, right now you can pass a generator to Py5Shape.vertices() and Py5Graphics.vertices() but not Sketch.vertices(). But after thinking about it a bit, I see can correct that part

Turns out I already thought of this and took care of it a long time ago.

So although Python lists do work for these methods, numpy arrays are still the preferred way to go here because the performance will be much better.

@GoToLoop
Copy link

GoToLoop commented Jan 8, 2025

, numpy arrays are still the preferred way to go here because the performance will be much better.

What if we pass an actual Java int[] or float[] array?
Would JPype be able to pass those Java arrays to a Java method w/o any conversion?

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 8, 2025

What if we pass an actual Java int[] or float[] array? Would JPype be able to pass those Java arrays to a Java method w/o any conversion?

Yes, that would also work fine, and would be fast and efficient because there is no memory copying from numpy to Java. Although you pointed out that Java arrays are not Sequences. Now I see your point about why they should be. They meet the requirements.

Personally I find working with Java arrays in Python to be irritating. In particular, in Java, pixels are the one dimensional array of ints. Pixels in numpy with np.uint8 data type makes much more sense. But there are cases where the performance gain would make using Java arrays more than worth it.

@GoToLoop
Copy link

GoToLoop commented Jan 8, 2025

Although you pointed out that Java arrays are not Sequences. Now I see your point about why they should be. They meet the requirements.

It means for the time being, if we prefer to pass a Java array, we'll have to silence the linter for the calling line, until JPype devs add Sequence type for Java arrays.

BtW, now that you have type Sequence for all those functions, do you still intend to keep the extra "generator to list" conversion check lines?

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 8, 2025

It means for the time being, if we prefer to pass a Java array, we'll have to silence the linter for the calling line, until JPype devs add Sequence type for Java arrays.

Oh right, linters will pick up that.

BtW, now that you have type Sequence for all those functions, do you still intend to keep the extra "generator to list" conversion check lines?

Yes, absolutely. I would not revoke functionality like that. I can't easily change the type hints on those to Iterable but generators will continue to work.

@GoToLoop
Copy link

GoToLoop commented Jan 8, 2025

Yes, absolutely. I would not revoke functionality like that.

So the actual type hint for set_fills()'s fills parameter is still: Sequence[int] | Generator[int]. ;-)

And IDE linters will complain about this @villares's code line icosa.set_fills(random_color() for _ in range(60)); b/c the argument isn't a Sequence[int], but a Generator[int] instead; even though the method would handle that regardless.

In other words, the type Sequence[int] is the official type, while Generator[int] is the undocumented 1.

@hx2A
Copy link
Collaborator Author

hx2A commented Jan 9, 2025

In other words, the type Sequence[int] is the official type, while Generator[int] is the undocumented 1.

Yes, that's a good way of putting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants