Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BigInteger type #10750

Open
wants to merge 93 commits into
base: development
Choose a base branch
from

Conversation

flashultra
Copy link
Contributor

This pull request add BigInteger type in Haxe . The implementation is created by Chuck Batson and I have his permission to relicense his code under the Haxe license ( thank you @cbatson ! ).

Some things about the current implementation. I compare three BigInteger implementations - on Chuck ( https://github.com/cbatson/hxBitcoin/tree/master/com/fundoware/engine/bigint) , in haxe-crypto lib ( https://github.com/Jens-G/haxe-crypto/blob/master/src/com/hurlant/math/BigInteger.hx ) and littleBigInt ( https://github.com/maitag/littleBigInt/blob/master/src/BigInt.hx ).

The Chuck implementation give about 5x better performance and also is better than native Java implementation ( 3x times better !).

As a additional info - many languages have a BigInteger , such as Java, C#, Javascript

@skial skial mentioned this pull request Jul 7, 2022
1 task
@Aurel300
Copy link
Member

Aurel300 commented Jul 7, 2022

Thanks, this would be useful to have in the standard library indeed. There are a couple of high-level comments I have (I haven't yet looked through all of the implementation):

  • For targets that support big integers natively (which you mention at the end of your description), can we use the native implementation? If it is in fact slower, maybe we can have a define to switch between the two versions, similar to the JSON define?
  • There should be tests, for all the API methods.
  • We might need other types to be aware of BigInt. What about the seraliser? What about additional methods to read/write in Input/Output?
  • Could @cbatson himself comment on whether the permission was indeed given to relicense the code?

@flashultra
Copy link
Contributor Author

Thank you for reviewing this pull request.
About the comments:

  • Well, this is possible, but different implementation can expose different methods, so some methods can exist only in Java, but not in Javacript , c# or to have a different names ( example isProbablePrime, modPow, ModPo ..).
    So that change will require to match any specific implementation for each language or to expose only main methods which exist in all implementations. Not sure how feasible is that solution .As I mentioned for some targets the current implementation give a better results.
    However , in the current implementation are missing some important methods such as pow , modPow and etc. , so I can try to add it.

  • Yes, I should add test. Maybe after adding the missing methods .

  • Yes, I had not thought for that, but that is valid point. At the moment is possible to convert BigInt to haxe.io.Bytes , String , Vector , which give some possibilities for serialisation .

  • His permission is public and available here : Ask for integration cbatson/hxBitcoin#4 (comment)
    I'm sure he can comment too when have a good internet connectivity

@flashultra
Copy link
Contributor Author

I think this pull request can be reviewed and merged as I added the missing methods ( abs, pow, modPow, isProbablePrime, getLowestSetBit and bitLength) and all the tests for them.
Regarding the other questions :
1) BigInt native implementation
Currently, only three targets supports BigInt type natively - Java, C# and Javascript . For the first two it contains many methods ( pow, modpow, prime ...) , but for Javascript it's just an empty class that will required other methods to be implemented using javascript's native BigInt type .I'm not sure how feasible it is since the current implementation stores data in a Vector. For C# and Java - they do not support Muttable BigInt type (this PR adds immutable and mutable types - BigInt and MutableBigInt) . It might be possible to add only muttable as native type, but maybe in the future if someone is interested in that.
2) Serializer
There are several solutions for that - for example, like adding BigInt as a separate type to the Serializer class, perhaps similar to StringMap, IntMap, ObjectMap, which would require a change to the Serializer class, or simply serializing the BigInt as a class or bytes (there is a method to convert to bytes) .
This may be a separate PR in the future, and as I mentioned, it is currently possible to serialize a BigInt to Bytes.
3) Input/ Output
Internally BigInt keep the data in Vector. This could be ease converted to Bytes.
So I think it's possible someone to get the data and use it in Input/Output methods .
Another solution could be adding new methods writeBigInt / readBigInt ( with keeping the size of the Bigint ) , which will require change in Input /Output class in separate PR.

@cbatson
Copy link
Contributor

cbatson commented Jul 18, 2022

Yes, this does indeed have my blessing. Hope it's a useful addition. 👍

Thank you @flashultra for doing this work!

@flashultra
Copy link
Contributor Author

Php tests fails on another test. I think the bigint tests should pass the Php target.

@flashultra
Copy link
Contributor Author

All tests are green except PHP ( and mac-cpp which failed with other reason) .
For php test I tried to run on my server , but I've got the following error which is not related with BigInt.
I think it's safe to be merge this pull request.

Stack trace:
#0 /var/www/lib/Array_hx.php(11): php\Boot::php\{closure}()
#1 /var/www/index.php(9): include_once('...')
#2 /var/www/lib/unit/Test.php(629): {closure}()
#3 /var/www/lib/unit/Test.php(644): unit\Test::__hx__init()
#4 /var/www/index.php(9): include_once('...')
#5 /var/www/lib/unit/TestBigInt.php(23): {closure}()
#6 /var/www/index.php(9): include_once('...')
#7 /var/www/lib/BigIntTest.php(16): {closure}()
#8 /var/www/index.php(14): BigIntTest::main()
#9 {main} in /var/www//lib/Array_hx.php on line 11
PHP Fatal error:  During inheritance of JsonSerializable: Uncaught ErrorException: Return type of Array_hx::jsonSerialize() should either be compatible with JsonSerializable::jsonSerialize(): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/lib/Array_hx.php:267

@flashultra
Copy link
Contributor Author

The above error is when I test with PHP 8.1.
I ran a new test with PHP 7.3 and all tests passed.
This PR is ready to be merged.

}

public static inline function fromInt(value:Int):BigInt {
return new BigInt(BigInt_.fromInt(value));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some caching could be added for values e.g.from -10 to 10, so that in these cases no new instances are created but shared BigInts are re-used. In Java for example values from -16 to 16 are cached https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/math/BigInteger.java#L949

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that. What is the purpose of this caching in Java ? The only logic for that is if there are any operations and they often use values between -16 and 16. Otherwise , why not cache between -100 and 100 or any other values ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is about reducing garbage collector pressure from very commonly used values. why they decided for +/-16 I don't know. For the regular Java Integer type it is actually -128 to 128 https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/lang/Integer.java#L638

Maybe we can returned the cached values for the once static variables are initialized anyways? I.e. ZERO, ONE, TWO, NEGATIVE_ONE (which I would actually call MINUS_ONE - a symbol that is 25 times more common when you do a GitHub code search).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the cache, since this is not my implementation, I just foundthat there is already a cache for values between -16 and 16 (

public static function fromInt(value:Int):BigInt_ {
) , the fromInt() method calls getCachedValue(...) ,so it caches all values ( -16...16) after first call .
I think this is a good implementation and does not require change

About NEGATIVE_ONE , I changed it to MINUS_ONE


/* Original code courtesy Chuck Batson (github.com/cbatson) */
@:allow(haxe.math.bigint)
abstract BigInt(BigInt_) {
Copy link
Contributor

@sebthom sebthom Mar 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible in case of Java to delegate to the BigInteger class? Mabye other targets also have native BigInt implementations that could be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that maybe will be a good solution as Java implementation have a very optimize methods for modPow and pow . The problems here is about some methods that not exist in Java, and need to be reimplemented here ( withouth using the current BigInt_.hx class). Most methods exists, but miss some for random and random prime numbers.

As I mention in above comments , there is also native BigInteger implementation in Python and Javascript which should be used here too. Native javascript implementation give better performance, but as the Java many current methods should be rewritten as Javascript implementation doesn't have any,

C# also have BigInteger class , but I think the performance is not great.

However, in some tests the current BigInteger have same or better performance against Java ( when use some of those benchmarks : https://github.com/maitag/littleBigInt/tree/master/benchmarks ) . Also Karatsuba multiplication in native Java is slower than the current implementation ( but the have a slighty faster one using implementation of Toom–Cook multiplication )

At the moment , I can't apply native BigInteger implementation for Java, C#, Javascript or Python , but hope so someone else will add them in the future.

For me , the bigger drawback for Haxe implementation is the slower modPow ( and partly pow ) operations which is used for RSA .

@sebthom
Copy link
Contributor

sebthom commented Mar 22, 2023

@flashultra for the php job to succeed you may need to specify -d memory_limit=-1 here:

runCommand("php", generateArgs(binDir + "/index.php"));
changeDirectory(sysDir);
if(isCi())
deleteDirectoryRecursively(binDir);
runCommand("haxe", ["compile-php.hxml"].concat(prefix).concat(args));
runSysTest("php", generateArgs(binDir + "/Main/index.php"));

@flashultra
Copy link
Contributor Author

@sebthom Thank you for your help.
Cppia still fails. ( https://github.com/HaxeFoundation/haxe/actions/runs/4497491663/jobs/7913155711#step:11:2791 ) .
Do you have any idea what the problem could be?

@sebthom
Copy link
Contributor

sebthom commented Mar 23, 2023

"core dumped" sounds like a segmentation fault.

To be honest, I had no good experiences with Cppia myself. A lot of random crashes with it while things worked fine on all other targets. I gave up trying to support it in my projects.

maybe @hughsando has some ideas?

@Simn Simn added this to the Later milestone Mar 25, 2023
@flashultra
Copy link
Contributor Author

flashultra commented Jan 7, 2024

There is some problem with CI. Since I can't cancel the CI job, please someone to cancel CI #4960 as the new one (CI #4962) has been set.

Some thoughts on this pull request.
At the moment multiplication is very slow because it uses 16-bit operations.
Replacing it with 32-bit requires using Int64, but that's even slower in the tests I've done
Proper(native) Int64 for each target is required ( many targetd have such type - C,C++,C#, Java , JS(bigint) etc.).
The new modPow() method gives a better result, but is still slow (for the reason mentioned above).

Any futher optimization will require a native Int64 type and/or Toom-Cook / Schonhage-Strassen implementation.

@Simn
Copy link
Member

Simn commented Jan 7, 2024

CI is just completely broken at the moment it seems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants