PyPMML is making systematically off predictions with XGBoost PMML documents? #407

claudiocc1 · 2024-01-16T22:00:20Z

hi

I am training a model with xgb.XGBClassifier.
I save the model in pmml format using sklearn2pmml.
I read back the model with pypmml.Model.fromFile.

I then compare the results from the original model (xgb) and the model read in from the the file (pmml).

I compare the the original xgb.predict and xgb.predict_proba calculations with the pmml-based ones (to get the equivalent of xgb.predict from pmml I undo the sigmoid function).

I find that the results are not exactly the same. See these plots
http://tinyurl.com/9bjm6mbz

Interestingly, the difference in the "predict" quantity is quantized, see bottom right plot.
IMHO, the differences are too large to be due to machine precision in the leaf values.
It might be machine precision in the various "if this < that" conditions that can send the calculation down a different path -- however, there are only 35 such conditions in the tree, but more than half the differences are "off" from (about) zero. I say "about zero" because the most probable difference is actually not zero, but somewhat higher (again, see bottom right plot).

Any idea of what is going on.

The dataset and the full code can be accessed from here
http://tinyurl.com/yp96ypc4

Some details about my setup.

Mac OS 14.2.1 M2 processor
python version. = 3.11.6
xgboost version = 2.0.3
sklearn version = 1.3.2
sklearn2pmml version = 0.101.0
pypmml version = 0.9.17
java version = 1.8.0_401

(I get the same results with my other Mac, which is Intel based).

Thhank you in advance

Claudio

vruusmann · 2024-01-17T04:49:12Z

I save the model in pmml format using sklearn2pmml.
I read back the model with pypmml.Model.fromFile.

Dear people, please don't use the PyPMML package because it is known to be problematic! And if you still do, please don't complain about your issues in the JPMML software project, because the JPMML software project has got nothing to do with the PyPMML package!

Now, a bit more constructively:

If you compare the contents of the PMML document with XGBoost native dump (eg. in JSON and TXT formats), then you will see that the PMML document is 100% accurate - all the numbers are 32-bit floats, and they match exactly.
Please re-run your evaluations using the JPMML-Evaluator-Python package. Does the picture improve?

I will advise people that use full JPMML stack (both on the converter and evaluator side).

claudiocc1 · 2024-01-17T20:14:07Z

Thank you for your reply!
I never used PMML before and I appreciate your help.

You suggest to use the JPMML stack for both the converter and evaluator.
I used sklearn2pmml for the converter, if there is a better way, can you give me a hint?

Anyway, using the PMML file from sklearn2pmml I I find that the evaluations from PyPMML and JPMML-Evaluator-Python are the same within differences of typically 10^{-9}. See http://tinyurl.com/5n7vddh8
The code to make this comparison is now also in the http://tinyurl.com/yp96ypc4 folder.

BTW: If I try to use JPMML-Evaluator-Python on an ARM Mac, Java crashes and the Python interpreter also crashes at the make_evaluator step.
On Intel Mac it works, and that is where I performed the test mentioned above.
It does not just crash with my PMML file but also on this one
https://github.com/jpmml/jpmml-evaluator-python/blob/master/jpmml_evaluator/tests/resources/DecisionTreeIris.pmml
when I follow verbatin the example in
https://github.com/jpmml/jpmml-evaluator-python/blob/master/README.md#workflow

I tried two Intel Macs and one ARM Mac. I will have access to another ARM Mac tomorrow, and I will try on that as well, in case my java setup is messed up, but it is a fresh install.
As mentioned in my first post, java 8 (1.8.0_401). Maybe I should try a newer version of ARM?

Thank you again.

Claudio

vruusmann · 2024-01-17T20:32:20Z

BTW: If I try to use JPMML-Evaluator-Python on an ARM Mac, Java crashes and the Python interpreter also crashes at the make_evaluator step

It crashes, because the default JPype backend (implements Python-to-Java connectivity over JNI) is looking for a JNI native library, which isn't available for the Mac ARM architecture. This is something that JPype developers must fix, we can't help it.

Similarly, Mac ARM won't work with the PyJNIus backend either (also, caused by a missing JNI native library).

The only option that works is the Py4J backend (this is also what PyPMML uses).

You can switch between backends like this:

from jpmml_evaluator import make_evaluator

evaluator = make_evaluator("DecisionTreeIris.pmml", backend = "py4j") \
	.verify()

It's generally recommended to keep using the JPype backend when the underlying architecture supports it. Right mow, the only troublemaker is Mac ARM, all the others should be supported.

vruusmann · 2024-01-17T20:40:05Z

I I find that the evaluations from PyPMML and JPMML-Evaluator-Python are the same within differences of typically 10^{-9}. See http://tinyurl.com/5n7vddh8

The important thing about XGBoost models is that they default to 32-bit math operations (aka floats).

The underlying JPMML-XGBoost library encodes this "math context hint" using aModel@x-mathContext="float" vendor extension attribute. When the JPMML-Evaluator-Python library sees this type hint, it switches from 64-bit math operations to 32-bit ones.

The PyPMML doesn't know about this type hint, and carries out all operations using 64-bit. Essentially, it adds a one float worth of "extra precision" where it shouldn't exist. These are just meaningless numbers.

Thus, your graph should be interpreted that the JPMML-Evaluator-Python gives the reference prediction using 32-bit math, and then PyPMML simply appends around ~1e-8 worth of "noise" to it.

The handful of values where JPMML-Evaluator-Python and PyPMML agree (to the right on your figure), must be some very common (integer like-?) values such as 0 or 1, or smth similar.

vruusmann · 2024-01-17T20:45:39Z

The main issue about "XGBoost native" vs "XGBoost over PMML" reproducibility concerns the use of post-transformations such as the sigmoid function.

If the model does not contain any post-transformation (eg. a linear regression), then the results come out identical almost always (ie. within 1-2 ULPs). However, when you bring in a post-transformation, then the gap widens, especially if there are exponentiation operations involved.

I haven't investigated this issue in much detail lately. But I recall that in earlier XGBoost versions they were using some non-standard sigmoid implementation (arbitrarily mixing 32-bit and 64-bit math operations), which simply couldn't be translated to PMML as-is. The JPMML-XGBoost library was emitting its markup under the assumption that all math operations will take place using 32-bit math operations.

vruusmann · 2024-01-17T20:50:01Z

TLDR: If you're interested in testing reproducibility, then you should first establish the baseline using the linear regression (eg. reg:squarederror objective function).

You will see that XGBoost and XGBoost-via-PMML reproduce fine when the task is about (weighted-) summing member decision tree contributions.

If your task gets more complicated, such as involves performing a post-transformation on the boosted value, then you'll start seeing some systematic errors due to pure/mixed 32-bit/64-bit math operations use.

IIRC, I'd argue that XGBoost post-transformations cannot be re-implemented in pure Java.

vruusmann · 2024-01-17T20:52:26Z

Some of my integration test resources are here:
https://github.com/jpmml/jpmml-xgboost/blob/master/pmml-xgboost/src/test/resources/main.py

claudiocc1 · 2024-01-17T23:53:00Z

Thank you for your explanations!

Indeed changing the backend for Apple silicon works.
It may be nice to put a warning in the README, or in the help for make_evaluator.
Maybe even a hardwired check/warning based on platform.processor()?

At some point I might do some of the tests you suggest, for fun, and will let you know if I see any surprises.

C.

vruusmann · 2024-01-18T07:53:20Z

@claudiocc1 Could you copy&paste me the full stack traces that you're experiencing on Mac ARM here: jpmml/jpmml-evaluator-python#21

Please post one for the JPype backend, and another one for the PyJNIus backend (I assume that they are a little different):

from jpmml_evaluator import make_evaluator

# The first troublemaker
# Calling the Evaluator.verify() method is needed to actually load and initialize the model to its fullest
evaluator = make_evaluator("DecisionTreeIris.pmml", backend = "jpype") \
	.verify()

# The second troublemaker 
evaluator = make_evaluator("DecisionTreeIris.pmml", backend = "pyjnius") \
	.verify()

I don't have access to Mac ARM myself.

claudiocc1 · 2024-01-18T10:33:14Z

In http://tinyurl.com/57fhxpd6 you should find

XXX_terminal_messages.txt ... the messages that I get when I run from the terminal as a script. It also makes a log file
*.log ... the logfiles mentioned above (from XXX_terminal_messages.txt you should figure out what log file goes with what test)
XXX_jupyter.txt ... the error messages that the system asks me whether I want to send to Apple when the interpreter crashes

XXX = jpype or pyjnius

C.

vruusmann · 2024-01-18T10:48:02Z

Got the files, appreciate your efforts!

So, looks like the underlying Java.exe process crashes with a segmentation fault, and then brings down the parent Python process as well? There are no Python or Java errors to catch (eg. for intelligent error handling purposes), everything just goes offline?

Very intriguing, which leaves system OS/architecture pre-detection as the only preventation measure. Can't do anything after the fact, because the system is down.

Will try to figure smth out over the weekend, and then ask to confirm my fix(es).

claudiocc1 · 2024-01-18T16:52:36Z

Yes, Python crashes as well.

I also tried a try-except construct but it did not seem to change things.

Actually, before you told me about the ARM issue, I was trying to figure out my problem by changing this and that. Which meant that I made python/java crash several times. IIRC in a few instances within jupyter I saw a proper java stack trace. It sounds like python is trying to spit out the java stack trace but in most cases it crashes before it is done with that (? I dont claim to understand ?).
Yesterday when I was getting the info for you I repeated the test a few times hoping to see the stack trace but I never saw it.

FWIW, since I switch between Intel and ARM machines depending on where I am, I added these lines to my code and they seem to work.

import os 
import platform

# This is the default backend in make_evaluator
backend='jpype'

# if it is mac os, needs an env variable to be setup
if platform.system() == 'Darwin':
    os.environ['JAVA_HOME']='/Library/Internet Plug-Ins/JavaAppletPlugin.plugin/Contents/Home/'

    # Java for Apple silicon does not have JNI native library needed 
    # for Python-to-Java connectivity.  Set backend to the only one 
    # that works for Apple silicon
    if platform.processor() == 'arm':
        backend = "py4j"

jpmml = make_evaluator('xgb.pmml', backend=backend).verify()

PS: When testing "this and that" I also tried changing the backend to pyjnius, but not to py4j. Sigh.

PPS: The explicit setting of JAVA_HOME is necessary for reading the pmml file with jpmml-evaluator-python but not for reading it with pypmml or for writing it with sklearn2pmml. Not that it matters, but I do not understad that either (I dont speak java...)

C.

vruusmann · 2024-01-18T17:52:30Z

The explicit setting of JAVA_HOME is necessary for reading the pmml file with jpmml-evaluator-python but not for reading it with pypmml or for writing it with sklearn2pmml.

This JAVA_HOME requirement also comes from using the JNI communication approach. Both JPype and PyJNIus want to communicate with the Java/JVM "directly", therefore they will be approaching it using libraries (which are to be found using the JAVA_HOME), not the java.exe.

The Py4J backend, and PyPMML and SkLearn2PMML packages directly execute java.exe using the standard popen function. So, this depends more on the correct PATH environment variable being set (and not JAVA_HOME anymore).

Nevertheless, this is a very interesting observation from your side. Perhaps the backend auto-detecting algorithm should try to detect both platform system/processor and JAVA_HOME. If the latter is not set, then everything should gracefully fall back to Py4J backend, as this is the most robust option.

claudiocc1 · 2024-01-18T20:04:12Z

Without setting JAVA_HOME, even with the default backend, python would not crash. The call to make_evaluator would instead return some lengthy error message, including a suggestion, something like "maybe you need to set JAVA_HOME". I guess in this case the java process stopped and returned an error to python before it had a chance to crash.

Through a somewhat painful series of google searches I figured out where JAVA_HOME should point to. At least on my current systems. As I said, I never use java, so I was flying blind.

Thrameos · 2024-01-22T06:40:01Z

I would love for JPype to better support Mac ARM but it is a system issue with the os not loading the libjvm from python likely due to some os level trust flags. Thus it isn't a bug in JPype or something we can solve. We just call the load shared library and hope it is working.

vruusmann · 2024-03-04T05:16:12Z

Thanks to everybody for their insightful comments about JNI on Mac ARM issues.

However, I'm closing this ticket, and will be making progress under a different ticket (jpmml/jpmml-evaluator-python#21), where there is an actual opportunity to make code changes.

I don't want to have PyPMML and other unrelated projects occupying my territory for no good reason.

vruusmann changed the title ~~xgb.XGBClassifier loaded as pmml does not give exact results~~ PyPMML is making systematically off predictions with XGBoost PMML documents? Jan 17, 2024

vruusmann mentioned this issue Jan 18, 2024

Choosing a default backend depending on the system architecture jpmml/jpmml-evaluator-python#21

Open

vruusmann closed this as completed Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyPMML is making systematically off predictions with XGBoost PMML documents? #407

PyPMML is making systematically off predictions with XGBoost PMML documents? #407

claudiocc1 commented Jan 16, 2024

vruusmann commented Jan 17, 2024 •

edited

Loading

claudiocc1 commented Jan 17, 2024

vruusmann commented Jan 17, 2024 •

edited

Loading

vruusmann commented Jan 17, 2024

vruusmann commented Jan 17, 2024

vruusmann commented Jan 17, 2024 •

edited

Loading

vruusmann commented Jan 17, 2024

claudiocc1 commented Jan 17, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

Thrameos commented Jan 22, 2024

vruusmann commented Mar 4, 2024

PyPMML is making systematically off predictions with XGBoost PMML documents? #407

PyPMML is making systematically off predictions with XGBoost PMML documents? #407

Comments

claudiocc1 commented Jan 16, 2024

vruusmann commented Jan 17, 2024 • edited Loading

claudiocc1 commented Jan 17, 2024

vruusmann commented Jan 17, 2024 • edited Loading

vruusmann commented Jan 17, 2024

vruusmann commented Jan 17, 2024

vruusmann commented Jan 17, 2024 • edited Loading

vruusmann commented Jan 17, 2024

claudiocc1 commented Jan 17, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

vruusmann commented Jan 18, 2024

claudiocc1 commented Jan 18, 2024

Thrameos commented Jan 22, 2024

vruusmann commented Mar 4, 2024

vruusmann commented Jan 17, 2024 •

edited

Loading

vruusmann commented Jan 17, 2024 •

edited

Loading

vruusmann commented Jan 17, 2024 •

edited

Loading