-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: could not convert string to int, better type checks for OOBTree keys under Python 3? #272
Comments
Andreas Jung wrote at 2019-6-17 21:28 -0700:
Plone 5.2RC3, ZODB 5.5.1, Python 3.7
During a Plone migration over plone.restapi I received this random ZODB error:
...
2019-06-18 00:02:35,161 ERROR [ZODB.Connection:809][waitress] Couldn't load state for BTrees.OOBTree.OOBTree 0x52b7
Traceback (most recent call last):
File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/Connection.py", line 795, in setstate
self._reader.setGhostState(obj, p)
File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 633, in setGhostState
state = self.getState(pickle)
File "/home/ajung/sandboxes/ugent-longtime/eggs/ZODB-5.5.1-py3.7.egg/ZODB/serialize.py", line 626, in getState
return unpickler.load()
ValueError: could not convert string to int
2019-06-18 00:02:35,281 ERROR [Zope.SiteErrorLog:251][waitress] 1560808955.19451550.22340673761002505 http://localhost:30081/plone_portal/lw/geschiedenis/en/POST_application_json_
Traceback (innermost last):
...
Module Products.PluginIndexes.unindex, line 240, in index_object
Module Products.PluginIndexes.unindex, line 276, in _index_object
Module Products.PluginIndexes.unindex, line 216, in insertForwardIndexEntry
Module ZODB.Connection, line 795, in setstate
Module ZODB.serialize, line 633, in setGhostState
Module ZODB.serialize, line 626, in getState
ValueError: could not convert string to int
The index likely contains both "int" and "string" values.
Under Python 3, `BTrees` can no longer have keys of different types:
`BTrees` are based on an order and Python 3 (unlike Python 2)
does not order objects of different types.
|
That does not explain how this corruption of self._index can occur. |
Andreas Jung wrote at 2019-6-17 22:26 -0700:
> The index likely contains both "int" and "string" values. Under Python 3, `BTrees` can no longer have keys of different types: `BTrees` are based on an order and Python 3 (unlike Python 2) does not order objects of different types.
That does not explain how this corruption of self._index can occur.
I assume that there are indexed objects which provide index values
of different types (at least "int" and "str"). Unter Python 2,
this was no problem - and therefore might have been undetected.
Up to you to verify the assumption.
The problem might come from the migration instead.
The migration uses heuristics to promote Python 2's `str`
to Python 3's `str`. I do not know what it does if the heuritics fail,
maybe it converts Python 2's `str` to Python 3 `bytes`, instead.
Important operations on Python 3 `bytes` return `int` which might
explain a convertion to `int` (even though this does not seem likely).
I would approach the problem as follows:
Load the pickles (!) stored for OID 0x52b7.
Use "pickletools.dis" to analyse the pickles.
Watch out especially for different types related to the `BTrees` keys.
|
Andreas Jung wrote at 2019-6-17 23:05 -0700:
> Load the pickles (!) stored for OID 0x52b7.
How do you load the raw pickle?
I look at how the ZODB does it.
`ZODB.Connection`:
...
def get(self, oid):
...
p, _ = self._storage.load(oid)
obj = self._reader.getGhost(p)
Note that the pickle returned by `self._storage.load` actually consists
of 2 adjacent pickles - the first describing the class, the second the state.
You will likely want to use `ZODB.utils.p64` to convert the number
"0x52b7" into a (true) OID.
|
|
The analysis is like this: The issue is clearly an index corruption that happened within a Plone migration after a series of plone.restapi calls. The affected field index is department which is defined in two content-types as a Choice field connected to a vocabulary. However the affected Python pickle of the underlaying index structure indicates that a content object was indexed which has a parent folder with ID department. So the problem is caused clearly caused on the application level through acquistion. |
Andreas Jung wrote at 2019-6-17 23:38 -0700:
```
(Pdb) pickletools.dis(pickle)
0: \x80 PROTO 3
2: c GLOBAL 'BTrees.OOBTree OOBTree'
26: q BINPUT 0
28: . STOP
highest protocol among opcodes = 2
(Pdb) for x in pickletools.genops(pickle): print(x)
(<pickletools.OpcodeInfo object at 0x7f1b86dad048>, 3, 0)
(<pickletools.OpcodeInfo object at 0x7f1b86da7d08>, 'BTrees.OOBTree OOBTree', 2)
(<pickletools.OpcodeInfo object at 0x7f1b86da7ac8>, 0, 26)
(<pickletools.OpcodeInfo object at 0x7f1b86dad0a8>, None, 28)
```
I have told you that the returned pickle consists of *2* adjacent
pickles (first class, second state).
Above you show the not very interesting first pickle.
You must look at the second one.
Apparently, the first pickle is 29 bytes long. This would
mean that the second pickle starts at byte 30.
Alternatively, you could wrap your pickle into a file like
object (likely `io.BytesIO`) and call "dis" 2 times on it.
|
No further analysis needed since the reason for the corruption appears to be clear now. |
Andreas Jung wrote at 2019-6-17 23:54 -0700:
...
However I raise the question how the ZODB - OOBTrees under Python 3 in particular - could be improved in order to avoid this kind of corruption. The OOBTree implementation would have to perform type checks...
With a complete Python 3 history, the corruption would have been prevented:
you would not have been able to mix different key types in the same
`BTree`. The corruption could only happen because Python 2 has been
less strict than Python 3 and you are now interpreting a Python 2 generated
`BTree` under Python 3.
Should you really want to support Python 2 created `BTrees`
under Python 3, then `BTrees` would need to implement the old
Python 2 comparison for objects of different types.
I doubt however, that there is general interest.
|
There is no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3 |
Andreas Jung wrote at 2019-6-18 00:27 -0700:
There was no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3
That is very strange. In order to insert a key into a tree,
the new key must be compared with (some of) the current keys (to find out
where the new key is stored). Python 3 cannot compare
objects of different types:
```
>> from BTrees.OOBTree import OOBTree
>> t=OOBTree()
>> t[1] = 1
>> t[''] = ''
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'int' and 'str'
```
Maybe, you can use `dm.historical` to find out which transaction
has introduced the inconsistency and the states before and after
this transaction. This should allow to analyse further how
the inconsistency could happen.
|
Andreas Jung wrote at 2019-6-18 00:27 -0700:
There was no Python 2 in the game here. The ZODB was created from scratch under Python 3.7/Plone 5.2RC3
You might have been hit by the issue described in
"zopefoundation/zodbpickle#47":
"... manifests in a random byte replacing a correct one in a pickle, ..."
|
I stumbled on a similar issue here. |
NicolasGoeddel wrote at 2020-11-27 09:08 -0800:
I stumbled on a similar issue here.
I have a field index `myIndex` and depending on the portal_type I sometimes add a list of strings to it and sometimes just a string. This seems not to work anymore. This happened while migrating a project from Plone 4 to Plone 5.2.2 with Python 3.8.
According to your responses so far it seems that I can no longer mix these types within one index, am I right?
You are right but this is not a ZODB issue but one of `BTrees` (and
Python 3).
`BTrees` maintain their key/value pairs essentially in a key ordered
structure. Therefore, it is important that all keys can be (completely)
ordered.
With Python 2, objects of unrelated types could be (artificiantly) ordered.
Python 3 has abolished this hack. Therefore, all keys in
a `BTree` must now have order compatible types. As an exception, `None` can
cohabitate with other keys (even it Python 3 does not define such
an ordering).
|
Plone 5.2RC3, ZODB 5.5.1, Python 3.7
During a Plone migration over plone.restapi I received this random ZODB error, not sure how to reproduce:
The text was updated successfully, but these errors were encountered: