-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow 3.7 Pickles to be Loaded in 3.8 #406
base: master
Are you sure you want to change the base?
Conversation
Hi @mmckerns - can I get a review for this? I’m just moving around a few lines of existing code so I don’t think the code coverage CI failure is meaningful. Thanks - |
Impact of the change needs to be assessed on the various use cases. |
Is your description still correct? It seems that all this PR does is to move |
The key bit is line 585 - when the deserialisation code sees a CodeType it now uses the backcompat logic in _create_code to interpret it. Previously dill would just try to read the CodeType using the default logic, which fails on 3.8 when you try to read a 3.7 pickle as the signature has changed. |
Here's an example of the current behavior:
and then in 3.8
Can you provide a case where |
@mmckerns I've pushed a test case - it fails if you comment out the handler. |
def lambda_a(): | ||
pkl = os.path.join( | ||
os.path.dirname(__file__), | ||
"lambda.pkl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be much better to write the file contents elsewhere, instead of relying on a stored pickle file. Was the file written in python 3.7? dill
tests are currently run with 2.7, 3.6, 3.7, 3.8, 3.9, 3.10, pypy27, pypy36, and pypy37. Is the test only supposed to run with 3.8?
You don't need to add a test case into the code at the moment, if it's difficult. Rather, just present the details in the main conversation of the Github issue, so I and others can reproduce what you are seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file was written with python 3.7 / dill 0.3.0. I think that’s why you can’t put the file contents there (and need the binary) - the bug seems to be that dill isn’t backwards compatible.
The pickled object is just lambda x: x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example using lambda in dill
master with 3.7:
Python 3.7.10 (default, Mar 18 2021, 06:11:04)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> dill.dump(lambda x:x, open('test.pkl', 'wb'))
and loading with 3.8...
Python 3.8.10 (default, May 7 2021, 23:18:56)
[Clang 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = dill.load(open('test.pkl', 'rb'))
>>> f(4)
4
>>>
What is your PR doing that is not possible currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to dump with dill 0.3.0 and python 3.7, not master/3.7 to reproduce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... you are saying that the issue is that old pickles from python 3.7 created with dill
0.3.0 don't unpickle in python 3.8 with dill
master. I'm assuming this is also the case for other old versions of dill
(before _create_function
was recently modified).
Am I correct in thinking that you could, as a workaround, load the pickle in 3.7 with dill
master, and then dump it again... then the resulting file would be able to be opened with 3.8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 3.8 workaround is just to set:
dill._dill._reverse_typemap[‘CodeType’] = dill._dill._create_code
much easier than re-serialising all the pickle files I have lying around :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Riiight, of course... hence the PR. I never really considered it, however, I'm wonder if adding other of the _create_
functions to the reverse_typemap
is worth investigating. I'm not certain of what functionality it might impact.
Use
_create_code
Logic when loading pickled objects rather than just the builtinCodeType
. If the pickle file was created using, e.g., Python 3.7 then the serialized object will contain 15 arguments (missingco_posonlyargcount
) but the version in the current (Python 3.8) interpreter expects 16. This PR just fills in a zero reusing the existing logic in _dill.pyRelated issues: #357 #318 #394 cloudpipe/cloudpickle#396 python/cpython#12701 facebookincubator#39