-
Notifications
You must be signed in to change notification settings - Fork 244
api: Add (rudimentary) autopickling support #2775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
mloubout
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments
| pkl_files = glob.glob(os.path.join(self._directory, '*.pkl')) | ||
|
|
||
| self.registry.update({ | ||
| os.path.basename(pkl_file)[:-4]: pkl_file for pkl_file in pkl_files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.splitext(pkl_file)
| if not configuration['autopickling']: | ||
| return None | ||
|
|
||
| tic = time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process_time is better.
|
|
||
| tic = time() | ||
|
|
||
| if not self._initialized: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should just be in init
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2775 +/- ##
==========================================
+ Coverage 83.04% 83.07% +0.02%
==========================================
Files 248 248
Lines 50452 50564 +112
Branches 4439 4447 +8
==========================================
+ Hits 41900 42005 +105
- Misses 7791 7798 +7
Partials 761 761
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| # Maybe lookup an existing Operator from disk | ||
| name = kwargs.get('name', default_op_name) | ||
| op = autopickler.maybe_load(name) | ||
| if op is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this "log" at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree that it should log that it is using a pickled version (and which one it found)
|
|
||
| toc = time() | ||
|
|
||
| perf(f"Operator `{name}` unpickled from disk in {toc - tic:.2f} s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, this is fine, then I was expecting sth like that, ignore the above
| # Maybe lookup an existing Operator from disk | ||
| name = kwargs.get('name', default_op_name) | ||
| op = autopickler.maybe_load(name) | ||
| if op is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree that it should log that it is using a pickled version (and which one it found)
| return super().__new__(cls, **kwargs) | ||
|
|
||
| # Maybe lookup an existing Operator from disk | ||
| name = kwargs.get('name', default_op_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure using the name is the ideal approach, given that this is often left as the default "kernel". Maybe hash the args instead somehow?
Side note: is this behaviour turned off by default?
Side side note: do the pickled operators get cleared automatically, or is this left to the user?
Yet another side note: can the user specify a path to their pickled operator dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the word "rudimentary" in the title and the PR description which I'm reporting below
This is intended mostly for developers... for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, in that case as long as you don't see it as a potential source of mistakes/bother, I'm not overly concerned
| def _initialize(self): | ||
| # Search the `autopickling` temporary directory for pickled Operators | ||
| # and maintain a registry of them. Each pickled Operator is uniquely | ||
| # identified by a name; this might not be enough to avoid name clashes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about name_dsf8s7 where the suffix is the first n characters of the hash of the args the operator was constructed with?
| default_op_name = "Kernel" | ||
|
|
||
|
|
||
| class Autopickler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this live in its own file? Operator.py is getting pretty big
|
|
||
|
|
||
| @pytest.fixture | ||
| def purged_autopickling_dir(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make a helper function to do this, akin to clear_cache()?
This is intended mostly for developers... for now