-
Notifications
You must be signed in to change notification settings - Fork 32
Developer Tips for Python code in HATCHet
HATCHet
has evolved over time, and includes certain patterns that can aid with addition of new features (and refactoring/cleanup of existing HATCHet code). A non-exhaustive list of these recommendations would include:
- Always try to add URLs/"magic constants" and any other tweakable knobs in
hatchet.ini
, in an existing or new section. If a key is added in a section[mysection]
asmykey = 42
, it is then available to the entireHATCHet
codebase as follows:
from hatchet import config
print(config.mysection.mykey + 1) # prints 43
Note that values are automatically cast as int
, float
or str
, as appropriate. A missing value is interpreted as None
. Looking for a missing key will raise an Exception. Further, at runtime, this configuration can be overridden (by the developer or the end-user), by setting HATCHET_MYSECTION_MYKEY
environment variable.
-
Do you wish to check for certain conditions and raise Exceptions if they are not met? Use the
ensure
function inhatchet.utils.Supporting
if possible. -
Want to run an external command, or a series of commands, optionally saving stdout/stderr, and check if they ran correctly? See the
hatchet.utils.Supporting.run
function to see if it will work for you. -
Trying to write your own multiprocessing code? Struggling with capturing exceptions thrown in the parallel code? This is often done in the
HATCHet
codebase, so we've recently introduced a submodule that may help. The essential idea is this:
from hatchet.utils.multiprocessing import Worker
class MyWorker(Worker):
def __init__(self, x):
"""
Save any attributes you wish to maintain internal state,but don't ever modify them in work()!
"""
self.x = x
def work(self, *args):
# Do any work you wish to do, consulting class attributes if you wish
# return a result
return self.x + args[0]
worker = MyWorker(x=42)
work = [1, 2, 3, 5] # arguments specifying the work that needs to be done
# run up to 8 parallel instances of worker; this call waits for all workers to finish.
results = worker.run(work=work, n_instances=8)
# Results will conform to the same ordering as the input arguments
# regardless of how long each worker took to complete the job
assert results == [43, 44, 45, 47]
# Any exception thrown in any worker is propagated to the worker.run() call
-
Do you see that you're writing (or have detected in HATCHet code) the same logic more than once? Consider introducing a new module or function in
hatchet.utils
and re-using it. Use sensible default arguments to lessen code noise! -
No commented-out code please! Commented-out code tends to rot much faster since it goes untouched during testing, results in unused variables and imports, and it's almost always unclear why it even exists. If you wish to point something out in the code or maintain a TODO for your future self, please use Github Issues instead. One easy way to do this is to navigate to the source on Github, click on the
line number -> Reference in new issue..