Conversation
Added workflow_dispatch event to trigger the workflow manually.
Convert WFSC snippet artifact writes from pickle to HDF5 and add an HDF5 serializer/deserializer for the out object tree. Also: - add a functional test readback assertion for WFSC FLC - wire h5py into requirements, setup.py, and CI workflow - keep full-workspace pickle path unchanged - include the earlier WFSC estimator shape fix and warning-string fixes Co-authored-by: GitHub Copilot <copilot[bot]@users.noreply.github.com>
Task transcript and timing metadataTiming
Notes on timing:
Conversation transcript (best-effort full transcript from available chat context)
Work summary included in PR
Performance results (same functional test)
Security assessment
|
There was a problem hiding this comment.
Pull request overview
This PR replaces WFSC “snippet” artifact persistence from pickle to HDF5, adding an HDF5 serializer/deserializer and updating the functional regression test and dependency wiring accordingly.
Changes:
- Write WFSC snippet artifacts as
*_snippet.h5via a newfalco.hdf5utilsHDF5 object-tree serializer. - Extend
test_wfsc_flcto load the HDF5 snippet and assert key fields match the in-memoryout. - Add
h5pyto dependency/install wiring (requirements, setup, and CI workflow), plus small related fixes (estimator shape, docstring escapes).
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
falco/wfsc.py |
Switch snippet persistence from pickle to HDF5 (*_snippet.h5) using save_object_to_hdf5. |
falco/hdf5utils.py |
New HDF5 serializer/deserializer for falco.config.Object and common container/scalar types. |
tests/functional/test_wfsc_flc.py |
Load snippet HDF5 after WFSC loop and assert readback equivalence for selected fields. |
falco/est.py |
Adjust Epix zero-shape for estimator path to avoid shape-related errors. |
falco/plot.py |
Escape sequence tweak in plot label to avoid string/doc warnings. |
falco/proper/prop_psd_errormap.py |
Escape sequence tweaks in docstring math blocks to avoid warnings. |
requirements.txt |
Add pinned h5py dependency. |
setup.py |
Add h5py to install_requires. |
.github/workflows/pyfalco_test.yml |
Install h5py in CI and add workflow_dispatch. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -611,7 +611,7 @@ def pairwise_probing(mp, ev, jacStruct=np.array([])): | |||
| # If <2 probe pairs had good measurements, can't do pinv. Leave Eest as zero. | |||
| if NpairsGood < 2: | |||
| zerosCounter = zerosCounter + 1 | |||
There was a problem hiding this comment.
In the NpairsGood < 2 branch, zerosCounter is incremented twice (zerosCounter = zerosCounter + 1 and then zerosCounter += 1). This will double-count the number of zeroed pixels and make the summary printout inaccurate. Remove one of the increments (and prefer a single += 1).
| zerosCounter = zerosCounter + 1 |
| value = node[()] | ||
|
|
||
| if node_type == "bytes": | ||
| return bytes(np.asarray(value, dtype=np.uint8).tolist()) |
There was a problem hiding this comment.
_read_node reconstructs byte datasets via bytes(np.asarray(value, dtype=np.uint8).tolist()), which is very slow and memory-heavy for larger byte payloads. Prefer a zero-copy-ish approach like np.asarray(value, dtype=np.uint8).tobytes() (or value.tobytes() if already uint8).
| return bytes(np.asarray(value, dtype=np.uint8).tolist()) | |
| return np.asarray(value, dtype=np.uint8).tobytes() |
| def _write_node(parent, name: str, value) -> None: | ||
| if isinstance(value, falco.config.Object): | ||
| group = parent.create_group(name) | ||
| group.attrs[_NODE_TYPE_ATTR] = "falco_object" | ||
| for key, sub_value in value.data.items(): | ||
| _write_node(group, str(key), sub_value) | ||
| return | ||
|
|
||
| if isinstance(value, dict): | ||
| group = parent.create_group(name) | ||
| group.attrs[_NODE_TYPE_ATTR] = "dict" | ||
| for key, sub_value in value.items(): | ||
| _write_node(group, str(key), sub_value) | ||
| return | ||
|
|
||
| if isinstance(value, (list, tuple)): | ||
| group = parent.create_group(name) | ||
| group.attrs[_NODE_TYPE_ATTR] = "tuple" if isinstance(value, tuple) else "list" | ||
| for index, sub_value in enumerate(value): | ||
| _write_node(group, f"item_{index:06d}", sub_value) | ||
| return | ||
|
|
||
| if isinstance(value, np.ndarray): | ||
| dataset = parent.create_dataset(name, data=value) | ||
| dataset.attrs[_NODE_TYPE_ATTR] = "ndarray" | ||
| return | ||
|
|
||
| if value is None: | ||
| group = parent.create_group(name) | ||
| group.attrs[_NODE_TYPE_ATTR] = "none" | ||
| return | ||
|
|
||
| if isinstance(value, str): | ||
| dataset = parent.create_dataset( | ||
| name, data=np.array(value, dtype=h5py.string_dtype("utf-8")) | ||
| ) | ||
| dataset.attrs[_NODE_TYPE_ATTR] = "str" | ||
| return | ||
|
|
||
| if isinstance(value, bytes): | ||
| dataset = parent.create_dataset(name, data=np.frombuffer(value, dtype=np.uint8)) | ||
| dataset.attrs[_NODE_TYPE_ATTR] = "bytes" | ||
| return | ||
|
|
||
| if isinstance(value, (np.generic, bool, int, float, complex)): | ||
| dataset = parent.create_dataset(name, data=np.asarray(value)) | ||
| dataset.attrs[_NODE_TYPE_ATTR] = "scalar" | ||
| return | ||
|
|
||
| if isinstance(value, types.SimpleNamespace): | ||
| group = parent.create_group(name) | ||
| group.attrs[_NODE_TYPE_ATTR] = "simple_namespace" | ||
| for key, sub_value in vars(value).items(): | ||
| _write_node(group, str(key), sub_value) | ||
| return | ||
|
|
||
| raise TypeError(f"Unsupported type for HDF5 serialization: {type(value)}") |
There was a problem hiding this comment.
falco/hdf5utils.py introduces a general-purpose serializer/deserializer with multiple supported node types (Object/dict/list/tuple/ndarray/scalars/str/bytes/SimpleNamespace), but the PR only adds a functional assertion on a couple of fields. Please add a focused unit test that round-trips representative values for each supported type (including nested containers) and asserts that unsupported types raise TypeError—this will help prevent silent format regressions.
Summary
This PR converts a localized pickle workflow to HDF5 while keeping behavior equivalent and test-backed.
What changed
*_snippet.pklto*_snippet.h5:falco/wfsc.pyoutobject graph:falco/hdf5utils.pytests/functional/test_wfsc_flc.pyh5pydependency wiring:requirements.txtsetup.py.github/workflows/pyfalco_test.ymlIncluded related fixes from the same working session
TypeErrorintest_wfsc_flc:falco/est.pyfalco/plot.pyfalco/proper/prop_psd_errormap.pyFunctional equivalence
Validated with existing regression test:
tests/functional/test_wfsc_flc.pyPerformance measurement
Measured test runtime with:
/usr/bin/time -p python -m pytest tests/functional/test_wfsc_flc.py -qResults:
real 135.22s(pytest:134.72s)real 133.83s(pytest:133.40s)Observed delta:
-1.39swall time (about1.0%faster). This is small and within normal noise for long functional runs, with no regression.Security assessment
Before (pickle snippet path):
pickle.dump/pickle.loadfor snippet artifacts; deserialization can execute arbitrary code if file is attacker-controlled.After (HDF5 snippet path):
falco/hdf5utils.py.falco.config.Object, dict/list/tuple, ndarray/scalars/strings/bytes,types.SimpleNamespace).Net effect:
Notes
*_all.pkl) was intentionally left unchanged to keep the conversion localized.