Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .gitignore to exclude virtual environment directories and enhance documentation on adding datasets with h5py #2032

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

bendichter
Copy link
Contributor

@bendichter bendichter commented Feb 6, 2025

Motivation

Added a new section to the editing tutorial that demonstrates how to add custom datasets using h5py. This enhances the documentation by showing users how to add new datasets to existing groups in NWB files when the standard PyNWB API doesn't provide direct methods for doing so. The example specifically shows adding a genotype dataset to the Subject object, which is a common use case for neuroscience data management. This came up recently in the following slack message: https://nwb-users.slack.com/archives/C5XKC14L9/p1738791649800719

I don't think there is a way to do this directly with pynwb. Is that right, @rly?

Adrian Duszkiewicz
Hi all 🙂 I’m wondering about the possible solutions to the problem of blinding to some of the ‘subject’ information in the NWB file.
We are moving our ephys processing pipeline to the NWB format and in our case, the experimenter is blind to the genotype of the animal during pre-processing and initial data analysis. In an ideal situation this info would not be accessible to them in the NWB files they are working with and would only be added at a later stage. However, as I understand, the subject info can be only added when creating the NWB file and cannot be edited later. Is there anything obvious I’m missing or is recreating the whole NWB file from scratch after unblinding the only solution to this issue?
Thank you in advance for all the tips! (edited)

…nce documentation on adding datasets with h5py
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.73%. Comparing base (716e3ce) to head (730ed7b).

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #2032   +/-   ##
=======================================
  Coverage   91.73%   91.73%           
=======================================
  Files          27       27           
  Lines        2722     2722           
  Branches      710      710           
=======================================
  Hits         2497     2497           
  Misses        149      149           
  Partials       76       76           
Flag Coverage Δ
integration 72.96% <ø> (ø)
unit 82.29% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rly
Copy link
Contributor

rly commented Feb 7, 2025

I don't think there is a way to do this directly with pynwb. Is that right, @rly?

If the dataset has not been created yet, it can be appended using pynwb but subject.set_modified() has to be called. (If it has been written and it is the same shape, it can be replaced but only using h5py because it is a scalar dataset, and pynwb does not provide direct access to the h5py.Dataset object for a scalar dataset).

import pynwb
from pynwb.testing.mock.file import mock_NWBFile
nwb = mock_NWBFile()
nwb.subject = pynwb.file.Subject(subject_id="test")
io = pynwb.NWBHDF5IO("test.nwb", "w")
io.write(nwb)
io.close()

io = pynwb.NWBHDF5IO("test.nwb", "a")
nwb = io.read()
nwb.subject.genotype = "test"
nwb.subject.set_modified()
io.write(nwb)
io.close()

io = pynwb.NWBHDF5IO("test.nwb", "r")
nwb = io.read()
print(nwb.subject.genotype)  # returns "test"

set_modified should really be called anytime a field setter is successfully executed but there might be some strange edge cases. I'll look into that in hdmf-dev/hdmf#1244

@bendichter
Copy link
Contributor Author

OK, we should definitely add this to the editing tutorial. Can optional Attributes and optional non-scalar Datasets also be added in this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants