Add typed Units columns and UnitMetrics table for per-unit metrics by h-mayorquin · Pull Request #20 · catalystneuro/ndx-spikesorting

h-mayorquin · 2026-05-20T02:20:24Z

I'd like to propose a different shape for spike-sorting metrics storage than the open MetricExtension PR (#18). Two pieces:

Cell properties as typed columns on the Units table
Some cell properties are properties of the cell, independent of any analysis run, so they belong on nwbfile.units as typed VectorData columns. As a minimal implementation of the idea I am adding FiringRate; other candidates are peak-to-trough duration, trough half-width, and median amplitude, we can add them after this PR if we refine this idea.

A generic UnitMetrics table for run-dependent metrics
Besides cell properties, the metrics extension covers a lot. From the SpikeInterface side we have quality_metrics (defined by purpose), template_metrics (defined by source), and spiketrain_metrics (defined by source). Like in #18, I have a generic DynamicTable to cover all those cases, but with some differences from #18: (1) explicit linkage to the Units table via a required unit DynamicTableRegion per row, which ensures provenance; (2) per-row obs_intervals matching the NWB-core Units.obs_intervals ragged-column pattern, so we reuse the existing NWB convention for period attribution rather than introducing a new shape (this also covers the periods produced by SpikeInterface's valid_unit_periods extension).

Taken together, this PR is the structural base: typed canonical columns on nwbfile.units for cell properties, plus a generic UnitMetrics DynamicTable for run-dependent metrics. Future PRs can either canonize more typed columns on Units or subclass UnitMetrics for specific purposes (e.g., curation) with their own canonical columns.

alejoe91 · 2026-05-28T15:55:44Z

As we discussed, since only a subset of metrics ise the periods, individual columns could link to the ValidUnitPeriods (if they were used for computation) or another TimeIntervals for custom periods.

This is how to check if valid unit periods were used in SI and which metric used it:

from spikeinterface.metrics import ComputeQualityMetrics

qm_ext = analyzer.get_extension("quality_metrics")
use_valid_periods = qm_ext.params["use_valid_periods"] and qm_ext.params["periods"] is not None

metrics_with_periods = []
for m in ComputeQualityMetrics.metric_list:
    if m.supports_periods:
        metrics_with_periods.extend(list(m.metric_columns.keys()))

@h-mayorquin the only "issue" here is that spike train metrics (which are also in the quality metrics list) also support periods, but I think that's ok...

h-mayorquin · 2026-05-31T13:37:40Z

I implemented the logic for adding the time_support link only to metric columns whose underlying SpikeInterface metric class has supports_periods=True.

This PR is ready to go, here is a summary of the current implementation:

UnitsMetrics is meant to be a flexible way of adding analysis data to the Units table. We can have more than one in case more than one analysis is done. Properties should go in the Units table directly if they are definitive cell properties (i.e. cell_type, firing_rate, brain_region, peak_channel).
ValidUnitPeriods remains and is meant to be a generic container for per-unit intervals during which each unit's sort can be trusted; one or more may coexist per file. SpikeInterface's valid_unit_periods extension produces these by thresholding false-positive (refractory violations) and false-negative (amplitude cutoff) rates per time bin and merging contiguous good bins, but the type stays algorithm-agnostic so other methods can populate it too. When converting NWB back to a SortingAnalyzer we restore the SpikeInterface extension with method="user_defined".
We have now a column for UnitVectorData that adds a time_support attribute for provenance about which time domain the computation used. Depending on whether qm.params["periods"] is set and whether a SpikeInterface valid_unit_periods extension is also present (and whether the two arrays match via np.array_equal), we link this to a ValidUnitPeriods, but if no valid_unit_periods extension is available we store a plain TimeIntervals to link the time_support to. This will be simplified after your SpikeInterface PR.
The reason for time_support being a column attribute is twofold: 1) it allows for flexibility (every column can link to a different dynamic table if they used different time support) but also 2) it allows for avoiding data duplication (if all the columns were calculated with the same time support we link all of them to one). I think this mechanism for storing computational provenance might be used more generally in NWB, so I am happy to test it here.

Let me know if you want to to take a second look. Otherwise, we can merge and make a release.

alejoe91 · 2026-06-09T15:23:02Z

Good to merge for me @h-mayorquin !

h-mayorquin · 2026-06-09T15:26:28Z

I merged yours. SHOuld we still merge this one?

h-mayorquin · 2026-06-09T16:53:59Z

I merged yours. SHOuld we still merge this one?

Yes. I realize I have tests here.

h-mayorquin added 6 commits May 19, 2026 17:36

Alterantive for metrics

05e7fec

unit linking required

e0bbcc9

restore unit periods

b22373e

restore valid intervals

79dcbe3

add obs_intervals

258b1cd

obs intervals vs computation_intervals

3bbd6d4

h-mayorquin mentioned this pull request May 20, 2026

Add MetricExtension for quality/template/spiketrain metrics #18

Closed

h-mayorquin added 3 commits May 20, 2026 01:42

pandas

99a084a

naming consistnet with pynapple

bb4d52a

round-trip fix

5de25a5

This was referenced May 20, 2026

Tracking: canonical columns for unit metrics #21

Open

Add FiringRate and NumSpikes canonical typed VectorData columns #22

Merged

h-mayorquin added 3 commits May 27, 2026 15:26

utils

a2e5cce

spike sorting utils

8c7749b

alessio suggestion

c5c2eb8

h-mayorquin added 2 commits May 29, 2026 11:40

naming

b230fb6

Long refactor

96bca5d

Simplify metrics implementation (#23)

fe6c2d0

alejoe91 approved these changes Jun 9, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into heberto_metrics

38e94e6

h-mayorquin merged commit 6710e0a into main Jun 9, 2026
8 checks passed

h-mayorquin deleted the heberto_metrics branch June 9, 2026 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add typed Units columns and UnitMetrics table for per-unit metrics#20

Add typed Units columns and UnitMetrics table for per-unit metrics#20
h-mayorquin merged 16 commits into
mainfrom
heberto_metrics

h-mayorquin commented May 20, 2026

Uh oh!

alejoe91 commented May 28, 2026

Uh oh!

h-mayorquin commented May 31, 2026

Uh oh!

alejoe91 commented Jun 9, 2026

Uh oh!

h-mayorquin commented Jun 9, 2026

Uh oh!

h-mayorquin commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

h-mayorquin commented May 20, 2026

Uh oh!

alejoe91 commented May 28, 2026

Uh oh!

h-mayorquin commented May 31, 2026

Uh oh!

alejoe91 commented Jun 9, 2026

Uh oh!

h-mayorquin commented Jun 9, 2026

Uh oh!

h-mayorquin commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants