Unify report writing across single-rank and MPI runs using recording sites by ilkilic · Pull Request #65 · openbraininstitute/BlueCelluLab

ilkilic · 2026-02-27T13:39:28Z

Refactors report writing to operate on cell recordings instead of voltage trace dictionaries. Reports are now generated from configured recording sites, unifying single-rank and MPI workflows and making compartment reports consistent with the NEURON recording model.

What changed

Breaking: ReportManager.write_all(cells_or_traces=...) removed → now write_all(cells=...).
Compartment writers now read traces via cell.get_recording(rec_name) and iterate
cell.report_sites[report_name].
Report writing is now variable-agnostic (no longer hardcoded to voltage).
Adds explicit MPI gather helpers so consumers can gather recordings/spikes to rank 0
and write reports there.

The cells argument must be a mapping {CellId: cell-like} where entries expose:

report_sites
get_recording(rec_name)

Why

The previous pipeline supported multiple input shapes (live cells vs gathered trace
dictionaries). This made behavior harder to reason about and fragile, especially for
non-voltage variables and MPI execution.

The new pipeline separates responsibilities:
1. recording configuration
2. simulation
3. gathering
4. writing

MPI usage (Recommended)

Consumers running under MPI should gather locally recorded data and write only on rank 0:

# after sim.run(...)
local_sites_index = getattr(sim, "sites_index", {})
gathered_sites = pc.py_gather(local_sites_index, 0)

local_payload = collect_local_payload(sim.cells, cell_ids_for_this_rank, sim.recording_index)
local_spikes = collect_local_spikes(sim, cell_ids_for_this_rank)

all_payload, all_spikes = gather_payload_to_rank0(pc, local_payload, local_spikes)

if rank == 0:
    all_sites_index = gather_recording_sites(gathered_sites)
    cells_for_writer = payload_to_cells(all_payload, all_sites_index)

    report_mgr = ReportManager(sim.circuit_access.config, sim.dt)
    report_mgr.write_all(cells=cells_for_writer, spikes_by_pop=all_spikes)

This replaces the previous approach where users gathered trace dicts and passed cells_or_traces=traces.

Non-MPI usage

Single-rank usage remains simple:

report_mgr = ReportManager(sim.circuit_access.config, sim.dt)
report_mgr.write_all(cells=sim.cells)

Any consumer code passing cells_or_traces or trace dictionaries must migrate to:
• cells=sim.cells (single rank), or
• cells=payload_to_cells(...) on rank 0 after MPI gather.

codecov · 2026-02-27T13:46:31Z

Codecov Report

❌ Patch coverage is 93.79845% with 24 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
bluecellulab/reports/utils.py	93.57%	9 Missing ⚠️
bluecellulab/reports/writers/compartment.py	72.00%	7 Missing ⚠️
tests/test_reports/test_compartment_writer.py	90.76%	6 Missing ⚠️
tests/test_reports/test_reports_utils.py	98.31%	2 Missing ⚠️

Files with missing lines	Coverage Δ
bluecellulab/cell/core.py	`78.73% <100.00%> (+0.13%)`	⬆️
...ab/circuit/circuit_access/sonata_circuit_access.py	`98.20% <100.00%> (ø)`
bluecellulab/circuit_simulation.py	`84.89% <100.00%> (-0.82%)`	⬇️
bluecellulab/reports/manager.py	`93.02% <100.00%> (-2.22%)`	⬇️
bluecellulab/type_aliases.py	`100.00% <100.00%> (ø)`
tests/test_cell/test_core.py	`99.40% <100.00%> (+<0.01%)`	⬆️
tests/test_reports/test_reports_utils.py	`98.61% <98.31%> (-1.39%)`	⬇️
tests/test_reports/test_compartment_writer.py	`88.49% <90.76%> (-11.51%)`	⬇️
bluecellulab/reports/writers/compartment.py	`83.72% <72.00%> (-2.31%)`	⬇️
bluecellulab/reports/utils.py	`92.85% <93.57%> (+9.52%)`	⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

AurelienJaquier · 2026-02-27T15:14:19Z

bluecellulab/cell/core.py

+                else:
+                    rec_name = section_to_variable_recording_str(sec, float(seg), variable_name)
+                    if rec_name not in self.recordings:
+                        self.add_variable_recording(variable=variable_name, section=sec, segx=float(seg))


is there a reason we check for rec_name presence in self.recordings in the else but not in the if?

Good catch, it should be checked in both cases. Thanks!

AurelienJaquier · 2026-02-27T15:22:03Z

bluecellulab/reports/manager.py

        """Write all configured reports (compartment and spike) in SONATA
        format.

-        Parameters


Could you still keep Parameters in the docstring please?

Added in the last commit

AurelienJaquier · 2026-02-27T15:23:21Z

bluecellulab/reports/utils.py

+def prepare_recordings_for_reports(
+    cells: Dict[CellId, Any],
+    simulation_config: Any,
+) -> tuple[dict[CellId, list[str]], dict[CellId, list[SiteEntry]]]:


docstring please! Especially with the complicated return type, would be nice to explain what it represents

Added in my last commit

AurelienJaquier · 2026-02-27T15:30:56Z

bluecellulab/reports/utils.py

-    ----------
-    cells : dict
-        Mapping from (population, gid) → Cell object.
+def prepare_recordings_for_reports(


This function is quite long. Would be nice to refactor it into something with less lines. This can be done in another PR though if we want to merge this fast.

Refactored in my last commit

darshanmandge

Thanks! I have made some comments.

darshanmandge · 2026-02-27T15:18:56Z

bluecellulab/cell/core.py

+                if sec is None:
+                    self.add_variable_recording(variable=variable_name, section=None, segx=float(seg))
+                    sec_obj = self.soma
+                    rec_name = section_to_variable_recording_str(sec_obj, float(seg), variable_name)
+                else:
+                    rec_name = section_to_variable_recording_str(sec, float(seg), variable_name)
+                    if rec_name not in self.recordings:


Do you want to check for duplicate recording for None (which results in soma recording) as you do in the else statement in L1049?

Already fixed in one of the latest commit

darshanmandge · 2026-02-27T15:28:39Z

bluecellulab/reports/utils.py

-):
-    """Build per-cell recording sites based on source type and report
-    configuration.
+            for (sec, sec_name, segx), rec_name in zip(sites, rec_names):


When len(rec_names) < len(sites) (as warned in L99 above), this zip silently drops the tail of sites. Those dropped sites never get a site_entry in sites_index or report_sites, which could cause a silent mismatch when payload_to_cells reconstructs cells on rank 0.
Could you log which specific (sec_name, segx) pairs were skipped?

I updated configure_recording to returns (site, rec_name) pairs. Also, we log sec_name/segx in the exception path in configure_recording and I updated the AttributeError path to include sec_name/segx as well so skipped sites are identifiable in logs.

darshanmandge · 2026-02-27T15:39:32Z

bluecellulab/reports/utils.py

+    sites_index: Mapping[CellId, list[SiteEntry]],
+) -> Dict[CellId, RecordedCell]:
+    """
+    payload: {"pop_gid": {"recordings": {rec_name: [floats...]}}}


You are coding both the population name and gid as pop_gid. Population names sometimes have underscores, too. Do you want to maintain the tuple structure as in SONATA (population_name, node_id) in payload?

Good catch ! I will switch the payload to use CellId directly

darshanmandge · 2026-02-27T15:47:03Z

bluecellulab/reports/utils.py

+            report_sites = getattr(cell, "report_sites", None)
+            if not isinstance(report_sites, dict):
+                report_sites = {}
+                setattr(cell, "report_sites", report_sites)


Real Cell objects always have report_sites (set in __init__), so this getattr/setattr fallback will never execute for them — it only exists to handle objects that implement ReportConfigurableCell but forgot to include report_sites. Since the Protocol doesn't declare report_sites as a required attribute, those implementors have no way of knowing they need it. Adding it to the Protocol explicitly can be an option:

class ReportConfigurableCell(ReportSiteResolvable, Protocol): report_sites: dict[str, list[dict]] # ← add this

Right, I removed the protocol to reduce complexity. Both Cell and RecordedCell now define report_sites, so objects without it are no longer supported and the dynamic fallback isn’t needed anymore.

darshanmandge · 2026-02-27T15:48:58Z

bluecellulab/type_aliases.py

 TStim: TypeAlias = hoc_type

 SectionMapping = Dict[str, NeuronSection]
+SiteEntry: TypeAlias = dict[str, Any]


SiteEntry is used in at least 5 places with fixed string keys "report", "rec_name", "section", "segx". Using dict[str, Any] does not provide static checking. A TypedDict would catch key typos at type-check time:

class SiteEntry(TypedDict): report: str rec_name: str section: str segx: float

Good point, I replaced dict[str, Any] with a TypedDict

darshanmandge · 2026-02-27T16:01:10Z

bluecellulab/reports/utils.py

+            )
+            spikes[pop][gid] = list(times) if times is not None else []
+        except Exception:
+            spikes[pop][gid] = []


You can add logging logger.warning("Failed to collect spikes for (%s, %d): %s", pop, gid, e, exc_info=True) for easier debugging.

Good idea, I added logging but at debug level since missing spikes are expected for some cells and shouldn’t warn in large simulations.

ilkilic added 3 commits February 27, 2026 11:42

refactor reporting

e41ec7e

refactor part2

c0d9cac

lint fix

f01b92b

ilkilic self-assigned this Feb 27, 2026

ilkilic changed the title ~~Refactor report writing to be variable-agnostic and driven by rec_name + report_sites~~ Redesign SONATA report pipeline to support generic variables and MPI-safe writing Feb 27, 2026

ilkilic changed the title ~~Redesign SONATA report pipeline to support generic variables and MPI-safe writing~~ Unify report writing across single-rank and MPI runs using recording sites Feb 27, 2026

ilkilic requested review from AurelienJaquier and darshanmandge February 27, 2026 13:46

ilkilic added 3 commits February 27, 2026 15:38

add unit-tests

f06b502

minor: update example

980f9de

simplify

b01feba

AurelienJaquier reviewed Feb 27, 2026

View reviewed changes

bug fix

ea6c2e0

darshanmandge reviewed Feb 27, 2026

View reviewed changes

ilkilic added 3 commits February 27, 2026 17:06

docstring + refactor

8ee22ba

fix

4f8bf19

fix

d5f37c9

Conversation

ilkilic commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Why

MPI usage (Recommended)

Non-MPI usage

Uh oh!

codecov bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

darshanmandge left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ilkilic commented Feb 27, 2026 •

edited

Loading

codecov bot commented Feb 27, 2026 •

edited

Loading