diff --git a/README.md b/README.md index 281df9f..ed7d48c 100644 --- a/README.md +++ b/README.md @@ -43,11 +43,79 @@ If `git pull` shows a conflict or error, reach out before trying to fix it. 3. **Set frame range** — use "Set Start/End" buttons to find the active region automatically 4. **Run analysis** — click "Analyze Brightness" (or press F5), choose an output folder +### Capture Metadata Sidecar (new) + +When a video is loaded, the app now checks for an optional sidecar file named: + +`.capture.json` + +Example: +- `experiment_01.mov` +- `experiment_01.capture.json` + +The current schema authority is a lightweight versioned contract with `schema_version: "1.0"`. The validator checks required acquisition fields (`device_model`, exposure/white-balance lock flags, exposure duration, ISO, FPS, resolution, HDR flag), warns on legacy or invalid metadata, and shows a metadata status line in the UI after load. + +Reference: +- `docs/capture_metadata_schema.md` + +### Capture Inbox Workflow (new) + +For end-to-end testing, you can now point a local inbox at incoming iPhone captures and optionally auto-run analysis with a fixed manifest: + +```bash +python tools/ingest_capture_inbox.py tools/capture_inbox_manifest.example.json +``` + +That tool: +- scans `inbox_dir` for video + `*.capture.json` pairs +- validates capture metadata using the current schema contract +- creates deterministic per-capture output folders using `capture_id` when present +- writes `capture_ingest_summary.json` for each capture +- optionally runs the existing mask-review analysis path if `analysis_case` is configured +- optionally archives processed source files out of the inbox + +If you want to rerun ingest against the same capture and `output_dir`, pass `--force-reprocess` or set `"force_reprocess": true` in the manifest. Otherwise identical source signatures are skipped on purpose, and the run summary now includes the existing summary path that triggered the skip. + +Use `--watch-seconds 5` to keep rescanning during manual device-to-desktop testing. + ### Output Each analysis produces: - **CSV files** — one per ROI with columns: `frame, brightness_mean, brightness_median, blue_mean, blue_median` - **Plot images** — dual-panel PNG (brightness trends + difference plot) with statistical annotations +- **Metadata sidecar** — one `*_analysis_metadata.json` file capturing mask mode, thresholds, source frames, mask-quality warnings, and normalized capture provenance / validation results when a capture sidecar exists + +### Dark-Enclosure Review Workflow + +For lab-style review of electrode light inside a dark enclosure: + +1. Lock exposure, ISO, white balance, and focus before recording. +2. Draw tight electrode ROIs and place the background ROI close to the electrodes, but outside visible glow. +3. Capture a fixed mask, then enable **Show Pixel Mask** to inspect agreement between the fixed mask and the current adaptive mask. + Fixed-only pixels render in red, adaptive-only pixels in blue, and agreement in magenta. +4. Check the mask-quality summary: + - `high` / `medium` confidence means the consensus mask is stable enough to review. + - `low` confidence, `low_consensus`, `unstable_mask`, or `small_mask` means the mask needs operator review before trusting the run. +5. Export the analysis and confirm the `*_analysis_metadata.json` sidecar was written next to the CSV/plot files. +6. Package one or more exported runs into a repeatable review bundle: + +```bash +python tools/run_real_video_review.py \ + tools/real_video_review_manifest.example.json \ + --output-dir review_output +``` + +The manifest should point at already-exported analysis folders plus the original raw video paths. The review bundle copies metadata, CSVs, and plots into one folder and generates `review_report.md` with a per-run PASS/FAIL summary. + +For a direct rerun from raw videos and ROI manifests, use: + +```bash +python tools/run_mask_review.py \ + tools/mask_review_manifest.example.json \ + --output-dir mask_review_outputs +``` + +That runner performs auto-capture plus full analysis from the raw videos, writes overlay PNGs for source frames, exports fresh CSV/metadata artifacts, and generates a case-by-case review summary. ### Useful Shortcuts @@ -70,10 +138,11 @@ Arrow keys nudge a selected ROI instead of navigating frames. Shift+Arrow for 10 1. User draws ROIs on the video frame (one can be designated as a background reference). 2. For each frame in the selected range, the tool converts BGR pixels to **CIE LAB** color space and extracts the **L\* channel** (perceptually uniform brightness, 0–100 scale). -3. Pixels below a noise threshold (default 5 L\*) are filtered out. An optional morphological opening (erode then dilate) removes isolated bright pixels. -4. If a background ROI is set, its brightness (configurable percentile, default 90th) is subtracted per-frame to compensate for lighting drift. -5. Both mean and median brightness are computed per ROI per frame. -6. Results are exported to CSV and plotted. +3. Fixed-mask capture scores signal above the background ROI and absolute noise floor, then builds a deterministic consensus mask from the strongest source frames. +4. Pixels below the noise floor (default 5 L\*) are filtered out. Morphological opening plus connected-component filtering remove isolated bright artifacts. +5. If a background ROI is set, its brightness (configurable percentile, default 90th) is subtracted per-frame to compensate for lighting drift. +6. Both mean and median brightness are computed per ROI per frame. +7. Results are exported to CSV, plots, and metadata. ### Architecture @@ -123,10 +192,21 @@ Brightness Sorcerer reports **relative** L\* brightness values derived from smar ### Pipeline Notes - **Background subtraction** uses a configurable percentile (default 90th) from the background ROI. This adapts to gradual lighting drift but assumes the background ROI contains no glow signal. +- **Fixed-mask provenance** records the source frames, consensus score, warning flags, and threshold settings used to create each reusable mask. - **Morphological filtering** removes isolated bright pixels but may erode edges of very small glow regions. For ROIs smaller than ~50 px, use smaller kernel sizes (1–3). - **No temporal smoothing.** Each frame is analyzed independently. Raw traces may appear noisier than time-averaged instruments; post-hoc filtering (moving average, Savitzky-Golay) can be applied to the exported CSV data. - **Blue channel values** are on the raw 0–255 sensor scale without perceptual correction — useful for qualitative spectral trends, not calibrated spectral measurements. +### Mask-Quality Interpretation + +- `high` confidence: the fixed mask stayed stable across the strongest source frames and showed no blocking warnings. +- `medium` confidence: acceptable for review, but verify the overlay and source frames before using the run as a reference. +- `low` confidence: do not trust the run without manual inspection and likely recapturing the mask. +- `single_frame_capture`: only one usable source frame contributed to the fixed mask; repeatability is weaker. +- `low_consensus`: candidate frames disagreed about which pixels belonged to the glow region. +- `unstable_mask`: the consensus region was much smaller than the total detected union, suggesting drifting or noisy detections. +- `small_mask`: the retained signal region was near the minimum component-size floor and may be dominated by artifacts. + ### Reporting Recommendations When citing results in publications, note: diff --git a/docs/capture_metadata_schema.md b/docs/capture_metadata_schema.md new file mode 100644 index 0000000..f54842f --- /dev/null +++ b/docs/capture_metadata_schema.md @@ -0,0 +1,53 @@ +# Capture Metadata Sidecar Schema + +The analyzer accepts an optional sidecar JSON file next to each video: + +- Video: `experiment_01.mov` +- Sidecar: `experiment_01.capture.json` + +The current lightweight schema authority is: + +- `schema_version: "1.0"` +- Schema contract source of truth: `ecl_analysis/ingest/metadata.py` + +This is intentionally versioned but non-blocking during the transition from legacy videos to the dedicated iPhone capture app. Missing or invalid metadata should warn, not block analysis. + +## Required fields for schema `1.0` + +```json +{ + "schema_version": "1.0", + "device_model": "iPhone 15 Pro", + "capture_id": "8A0F0A5A-2A79-4D8C-9C2A-0CCF9F9368EA", + "recorded_at": "2026-04-01T10:15:30Z", + "app_version": "0.1.0", + "ios_version": "iOS 26.0", + "video_codec": "h264", + "color_space": "sdr", + "exposure_mode_locked": true, + "exposure_duration": 0.0333333333, + "iso": 80, + "white_balance_mode_locked": true, + "fps": 30, + "resolution": "1920x1080", + "hdr_disabled": true +} +``` + +## Validation behavior + +- Missing sidecar: warning-only in the UI; analysis still proceeds. +- Missing `schema_version`: warning; validator assumes compatibility with schema `1.0` and marks `schema_version_assumed: true` in exported provenance. +- Unknown `schema_version`: warning; validator performs best-effort validation against current fields. +- Missing required acquisition fields: warning-only at load time, but surfaced as validation errors in exported provenance. +- Unknown fields are retained in provenance as `unrecognized_fields` so schema drift is visible without blocking ingest. + +## Export behavior + +Analysis metadata exports now include: + +- `capture_metadata_validation`: whether the sidecar passed validation plus any warnings/errors +- `capture_metadata`: normalized capture provenance when a sidecar is present +- `capture_provenance`: grouped export view that carries both the normalized metadata and the validation record used for the run + +That contract is the boundary the iPhone capture app should target. diff --git a/docs/iphone_capture_pipeline_review.md b/docs/iphone_capture_pipeline_review.md new file mode 100644 index 0000000..bff1ae4 --- /dev/null +++ b/docs/iphone_capture_pipeline_review.md @@ -0,0 +1,73 @@ +# iPhone Capture Pipeline Feasibility Review + +## Context +The current app analyzes pre-recorded videos and assumes camera settings are stable enough for relative brightness trends. + +## What the project already does well +- Computes brightness using CIE L* from each frame and supports background subtraction and noise/morphological filtering. +- Exports reproducible frame-level CSV files and plots. +- Explicitly documents that manual exposure/ISO/white balance lock is required for valid results. + +## Current gap vs. requested workflow +Your proposed workflow is: +1. Record on iPhone with exposure lock and stable imaging pipeline. +2. Persist capture settings in metadata. +3. Automatically deliver video into ECL_Analysis for processing. + +The repository currently starts analysis from a local file picker / drag-drop and does not include: +- iPhone capture controls. +- In-app metadata ingestion/validation for camera settings. +- An automated watch/import service for incoming files. + +## Feasibility assessment +This is feasible and likely worth it if consistency is your top priority. + +### Why it is worth doing +- This codebase already depends on consistency of acquisition conditions for scientific validity. +- Most of your measurement error risk is upstream (capture variability), not downstream (analysis code). +- A capture-controlled iPhone flow should reduce false trends caused by auto-exposure, tone mapping, HDR, or AWB drift. + +### Practical constraints to account for +- iPhone camera APIs are iOS-native (AVFoundation). A robust capture app is best built as a separate iOS app, not inside this PyQt desktop app. +- iOS may not allow writing arbitrary custom metadata into the container exactly how you want for every codec/profile; often you should also create a sidecar JSON record. +- HEVC/HDR/Dolby Vision defaults can distort analysis unless explicitly disabled. + +## Recommended architecture (incremental) + +### Phase 1 (highest ROI, low risk): metadata-aware import in this repo +Add import-time validation in ECL_Analysis: +- Parse container metadata via ffprobe/exiftool (codec, fps, dimensions, capture date, color transfer/profile when available). +- Use a lightweight versioned sidecar JSON contract (`schema_version: "1.0"`) from `ecl_analysis/ingest/metadata.py` with fields like: + - device_model + - exposure_mode_locked + - exposure_duration + - iso + - white_balance_mode_locked + - fps + - resolution + - hdr_disabled +- Warn, rather than block, when required fields are missing or invalid so legacy videos remain analyzable during the transition. +- Normalize recognized sidecar fields before export so downstream analysis artifacts stay reproducible even when inputs vary in representation. + +### Phase 2: automatic ingest +- Add a watched inbox folder (`incoming/`). +- New files with valid sidecar metadata are queued for analysis automatically. +- Save outputs to deterministic folder names tied to capture IDs. + +### Phase 3: iPhone acquisition app +- Build a lightweight iOS capture app (Swift + AVFoundation): + - lock exposure/ISO/white balance/focus + - disable HDR/night mode/deep tone mapping where possible + - force fixed FPS and resolution + - export MOV + sidecar JSON + - upload directly to shared storage / API endpoint consumed by the desktop pipeline + +## Suggested acceptance criteria +- Repeated static-scene captures produce <= X% frame-level brightness variance across runs. +- Pipeline surfaces capture-provenance warnings for any run lacking lock-confirmed metadata. +- Analysis output includes capture settings provenance in summary artifacts, including schema version, validation status, and normalized sidecar fields. + +## Bottom line +Yes, this is feasible. It is also strategically aligned with the project’s own measurement assumptions. + +Best path: keep this Python analyzer as the analysis engine, and add (1) metadata-gated ingest now, then (2) iPhone capture app integration. That gives you immediate quality gains without a risky full rewrite. diff --git a/docs/metadata_ingest_execution_plan.md b/docs/metadata_ingest_execution_plan.md new file mode 100644 index 0000000..a6e064b --- /dev/null +++ b/docs/metadata_ingest_execution_plan.md @@ -0,0 +1,50 @@ +# Metadata Ingest Execution Plan + +## Goal +Improve acquisition consistency and provenance in the desktop analyzer while keeping legacy videos analyzable during the transition to a dedicated iPhone capture app. + +## Decisions +- Capture metadata validation is warning-first, not hard-blocking. +- The schema authority is a lightweight versioned sidecar contract with `schema_version: "1.0"`. +- The Python desktop app remains the analysis engine. +- The iPhone capture app should live in a separate repository and can start as a minimal AVFoundation MVP. + +## Phase Status + +### Phase 1: metadata-aware import in this repo +Status: in progress + +Implemented: +- Sidecar schema contract and validator in `ecl_analysis/ingest/metadata.py` +- UI load-time metadata status in `ecl_analysis/video_analyzer.py` +- Provenance export fields in analysis metadata outputs +- Tests covering validation behavior and metadata export wiring + +Remaining: +- Optional container-level metadata parsing (`ffprobe` / `exiftool`) to cross-check sidecar claims +- More explicit UI surfacing of validation warnings/details beyond the status line + +### Phase 2: automatic ingest +Status: in progress + +Implemented: +- Inbox ingest script in `tools/ingest_capture_inbox.py` +- Deterministic capture output folders using `capture_id` when present +- Per-capture ingest summaries and optional archive behavior +- Manifest-driven optional auto-analysis flow + +Remaining: +- Decide where the watched inbox should live in real deployments +- Add any daemon/service wrapper if continuous unattended ingest is needed + +### Phase 3: iPhone capture app +Status: not started in this repository + +Planned: +- Minimal Swift / AVFoundation capture app in a separate repository +- Fixed capture settings, sidecar JSON export, and transfer into the desktop ingest path + +## Near-Term Next Steps +1. Commit the Phase 1 and Phase 2 desktop-side ingest work. +2. Decide whether container metadata cross-checking is required before starting the iPhone app. +3. Create a separate repository for the iPhone capture MVP. diff --git a/ecl_analysis/analysis/__init__.py b/ecl_analysis/analysis/__init__.py index 395c6b3..e91658d 100644 --- a/ecl_analysis/analysis/__init__.py +++ b/ecl_analysis/analysis/__init__.py @@ -3,14 +3,20 @@ from .background import compute_background_brightness from .brightness import compute_brightness, compute_brightness_stats, compute_l_star_frame from .duration import validate_run_duration -from .models import AnalysisRequest, AnalysisResult +from .masking import MASK_TOP_CANDIDATES, build_consensus_mask, build_signal_mask, evaluate_mask_candidate +from .models import AnalysisRequest, AnalysisResult, MaskCaptureMetadata __all__ = [ "AnalysisRequest", "AnalysisResult", + "MaskCaptureMetadata", + "MASK_TOP_CANDIDATES", + "build_consensus_mask", + "build_signal_mask", "compute_background_brightness", "compute_brightness", "compute_brightness_stats", "compute_l_star_frame", + "evaluate_mask_candidate", "validate_run_duration", ] diff --git a/ecl_analysis/analysis/masking.py b/ecl_analysis/analysis/masking.py new file mode 100644 index 0000000..a322a09 --- /dev/null +++ b/ecl_analysis/analysis/masking.py @@ -0,0 +1,242 @@ +"""Mask scoring and consensus helpers for electrode-light analysis.""" + +from __future__ import annotations + +from dataclasses import dataclass +from math import ceil +from typing import List, Optional, Sequence, Tuple + +import cv2 +import numpy as np + +from .models import MaskCaptureMetadata + +MASK_TOP_CANDIDATES = 3 + + +@dataclass(frozen=True) +class MaskCandidate: + """Single-frame mask candidate for one ROI.""" + + frame_idx: int + score: float + background_brightness: Optional[float] + mask: np.ndarray + pixel_count: int + signal_peak: float + threshold_value: float + min_component_area: int + + +def compute_min_component_area( + mask_shape: Tuple[int, int], + morphological_kernel_size: int, +) -> int: + """Return a conservative connected-component floor scaled to ROI area.""" + roi_area = max(1, int(mask_shape[0] * mask_shape[1])) + area_floor = int(round(roi_area * 0.002)) + return max(4, morphological_kernel_size, min(64, area_floor)) + + +def filter_connected_components(mask: np.ndarray, min_component_area: int) -> np.ndarray: + """Remove connected components smaller than the requested area.""" + if mask.size == 0 or not np.any(mask): + return np.zeros(mask.shape, dtype=bool) + + num_labels, labels, stats, _centroids = cv2.connectedComponentsWithStats( + mask.astype(np.uint8), + connectivity=8, + ) + filtered = np.zeros(mask.shape, dtype=bool) + for label_idx in range(1, num_labels): + area = int(stats[label_idx, cv2.CC_STAT_AREA]) + if area >= min_component_area: + filtered |= labels == label_idx + return filtered + + +def build_signal_mask( + roi_l_star: np.ndarray, + background_brightness: Optional[float], + noise_floor_threshold: float, + morphological_kernel_size: int, + min_component_area: Optional[int] = None, +) -> Tuple[np.ndarray, float, int]: + """Build a cleaned binary mask for pixels that plausibly belong to electrode light.""" + if roi_l_star.size == 0: + return np.zeros(roi_l_star.shape, dtype=bool), float(noise_floor_threshold), 0 + + threshold_value = float(noise_floor_threshold) + if background_brightness is not None: + threshold_value = max(threshold_value, float(background_brightness)) + + mask = roi_l_star > threshold_value + if np.any(mask): + kernel = cv2.getStructuringElement( + cv2.MORPH_ELLIPSE, + (morphological_kernel_size, morphological_kernel_size), + ) + mask_uint8 = mask.astype(np.uint8) * 255 + cleaned = cv2.morphologyEx(mask_uint8, cv2.MORPH_OPEN, kernel) + mask = cleaned > 0 + + min_area = ( + compute_min_component_area(mask.shape, morphological_kernel_size) + if min_component_area is None + else max(1, int(min_component_area)) + ) + mask = filter_connected_components(mask, min_area) + return mask, threshold_value, min_area + + +def evaluate_mask_candidate( + roi_l_star: np.ndarray, + background_brightness: Optional[float], + noise_floor_threshold: float, + morphological_kernel_size: int, + frame_idx: int, +) -> Optional[MaskCandidate]: + """Score a single-frame candidate using background-aware positive signal only.""" + mask, threshold_value, min_area = build_signal_mask( + roi_l_star=roi_l_star, + background_brightness=background_brightness, + noise_floor_threshold=noise_floor_threshold, + morphological_kernel_size=morphological_kernel_size, + ) + if not np.any(mask): + return None + + reference_value = float(background_brightness) if background_brightness is not None else threshold_value + signal_values = np.maximum(roi_l_star[mask] - reference_value, 0.0) + score = float(np.sum(signal_values)) + if score <= 0.0: + return None + + return MaskCandidate( + frame_idx=int(frame_idx), + score=score, + background_brightness=None if background_brightness is None else float(background_brightness), + mask=mask, + pixel_count=int(np.count_nonzero(mask)), + signal_peak=float(np.max(signal_values)), + threshold_value=float(threshold_value), + min_component_area=int(min_area), + ) + + +def update_top_candidates( + candidates: Sequence[MaskCandidate], + candidate: Optional[MaskCandidate], + limit: int = MASK_TOP_CANDIDATES, +) -> List[MaskCandidate]: + """Return the top scored candidates with deterministic ordering.""" + if candidate is None: + return list(candidates) + + ranked = list(candidates) + [candidate] + ranked.sort(key=lambda item: (-item.score, item.frame_idx)) + return ranked[: max(1, int(limit))] + + +def _confidence_label( + candidate_count: int, + consensus_ratio: float, + stability_ratio: float, + pixel_count: int, + min_component_area: int, +) -> str: + if pixel_count <= 0: + return "none" + if candidate_count == 1 or consensus_ratio < 0.6 or stability_ratio < 0.4: + return "low" + if pixel_count <= (min_component_area * 2) or consensus_ratio < 0.8 or stability_ratio < 0.6: + return "medium" + return "high" + + +def build_consensus_mask( + candidates: Sequence[MaskCandidate], + capture_mode: str, + noise_floor_threshold: float, + morphological_kernel_size: int, +) -> Tuple[Optional[np.ndarray], MaskCaptureMetadata]: + """Build a deterministic fixed mask from top candidates and summarize provenance.""" + ranked = sorted(candidates, key=lambda item: (-item.score, item.frame_idx)) + if not ranked: + return ( + None, + MaskCaptureMetadata( + capture_mode=capture_mode, + warnings=["no_signal"], + noise_floor_threshold=float(noise_floor_threshold), + morphological_kernel_size=int(morphological_kernel_size), + ), + ) + + source_frames = [candidate.frame_idx for candidate in ranked] + background_values = [ + 0.0 if candidate.background_brightness is None else float(candidate.background_brightness) + for candidate in ranked + ] + signal_scores = [float(candidate.score) for candidate in ranked] + threshold_values = [float(candidate.threshold_value) for candidate in ranked] + min_component_area = max(candidate.min_component_area for candidate in ranked) + + if len(ranked) == 1: + consensus_mask = ranked[0].mask.copy() + consensus_ratio = 1.0 + stability_ratio = 1.0 + else: + mask_stack = np.stack([candidate.mask.astype(np.uint8) for candidate in ranked], axis=0) + support_counts = np.sum(mask_stack, axis=0) + required_votes = max(1, ceil(len(ranked) / 2)) + consensus_mask = support_counts >= required_votes + consensus_mask = filter_connected_components(consensus_mask, min_component_area) + if np.any(consensus_mask): + consensus_ratio = float(np.mean(support_counts[consensus_mask] / len(ranked))) + else: + consensus_ratio = 0.0 + union_mask = np.any(mask_stack.astype(bool), axis=0) + overlap = np.count_nonzero(consensus_mask) + union = np.count_nonzero(union_mask) + stability_ratio = float(overlap / union) if union else 0.0 + + warnings: List[str] = [] + if len(ranked) == 1: + warnings.append("single_frame_capture") + if not np.any(consensus_mask): + warnings.append("low_consensus") + if np.count_nonzero(consensus_mask) <= min_component_area: + warnings.append("small_mask") + if consensus_ratio and consensus_ratio < 0.7: + warnings.append("low_consensus") + if stability_ratio and stability_ratio < 0.5: + warnings.append("unstable_mask") + + confidence_label = _confidence_label( + candidate_count=len(ranked), + consensus_ratio=consensus_ratio, + stability_ratio=stability_ratio, + pixel_count=int(np.count_nonzero(consensus_mask)), + min_component_area=min_component_area, + ) + + metadata = MaskCaptureMetadata( + capture_mode=capture_mode, + source_frames=source_frames, + primary_source_frame=ranked[0].frame_idx, + background_values=background_values, + signal_scores=signal_scores, + threshold_values=threshold_values, + pixel_count=int(np.count_nonzero(consensus_mask)), + consensus_ratio=float(consensus_ratio), + stability_ratio=float(stability_ratio), + confidence_label=confidence_label, + min_component_area=int(min_component_area), + warnings=sorted(set(warnings)), + noise_floor_threshold=float(noise_floor_threshold), + morphological_kernel_size=int(morphological_kernel_size), + ) + if not np.any(consensus_mask): + return None, metadata + return consensus_mask.astype(bool), metadata diff --git a/ecl_analysis/analysis/models.py b/ecl_analysis/analysis/models.py index 111f52c..a7faa5e 100644 --- a/ecl_analysis/analysis/models.py +++ b/ecl_analysis/analysis/models.py @@ -1,7 +1,7 @@ """Data contracts for analysis requests and results.""" -from dataclasses import dataclass -from typing import List, Optional, Sequence, Tuple +from dataclasses import dataclass, field +from typing import Any, Dict, List, Optional, Sequence, Tuple import numpy as np @@ -9,6 +9,64 @@ RoiRect = Tuple[Point, Point] +@dataclass +class MaskCaptureMetadata: + """Structured provenance for a captured fixed mask.""" + + capture_mode: str + source_frames: List[int] = field(default_factory=list) + primary_source_frame: Optional[int] = None + background_values: List[float] = field(default_factory=list) + signal_scores: List[float] = field(default_factory=list) + threshold_values: List[float] = field(default_factory=list) + pixel_count: int = 0 + consensus_ratio: float = 0.0 + stability_ratio: float = 0.0 + confidence_label: str = "none" + min_component_area: int = 0 + warnings: List[str] = field(default_factory=list) + noise_floor_threshold: float = 0.0 + morphological_kernel_size: int = 0 + + def clone(self) -> "MaskCaptureMetadata": + """Return a detached copy for history snapshots and worker payloads.""" + return MaskCaptureMetadata( + capture_mode=str(self.capture_mode), + source_frames=[int(frame) for frame in self.source_frames], + primary_source_frame=None if self.primary_source_frame is None else int(self.primary_source_frame), + background_values=[float(value) for value in self.background_values], + signal_scores=[float(value) for value in self.signal_scores], + threshold_values=[float(value) for value in self.threshold_values], + pixel_count=int(self.pixel_count), + consensus_ratio=float(self.consensus_ratio), + stability_ratio=float(self.stability_ratio), + confidence_label=str(self.confidence_label), + min_component_area=int(self.min_component_area), + warnings=[str(value) for value in self.warnings], + noise_floor_threshold=float(self.noise_floor_threshold), + morphological_kernel_size=int(self.morphological_kernel_size), + ) + + def to_dict(self) -> Dict[str, Any]: + """Serialize mask provenance for JSON export.""" + return { + "capture_mode": self.capture_mode, + "source_frames": list(self.source_frames), + "primary_source_frame": self.primary_source_frame, + "background_values": list(self.background_values), + "signal_scores": list(self.signal_scores), + "threshold_values": list(self.threshold_values), + "pixel_count": self.pixel_count, + "consensus_ratio": self.consensus_ratio, + "stability_ratio": self.stability_ratio, + "confidence_label": self.confidence_label, + "min_component_area": self.min_component_area, + "warnings": list(self.warnings), + "noise_floor_threshold": self.noise_floor_threshold, + "morphological_kernel_size": self.morphological_kernel_size, + } + + @dataclass(frozen=True) class AnalysisRequest: """Immutable snapshot of all inputs required for frame analysis.""" @@ -23,6 +81,8 @@ class AnalysisRequest: background_percentile: float morphological_kernel_size: int noise_floor_threshold: float + mask_metadata: Sequence[Optional[MaskCaptureMetadata]] = field(default_factory=list) + analysis_metadata: Dict[str, Any] = field(default_factory=dict) @dataclass @@ -40,3 +100,6 @@ class AnalysisResult: elapsed_seconds: float start_frame: int end_frame: int + use_fixed_mask: bool = False + mask_metadata: List[Optional[MaskCaptureMetadata]] = field(default_factory=list) + analysis_metadata: Dict[str, Any] = field(default_factory=dict) diff --git a/ecl_analysis/constants.py b/ecl_analysis/constants.py index 1ae995a..f5bc633 100644 --- a/ecl_analysis/constants.py +++ b/ecl_analysis/constants.py @@ -33,6 +33,7 @@ AUTO_DETECT_BASELINE_PERCENTILE = 5 BRIGHTNESS_NOISE_FLOOR_PERCENTILE = 2 DEFAULT_MANUAL_THRESHOLD = 5.0 +DEFAULT_NOISE_FLOOR_THRESHOLD = 5.0 MORPHOLOGICAL_KERNEL_SIZE = 3 MOUSE_RESIZE_HANDLE_SENSITIVITY = 10 diff --git a/ecl_analysis/export/csv_exporter.py b/ecl_analysis/export/csv_exporter.py index 7dc7ad1..6d49fa5 100644 --- a/ecl_analysis/export/csv_exporter.py +++ b/ecl_analysis/export/csv_exporter.py @@ -2,10 +2,11 @@ from __future__ import annotations +import json import logging import os from dataclasses import dataclass -from typing import Callable, List, Optional, Sequence, Tuple +from typing import Any, Callable, Dict, List, Optional, Sequence, Tuple import numpy as np import pandas as pd @@ -120,6 +121,37 @@ def save_analysis_outputs( plot_failed = True summary_lines.append(f" - FAILED: ROI {actual_roi_idx + 1}") + metadata_payload: Dict[str, Any] = { + "analysis_name": clean_analysis_name, + "video_name": base_video_name, + "frames_processed": analysis_result.frames_processed, + "total_frames": analysis_result.total_frames, + "start_frame": analysis_result.start_frame, + "end_frame": analysis_result.end_frame, + "elapsed_seconds": analysis_result.elapsed_seconds, + "use_fixed_mask": analysis_result.use_fixed_mask, + "non_background_rois": list(analysis_result.non_background_rois), + "analysis_metadata": dict(analysis_result.analysis_metadata), + "mask_metadata": [ + metadata.to_dict() if metadata is not None else None + for metadata in analysis_result.mask_metadata + ], + } + metadata_filename = ( + f"{clean_analysis_name}_{base_video_name}_" + f"frames{analysis_result.start_frame + 1}-{analysis_result.end_frame + 1}_analysis_metadata.json" + ) + metadata_path = os.path.join(save_dir, metadata_filename) + try: + with open(metadata_path, "w", encoding="utf-8") as metadata_file: + json.dump(metadata_payload, metadata_file, indent=2) + out_paths.append(metadata_path) + summary_lines.append(f" - Saved Metadata: {metadata_filename}") + except Exception as exc: + logging.exception("Failed to export metadata to %s: %s", metadata_path, exc) + plot_failed = True + summary_lines.append(" - FAILED: analysis metadata export") + return ExportResult( summary_lines=summary_lines, avg_brightness_summary=avg_brightness_summary, diff --git a/ecl_analysis/ingest/__init__.py b/ecl_analysis/ingest/__init__.py new file mode 100644 index 0000000..f74a5ee --- /dev/null +++ b/ecl_analysis/ingest/__init__.py @@ -0,0 +1,19 @@ +"""Capture-ingest helpers.""" + +from .metadata import ( + CAPTURE_METADATA_SCHEMA_NAME, + CAPTURE_METADATA_SCHEMA_VERSION, + CURRENT_CAPTURE_SCHEMA_VERSION, + CaptureMetadataValidation, + get_capture_metadata_schema_contract, + validate_capture_metadata, +) + +__all__ = [ + "CAPTURE_METADATA_SCHEMA_NAME", + "CAPTURE_METADATA_SCHEMA_VERSION", + "CURRENT_CAPTURE_SCHEMA_VERSION", + "CaptureMetadataValidation", + "get_capture_metadata_schema_contract", + "validate_capture_metadata", +] diff --git a/ecl_analysis/ingest/metadata.py b/ecl_analysis/ingest/metadata.py new file mode 100644 index 0000000..9e9c28c --- /dev/null +++ b/ecl_analysis/ingest/metadata.py @@ -0,0 +1,364 @@ +"""Validation helpers for camera-capture sidecar metadata.""" + +from __future__ import annotations + +import json +import os +from dataclasses import dataclass, field +from typing import Any, Dict, List, Optional + +CAPTURE_METADATA_SCHEMA_NAME = "ecl_capture_metadata" +CURRENT_CAPTURE_SCHEMA_VERSION = "1.0" +CAPTURE_METADATA_SCHEMA_VERSION = CURRENT_CAPTURE_SCHEMA_VERSION +SUPPORTED_CAPTURE_SCHEMA_VERSIONS = {CURRENT_CAPTURE_SCHEMA_VERSION} +REQUIRED_CAPTURE_FIELDS = ( + "device_model", + "exposure_mode_locked", + "exposure_duration", + "iso", + "white_balance_mode_locked", + "fps", + "resolution", + "hdr_disabled", +) +OPTIONAL_CAPTURE_FIELDS = ( + "capture_id", + "recorded_at", + "app_version", + "ios_version", + "color_space", + "video_codec", +) +KNOWN_CAPTURE_FIELDS = ("schema_version",) + REQUIRED_CAPTURE_FIELDS + OPTIONAL_CAPTURE_FIELDS + +CAPTURE_METADATA_FIELD_SPECS: Dict[str, Dict[str, str]] = { + "schema_version": { + "type": "string", + "required": "false during transition; assumed when omitted", + "description": "Version of the sidecar contract emitted by capture.", + }, + "device_model": { + "type": "string", + "required": "true", + "description": "Capture device model.", + }, + "exposure_mode_locked": { + "type": "boolean", + "required": "true", + "description": "Exposure lock state during capture.", + }, + "exposure_duration": { + "type": "number", + "required": "true", + "description": "Exposure duration in seconds.", + }, + "iso": { + "type": "number", + "required": "true", + "description": "Sensor ISO at capture time.", + }, + "white_balance_mode_locked": { + "type": "boolean", + "required": "true", + "description": "White-balance lock state during capture.", + }, + "fps": { + "type": "number", + "required": "true", + "description": "Configured frames per second.", + }, + "resolution": { + "type": "string|object", + "required": "true", + "description": "Capture resolution as WIDTHxHEIGHT or {width,height}.", + }, + "hdr_disabled": { + "type": "boolean", + "required": "true", + "description": "Whether HDR/tone mapping was disabled.", + }, + "capture_id": { + "type": "string", + "required": "false", + "description": "Stable capture identifier for downstream provenance.", + }, + "recorded_at": { + "type": "string", + "required": "false", + "description": "Capture timestamp in ISO-8601 format.", + }, + "app_version": { + "type": "string", + "required": "false", + "description": "Version of the capture app.", + }, + "ios_version": { + "type": "string", + "required": "false", + "description": "iOS version on the capture device.", + }, + "color_space": { + "type": "string", + "required": "false", + "description": "Recorded color space/profile label.", + }, + "video_codec": { + "type": "string", + "required": "false", + "description": "Recorded video codec label.", + }, +} + + +def get_capture_metadata_schema_contract() -> Dict[str, object]: + """Return the current lightweight sidecar schema contract.""" + return { + "schema_name": CAPTURE_METADATA_SCHEMA_NAME, + "schema_version": CAPTURE_METADATA_SCHEMA_VERSION, + "supported_schema_versions": sorted(SUPPORTED_CAPTURE_SCHEMA_VERSIONS), + "required_fields": { + field: dict(CAPTURE_METADATA_FIELD_SPECS[field]) for field in REQUIRED_CAPTURE_FIELDS + }, + "optional_fields": { + field: dict(CAPTURE_METADATA_FIELD_SPECS[field]) for field in OPTIONAL_CAPTURE_FIELDS + }, + "schema_version_field": dict(CAPTURE_METADATA_FIELD_SPECS["schema_version"]), + "resolution_formats": ["1920x1080", {"width": 1920, "height": 1080}], + } + + +@dataclass(frozen=True) +class CaptureMetadataValidation: + """Structured validation result for capture metadata sidecars.""" + + is_valid: bool + sidecar_path: str + errors: List[str] + warnings: List[str] + metadata: Optional[Dict[str, object]] = None + normalized_metadata: Optional[Dict[str, object]] = None + detected_schema_version: Optional[str] = None + schema_version_assumed: bool = False + unrecognized_fields: List[str] = field(default_factory=list) + + @property + def status(self) -> str: + if not self.is_valid: + return "invalid" + if self.warnings: + return "valid_with_warnings" + return "valid" + + def to_dict(self) -> Dict[str, Any]: + """Serialize the validation result for export/reporting.""" + return { + "is_valid": self.is_valid, + "status": self.status, + "sidecar_path": self.sidecar_path, + "errors": list(self.errors), + "warnings": list(self.warnings), + "schema_name": CAPTURE_METADATA_SCHEMA_NAME, + "expected_schema_version": CAPTURE_METADATA_SCHEMA_VERSION, + "detected_schema_version": self.detected_schema_version, + "schema_version": ( + None if self.normalized_metadata is None else self.normalized_metadata.get("schema_version") + ), + "schema_version_assumed": self.schema_version_assumed, + "unrecognized_fields": list(self.unrecognized_fields), + "schema_contract": get_capture_metadata_schema_contract(), + "metadata": dict(self.metadata) if isinstance(self.metadata, dict) else None, + "normalized_metadata": ( + dict(self.normalized_metadata) if isinstance(self.normalized_metadata, dict) else None + ), + } + + +def _sidecar_path_for_video(video_path: str) -> str: + base, _ = os.path.splitext(video_path) + return f"{base}.capture.json" + + +def _coerce_bool(value: object) -> Optional[bool]: + if isinstance(value, bool): + return value + if isinstance(value, str): + normalized = value.strip().lower() + if normalized in {"true", "1", "yes"}: + return True + if normalized in {"false", "0", "no"}: + return False + return None + + +def _coerce_positive_float(value: object, field_name: str, errors: List[str]) -> Optional[float]: + if value is None: + return None + try: + numeric = float(value) + except (TypeError, ValueError): + errors.append(f"{field_name} must be numeric.") + return None + if numeric <= 0: + errors.append(f"{field_name} must be greater than 0.") + return None + return numeric + + +def _normalize_resolution(value: object, errors: List[str]) -> Optional[str]: + if value is None: + return None + if isinstance(value, str): + normalized = value.strip().lower().replace(" ", "") + if "x" in normalized: + width, height = normalized.split("x", 1) + if width.isdigit() and height.isdigit(): + return f"{int(width)}x{int(height)}" + if isinstance(value, dict): + width = value.get("width") + height = value.get("height") + if isinstance(width, (int, float)) and isinstance(height, (int, float)) and width > 0 and height > 0: + return f"{int(width)}x{int(height)}" + errors.append("resolution must be a string like '1920x1080' or an object with width/height.") + return None + + +def validate_capture_metadata(video_path: str) -> CaptureMetadataValidation: + """Load and validate `