Pipeline silently produces a broken poster (only 1 column filled) on papers with ≤4 narrative sections



## Summary

The pipeline ran end-to-end without crashing but produced a poster where only the **left column** had content; the middle and right columns were entirely blank. Tracing through the agent log files revealed three independent failure modes that compound silently. This issue documents all three with minimal patches that I'm using locally.

Run command:

```bash
python -m src.workflow.pipeline \
  --poster_width 54 --poster_height 36 \
  --paper_path ./data/PhyGround/paper.pdf \
  --text_model gemini-2.5-pro \
  --vision_model gemini-2.5-pro \
  --logo "" \
  --aff_logo ./data/PhyGround/aff.png
```

---

## Bug 1 — Curator validator rejects valid storyboards over a single dropped field

**File**: `src/agents/curator.py`, `_validate_story_board`

When the LLM (gemini-2.5-pro in my run) generates a 5-section storyboard but forgets `column_assignment` on the *last* section, the entire storyboard is rejected. The retry then sometimes makes the situation worse — the LLM panics and returns only 1–2 sections, blowing through the 3 attempts:

```
section 5 missing 'column_assignment'
section 5 missing 'column_assignment'
need 5-8 sections, got 2
❌ failed to create story board
```

After this, every downstream agent fails (Color → Section Title → Layout → Font → Renderer).

**Patch**: insert an autofix step right after `extract_json` and before validation. Fill in the most-underused column for missing `column_assignment`, default `vertical_priority`, and ensure `text_content` is at least an empty list. Also: if all sections cluster in 1–2 columns, move the lowest-priority section into any empty column so all 3 columns get used.

```python
story_board = extract_json(response.content)
self._autofix_story_board(story_board)   # NEW
if self._validate_story_board(...):
    ...
```

```python
def _autofix_story_board(self, story_board):
    sections = story_board.get("spatial_content_plan", {}).get("sections", [])
    column_cycle = ["left", "middle", "right"]
    vertical_cycle = ["top", "middle", "bottom"]
    used = [s.get("column_assignment") for s in sections if s.get("column_assignment") in column_cycle]
    for i, s in enumerate(sections):
        s.setdefault("section_id", f"section_{i+1}")
        if s.get("column_assignment") not in column_cycle:
            counts = {c: used.count(c) for c in column_cycle}
            s["column_assignment"] = min(counts, key=counts.get)
            used.append(s["column_assignment"])
        if s.get("vertical_priority") not in vertical_cycle:
            s["vertical_priority"] = vertical_cycle[i % 3]
        s.setdefault("section_title", s["section_id"])
        s.setdefault("text_content", [])
    # redistribute when one column is empty
    if len(sections) >= 3:
        cols_used = {s["column_assignment"] for s in sections}
        for empty_col in [c for c in column_cycle if c not in cols_used]:
            counts = {c: sum(1 for s in sections if s.get("column_assignment") == c) for c in column_cycle}
            src = max(counts, key=counts.get)
            if counts[src] <= 1: break
            target = max((s for s in sections if s.get("column_assignment") == src),
                         key=lambda s: s.get("importance_level", 0))
            target["column_assignment"] = empty_col
```

---

## Bug 2 — Balancer LLM happily drops sections and the validator doesn't notice

**File**: `src/agents/balancer_agent.py`, `_validate_story_board`

The balancer agent's prompt asks the LLM to redistribute content so every column is used. On my paper the LLM consistently took the lazy path: instead of moving sections, it **deleted** the sections it couldn't fit and returned a 2-section storyboard with both sections in the left column. The current `_validate_story_board` only checks `column_assignment` enum membership, so this passes validation.

Result: `column_analysis.json` shows left = 110% utilization (overflow), middle/right = 0%.

```json
"left":   {"utilization_rate": 1.10, "status": "overflow"},
"middle": {"utilization_rate": 0.00, "status": "underutilized"},
"right":  {"utilization_rate": 0.00, "status": "underutilized"}
```

`balancer_decisions.json` — `section_removals: []` even though sections *were* removed (the regex pattern only catches "removed section" text in the response, not actual JSON diffs).

**Patch**: add a `_preserves_sections(original, optimized)` check that fails if any original `section_id` is missing OR if all optimized sections collapse into <2 columns. Pass the original `story_board` into the validation call.

```python
optimized_story_board = extract_json(response.content)
if self._validate_story_board(optimized_story_board) \
   and self._preserves_sections(story_board, optimized_story_board):  # NEW
    ...
```

```python
def _preserves_sections(self, original, optimized):
    orig_ids = {s.get("section_id") for s in original["spatial_content_plan"]["sections"]}
    new = optimized["spatial_content_plan"]["sections"]
    if not orig_ids.issubset({s.get("section_id") for s in new}):
        log_agent_error(self.name, f"balancer dropped sections: {orig_ids - {s.get('section_id') for s in new}}")
        return False
    cols = {s.get("column_assignment") for s in new if s.get("column_assignment")}
    if len(cols) < 2:
        return False
    return True
```

When all retries fail this guard, the balancer's existing fallback (`return {"optimized_story_board": story_board, ...}`) preserves the original storyboard, which is much better than the silent collapse.

---

## Bug 3 — Layout agent crashes (`'NoneType' object has no attribute 'get'`) when balancer adds new section IDs

**File**: `src/agents/layout_agent.py`, `_create_section_title_design`

`section_title_designer` runs **before** the balancer. If the balancer adds a new section (or my autofix above renames one), the layout agent looks up that section_id in `state["section_title_design"]["section_applications"]`, doesn't find it, and `section_app` stays `None`:

```python
section_app = None
for app in title_design.get("section_applications", []):
    if app.get("section_id") == section_id:
        section_app = app
        break
title_styling  = section_app.get("title_styling", {})    # 💥 AttributeError
accent_styling = section_app.get("accent_styling", {})
```

The exception is caught one frame up and reported as `❌ final layout error: 'NoneType' object has no attribute 'get'`, which then cascade-fails Font and Renderer with cryptic "missing design_layout" / "no styled_layout" messages.

**Patch**: fall back to the first available design (or `{}`) when no section-specific design exists.

```python
section_app = None
for app in title_design.get("section_applications", []):
    if app.get("section_id") == section_id:
        section_app = app
        break
if section_app is None:
    apps = title_design.get("section_applications", [])
    section_app = apps[0] if apps else {}
    log_agent_warning(self.name, f"no title design for section '{section_id}', using fallback styling")
```

---

## Config knobs that helped

In `config/poster_config.yaml`:

```yaml
validation:
  min_section_count: 4   # was 5 — short papers struggle to hit 5
  max_llm_attempts: 5    # was 3 — gives autofix room to converge
```

---

## Suggested next steps

1. **Curator validator should auto-fill** instead of rejecting on a single missing field. The retry-on-validation-failure pattern is fragile because LLM behavior often gets *worse* under "you got it wrong, try again" feedback. The autofix above covers the most common drops without touching the prompt.
2. **Balancer needs a hard invariant**: section count must be non-decreasing (or at minimum: `set(original_section_ids) ⊆ set(optimized_section_ids)`). The free-form natural-language "decisions extraction" via regex (`section_removals` heuristic) is unreliable — compare actual section IDs between input and output.
3. **Pipeline should fail fast** when an upstream agent produces a structurally broken state. Right now Color/SectionTitle/Layout/Font/Renderer all run even after curator fails completely, producing piles of `❌ failed: missing X` cascades that obscure the original cause.
4. The "visual table_" / "visual figure_" lookup warning (empty `lookup_id`) suggests the LLM occasionally emits malformed `visual_id` strings — worth adding a regex check (`r"^(table|figure)_\d+$"`) at storyboard validation time.

---

Happy to open a PR with these patches if the maintainers are interested. Tested on:

- Python 3.11
- gemini-2.5-pro (text + vision)
- 10-page paper, 4 narrative sections, 3 figures, 7 tables

End-to-end runtime after all patches: 4.7 min, 10 API calls.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline silently produces a broken poster (only 1 column filled) on papers with ≤4 narrative sections #8

Summary

Bug 1 — Curator validator rejects valid storyboards over a single dropped field

Bug 2 — Balancer LLM happily drops sections and the validator doesn't notice

Bug 3 — Layout agent crashes (`'NoneType' object has no attribute 'get'`) when balancer adds new section IDs

Config knobs that helped

Suggested next steps

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Pipeline silently produces a broken poster (only 1 column filled) on papers with ≤4 narrative sections #8

Description

Summary

Bug 1 — Curator validator rejects valid storyboards over a single dropped field

Bug 2 — Balancer LLM happily drops sections and the validator doesn't notice

Bug 3 — Layout agent crashes ('NoneType' object has no attribute 'get') when balancer adds new section IDs

Config knobs that helped

Suggested next steps

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug 3 — Layout agent crashes (`'NoneType' object has no attribute 'get'`) when balancer adds new section IDs