Skip to content

Feature request: make shell-heavy spec suites easier to optimize without losing spec clarity #21

@spilist

Description

@spilist

Hi, we use specdown for executable specs in a fairly shell-heavy TypeScript repo, and we ran into a performance pattern that feels common enough to deserve first-class support.

Problem

specdown already helps with file-level parallelism via -jobs, which is useful.

The remaining bottleneck for us was different:

  • many run:shell blocks
  • several of them call expensive commands like node --import tsx --test ...
  • the cost is dominated by repeated process startup / loader startup, not by the shell block text itself

In our case, the suite was structurally correct, but it was too easy to end up with:

  • repeated cold starts for the same test file or test runtime
  • spec prose split across sections, while execution reused the same heavyweight command many times
  • slow feedback loops even after enabling -jobs

What helped on our side was:

  • collapsing duplicate test invocations
  • moving seam-only assertions from shell blocks to source_guard-style checks
  • enabling specdown run -jobs N by default in our wrapper

That improved things a lot, but it still feels like the tool could help more directly.

What I wish specdown supported better

1. Better performance ergonomics for repeated shell patterns

A first-class way to express "same runtime, different assertions" without restarting a heavyweight process every time.

Examples of possible directions:

  • suite-scoped or section-scoped reusable exec sessions
  • a built-in pattern for “run once, assert many”
  • a way to bind named command outputs once and reuse them across cases more ergonomically
  • helpers that encourage converting repeated shell checks into structured checks before performance becomes a problem

I do not mean result caching across source changes. I mean reducing avoidable process startup when the spec is intentionally checking multiple facts from the same expensive runtime.

2. Better performance visibility in machine-readable output

The HTML report was useful for spotting slow cases, but I would love stronger performance metadata in report.json / CLI output.

Useful additions would be:

  • top slowest cases
  • per-spec total time
  • per-case startup vs execution time if available
  • a summary of which block kinds dominate runtime
  • maybe a --perf or trace/perf mode

This would make it much easier to answer:

  • “why is this suite slow?”
  • “which spec should I refactor first?”
  • “did a spec refactor actually reduce cost?”

3. Docs guidance for shell-heavy suites

The current syntax and workflow docs are good, but I think a dedicated section on performance-oriented authoring would help a lot.

For example:

  • when to replace repeated run:shell blocks with a check table + adapter
  • when to use file/source guards instead of full runtime execution
  • how to avoid repeating the same heavyweight command with different --test-name-pattern filters
  • how to structure specs so prose stays readable but execution stays cheap

Why this matters

The big value of specdown is that specs remain readable and close to the contract.

The risk in shell-heavy repos is that authors keep the prose clean but accidentally make execution much more expensive than necessary. Then teams either:

  • stop running specs often, or
  • start bypassing specdown for faster ad hoc verification

It would be great if specdown made the fast path more natural.

Concretely, if I had to pick two things

If only two improvements were feasible, I would vote for:

  1. richer timing/perf data in report.json
  2. a first-class “run once, assert many” or reusable-exec-session pattern

Those two alone would make shell-heavy suites much easier to keep healthy.

If helpful, I can also write up a more concrete example from a real suite where repeated tsx --test cold starts dominated runtime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions