Repro: it works as shipped; one line breaks it
curl -fsSL https://rubys.github.io/roundhouse/browse/spinel.tgz -o spinel.tgz
tar xzf spinel.tgz && cd spinel
make build SPINEL=/path/to/spinel # spinel main.rb --rbs sig -o build/blog
( cd e2e && npm install && npx playwright install chromium && CI=1 npx playwright test )
# -> 6/6 pass in ~2s
Now change the one line in runtime/tep/net.rb (Sock.sphttp_filesize, used to serve static assets):
- File.read(path).length
+ File.size(path)
Rebuild and rerun the e2e -> 6/6 fail: every page.goto times out at 30s. Same spinel binary, same app, only this line changed.
What "fail" looks like
The server stays alive but deaf — it stops accepting new connections. sample of the wedged process is 100% in the scheduler's poll loop:
sp_Scheduled_s_run_worker -> sp_net_poll_run -> poll() [single thread, all samples]
It's blocked in poll() and never wakes for an incoming connection — as if the listen socket dropped out of the poll set.
We tried to reduce it to a bare sp_net + File.size program and couldn't
Everything simpler stays healthy or is build-identical:
File.size in isolation returns the correct value.
- ~400 concurrent
File.size static GETs — healthy.
- parked
/cable WebSocket + keep-alive File.size bursts — healthy.
- a comment POST that broadcasts to parked WS subscribers — healthy, and identical on both builds.
Notably, in several of those tests File.size is never called, yet the File.read and File.size builds behave the same — so the File.size codegen change (whole-program inference / layout shift) seems to surface a latent UB in the sp_net scheduler, rather than File.size itself being wrong. Only the full browser workload (File.size-served assets + concurrent /cable WS + a broadcast + multi-worker timing, together) trips it, which is why we couldn't shrink it to a deterministic minimal case.
Build / env
spinel master @ b2e92ea; reproduces on Linux x86_64 and macOS arm64. Roundhouse pinned back to File.read to keep the e2e green (roundhouse@daccd53).
Repro: it works as shipped; one line breaks it
Now change the one line in
runtime/tep/net.rb(Sock.sphttp_filesize, used to serve static assets):Rebuild and rerun the e2e -> 6/6 fail: every
page.gototimes out at 30s. Same spinel binary, same app, only this line changed.What "fail" looks like
The server stays alive but deaf — it stops accepting new connections.
sampleof the wedged process is 100% in the scheduler's poll loop:It's blocked in
poll()and never wakes for an incoming connection — as if the listen socket dropped out of the poll set.We tried to reduce it to a bare
sp_net+File.sizeprogram and couldn'tEverything simpler stays healthy or is build-identical:
File.sizein isolation returns the correct value.File.sizestatic GETs — healthy./cableWebSocket + keep-aliveFile.sizebursts — healthy.Notably, in several of those tests
File.sizeis never called, yet theFile.readandFile.sizebuilds behave the same — so theFile.sizecodegen change (whole-program inference / layout shift) seems to surface a latent UB in the sp_net scheduler, rather thanFile.sizeitself being wrong. Only the full browser workload (File.size-served assets + concurrent/cableWS + a broadcast + multi-worker timing, together) trips it, which is why we couldn't shrink it to a deterministic minimal case.Build / env
spinel
master@b2e92ea; reproduces on Linux x86_64 and macOS arm64. Roundhouse pinned back toFile.readto keep the e2e green (roundhouse@daccd53).