Job queue, matrix builder, concurrency control #709

f-f · 2025-12-09T10:31:09Z

WIP, should fix #696, fix #649

f-f · 2025-12-09T10:37:26Z

Main chunks of work to be done at this point:

enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"
enqueing/running package set jobs, need to figure out auth story for that
return jobs from the API; need to spec it more:
- do we keep backwards compatibility? Spago should keep working
- what details do we return?
- when returning a list we should allowing selecting queued/done/in progress jobs

fsoikin · 2025-12-09T16:58:20Z

But the first two bullets can be implemented as separate changes, right? They're not needed for strict parity.

f-f · 2025-12-11T09:05:33Z

@fsoikin the overall goal is to merge #669 ASAP, so that we can reupload the whole registry.
That can't be merged until we have guaranteed single job execution (#696, this PR, and #649, not yet in this PR but soon), because jobs would otherwise take too long and likely conflict. Splitting off compilation for different compilers is part of making the jobs faster and support keeping the compiler information up to date.

We could be doing all these changes in separate PRs but I don't see the point since they will all need to end up in the compilers-in-metadata branch since we can't merge to trunk until we have the whole package.

thomashoneyman · 2025-12-12T11:12:30Z

app/src/App/Server/Router.purs

+  insertPackageJob :: PackageOperation -> ContT Response (Run _) Response
+  insertPackageJob operation = do
+    lift $ Log.info $ "Enqueuing job for package " <> PackageName.print (Operation.packageName operation)
+    jobId <- newJobId
+    lift $ Db.insertPackageJob { jobId, payload: operation }
+    jsonOk V1.jobCreatedResponseCodec { jobId }


The old code checked for running jobs before creating new ones:

lift (Db.runningJobForPackage packageName) >>= case _ of Right { jobId, jobType: runningJobType } -> ...

Looks like the duplicate checking is no longer there, is that intentional?

thomashoneyman · 2025-12-12T11:24:23Z

app/src/App/SQLite.js

+    JOIN ${JOB_INFO_TABLE} info ON job.jobId = info.jobId
+    WHERE info.finishedAt IS NULL
+    AND info.startedAt IS NULL
+    ORDER BY info.createdAt DESC


Why DESC here? Don't we want the oldest job to go out first (FIFO), like it previously was? Same for the other selectNext* functions.

thomashoneyman · 2025-12-12T11:26:30Z

Also the tests are failing because of a missing version field in the fixtures, see:
5afab58

thomashoneyman · 2025-12-12T22:34:31Z

app/src/App/Effect/Db.purs

+  DeleteIncompleteJobs next -> do
+    Run.liftEffect $ SQLite.deleteIncompleteJobs env.db
+    pure next


Are we sure we want to just delete incomplete jobs? Should they instead be re-inserted into the queue? For example, reset startedAt to NULL so the job is picked up again.

We may also want to put more effort into making partially-completed jobs more recoverable, such as making operations as close to idempotent as possible and having a way to sweep through and catch any only partially-complete operations and decide whether to roll them back or to retry and complete them.

thomashoneyman · 2025-12-12T22:41:27Z

Main chunks of work to be done at this point:

In addition to your list (some responses to that below), the other point that I'm not seeing here is that we need to ensure that the github issue module only proxies requests over to the server API and writes comments on 'notify' logs but does not actually execute e.g. the publish operation itself. Otherwise we've defeated the purpose of the pull request, as we can't enforce a lock on the git commits anymore.

enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"

This makes sense to me: start with no-dependency packages, and on completion we queue dependents that are satisfied by this package existing, propagate outwards.

enqueing/running package set jobs, need to figure out auth story for that

What do we need to do beyond what we have today? Today, we open an issue, pacchettibotti signs the payload via the GitHub Actions setup, and then the job is executed by the action. With the server, the only difference would be that this payload is submitted to the server API. I'm not sure I'm seeing what the additional auth issues are. We can also have a daily GitHub Action cron job that kicks off the daily package set updater, so as far as the server is concerned it's just receiving an authenticated package set update.

The same approach is used for transfers or unpublishing: open the issue as a purescript/packaging maintainer and it will be auto-signed by pacchettibotti. I like this because it's public.

thomashoneyman and others added 5 commits December 7, 2025 21:42

Update database schemas and add job executor loop

20d6b21

Split Server module into Env, Router, JobExecutor, and Main

4b9743c

Fix up build

2fe9635

Run job executor

a4f1047

Fix integration tests

dfd7e78

f-f mentioned this pull request Dec 9, 2025

WIP: concurrent jobs #707

Closed

thomashoneyman reviewed Dec 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Job queue, matrix builder, concurrency control #709

Job queue, matrix builder, concurrency control #709

f-f commented Dec 9, 2025

Uh oh!

f-f commented Dec 9, 2025

Uh oh!

fsoikin commented Dec 9, 2025

Uh oh!

f-f commented Dec 11, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Job queue, matrix builder, concurrency control #709

Are you sure you want to change the base?

Job queue, matrix builder, concurrency control #709

Conversation

f-f commented Dec 9, 2025

Uh oh!

f-f commented Dec 9, 2025

Uh oh!

fsoikin commented Dec 9, 2025

Uh oh!

f-f commented Dec 11, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

thomashoneyman Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

thomashoneyman Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

thomashoneyman commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants