-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganise run-processing logs #118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great idea, the logs being in multiple places is definitely quite annoying. When it comes to the implementation, the one concern I have is that this will lead to multiple extractors writing to the same file, which will cause their logs to be interleaved. There's two situations I can think of where this would happen:
- Someone reprocesses a run (perhaps only a non-cluster variable with
--match
) while a slurm job for that run is already in progress. - At some point I want to add support for multiple slurm jobs 😈 This is useful whenever you have multiple heavy-but-independent variables.
And it would be really nice if the logs weren't interleaved 😅 That's actually my main gripe with the current logging system: locally-processed runs often overlap in execution which causes their logs to be interleaved and makes debugging tricky.
I wrote a whole thing about this on Zulip, but the solutions all seem to be rather complex. I would say that right now this is a net win and we should merge it and fix the interleaving/concurrency issues later.
WRT old logs, let's keep them and append new ones to the same file. Sometimes multiple variables will be failing and while debugging and amore-proto reprocess --match
'ing one variable you might not want the logs for the other failing variables to be deleted. This also ties in the with the versioning thing, I think we should preserve logs for each version.
Yeah, I can see that interleaving is going to be annoying, especially if there are multiple Slurm jobs per run. I can see some possible ways to deal with that, but they all have their own drawbacks. Let's discuss that further elsewhere.
Sure, that's easy to do. 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
59dc0c8
to
7597dbd
Compare
Thanks for the review 👍 |
At present, variables evaluated in listener subprocesses go into
amore.log
, variables evaluated underamore-proto reprocess
are logged in the terminal of whoever ran it, and variables evaluated in a Slurm job get a per-job log file inslurm_logs
. This makes it hard to find any log messages from processing a given run (see #45, #49).With this change, all DAMNIT processing for a given run will be logged in a file like
process_logs/r123_p4507.out
. I've tried to put some markers in to delineate different processes adding to the file.Example log file
Question: Should we keep older logs at all? For now, when you
reprocess
a run, its log file is overwritten. Another options is to append each new processing to the same file, and another is to write a new file each time (while making it easy to see the latest).I view this PR as part 1 of 2: this changes where the backend puts the information, part 2 will be to use the new structure so you can get to the logs from the GUI. I've got some ideas about how precisely to do that, but I think having the Slurm, non-Slurm, listener & reprocess logs all going to the same place is a good starting point.
cc @JamesWrigley @matheuscteo