Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEW CLI output in ProcessJobQueueTask #310

Merged
merged 7 commits into from
Nov 17, 2020

Conversation

chillu
Copy link
Contributor

@chillu chillu commented Jun 9, 2020

It's counterintuitive to run the queue on CLI (e.g. when testing things),
get zero error output, and then discover why the job was broken by looking
at a tiny text field in the admin/jobs CMS UI.

There's an interesting edge case where the logger does output to CLI only
when a broken job is discovered, because that uses the logger for messages
before adding the job-specific handlers. And in case a Monolog logger
implementation doesn't have any handlers, it'll add a php://stderr by
default. Very confusing :D

Note that modifying the singleton during execution by adding job-specific
handlers isn't ideal (didn't change that status quo). But there's no clear
interface for any services being executed through a task receiving a logger
from the task. So we have to assume they'll use the injector singleton.
Technically it means any messages after the job-specific execution (e.g.
during shutdown) would also be logged into the QueuedJobDescriptor database
record, but you could argue that's desired behaviour.

This should really be fixed by adding BuildTask->getLogger() and making it
available to all tasks, but this is the first step to fix this specific
task behaviour. See
silverstripe/silverstripe-framework#9183

@chillu
Copy link
Contributor Author

chillu commented Jun 12, 2020

OK, builds are passing now ... what a mission. See 9e7b2fc

@mfendeksilverstripe
Copy link
Contributor

@chillu I'm trying to do the following:

  • queue runner / service logging needs to go to Graylog
  • job execution (process()) logging needs to go into job messages
  • note that job execution logging is not limited to job failing by throwing an exception
  • both logging setups can emit log messages of any level so you can't use filtering to differentiate the origin
  • common case for migration jobs is that they use logger to indicate things that failed during migration process but completing the job execution without throwing
  • this is because we usually go with the try to migrate all we can and log whatever fails and accept the failure (fixed manually) approach

Please explain to me how your change-set allows me to implement the setup I need as opposed to the change-set proposed in this PR.

@chillu
Copy link
Contributor Author

chillu commented Jun 14, 2020

queue runner / service logging needs to go to Graylog

Ensure your project has a SyslogLogger as described in https://docs.silverstripe.cloud/cloud-features/logs/

job execution (process()) logging needs to go into job messages

Even before my PR here, the global logger had the QueuedJobHandler (wrapped in a BufferHandler) attached via the QueuedJobService which is used by ProcessJobQueueTask.. Within the job, you can retrieve this preconfigured logger singleton via Injector::inst()->get(LoggerInterface::class). Ideally we'd make that explicit by adding Job->setLogger() and passing it in, but that's not strictly required to achieve what you're asking for here.

note that job execution logging is not limited to job failing by throwing an exception

Yes, you have all the different log levels to express those priorities within the job.

both logging setups can emit log messages of any level so you can't use filtering to differentiate the origin

Once the QueuedJobHandler is attached the global logger, it'll receive any log messages emitted within the job, or any other service used within that job (which can also use the logger global). It's correct that the log handling can't differentiate between sources beyond log levels. I think it's an acceptable limitation to log all messages (with optional level filtering) to the job database row during job execution, rather than just e.g. the messages directly created by the job (but not any services used by the job). And if you really care about that distinction, then ensure you can inject different logger instances into the services used by your job.

If you wanted to increase the log levels for specific jobs only (e.g. log info rather than just warning/error), then you can do that within your job execution on the global logger. It'll reconfigure it globally on the singleton, so might be best to clone the object graph in this case. This seems pretty advanced though?

@mfendeksilverstripe
Copy link
Contributor

Thanks for clarification, will try it out ;-)

@chillu
Copy link
Contributor Author

chillu commented Jul 7, 2020

@mfendeksilverstripe Did you get a chance to try this out? Keen to both get this one merged, and the "competing PR" (#309) closed without merge.

@mfendeksilverstripe
Copy link
Contributor

mfendeksilverstripe commented Aug 23, 2020

Hey @chillu , I reviewed your suggestions and I don't think the two PRs are competing. I see your point though, so this is my suggestion:

  • feel free to merge this PR as this change is really useful
  • I'll update the other PR with the new information based of off this PR

@chillu
Copy link
Contributor Author

chillu commented Aug 25, 2020

I don't have permission to merge actually, maybe @maxime-rainville or @emteknetnz could?

@emteknetnz
Copy link
Collaborator

@dnsl48 @Cheddam

@dnsl48 dnsl48 requested review from dnsl48 and emteknetnz and removed request for dnsl48 August 26, 2020 02:14
Copy link
Collaborator

@emteknetnz emteknetnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, a couple of minor questions/suggestions

],
true
),
$logger->error(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to test if the $logger is not null?

Copy link
Contributor Author

@chillu chillu Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, we also don't check it in the other dozen places. private $logger is a hard dependency. That's not made super clear because we tend to not inject dependencies through the constructor, but I'd say that's a different discussion (pre-existing issue in the class).

Copy link
Contributor Author

@chillu chillu Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scratch that, just read through the diff again - I'm replacing the Injector::inst()->get(LoggerInterface::class) call which used to be getLogger(). So I'm turning an internal dependency into an external dependency, but making it implicit. I'd say that the default calling method on non object in $this->logger PHP error is pretty much the same than a custom throw "dependency not met" though? Moving this to a constructor based injection would be a cleaner API, but also an API breakage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was cranky before - I've fixed it be reinstating the old functionality, but taking preference for $this->logger if defined. So it should never be null.

@@ -43,6 +47,26 @@ public function run($request)

$service = $this->getService();

// Ensure that log messages are visible when executing this task on CLI.
// TODO Replace with BuildTask logger: https://github.com/silverstripe/silverstripe-framework/issues/9183
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include TODO's in PR's, already enough of those in the codebase =)

Better of linking to this PR:line in that issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel strongly about this then please change the TODO tag in a commit. I don't ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was cranky before - I've removed the TODO tag, and converted it to a standard comment referencing this context instead. Thanks again for following up on those PRs, and getting into the details during review!

Logger::WARNING
);

$logger->pushHandler($standardFilterHandler);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to test if the $logger is not null here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be the case any more, see previous comment.

@emteknetnz
Copy link
Collaborator

@mfendeksilverstripe I'm looking to get this merged, though I'm also aware that your team has about 20x the queuedjobs knowledge that I personally have. Any chance you could run this pull-request on your local and see how it goes?

Copy link
Collaborator

@emteknetnz emteknetnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing locally, seems like the CLI output is a bit better, though the CMS output is worse as it's missing the line number an exception was thrown from which is pretty bad from a live debugging point of view

I added the following code to a job to test the output:

    public function process()
    {
        /** @var Logger $logger */
        $logger = Injector::inst()->get(LoggerInterface::class);
        $logger->warning("MY WARNING");
        echo "MY ECHO\n";
        throw new \Exception("MY EXCEPTION");

CLI BEFORE

[2020-08-31 10:29:55] error-log.ERROR: Array (     [errno] => 0     [errstr] => Broken jobs were found in the job queue     [errfile] => /var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Services/QueuedJobService.php     [errline] => 374     [errcontext] => Array         (         )  )  {"file":"/var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Services/QueuedJobService.php","line":381} []
[2020-08-31 10:29:55] No new jobs on queue 2
[2020-08-31 10:29:55] error-log.WARNING: MY WARNING [] []
[2020-08-31 10:29:55] error-log.ERROR: MY EXCEPTION {"exception":"[object] (Exception(code: 0): MY EXCEPTION at /var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Jobs/CleanupJob.php:111)"} []
MY ECHO

CLI AFTER

[2020-08-31 10:30:30] error-log.ERROR: Broken jobs were found in the job queue {"file":"/var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Services/QueuedJobService.php","line":487} []
[2020-08-31 10:30:30] No new jobs on queue 2
[2020-08-31 10:30:30] error-log.WARNING: MY WARNING [] []
[2020-08-31 10:30:30] error-log.ERROR: MY EXCEPTION {"exception":"[object] (Exception(code: 0): MY EXCEPTION at /var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Jobs/CleanupJob.php:111)"} []
MY ECHO
[2020-08-31 10:30:30] error-log.WARNING: Job paused at 2020-08-31 10:30:30 [] []

Running via the Jobs ModelAdmin also results in tidier output, and it's also showing the MY WARNING which is certainly an improvement, BUT it's missing the line number the exception was throw from so it's much less usable

MODELADMIN BEFORE

    [2020-08-31 10:36:40][INFO] Job caused exception MY EXCEPTION in /var/www/mysite/www/vendor/silverstripe/queuedjobs/src/Jobs/CleanupJob.php at line 111
    [2020-08-31 10:36:40][INFO] MY ECHO
    [2020-08-31 10:36:40][INFO] Job paused at 2020-08-31 10:36:40

MODELADMIN AFTER

    [2020-08-31 10:36:41][INFO] MY ECHO
    [2020-08-31 10:36:41][WARNING] Job paused at 2020-08-31 10:36:41
    [2020-08-31 10:36:41][WARNING] MY WARNING
    [2020-08-31 10:36:41][ERROR] MY EXCEPTION
    [2020-08-31 10:36:41][WARNING] Job paused at 2020-08-31 10:36:41

@chillu
Copy link
Contributor Author

chillu commented Sep 4, 2020

OK turns out the formatters configured for this weren't kicking in for the queue handler because the Monolog specification was a bit ambiguous about how to implement those. See 741d446

Triggered through ModelAdmin:

image

CLI with JobType=1 (IMMEDIATE)

image

CLI with JobType=2 (QUEUE)

image

Not showing logs there because of Doorman forking off via ProcessJobQueueChildTask. Even when defining StreamHandler there (like in ProcessJobQueueTask, Doorman super helpfully sends stderr to dev/null by default. silverstripe/doorman#23. The logs are in the db message though.

@emteknetnz
Copy link
Collaborator

emteknetnz commented Sep 6, 2020

@chillu Looks great.

I was about to squash merge though I noticted the rather large commit message generated

It might be worth you either doing the squash yourself and tailor the commit message however you like, or just reply to this comment with an updated commit message and I'll use that in the squash-merge

Commit message:

* Avoid serialisation in human readable log messages

It only does sprintf() on two occurrences in this class,
and not anywhere else. I can't see why it's necessary.

* NEW CLI output in ProcessJobQueueTask

It's counterintuitive to run the queue on CLI (e.g. when testing things),
get zero error output, and then discover why the job was broken by looking at a tiny text field in the admin/jobs CMS UI.

There's an interesting edge case where the logger *does* output to CLI only when
a broken job is discovered, because that uses the logger for messages *before*
adding the job-specific handlers. And in case a Monolog logger implementation doesn't have any handlers,
it'll add a php://stderr by default. Very confusing :D

Note that modifying the singleton during execution by adding job-specific handlers isn't ideal (didn't change that status quo).
But there's no clear interface for any services being executed through a task
receiving a logger *from* the task. So we have to assume they'll use the injector singleton.
Technically it means any messages after the job-specific execution (e.g. during shutdown)
would also be logged into the QueuedJobDescriptor database record,
but you could argue that's desired behaviour.

This should really be fixed by adding BuildTask->getLogger() and making it available to all tasks,
but this is the first step to fix this specific task behaviour.
See https://github.com/silverstripe/silverstripe-framework/issues/9183

* FIX Avoid double message logging

$logger already has a QueuedJobsHandler which calls addMessage()

* Use PHP7 Throwable consistently

We no longer support PHP 5.x so no need to keep Exception catching

* More printf reformatting

* FIX Use logger consistently in QueuedJobService

Job->addMessage() only adds the message to the job, but not the actual task output.
Which means it's harder to diagnose issues. The logger already has a QueuedJobHander
which also calls Job->addMessage(), *and* can do other things like output to php://stderr

* FIX Resilient log handler implementation

The log handler on QueuedJobService is triggering a write to the database on log messages.
It's added to the *global logger singleton*, meaning it applies beyond the scope of this service execution.
That's problematic when completely unrelated logic writes new log messages,
and the buffer flushes - at the latest point, during PHP shutdown.
At this point, a database is often no longer in a state to accept the write,
e.g. during unit test execution.

Note that a better fix would be to use a specific logger instance clone
for this purpose, but we do want to inherit the log handlers set up for a specific project here,
so can't just create a logger from scratch.

* Use formatters in QueuedJobHandler

If you check other handlers, they all call format() as part of their handle() implementation.
But this handler is wrapped in a BufferHandler, and that's expecting handleBatch() implementations instead.
So unlike other handlers, this one doesn't inherit the default formatter invocation behaviour from parent implementations.
This wasn't a problem before because the formatting was just inlined into the custom code (QueuedJobService),
which is an anti-pattern in terms of Monolog usage. Compare this to SyslogHandler and AbstractSyslogHandler.

It only does sprintf() on two occurrences in this class,
and not anywhere else. I can't see why it's necessary.
It's counterintuitive to run the queue on CLI (e.g. when testing things),
get zero error output, and then discover why the job was broken by looking at a tiny text field in the admin/jobs CMS UI.

There's an interesting edge case where the logger *does* output to CLI only when
a broken job is discovered, because that uses the logger for messages *before*
adding the job-specific handlers. And in case a Monolog logger implementation doesn't have any handlers,
it'll add a php://stderr by default. Very confusing :D

Note that modifying the singleton during execution by adding job-specific handlers isn't ideal (didn't change that status quo).
But there's no clear interface for any services being executed through a task
receiving a logger *from* the task. So we have to assume they'll use the injector singleton.
Technically it means any messages after the job-specific execution (e.g. during shutdown)
would also be logged into the QueuedJobDescriptor database record,
but you could argue that's desired behaviour.

This should really be fixed by adding BuildTask->getLogger() and making it available to all tasks,
but this is the first step to fix this specific task behaviour.
See silverstripe/silverstripe-framework#9183
$logger already has a QueuedJobsHandler which calls addMessage()
We no longer support PHP 5.x so no need to keep Exception catching
Job->addMessage() only adds the message to the job, but not the actual task output.
Which means it's harder to diagnose issues. The logger already has a QueuedJobHander
which also calls Job->addMessage(), *and* can do other things like output to php://stderr
The log handler on QueuedJobService is triggering a write to the database on log messages.
It's added to the *global logger singleton*, meaning it applies beyond the scope of this service execution.
That's problematic when completely unrelated logic writes new log messages,
and the buffer flushes - at the latest point, during PHP shutdown.
At this point, a database is often no longer in a state to accept the write,
e.g. during unit test execution.

Note that a better fix would be to use a specific logger instance clone
for this purpose, but we do want to inherit the log handlers set up for a specific project here,
so can't just create a logger from scratch.
If you check other handlers, they all call format() as part of their handle() implementation.
But this handler is wrapped in a BufferHandler, and that's expecting handleBatch() implementations instead.
So unlike other handlers, this one doesn't inherit the default formatter invocation behaviour from parent implementations.
This wasn't a problem before because the formatting was just inlined into the custom code (QueuedJobService),
which is an anti-pattern in terms of Monolog usage. Compare this to SyslogHandler and AbstractSyslogHandler.
@chillu
Copy link
Contributor Author

chillu commented Sep 18, 2020

@emteknetnz OK I've squashed down two commits, but it's still a handful of distinct changes which are probably worth calling out in the changelog individually - so I'd recommend keeping those, rather than squashing on merge.

@chillu
Copy link
Contributor Author

chillu commented Nov 17, 2020

@emteknetnz Can you merge please?

@emteknetnz emteknetnz merged commit 1d263b9 into symbiote:4 Nov 17, 2020
@chillu chillu deleted the pulls/4/task-logging branch November 18, 2020 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants