-
Notifications
You must be signed in to change notification settings - Fork 515
Amazon Bedrock agentcore - add alert templates #16074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| "timeWindowSize": 5, | ||
| "timeWindowUnit": "m", | ||
| "esqlQuery": { | ||
| "esql": "// Alert triggers when system errors occur during AWS Bedrock AgentCore runtime invocations.\n//\n// System errors correspond to InvocationError.Internal - Internal server error (500).\n// These are server-side errors indicating infrastructure or service issues.\n//\n// The alert is grouped by cloud account, region, and agent endpoint name (aws.dimensions.Name)\n// to pinpoint the specific agent endpoint experiencing issues.\n//\n// To reduce alert noise, increase the threshold (e.g., `total_system_errors > 5`) to only\n// alert on sustained error patterns rather than transient service disruptions.\n\nFROM metrics-aws_bedrock_agentcore.metrics-*\n| WHERE aws.dimensions.Operation == \"InvokeAgentRuntime\"\n| STATS total_system_errors = sum(aws.bedrock_agentcore.metrics.SystemErrors.sum) BY cloud.account.id, cloud.region, aws.dimensions.Name\n| WHERE total_system_errors > 0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reduce alert noise, increase the threshold (e.g.,
total_system_errors > 5) to only.
Should the default threshold itself configured to >5 and suggest user to modify? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are pros and cons for both approaches.
If >0, no errors go un-noticed and is much suited for the new deployments. Users can increase the threshold based on their noise level. It's good for low-traffic agents.
muthu-mps
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
💔 Build Failed
Failed CI StepsHistory
|
| conditions: | ||
| kibana: | ||
| version: "^8.19.0 || ^9.1.0" | ||
| version: "^8.19.0 || ^9.2.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this bump necessary? In other alerts PRs we kept 9.1.x. cc: @muthu-mps
gpop63
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alerts look good!
Proposed commit message
Add alert templates for the Amazon Bedrock agentcore runtime.
Checklist
changelog.ymlfile.Author's Checklist
How to test this PR locally
elastic-package build && elastic-package stack up -v -d --services package-registryScreenshots
Not applicable.