-
Notifications
You must be signed in to change notification settings - Fork 1.8k
out_cloudwatch_logs: increase MAX_EVENT_LEN to 1MB with tests #11159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughUpdated CloudWatch Logs public header: Changes
Sequence Diagram(s)Not applicable — changes are macro/documentation updates and new test cases; no runtime control-flow or component interaction changes to diagram. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
7c85150 to
ee95b3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)
🔇 Additional comments (1)
plugins/out_cloudwatch_logs/cloudwatch_api.h (1)
56-56: I'll search the repository directly for the constants to verify the buffer calculations:Based on my verification, here's the rewritten review comment:MAX_EVENT_LEN increase to 1,000,000 is correct and properly handled.
The increase to 1,000,000 bytes correctly reflects CloudWatch's 2025 update supporting log events up to 1 MB. Verification confirms the buffer logic is sound:
- Individual events are truncated at MAX_EVENT_LEN (cloudwatch_api.c:558-561)
- Payload size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event, and the code conservatively uses 42 bytes per event
- Batch size check (cloudwatch_api.c:891) prevents payloads from exceeding PUT_LOG_EVENTS_PAYLOAD_SIZE
- Maximum single event with overhead: 1,000,000 + 42 = 1,000,042 bytes fits within the 1,048,576-byte batch limit
No buffer sizing issues detected.
3d7597b to
ee654ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
tests/runtime/out_cloudwatch.c (5)
12-14: Avoid duplicatingMAX_EVENT_LENand keep tests tied to the real limitHere you reintroduce
MAX_EVENT_LENlocally but it’s not used anywhere in this file, and the actual definition already lives incloudwatch_api.h. That creates a drift risk if the plugin constant changes again.Consider either:
- Including the CloudWatch API header and using its
MAX_EVENT_LENdirectly in the tests, or- Wiring these tests to use this macro (e.g.,
create_large_json_message(MAX_EVENT_LEN - 1000), constructing “over-limit” payloads asMAX_EVENT_LEN + delta) instead of hard‑coded sizes later in the file.
399-430:create_large_json_messageis bounds-safe but relies on magic offsetsThe allocation and index math look safe (no obvious buffer overruns), but the use of hard-coded values (
12for the prefix offset,15for the “overhead” size) makes this helper fragile if the JSON structure ever changes.You could make this more robust and self-documenting by defining prefix/suffix strings and using
strlenfor lengths, e.g.const char *prefix = "{\"message\":\""; const char *suffix = "\"}";and computing offsets from those instead of literal12/15.
432-468: Nice reuse of CloudWatch setup; consider minimal defensive checksThis helper cleanly de-duplicates the common setup/teardown logic and pushes arbitrary JSON payloads via the existing lib input/output harness. Functionally it’s consistent with the rest of the file.
If you want to harden it slightly, you could add a
TEST_CHECK(ctx != NULL);afterflb_create()to fail fast on allocation/init errors, but that’s in line with (and optional relative to) existing tests.
470-498: UseMAX_EVENT_LEN(or AWS max) instead of hard-coded sizes in new large-event testsBoth
flb_test_cloudwatch_event_size_near_limitandflb_test_cloudwatch_event_size_at_aws_maxcurrently hard-code999000and1048500while the file definesMAX_EVENT_LENand the PR is about aligning with CloudWatch’s updated maximum.To keep these tests robust against future limit changes and better reflect their intent:
- For the “near limit” test, derive the size from
MAX_EVENT_LEN(e.g.,MAX_EVENT_LEN - 1000or similar), rather than999000directly.- For the “AWS max” test, consider either:
- Using a named constant that documents how it relates to the AWS limit (e.g.,
AWS_MAX_EVENT_SIZE_APPROX), or- Deriving it from
MAX_EVENT_LENplus a clear delta that documents “this should be truncated down toMAX_EVENT_LEN”.That would remove magic numbers and ensure these tests stay aligned with the semantics they’re meant to cover.
500-525: Backslash-boundary test indexing looks correct but could reuse the JSON helperThe manual construction of the backslash-heavy payload appears bounds-safe: the loop limit (
i < 999980), the base offset (12), and the subsequentstrcpy(large_json + 12 + 999980, "\"}")all fit well within theflb_malloc(1000100)buffer.To reduce the amount of manual index arithmetic and keep the JSON shape consistent with other tests, you could consider building the bulk of the message via
create_large_json_message()and then post-processing the payload (e.g., walking the message body and replacing selected characters with\\) instead of re-encoding `{"message":"..."} by hand here.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/out_cloudwatch_logs/cloudwatch_api.h
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (1)
tests/runtime/out_cloudwatch.c (1)
539-541: Registering new tests inTEST_LISTlooks correctThe new tests are properly added to
TEST_LISTwith descriptive names and the required{NULL, NULL}terminator remains intact. This should integrate cleanly with the existing runtime test harness.
ee654ea to
3bf4539
Compare
3bf4539 to
bb01b00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/runtime/out_cloudwatch.c (1)
403-437: Consider safer string operations and add documentation.The function is correct but could be improved:
strcpy usage (lines 424, 433): Consider using
memcpyorstrncpyfor explicit length control, even though the current usage is safe.Silent size adjustment (lines 414-416): The function silently adjusts
target_sizeif it's too small. Consider returningNULLor documenting this behavior.Missing documentation: Add a comment noting that the caller must free the returned pointer.
Example improvement for the strcpy calls:
- strcpy(json, TEST_JSON_PREFIX); + memcpy(json, TEST_JSON_PREFIX, prefix_len);- strcpy(json + prefix_len + data_size, TEST_JSON_SUFFIX); + memcpy(json + prefix_len + data_size, TEST_JSON_SUFFIX, suffix_len);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (7)
plugins/out_cloudwatch_logs/cloudwatch_api.h (2)
46-56: LGTM! Conservative limit with clear documentation.The 1MB limit with ~4.6% safety margin is a sound approach for production reliability. The comment clearly documents the rationale and empirical testing results.
60-60: Good addition for maintainability.Defining the prefix length as a constant improves code clarity and reduces magic numbers.
tests/runtime/out_cloudwatch.c (5)
8-9: LGTM! Correct header inclusion.Including the CloudWatch API header provides access to MAX_EVENT_LEN and other constants needed for the new tests.
15-17: LGTM! Clear test structure constants.These constants appropriately define the JSON envelope for constructing test messages.
439-476: LGTM! Standard test helper implementation.The function properly sets up the test environment, handles optional data injection, and ensures cleanup.
478-534: LGTM! Comprehensive edge-case coverage.The three test functions properly exercise large-event handling:
- At-limit events
- Over-limit events requiring truncation
- Backslash handling at truncation boundaries (important for JSON escaping)
Memory management is correct in all tests. The tests follow the existing pattern of integration-style validation without explicit truncation assertions.
Note: The test names in code (
event_size_at_limit,event_size_over_limit) differ from those mentioned in the PR objectives (event_size_near_limit,event_size_at_aws_max). The current names are more descriptive and accurate.
548-550: LGTM! Proper test registration.The three new tests are correctly registered in the test suite with names matching their function declarations.
Increase MAX_EVENT_LEN from 262,118 bytes (256 KiB) to 1,000,000 bytes (1 MB) to better align with AWS CloudWatch's documented maximum event size of 1,048,576 bytes (1 MiB). The 1 MB limit provides a ~4.6% safety margin to account for JSON encoding overhead. Testing confirmed messages up to 1,048,546 bytes (encoding to 1,048,586 bytes) succeed, though we use a conservative limit for production safety. Add runtime tests to validate the new limit: - event_size_at_limit: Validates events at exactly MAX_EVENT_LEN (1MB) are accepted - event_size_over_limit: Validates events exceeding MAX_EVENT_LEN are truncated - event_truncation_with_backslash: Validates backslash handling at truncation boundary Signed-off-by: Shelby Hagman <[email protected]>
bb01b00 to
f844d36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tests/runtime/out_cloudwatch.c (4)
403-438: Large JSON generator is correct; minor optional simplificationThe helper correctly:
- enforces a minimum
target_sizeto hold prefix + at least one data byte + suffix,- allocates
target_size + 1and null‑terminates atjson[target_size],- fills the payload so that
strlen(json) == target_size.If you ever want to shave a few lines, the data fill loop could be replaced with a
memsetstarting atjson + prefix_lenfordata_sizebytes, but that’s purely cosmetic in a test helper.
440-477: Shared CloudWatch test harness looks consistent with existing patterns
run_cloudwatch_test_with_datamirrors the setup/teardown logic of the earlier tests (env flag, lib input, cloudwatch_logs output, start/push/stop/destroy) and correctly frees resources on the happy path.Optionally, you could assert on the return value of
flb_lib_pushto catch immediate push failures, but given the existing tests do not, this is non‑blocking.
479-492: Limit-size test correctly drives a MAX_EVENT_LEN‑sized payload through the pipeline
create_large_json_message(MAX_EVENT_LEN)followed bystrlen(large_json)ensures the pushed payload is exactlyMAX_EVENT_LENbytes long, which is what the truncation logic should treat as in‑bounds.The test currently asserts only that setup succeeds and the pipeline runs without crashing, which matches the style of existing runtime tests here. If you later add hooks to inspect the mock AWS backend, this would be a good place to assert that the event was not truncated.
494-507: Over-limit test exercises truncation path, though assertions are coarseUsing
create_large_json_message(MAX_EVENT_LEN + 1)ensures the input exceeds the per‑event cap and should force the CloudWatch code down the truncation path.As with the previous test, this currently validates “no crash / no fatal error” rather than inspecting the truncated size. That’s acceptable for a runtime smoke test, but if you later expose the observed payload length from the mock backend, asserting that it equals
MAX_EVENT_LENwould make this test much stronger.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (2)
tests/runtime/out_cloudwatch.c (2)
8-17: Using shared constants for JSON shape and MAX_EVENT_LEN is a good choiceIncluding
cloudwatch_api.hand definingTEST_JSON_PREFIX/TEST_JSON_SUFFIXkeeps the tests aligned with the real CloudWatch event constraints and avoids duplicated magic numbers. No issues here.
549-551: New tests are correctly registered in TEST_LISTThe three new test cases are wired into
TEST_LISTwith clear names matching their purposes, so they will run under the existing runtime test harness without further changes.
| /* | ||
| * https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html | ||
| * AWS CloudWatch's documented maximum event size is 1,048,576 bytes (1 MiB), | ||
| * including JSON encoding overhead (structure, escaping, etc.). | ||
| * | ||
| * Setting MAX_EVENT_LEN to 1,000,000 bytes (1 MB) provides a ~4.6% safety margin | ||
| * to account for JSON encoding overhead and ensure reliable operation. | ||
| * Testing confirmed messages up to 1,048,546 bytes (encoding to 1,048,586 bytes) | ||
| * succeed, though we use a conservative limit for production safety. | ||
| */ | ||
| #define MAX_EVENT_LEN 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event size cap looks sound; align earlier payload comment with new semantics
The new MAX_EVENT_LEN = 1000000 choice, with the detailed comment and empirical numbers, is reasonable and keeps event->len + PUT_LOG_EVENTS_PER_EVENT_LEN comfortably under PUT_LOG_EVENTS_PAYLOAD_SIZE for a single event.
One thing to tidy up: the earlier comment at lines 23–27 still states that using the documented payload size (1,048,576) “leads to API errors” and that “no issues have been seen setting it to 1,000,000 bytes,” but PUT_LOG_EVENTS_PAYLOAD_SIZE here remains 1,048,576 while 1,000,000 is now used for MAX_EVENT_LEN. That’s confusing for future readers.
Consider updating that earlier comment to explicitly distinguish:
- payload buffer size (still 1,048,576, per AWS docs), and
- per‑event cap (
MAX_EVENT_LEN = 1,000,000) used as a conservative safety margin.
This keeps the rationale clear and avoids implying the payload macro itself should be 1,000,000.
🤖 Prompt for AI Agents
In plugins/out_cloudwatch_logs/cloudwatch_api.h around lines 23–27 and 46–56,
the earlier comment implies the payload macro should be reduced to 1,000,000
which conflicts with the new MAX_EVENT_LEN; update the earlier comment to
explicitly state that PUT_LOG_EVENTS_PAYLOAD_SIZE remains the AWS-documented
1,048,576 byte payload buffer, while MAX_EVENT_LEN is a separate conservative
per-event cap set to 1,000,000 bytes to allow JSON-encoding overhead, and remove
any wording that suggests changing the payload macro itself to 1,000,000.
| /* Test event with trailing backslash at truncation boundary */ | ||
| void flb_test_cloudwatch_event_truncation_with_backslash(void) | ||
| { | ||
| char *large_json; | ||
| size_t prefix_len = strlen(TEST_JSON_PREFIX); | ||
| size_t suffix_len = strlen(TEST_JSON_SUFFIX); | ||
| size_t total_len; | ||
| size_t data_len; | ||
| size_t i; | ||
|
|
||
| /* Create base message near MAX_EVENT_LEN */ | ||
| large_json = create_large_json_message(MAX_EVENT_LEN + 100); | ||
| TEST_CHECK(large_json != NULL); | ||
|
|
||
| if (large_json) { | ||
| total_len = strlen(large_json); | ||
| data_len = total_len - prefix_len - suffix_len; | ||
|
|
||
| /* Post-process: replace every 100th character with backslash */ | ||
| for (i = 99; i < data_len; i += 100) { | ||
| large_json[prefix_len + i] = '\\'; | ||
| } | ||
|
|
||
| run_cloudwatch_test_with_data(large_json, strlen(large_json)); | ||
| flb_free(large_json); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backslash truncation test likely builds invalid JSON and may not hit the boundary case
Two concerns with this helper:
-
JSON correctness
create_large_json_message()builds a valid JSON string. The loop:for (i = 99; i < data_len; i += 100) { large_json[prefix_len + i] = '\\'; }
replaces individual data characters with a single
'\\'but does not adjust the following character, so many substrings inside the"message"value become\A,\B, etc. In standard JSON, a backslash must be followed by a valid escape (",\,/,b,f,n,r,t, oru);\Ais invalid. If thelibinput’s test formatter is parsing this as JSON (as it does forJSON_TD), this test may fail or behave differently than intended because the JSON is malformed rather than because of truncation behavior. -
Not guaranteed to place a backslash at the truncation boundary
The current pattern (every 100th data character starting at offset 99) does not ensure that the character at the truncation boundary (
index MAX_EVENT_LEN - 1in the full JSON string) is a backslash. If the truncation logic only special‑cases “last byte before cut is '\'”, this test may never exercise that branch.
To both keep the JSON valid and guarantee a backslash at the boundary, you could:
- Explicitly plant a
\\escape sequence straddlingMAX_EVENT_LEN - 1/MAX_EVENT_LEN, and - Optionally keep (or remove) the regular 100‑byte pattern if you still want many backslashes.
For example:
- /* Post-process: replace every 100th character with backslash */
- for (i = 99; i < data_len; i += 100) {
- large_json[prefix_len + i] = '\\';
- }
+ /* Post-process: add backslashes in the data portion */
+
+ /* 1) Ensure a backslash is exactly at the truncation boundary.
+ * MAX_EVENT_LEN is an index in the full JSON string; make sure we stay in the data range
+ * and create a valid JSON escape sequence (“\\”) there.
+ */
+ {
+ size_t boundary = MAX_EVENT_LEN - 1; /* index in full JSON string */
+
+ if (boundary + 1 < total_len - suffix_len) {
+ large_json[boundary] = '\\';
+ large_json[boundary + 1] = '\\';
+ }
+ }
+
+ /* 2) Optionally, sprinkle additional backslashes while keeping JSON valid:
+ * turn pairs of 'A's into '\\\\' sequences.
+ */
+ for (i = 99; i + 1 < data_len; i += 100) {
+ large_json[prefix_len + i] = '\\';
+ large_json[prefix_len + i + 1] = '\\';
+ }This way:
- The input JSON remains syntactically valid.
- The character at
MAX_EVENT_LEN - 1is a backslash, so the truncation logic’s “trailing backslash” handling is actually exercised.
🤖 Prompt for AI Agents
In tests/runtime/out_cloudwatch.c around lines 509 to 535, the loop currently
injects single backslash characters into the JSON payload which produces invalid
JSON escapes and does not guarantee a backslash at the truncation boundary; fix
by (1) stop inserting lone '\' characters that create invalid escapes, (2)
instead insert valid JSON escape sequences (a pair of characters representing a
backslash in the JSON string) when mutating the payload, and (3) explicitly
place one such two‑char escape so that its first byte lands at index
MAX_EVENT_LEN - 1 of the final JSON buffer (i.e. straddle the truncation
boundary) to ensure the truncation branch is exercised, taking care to check
bounds against the buffer length and maintain existing prefix/suffix offsets;
you may keep or remove the periodic 100‑byte pattern but any injected backslash
must be written as a two‑character JSON escape.
Summary
CloudWatch logs PutLogEvents API change in 2025 to support events up to 1MB (ref) while our defaults reflect the older value 256KB.
Add runtime tests to validate the new limit:
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Gist: https://gist.github.com/ShelbyZ/76bfac405e03723efeb532a48d5b6a87
Over-limit on current logs
New limit
Tested directly against CloudWatch Logs via PutLogEvents to discover that a maximum value of
1,048,546bytes could be sent without issue. Using 1,000,000 bytes (1MB) gives a safe margin to avoid calculations for thePUT_LOG_EVENTS_PAYLOAD_SIZEwhich would need reevaluation to support observed max. The encoded max size was also1,048,586bytes which is 10 bytes about 1MiB (1,048,576) which makes me worry that depending on an exact max may break over time.If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.