Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat ignores ignore_older option and processes all files in S3 folder #42824

Open
b2ronn opened this issue Feb 21, 2025 · 2 comments
Open
Labels
needs_team Indicates that the issue/PR needs a Team:* label

Comments

@b2ronn
Copy link

b2ronn commented Feb 21, 2025

Description:
I am using Filebeat to read logs from an S3 bucket and want to process only files from the last hour using the ignore_older option. However, Filebeat is processing all files in the folder, regardless of their age.
Steps to Reproduce:

  1. Create an S3 bucket SOME_BUCKET and upload log files with timestamps older than 1 hour.
    $aws s3api list-objects --bucket SOME_BUCKET --prefix SOME/FOLDER
{
    "Contents": [
        {
            "Key": "SOME/FOLDER/20250213T004708Z_20250213T004713Z_86de733b.log.gz",
            "LastModified": "2025-02-13T00:00:48+00:00",
            "ETag": "\"\"",
            "Size": 165655,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": ""
            }
        },

  1. Configure Filebeat with the following input:
filebeat.inputs:
- access_key_id: ${ACCESS_KEY_ID}
  bucket_arn: arn:aws:s3:::SOME_BUCKET
  bucket_list_interval: 10m
  bucket_list_prefix: SOME/FOLDER
  default_region: eu-central-1
  ignore_older: 1h
  secret_access_key: ${SECRET_ACCESS_KEY}
  type: aws-s3

logging.level: debug
output:
  console: 
  1. Start Filebeat and observe that it processes all files in SOME/FOLDER, including those older than 1 hour.

Expected Behavior:
Filebeat should only process files that have been modified within the last hour, as per the ignore_older setting.

Actual Behavior:
Filebeat processes all files in the specified folder, even those much older than 1 hour.

Logs & Debug Output:
Example of a processed file that is older than 1 hour:

{"@timestamp":"2025-02-21T10:33:40.276Z","@metadata":{"beat":"filebeat","type":"_doc","version":"8.17.2","_id":""},"log":{"file":{"path":"https://SOME_BUCKET.s3.eu-central-1.amazonaws.com/SOME/FOLDER/20250213T004708Z_20250213T004713Z_86de733b.log.gz"},"offset":93908},"aws":{"s3":{"bucket":{"name":"SOME_BUCKET","arn":"arn:aws:s3:::SOME_BUCKET"},"object":{"key":"SOME/FOLDER/20250213T004708Z_20250213T004713Z_86de733b.log.gz"}}},"cloud":{"provider":"","region":"eu-central-1"},"input":{"type":"aws-s3"},"ecs":{"version":"8.0.0"},"host":{"name":"host001"},"agent":{"version":"8.17.2","ephemeral_id":"","id":"","name":"host001","type":"filebeat"},"message":"{\"EdgeEndTimestamp\":1739407629000000000,\"EdgeResponseBytes\":4945,\"EdgeResponseStatus\":403,\"EdgeStartTimestamp\":1739407629000000000,\"ClientRequestProtocol\":\"HTTP/1.1\"}"}

Corresponding debug log:

{"log.level":"debug","@timestamp":"2025-02-21T11:33:40.263+0100","log.logger":"input.aws-s3.s3","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/filebeat/input/awss3.(*s3ObjectProcessor).ProcessS3Object","file.name":"awss3/s3_objects.go","file.line":123},"message":"Begin S3 object processing.","service.name":"filebeat","bucket_arn":"SOME_BUCKET","object_key":"/SOME/FOLDER/20250213T004708Z_20250213T004713Z_86de733b.log.gz","ecs.version":"1.6.0"}
{"log.level":"debug","@timestamp":"2025-02-21T11:33:40.273+0100","log.logger":"input.aws-s3.s3","log.origin":{"function":"github.com/elastic/beats/v7/x-pack/filebeat/input/awss3.(*s3ObjectProcessor).ProcessS3Object.func1","file.name":"awss3/s3_objects.go","file.line":131},"message":"End S3 object processing.","service.name":"filebeat","id":"","bucket_arn":"SOME_BUCKET","object_key":"/SOME/FOLDER/20250213T004708Z_20250213T004713Z_86de733b.log.gz","ecs.version":"1.6.0"}

Environment:
Filebeat Version: 8.17.2

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Feb 21, 2025
@botelastic
Copy link

botelastic bot commented Feb 21, 2025

This issue doesn't have a Team:<team> label.

@lucabelluccini
Copy link
Contributor

lucabelluccini commented Feb 21, 2025

I'm providing a provisional answer as I've not tested it, but ignore_older will be available in 8.18.x and it has been backported to 8.17.x #41817 (comment)

The code is not on 8.17.2 for sure https://github.com/elastic/beats/blob/v8.17.2/x-pack/filebeat/input/awss3/s3_filters.go as it didn't make the feature freeze, but it will be on any next 8.17.3 (if it will happen) https://github.com/elastic/beats/blob/8.17/x-pack/filebeat/input/awss3/s3_filters.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

No branches or pull requests

2 participants