Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug][storage] Fixed the ES-Rollover Idempotency #6638

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

Manik2708
Copy link
Contributor

@Manik2708 Manik2708 commented Jan 30, 2025

Which problem is this PR solving?

Fixes: #6203

Description of the changes

  • Currently es-rollover checks for index existence through errors, it is mainly expecting the error:
{"error":{"root_cause":[{"type":"resource_already_exists_exception","reason":"]"}],"type":"resource_already_exists_exception","reason":"request [/jaeger-*] contains unrecognized parameter: [help]"},"status":400}

But it can lead to inconsistent results as found in the issue, where init was failing due to the error:

Error: failed to create index: jaeger-span-000001, request failed, status code: 400, body: {"error":{"root_cause":[{"type":"invalid_index_name_exception","reason":"Invalid index name [jaeger-span-000001], already exists as alias","index_uuid":"_na_","index":"jaeger-span-000001"}],"type":"invalid_index_name_exception","reason":"Invalid index name [jaeger-span-000001], already exists as alias","index_uuid":"_na_","index":"jaeger-span-000001"},"status":400}

Here if we see carefully the error is coming due to existence of index but the reason is different. es-rollover is ready only for resource_already_exists_exception but there are other errors also which can be generated due to this (like the above).

The current way of marshalling error is unsafe, the safe way is: Check if index exists -> Create if not exists. This way the certained error (resource_already_exists is fixed) and the unavoidable error like index_name_exception is ignored.

How was this change tested?

  • Unit and E2E Tests

Checklist

Copy link

codecov bot commented Jan 30, 2025

Codecov Report

Attention: Patch coverage is 94.73684% with 2 lines in your changes missing coverage. Please review.

Project coverage is 95.99%. Comparing base (8bf69c7) to head (4082416).

Files with missing lines Patch % Lines
pkg/es/client/index_client.go 92.30% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6638      +/-   ##
==========================================
- Coverage   95.99%   95.99%   -0.01%     
==========================================
  Files         363      363              
  Lines       20568    20589      +21     
==========================================
+ Hits        19745    19764      +19     
- Misses        628      630       +2     
  Partials      195      195              
Flag Coverage Δ
badger_v1 9.82% <ø> (ø)
badger_v2 1.82% <ø> (ø)
cassandra-4.x-v1-manual 14.79% <ø> (ø)
cassandra-4.x-v2-auto 1.81% <ø> (ø)
cassandra-4.x-v2-manual 1.81% <ø> (ø)
cassandra-5.x-v1-manual 14.79% <ø> (ø)
cassandra-5.x-v2-auto 1.81% <ø> (ø)
cassandra-5.x-v2-manual 1.81% <ø> (ø)
elasticsearch-6.x-v1 19.12% <ø> (ø)
elasticsearch-7.x-v1 19.20% <ø> (ø)
elasticsearch-8.x-v1 19.37% <ø> (ø)
elasticsearch-8.x-v2 1.82% <ø> (ø)
grpc_v1 10.80% <ø> (ø)
grpc_v2 7.80% <ø> (ø)
kafka-3.x-v1 10.12% <ø> (ø)
kafka-3.x-v2 1.82% <ø> (ø)
memory_v2 1.82% <ø> (ø)
opensearch-1.x-v1 19.25% <ø> (ø)
opensearch-2.x-v1 19.25% <ø> (ø)
opensearch-2.x-v2 1.82% <ø> (ø)
tailsampling-processor 0.48% <ø> (ø)
unittests 94.87% <94.73%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Manik2708
Copy link
Contributor Author

Manik2708 commented Jan 30, 2025

@yurishkuro I think, even this approach can lead to the error described in the issue. An index might not exist but still pointed by an alias. Should we ignore that error in init (as that means rollover has taken place already)? I intially thought that index existence could fix this but I may be wrong. So I want to take your views over ignoring this error in init.

@Manik2708 Manik2708 changed the title [bug][storage] Enhanced the idempotency of ES-Rollover [bug][storage] Fixed the ES-Rollover Idempotency Jan 31, 2025
@Manik2708
Copy link
Contributor Author

@mahadzaryab1 Can you please review this PR and solve this doubt #6638 (comment) ?

pkg/es/client/client.go Outdated Show resolved Hide resolved
cmd/es-rollover/app/init/action.go Outdated Show resolved Hide resolved
cmd/es-rollover/app/init/action.go Show resolved Hide resolved
Signed-off-by: Manik2708 <[email protected]>
@Manik2708 Manik2708 requested a review from yurishkuro February 11, 2025 00:16
@Manik2708 Manik2708 requested a review from yurishkuro February 11, 2025 11:20
pkg/es/client/client.go Outdated Show resolved Hide resolved
pkg/es/client/client.go Outdated Show resolved Hide resolved
Signed-off-by: Manik2708 <[email protected]>
@Manik2708 Manik2708 requested a review from yurishkuro February 11, 2025 14:57
@Manik2708
Copy link
Contributor Author

Fixing e2e tests

This reverts commit aab70e3.

Signed-off-by: Manik2708 <[email protected]>
Signed-off-by: Manik2708 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: jaeger-es-rollover-init tries to create already existing [jaeger-span-000001] index in Elasticseach
2 participants