Lucene: update the same record multiple times in the same transaction #3705
+158
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The issue at hand is that when running multiple update operations in a single transaction, the partition's document counts and the PK-segment index may get into an inconsistent state. The root cause is that the first update in the transaction clears the doc from the Lucene index and the PK index. Since the changes are not flushed, the IndexWriter has them cached in the NRT cache. The second record update would then not find the record in the PK index (because the segment has changed but the IndexReader does not yet reflect that) and therefore the delete is skipped, including updating the partition count. Note that it does attempt a delete-by-query that actually removes the doc from the Lucene index, but since we can't know that, the partition is not updated.
The proposed solution is as follows:
LuceneIndexMaintainer.tryDeletemethod, set to true when we are doing update. The rationale is that when doing update we know that the old document actually existed, even if we can't find itforceDelete, clear the entire range for the record PK, removing the record entry regardless of the segment it is currently inforceDelete, always try to decrement the count since we assume that update always has an original record that is being modifiedResolve #3704