Elasticsearch HTTP proxy refactor#9318
Open
josegar74 wants to merge 4 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously
EsHTTPProxyhad a lot of responsability, with this refactor, the code has been refactor to be more clear and maintanable. The workflow follows a request-response pipeline:EsHTTPProxyreceives a search request →EsQueryProcessormodifies the JSON DSL →EsHTTPProxyforwards the request to Elasticsearch.EsResponseProcessorreceives the response → Streams it through variousEsDocumentProcessorimplementations → Returns the modified response to the client.A. Proxy Layer
EsHTTPProxy: The primary entry point. It is a Spring MVC Controller that handles HTTP POST requests to/search/records/_search. It manages the connection to Elasticsearch, handles content encoding (GZIP), and coordinates the pre-processing and post-processing steps.B. Query Processing Layer (Request)
EsQueryProcessor: Intercepts the raw JSON query body. It supports both single searches and multi-searches (_msearch). It ensures required fields (like ID, Owner, Schema) are always requested from Elasticsearch.EsQueryFilterBuilder: Responsible for modifying the Elasticsearch Query DSL. It injects Access Control List (ACL) filters to ensure users only see records they have permission to access.C. Response Processing Layer (Response)
EsResponseProcessor: Orchestrates the enrichment of the search hits. It uses a streaming approach to process results efficiently without loading the entire response into memory.EsDocumentProcessor(Interface): Defines a contract for individual document modification.EsDocumentUserInfoProcessor: Adds privilege information (e.g.,canEdit,isOwner) to each search hit based on the current user's session.EsDocumentSelectionInfoProcessor: Adds information about whether a record is currently "selected" in the user's bucket.EsDocumentMetadataFiltersProcessor: Applies schema-specific filters to remove fields that shouldn't be exposed.EsDocumentRemovePrivilegesProcessor: Cleans up internal privilege fields before sending the data to the client.D. Utility & Infrastructure
JsonStreamUtils: A critical utility that leverages the Jackson library for streaming JSON processing. It allows GeoNetwork to "walk" through the Elasticsearch response and apply modifications only to thehitsarray, maintaining high performance for large results.ObjectNodeUtils: Provides helper methods for manipulating JacksonObjectNodestructures.Checklist
mainbranch, backports managed with labelREADME.mdfilespom.xmldependency management. Update build documentation with intended library use and library tutorials or documentation