Skip to content

Elasticsearch HTTP proxy refactor#9318

Open
josegar74 wants to merge 4 commits into
geonetwork:mainfrom
GeoCat:eshttpproxy-refactor
Open

Elasticsearch HTTP proxy refactor#9318
josegar74 wants to merge 4 commits into
geonetwork:mainfrom
GeoCat:eshttpproxy-refactor

Conversation

@josegar74

@josegar74 josegar74 commented Jun 6, 2026

Copy link
Copy Markdown
Member

Previously EsHTTPProxy had a lot of responsability, with this refactor, the code has been refactor to be more clear and maintanable. The workflow follows a request-response pipeline:

  • Request: EsHTTPProxy receives a search request → EsQueryProcessor modifies the JSON DSL → EsHTTPProxy forwards the request to Elasticsearch.
  • Response: EsResponseProcessor receives the response → Streams it through various EsDocumentProcessor implementations → Returns the modified response to the client.

A. Proxy Layer

  • EsHTTPProxy: The primary entry point. It is a Spring MVC Controller that handles HTTP POST requests to /search/records/_search. It manages the connection to Elasticsearch, handles content encoding (GZIP), and coordinates the pre-processing and post-processing steps.

B. Query Processing Layer (Request)

  • EsQueryProcessor: Intercepts the raw JSON query body. It supports both single searches and multi-searches (_msearch). It ensures required fields (like ID, Owner, Schema) are always requested from Elasticsearch.
  • EsQueryFilterBuilder: Responsible for modifying the Elasticsearch Query DSL. It injects Access Control List (ACL) filters to ensure users only see records they have permission to access.

C. Response Processing Layer (Response)

  • EsResponseProcessor: Orchestrates the enrichment of the search hits. It uses a streaming approach to process results efficiently without loading the entire response into memory.
  • EsDocumentProcessor (Interface): Defines a contract for individual document modification.
    • EsDocumentUserInfoProcessor: Adds privilege information (e.g., canEdit, isOwner) to each search hit based on the current user's session.
    • EsDocumentSelectionInfoProcessor: Adds information about whether a record is currently "selected" in the user's bucket.
    • EsDocumentMetadataFiltersProcessor: Applies schema-specific filters to remove fields that shouldn't be exposed.
    • EsDocumentRemovePrivilegesProcessor: Cleans up internal privilege fields before sending the data to the client.

D. Utility & Infrastructure

  • JsonStreamUtils: A critical utility that leverages the Jackson library for streaming JSON processing. It allows GeoNetwork to "walk" through the Elasticsearch response and apply modifications only to the hits array, maintaining high performance for large results.
  • ObjectNodeUtils: Provides helper methods for manipulating Jackson ObjectNode structures.

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

@josegar74 josegar74 added this to the 4.4.12 milestone Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants