Skip to content

CSW GetRecords fails when requesting an outputSchema not supported by a metadata schema.#6941

Merged
josegar74 merged 4 commits into
geonetwork:mainfrom
GeoCat:fix-6940
Jun 10, 2026
Merged

CSW GetRecords fails when requesting an outputSchema not supported by a metadata schema.#6941
josegar74 merged 4 commits into
geonetwork:mainfrom
GeoCat:fix-6940

Conversation

@josegar74

@josegar74 josegar74 commented Mar 24, 2023

Copy link
Copy Markdown
Member

Fixes #6940

Add a new setting to customise the behaviour:

  • Ignore the metadata (default)
  • Fail the request
image

@josegar74 josegar74 added this to the 4.2.4 milestone Mar 24, 2023
@josegar74 josegar74 requested a review from fxprunayre March 24, 2023 14:45
@fxprunayre fxprunayre modified the milestones: 4.2.4, 4.2.5 May 10, 2023
@fxprunayre fxprunayre modified the milestones: 4.2.5, 4.2.6 Jul 5, 2023
@fxprunayre fxprunayre modified the milestones: 4.2.6, 4.4.1 Oct 4, 2023
@josegar74 josegar74 modified the milestones: 4.4.1, 4.4.2 Nov 22, 2023
@fxprunayre fxprunayre modified the milestones: 4.4.2, 4.4.3 Jan 23, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.3, 4.4.4 Mar 13, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.4, 4.4.5 Apr 16, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.5, 4.4.6 Jun 4, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.6, 4.4.7 Oct 15, 2024
@CLAassistant

CLAassistant commented Dec 8, 2024

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@josegar74 josegar74 modified the milestones: 4.4.7, 4.4.8 Apr 10, 2025
@josegar74 josegar74 modified the milestones: 4.4.8, 4.4.9 May 28, 2025
@jahow jahow modified the milestones: 4.4.9, 4.4.10 Oct 7, 2025
@josegar74 josegar74 removed this from the 4.4.10 milestone Jun 6, 2026
@josegar74 josegar74 added this to the 4.4.12 milestone Jun 6, 2026
juanluisrp added a commit to GeoCat/core-geonetwork that referenced this pull request Jun 9, 2026
…eturned

When harvesting via CSW GetRecords, the source returns pages of several
records at once. Some CSW servers (GeoNetwork included) abort the whole
GetRecords response when a single record of the page can not be serialized
in the requested outputSchema, for instance an ISO 19110 feature catalogue
requested with outputSchema=gmd, which has no gmd presentation. The
harvester turned that page error into a fatal OperationAbortedEx and the
whole harvest run stopped, leaving the catalogue partially harvested.

Make the page fetch resilient: when the source returns an OWS exception for
a page, split the page in half and retry each half. A single record that
still fails is logged, skipped, and the harvest carries on with the rest.
Each bad record costs O(log n) extra requests to isolate; in the sparse
case this keeps the overhead low. In the worst case (every record in the
page is bad) the split visits 2n-1 nodes, so the total is linear in the
page size - bounded in practice because pages are small (typically 10-200
records), and still far fewer requests than fetching records one by one
from the start. A SearchResults element is synthesized for the recovered
page with consistent numberOfRecordsMatched, numberOfRecordsReturned
(positions consumed, i.e. returned plus skipped) and nextRecord attributes,
so the existing paging and end-of-set detection keep working unchanged.

Only server-side OWS exceptions trigger this recovery. Connection and
protocol errors, and all the existing handling of well-behaved and
misbehaving servers (nextRecord based termination, old CSW namespace,
GET/POST fallback), are left untouched.

The recovery is on the harvester side, so it also handles non-GeoNetwork
CSW servers that fail a page for any reason. The matching server-side
behaviour, making a GeoNetwork source skip the records it can not present
instead of failing the whole page, is tracked in geonetwork#6940 and proposed in
geonetwork#6941; the two changes are complementary.

Related to geonetwork#6940
Related to geonetwork#6941
@josegar74 josegar74 requested a review from juanluisrp June 9, 2026 14:51

@fxprunayre fxprunayre left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with default samples and the option off and http://localhost:8080/geonetwork/srv/fre/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&outputSchema=http://data.europa.eu/930/&elementSetName=full&resultType=results&maxRecords=30

<ows:ExceptionReport xmlns:ows="http://www.opengis.net/ows" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.2.0" xsi:schemaLocation="http://www.opengis.net/ows http://schemas.opengis.net/ows/1.0.0/owsExceptionReport.xsd">
<ows:Exception exceptionCode="NoApplicableCode">
<ows:ExceptionText>java.lang.RuntimeException: org.fao.geonet.csw.common.exceptions.InvalidParameterValueEx: code=InvalidParameterValue, locator=OutputSchema, message=OutputSchema 'eu-geodcat-ap' not supported for metadata with '106' (dublin-core). Corresponding XSL transformation 'eu-geodcat-ap-full.xsl' (or 'eu-geodcat-ap.xsl') does not exist for this schema. The record will not be returned in response.</ows:ExceptionText>
</ows:Exception>
</ows:ExceptionReport>

@josegar74 josegar74 merged commit c7be4e4 into geonetwork:main Jun 10, 2026
8 checks passed
@josegar74 josegar74 added the update-release-notes PRs with this label need some notes to be added to the documentation/release notes for user action. label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

update-release-notes PRs with this label need some notes to be added to the documentation/release notes for user action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CSW GetRecords fails when requesting an outputSchema not supported by a metadata schema

4 participants