Improve exception handling for JsonXContentParser #123439

smalyshev · 2025-02-26T00:44:24Z

Ensures that all exception handling in the parser follows single code path and that JSON exceptions are handled in one place.

rjernst

Seems like this could possibly have an impact on json parsing speed due to introducing lambdas at every token event. Is the lambda needed?
cc @original-brownbear

smalyshev · 2025-02-26T04:55:54Z

@rjernst Thanks, I did not consider the performance aspect, I'll look into that. The lambda is strictly speaking not necessary, that part was added to avoid copypasting try/catch all over, but if it kills the performance then I can go back to copypasting.

original-brownbear · 2025-02-26T12:24:00Z

Yea we can't do this, this is going to annihilate performance I would expect. We have some jmh benchmarks around mapper parsing that will probably show this quite clearly.

smalyshev · 2025-02-26T17:37:21Z

I run FilterContentBenchmark and don't see any substantial difference there (in fact, majority of runs were faster but not sure it means anything - probably just jitter) - is it the right one to use? I put the results here: https://docs.google.com/spreadsheets/d/1JxHtSVzgfcOCZVAVHuIj1NmkvUvZKRcN1gVCChXz3xo/edit in case it's useful. I'd expect JIT to unroll the statics and lambdas but I'm not sure how smart it is.

original-brownbear · 2025-02-26T19:34:39Z

@smalyshev Does that benchmark even exercise this code path much? Almost looks like it just delegates to com.fasterxml.jackson.core.JsonGenerator#copyCurrentStructure? Also I'd expect the cost in that one to mainly live with the builder side of things, not the parsing?

I'd expect org.elasticsearch.benchmark.index.mapper.BeatsMapperBenchmark to show a regression (but run it on a clean machine, not a laptop ... at least for me results on a MacBook or similar are just all over in the JMH runs).

smalyshev · 2025-02-26T19:39:24Z

@original-brownbear OK I'll try BeatsMapperBenchmark on a VM and see what happens.

smalyshev · 2025-02-26T20:26:07Z

@original-brownbear So this is what I get on the Google VM:
from main:

BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  19035.809 ± 63.917  ns/op
BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  19312.652 ± 82.225  ns/op
BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  19097.191 ± 168.124  ns/op

with the patch

BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  18866.318 ± 190.904  ns/op
BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  18837.553 ± 221.828  ns/op
BeatsMapperBenchmark.benchmarkParseKeywordFields  1600172297  avgt    5  19106.008 ± 160.673  ns/op

I'm not sure why the patch gets a tiny bit faster results, but it seems to be all within the error bars. Any other suggestions about how I could test it?

elasticsearchmachine · 2025-02-27T18:09:23Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

original-brownbear · 2025-02-27T18:15:45Z

@smalyshev even if you cannot show a regression in a microbenchmark. I would strongly discourage this kind of change. It might be that the compiler is able to inline the whole thing. But also it could be that for some percentage of runs It won't be able to do so (JIT is not deterministic at all and you might even be able to reproduce this fact by rerunning the benchmark a couple times in fresh JVMs).
If we want to make a change to this code, its keep it cleanly procedural IMO.

I also gave that benchmark a spin and it's not a good reproducer sadly because it mostly goes through the dot expanding parser which has so much overhead that it drowns out the overhead here. Sorry for the inaccurate suggestion, I couldn't find a good Rally run at all when I just checked.
You might be able to reproduce a change for this kind of thing in some Rally runs that are heavy in runtime fields, those parse heavily but whether that avoids the dot expander noise is a different topic.

smalyshev · 2025-02-27T19:06:18Z

Maybe we could add parser-specific microbenchmark if the existing ones aren't good enough? It's not hard for me to unroll the lambdas, but if we don't have the good benchmark I can't be sure any change wouldn't be problematic so there's something that needs to be improved regardless.

original-brownbear · 2025-04-29T09:17:03Z

Thanks @smalyshev this looks quite nice. I could see this being a performance improvement as well actually from now reducing the code size in the hot path! Maybe wait for @rjernst to be good with this one as well but LGTM from my end!

rjernst

Looks much better, but I have one more question

.../x-content/impl/src/main/java/org/elasticsearch/xcontent/provider/json/JsonXContentImpl.java

rjernst

LGTM

* Improve exception handling for JsonXContentParser * Wrap parser creation so that bad char sequences do not cause 500 * Unroll safeParse due to performance concerns * Add parser-only microbenchmark

elasticsearchmachine · 2025-08-06T00:20:56Z

💚 Backport successful

Status	Branch	Result
✅	9.1
✅	8.19

* Improve exception handling for JsonXContentParser * Wrap parser creation so that bad char sequences do not cause 500 * Unroll safeParse due to performance concerns * Add parser-only microbenchmark

…132481) * Improve exception handling for JsonXContentParser (#123439) * Improve exception handling for JsonXContentParser * Wrap parser creation so that bad char sequences do not cause 500 * Unroll safeParse due to performance concerns * Add parser-only microbenchmark * Old Java doesn't have pattern matching, do it the hard way

…cking * upstream/main: (24 commits) Revert "[Fleet] add privileges to `kibana_system` to read integrations data (elastic#132400)" (elastic#132499) ESQL: Rename evaluators for FIRST and LAST (elastic#132466) Add inference fields to semantic text docs (elastic#132471) ESQL: Allow FIRST and LAST as method name (elastic#132469) ESQL: Add javadoc for PushDownAndCombineFilters (elastic#132484) Misc cleanups in Coordinator (elastic#132452) [DiskBBQ] Write the maximum posting list size to avoid resizing the docId array (elastic#132447) Improve exception handling for JsonXContentParser (elastic#123439) Clarify quantization on semantic_text BBQ dense vector default (elastic#132470) Fix test infra NPE in doEnsureClusterStateConsistency (elastic#131859) Stabilize CancellableTasksIT#testRemoveBanParentsOnDisconnect (elastic#131858) Move ClusterApplierService assertion after logging exception (elastic#132446) ESQL: Support for multi-argument aggs (elastic#132424) Update wolfi (versioned) (elastic#132457) ESQL: Fix Function javadoc (elastic#132399) [ML] Inference API disable partial search results (elastic#132362) Unmute testTermsQuery tests (elastic#132409) Fix index lookup when field-caps returns empty mapping (elastic#132138) CompressorFactory.compressor (elastic#132448) ESQL add formatting to plans in javadoc (elastic#132421) ...

Improve exception handling for JsonXContentParser

c9d4bb7

elasticsearchmachine added the v9.1.0 label Feb 26, 2025

smalyshev added >refactoring :Core/Infra/Core Core issues without another label labels Feb 26, 2025

smalyshev requested review from cbuescher, rjernst and benwtrent February 26, 2025 02:27

rjernst reviewed Feb 26, 2025

View reviewed changes

Merge branch 'main' into json-parser-exc

99bca26

smalyshev added 3 commits February 26, 2025 13:41

Fix tests

617f12a

Merge branch 'main' into json-parser-exc

2b3f32b

More test fixes

d37e09c

smalyshev force-pushed the json-parser-exc branch from 459d209 to d37e09c Compare February 26, 2025 21:48

smalyshev added v8.19.0 auto-backport Automatically create backport pull requests when merged labels Feb 26, 2025

Wrap parser creation so that bad char sequences do not cause 500

20a0f9e

smalyshev marked this pull request as ready for review February 27, 2025 18:09

smalyshev requested a review from a team as a code owner February 27, 2025 18:09

elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Feb 27, 2025

smalyshev added 2 commits February 27, 2025 13:22

Merge branch 'main' into json-parser-exc

428ec3c

Unroll safeParse due to performance concerns

ca7b09e

Merge branch 'main' into json-parser-exc

0771670

benwtrent removed their request for review March 27, 2025 14:04

smalyshev added 2 commits April 26, 2025 22:20

Merge branch 'main' into json-parser-exc

9b1331b

Merge branch 'main' into json-parser-exc

a53dd2b

original-brownbear approved these changes Apr 29, 2025

View reviewed changes

rjernst reviewed Apr 29, 2025

View reviewed changes

.../x-content/impl/src/main/java/org/elasticsearch/xcontent/provider/json/JsonXContentImpl.java Outdated Show resolved Hide resolved

Merge branch 'main' into json-parser-exc

e33ea1e

elasticsearchmachine added v9.2.0 and removed v9.1.0 labels Jun 26, 2025

smalyshev removed the v8.19.0 label Jul 9, 2025

smalyshev and others added 4 commits July 9, 2025 13:56

Merge branch 'main' into json-parser-exc

ce56773

Merge branch 'main' into json-parser-exc

cf11dd6

Merge branch 'main' into json-parser-exc

2df22a2

Patch feedback

1f2b9be

smalyshev requested a review from rjernst August 5, 2025 22:47

smalyshev added v9.1.1 v8.19.1 auto-backport Automatically create backport pull requests when merged and removed auto-backport Automatically create backport pull requests when merged labels Aug 5, 2025

rjernst approved these changes Aug 5, 2025

View reviewed changes

smalyshev merged commit 91b6267 into elastic:main Aug 6, 2025
33 checks passed

smalyshev mentioned this pull request Aug 6, 2025

[9.1] Improve exception handling for JsonXContentParser (#123439) #132480

Merged

smalyshev mentioned this pull request Aug 6, 2025

[8.19] Improve exception handling for JsonXContentParser (#123439) #132481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve exception handling for JsonXContentParser #123439

Improve exception handling for JsonXContentParser #123439

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

rjernst left a comment

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

original-brownbear commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025 •

edited

Loading

Uh oh!

original-brownbear commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

elasticsearchmachine commented Feb 27, 2025

Uh oh!

original-brownbear commented Feb 27, 2025

Uh oh!

smalyshev commented Feb 27, 2025

Uh oh!

original-brownbear commented Apr 29, 2025

Uh oh!

rjernst left a comment

Uh oh!

Uh oh!

rjernst left a comment

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 6, 2025

Uh oh!

Uh oh!

Improve exception handling for JsonXContentParser #123439

Improve exception handling for JsonXContentParser #123439

Uh oh!

Conversation

smalyshev commented Feb 26, 2025

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

original-brownbear commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

original-brownbear commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

smalyshev commented Feb 26, 2025

Uh oh!

elasticsearchmachine commented Feb 27, 2025

Uh oh!

original-brownbear commented Feb 27, 2025

Uh oh!

smalyshev commented Feb 27, 2025

Uh oh!

original-brownbear commented Apr 29, 2025

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rjernst left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 6, 2025

💚 Backport successful

Uh oh!

Uh oh!

smalyshev commented Feb 26, 2025 •

edited

Loading