Skip to content

Conversation

martijnvg
Copy link
Member

Contains all changes mentioned in #125403

martijnvg and others added 30 commits March 21, 2025 13:57
The doc values codec iterates a few times over the doc value instance that needs to be written to disk. In case when merging and index sorting is enabled, this is much more expensive, as each time the doc values instance is iterated an expensive doc id sorting is performed (in order to get the doc ids in order of index sorting).

There are several reasons why the doc value instance is iterated multiple times:
* To compute stats (num values, number of docs with value) required for writing values to disk.
* To write bitset that indicate which documents have a value. (indexed disi, jump table)
* To write the actual values to disk.
* To write the addresses to disk (in case docs have multiple values)

This applies for numeric doc values, but also for the ordinals of sorted (set) doc values.

This PR addresses solving the first reason why doc value instance needs to be iterated. This is done only when in case of merging and when the segments to be merged with are also of type es87 doc values, codec version is the same and there are no deletes.
fixed sorted set dv
added unit test with index sorting
…ctly in compatibleWithOptimizedMerge(...) method.
* Always store `numDocsWithField` and move it to `NumericEntry`
* Use `TsdbDocValuesProducer` instead of EmptyDocValuesProducer where possible in ES87TSDBDocValuesConsumer
* Fix TsdbDocValuesProducer#isSingleValued(...)
@martijnvg
Copy link
Member Author

martijnvg commented Mar 28, 2025

Running tsdb track (with force merge) without this change as baseline and with this change as contender:

|                                                        Metric |                    Task |        Baseline |       Contender |       Diff |   Unit |    Diff % |
|--------------------------------------------------------------:|------------------------:|----------------:|----------------:|-----------:|-------:|----------:|
|                    Cumulative indexing time of primary shards |                         |   261.856       |   254.106       |   -7.75052 |    min |    -2.96% |
|             Min cumulative indexing time across primary shard |                         |   261.856       |   254.106       |   -7.75052 |    min |    -2.96% |
|          Median cumulative indexing time across primary shard |                         |   261.856       |   254.106       |   -7.75052 |    min |    -2.96% |
|             Max cumulative indexing time across primary shard |                         |   261.856       |   254.106       |   -7.75052 |    min |    -2.96% |
|           Cumulative indexing throttle time of primary shards |                         |     0           |     0           |    0       |    min |     0.00% |
|    Min cumulative indexing throttle time across primary shard |                         |     0           |     0           |    0       |    min |     0.00% |
| Median cumulative indexing throttle time across primary shard |                         |     0           |     0           |    0       |    min |     0.00% |
|    Max cumulative indexing throttle time across primary shard |                         |     0           |     0           |    0       |    min |     0.00% |
|                       Cumulative merge time of primary shards |                         |    93.0234      |    75.2697      |  -17.7538  |    min |   -19.09% |
|                      Cumulative merge count of primary shards |                         |    58           |    58           |    0       |        |     0.00% |
|                Min cumulative merge time across primary shard |                         |    93.0234      |    75.2697      |  -17.7538  |    min |   -19.09% |
|             Median cumulative merge time across primary shard |                         |    93.0234      |    75.2697      |  -17.7538  |    min |   -19.09% |
|                Max cumulative merge time across primary shard |                         |    93.0234      |    75.2697      |  -17.7538  |    min |   -19.09% |
|              Cumulative merge throttle time of primary shards |                         |    14.8963      |    15.443       |    0.54673 |    min |    +3.67% |
|       Min cumulative merge throttle time across primary shard |                         |    14.8963      |    15.443       |    0.54673 |    min |    +3.67% |
|    Median cumulative merge throttle time across primary shard |                         |    14.8963      |    15.443       |    0.54673 |    min |    +3.67% |
|       Max cumulative merge throttle time across primary shard |                         |    14.8963      |    15.443       |    0.54673 |    min |    +3.67% |
|                     Cumulative refresh time of primary shards |                         |     1.68313     |     1.6907      |    0.00757 |    min |    +0.45% |
|                    Cumulative refresh count of primary shards |                         |    89           |    88           |   -1       |        |    -1.12% |
|              Min cumulative refresh time across primary shard |                         |     1.68313     |     1.6907      |    0.00757 |    min |    +0.45% |
|           Median cumulative refresh time across primary shard |                         |     1.68313     |     1.6907      |    0.00757 |    min |    +0.45% |
|              Max cumulative refresh time across primary shard |                         |     1.68313     |     1.6907      |    0.00757 |    min |    +0.45% |
|                       Cumulative flush time of primary shards |                         |     0.00426667  |     0.1037      |    0.09943 |    min | +2330.47% |
|                      Cumulative flush count of primary shards |                         |     1           |     3           |    2       |        |  +200.00% |
|                Min cumulative flush time across primary shard |                         |     0.00426667  |     0.1037      |    0.09943 |    min | +2330.47% |
|             Median cumulative flush time across primary shard |                         |     0.00426667  |     0.1037      |    0.09943 |    min | +2330.47% |
|                Max cumulative flush time across primary shard |                         |     0.00426667  |     0.1037      |    0.09943 |    min | +2330.47% |
|                                       Total Young Gen GC time |                         |    36.185       |    35.991       |   -0.194   |      s |    -0.54% |
|                                      Total Young Gen GC count |                         |  1609           |  1604           |   -5       |        |    -0.31% |
|                                         Total Old Gen GC time |                         |     0           |     0           |    0       |      s |     0.00% |
|                                        Total Old Gen GC count |                         |     0           |     0           |    0       |        |     0.00% |
|                                                    Store size |                         |     4.59639     |     4.59631     |   -8e-05   |     GB |    -0.00% |
|                                                 Translog size |                         |     5.12227e-08 |     5.12227e-08 |    0       |     GB |     0.00% |
|                                        Heap used for segments |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                      Heap used for doc values |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                           Heap used for terms |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                           Heap used for norms |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                          Heap used for points |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                   Heap used for stored fields |                         |     0           |     0           |    0       |     MB |     0.00% |
|                                                 Segment count |                         |     1           |     1           |    0       |        |     0.00% |
|                                   Total Ingest Pipeline count |                         |     0           |     0           |    0       |        |     0.00% |
|                                    Total Ingest Pipeline time |                         |     0           |     0           |    0       |     ms |     0.00% |
|                                  Total Ingest Pipeline failed |                         |     0           |     0           |    0       |        |     0.00% |
|                                                Min Throughput |                   index | 84267.9         | 86637.2         | 2369.27    | docs/s |    +2.81% |
|                                               Mean Throughput |                   index | 89665.9         | 91234.4         | 1568.42    | docs/s |    +1.75% |
|                                             Median Throughput |                   index | 89995.2         | 91391.1         | 1395.89    | docs/s |    +1.55% |
|                                                Max Throughput |                   index | 93419.7         | 94634.4         | 1214.7     | docs/s |    +1.30% |
|                                       50th percentile latency |                   index |   824.045       |   798.843       |  -25.2022  |     ms |    -3.06% |
|                                       90th percentile latency |                   index |  1069.81        |  1015.18        |  -54.6373  |     ms |    -5.11% |
|                                       99th percentile latency |                   index |  2827.33        |  2790.88        |  -36.4487  |     ms |    -1.29% |
|                                     99.9th percentile latency |                   index |  4272.78        |  4345.99        |   73.2042  |     ms |    +1.71% |
|                                    99.99th percentile latency |                   index |  5994.54        |  5228.54        | -766.003   |     ms |   -12.78% |
|                                      100th percentile latency |                   index |  6301.8         |  6121.14        | -180.66    |     ms |    -2.87% |
|                                  50th percentile service time |                   index |   823.468       |   798.785       |  -24.6824  |     ms |    -3.00% |
|                                  90th percentile service time |                   index |  1071.32        |  1014.54        |  -56.7888  |     ms |    -5.30% |
|                                  99th percentile service time |                   index |  2826.77        |  2793.96        |  -32.8081  |     ms |    -1.16% |
|                                99.9th percentile service time |                   index |  4276.41        |  4345.83        |   69.4221  |     ms |    +1.62% |
|                               99.99th percentile service time |                   index |  5994.54        |  5228.54        | -766.003   |     ms |   -12.78% |
|                                 100th percentile service time |                   index |  6301.8         |  6121.14        | -180.66    |     ms |    -2.87% |
|                                                    error rate |                   index |     0           |     0           |    0       |      % |     0.00% |
|                                                Min Throughput |                 default |    57.1771      |    62.3085      |    5.13131 |  ops/s |    +8.97% |
|                                               Mean Throughput |                 default |    57.1771      |    62.3085      |    5.13131 |  ops/s |    +8.97% |
|                                             Median Throughput |                 default |    57.1771      |    62.3085      |    5.13131 |  ops/s |    +8.97% |
|                                                Max Throughput |                 default |    57.1771      |    62.3085      |    5.13131 |  ops/s |    +8.97% |
|                                       50th percentile latency |                 default |    12.234       |    11.7563      |   -0.47768 |     ms |    -3.90% |
|                                       90th percentile latency |                 default |    13.7187      |    12.8379      |   -0.88079 |     ms |    -6.42% |
|                                       99th percentile latency |                 default |    19.121       |    15.5433      |   -3.5777  |     ms |   -18.71% |
|                                      100th percentile latency |                 default |    19.8582      |    17.4709      |   -2.38726 |     ms |   -12.02% |
|                                  50th percentile service time |                 default |    12.234       |    11.7563      |   -0.47768 |     ms |    -3.90% |
|                                  90th percentile service time |                 default |    13.7187      |    12.8379      |   -0.88079 |     ms |    -6.42% |
|                                  99th percentile service time |                 default |    19.121       |    15.5433      |   -3.5777  |     ms |   -18.71% |
|                                 100th percentile service time |                 default |    19.8582      |    17.4709      |   -2.38726 |     ms |   -12.02% |
|                                                    error rate |                 default |     0           |     0           |    0       |      % |     0.00% |
|                                                Min Throughput |              default_1k |    26.2445      |    26.3179      |    0.07338 |  ops/s |    +0.28% |
|                                               Mean Throughput |              default_1k |    26.7474      |    27.0223      |    0.27485 |  ops/s |    +1.03% |
|                                             Median Throughput |              default_1k |    26.785       |    27.1886      |    0.40363 |  ops/s |    +1.51% |
|                                                Max Throughput |              default_1k |    27.1753      |    27.3941      |    0.21875 |  ops/s |    +0.80% |
|                                       50th percentile latency |              default_1k |    35.0587      |    34.5172      |   -0.54152 |     ms |    -1.54% |
|                                       90th percentile latency |              default_1k |    36.4106      |    35.2654      |   -1.14522 |     ms |    -3.15% |
|                                       99th percentile latency |              default_1k |    38.9861      |    48.2695      |    9.28334 |     ms |   +23.81% |
|                                      100th percentile latency |              default_1k |    43.4351      |    53.5956      |   10.1605  |     ms |   +23.39% |
|                                  50th percentile service time |              default_1k |    35.0587      |    34.5172      |   -0.54152 |     ms |    -1.54% |
|                                  90th percentile service time |              default_1k |    36.4106      |    35.2654      |   -1.14522 |     ms |    -3.15% |
|                                  99th percentile service time |              default_1k |    38.9861      |    48.2695      |    9.28334 |     ms |   +23.81% |
|                                 100th percentile service time |              default_1k |    43.4351      |    53.5956      |   10.1605  |     ms |   +23.39% |
|                                                    error rate |              default_1k |     0           |     0           |    0       |      % |     0.00% |
|                                                Min Throughput | date-histo-entire-range |   252.385       |   252.975       |    0.58992 |  ops/s |    +0.23% |
|                                               Mean Throughput | date-histo-entire-range |   252.385       |   252.975       |    0.58992 |  ops/s |    +0.23% |
|                                             Median Throughput | date-histo-entire-range |   252.385       |   252.975       |    0.58992 |  ops/s |    +0.23% |
|                                                Max Throughput | date-histo-entire-range |   252.385       |   252.975       |    0.58992 |  ops/s |    +0.23% |
|                                       50th percentile latency | date-histo-entire-range |     2.53056     |     2.31873     |   -0.21183 |     ms |    -8.37% |
|                                       90th percentile latency | date-histo-entire-range |     3.17176     |     3.31687     |    0.14511 |     ms |    +4.58% |
|                                       99th percentile latency | date-histo-entire-range |     4.19283     |     4.18774     |   -0.00508 |     ms |    -0.12% |
|                                      100th percentile latency | date-histo-entire-range |     4.76439     |     4.81196     |    0.04757 |     ms |    +1.00% |
|                                  50th percentile service time | date-histo-entire-range |     2.53056     |     2.31873     |   -0.21183 |     ms |    -8.37% |
|                                  90th percentile service time | date-histo-entire-range |     3.17176     |     3.31687     |    0.14511 |     ms |    +4.58% |
|                                  99th percentile service time | date-histo-entire-range |     4.19283     |     4.18774     |   -0.00508 |     ms |    -0.12% |
|                                 100th percentile service time | date-histo-entire-range |     4.76439     |     4.81196     |    0.04757 |     ms |    +1.00% |
|                                                    error rate | date-histo-entire-range |     0           |     0           |    0       |      % |     0.00% |
|                                                Min Throughput |          esql-fetch-500 |     4.69794     |     4.79783     |    0.09989 |  ops/s |    +2.13% |
|                                               Mean Throughput |          esql-fetch-500 |     5.07152     |     5.14014     |    0.06862 |  ops/s |    +1.35% |
|                                             Median Throughput |          esql-fetch-500 |     5.12409     |     5.17911     |    0.05501 |  ops/s |    +1.07% |
|                                                Max Throughput |          esql-fetch-500 |     5.28468     |     5.33109     |    0.04642 |  ops/s |    +0.88% |
|                                       50th percentile latency |          esql-fetch-500 |   174.502       |   174.21        |   -0.29179 |     ms |    -0.17% |
|                                       90th percentile latency |          esql-fetch-500 |   181.165       |   180.646       |   -0.51969 |     ms |    -0.29% |
|                                       99th percentile latency |          esql-fetch-500 |   200.495       |   194.328       |   -6.1667  |     ms |    -3.08% |
|                                      100th percentile latency |          esql-fetch-500 |   200.661       |   202.518       |    1.85774 |     ms |    +0.93% |
|                                  50th percentile service time |          esql-fetch-500 |   174.502       |   174.21        |   -0.29179 |     ms |    -0.17% |
|                                  90th percentile service time |          esql-fetch-500 |   181.165       |   180.646       |   -0.51969 |     ms |    -0.29% |
|                                  99th percentile service time |          esql-fetch-500 |   200.495       |   194.328       |   -6.1667  |     ms |    -3.08% |
|                                 100th percentile service time |          esql-fetch-500 |   200.661       |   202.518       |    1.85774 |     ms |    +0.93% |
|                                                    error rate |          esql-fetch-500 |     0           |     0           |    0       |      % |     0.00% |

@martijnvg martijnvg closed this Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants