pageserver: getpage requests sometimes skip reading recently written image layers #9185

jcsp · 2024-09-27T17:14:08Z

Via investigation of #9058 -- in that issue, it was observed that layers before recently written image layers were being visited by getpage requests.

It seems like under some circumstances, a getpage request to the exact same LSN where an image layer exists can fail to hit that image layer. Not clear if being at the exact same LSN is important or not: it might just be that we don't hit image layers for reads until the current in memory layer is closed?

Lots of uncertainty here, not claiming to have conclusively diagnosed this

Branch with experimental test:
https://github.com/neondatabase/neon/tree/jcsp/layer-map-search-at-image-lsn-2

In that branch, there are some log lines hacked in to record which layers are visited at INFO level. In the test, there is a checkpoint line commented out:

    # Uncomment this checkpoint, and the logs will show getpage requests hitting the image layers we
    # just created.  However, without the checkpoint, getpage requests will hit one InMemoryLayer and
    # one persistent delta layer.
    # env.pageserver.http_client().timeline_checkpoint(tenant_id, timeline_id, wait_until_uploaded=True)

The presence or absence of inmemory layers shouldn't make any difference to whether reads hit an image layer, but apparently it does.

The text was updated successfully, but these errors were encountered:

jcsp · 2024-10-14T16:42:56Z

A cleaner reproducer that uses layer eviction + on-demand downloads to prove which layers are touched by a getpage request: https://github.com/neondatabase/neon/tree/jcsp/layer-map-search-at-image-lsn-3

This test does reads at exactly the LSN of the image layer, but I can also reproduce the issue with some writes between generating the image layer and doing the read, so this is not something that only occurs when reading exactly at the image layer's LSN. I suspect our reads are skipping the image layer until the next time we freeze the ephemeral layer.

jcsp · 2024-10-15T12:57:13Z

Perhaps this piece of logic is at fault in get_vectored_reconstruct_data_timeline:

                match in_memory_layer {
                    Some(l) => {
                        let lsn_range = l.get_lsn_range().start..cont_lsn;
                        fringe.update(
                            ReadableLayer::InMemoryLayer(l),
                            unmapped_keyspace.clone(),
                            lsn_range,
                        );
                    }

...because lsn_range is being constructed from the absolute start of the layer. Our cont_lsn jumps back to the start of the oldest inmemory layer before we start looking at historic layers at all.

jcsp added c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug labels Sep 27, 2024

jcsp mentioned this issue Oct 1, 2024

Investigate unexpectedly high volume of "became visible" logs #9058

Closed

jcsp added a commit that referenced this issue Oct 15, 2024

pageserver: work around #9185 in layer visibility calculation

0ed9897

jcsp mentioned this issue Oct 15, 2024

pageserver: work around #9185 in layer visibility calculation #9400

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pageserver: getpage requests sometimes skip reading recently written image layers #9185

pageserver: getpage requests sometimes skip reading recently written image layers #9185

jcsp commented Sep 27, 2024

jcsp commented Oct 14, 2024 •

edited

Loading

jcsp commented Oct 15, 2024

pageserver: getpage requests sometimes skip reading recently written image layers #9185

pageserver: getpage requests sometimes skip reading recently written image layers #9185

Comments

jcsp commented Sep 27, 2024

jcsp commented Oct 14, 2024 • edited Loading

jcsp commented Oct 15, 2024

jcsp commented Oct 14, 2024 •

edited

Loading