Skip to content

Clock mismatch between eBPF and userspace code. #8

@fracappa

Description

@fracappa

Sometimes, logs like this can show up:

YYYY/MM/DD hh:mm::ss Stale CLOSE_WAIT: local_ip:local_port -> remote_ip:remote_port (age: -128.141236ms)

It's a clear indication of clock mismatch between userspace code and BPF.

There could be a few reasons why this happens:

  1. NTP slewing: While CLOCK_MONOTONIC doesn't jump, NTP can slew it (gradually speed up or slow down the clock). If the userspace clock is being slewed backward slightly relative to when the kernel recorded entered_at, you could see small negative values.
  2. vDSO timing: clock_gettime(CLOCK_MONOTONIC) is typically served via vDSO, reading from a shared memory page that the kernel updates periodically. There can be tiny discrepancies between this and the kernel's internal ktime_get_ns().
  3. Virtualization: In VMs, the guest kernel's internal ktime and the userspace-visible clock can drift slightly due to how the hypervisor handles time.

This is a bug to be solved.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions