Skip to content

Conversation

bastien-curutchet
Copy link
Contributor

No description provided.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 17 times, most recently from c199778 to b0c73f0 Compare September 28, 2025 11:38
Hi all,

The test_xsk.sh script covers many AF_XDP use cases. The tests it runs
are defined in xksxceiver.c. Since this script is used to test real
hardware, the goal here is to leave it as it is, and only integrate the
tests that run on veth peers into the test_progs framework.

Some tests are flaky so they can't be integrated in the CI as they are.
I think that fixing their flakyness would require a significant amount of
work. So, as first step, I've excluded them from the list of tests
migrated to the CI (cf PATCH 14). If these tests get fixed at some
point, integrating them into the CI will be straightforward.

I noticed a small error on a function's return value while investigating
on the report's summary issue pointed out by Maciej in previous iteration,
the new PATCH 3 fixes it.

PATCH 1 extracts test_xsk[.c/.h] from xskxceiver[.c/.h] to make the
tests available to test_progs.
PATCH 2 to 7 fix small issues in the current test
PATCH 8 to 13 handle all errors to release resources instead of calling
exit() when any error occurs.
PATCH 14 isolates some flaky tests
PATCH 15 integrate the non-flaky tests to the test_progs framework

To: Björn Töpel <[email protected]>
To: Magnus Karlsson <[email protected]>
To: Maciej Fijalkowski <[email protected]>
To: Jonathan Lemon <[email protected]>
To: Alexei Starovoitov <[email protected]>
To: Daniel Borkmann <[email protected]>
To: Andrii Nakryiko <[email protected]>
To: Martin KaFai Lau <[email protected]>
To: Eduard Zingerman <[email protected]>
To: Song Liu <[email protected]>
To: Yonghong Song <[email protected]>
To: John Fastabend <[email protected]>
To: KP Singh <[email protected]>
To: Stanislav Fomichev <[email protected]>
To: Hao Luo <[email protected]>
To: Jiri Olsa <[email protected]>
To: Mykola Lysenko <[email protected]>
To: Shuah Khan <[email protected]>
To: David S. Miller <[email protected]>
To: Jakub Kicinski <[email protected]>
To: Jesper Dangaard Brouer <[email protected]>
Cc: Thomas Petazzoni <[email protected]>
Cc: Alexis Lothore <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>

---
Changes in v5:
- EDITME: describe what is new in this series revision.
- EDITME: use bulletpoints and terse descriptions.
- Link to v4: https://lore.kernel.org/r/[email protected]

Changes in v4:
- Fix test_xsk.sh's summary report.
- Merge PATCH 11 & 12 together, otherwise PATCH 11 fails to build.
- Split old PATCH 3 in two patches. The first one fixes
  testapp_stats_rx_dropped(), the second one fixes
  testapp_xdp_shared_umem(). The unecessary frees (in
  testapp_stats_rx_full() and testapp_stats_fill_empty() are removed)
- Link to v3: https://lore.kernel.org/r/[email protected]

Changes in v3:
- Rebase on latest bpf-next_base to integrate commit c9110e6 ("selftests/bpf:
Fix count write in testapp_xdp_metadata_copy()").
- Move XDP_METADATA_COPY_* tests from flaky-tests to nominal tests
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Rebase on the latest bpf-next_base and integrate the newly added tests
  to the work (adjust_tail* and tx_queue_consumer tests)
- Re-order patches to split xkxceiver sooner.
- Fix the bug reported by Maciej.
- Fix verbose mode in test_xsk.sh by keeping kselftest (remove PATCH 1,
  7 and 8)
- Link to v1: https://lore.kernel.org/r/[email protected]

--- b4-submit-tracking ---
# This section is used internally by b4 prep for tracking purposes.
{
  "series": {
    "revision": 5,
    "change-id": "20250218-xsk-0cf90e975d14",
    "prefixes": [
      "bpf-next"
    ],
    "history": {
      "v1": [
        "[email protected]"
      ],
      "v2": [
        "[email protected]"
      ],
      "v3": [
        "[email protected]"
      ],
      "v4": [
        "[email protected]"
      ]
    }
  }
}
AF_XDP features are tested by the test_xsk.sh script but not by the
test_progs framework. The tests used by the script are defined in
xksxceiver.c which can't be integrated in the test_progs framework as is.

Extract these test definitions from xskxceiver{.c/.h} to put them in new
test_xsk{.c/.h} files.
Keep the main() function and its unshared dependencies in xksxceiver to
avoid impacting the test_xsk.sh script which is often used to test real
hardware.
Move ksft_test_result_*() calls to xskxceiver.c to keep the kselftest's
report valid

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
bitmap is used before being initialized.

Initialize it to zero before using it.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
__testapp_validate_traffic is supposed to return an integer value that
tells if the test passed (0), failed (-1) or was skiped (2). It actually
returns a boolean in the end. This doesn't harm when the test is
successful but can lead to misinterpretation in case of failure as 1
will be returned instead of -1.

Return TEST_FAILURE (-1) in case of failure, TEST_PASS (0) otherwise.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
testapp_stats_rx_dropped() generates pkt_stream twice. The last
generated is released by pkt_stream_restore_default() at the end of the
test but we lose the pointer of the first pkt_stream.

Release the 'middle' pkt_stream when it's getting replaced to prevent
memory leaks.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
testapp_xdp_shared_umem() generates pkt_stream on each xsk from xsk_arr,
where normally xsk_arr[0] gets pkt_streams and xsk_arr[1] have them NULLed.
At the end of the test pkt_stream_restore_default() only releases
xsk_arr[0] which leads to memory leaks.

Release the missing pkt_stream at the end of testapp_xdp_shared_umem()

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
The clean-up done at the end of a test in __testapp_validate_traffic()
isn't wrapped in a function. It isn't convenient if we want to use it
somewhere else in the code.

Wrap the clean-up in two new functions : the first deletes the sockets,
the second releases the umem.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
testapp_validate_traffic() doesn't release the sockets and the umem
created by the threads if the test isn't currently in its last step.
Thus, if the swap_xsk_resources() fails before the last step, the
created resources aren't cleaned up.

Clean the sockets and the umem in case of swap_xsk_resources() failure.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
init_iface() doesn't have any return value while it can fail. In case of
failure it calls exit_on_error() which exits the application
immediately. This prevents the following tests from being run and isn't
compliant with the CI

Add a return value to init_iface() so errors can be handled more
smoothly.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
xsk_reattach_xdp calls exit_with_error() on failures. This exits the
program immediately. It prevents the following tests from being run and
isn't compliant with the CI.

Add a return value to the functions handling XDP attachments to handle
errors more smoothly.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
exit_with_error() is called when gettimeofday() fails. This exits the
program immediately. It prevents the following tests from being run and
isn't compliant with the CI.

Return TEST_FAILURE instead of calling exit_on_error().

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
TX and RX workers can fail in many places. These failures trigger a call
to exit_with_error() which exits the program immediately. It prevents the
following tests from running and isn't compliant with the CI.

Add return value to functions that can fail.
Handle failures more smoothly through report_failure().

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
…ails

__testapp_validate_traffic() calls exit_with_error() on failures. This
exits the program immediately. It prevents the following tests from
running and isn't compliant with the CI.

Return TEST_FAILURE instead of calling exit_with_error().
Release the resource of the 1st thread if a failure happens between its
creation and the creation of the second thread.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
If any allocation in the pkt_stream_*() helpers fail, exit_with_error() is
called. This terminates the program immediately. It prevents the following
tests from running and isn't compliant with the CI.

Return NULL in case of allocation failure.
Return TEST_FAILURE when something goes wrong in the packet generation.
Clean up the resources if a failure happens between two steps of a test.

Move exit_with_error()'s definition into xskxceiver.c as it isn't used
anywhere else now.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Some tests are flaky and fail from time to time on virtual interfaces.
Adding them to the CI would trigger lots of 'false' errors.

Remove the flaky tests from the nominal tests table so they won't be
run by the CI in upcoming patch.
Create a flaky_tests table to hold them.
Use this flaky table in xskxceiver.c to keep all the tests available
from the test_xsk.sh script.

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
test_xsk.c isn't part of the test_progs framework.

Integrate the tests defined by test_xsk.c into the test_progs framework
through a new file : prog_tests/xsk.c. ZeroCopy mode isn't tested in it
as veth peers don't support it.

Move test_xsk{.c/.h} to prog_tests/.

Add the find_bit library to test_progs sources in the Makefile as it is
is used by test_xsk.c

Signed-off-by: Bastien Curutchet (eBPF Foundation) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant