Skip to content

Conversation

@PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Nov 3, 2025

Update process (This kernel CentOS base for 5.14.0-570)

  • Kernel History Rebuild Process for all src.rpms hosted by RESF
  • Create sig-cloud-9/5.14.0-570.X.1.el9_6 branch
  • Check if any maintained code is included in the new el release.
  • Cherry-pick all code from previous branch into new branch (skipping unneeded code)
    • Fix conflicts as they arise
  • Build and Test

Removed Commits

None

Forward Port Process

[jmaple@devbox kernel-src-tree-tools]$ python3 rolling-release-update.py --repo ../kernel-src-tree-rolling/ --new-base-branch rocky8_10 --old-rolling-branch rlc-8/4.18.0-553.80.1.el8_10 | tee ../RR.RLC.$(git -C ../kernel-src-tree-rolling/ describe origin/rocky8_10).log
[rolling release update] Rolling Product:  rlc-8
[rolling release update] Checking out branch:  rlc-8/4.18.0-553.80.1.el8_10
[rolling release update] Gathering all the RESF kernel Tags
[rolling release update] Found 38 RESF kernel tags
[rolling release update] Checking out branch:  rocky8_10
[rolling release update] Gathering all the RESF kernel Tags
[rolling release update] Found 39 RESF kernel tags
[rolling release update] Latest RESF tag sha:  b'9646b4b50868'
"9646b4b5086849534b26c6f9a9a941c0aa70f7b2 Rebuild rocky8_10 with kernel-4.18.0-553.80.1.el8_10"
[rolling release update] Checking for FIPS protected changes between the common tag and HEAD
[rolling release update] Checking for FIPS protected changes
[rolling release update] Getting SHAS 9646b4b50868..HEAD
[rolling release update] Number of commits to check:  96
[rolling release update] Checking modifications of shas
[rolling release update] Checked 9 of 96 commits
[rolling release update] Checked 18 of 96 commits
[rolling release update] Checked 27 of 96 commits
[rolling release update] Checked 36 of 96 commits
[rolling release update] Checked 45 of 96 commits
[rolling release update] Checked 54 of 96 commits
[rolling release update] Checked 63 of 96 commits
[rolling release update] Checked 72 of 96 commits
[rolling release update] Checked 81 of 96 commits
[rolling release update] Checked 90 of 96 commits
[rolling release update] 0 of 96 commits have FIPS protected changes
[rolling release update] Checking out old rolling branch:  rlc-8/4.18.0-553.80.1.el8_10
[rolling release update] Finding the CIQ Kernel and Associated Upstream commits between the last resf tag and HEAD
[rolling release update] Last RESF tag sha:  b'9646b4b50868'
[rolling release update] Total commits in old branch: 10
[rolling release update] Checking out new base branch:  rocky8_10
[rolling release update] Finding the kernel version for the new rolling release
[rolling release update] New Branch to create: rlc-8/4.18.0-553.81.1.el8_10
[rolling release update] Creating new branch: rlc-8/4.18.0-553.81.1.el8_10
[rolling release update] Creating new branch for PR:  jmaple_rlc-8/4.18.0-553.81.1.el8_10
[rolling release update] Creating Map of all new commits from last rolling release fork
[rolling release update] Total commits in new branch: 95
[rolling release update] Checking if any of the commits from the old rolling release are already present in the new base branch
[rolling release update] Found 0 duplicate commits to remove
[rolling release update] Applying 10 remaining commits to the new branch
  [1/10] 6ff2d845c731 x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach
  [2/10] 137cb1de0f69 x86/boot: Move x86_cache_alignment initialization to correct spot
  [3/10] e643d65e6f5f x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()
  [4/10] 861b82b15ebf x86/cpu: Get rid of an unnecessary local variable in get_cpu_address_sizes()
  [5/10] 8ec98fd5263e x86/cpu: Provide default cache line size if not enumerated
  [6/10] e5580ed4f6fb net: mana: Enable MANA driver on ARM64 with 4K page size
  [7/10] 3210f4946a52 net: mana: Add support for page sizes other than 4KB on ARM64
  [8/10] ec01560cbf97 RDMA/mana_ib: Fix bug in creation of dma regions
  [9/10] d4f1388b851a RDMA/mana_ib: use the correct page size for mapping user-mode doorbell page
  [10/10] ee9f1c2e3b95 RDMA/mana_ib: use the correct page table index based on hardware page size
[rolling release update] Successfully applied all 10 commits

Build

[jmaple@devbox kernel-src-tree-rolling]$ git push --follow-tags origin jmaple_rlc-8/4.18.0-553.81.1.el8_10
Everything up-to-date
[jmaple@devbox kernel-src-tree-rolling]$ ^C
[jmaple@devbox kernel-src-tree-rolling]$ cd ../
[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree-build
Running make mrproper...
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1844s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9+
[TIMER]{MODULES}: 9s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 19s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9+ and Index to 3
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1844s
[TIMER]{MODULES}: 9s
[TIMER]{INSTALL}: 19s
[TIMER]{TOTAL} 1882s
Rebooting in 10 seconds

KSelfTest

[jmaple@devbox code]$ ~/workspace/auto_kernel_history_rebuild/Rocky10/rocky10/code/get_kselftest_diff.sh
kselftest.4.18.0-jmaple_sig-cloud-8_4.18.0-553.72.1.el8_10-5a1cae8e021+.log
206
kselftest.4.18.0-jmaple_sig-cloud-8_4.18.0-553.72.1.el8_10-7c700caae90+.log
206
kselftest.4.18.0-rocky8_10_rebuild-baea35f64da5+.log
207
kselftest.4.18.0-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9+.log
206
Before: kselftest.4.18.0-rocky8_10_rebuild-baea35f64da5+.log
After: kselftest.4.18.0-jmaple_rlc-8_4.18.0-553.81.1.el8_10-ed216b7115b9+.log
Diff:
-ok 1 selftests: filesystems: devpts_pts # SKIP

ciq-sahlberg and others added 10 commits November 3, 2025 15:46
…tead of a two-phase approach

jira roc-2673
commit fbf6449

Instead of setting x86_virt_bits to a possibly-correct value and then
correcting it later, do all the necessary checks before setting it.

At this point, the #VC handler references boot_cpu_data.x86_virt_bits,
and in the previous version, it would be triggered by the CPUIDs between
the point at which it is set to 48 and when it is set to the correct
value.

    Suggested-by: Dave Hansen <[email protected]>
    Signed-off-by: Adam Dunlap <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: Jacob Xu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]

Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira roc-2673
commit 3e32552

c->x86_cache_alignment is initialized from c->x86_clflush_size.
However, commit fbf6449 moved c->x86_clflush_size initialization
to later in boot without moving the c->x86_cache_alignment assignment:

  fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")

This presumably left c->x86_cache_alignment set to zero for longer
than it should be.

The result was an oops on 32-bit kernels while accessing a pointer
at 0x20.  The 0x20 came from accessing a structure member at offset
0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10.  kmalloc() can
evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
requirement.

Move the c->x86_cache_alignment initialization to be after
c->x86_clflush_size has an actual value.

    Fixes: fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
    Signed-off-by: Dave Hansen <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: Nathan Chancellor <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    (cherry picked from commit 3e32552)
Signed-off-by: Ronnie Sahlberg <[email protected]>

Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2183
bug-fix x86/sev-es: Set x86_virt_bits
commit-author Paolo Bonzini <[email protected]>
commit 9a45819

In commit fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct
value straight away, instead of a two-phase approach"), the initialization
of c->x86_phys_bits was moved after this_cpu->c_early_init(c).  This is
incorrect because early_init_amd() expected to be able to reduce the
value according to the contents of CPUID leaf 0x8000001f.

Fortunately, the bug was negated by init_amd()'s call to early_init_amd(),
which does reduce x86_phys_bits in the end.  However, this is very
late in the boot process and, most notably, the wrong value is used for
x86_phys_bits when setting up MTRRs.

To fix this, call get_cpu_address_sizes() as soon as X86_FEATURE_CPUID is
set/cleared, and c->extended_cpuid_level is retrieved.

Fixes: fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
	Signed-off-by: Paolo Bonzini <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
	Cc:[email protected]
Link: https://lore.kernel.org/all/20240131230902.1867092-2-pbonzini%40redhat.com
(cherry picked from commit 9a45819)
	Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
…sizes()

jira LE-2183
bug-fix-prereq x86/sev-es: Set x86_virt_bits
commit-author Borislav Petkov (AMD) <[email protected]>
commit 95bfb35

Drop 'vp_bits_from_cpuid' as it is not really needed.

No functional changes.

	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 95bfb35)
	Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2183
bug-fix x86/sev-es: Set x86_virt_bits
commit-author Dave Hansen <[email protected]>
commit 2a38e4c

tl;dr: CPUs with CPUID.80000008H but without CPUID.01H:EDX[CLFSH]
will end up reporting cache_line_size()==0 and bad things happen.
Fill in a default on those to avoid the problem.

Long Story:

The kernel dies a horrible death if c->x86_cache_alignment (aka.
cache_line_size() is 0.  Normally, this value is populated from
c->x86_clflush_size.

Right now the code is set up to get c->x86_clflush_size from two
places.  First, modern CPUs get it from CPUID.  Old CPUs that don't
have leaf 0x80000008 (or CPUID at all) just get some sane defaults
from the kernel in get_cpu_address_sizes().

The vast majority of CPUs that have leaf 0x80000008 also get
->x86_clflush_size from CPUID.  But there are oddballs.

Intel Quark CPUs[1] and others[2] have leaf 0x80000008 but don't set
CPUID.01H:EDX[CLFSH], so they skip over filling in ->x86_clflush_size:

	cpuid(0x00000001, &tfms, &misc, &junk, &cap0);
	if (cap0 & (1<<19))
		c->x86_clflush_size = ((misc >> 8) & 0xff) * 8;

So they: land in get_cpu_address_sizes() and see that CPUID has level
0x80000008 and jump into the side of the if() that does not fill in
c->x86_clflush_size.  That assigns a 0 to c->x86_cache_alignment, and
hilarity ensues in code like:

        buffer = kzalloc(ALIGN(sizeof(*buffer), cache_line_size()),
                         GFP_KERNEL);

To fix this, always provide a sane value for ->x86_clflush_size.

Big thanks to Andy Shevchenko for finding and reporting this and also
providing a first pass at a fix. But his fix was only partial and only
worked on the Quark CPUs.  It would not, for instance, have worked on
the QEMU config.

1. https://raw.githubusercontent.com/InstLatx64/InstLatx64/master/GenuineIntel/GenuineIntel0000590_Clanton_03_CPUID.txt
2. You can also get this behavior if you use "-cpu 486,+clzero"
   in QEMU.

[ dhansen: remove 'vp_bits_from_cpuid' reference in changelog
	   because bpetkov brutally murdered it recently. ]

Fixes: fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
	Reported-by: Andy Shevchenko <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
	Tested-by: Andy Shevchenko <[email protected]>
	Tested-by: Jörn Heusipp <[email protected]>
	Cc: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/all/20240517200534.8EC5F33E%40davehans-spike.ostc.intel.com
(cherry picked from commit 2a38e4c)
	Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3812
commit-author Haiyang Zhang <[email protected]>
commit 40a1d11

Change the Kconfig dependency, so this driver can be built and run on ARM64
with 4K page size.
16/64K page sizes are not supported yet.

	Signed-off-by: Haiyang Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 40a1d11)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3812
commit-author Haiyang Zhang <[email protected]>
commit 382d174

As defined by the MANA Hardware spec, the queue size for DMA is 4KB
minimal, and power of 2. And, the HWC queue size has to be exactly
4KB.

To support page sizes other than 4KB on ARM64, define the minimal
queue size as a macro separately from the PAGE_SIZE, which we always
assumed it to be 4KB before supporting ARM64.

Also, add MANA specific macros and update code related to size
alignment, DMA region calculations, etc.

	Signed-off-by: Haiyang Zhang <[email protected]>
	Reviewed-by: Michael Kelley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 382d174)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira LE-3812
commit-author Konstantin Taranov <[email protected]>
commit e02497f

Use ib_umem_dma_offset() helper to calculate correct dma offset.

Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
	Signed-off-by: Konstantin Taranov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit e02497f)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
…l page

jira LE-3812
commit-author Long Li <[email protected]>
commit 4a3b99b

When mapping doorbell page from user-mode, the driver should use the system
page size as this memory is allocated via mmap() from user-mode.

	Cc: [email protected]
Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
	Signed-off-by: Long Li <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit 4a3b99b)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
… size

jira LE-3812
commit-author Long Li <[email protected]>
commit 9e517a8

MANA hardware uses 4k page size. When calculating the page table index,
it should use the hardware page size, not the system page size.

	Cc: [email protected]
Fixes: 0266a17 ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
	Signed-off-by: Long Li <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit 9e517a8)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
@PlaidCat PlaidCat requested a review from a team November 3, 2025 22:10
@PlaidCat PlaidCat self-assigned this Nov 3, 2025
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat PlaidCat merged commit ed216b7 into rlc-8/4.18.0-553.81.1.el8_10 Nov 5, 2025
2 checks passed
@PlaidCat PlaidCat deleted the jmaple_rlc-8/4.18.0-553.81.1.el8_10 branch November 5, 2025 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants