OCPBUGS-79576: Mark raw FC/iSCSI multipath members as ineligible#10142
Conversation
This reverts merge commit a63f435 (PR openshift#10091). The GetPreferredDiskID approach is bypassed by the "already matches" short-circuit in applyRootDeviceHints: if the service auto-selected a raw FC/iSCSI path and it appears in the acceptable list, the code returns early without ever calling GetPreferredDiskID. The correct fix is to mark raw FC/iSCSI multipath members as ineligible at the validator level (next commit).
|
@yoavsc0302: This pull request references Jira Issue OCPBUGS-79576, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughDisk selection logic was refactored: the shared Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 9 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/controller/controllers/bmh_agent_controller.go (1)
1295-1305: Consider adding a guard for empty input slice.The function assumes
disksis non-empty when accessingdisks[0].IDon line 1304. While the current call site at line 1282 guards withlen(acceptable) > 0, defensive coding would protect against future misuse.🛡️ Optional: Add defensive check
func getDiskID(disks []*models.Disk) string { + if len(disks) == 0 { + return "" + } for _, disk := range disks { if strings.EqualFold(string(disk.DriveType), string(models.DriveTypeMultipath)) { return disk.ID } } return disks[0].ID }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/controllers/bmh_agent_controller.go` around lines 1295 - 1305, getDiskID assumes disks is non-empty and will panic on disks[0].ID; add a defensive empty-slice guard at the start of getDiskID (e.g., if len(disks) == 0 { return "" }) so the function safely returns an empty string when no disks are provided (alternatively consider changing signature to return (string, error) if callers need explicit error handling); update any callers or tests if they rely on a non-empty return.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/controller/controllers/bmh_agent_controller.go`:
- Around line 1295-1305: getDiskID assumes disks is non-empty and will panic on
disks[0].ID; add a defensive empty-slice guard at the start of getDiskID (e.g.,
if len(disks) == 0 { return "" }) so the function safely returns an empty string
when no disks are provided (alternatively consider changing signature to return
(string, error) if callers need explicit error handling); update any callers or
tests if they rely on a non-empty return.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 155b91ad-d759-4f3b-93d7-5cfaa0b0eac7
📒 Files selected for processing (6)
cmd/agentbasedinstaller/host_config.gocmd/agentbasedinstaller/host_config_test.gointernal/controller/controllers/bmh_agent_controller.gointernal/hardware/validator.gointernal/hardware/validator_test.gointernal/host/hostutil/host_utils.go
💤 Files with no reviewable changes (2)
- cmd/agentbasedinstaller/host_config_test.go
- internal/host/hostutil/host_utils.go
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #10142 +/- ##
==========================================
+ Coverage 44.29% 44.37% +0.08%
==========================================
Files 415 415
Lines 72735 72825 +90
==========================================
+ Hits 32215 32316 +101
+ Misses 37606 37596 -10
+ Partials 2914 2913 -1
🚀 New features to boost your workflow:
|
Raw FC and iSCSI disks that are members of a multipath device are not eligible for installation. Writing to a raw path instead of the multipath device causes installation failure (ENODEV) when multipathd reclaims the disk. Mark these disks as ineligible at the validator level so they are excluded from all disk selection paths (SaaS, ZTP, ABI) and show as ineligible in the UI.
37496d6 to
b587c8e
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/hardware/validator.go (1)
235-240: Trim holder tokens before lookup to avoid edge-case false negatives.If
disk.Holdersever contains spacing ("dm-0, dm-1"), direct map lookup can miss valid multipath holders.♻️ Proposed hardening
holders := strings.Split(disk.Holders, ",") for _, holder := range holders { + holder = strings.TrimSpace(holder) + if holder == "" { + continue + } if _, exists := multipathDiskNamesMap[holder]; exists { return fmt.Errorf(diskIsMultipathMember, holder) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/hardware/validator.go` around lines 235 - 240, The loop that checks disk.Holders splits on commas but does not trim whitespace, so lookups on multipathDiskNamesMap can miss entries like " dm-1"; update the holder handling inside the loop in validator.go (the loop over holders created by strings.Split in the function that checks multipath membership) to call strings.TrimSpace(holder) and skip empty tokens before doing the map lookup; use the trimmed token for the exists check and the error message (fmt.Errorf(diskIsMultipathMember, trimmedHolder)).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/hardware/validator.go`:
- Around line 235-240: The loop that checks disk.Holders splits on commas but
does not trim whitespace, so lookups on multipathDiskNamesMap can miss entries
like " dm-1"; update the holder handling inside the loop in validator.go (the
loop over holders created by strings.Split in the function that checks multipath
membership) to call strings.TrimSpace(holder) and skip empty tokens before doing
the map lookup; use the trimmed token for the exists check and the error message
(fmt.Errorf(diskIsMultipathMember, trimmedHolder)).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 60aa4063-6544-4467-bea9-8520fe6a10d6
📒 Files selected for processing (2)
internal/hardware/validator.gointernal/hardware/validator_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/hardware/validator_test.go
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: avishayt, yoavsc0302 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/hold |
|
/test ? |
|
/test edge-e2e-oci-assisted-bm-iscsi-4-22 |
|
/jira refresh |
|
@yoavsc0302: This pull request references Jira Issue OCPBUGS-79576, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@zaneb: This pull request references Jira Issue OCPBUGS-79576, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/cherrypick release-4.22 release-4.21 release-4.20 Note: auto cherry-pick to release-4.21 and release-4.20 will likely fail because the original fix (#10091) was manually backported to those branches with adapted code, so the revert commit won't match. Commenting anyway for documentation, will prepare manual backports. |
|
@yoavsc0302: once the present PR merges, I will cherry-pick it on top of DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/hold cancel |
|
/retest |
|
@yoavsc0302: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@yoavsc0302: Jira Issue OCPBUGS-79576: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-79576 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@yoavsc0302: new pull request created: #10150 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Two commits:
Revert PR OCPBUGS-79576: Prefer multipath disk in ABI disk selection #10091 — The
GetPreferredDiskIDapproach is bypassed by the "already matches" early return inapplyRootDeviceHints(): if Phase 1 auto-selects a raw FC/iSCSI path and it appears in the acceptable disk list, the function returns beforeGetPreferredDiskIDis ever called. Confirmed on s390x FC multipath hardware.Mark raw FC/iSCSI multipath members as ineligible — Raw FC and iSCSI disks that are members of a multipath device are now marked as ineligible at the validator level. Writing to a raw path instead of the multipath device causes installation failure (ENODEV) when multipathd reclaims the disk. By making them ineligible at the source, they are filtered out from all disk selection paths (SaaS, ZTP, ABI), no bypass is possible, and they show as ineligible in the UI with a clear reason.
List all the issues related to this PR
What environments does this code impact?
How was this code tested?
Checklist
docs, README, etc)Reviewers Checklist
Summary by CodeRabbit
Release Notes
Bug Fixes
Refactor