Skip to content

Conversation

@pevogam
Copy link
Contributor

@pevogam pevogam commented Nov 17, 2025

The typical use for this is when dumping extra logs and wanting to make sure certain information is always present (e.g. startup, etc.) and certain information is always absent (e.g. segfaults, etc.).

Summary by CodeRabbit

  • Refactor

    • VM dump processing now handles multiple dumps individually, establishing a fresh session before each copy and running per-dump verification prior to cleanup.
    • Removed the previous unconditional end-of-run dump copy in favor of per-dump handling.
  • Bug Fixes

    • If a VM is not alive, extra-dump download is skipped with a warning instead of attempting a copy.
    • Kernel-parameter updates use a simplified alive/not-alive check to determine start/create and apply boot options.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

The patch modifies virttest/env_process.py. It adds an alias import from aexpect import ops_linux as ops. In postprocess_vm the VM-alive guard for kernel parameter updates was simplified to if not vm.is_alive(). New handling for vm_extra_dump_paths iterates each path: if VM is not alive it logs and skips; if alive it waits for login, ensures a vm_extra_dumps directory, copies each dump from VM to local storage, and optionally verifies per-dump expected/unexpected strings using ops.grep, raising TestFail on violations. The previous unconditional copy of extra dumps at function end was removed; session/log cleanup now occurs after per-dump processing.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Verify the new from aexpect import ops_linux as ops import and all ops.grep usages (error handling, return semantics).
  • Check per-dump parameter naming and parsing (vm_extra_dump_paths, verify_vm_extra_dump_{i}, expected_vm_extra_dump_{i}, unexpected_vm_extra_dump_{i}) and delimiter handling.
  • Review VM liveness logic change for kernel-parameter updates (removed explicit vm existence check).
  • Confirm session/login waiting, directory creation, file copy error handling, and that cleanup runs on all paths (including exceptions).

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding configurable validation for extra VM dump files to check for presence/absence of messages.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
virttest/env_process.py (1)

685-690: Replace bare except: with specific exception handling.

The bare except: clause catches all exceptions including SystemExit and KeyboardInterrupt, which can mask critical errors and make debugging difficult.

Apply this diff:

-            except:
+            except Exception as e:
                 LOG.error(
-                    "Could not copy the extra dump '%s' from the vm '%s'",
+                    "Could not copy the extra dump '%s' from the vm '%s': %s",
                     dump_path,
                     vm.name,
+                    e,
                 )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8dd98ee and a1c8ff6.

📒 Files selected for processing (1)
  • virttest/env_process.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (3)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/guest_agent.py (1)
  • cmd (412-444)
virttest/qemu_monitor.py (2)
  • cmd (931-981)
  • cmd (2033-2093)
🪛 GitHub Actions: CI
virttest/env_process.py

[error] 1-1: Black formatting would reformat 1 file. Run 'black' to format and re-run 'make check'.

🪛 Ruff (0.14.4)
virttest/env_process.py

694-696: Avoid specifying long messages outside the exception class

(TRY003)


699-701: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (1)
virttest/env_process.py (1)

682-682: Good change to use get_list for parsing the dump paths.

The switch from string splitting to params.get_list() with a delimiter is a cleaner approach and consistent with the codebase patterns.

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from a1c8ff6 to 6184f30 Compare November 17, 2025 17:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
virttest/env_process.py (1)

672-706: Fix shell injection risk and incorrect grep/exit-status handling in dump verification

The current verification logic has multiple critical issues:

  • message and dump_path are interpolated directly into shell commands (Lines 694 and 702), so any spaces, metacharacters, or user-controlled content can break grep or allow shell injection.
  • For unexpected messages (Lines 699-703), vm.session.cmd() is used but then treated as if it returned an exit status; instead it returns command output and may raise on non‑zero status.
  • The grep -q -v {message} usage and status != 0 check implement the wrong semantics: it will generally succeed as long as there is at least one line without the pattern, instead of failing when the unexpected pattern is present.

You can address all of these in one go by (1) using utils_misc.shell_escape() for both arguments, and (2) using cmd_status() with plain grep -q and checking the exit code appropriately:

-        for message in params.get_list(
-            f"expected_vm_extra_dump_{i}", delimiter=";"
-        ):
-            status = vm.session.cmd_status(f"grep -q {message} {dump_path}")
-            if status != 0:
-                raise exceptions.TestFail(
-                    f"Missing expected message {message} in {vm.name} extra dump {dump_path}"
-                )
-        for message in params.get_list(
-            f"unexpected_vm_extra_dump_{i}", delimiter=";"
-        ):
-            status = vm.session.cmd(f"grep -q -v {message} {dump_path}")
-            if status != 0:
-                raise exceptions.TestFail(
-                    f"Redundant unexpected message {message} in {vm.name} extra dump {dump_path}"
-                )
+        for message in params.get_list(
+            f"expected_vm_extra_dump_{i}", delimiter=";"
+        ):
+            cmd = "grep -q %s %s" % (
+                utils_misc.shell_escape(message),
+                utils_misc.shell_escape(dump_path),
+            )
+            status = vm.session.cmd_status(cmd)
+            if status != 0:
+                raise exceptions.TestFail(
+                    f"Missing expected message '{message}' in {vm.name} extra dump {dump_path}"
+                )
+
+        for message in params.get_list(
+            f"unexpected_vm_extra_dump_{i}", delimiter=";"
+        ):
+            cmd = "grep -q %s %s" % (
+                utils_misc.shell_escape(message),
+                utils_misc.shell_escape(dump_path),
+            )
+            status = vm.session.cmd_status(cmd)
+            if status == 0:
+                raise exceptions.TestFail(
+                    f"Found unexpected message '{message}' in {vm.name} extra dump {dump_path}"
+                )

This keeps the behavior clear and robust: exit status 0 = match; non‑zero = no match/error, and all arguments are properly escaped.

🧹 Nitpick comments (1)
virttest/env_process.py (1)

708-716: Avoid swallowing all exceptions when closing remote sessions

The try/except Exception: pass around vm.remote_sessions cleanup (Lines 709-714) hides any real problems and triggers the BLE001/S110 lints. It’s cheap to make this more robust while still being non‑critical for the test flow.

Consider logging at least at debug level and ensuring the session is removed from the list even on failure:

-    # Close all SSH sessions that might be active to this VM
-    for s in vm.remote_sessions[:]:
-        try:
-            s.close()
-            vm.remote_sessions.remove(s)
-        except Exception:
-            pass
-
-    utils_logfile.close_log_file()
+    # Close all SSH sessions that might be active to this VM
+    for s in vm.remote_sessions[:]:
+        try:
+            s.close()
+        except Exception as exc:
+            LOG.debug(
+                "Failed to close remote session %r for VM %s: %s",
+                s,
+                vm.name,
+                exc,
+            )
+        finally:
+            if s in vm.remote_sessions:
+                vm.remote_sessions.remove(s)
+
+    utils_logfile.close_log_file()

This keeps failures non‑fatal but makes them visible and avoids silently swallowing unexpected exceptions.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a1c8ff6 and 6184f30.

📒 Files selected for processing (1)
  • virttest/env_process.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (3)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/remote.py (3)
  • cmd_status (708-716)
  • close (30-31)
  • remove (424-438)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 Ruff (0.14.4)
virttest/env_process.py

696-698: Avoid specifying long messages outside the exception class

(TRY003)


704-706: Avoid specifying long messages outside the exception class

(TRY003)


713-714: try-except-pass detected, consider logging the exception

(S110)


713-713: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from 6184f30 to b27d0e5 Compare November 17, 2025 17:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
virttest/env_process.py (1)

707-715: Consider logging exceptions during session cleanup.

The bare exception handler on lines 712-713 silently swallows all errors, which could hide issues during cleanup. While this is existing code that was moved (not newly introduced), consider logging exceptions for better debuggability.

Apply this diff to log exceptions:

     # Close all SSH sessions that might be active to this VM
     for s in vm.remote_sessions[:]:
         try:
             s.close()
             vm.remote_sessions.remove(s)
-        except Exception:
-            pass
+        except Exception as e:
+            LOG.debug("Failed to close session: %s", e)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6184f30 and b27d0e5.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (4)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/qemu_monitor.py (2)
  • is_responsive (511-522)
  • close (731-748)
virttest/remote.py (2)
  • close (30-31)
  • remove (424-438)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 GitHub Actions: CI
virttest/env_process.py

[error] 1-1: isort: Imports are incorrectly sorted and/or formatted.

🪛 Ruff (0.14.5)
virttest/env_process.py

696-698: Avoid specifying long messages outside the exception class

(TRY003)


703-705: Avoid specifying long messages outside the exception class

(TRY003)


712-713: try-except-pass detected, consider logging the exception

(S110)


712-712: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (2)
virttest/env_process.py (2)

673-675: LGTM: Clean refactor to support indexed dump configuration.

The change from splitting a string to using params.get_list() with enumeration enables per-dump configuration, which aligns with the PR objectives.


684-705: Verify the ops.grep() check parameter behavior before merging.

The code at lines 695 and 702 calls ops.grep(..., check=True) and assumes it returns a boolean value. However, since ops is imported from the external aexpect library (line 16), I cannot definitively verify the behavior of the check parameter from the codebase alone.

Key concerns:

  1. Return type verification: Confirm that ops.grep(..., check=True) returns True when a pattern is found and False when not found (or raises an exception, which the code doesn't handle).

  2. Empty pattern handling: If params.get_list() returns strings with only whitespace, those will be passed to grep and may cause unexpected behavior. Consider filtering empty or whitespace-only strings before calling grep.

The suggested diff in the original review for handling empty patterns is still valid:

                 for message in params.get_list(
                     f"expected_vm_extra_dump_{i}", delimiter=";"
                 ):
+                    if not message.strip():
+                        continue
                     if not ops.grep(vm.session, message, dump_path, check=True):

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from b27d0e5 to ebc52e6 Compare November 17, 2025 17:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
virttest/env_process.py (1)

15-15: Import ordering violation persists - run isort to fix.

The import order still violates isort rules. Third-party imports should be alphabetically sorted: aexpect should come before avocado and six.

Run this command to fix:

isort virttest/env_process.py
🧹 Nitpick comments (2)
virttest/env_process.py (2)

696-698: Consider moving exception messages into TestFail constructor as a cleanup.

The static analysis tool suggests avoiding long messages outside the exception class, though this is a minor style concern.

If you want to address the Ruff TRY003 warnings, you could create custom exception classes or keep the messages concise, but this is a low-priority cleanup since the current code is functional and readable.

Also applies to: 703-705


707-715: Consider logging exceptions during session cleanup.

The exception handling in the session cleanup loop suppresses all exceptions without logging. While this pattern is consistent with existing code in this file (e.g., lines 1522-1525), logging exceptions would help with debugging cleanup issues.

If you want to address the static analysis warnings (S110, BLE001), consider logging exceptions:

     for s in vm.remote_sessions[:]:
         try:
             s.close()
             vm.remote_sessions.remove(s)
-        except Exception:
-            pass
+        except Exception as e:
+            LOG.debug("Failed to close remote session: %s", e)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b27d0e5 and ebc52e6.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (3)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/qemu_monitor.py (2)
  • is_responsive (511-522)
  • close (731-748)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 Ruff (0.14.5)
virttest/env_process.py

696-698: Avoid specifying long messages outside the exception class

(TRY003)


703-705: Avoid specifying long messages outside the exception class

(TRY003)


712-713: try-except-pass detected, consider logging the exception

(S110)


712-712: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Static checks
  • GitHub Check: fedora_40 AVOCADO_SRC: SETUP:-m pip install . VT_TYPE:libvirt
  • GitHub Check: fedora_40 AVOCADO_SRC: SETUP:-m pip install PYPI_UPLOAD/*.whl VT_TYPE:libvirt
🔇 Additional comments (1)
virttest/env_process.py (1)

684-705: No issues found. The code correctly handles missing verification parameters.

The get_list method returns an empty list if the key is not in the parameters, which is exactly what occurs when expected_vm_extra_dump_{i} or unexpected_vm_extra_dump_{i} parameters are missing. The for loops at lines 692-698 and 699-705 will simply not execute, and no exception is raised.

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from ebc52e6 to 0bb9915 Compare November 17, 2025 21:11
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
virttest/env_process.py (2)

684-707: Logic looks correct, and session validation is properly implemented.

The verification logic correctly handles both expected and unexpected messages:

  • For expected messages: raises TestFail when ops.grep returns False (not found)
  • For unexpected messages: raises TestFail when ops.grep returns True (found)

The session availability check prevents errors when the VM session is unavailable.

Optional: Consider extracting exception messages to constants to address Ruff TRY003.

The static analysis tool flags long exception messages on lines 697-699 and 705-707. While not critical, you could extract these to module-level constants or a helper function:

def _format_dump_verification_error(vm_name, message, dump_path, found):
    state = "Redundant unexpected" if found else "Missing expected"
    return f"{state} message '{message}' in {vm_name} extra dump {dump_path}"

Then use it in the exception raises. However, this is purely stylistic and the current implementation is clear and functional.


709-717: Consider logging exceptions during session cleanup for better debugging.

The bare except Exception: pass on lines 714-715 silently ignores any errors during session cleanup. While this is acceptable for best-effort cleanup, logging the exceptions would help with debugging issues.

Apply this diff to add logging:

     # Close all SSH sessions that might be active to this VM
     for s in vm.remote_sessions[:]:
         try:
             s.close()
             vm.remote_sessions.remove(s)
-        except Exception:
-            pass
+        except Exception as e:
+            LOG.debug("Failed to close session: %s", e)

Based on coding guidelines or learnings if similar patterns exist in the codebase.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebc52e6 and 0bb9915.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (3)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/qemu_monitor.py (2)
  • is_responsive (511-522)
  • close (731-748)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 Ruff (0.14.5)
virttest/env_process.py

697-699: Avoid specifying long messages outside the exception class

(TRY003)


705-707: Avoid specifying long messages outside the exception class

(TRY003)


714-715: try-except-pass detected, consider logging the exception

(S110)


714-714: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks

@pevogam
Copy link
Contributor Author

pevogam commented Nov 17, 2025

@luckyh @YongxueHong Finally, I hope you consider a change like this one, it is meant to enhance the original dmesg validation during postprocessing with arbitrary validation for presence or absence of log messages or other strings from arbitrary configurable dumped files from any vm.

@pevogam
Copy link
Contributor Author

pevogam commented Nov 21, 2025

Just to clarify, the failed test here is not related to the changes but reads: "⚠️ Not enough compute credits to prioritize tasks!"

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from 0bb9915 to 7dba010 Compare November 21, 2025 19:08
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
virttest/env_process.py (1)

687-703: Harden ops.grep usage and clarify success/failure conditions

There are two concerns in this verification block:

  1. Shell‑injection surface via ops.grep
    Earlier analysis showed ops.grep builds a shell command where the pattern argument is not safely quoted, so a crafted message (from expected_vm_extra_dump_{i} / unexpected_vm_extra_dump_{i}) can inject shell when it contains problematic characters like single quotes. Even if you plan to fix this upstream in aexpect, you should add a local guard here to reject or sanitize unsafe patterns before calling ops.grep (e.g., at minimum reject messages containing ' and clearly fail the test with a explanatory error).

  2. Ambiguous behavior with check=True and the if conditions
    With check=True, a non‑zero grep exit status (no match or other error) is likely to raise an exception internally. That makes conditions like:

    • if not ops.grep(..., check=True, ...) for expected messages
    • if ops.grep(..., check=True, ...) for unexpected messages

    somewhat brittle and hard to reason about – it’s unclear whether “no match” will be surfaced via a boolean result or an exception, and for unexpected messages you probably want “no match” to be treated as success rather than an exception.

    Consider restructuring to make the contract explicit, for example:

    • Use check=False (or equivalent) and inspect the return value / status for “found vs not found”.
    • Or catch the specific exception raised on non‑zero exit and translate it into a TestFail only when it truly indicates a logic failure, not just “pattern not found”.
  3. Formatting / CI
    CI reports that Black would reformat this file. Please run:

    black virttest/env_process.py

    to clear the formatting failure.

🧹 Nitpick comments (2)
virttest/env_process.py (2)

669-678: Double‑check control flow and indexing for extra dump handling

The overall flow (only acting when vm_extra_dump_paths is set, creating a single session, and iterating with enumerate() over params.get_list("vm_extra_dump_paths", delimiter=";")) is reasonable. Two things to verify:

  • enumerate() starts at 0, so the configuration must use verify_vm_extra_dump_0, expected_vm_extra_dump_0, etc. If your configs were written with 1‑based indices, they won’t be picked up.
  • vm.wait_for_login() is called even if none of the verify_vm_extra_dump_{i} flags are enabled; if that’s a common case or if VMs are often unreachable in postprocess, you may want to defer the login until you know at least one dump needs verification.

If you confirm both are intentional, the structure is fine.


705-713: Avoid silent except Exception: pass when closing remote sessions

The cleanup loop correctly iterates over a copy of vm.remote_sessions and removes entries as they are closed, but the bare:

except Exception:
    pass

fully swallows any errors, which can hide real issues (e.g., leaking sessions or file descriptors) and is flagged by static analysis.

Consider at least logging at debug/warning level, and narrowing the exception type if possible, e.g.:

-    for s in vm.remote_sessions[:]:
-        try:
-            s.close()
-            vm.remote_sessions.remove(s)
-        except Exception:
-            pass
+    for s in vm.remote_sessions[:]:
+        try:
+            s.close()
+            vm.remote_sessions.remove(s)
+        except Exception as exc:
+            LOG.debug("Failed to close remote session %r for VM %s: %s", s, vm.name, exc)

This keeps cleanup best‑effort while retaining observability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb9915 and 7dba010.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (4)
virttest/libvirt_vm.py (2)
  • wait_for_login (1979-2020)
  • remove (2410-2427)
virttest/virt_vm.py (1)
  • wait_for_login (1216-1307)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 GitHub Actions: CI
virttest/env_process.py

[error] 687-689: Black would reformat 1 file; 406 files would be left unchanged. Run 'black' to format the code before committing.

🪛 Ruff (0.14.5)
virttest/env_process.py

693-695: Avoid specifying long messages outside the exception class

(TRY003)


701-703: Avoid specifying long messages outside the exception class

(TRY003)


710-711: try-except-pass detected, consider logging the exception

(S110)


710-710: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (1)
virttest/env_process.py (1)

15-16: Import of ops_linux alias looks fine

Using from aexpect import ops_linux as ops is consistent with existing aexpect usage in this module; no issues from a dependency or style perspective.

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from 7dba010 to d561027 Compare November 21, 2025 19:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
virttest/env_process.py (1)

669-705: Tighten vm-extra-dump verification: session lifecycle, behavior change, and ops.grep semantics

This block introduces useful functionality but has a few non-trivial concerns:

  1. Unconditional login in postprocess may change behavior
    vm.wait_for_login() is now always called whenever vm_extra_dump_paths is non-None, even if no verify_vm_extra_dump_{i} flag is set. If the VM has already shut down or crashed by the time postprocess runs, this will raise (e.g. login timeout) and fail postprocessing where previously copy_files_from() failures were only logged and the test could still pass. Consider:

    • Only attempting login if at least one verify_vm_extra_dump_{i} is enabled, or
    • Catching login errors here and turning them into a clearer TestFail/skip with context (“cannot verify extra dumps because login failed”).
  2. Session created by wait_for_login() is never explicitly closed
    The session obtained at Line 672 is not closed in this function. Elsewhere (e.g., kill_unresponsive_vms), the pattern is session = vm.wait_for_login(...); session.close(), so this is likely a new leak unless wait_for_login internally registers the session in vm.remote_sessions and you rely solely on that list. To be safe and explicit, wrap this block and close session in finally:

-    if params.get("vm_extra_dump_paths") is not None:
-        # even if there is previously existing session we cannot guarantee its state
-        # is viable, e.g. it may be in python .env or SMTP subsession, etc.
-        session = vm.wait_for_login(timeout=vm.LOGIN_WAIT_TIMEOUT)
+    if params.get("vm_extra_dump_paths") is not None:
+        # even if there is previously existing session we cannot guarantee its state
+        # is viable, e.g. it may be in python .env or SMTP subsession, etc.
+        session = None
+        try:
+            session = vm.wait_for_login(timeout=vm.LOGIN_WAIT_TIMEOUT)
@@
-            if params.get_boolean(f"verify_vm_extra_dump_{i}"):
+            if params.get_boolean(f"verify_vm_extra_dump_{i}"):
                 for message in params.get_list(
@@
                     LOG.info(f"Checking for unexpected message in {dump_path}")
                     if ops.grep(session, message, dump_path, check=True, flags="-aP"):
                         raise exceptions.TestFail(
                             f"Redundant unexpected message '{message}' in {vm.name} extra dump {dump_path}"
                         )
-
-    # Close all SSH sessions that might be active to this VM
+        finally:
+            if session is not None:
+                session.close()
+
+    # Close all SSH sessions that might be active to this VM
  1. Verify ops.grep behavior with check=True and conditional use
    The conditions

    • if not ops.grep(..., check=True, flags="-aP"): for expected messages, and
    • if ops.grep(..., check=True, flags="-aP"): for unexpected messages

    assume that ops.grep returns a truthy/falsy value without throwing when check=True. In many APIs, a check=True flag instead raises on non-zero exit status (i.e. when the pattern is not found), in which case:

    • The if conditions may never see a False result for “not found”, and
    • You’ll get raw command errors rather than the tailored TestFail messages in Lines 695–697 and 703–705.

    Please double-check the exact semantics of aexpect.ops_linux.grep in the version you’re using and adjust accordingly (e.g., use check=False and inspect the return/exit status, or wrap the call in try/except to translate grep failures into TestFail with your custom message).

    What is the behavior of `ops_linux.grep` in the `aexpect` library when `check=True`? Does it raise on non-zero grep exit codes, or does it return a boolean/tuple that can be safely used in conditionals?
    
  2. Shell-injection risk via ops.grep expr parameter (already noted earlier)
    As identified in the earlier review on this PR, ops.grep currently builds a shell command where the search expression (expr) is only wrapped in single quotes and not fully shell-quoted. Because message and dump_path here come from configuration (expected_vm_extra_dump_{i}, unexpected_vm_extra_dump_{i}, vm_extra_dump_paths), a crafted value containing ' and shell metacharacters could inject commands into the guest shell. Until an upstream fix for aexpect is confirmed and consumed here, consider:

    • Restricting allowed characters in expected/unexpected messages (e.g., reject entries with shell metacharacters), or
    • Avoiding ops.grep here and instead running grep via session.cmd_status() with utils_misc.shell_escape() applied to both message and dump_path.

    Since this matches the previously raised concern, I’m marking it as a duplicate for tracking.

Overall, I’d treat (2) and the confirmed ops.grep injection behavior as important enough to address before merging, and (1)/(3) as behavior/robustness points that at least deserve an explicit decision.

🧹 Nitpick comments (1)
virttest/env_process.py (1)

707-715: Avoid silent bare except when closing remote sessions

This teardown block is best-effort, but a bare except Exception: pass will silently swallow any issue closing sessions or mutating vm.remote_sessions, which can hide real cleanup problems and triggered Ruff warnings (S110/BLE001).

A lightweight improvement that keeps behavior non-fatal but surfaces issues for debugging would be:

-    # Close all SSH sessions that might be active to this VM
-    for s in vm.remote_sessions[:]:
-        try:
-            s.close()
-            vm.remote_sessions.remove(s)
-        except Exception:
-            pass
-
-    utils_logfile.close_log_file()
+    # Close all SSH sessions that might be active to this VM
+    for s in vm.remote_sessions[:]:
+        try:
+            s.close()
+            vm.remote_sessions.remove(s)
+        except Exception as exc:
+            LOG.debug(
+                "Ignoring error while closing remote session %r for VM %s: %s",
+                s,
+                vm.name,
+                exc,
+            )
+
+    utils_logfile.close_log_file()

This keeps failures non-fatal but provides some observability when cleanup goes wrong.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7dba010 and d561027.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (5)
virttest/libvirt_vm.py (2)
  • wait_for_login (1979-2020)
  • remove (2410-2427)
virttest/virt_vm.py (1)
  • wait_for_login (1216-1307)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/remote.py (2)
  • close (30-31)
  • remove (424-438)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 Ruff (0.14.5)
virttest/env_process.py

695-697: Avoid specifying long messages outside the exception class

(TRY003)


703-705: Avoid specifying long messages outside the exception class

(TRY003)


712-713: try-except-pass detected, consider logging the exception

(S110)


712-712: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks
🔇 Additional comments (1)
virttest/env_process.py (1)

15-15: Import alias for ops_linux looks fine

Using from aexpect import ops_linux as ops is appropriate for the new ops.grep usage and keeps call sites readable. No issues from a functionality standpoint here.

@pevogam pevogam force-pushed the extra-vm-dump-validation branch from d561027 to bdbf86f Compare November 21, 2025 19:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
virttest/env_process.py (1)

679-686: Replace bare except: with specific exception types.

The bare except: clause will catch all exceptions including SystemExit and KeyboardInterrupt, which can prevent proper test cleanup and interruption.

Apply this diff:

             try:
                 vm.copy_files_from(dump_path, vm_extra_dumps)
-            except:
+            except Exception as e:
                 LOG.error(
-                    "Could not copy the extra dump '%s' from the vm '%s'",
+                    "Could not copy the extra dump '%s' from the vm '%s': %s",
                     dump_path,
                     vm.name,
+                    e,
                 )
♻️ Duplicate comments (1)
virttest/env_process.py (1)

15-15: Import ordering violation (duplicate of past review).

This import should be sorted according to the project's isort rules. Run isort virttest/env_process.py to fix.

🧹 Nitpick comments (1)
virttest/env_process.py (1)

707-715: Improve exception handling in SSH session cleanup.

The bare except Exception: with pass silently swallows all errors, making debugging difficult. Consider logging exceptions while still ensuring cleanup continues.

Apply this diff:

     # Close all SSH sessions that might be active to this VM
     for s in vm.remote_sessions[:]:
         try:
             s.close()
             vm.remote_sessions.remove(s)
-        except Exception:
-            pass
+        except Exception as e:
+            LOG.warning("Failed to close SSH session: %s", e)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d561027 and bdbf86f.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (4)
virttest/libvirt_vm.py (2)
  • wait_for_login (1979-2020)
  • remove (2410-2427)
virttest/virt_vm.py (1)
  • wait_for_login (1216-1307)
virttest/utils_config.py (2)
  • get_list (200-202)
  • get_boolean (188-198)
virttest/utils_logfile.py (1)
  • close_log_file (150-175)
🪛 Ruff (0.14.5)
virttest/env_process.py

695-697: Avoid specifying long messages outside the exception class

(TRY003)


703-705: Avoid specifying long messages outside the exception class

(TRY003)


712-713: try-except-pass detected, consider logging the exception

(S110)


712-712: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
virttest/env_process.py (1)

669-711: Tighten error handling around extra dump copying and verification.

The overall flow (single login, per‑path copy, then per‑path expected/unexpected checks) looks good, but two points are worth adjusting:

  1. Bare except around vm.copy_files_from (Line 684)
    Catching all exceptions (including BaseException subclasses like KeyboardInterrupt) is unnecessary here and can hide serious problems. Prefer narrowing this to Exception and logging the exception details:
  •            try:
    
  •                vm.copy_files_from(dump_path, vm_extra_dumps)
    
  •            except:
    
  •                LOG.error(
    
  •                    "Could not copy the extra dump '%s' from the vm '%s'",
    
  •                    dump_path,
    
  •                    vm.name,
    
  •                )
    
  •            try:
    
  •                vm.copy_files_from(dump_path, vm_extra_dumps)
    
  •            except Exception:
    
  •                LOG.exception(
    
  •                    "Could not copy the extra dump '%s' from the vm '%s'",
    
  •                    dump_path,
    
  •                    vm.name,
    
  •                )
    
    
    
  1. Behavior when copy fails but verification is requested
    After a failed copy, the code continues and still runs ops.grep on the guest path. If the intention is that inability to retrieve a requested dump is a test failure, you might want to either:
    • Raise TestFail immediately on copy failure when verify_vm_extra_dump_{i} is true, or
    • Track a failure flag and raise once the loop completes.

If continuing despite copy failure is intentional (e.g., dumps are “best‑effort” artifacts while content verification against the guest path is the only requirement), consider adding a short comment to document that choice.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bdbf86f and 14d1f49.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (2)
virttest/qemu_vm.py (1)
  • is_alive (308-314)
virttest/virt_vm.py (3)
  • is_alive (1635-1639)
  • wait_for_login (1216-1307)
  • copy_files_from (1370-1427)
🪛 Ruff (0.14.5)
virttest/env_process.py

684-684: Do not use bare except

(E722)


685-689: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


698-700: Avoid specifying long messages outside the exception class

(TRY003)


708-710: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks
🔇 Additional comments (2)
virttest/env_process.py (2)

15-15: Import alias usage looks correct.

ops is only used for ops.grep in this file, and the alias import alongside remote keeps the aexpect imports grouped logically. No issues here.


657-661: Condition simplification for kernel param revert is safe.

Switching to if not vm.is_alive(): is correct here because vm is already validated earlier in postprocess_vm (env.get_vm(name) + early return on None). Behavior is unchanged and clearer.

The typical use for this is when dumping extra logs and wanting to
make sure certain information is always present (e.g. startup, etc.)
and certain information is always absent (e.g. segfaults, etc.).

It makes use of aexpect's linux ops as well as pattern matrching
for grepping the expected or unexpected messages.

Signed-off-by: Plamen Dimitrov <[email protected]>
The vm being defined is something done at the very beginning of
postprocessing. Only is_alive() checks make sense at each separate
stage, including the handling of extra file dumps from the vms.

Signed-off-by: Plamen Dimitrov <[email protected]>
@pevogam pevogam force-pushed the extra-vm-dump-validation branch from 14d1f49 to e9fc541 Compare November 25, 2025 19:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14d1f49 and e9fc541.

📒 Files selected for processing (1)
  • virttest/env_process.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
virttest/env_process.py (2)
virttest/virt_vm.py (3)
  • is_alive (1635-1639)
  • wait_for_login (1216-1307)
  • copy_files_from (1370-1427)
virttest/qemu_vm.py (1)
  • is_alive (308-314)
🪛 Ruff (0.14.5)
virttest/env_process.py

684-684: Do not use bare except

(E722)


685-689: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


698-700: Avoid specifying long messages outside the exception class

(TRY003)


708-710: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Static checks
🔇 Additional comments (2)
virttest/env_process.py (2)

15-15: Alias import of aexpect.ops_linux looks correct and matches usage.

The new from aexpect import ops_linux as ops import is consistent with how ops.grep is used later in postprocess_vm; no issues here.


655-667: Kernel param revert guard is safe after prior vm existence check.

Changing the guard to if not vm.is_alive(): is correct because vm was already validated earlier (if not vm: return), so the extra vm and check was redundant. The revert logic for kernel params remains unchanged.

if kernel_extra_params_add or kernel_extra_params_remove:
# VM might be brought down after test
if vm and not vm.is_alive():
if not vm.is_alive():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding, we will return if the vm is None at the beginning of postprocess_vm:

vm = env.get_vm(name)
if not vm:
return

It means that the vm is a VM instance here.

Could you illustrate more on which cases will result in the vm being None? Thanks.

if not vm.is_alive():
LOG.warning("VM is not alive so we cannot download extra dump paths")
else:
session = vm.wait_for_login(timeout=vm.LOGIN_WAIT_TIMEOUT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your intention is to check the VM's dump content at the post_process stage, right?
We will download the dump files from the VM to the host first and then check them, so I think we can check the content on the host session instead of the guest session.
Please correct me if I was understood. Thanks.

dump_path,
vm.name,
)
if params.get_boolean(f"verify_vm_extra_dump_{i}"):
Copy link
Contributor

@YongxueHong YongxueHong Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining parameters for vm_extra_dump using an index feels a bit inelegant and unfriendly to users. It risks causing confusion when configuring multiple vm_extra_dump instances.

Since the framework already has a mechanism for this (similar to the image object parameter), I believe we should align with that existing design principle.

For example:

vm_extra_dumps = foo boo
vm_extra_dump_file_foo = /path/foo_dump_file
vm_extra_dump_file_boo = /path/foo_dump_boo
verify_vm_extra_dump_foo = yes
verify_vm_extra_dump_boo = yes
expected_vm_extra_dump_foo = boot success
expected_vm_extra_dump_boo = shut down success
unexpected_vm_extra_dump_foo = boot failed
unexpected_vm_extra_dump_boo = shut down failed

However, I realize this approach would impact existing test case configurations, so I’d like to hear other maintainers' opinions. That being said, your current workaround solution is acceptable to me. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants