Skip to content

Commit acc7044

Browse files
committed
test: cleanup and improve "wait for readiness" checks
In Microvm.spawn(), we try to wait for the firecracker process to initialize itself and become ready to server requests. We have multiple checks of varying fidelity in there, and they can be strictly ordered by this based on what they wait for: e.g. if we wait for SSH availability (guest userspace is ready), there is no point to _also_ wait for firecracker's startup message in the logs, as that is always printed before SSH becomes available. Thus clean this logic up to only do the one check that has the highest fidelity in any given setup. While we're at it, update/move some comments. Then, improve the check for API server readiness to wait for the log message that signals completion of API server initialization, instead of just waiting for socket file creation (which happens before we are actually ready to accept connections on it). Signed-off-by: Patrick Roy <[email protected]>
1 parent 90774bd commit acc7044

File tree

1 file changed

+20
-8
lines changed

1 file changed

+20
-8
lines changed

tests/framework/microvm.py

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -688,21 +688,33 @@ def spawn(
688688
if emit_metrics:
689689
self.monitors.append(FCMetricsMonitor(self))
690690

691-
# Wait for the jailer to create resources needed, and Firecracker to
692-
# create its API socket.
693-
# We expect the jailer to start within 80 ms. However, we wait for
694-
# 1 sec since we are rechecking the existence of the socket 5 times
695-
# and leave 0.2 delay between them.
696-
if "no-api" not in self.jailer.extra_args:
697-
self._wait_for_api_socket()
691+
# Ensure Firecracker is in as good a state as possible wrts guest
692+
# responsiveness / API availability.
693+
# If we are using a config file and it has a network device specified,
694+
# use SSH to wait until guest userspace is available. If we are
695+
# using the API, wait until the log message indicating the API server
696+
# has finished initializing is printed (if logging is enabled), or
697+
# until the API socket file has been created.
698+
# If none of these apply, do a last ditch effort to make sure the
699+
# Firecracker process itself at least came up by checking
700+
# for the startup log message. Otherwise, you're on your own kid.
698701
if "config-file" in self.jailer.extra_args and self.iface:
699702
self.wait_for_ssh_up()
700-
if self.log_file and log_level in ("Trace", "Debug", "Info"):
703+
elif "no-api" not in self.jailer.extra_args:
704+
if self.log_file and log_level in ("Trace", "Debug", "Info"):
705+
self.check_log_message("API server started.")
706+
else:
707+
self._wait_for_api_socket()
708+
elif self.log_file and log_level in ("Trace", "Debug", "Info"):
701709
self.check_log_message("Running Firecracker")
702710

703711
@retry(wait=wait_fixed(0.2), stop=stop_after_attempt(5), reraise=True)
704712
def _wait_for_api_socket(self):
705713
"""Wait until the API socket and chroot folder are available."""
714+
715+
# We expect the jailer to start within 80 ms. However, we wait for
716+
# 1 sec since we are rechecking the existence of the socket 5 times
717+
# and leave 0.2 delay between them.
706718
os.stat(self.jailer.api_socket_path())
707719

708720
@retry(wait=wait_fixed(0.2), stop=stop_after_attempt(5), reraise=True)

0 commit comments

Comments
 (0)