Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman-machine uncategorized flakes #22551

Open
edsantiago opened this issue Apr 30, 2024 · 8 comments
Open

podman-machine uncategorized flakes #22551

edsantiago opened this issue Apr 30, 2024 · 8 comments
Labels
flakes Flakes from Continuous Integration machine

Comments

@edsantiago
Copy link
Member

Seeing this very often - much more than the table below shows, because my flake logger only logs once a PR is merged.

           Starting machine "blah blah"

[+1207s]   [FAILED] Timed out after 600.001s.
           Expected process to exit.  It did not.
x x x x x x
machine-linux(4) podman(5) fedora-39-aarch64(4) rootless(5) host(5) sqlite(5)
machine-mac(1) darwin(1)
@edsantiago
Copy link
Member Author

x x x x x x
machine-linux(20) podman(22) fedora-39-aarch64(20) rootless(22) host(22) sqlite(22)
machine-hyperv(1) darwin(1)
machine-mac(1) windows(1)

edsantiago added a commit to edsantiago/libpod that referenced this issue May 6, 2024
Followup to containers#13936 : add an exclusion to localmachine tests
so we can avoid running those on test- or doc-only PRs.
Reason: containers#22551, the machine-start-timeout flake, is causing
hours of wasted time.

Signed-off-by: Ed Santiago <[email protected]>
@cevich
Copy link
Member

cevich commented May 8, 2024

@edsantiago
Copy link
Member Author

Different one: here the test fails, but not via timeout. machine wsl:

  C> podman.exe machine start 492b48671720
  Starting machine "492b48671720"
  your 131072x1 screen size is bogus. expect trouble

  This machine is currently configured in rootless mode. If your containers
  require root permissions (e.g. ports < 1024), or if you run into compatibility
  issues with non-podman clients, you can switch using the following command:

  	podman machine set --rootful 492b48671720

  API forwarding listening on: npipe:////./pipe/docker_engine

  Docker API clients default to this address. You do not need to set DOCKER_HOST.
  Error: machine did not transition into running state: ssh error: machine is not listening on ssh port

Is this the same bug? Shall I assign it to this issue?

@cevich
Copy link
Member

cevich commented May 21, 2024

I seem to remember @baude was working/debugging on this or something related to "machine fails to start" a month or so ago. Dunno if it's the same thing or different, but in terms of getting it fixed, that's who I'd start with.

Copy link

A friendly reminder that this issue had no activity for 30 days.

@edsantiago
Copy link
Member Author

The past 30 days. It's tempting to focus on Mac because that's what's hitting us so hard right now in #23154 and #23157, but this is happening on windows and linux too.

x x x x x x
machine-mac(9) podman(17) darwin(9) rootless(17) host(17) sqlite(17)
machine-hyperv(4) fedora-40-aarch64(4)
machine-linux(4) windows(4)

@edsantiago
Copy link
Member Author

I've just spent ten minutes blindly assigning all podman machine flakes to this issue. I looked at logs for some of them, and, some of them include the timeout, some are different errors. I do not have the time nor interest in opening issues for every podman-machine failure, so the list below is not entirely accurate. Still, I hope it helps in some wah.

Podman machine CI is super broken right now. I hope this helps someone diagnose and fix it.

x x x x x x
machine-mac(82) podman(205) darwin(82) rootless(205) host(205) sqlite(205)
machine-linux(69) windows(54)
machine-hyperv(44) fedora-39-aarch64(37)
machine-wsl(10) fedora-40-aarch64(32)

@edsantiago edsantiago changed the title machine: timeout in start podman-machine uncategorized flakes Aug 1, 2024
@edsantiago
Copy link
Member Author

Executive decision: this issue is now a one-stop catchall for all podman-machine flakes. There are too many flakes, I don't have the time to look at each one, so I'm just doing an automatic lump of all flakes with "machine" in the test name into this issue. If anyone cares about podman-machine, please feel free to start tackling these.

Here's the last two weeks. Have fun.

x x x x x x
machine-hyperv(84) podman(164) windows(112) rootless(164) host(164) sqlite(164)
machine-mac(42) darwin(42)
machine-wsl(28) fedora-40-aarch64(10)
machine-linux(10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flakes Flakes from Continuous Integration machine
Projects
None yet
Development

No branches or pull requests

2 participants