-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default route confusion when using multiple --network
options with macvlan
and bridge
networks
#23984
Comments
I am not sure if there is something special about ipv6 but what we do is we add for each network a default route which the same setting by default and the kernel seems to do some round robin based logic to route between them which is far from perfect if macvlan and bridge is mixed. One of the reasons is me do not know what that user prefers. What you can do is |
Thank you for the tip with the I understand what you're telling me about how Podman works (I assumed something like that). I am not sure whether any other options are necessary, e.g. for disabling/enabling the defroute on a per-container basis. But what you propose does fix my problem. Come to think of it, it's actually documented in the |
For the multiple bridge networks it doesn't really matter to much as the traffic is masqueraded on the host anyways but for macvlan this is certainly not very nice out of the box behavior. |
Having a different default weight for macvlan/ipvlan and bridge interfaces seems like a very good idea for the exact reason you mentioned - would resolve a lot of potentially awkward behavior when using both types of networks at the same time. |
The problem is if we change the default for macvlan now it might break others that already depend on the current default. |
A friendly reminder that this issue had no activity for 30 days. |
Issue Description
Consider the following
macvlan
network bound to a physical interface:And the following Podman-internal
bridge
network:And the following container:
Then the routing table will be very confusing (we'll be having a peek at the routing table using the host's "ip" utility via "nsentry", because the container's watered-down busybox-based "ip" doesn't show all the details):
The problem is that I have two default routes: one
via eth0
which is the one I want, and another onevia eth1
which is a local address, and would usually get NAT'ed/MASQUERADE'ed. (There's actually a third one via the link-localeth0
IP address.)This is OK if the container is only to initiate communications to the outside (
ping google.com
...). It doesn't really matter where the packets go out.But if the container works as a server, e.g. a HTTP reverse proxy, all kinds of things happen:
traceroute 2a02:2f0f:2f7:1::78:137
usually works from anywhere in the worldping 2a02:2f0f:2f7:1::78:137
mostly works from outside the local network (i.e. outside the uplink router boundary), but sometimes it fails from the outside:ping ...
mostly works perfectly from the hosttelnet 2a02:2f0f:2f7:1::78:137 2345
will mostly simply just freeze:The hot fix:
This essentially removes all default routes except for the link-local one. The new table looks like this:
... which is not perfect, but the connection now works as expected.
A better solution would be this, but it's more difficult to implement manually:
...which leaves a clean routing table via the public IP:
In essence I need to enter the container and manipulate the routing table from within. I'm at a loss as to how this could be done elegantly when starting the container. The only solution I can come up with is create the
v46bridge
network as--internal
, which means that it won't be used as a default route.But this leaves me with the fact that I do have containers which will be started within
v46bridge
(to be able to communicate with other containers in the same network via host name resolution), will not be included inv6pub
(because they contain sensitive services not to be exposed directly), and might need to reach the internet to download stuff -- which then they couldn't for lack of a default route.Steps to reproduce the issue
Steps to reproduce the issue
Describe the results you received
Unreliable connection through ping, even less reliable through "regular" TCP/IP applications, from hosts in different networks than the container host. Flawless "tracepath" though.
Describe the results you expected
Flawless connection in all cases.
podman info output
On a current Fedora CoreOS stable:
If you are unable to run podman info for any reason, please provide the podman version, operating system and its version and the architecture you are running.
Additional environment details
I did this on a machine connected to the internet; at the very least it should be something where there are multiple networks (at least two).
It's also possible that the connection issues are related to the router.
I'm using a Mikrotik router with MikroTik RouterOS 6.49.17, and, for your consideration the routing table looks like this:
The
pub-bridge
here represents the interfaces connected to the corresponding IPv6 network. So to my understanding there really is nothing here that should prevent proper routing -- given the fact that every single physical machine with an IP in the corresponding range doesn't have any connection problems (to the outside world, through the Mikrotik gateway) what so ever, including the container host.I'd argue that even if this is a weird Mikrotik IPv6 routing issue (is it?), then Podman is on the hook at least for its inability to control the default route when using more than one network -- which would apparently mitigate the problem in this case :-p
Additional information
Owing to the fairly complex setup, the "steps to reproduce" above is not an exact copy & paste. I've reproduced it based on my work as well as I could, but I apologize in advance for problems or typos I might have inadvertently induced.
The general idea should be clear: when using two or more
--net... --net...
options withpodman run
, Podman appears to be confused as to which the default route should be. With macvlan networks this leads to catastrophic inability to actually reliably serve a connection.The text was updated successfully, but these errors were encountered: