Inroduction
This post is for discussing the issue of NAT traversal for wireguard. In the existing system Wireguard is used to connect XDCs to facility gateways which host infrapods. These infrapods have wireguard interfaces that connect to the infranet - and thus to the experiment nodes. The XDCs have their own wireguard interfaces that are created at XDC attach time. This is a standard Wireguard connection.
The Merge portal hosts a wireguard key exchange service. When a new wireguard interface is created on an XDC or a facility, the private keys stay on the local wireguard interface and the public key is sent to the portal. The portal then distributes this peer information (public key, initial client ip address and port, and the allowed addresses for that client in the tunnel) to all entities that need it for the given materialization. If a facility key is given, all attached XDCs get the peer information. If an XDC key is given all facilities in the materialization get the peer information.
Now this works well for all the reasons that Wireguard works well: minimal setup (single key and peer information exchange), client roaming, minimal attack service. Once all peer information is distributed, wireguard will wait for an initial connection from either side and note from where the first authenticated packet comes from and set that to be the response address and port for packets back to that client. The client does the same when the other side responds. So at least one end needs to have a route-able endpoint for the initial connection. This will break when both sides are NATted as neither side has a well known endpoint.
This is generally fine for Merge as it exists today. The facilities all have well known endpoints and the connection is driven by XDCs - XDCs make first contact. This means the facility knows how to route back to XDCs regardless of where they are.
But we are introducing bring-your-own-device (BYOD) to Merge. This is generally imagined to be a single node or small cluster not running with a public or well known endpoint. BYOD will, we think, will use multi-facility functionality to materialize the device into an experiment that is mostly running on another facility. So two endpoints may be NATted - the XDCs and the BYOD. at least one of them will need to get the NATted endpoint to be able to make first contact.
So how to do that?
As this case is not thought to be that prevalent (is this true?) I think the solution should not disrupt the existing system too much. We do not want to re-architect a working system for an edge case that won’t happen too often. Especially a user facing part of the system which the wireguard service is. We want it to be as robust and simple as possible.
Chris’ Suggestion
Chris has suggested a hub and spoke - the portal will host one (or more?) wireguard-pods that will have well known endpoints. All wireguard traffic for a materialization (XDC <=> facilities) will flow through this pod or pods. The traffic between these pods and the XDCs will not be encrypted so XDCs will not need wireguard keys. Please let me know if this is not correct.
The upside of this approach is, first of all, it solves the double NAT problem. And as a nice side effect, it simplifies key exchange as each materialization will only need 1 key for the portal (the wireguard pod) and 1 key for each facility. The downsides of this approach (according to me) are: 1) a single point of failure if the pod goes down or has network issues all connections are broken; 2) key distribution doesn’t really need simplification - it’s not a complex protocol; and 3) creates a dependency that doesn’t exist in wireguard - wireguard is very robust to changing network conditions. Introducing a hub to a peer to peer connection adds complexity and possible network latency; 4) it breaks “local XDC” connections (XDC that run on user machines not hosted in the portal).
Chris has rightly said that if we do not go with this, well, we need to go with something. BYOD will generally be behind a NAT.
Another Suggestion
I don’t really have another suggestion.
Although after a very small bit of looking around I have found that others have come up with solutions for this. Or at least thought about it. This is not surprising as many people use wireguard and many people are behind NATs.
Here’s a write up for one: WireGuard Endpoint Discovery and NAT Traversal using DNS-SD | Jordan Whited. First it talks about using STUN (Session Traversal Utilities for NAT) to punch through a NAT. That may work. Then it also suggests a NAT traversal broker that distributes peer addressing information. I think at first glance this looks like a pretty good solution. It could likely be easily integrated into the existing wireguard service on the portal so there would be minimal new “things” needed. When a wireguard connection is created the client (facility) already gives it’s keying information to the portal as normal. It could, in addition, make a second call to this address broker. The broker would note the (NATted or not) endpoint for the facility. This endpoint would be added to the existing wireguard enclave information that is distributed. So all XDCs (and other facilities) would get the (NATted or not) endpoint of the facility. We’ll need to look into this a more, but on the face of it it seems reasonable to me. This API call / broker only extends an existing system and integrates nicely. No real re-architecting is needed and it does not route all traffic though a single point so the robustness inherent in Wireguard is kept.
Edit: and if we can get the GRPC system to give us the packet used to send the public key, we will already have the (NATted or not) address for the facility. So no second API call would be needed. I’m not sure GRPC will give us that though. Worth looking into anyway.,
Anyway…thoughts?