I’ve noticed a large buildup of stale EVPN routes on one of the testbed facilities we manage. This is a collection of my notes, hopefully ending with a solution.
Failing to delete routes through GoBGP
At first when I noticed the stale routes, my first instinct was to use GoBGP to delete the routes.
gobgp global rib -a evpn del multicast 10.99.1.3 etag 0 rd 10.99.1.3:12
However, it appears that GoBGP will only withdraw a route that it originated. An indication of this is the following in the gobgpd
logs.
WARN[0006] No matching path for withdraw found, may be path was not installed into table Key="[type:multicast][rd:10.99.1.3:12][etag:0][ip:10.99.1.3]" Path="{ [type:multicast][rd:10.99.1.3:12][etag:0][ip:10.99.1.3] | src: local, nh: 0.0.0.0, withdraw }" Topic=Table
Where the key text is
No matching path for withdraw found, may be path was not installed into table
This is a roundabout way of saying the path is from a a peer. The source of the peer can be found by using GoBGP query with the --json
flag. This will dump tons of data so it’s best to pipe it to a file and inspect after the fact.
gobgp global rib -a evpn --json > out
"[type:macadv][rd:10.99.1.3:12][etag:0][mac:00:08:a2:0d:dc:ab][ip:<nil>]": [
------ snip --------
{
"source-id": "10.99.1.2",
"neighbor-ip": "fe80::526b:4bff:fe8e:9e70"
}
],
So this tells us that our problem lies with the router 10.99.1.2
. This router is a Cumulus switch running FRR, so the next stage of our saga will go there.
Investigating stale routes on Cumulus switches running FRR
Hopping on to the router with id 10.99.1.2
we see the following
# net show bgp evpn route rd 10.99.1.3:12 mac 00:08:a2:0d:dc:ab
BGP routing table entry for 10.99.1.3:12:[2]:[0]:[0]:[48]:[00:08:a2:0d:dc:ab]
Paths: (1 available, best #1)
Advertised to non peer-group peers:
swp7 swp9 xf0(xf0) xf1(xf1) xf2(xf2) xf3(xf3) xf4(xf4)
Route [2]:[0]:[0]:[48]:[00:08:a2:0d:dc:ab] VNI 693
64803
10.99.1.3 from xf0(xf0) (10.99.1.3)
Origin IGP, valid, external, bestpath-from-AS 64803, best
Extended Community: RT:64803:693 ET:8
AddPath ID: RX 0, TX 1452543
Last update: Tue Apr 28 11:55:34 2020
which tells us that the route came from 10.99.1.3
.
Hopping onto the router with id 10.99.1.3 we see the following
xf0:$ net show bgp evpn route rd 10.99.1.3:12 mac 00:08:a2:0d:dc:ab
% Network not in table
which appears to mean the route is not here, which is … odd.
On the 10.99.1.2
doing a ‘hard clear’ of BGP got rid of this particular stale route. Here xf0
is the name of the 10.99.1.3
router.
$ vtysh
% enable
% clear bgp l2vpn xf0