Basic leaf/spine networking

Hello,

I’m attempting to setup a basic leaf/spine network by hand to learn the nuts and bolts of basic networking. The basic goal (for now) is just to setup two VLANs among four nodes in a simple leaf/spine topology. After that I’ll work on VXLAN, then VXLAN/EVPN.

Repo for the topology is at: MergeTB / DevOps / VTE / leaf-spine · GitLab

I have a simple rvn model that is

INFO[0001] nodes
INFO[0001] x0 running success 172.22.3.28
INFO[0001] y0 running success 172.22.3.92
INFO[0001] x1 running success 172.22.3.163
INFO[0001] y1 running success 172.22.3.84
INFO[0001] switches
INFO[0001] spine running success 172.22.3.115
INFO[0001] leaf0 running success 172.22.3.117
INFO[0001] leaf1 running success 172.22.3.211
INFO[0001] external links

topo = {                                                                                               
  name: "basic_" + Math.random().toString().substr(-6),                                                
  nodes: [...["x0", "y0", "x1", "y1"].map((x) => node(x))],                                            
  switches: [cumulus("leaf0"), cumulus("leaf1"), cumulus("spine")],                                    
  links: [                                                                                             
    v2v("x0", 1, "leaf0", 1, { mac: { x0: "04:80:00:00:00:01", leaf0: "04:60:00:00:00:02" }, }),       
    v2v("y0", 1, "leaf0", 2, { mac: { x1: "04:81:00:00:00:01", leaf0: "04:60:00:00:00:03" }, }),       
                                                                                                       
    v2v("x1", 1, "leaf1", 1, { mac: { x0: "04:82:00:00:00:01", leaf1: "04:60:00:00:00:02" }, }),       
    v2v("y1", 1, "leaf1", 2, { mac: { x1: "04:83:00:00:00:01", leaf1: "04:60:00:00:00:03" }, }),       
                                                                                                       
    v2v("leaf0", 3, "spine", 1, { mac: { leaf0: "04:60:00:00:00:01", spine: "04:50:00:00:00:01" }, }), 
    v2v("leaf1", 3, "spine", 2, { mac: { leaf1: "04:70:00:00:00:01", spine: "04:50:00:00:00:02" }, }), 
  ],                                                                                                   
};                                                                                                     

This is two nodes, x0 and y0, connected to leaf0 two other nodes x1 and y1 connected to leaf1. Then two leaf switches connected to a spine switch. Connections are

  • x0/eth1 <=> leaf0/swp1
  • y0/eth1 <=> leaf0/swp2
  • x1/eth1 <=> leaf1/swp1
  • y2/eth1 <=> leaf1/swp2
  • leaf0/swp3 <=> spine/swp1
  • leaf1/swp3 <=> spine/swp2

Am I right in assuming: leaf0 swp 1 and 2 and leaf1 swp 1 and 2 are configured for access mode as they are connected to machines, leaf0 swp3, leaf1 swp3 and spine swp1 and swp2 are configured to be trunks as they are connected to switches?

This is what I have for configuration in ansible at the moment for leaf0 and leaf1

nv set interface swp1-3 bridge domain br_default         
nv set bridge domain br_default vlan 10,20
nv set interface swp1 bridge domain br_default access 10
nv set interface swp2 bridge domain br_default access 20
nv set interface swp3 bridge domain br_default vlan 10,20
nv set system hostname leaf0                  
nv config apply -y                                 

I’m seeing conflicting information about setting up spine to just trunk the two ports. So any help here would be appreciated. As would any pointers to documentation about this which is not aimed at setting up a data center and is 9238472 pages long.

The switches are all running cumulus 5.12.1.1000. I can be available for zoom if that’s easier.

edit: It looks like I can just read the canopy code to get all the vtysh commands to do this. So I will do that.

We “roughly” follow data center networking though.
Leaf means → Connected to edge devices (like a hypervisor) and spine switches.
Spine means → Connects to leaf switches and spine switches.

How you exactly set up the path through the network is up to you, so to be specific, this phys → access → trunk → access → phys is 1 implementation of connecting these nodes across the network.

You use nv set interface when you want it persistent (like if you’re actually configuring the switch), but there’s nothing stopping you from using raw ip link commands (which is similar to how canopy does it.)

Using raw ip link is probably easier, since documentation for that is easier and better.

If you want to see all permutations of what’s possible on merge, it’s pretty much in here in the test cases, which tells you what interfaces and types are on each node.

https://gitlab.com/mergetb/portal/services/-/blob/main/pkg/realize/sfe/pathfinder_test.go?ref_type=heads

What’s stem in this context?

Stem is a specific XIR role meaning “VXLAN_Capable.”
The usage is deprecated, but I think Joe still uses it in his models.
With XIR capabilities now, you just put down the VXLAN Capable one instead.

2 Likes

i don’t know how to use capabilities.

generally, i let canopy handle any edge ports.

you don’t need to set up the vlans on the br_default or bridge because by default the bridges are vlan aware in 5.11 and up, so as long as the interfaces are added to the “bridge domain” the vlans will pass through the bridge.

on a leaf, canopy will set the vlans/vxlans as necessary except on the trunk ports which are upstream, and afaik canopy does not configure lacp/aggregated bond uplinks, so i usually do that manually.

spines are the same afaik. canopy will configure anything except a bond correctly, but if you name the bond correctly everything else does happen as it should.

This post was an exercise in setting up leaf/spine networking “the hard way” - by hand so I can understand how canopy/merge does it automatically. This is so I can extend it to multi-facility.

1 Like

Did you explore bgp unnumbered within cumulus as well?

Nope. I did not complete the exercise.

To setup connectivity using VLANS:

on leaf0

bridge vlan add vid 100 dev swp1 pvid untagged   # host connection on swp1 on vlan 100
bridge vlan add vid 200 dev swp2 pvid untagged   # host connection on swp2 on vlan 200
bridge vlan add vid 100 dev swp3                 # trunked port to spine allows vlan 100
bridge vlan add vid 200 dev swp3                 # trunked port to spine allows vlan 200

on leaf1 (same as leaf0 as the “0” is on swp1 and the “1” is on swp2.

bridge vlan add vid 100 dev swp1 pvid untagged   # host connection on swp1 on vlan 100
bridge vlan add vid 200 dev swp2 pvid untagged   # host connection on swp2 on vlan 200
bridge vlan add vid 100 dev swp3                 # trunked port to spine allows vlan 100
bridge vlan add vid 200 dev swp3                 # trunked port to spine allows vlan 200

on spine:

# add vlans 100 and 200 to trunked ports swp1 and swp2
bridge vlan add vid 100 dev swp1
bridge vlan add vid 200 dev swp1 
bridge vlan add vid 100 dev swp2 
bridge vlan add vid 200 dev swp2 

It seems like all the swps must be bridged together for this to work. I would not have thought that as these are switches so all swps should already be connected? I am running nv set interface swp1-3 bridge domain br_default on al switches to do this. This sets up the bridge br_default and puts untagged vid 1 on all ports. :shrug:.

edit: instead of using the nv command I can just make a bridge and add all ports to it.

ip link add name br0 type bridge
ip link set dev swp1 master br0 
ip link set dev swp2 master br0 
ip link set dev swp3 master br0 

This does not work.

[rvn@x0 ~]$ ssh x1 hostname  
y1                           

Sshing to x1 gets me to y1. No idea why this is.

Config:

root@leaf0:mgmt:~# brctl show                                      
bridge name     bridge id               STP enabled     interfaces 
br0             8000.046000000001       no              swp1       
                                                        swp2       
                                                        swp3       
root@leaf0:mgmt:~# bridge vlan show                                
port              vlan-id                                          
swp1              100 Egress Untagged                              
swp2              200 Egress Untagged                              
swp3              100                                              
                  200                                              
br0               1 PVID Egress Untagged                           
root@leaf0:mgmt:~#                                                 
rvn@spine:mgmt:~$ sudo su                                         
root@spine:mgmt:/var/home/rvn# brctl show                         
bridge name     bridge id               STP enabled     interfaces
br0             8000.045000000001       no              swp1      
                                                        swp2      
root@spine:mgmt:/var/home/rvn# bridge vlan show                   
port              vlan-id                                         
swp1              100                                             
                  200                                             
swp2              100                                             
                  200                                             
br0               1 PVID Egress Untagged                          
root@spine:mgmt:/var/home/rvn#                                    
root@leaf1:mgmt:~# brctl show
bridge name     bridge id               STP enabled     interfaces
br0             8000.046000000002       no              swp1
                                                        swp2
                                                        swp3
root@leaf1:mgmt:~# bridge vlan show
port              vlan-id  
swp1              100 Egress Untagged
swp2              200 Egress Untagged
swp3              100
                  200
br0               1 PVID Egress Untagged
root@leaf1:mgmt:~# 

EDIT: If I ping from x1 to x0 it works. x0 suddenly understands how to get to x1. Hmm. Why is this if I’ve setup static VLANs for traffic? Why does the switch seems to learn the correct path after that?

vxlans seem to be working. Used /etc/network/interface to set this up. I stole the config from the nvidia site here: VXLAN Devices | Cumulus Linux 5.15.

basic repository updated with the files. The gist though is this file on leaf0, leaf1:

...
auto swp1                                                           
iface swp1                                                          
    bridge-access 10                         
                                                                    
auto swp2                                                           
iface swp2                                                          
    bridge-access 20                                                
                                                                    
auto swp3                                    
iface swp3                                                          
                                                                    
auto vni10                                                          
iface vni10                                  
    bridge-access 10                                                
    mstpctl-bpduguard yes                                           
    mstpctl-portbpdufilter yes                                                            
    vxlan-id 10                                                                           
                                                                                          
auto vni20                                   
iface vni20                                                                               
    bridge-access 20                         
    mstpctl-bpduguard yes     
    mstpctl-portbpdufilter yes                                                            
    vxlan-id 20                              
                                                                                          
auto bridge          
iface bridge                                 
    bridge-ports swp1 swp2 swp3 vni10 vni20
    bridge-vlan-aware yes                                                                 
    bridge-vids 10 20
    bridge-pvid 1

And a similar setup on spine, minus swp3:

auto swp1
iface swp1

auto swp2
iface swp2

auto vni10
iface vni10
    bridge-access 10
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 10

auto vni20
iface vni20
    bridge-access 20
    mstpctl-bpduguard yes
    mstpctl-portbpdufilter yes
    vxlan-id 20

auto bridge
iface bridge
    bridge-ports swp1 swp2 vni10 vni20
    bridge-vlan-aware yes
    bridge-vids 10 20
    bridge-pvid 1

I still don’t fully understand this ( mstpctl-bpduguard? mstpctl-portbpdufilter?) but it seems to work. Minimum-spanning-tree BUM/and port filtering?