Testbed Support For Mobility

This post explores various ways in which mobility can be supported in the Merge testbed setting.

Concepts

Generally speaking, a node in an experiment is mobile if at time t it is at location a and at time t+1 it is at location b. What the meaning of location is for a given node is a matter of perspective. Some nodes are aware of their geographic location through GPS or similar equipment. Some nodes are aware of their location relative to a base station or access point, through transceivers and protocols that allow for signal strength measurement. Some nodes have a perspective of location based on what they can observe through network measurement techniques such as topology discovery. Some nodes have physical sensing capabilities such as cameras, accelerometers and gyroscopes that give a perspective of location.

What it means for a testbed to support mobility, is to be aware of how nodes within the testbed perceive location and provide experiments with an API that allows that perception to be controlled.

Here we’ll cover the following perspectives of location and how they can be controlled by a testbed facility.

  • Link layer network topology location
  • Physical layer network topology location
    • Wifi, WPAN, ZWave, Zigbee/Xbee, Nanobeam, Photonic Switching …
  • GPS location
  • Physical sensors

Node-Based vs Network-Based Techniques

All of the location control techniques that follow can be broadly categorized as either node-based or network based. Nod- based techniques control a node’s perspective of location by manipulating the node itself. Network-based techniques control a node’s perspective of location by manipulating the network a node is connected to.

Techniques

Link Layer Network Topology Location

We’ll start with the most basic case. Consider a simple topology

import mergexp as mx
from mergexp.net import capacity, latency
from mergexp.unit import mbps

net = mx.Topology('mobile0')

r = net.device('router')
net0 = [net.device(name) for name in ['a', 'b']]
net1 = [net.device(name) for name in ['c', 'd']]
net.connect([r] + net0, capacity == mbps(100))
net.connect([r] + net1, capacity == mbps(10))

mx.experiment(net)

image

Lets consider the above our initial topological state at time t_0. Now let us suppose that at some time t_k for k>0. The experiment requires c to move to lan0. In static terms the topology would then be the following.

import mergexp as mx
from mergexp.net import capacity, latency
from mergexp.unit import mbps

net = mx.Topology('mobile0')

r = net.device('router')
net0 = [net.device(name) for name in ['a', 'b', 'c']]
net1 = [net.device(name) for name in ['d']]
net.connect([r] + net0, capacity == mbps(100))
net.connect([r] + net1, capacity == mbps(10))

mx.experiment(net)

image

Before we even begin to think about how to implement this very simple case of mobility, a node moving from one part of the network to another, we must address how this will be expressed by experimenters.

Expression

The following is a simple and pragmatic proposal for structurally dynamic experiments. In the example above we have two forms of a network at times t_0 and t_k. There are several factors that motivate the proposed mechanism of expression.

  1. Define the variation space up front so the testbed can allocate the necessary node and network assets to support the structural dynamics.
  2. Do not define how the dynamics will unfold up front, leave that up to experiment orchestration. Variation in the structure of the network may need to be event based and cannot be determined in advance. Experimenters may also need to experiment with structural changes incrementally in order to construct an experiment so it should be easy for them to do so on a running experiment and not have to re-define the experiment for each change in the way structurally dynamic events are introduced.
  3. Preserve reproducibility. The mechanism of describing the space in which a topology can evolve must produce a set of constraints sufficient for the experiment to be reproduced.
import mergexp as mx
from mergexp.net import capacity, latency
from mergexp.unit import mbps

# The first variant of the topology
t0 = mx.Topology('mobile0')

# Nodes in the topology
r = t0.device('router').tag('t0')
[a, b] = [t0.device(name) for name in ['a', 'b']]
[c, d] = [t0.device(name) for name in ['c', 'd']]

# Make connectinos between nodes and tag them for easy access later
t0.connect([r, a, b], capacity == mbps(100)).tag('location 0')
t0.connect([r, c, d], capacity == mbps(10)).tag('location 1')

# The second variant of the topology
# Migrate node c from location 1 to location 0
t1 = t0.clone().tag('t1')
c = t1.device('c')
t1.connection('location 1').disconnect(c)
t1.connection('location 0').connect(c)

mx.experiment([t0, t1])

Here we have defined the two topologies presented in the sections above in a single Merge experiment model. Because both variations are given, the Merge realization engine has all the information it needs to allocate sufficient resources to support all variations of the model. Note that with this mechanism nodes can be added or removed as well and the realization engine will see this and ensure that an all encompassing union-set is allocated for the experiment.

What’s important here is that the realization engine is able to identify how migrations can possibly take place (node-based or network-based) and ensure that the node and network resources that are allocated for the experiment are in a position to support all variations of the experiment defined in the model. So instead of realizing a single topology, we are realizing the union of a super-set of topologies and ensuring that the link constraints hold over all defined subsets.

Now that we have defined how to express the potential for mobility within an experiment, we need to provide a way for experiments to actually enact the structural changes in a topology that provide the perspective of mobility. The idea here is to provide an API, similar to how the dynamic link modification in Merge currently works e.g. we

  • Provide an API for enacting changes in a topology.
  • Provide a command line client for using that API from experiment development containers.
  • Provide API documentation for advanced users to write software that interacts with the API
  • Provide Orchid orchestration engine agents that use the API

So where we have commands like

moacmd set wan delay 47ms

for dynamic link modification from the moacmd tool, we’ll how have commands like

topocmd set topology t1

Where the t1 references the topology tag in the experiment definition above.

Implementation

There are several ways the migration of node c from one part of the network to another in our running example can take place. We’ll cover the following two here:

  • virtual link migration
  • virtual machine migration
Virtual Link Migration

Virtual link migration means taking the virtual link that connects a node into the topology and moving it to a different place in the virtual network. In the Merge testbed setting, experiment topologies are mostly implemented through virtual overlay networks. Some exceptions exist for switched photonic networks, but we won’t be covering that here. The underlay/overlay network decouples the experiment topology from the underlying testbed physical topology. In some cases physical layer protocols are even tunneled over an underlay network to specialized virtual NICs running on hypervisors to give nodes the perspective of a certain type of physical connection (this point will become important later so keep it in mind), but the point at present is that the overlay is very flexible and we can migrate links almost arbitrarily within an experiment topology.

In the examples above, all nodes are connected to a parameterized link. This means that the traffic is going through a network emulator. In this case the network emulator itself can be used to implement the topological change.

In Merge, network emulation works through a BGP-based overlay protocol called EVPN. EVPN works by advertising layer 2 reachability information within a virtual network segment identified by an integer tag called a virtual network identifier (VNI). For example MAC 00:00:00:00:00:AA is reachable on VNI 47 through a router at address 10.99.0.1. What the emulator does, is play man in the middle with EVPN. For a given point to point link connecting nodes X and Z that goes through an emulator, the testbed will create two independent overlay segments (distinct VNIs) from the emulator to each node. Then the emulator will advertise that X's MAC is reachable through the emulator onZ’s VNI and vice versa. This way traffic between the two flows through the emulator.

Thus, link migration in this setting is a simple matter of moving a few VNI segments and EVPN advertisements around. A node on the link will then appear to enter the network at a different place. Note that because EVPN is a routed framework for creating isolated layer 2 networks (unlike VLAN) it is trivial to extend this technique into the distributed testbed setting as the underlay network is simply a normal layer 3 network that can transit regular vanilla BGP (or other routing protocol) routers.

WLOG this technique also extends to complex topologies that are implemented by a network emulator and have testbed nodes in an experiment connected to the edges - and not just simple point to point links and multi-point links (LANs). This technique is already being used on the Merge-based Steam testbed to represent optical switching in emulated networks, but has not been integrated into the mainline Merge emulation codebase yet.

Links are modeled

Virtual Node Migration

The virtual node migration technique takes the concept of mobility a bit more literally. In this setting we assume the experiment nodes that are migrating are virtual machines. This is another reason for the realization engine to be privy to potential structural dynamics in an experiment. If the mobility mechanism must be implemented through node migration, the realization engine must know this so the noes are constrained to VM allocations and the proper configuration information is passed down to hypervisors to ensure that the resulting VMs are in fact migratable.

The hypervisor technology that Merge uses (QEMU/KVM) has support for migrating virtual machines out of the box. We would just need to add the hooks to migrate virtual machines in response to topology modification calls. The hypervisor technologies in play even support live migration. This capability is commonly used in the cloud, but using it in an experimentation setting would require investigation as to what artifacts may be introduced during a migration period.

Physical Layer Network Topology Location

For nodes that can perceive to a certain extent where they are through physical carrier signals such as RF or OF (optical frequency) modifying the EVPN virtual overlay they are connected through is not sufficient to control their physical perspective of location. There are two basic ways that we can control the physical carrier signal perspective

  • Controlling a physical signal that is sent to the device through equipment like a software defined radio (SDR) operating in free space or over wave guides.
  • Implementing the device as a virtual machine and the it’s physical transceiver that allows it to interact with the physical carrier network as a virtual device and implementing the physical characteristics of the network it is connected to as another layer in the network overlay stack. This is the next logical step from what EVPN/VXLAN does now, instead of embedding L2 in L3, we are embedding L1 in L2 in L3, but still transiting like normal over an L3 underlay.

The former technique is limited to the physical reach of the controllable emitting device in free space or the stretch and loss profiles of the wave guides it’s attached to. From a scaling perspective this can be fairly limiting, but does present the opportunity for “real” devices on “real” physical networks of interest to be a part of experiments. The latter technique has the same scaling properties as the standard emulated networks in Merge today and can easily stretch across testbed facilities.