This document describes the use of the Merge testbed platform strictly for operating testbed facilities. A facility can be anything from a single computer to thousands of interconnected heterogeneous resources. A typical Merge deployment includes a centralized portal that presides over a set of testbed facilities. The Merge portal
- manages projects, users, and experiments
- defines how an experiment is expressed
- calculates how experiments map onto testbed facility resources
- orchestrates the materialization of experiments across testbed facilities
- provides project and user level collaborative compute environments (XDCs)
- provides access to experiments through on-demand VPNs from XDCs
We refer to these capabilities collectively as experiment space. Everything below experiment space that actually makes an experiment tick, the provisioning systems, isolated network synthesis, imaging
systems, etc. - we refer to as resource space.
The focus of this document is to delineate a path forward for a number of Merge use cases that revolve strictly around resource space.
- Resource providers that want to simply hook into an existing portal.
- Experimenters who want the bring your own machine, or set of machines provisioning model to support their experimentation.
- Resource providers that want to use Merge resource space automation, but hook into a different experiment space.
MARS is a suite of technologies that allows a collection of interconnected resources to be automatically provisioned into isolated experiment fragments. We define an experiment fragment as a collection of devices and interconnections that collectively comprise a part of an experiment. An experiment materialization is a collection of experiment fragments. The general composition of MARS is a layered system of services with well defined interfaces and a top level materialization API that cooperatively create experiment fragments within a testbed facility. At a high level the layers are
- Commander: provides the top level MARS API for fragment management and, an internal API for driver registration.
- Driver: presides over a set of resources. Registers the resources it is responsible for with the commander and takes delegated fragment API calls from the commander. Translates fragment management requests into directed graphs of actions that must be performed to carry out the task.
- Rex: Monitors task graphs created by drivers and carries out the specified tasks by using Merge technology stack components.
- Tech Stack: A collection of services that implement focused testbed tasks. Examples include experiment DHCP/DNS, virtual network synthesis, device configuration, OS imaging etc.
Other Experiment Spaces
This section describes how MARS may be interfaced with an experiment space other than the Merge experiment space.
In the end to end Merge setting, resources are proactively managed by resource providers through commissioning and decommissioning. When a provider wants resources to become available to experimenters, they commission the resources to a resource pool of their choice (or creation) in experiment space through the Merge portal API. From that point forward, experimenters who have access to that pool have access to the resources. Likewise, when resources are to be pulled
back, they can be decommissioned or deactivated by the provider. The former removes the resource from the facility model completely and is intended for situations like device failure or end of life. The latter is intended for more transient actions like device maintenance or temporary duty cycles off-testbed.
Merge is somewhat unique in this manner of proactive distributed resource management. Some other systems that wish to use MARS may employ a more reactive model, where a resource broker or centralized entity supplies a resource request and the resource provider gives back a response with the resources it can provide. For this model we recommend a middleware approach where a broker is placed between the Merge Commander and the experiment space callers. On the MARS side, this broker would implement the the Merge Commander API. On the experiment space side, the broker would implement whatever APIs are necessary for the callers in play - and translate the calls from that API into Merge Commander API calls. The Merge Broker could optionally implement the Merge Commissioning API to manage the availability of resources or take on the resource management APIs of the foreign experiment space.
Authentication and Authorization
There are two planes of authentication and authorization for MARS.
- Merge Commander API calls.
- Merge Commissioner API calls.
These two APIs are handled differently.
Merge Commander API
This API is protected by TLS client certificate authentication. Any client that bears a certificate that warrants authentication is authorized to use the Commander API in its entirety e.g, combined authentication and authorization. In an E2E Merge system, when a new facility is stood up, the resource provider provides the Merge portal it’s a part of with a TLS client cert that allows that Merge portal to use the facility through the Commander. This is done securely via the Merge API which has an OAuth2 based authentication mechanism in place. Only a maintainer of the facility is allowed to provision certs through the Portal API.
In this case, a client certificate would be given to the Merge broker, either through a secure API or through some sort of administrative out-of-band process. The Broker would then have full access to the Commander API.
Merge Commissioner API
Implementing Merge style resource management is optional for a Merge broker. The discussion below provides high level guidance for doing so.
The Merge Commissioner is implemented as a containerized micro-service that runs inside a Merge portal. Unlike the authentication model described in the previous section for certificate provisioning, the Commissioner sits behind the Merge Portal API that performs authentication via an OAuth2 based mechanism. Once a user is authenticated by an Identity provider the Portal trusts, their request is routed through the portal policy layer to see if the user making a (de)commission request on behalf of a facility is authorized to do so. This policy layer is designed similar to Linux security modules.
In the foreign experiment space setting, (de)commission requests would not be going to the Merge portal, but rather to the broker. The Merge Commissioner service is a modular component of the Merge portal - and does not necessarily need to run within a Portal to function. Thus a potential Merge broker could use this Commissioning service as a building block if so desired for reactive resource management. Authentication and authorization would be up to the implementer. For simple scenarios, client side TLS certification will work just fine. Authentication and authorization could also be implemented to be consistent with whatever the experiment space side of things is doing.
Merge has its own model of how experiment elements are described, how resources are described - and how experiment elements map onto resources. When hooking MARS up to a foreign experiment space that has its own notion of experiment elements, translation will need to take place at the broker between the foreign model and the native model.
The Merge experiment elements are defined in the Protocol Buffers version 3 interface description language (IDL). This means they are very well defined with a static structure, set of types, wire protocol, and widely used programming language bindings.
In addition to experiment elements, Merge also defines a set of message envelopes that carry materialization fragment requests. These envelopes describe the mapping of experiment elements onto testbed facility resources. These envelopes are also defined in Protocol Buffers V3 IDL.
When a Merge broker receives requests to provision resources on behalf of a foreign experiment space, it will need to translate the experiment element definitions into native Merge element definitions and wrap them in Merge materialization fragment request envelopes according to the element to resource mappings defined in the foreign provisioning request.