Portal V1 Testing Environment

This post describes how to get set up with a Portal v1 testing environment.

Setup OpenShift Code Ready Containers

Start here

this requires a RedHat login, but not any sort of subscription. Download the Code Ready Containers Archive for your OS along with the pull secret.

Extract the CRC archive

tar xf crc-linux-amd64.tar.xz
cd crc-linux-1.20.0-amd64

Configure CRC to have enough resources to run a Merge Portal. Here we have 32 cores and 64G of memory and 200G of disk, you can probably get away with about half of this.

./crc config set cpus 32
./crc config set memory 65536
./crc config set disk-size 200

Stash the pull secret you downloaded somehere and tell CRC where to find it

crc config set pull-secret-file <path-to-pull-secret>

Setup and start CRC

./crc setup
./crc start
eval $(./crc oc-env)

Display the CRC credentials for later use

./crc console --credentials

First login to the ‘developer’ account and grab the token.

oc login -u developer -p developer
DEVTOKEN=$(oc whoami -t)

Now login to the admin credentials displayed previously

oc login -u kubeadmin -p <token> https://api.crc.testing:6443

Save the OKD registry host in a variable for later use

HOST=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}')

Build the Portal

The installer can be made in a self-contained, self-extracting way, but that’s not the goal here. We want to use the logic of the installer to get things started, but feed it the artifacts we build to deploy instead of ones it grabs from CI.

First clone the portal code

git clone git@gitlab.com:mergetb/portal/services
cd services
git checkout ry-v1

Install the ymk build tool

curl -OL https://gitlab.com/mergetb/devops/ymk/-/jobs/940984772/artifacts/raw/ymk
chmod +x ymk
sudo cp ymk /usr/local/bin

Build the portal code

export DOCKER=podman REGISTRY=$HOST REGISTRY_PATH=merge TAG=latest
ymk tools.yml
ymk build.yml
ymk containers.yml

Build and Configure the Installer

First clone the installer code

git clone git@gitlab.com:mergetb/portal/install

Make sure you have the tools needed

sudo dnf install -y makeself device-mapper-devel btrfs-progs-devel gpgme-devel

Build

go build

Make a directory called build and copy the installer there

mkdir build
cp installer build/

Go to the build directory and create a merge directory and put the containers you built in the portal build step there.

cd build
mkdir -p containers/{merge,xdc}
cp <path-to-portal-code>/build/*.tar containers/merge/

If you are using a recent build of the portal with compressed containers then you’ll need to uncompress the xz archives in place

cd containers/merge
for x in `ls`; do unxz $x; done
cd ../..

Move xdc containers to the expected location

mv containers/merge/{ssh-jump,xdc-base,wgd}.tar containers/xdc/

Copy the example installer config to the build directory

cp ../example/portal.yml .

Edit portal.yml, specifically

  • change kubeconfig to the location of your CRC kubernetes config. This should be ~/.crc/machines/crc/kubeconfig on *nix type machines.
  • change openshift.registry.password to the content of the DEVTOKEN environment variable created earlier. Note that this token expires over time, so if you use it at a later time, you may need to refresh the value from oc whoami -t.

Run the preflight.sh script in the top level installer directory.

../preflight.sh

your install/build directory should now look like this

[install/build]$ tree
.
├── containers
│   ├── merge
│   │   ├── apiserver.tar
│   │   ├── cred.tar
│   │   ├── git-server.tar
│   │   ├── identity.tar
│   │   ├── materialize.tar
│   │   ├── mergefs.tar
│   │   ├── model.tar
│   │   ├── ops-init.tar
│   │   ├── pops.tar
│   │   ├── realize.tar
│   │   ├── step-ca.tar
│   │   ├── wgsvc.tar
│   │   └── xdc.tar
│   ├── merge-auth
│   │   ├── kratos.tar
│   │   ├── postgres.tar
│   │   └── user-ui.tar
│   └── xdc
│       ├── ssh-jump.tar
│       ├── wgd.tar
│       └── xdc-base.tar
├── installer
└── portal.yml

Wait until the openshift-controller-manger has started

oc get pods -n openshift-controller-manager
NAME                       READY   STATUS              RESTARTS   AGE
controller-manager-vbsw9   0/1     ContainerCreating   0          36s

some time passes …

oc get pods -n openshift-controller-manager
NAME                       READY   STATUS    RESTARTS   AGE
controller-manager-vbsw9   1/1     Running   0          89s

Now we can run the Merge portal installer against OpenShift.

./installer --config portal.yml

Once this installer finishes you’ll have a working Merge portal running inside your OpenShift cluster. A good sanity check is the following.

oc get pods -n merge
NAME                          READY   STATUS      RESTARTS   AGE
apiserver-5c97dc85f5-jn9kj    1/1     Running     0          116s
etcd-95d54bdb5-t6jns          1/1     Running     0          117s
git-server-56554c5795-tjfdx   1/1     Running     0          115s
identity-77d4bb67b-vjgrx      1/1     Running     0          116s
minio-6d66649fc7-dxp8k        1/1     Running     0          117s
model-fc8b5fcd5-66fzd         1/1     Running     0          115s
ops-init-cnt7v                0/1     Error       0          114s
ops-init-pfmg7                0/1     Completed   0          40s
realize-556bbf7d9b-c6r4r      1/1     Running     0          115s

Seeing a few errors on the ops-init job is normal, as it starts at the same time the pods it depends no, but eventually succeeds once sufficient infrastructure is up.

Doing work

I would highly recommend using the OpenShift web console.

./crc console

will open a browser window. Select Log in with … kube:admin, and provide the login credentials from

./crc console --credentials

Add the following to your /etc/hosts

192.168.130.11    api.mergetb.example.net auth.mergetb.example.net grpc.mergetb.example.net git.mergetb.example.net

Getting started with the v1 client

First check out and build the code

git clone git@gitlab.com:mergetb/portal/cli
cd cli
git checkout ry-v1
go build -o mrg

Now login with the credentials generated by the installer. Change directory back to installer/build

cat .conf/generated.yml | grep opspw
  opspw: BlOouW27DS08IgY65qE1saf4Pm3Jyi9j
mrg login --nokeys ops BlOouW27DS08IgY65qE1saf4Pm3Jyi9j

Now a quick sanity check to make sure the portal is initialized

mrg config set server grpc.mergetb.example.net
mrg list id
USERNAME    EMAIL                      ADMIN
ops         ops@mergetb.example.net    true

you are now ready to take over the world.

Pushing Updated Containers to the Portal

If you need to re-build and and push containers to the portal, you do not need to run the installer again. Go back to the services repo and

oc login -u developer -p developer
podman login -u $(oc whoami) -p $(oc whoami -t) --tls-verify=false $HOST
export DOCKER=podman DOCKER_PUSH_ARGS=--tls-verify=false
ymk push-containers.yml

Alternatively to push individual containers you can just use podman on it’s own, for example

podman push --tls-verify=false default-route-openshift-image-registry.apps-crc.testing/merge/apiserver

A few small things wrt to the Portal V1 Testing Env.

  • ../preflight.sh should be (cd .. && ./preflight.sh)
  • We may want to spin off a generic F33 pod in the merge namespace for cli usage. I spun one up to run the ops-init command to init the ops id. I imagine it would be useful for other things.
  • The merge-auth containers/tars are expected to exist in the install ./containers/merge-auth dir, but there is no documentation about them.
  • mrg config set server grpc.mergetb.example.net should be run before the ops login.

Ran into a couple of problems. First, the ymk tools.yml seems to want root. But then there is an error even when running as root.

elkins@willow services]$ ymk tools.yml
❌ dnf: dnf install -y protobuf-compiler protobuf-devel npm go
exit status 1
Error: This command has to be run with superuser privileges (under the root user on most systems).

[elkins@willow services]$ sudo ymk tools.yml
:white_check_mark: dnf: dnf install -y protobuf-compiler protobuf-devel npm go
Last metadata expiration check: 1:16:47 ago on Tue 12 Jan 2021 02:09:14 PM PST.
Package protobuf-compiler-3.12.4-1.fc33.x86_64 is already installed.
Package protobuf-devel-3.12.4-1.fc33.x86_64 is already installed.
Package npm-1:6.14.10-1.14.15.4.1.fc33.x86_64 is already installed.
Package golang-1.15.6-1.fc33.x86_64 is already installed.
Dependencies resolved.
Nothing to do.
Complete!
:white_check_mark: dnf: dnf install -y protobuf-compiler protobuf-devel npm go
Last metadata expiration check: 1:16:49 ago on Tue 12 Jan 2021 02:09:14 PM PST.
Package protobuf-compiler-3.12.4-1.fc33.x86_64 is already installed.
Package protobuf-devel-3.12.4-1.fc33.x86_64 is already installed.
Package npm-1:6.14.10-1.14.15.4.1.fc33.x86_64 is already installed.
Package golang-1.15.6-1.fc33.x86_64 is already installed.
Dependencies resolved.
Nothing to do.
Complete!
:white_check_mark: npm: npm install -g redoc-cli
/usr/local/bin/redoc-cli → /usr/local/lib/node_modules/redoc-cli/index.js

core-js@3.8.0 postinstall /usr/local/lib/node_modules/redoc-cli/node_modules/core-js
node -e “try{require(‘./postinstall’)}catch(e){}”

Thank you for using core-js ( GitHub - zloirock/core-js: Standard Library ) for polyfilling JavaScript standard library!

The project needs your help! Please consider supporting of core-js on Open Collective or Patreon:

core-js - Open Collective
Denis Pushkarev | creating core-js | Patreon

Also, the author of core-js ( zloirock (Denis Pushkarev) · GitHub ) is looking for a good job -)

npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@2.1.3 (node_modules/redoc-cli/node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@2.1.3: wanted {“os”:“darwin”,“arch”:“any”} (current: {“os”:“linux”,“arch”:“x64”})

try sudo -E ymk tools.yml

Yes, that worked for that step.

The build step went fine, but I encountered errors building the containers:

:x: model_ctr: $DOCKER build -f service/model/Dockerfile -t $REGISTRY/$REGISTRY_PATH/model:$TAG .
exit status 125
STEP 1: FROM fedora:33
STEP 2: RUN dnf update -y
Fedora 33 openh264 (From Cisco) - x86_64 2.1 kB/s | 2.5 kB 00:01
Fedora Modular 33 - x86_64 95 kB/s | 3.3 MB 00:35
Fedora Modular 33 - x86_64 - Updates 330 kB/s | 2.9 MB 00:08
Fedora 33 - x86_64 - Updates 6.5 MB/s | 21 MB 00:03
Fedora 33 - x86_64 3.4 MB/s | 72 MB 00:20
Dependencies resolved.

Package Arch Version Repo Size

Upgrading:
audit-libs x86_64 3.0-1.fc33 updates 114 k

Transaction Summary

Install 48 Packages
Upgrade 70 Packages

Total download size: 57 M
Downloading Packages:
(1/118): elfutils-debuginfod-client-0.182-1.fc3 200 kB/s | 33 kB 00:00

(118/118): python3-libs-3.9.1-1.fc33.x86_64.rpm 2.3 MB/s | 7.4 MB 00:03

Total 3.8 MB/s | 57 MB 00:14
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1

Running scriptlet: dnf-4.4.0-2.fc33.noarch 134/188
System has not been booted with systemd as init system (PID 1). Can’t operate.
Failed to connect to bus: Host is down

Cleanup : python3-dnf-4.4.0-2.fc33.noarch 135/188

Verifying : zlib-1.2.11-22.fc33.x86_64 188/188

Upgraded:
audit-libs-3.0-1.fc33.x86_64

Installed:
acl-2.2.53-9.fc33.x86_64

Complete!
Error: error checking if cached image exists from a previous build: error getting history of “f463699e6689a136ee7eea9ec990fe674c5ebbc59315680035fbf45cf8778ee3”: error creating new image from reference to image “f463699e6689a136ee7eea9ec990fe674c5ebbc59315680035fbf45cf8778ee3”: error locating item named “manifest” for image with ID “f463699e6689a136ee7eea9ec990fe674c5ebbc59315680035fbf45cf8778ee3”: file does not exist

I re-ran ymk containers.yml and this time it completed successfully.

Encountered another error while running ./installer --config portal.yml

INFO Pushing merge/realize                        
FATA failed to push image: Error trying to reuse blob sha256:58af922103e84c91de28c055a4fff487dc620386d1a007c7753731d41b2b1c97 at destination: Head "https://default-route-openshift-image-registry.apps-crc.testing/v2/merge/realize/blobs/sha256:58af922103e84c91de28c055a4fff487dc620386d1a007c7753731d41b2b1c97": dial tcp 192.168.130.11:443: connect: connection refused

Running it again gets a slightly different error (different port):

INFO Pushing merge/apiserver                      
FATA check auth: error pinging docker registry default-route-openshift-image-registry.apps-crc.testing: Get "http://default-route-openshift-image-registry.apps-crc.testing/v2/": dial tcp 192.168.130.11:80: connect: connection refused

Yeah. It seems podman has issues with concurrent builds that have a common base.

Is your CRC instance running?

How to install wireguard kernel module on the crc/open shift host.

  1. spawn a privileged container in open shift. The container image should be registry.access.redhat.com/ubi8:latest.

  2. copy the following RPMs into the container

kernel-headers-4.18.0-147.0.3.el8_1.x86_64.rpm        
kernel-headers-4.18.0-193.41.1.el8_2.x86_64.rpm       
kernel-core-4.18.0-193.41.1.el8_2.x86_64.rpm          
kernel-devel-4.18.0-193.41.1.el8_2.x86_64.rpm         
kernel-modules-4.18.0-193.41.1.el8_2.x86_64.rpm       
elfutils-libelf-devel-0.180-1.el8.x86_64.rpm          
linux-firmware-20191202-97.gite8a0f4c9.el8.noarch.rpm 

These can be downloaded direct from Red Hat (once a subscription is registered with Red Hat. Free developer subscriptions seem to work.) Download from here: https://access.redhat.com/downloads/content/package-browser .) The specific kernel version comes trom the CRC virtual machine which is Red Hat Enterprise Linux CoreOS release 4.6.

These can be copied via the oc cp command. e.g. oc cp elfutils-libelf-devel-0.180-1.el8.x86_64.rpm wgd-585x6:/tmp.

Presumably if we could get the Red Hat subscription management to work on the container, this step would not be needed as we could just yum install ... the packages directly. I"m not sure that it does work on containers, though.

  1. Install those RPMS on the container: yum install /tmp/*.rpm
  2. Build the wireguard kernel module on the container
  • yum install kernel-devel make gcc git binutils iproute
  • git clone https://git.zx2c4.com/wireguard-linux-compat
  • cd wireguard-linux-compat/src
  • git checkout v0.0.20200318
  • make module
  • make install
  • modprobe wireguard
  • dnf instal wiregaurd-tools

This is all wrapped up in the repo at https://gitlab.com/mergetb/internal/wgbuilder. You should be able to clone that and run install.sh to build and install the wireguard kernel module on the crc/openshift node.

This should be run after the portal is installed and running as it runs in the xdc namespace. This may not actually be needed.