The etcd API provides a notion of a Transaction to facilitate use in distributed applications. Transactions provide several key features:
- They allow updates to multiple keys to be performed atomically - e.g., assume there are two keys, A and B, which have values
a0
andb0
. If the transaction updates those values toa1
andb1
respectively, a reader that attempts to readA
andB
concurrently will either see valuesa0
/b0
ora1
/b1
- i.e., never a combination of old and new values - They allow for optimistic concurrency control (OCC) through If/Then/Else constructs, which obviate the need for explicit locks around concurrent updates to the same key space.
One place in which Merge uses transactions is on the facility when handling materialize API requests. When the apiserver receives a request, it constructs a transaction consisting of updates to the set of keys that are watched by the various site reconcilers. The transactional semantics here are especially important; i.e., it is assumed that if a watched key is updated, the reconciler that wakes up to handle it may assume that all other aspects of the etcd key space for that materialization have also been updated.
However, there is a challenge we need to deal with, which is that etcd places a limit on the number of keys that can be involved in a transaction. We can increase the limit by some amount, but in doing so we will eventually push it into an operating range that etcd simply cannot deal with.
Chris has a recent MR that works around this limit by splitting transactions - i.e., breaking the key-space into smaller segments and enforcing transactional semantics within (but not across) those individual segments. This gets around bulk txn limits, but breaks the full-txn atomic property that we rely on with our use of transactions.
Thus, we have 2 options: (1) remove our reliance on transactional semantics, or (2) find some other way to implement transactions without relying on etcd.
My initial preference is for (2). One way to achieve this is with reader-writer locks ā i.e., moving away from OCC in favor of explicit locking. Etcd provides read-write locks as an experimental feature: recipe package - go.etcd.io/etcd/client/v3/experimental/recipes - Go Packages
The basic idea is as follows:
- All txn writers must acquire a write-lock. Once acquired, they can non-atomically write to their key space (i.e., through a split transaction, or any other means). After all keys of the transaction are written, they release
- All txn readers must acquire a read-lock. Once acquired, they can read anywhere in the transactionsā key space, and transactional semantics are ensured. Once they are finished reading, they release
Write locks are mutually exclusive with writes and reads, while read locks are only mutually exclusive with writes. That is, any number of concurrent reads are permitted, as long as there is no write. This aligns well with the read/write mode that we typically use in Merge, which is:
- apiserver: single writer
- reconcilers: many readers
It may also be that option (1) is preferable ā do not build reconcilers to expect any sort of transactional semantics. This may be more in line with the idea of the reconciler model, which is that each reconciler only cares about state that is keyed under the prefix it is watching. Weād need some code updates to achieve this, because I know of at least one situation where a reconciler assumes that some keys outside of its immediate keyspace exist, but it might ultimately be the better option