1 Introduction

We study Federated Byzantine Agreement Systems (FBASs), as originally proposed by Mazières [16]. FBASs are conceptually related to Asymmetric Quorum Systems [2] and Personal Byzantine Quorum Systems [14]. While research on consensus protocols has accelerated in the wake of global blockchain enthusiasm, developments still mostly fall in two extreme categories: permissionless, i.e., open-membership, as exemplified by Bitcoin’s notoriously energy-hungry “Nakamoto consensus” [17], and permissioned, with a closed group of validators, as assumed both in the classical Byzantine fault tolerance (BFT) literature (e.g., [4]) and many state-of-the art protocols from the blockchain world (e.g., [22]). The FBAS paradigm and the works it has inspired suggest a middle way: Each node defines its own rules about which groups of nodes it will consider as sufficient validators. If the sum of all such configurations fulfills a set of properties, protocols like the Stellar Consensus Protocol (SCP) [16] can be defined that leverage the resulting structure for establishing a live and safe consensus system [3, 8, 9, 13, 14].

In the original FBAS model [16], which this paper is based on, these properties are foremost quorum availability despite faulty nodes, which enables liveness, and quorum intersection despite faulty nodes, which makes it possible for consensus protocols to prevent forks and thus enables safety. In a practical deployment, it is seldom clear which nodes are faulty, and in this way the level of risk w.r.t. to liveness and safety is uncertain. We propose an intuitive and yet precise analysis approach for determining the level of risk, based on enumerating minimal blocking sets and minimal splitting sets—minimal sets of nodes that, if faulty, can by themselves compromise liveness and safety. We provide algorithms for determining these sets in arbitrary FBASs and make available an efficient software-based analysis frameworkFootnote 1. To the best of our knowledge, we are the first to propose and implement an analysis methodology for the assessment of the liveness and safety guarantees of FBAS instances that yields precise results as opposed to heuristic estimations. As previously shown in [8], FBASs induce Byzantine quorum systems as per Malkhi and Reiter [15]—hence our results might be of interest to more classical formalizations as well. For example, we explicitly distinguish between sets of nodes that can undermine liveness and such sets that can undermine safety, highlighting that in an actual system the threat to liveness and the threat to safety can differ both in structure and in severity.

We apply our analysis approach and tooling in an empirical study that investigates the emergence of FBASs from existing inter-node relationships, as encoded in, e.g., trust graphs. Based on example configuration policies, we demonstrate that while FBASs can be bootstrapped in a bottom-up fashion from individual preferences, strategic considerations should additionally be applied by node operators in order to arrive at FBASs that are robust and amenable to monitoring.

Strategic considerations can increase centralization, on top of what is already implied by individual preferences. We observe that centralization manifests as a top tier of nodes that is solely relevant when determining liveness buffers. We contribute a proof that if maintaining basic safety guarantees is a minimal strategic requirement of node operators, top tiers are effectively “closed-membership” in the sense that a top tier’s composition can only change with cooperation of current top tier nodes. This casts doubt on the reported “open-membership” property of FBASs—while any node can become part of the FBAS, our results show that only nodes approved by the current top tier can become relevant for consensus.

Following an overview of related work (Sec. 2) and the formal introduction of the FBAS model and its interpretation in practical deployments (Sec. 3), we structure our paper around our main original contributions:

  • An analysis framework for reasoning about safety and liveness guarantees in concrete FBASs (Sec. 4).

  • Algorithms for efficiently performing the proposed analyses (Sec. 5).

  • A simulation-based exploration of possible configuration policies and their effects (Sec. 6).

  • Formal proof that membership in an FBAS’ top tier is only “open” if a violation of safety is considered acceptable (Sec. 7).

As appendices, we prove a number of additional corollaries and theorems (Appendix A) and present results from applying our analysis methodology to an interesting toy network (Appendix B) and the current Stellar network (Appendix C).

2 Related work

Federated Byzantine Agreements Systems were first proposed in [16], together with the Stellar Consensus Protocol (SCP), a first protocol for this setting. The viability of SCP has been proven formally [8, 9, 13] and the protocol is in active use in two large-scale payment networks [13, 18]. The FBAS notion has furthermore been generalized and reformulated in different ways, creating bridges to more classical models and enabling the development of additional protocols [2, 3, 14]. Among other things, as shown by García-Pérez and Gotsman [8], FBASs with “safe” configurations induce Byzantine quorum systems [15]. In this work, we are less interested in the mechanics of specific protocols for the FBAS setting but instead investigate the conditions they require for achieving safety, liveness and performance. We investigate how many node failures (and of which nodes) an FBAS can tolerate before the conditions to safety and liveness are compromised, and how individual node configuration policies influence these “buffers”.

Previously, consensus protocols relevant in practice (such as PBFT [4]) have relied on a symmetric threshold model. In a typical instantiation with \(3f+1\) nodes that can tolerated up to f Byzantine node failures, each \(2f+1\) nodes form a (minimal) quorum. This model naturally gives rise to quorum systems that are trivial to analyze, i.e., for which it is trivial to determine under which maximal fail-prone sets [15] consensus is still possible. The possibility for quorum systems that lack symmetry (that is opened up by the FBAS paradigm and related notions) makes the investigation of a more general analysis approach necessary.

A heuristics-based methodology for analyzing FBAS instances was previously proposed in [11], focusing on the identification of central nodes and threats to FBAS liveness. We propose a novel analysis approach that is not heuristics-based and hence yields precise insights, based on a solid theoretic foundation. As in [11], we apply our methodology to snapshots of the live Stellar network (cf. Appendix C).

Bracciali et al. [1] explore fundamental bounds on the decentrality in open quorum systems. One of their central arguments with regards to the FBAS paradigm is that quorum intersection, a crucial requirement to guaranteeing safety in protocols like SCP, is computationally intractable to determine and maintain, necessitating centralization if safety is a requirement. The NP-hardness of determining quorum intersection was previously also proven by Lachowski [12], together, however, with practical algorithms for nevertheless determining safety-critical properties of non-trivial FBASs. We develop new algorithms that incorporate the possibility that some nodes may fail, enumerating minimal blocking sets and minimal splitting sets. We evaluate their performance for different FBAS sizes, providing insights into the computational limitations that are relevant in practice. While, based on our analysis approach and its application to specific FBASs, we can confirm that nodes of higher influence (top tier nodes according to our choice of words) naturally emerge, we argue that it is not only the existence and size of such a group that determines “centralization” but also the fluidity of that group’s membership (which we explicitly investigate).

An alternative analysis methodology and software framework has recently been presented in [10]. Among other things, the authors provide algorithms for determining the consequences of specific sets of nodes becoming faulty, whereas we propose and implement approaches for identifying all minimal sets of nodes that need to become faulty for an FBAS to lose safety and liveness guarantees.

3 Federated byzantine agreement

In the following, we introduce core concepts of the FBAS paradigm that form our basis for reasoning about specific FBAS instances. We use terminology based on [12, 13, 16] and the Stellar codebase (stellar-core).

Our FBAS model is based on the concept of nodes. Whereas nodes usually represent individual machines, for the purposes of this paper we typically assume that each node represents a distinct entity or organization. We will illustrate introduced concepts using examples, with nodes represented as integers. For example, \(\{{0, 1, 2}\}\) denotes a set of three distinct nodes. We will occasionally also use established terms in the context of consensus protocols, such as “slot”, “externalize” and “faulty”, without formally introducing them. As an informal and approximate adaptation to the blockchain setting, a slot is a block of a given height, to externalize a value is to decide the contents of a blockFootnote 2, and a faulty node is one that violates protocol rules in arbitrary ways, e.g., assuming the worst-case scenario, via being under the control of an attacker that also controls all other faulty nodes.

We first introduce the formal foundation of the FBAS paradigm as originally proposed in [16]. Following that, we formally define the quorum set configuration format for FBAS nodes that was previously only used in a practical implementation (of the Stellar network software) but whose convenience for defining specific FBAS instances also benefits the theoretical discussion. Based on the introduced foundations, we finally derive the necessary properties an FBAS must exhibit in order to enable liveness and safety guarantees.

3.1 Quorum slice and FBAS

In an FBAS, each node (respectively its human administrator) individually configures which other nodes’ opinions it should consider when participating in consensus. Configurations can express individual expectations, such as “out of these n nodes, at most f will simultaneously cooperate to attack the system”, and can be used to strategically influence global system parameters. On a conceptual level, the configuration of an FBAS node consists in the definition of quorum slices.

Definition 3.1

(FBAS; adapted from [16]) A Federated Byzantine Agreement System (FBAS) is a pair \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) comprising a set of nodes \({{\,\mathrm{\mathbf {V}}\,}}\) and a quorum function \({{\,\mathrm{\mathbf {Q}}\,}}: {{\,\mathrm{\mathbf {V}}\,}}\rightarrow 2^{2^{{{\,\mathrm{\mathbf {V}}\,}}}}\) specifying quorum slices for each node, where a node belongs to all of its own quorum slices—i.e., \(\forall v \in {{\,\mathrm{\mathbf {V}}\,}}, \forall q \in {{\,\mathrm{\mathbf {Q}}\,}}(v), v \in q\).

Informally, each quorum slice of a node v describes a set of nodes that, should they all agree to externalize a value in a given slot, is sufficient to also cause v to externalize that value.

Clearly, an FBAS cannot be modeled as a regular graph (with FBAS nodes as graph edges) without losing information. Graph-based analyses as in [11] can therefore result only in heuristic insights. An FBAS can be modeled as a directed hypergraph [7]. However, we find the quorum set abstraction (presented next) more suitable for subsequent analysis. In Sec. 6, we explore strategies for bootstrapping robust FBASs from graphs.

3.2 Quorum set

While a useful abstraction for formally describing protocols for the FBAS setting, quorum slices are an unwieldy format for describing concrete FBAS instances. In Stellar, the currently most relevant practical deployment of an FBAS, nodes are configured not via quorum slices but via quorum sets [13]. Each quorum set defines a set of validator nodes \(U \subseteq {{\,\mathrm{\mathbf {V}}\,}}\), a set of inner quorum sets \(\mathcal {I}\) and a threshold value t. Intuitively, this representation enables the encoding of notions such as “out of these nodes U, at least t must agree” (satisfying the quorum set) or “the sum of agreeing nodes in U and satisfied inner quorum sets in \(\mathcal {I}\) must be at least t”.

Definition 3.2

(quorum set; adapted from Stellar codebase) A quorum set is a recursive tuple \((U, \mathcal {I}, t) \in \mathfrak {D}, \, \mathfrak {D}:= 2^{{{\,\mathrm{\mathbf {V}}\,}}} \times 2^\mathfrak {D}\times \mathbb {Z}^{+}\). For quorum sets of the form \(D = (U, \mathcal {I}, t)\), we recursively define that a set of nodes \(q \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) satisfies D iff \((|{q \cap U}| + |{\{{I \in \mathcal {I}: q \text { satisfies } I}\}}|) \ge t\).

For example, \((\{{0, 1}\},\emptyset , 1)\) encodes that agreement is required from either node 0 or node 1, whereas \((\{{0}\}, \mathcal {I}, 1)\) with \(\mathcal {I}= \{{(\{{1, 2, 3}\}, \emptyset , 2)}\}\) encodes that either node 0 or two out of \(\{{1, 2, 3}\}\) must agree. Inner quorum sets (members of \(\mathcal {I}\)) are often used for grouping nodes belonging to the same entity (respectively organization), so that the importance of an entity can be decoupled from the number of nodes it controls.

Quorum sets are useful for defining the quorum slices of a node. To ease notation, we define the formalism \({{\,\mathrm{qset}\,}}(v, D)\) that expresses the set of quorum slices of a node \(v \in {{\,\mathrm{\mathbf {V}}\,}}\) based on a quorum set \(D \in \mathfrak {D}\).

Definition 3.3

(quorum set \(\rightarrow \) quorum slices) For a node \(v \in {{\,\mathrm{\mathbf {V}}\,}}\) and a quorum set \(D \in \mathfrak {D}\), \({{\,\mathrm{qset}\,}}(v, D)\) maps to the set of all valid quorum slices for v that satisfy D, i.e., \({{\,\mathrm{qset}\,}}(v, D): {{\,\mathrm{\mathbf {V}}\,}}\times \, \mathfrak {D}\rightarrow 2^{2^{{{\,\mathrm{\mathbf {V}}\,}}}} := \{{q \subseteq {{\,\mathrm{\mathbf {V}}\,}}\mid v \in q \wedge q \text { satisfies } D}\}\).

Via the \({{\,\mathrm{qset}\,}}\) notation, quorum sets and quorum slices become equivalent representations that can be transformed into one another. A straightforward (but generally not space-efficient) way to express any k quorum slices \(\{{q_i \in 2^{{{\,\mathrm{\mathbf {V}}\,}}} \mid }\}{i \in [0, k), v \in q_i}\) of a node \(v \in {{\,\mathrm{\mathbf {V}}\,}}\) via a quorum set is \({{\,\mathrm{qset}\,}}(v, (\emptyset , \mathcal {I}, 1))\), with \(\mathcal {I}= \{{(q_i, \emptyset , |{q_i}|) \mid i \in [0, k)}\}\). Quorum sets are translated to quorum slices (values of \({{\,\mathrm{\mathbf {Q}}\,}}\)) by applying the \({{\,\mathrm{qset}\,}}\) function. For example (with \({{\,\mathrm{\mathbf {V}}\,}}= \{{0, 1, 2}\}\)):

$$\begin{aligned} {{\,\mathrm{\mathbf {Q}}\,}}(0)&= {{\,\mathrm{qset}\,}}(0, (\{{1, 2}\},\emptyset , 1)) = \{{\{{0, 1}\}, \{{0, 2}\}, \{{0, 1, 2}\}}\}\\ {{\,\mathrm{\mathbf {Q}}\,}}(1)&= {{\,\mathrm{qset}\,}}(1, (\{{0, 2}\},\emptyset , 2)) = \{{\{{0, 1, 2}\}}\}\\ {{\,\mathrm{\mathbf {Q}}\,}}(2)&= {{\,\mathrm{qset}\,}}(2, (\{{0, 1, 2}\},\emptyset , 2)) = \{{\{{0, 2}\}, \{{1, 2}\}, \{{0, 1, 2}\}}\} \end{aligned}$$

In the above example, \({{\,\mathrm{\mathbf {V}}\,}}= \{{0,1,2}\}\) and their quorum sets (as per \({{\,\mathrm{\mathbf {Q}}\,}}\)) form the FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\). As a way to visualize \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\), it can heuristically be represented as a graph where the existence of an edge \((v_i, v_j)\) implies that \(v_j\) is included in at least one of \(v_i\)’s quorum slices:

figure a

3.3 Preconditions to liveness

A consensus system is live if it can externalize new valuesFootnote 3. A consensus system built upon an FBAS is live if the FBAS contains an intact quorum— a group of FBAS nodes that can externalize new values by itself.

Definition 3.4

(quorum [16]) A set of nodes \(U \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) in FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) is a quorum iff \(U \ne \emptyset \) and U contains a quorum slice for each member—i.e., \(\forall v \in U \; \exists q \in {{\,\mathrm{\mathbf {Q}}\,}}(v): q \subseteq U\).

This is equivalent to stating that U satisfies the quorum sets of all \(v \in U\). Quorums are therefore determined by the sum of all individual quorum set configurations. Continuing the previous example with nodes \({{\,\mathrm{\mathbf {V}}\,}}= \{{0, 1, 2}\}\), we get the quorums \(\mathcal {U}= \{{\{{0,2}\},\{{0,1,2}\}}\}\). We capture part of the semantics behind quorums by defining what it means for a consensus protocol to honor a given FBAS —namely that whenever values are externalized for a slot, at least one quorum of nodes must eventually externalize values as well.

Definition 3.5

(protocol that honors an FBAS) Let \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) be an FBAS such that \({{\,\mathrm{\mathbf {V}}\,}}\) contains only non-faulty nodes, P a consensus protocol, and \(N_i \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) the set of all nodes that, following P, eventually externalize a value for a given slot i. We say that P honors \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) iff any nonempty \(N_i\) contains a quorum, i.e., \(\forall i: N_i = \emptyset \vee \exists U \subseteq N\) such that U is a quorum for \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\).

We say that \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) has quorum availability despite faulty nodes iff there exists a \(U \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) that is a quorum in \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) and consists of only non-faulty nodes. Quorum availability despite faulty nodes is a necessary condition to achieving liveness in an FBAS, i.e., ensuring that non-faulty nodes can externalize new values independently of the behavior of faulty nodes [16].

Theorem 3.1

(quorum availability \(\Longleftarrow \) liveness) Let \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) be an FBAS and P a consensus protocol that honors \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\). If P can provide liveness for \(({{\,\mathrm{\mathbf {V}}\,}},{{\,\mathrm{\mathbf {Q}}\,}})\) independently of the behavior of faulty nodes, then \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) enjoys quorum availability despite faulty nodes.

Proof

Let \(F \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) be the set of all faulty nodes and \(({{\,\mathrm{\mathbf {V}}\,}}\setminus F, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) a sub-FBAS that contains all non-faulty nodes, with \({{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) := \{{q \in {{\,\mathrm{\mathbf {Q}}\,}}(v) \mid q \subseteq {{\,\mathrm{\mathbf {V}}\,}}\setminus F}\}\) for \(\forall v \in {{\,\mathrm{\mathbf {V}}\,}}\setminus F\). P honors \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) and can provide liveness independently of the behavior of nodes in F, therefore there must exist a protocol \(P^\prime \) that can provide liveness while honoring \(({{\,\mathrm{\mathbf {V}}\,}}\setminus F, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\). Based on Def. 3.5, there is therefore at least one \(U \subseteq {{\,\mathrm{\mathbf {V}}\,}}\setminus F\) that is a quorum for \(({{\,\mathrm{\mathbf {V}}\,}}\setminus F, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\). U is, trivially, also a quorum for \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\). \(\square \)

Given quorum availability despite faulty nodes, protocols like SCP can provide liveness [16]. In the case of SCP, this was previously demonstrated through correctness proofs [9] as well as formal verification and practical deployment experience [13]. Additional conditions to achieving liveness include the reaction (via quorum set adaptations, i.e., changes to \({{\,\mathrm{\mathbf {Q}}\,}}\)) to (detectable) timing attacks [13]. We defer to works such as [2, 3, 14, 16] for an in-depth exploration of the mechanics and guarantees of consensus protocols for the FBAS setting.

3.4 Preconditions to safety

A set of nodes in an FBAS enjoy safety if no two of them ever externalize different values for the same slot [16]. In a blockchain context, a lack of safety guarantees translates into the possibility of forks and double spends. Protocols that honor an FBAS can only guarantee safety if the FBAS enjoys quorum intersection.

Definition 3.6

(quorum intersection [16]) A given FBAS enjoys quorum intersection iff any two of its quorums share a node—i.e., for all quorums \(U_{1}\) and \(U_{2}\), \(U_{1} \cap U_{2} \ne \emptyset \).

For example, the set of quorums \(\{{\{{0,2}\},\{{0,1,2}\}}\}\) intersects, whereas introducing an additional quorum \(\{{1,4}\}\) would break quorum intersection. In the latter scenario, \(\{{0,2}\}\) and \(\{{1,4}\}\) could induce two new, separated FBASs [14]. We say that an FBAS enjoys quorum intersection despite faulty nodes if every two quorums that contain non-faulty nodes intersect in at least one non-faulty node, even if all faulty nodes change their quorum sets in arbitrary ways or report different quorum sets to different peers. Formally, quorum intersection despite faulty nodes is defined via a delete operation that transforms an FBAS based on the assumption that a given set of nodes is acting in the most harmful (to safety) way possible.

Definition 3.7

(delete [16]) If \(({{\,\mathrm{\mathbf {V}}\,}},{{\,\mathrm{\mathbf {Q}}\,}})\) is an FBAS and \(F \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) a set of nodes, then to delete F from \(({{\,\mathrm{\mathbf {V}}\,}},{{\,\mathrm{\mathbf {Q}}\,}})\), written \(({{\,\mathrm{\mathbf {V}}\,}},{{\,\mathrm{\mathbf {Q}}\,}})^F\), means to compute the modified FBAS \(({{\,\mathrm{\mathbf {V}}\,}}\setminus F, {{\,\mathrm{\mathbf {Q}}\,}}^F)\) where \({{\,\mathrm{\mathbf {Q}}\,}}^F(v) = \{{q \setminus F, q \in {{\,\mathrm{\mathbf {Q}}\,}}(v)}\}\).

If \(F \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) is the set of all faulty nodes, then an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) enjoys quorums intersection despite faulty nodes iff \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})^F\) enjoys quorum intersection. If quorum intersection despite faulty nodes is not given, safety cannot be guaranteed (although it can be maintained by chance).

Theorem 3.2

(quorum intersection \(\Longleftarrow \) guaranteed safety) Let \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) be an FBAS and P a consensus protocol that can provide liveness for any FBAS with quorum availability despite faulty nodes, while honoring the respective FBAS. Let P furthermore be non-trivial, in the sense that externalized values are non-deterministic and depend on user input. If P can guarantee safety for all non-faulty nodes in \({{\,\mathrm{\mathbf {V}}\,}}\), then \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) enjoys quorum intersection despite faulty nodes.

Proof

Let \(F \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) be the set of all faulty nodes and \(({{\,\mathrm{\mathbf {V}}\,}}^\prime , {{\,\mathrm{\mathbf {Q}}\,}}^\prime ) := ({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})^F\). If \(({{\,\mathrm{\mathbf {V}}\,}}^\prime , {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) does not enjoy quorum intersection, then there are two quorums \(U_1, U_2 \subset {{\,\mathrm{\mathbf {V}}\,}}^\prime \) so that \(U_1 \cap U_2 = \emptyset \). For \(i \in \{{1, 2}\}\), let \(Q_i\) be defined such that \(\forall v \in U_i: Q_i(v) := \{{q \in {{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) \mid q \subseteq U_i}\}\). Then both \((U_1, Q_1)\) and \((U_2, Q_2)\) form FBASs with quorum availability. As P can provide liveness for any FBAS with quorum availability,

\((U_1, Q_1)\) and \((U_2, Q_2)\) can externalize values for the same slots without any communication taking place between nodes in \(U_1\) and nodes in \(U_2\).

As P is non-trivial, the externalized values can differ, i.e., safety cannot be guaranteed. \(\square \)

As formally proven by García-Pérez and Gotsman [8], an FBAS that enjoys quorum intersection induces a Byzantine quorum system [15], and an FBAS that enjoys quorum intersection despite faulty nodes can induce a dissemination quorum system [15]. These results are independent of attempts by faulty nodes to lie about their quorum set configuration [8]. There is strong evidence that protocols like SCP can guarantee safety in any FBAS with quorum intersection despite faulty nodes [2, 9, 13, 14].

4 Concepts for further analysis

In the following, we define new concepts for capturing relevant properties of concrete FBAS instances. While it is typical in the BFT literature to construct proofs based on assuming which sets of nodes can fail simultaneously (i.e., which are the fail-prone sets [15]), we instead investigate which sets of nodes have to fail in order for global liveness and safety guarantees to become void. This perspective uncovers the liveness and safety buffers a given (potentially non-trivial) quorum system has and is thus highly relevant for the monitoring and evaluation of systems deployed in practice. While defined based on the FBAS model, the proposed concepts are readily transferable to more general quorum system formalizations (e.g., recall that safety-enabling FBASs induce Byzantine quorum systems [8]).

For illustration, we will be using the example FBAS defined via Fig. 1. An analysis of a slightly larger example FBAS is presented in Appendix B. Appendix A contains formal write-ups and proofs of various corollaries and theorems relevant to this section.

Fig. 1
figure 1

Example FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\)

4.1 Starting point: Minimal quorums

As a prerequisite to subsequent analyses, it is helpful to understand which quorums (cf. Def. 3.4) exist in an FBAS. We will be focusing on minimal quorums, i.e., quorums \(\hat{U} \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) for which there is no proper subset \(U \subset \hat{U}\) that is also a quorum. Informally, the set of all minimal quorums \(\hat{\mathcal {U}}\) carries sufficient information for precisely determining FBAS-wide liveness properties, while being of significantly smaller size than the set of all quorums \(\mathcal {U}\).

Definition 4.1

(minimal node set) Within the set of node sets \(\mathcal {N}\subseteq 2^{{{\,\mathrm{\mathbf {V}}\,}}}\), a member set \(\hat{N} \in \mathcal {N}\) is minimal iff none of its proper subsets is included in \(\mathcal {N}\)—i.e., \(\forall N \in \mathcal {N}, N \not \subset \hat{N}\).

The FBAS depicted in Fig. 1 has the quorums \(\mathcal {U}= \{{\{{0,1,2}\}, \{{0,3,4}\}, \{{0,1,2,3,4}\}}\}\) and consequently the minimal quorums \(\hat{\mathcal {U}} = \{{\{{0,1,2}\}, \{{0,3,4}\}}\}\).

The notion of minimal quorums is helpful, among other things, for efficiently determining whether an FBAS enjoys quorum intersection [12]: it can be shown that an FBAS enjoys quorum intersection iff every two of its minimal quorums intersect (Cor. A.1).

4.2 Minimal blocking sets

As per Thm. 3.1, an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) cannot enjoy liveness if it doesn’t contain at least one non-faulty quorum. Considering the state of the art in consensus protocols for the FBAS setting and their formal verification (s.a. Sec. 3.3), quorum availability despite faulty nodes is furthermore the only precondition to achieving liveness that depends on \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) and arguably the most difficult to satisfy in a practical deployment. However, while quorum availability can easily be checked based on \({{\,\mathrm{\mathbf {Q}}\,}}\), faulty nodes are usually not readily identifiable as such in practice. We therefore propose, as a means to grasping liveness risks, to look at sets of nodes that, if faulty, can undermine quorum availability.

Definition 4.2

(blocking set) Let \(\mathcal {U}\subseteq 2^{{{\,\mathrm{\mathbf {V}}\,}}}\) be the set of all quorums of the FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\). We denote the set \(B \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) as blocking iff it intersects every quorum of the FBAS—i.e., \(\forall U \in \mathcal {U}, B \cap U \ne \emptyset \)

For example: \(\{{0}\}\) and \(\{{1,3}\}\) are both blocking sets for \(\mathcal {U}= \{{\{{0,1,2}\}, \{{0,3,4}\}, \{{0,1,2,3,4}\}}\}\).

Corollary 4.1

(blocking sets and liveness) Control over any blocking set B is sufficient for compromising the liveness of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\).

Proof

As B intersects all quorums of the FBAS, there is no quorum that can be formed without cooperation by B. Without at least one non-faulty quorum, liveness is not possible as per Thm 3.1. \(\square \)

Notably, blocking sets can also block liveness selectively, enabling censorship. As nodes from the blocking set are present in every quorum, consensus will never be reached on any value that the blocking set opposes to. For example, in the context of Stellar, the blocking set could block the ratification of transactions involving specific accounts. We chose the term blocking in analogy to the v-blocking sets introduced in [16]. As an important distinction, we use the term blocking set to refer to a property of the whole FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\), as opposed to a property of an individual node \(v \in {{\,\mathrm{\mathbf {V}}\,}}\).

In the above example, \(\{{0}\}\) and \(\{{1,3}\}\) are not only blocking sets with respect to \(\mathcal {U}\), they are minimal blocking sets, i.e., none of their proper subsets is a blocking setFootnote 4. In essence, minimal blocking sets describe minimal threat (respectively, fail) scenarios w.r.t. liveness.

4.3 Minimal splitting sets

As per Thm. 3.2, an FBAS can only be considered safe (as one coherent system) as long as it enjoys quorum intersection despite faulty nodes, i.e., as long as each two of its quorums intersect even after all faulty nodes have been deleted (as per Def. 3.7). For practical purposes, quorum intersection despite faulty nodes is furthermore a sufficient condition for achieving safety in an FBAS, considering protocols like SCP and the correctness proofs surrounding them (s.a. Sec. 3.4). Hence, for assessing the risk to safety, it is interesting to identify sets of nodes that can cause an FBAS to effectively lose quorum intersection. We call such a set of nodes a splitting set, as it can, if faulty, cause at least two quorums to diverge, splitting the FBAS.

Definition 4.3

(splitting set) We denote the set \(S \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) a splitting set iff \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})^S\) lacks quorum intersection—i.e., there are distinct quorums \(U_{1}\) and \(U_{2}\) of \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})^S\) so that \(U_{1} \cap U_{2} = \emptyset \).

In the above example with \(\hat{\mathcal {U}} = \{{\{{0,1,2}\},\{{0,3,4}\}}\}\), \(\{{0}\}\) is already a splitting set, as \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})^{\{{0}\}}\) induces the two non-intersecting quorums \(\{{1,2}\}\) and \(\{{3,4}\}\). Intuitively, \({\{{0}\}}\) is a splitting set of \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) because it forms the intersection of the quorums \(\{{0,1,2}\}\) and \(\{{0,3,4}\}\).

The existence of a faulty splitting set violates quorum intersection despite faulty nodes and therefore, as per Thm. 3.2, threatens safety. Informally, the members of a splitting set can perform two types of actions to compromise safety in practice (s.a. Thm. A.1). On the one hand, they can change their quorum configurations (or lie about them) to cause existing quorums to shrink or new quorums to emerge, both with the goal of reducing the overlap between quorums. On the other hand, whenever the intersection of two (minimal) quorums is comprised entirely of faulty nodes, these nodes can agree to different statements in each quorum, causing the quorums to externalize conflicting values and in this way diverge.

As with blocking sets, we are especially interested in finding the minimal splitting sets \(\hat{\mathcal {S}} \subset 2^{{{\,\mathrm{\mathbf {V}}\,}}}\) of an FBASFootnote 5\(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\). Minimal splitting sets describe minimal threat scenarios w.r.t. safety.

4.4 Top tier

For narrowing down notions of “centralization” with respect to FBASs, we propose the concept of a top tier. Informally, the top tier is the set of nodes in the FBAS that is exclusively relevant when determining minimal blocking sets and hence the liveness buffers of an FBAS.

Definition 4.4

(top tier) The top tier of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) is the set of all nodes that are contained in one or more minimal quorums—i.e., if \(\hat{\mathcal {U}} \subseteq 2^{{{\,\mathrm{\mathbf {V}}\,}}}\) is the set of all minimal quorums of the FBAS, \(T=\bigcup {\hat{\mathcal {U}}}\) is its top tier.

In the above example, it in fact holds that \(T = \{{0,1,2,3,4}\} = {{\,\mathrm{\mathbf {V}}\,}}\).

It can be shown that each minimal blocking set consists exclusively of top tier nodes (Cor. A.5), and each top tier node is included in at least one minimal blocking set (Thm. A.2). The FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) with top tier T has therefore the same properties w.r.t. global liveness as the FBAS induced by T, i.e., the FBAS \((T, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) with \({{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) := \{{q \cap T \mid q \in {{\,\mathrm{\mathbf {Q}}\,}}(v)}\}\).

This observation has direct implications for the computational complexity of FBAS analysis (further discussed in Sec. 5), and for the performance of FBAS-based consensus protocols. A consensus round in SCP (the so far only production-ready protocol for the FBAS setting, to the best of our knowledge) can demonstrably be completed in \(O(|{T}|^2)\) messages. While classical consensus protocols with quadratic message complexity (such as PBFT [4]) are notorious for becoming unusable in larger validator groups, several improved protocols have recently emerged that target the blockchain use case and scenarios with 100 and more validators [20, 22]. As a possible avenue for future exploration— for FBASs with a symmetric top tier, existing permissioned protocols could be adapted without much modification.

Definition 4.5

(symmetric top tier) The top tier T of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) is a symmetric top tier iff all top tier nodes have identical quorum sets—i.e., \(\exists D \in \mathfrak {D}, \forall v \in T: {{\,\mathrm{\mathbf {Q}}\,}}(v) = {{\,\mathrm{qset}\,}}(v, D)\).

Symmetric top tiers are also significantly more amenable to analysis. For example, in FBASs with a symmetric top tier T and a non-nested top tier quorum set \((T, \emptyset , t)\), it holds that any minimal blocking set has cardinality \(|{\hat{B}}| = |{T}|-t+1\) (Thm. A.3) and any minimal splitting set that can cause two top tier nodes to diverge from each other has cardinality \(|{\hat{S}}| = 2t-|{T}|\) (Thm. A.4).

5 Analysis algorithms

In the following, we propose algorithms for performing the analyses introduced in Sec. 4. We describe them as pseudocode that necessarily abstracts away some implementation details and optimizations. As a companion to this paper, we release a well-tested implementation of the presented algorithms as open source (fbas_analyzerFootnote 6). After outlining algorithms for enumerating minimal quorums (foundation for further analyses), determining quorum intersection (necessary condition for safety), enumerating minimal blocking sets (liveness “buffers”), enumerating minimal splitting sets (safety “buffers”), and efficiently dealing with symmetric top tiers, the section concludes with a short empirical study on analysis scalability.

5.1 Minimal quorums

Algorithm 1 describes a branch-and-bound algorithm for finding all minimal quorums. It is based on a quorum enumeration procedure originally described in [12]. Previous algorithms did not rigorously filter out non-minimal quorums, which we realize through is_minimal_quorum. The set of all minimal quorums of an FBAS defines its top tier (cf. Sec. 4.4) and can be used for determining whether the FBAS enjoys quorum intersection.

figure b

The keystone of the algorithm is the function fmq_step that takes a current quorum candidate U, a sorted list of yet-to-be-considered nodes V and a reference to \({{\,\mathrm{\mathbf {Q}}\,}}\) for mapping nodes to their quorum sets. The algorithm implements a classical branching pattern: at each invocation of fmq_step in which U is not already a quorum, the next node in V is taken out and, in one branch, added to U, and, in the other, not. Hopeless branches are identified early using the \(\texttt {is}\_\texttt {satisfiable}\) function.

As proposed in [12], we initially sort V using a heuristic such as PageRank [19] which can improve the algorithm’s performance in practice. Another important optimization from [12], that we leave out in our pseudocode for greater clarity, is the partitioning of \({{\,\mathrm{\mathbf {V}}\,}}\) into strongly connected componentsFootnote 7 so that find_minimal_quorums must be applied only to (often significantly smaller) subsets of \({{\,\mathrm{\mathbf {V}}\,}}\). Tarjan [21] gives an algorithm for performing this preprocessing step in linear time.

As noted in other works (e.g., [1, 12]), determining quorum intersection, and hence also enumerating all minimal quorums, is NP-hard. Consequently, our algorithm has exponential time complexity. For an FBAS with \(n = |{{{\,\mathrm{\mathbf {V}}\,}}}|\) nodes and a top tier of size \(m= |{T}|\) we find all \(k \le \left( {\begin{array}{c}m\\ \lceil {\frac{m}{2}}\rceil \end{array}}\right) \) minimal quorums in \(O(2^n)\). Note that in practice the number of de-facto considered nodes n is greatly reduced through polynomial-time preprocessing steps such as strongly-connected-component analysis and heuristics-based sorting, yielding actual running times that are close to the \(O(2^m)\) bound.

5.2 Quorum intersection

Quorum intersection is a central property for being able to guarantee safety in an FBAS (cf. Sec. 4.3). Quorum intersection can be determined by checking the pairwise intersection of all minimal quorums (Cor. A.1). This straightforward approach, that was also proposed in [12], is embodied in Algorithm 2.

figure c

In this paper, we propose an additional, alternative algorithm (Algorithm 3), that doesn’t check for pairwise intersections but instead checks whether the complement sets of found quorums contain quorums themselves. If this is never the case, the FBAS enjoys quorum intersection. This approach for checking for quorum intersection has the benefit that only a constant number of node sets must be held in memory at the same time, as opposed to all minimal quorum sets as in Algorithm 2. The space complexity of the check is therefore reduced from exponential to linear.

figure d

Our implementation of Algorithm 3 is also empirically faster for many FBASs, probably because contains_quorum scales better than iterating once over all minimal quorums, and because less data must be written to memory. For both algorithms, we leave out optimization details such as leveraging the fact that quorum intersection is guaranteed to hold if all minimal quorums \(\hat{U} \in \hat{\mathcal {U}}\) have cardinality greater than \(\frac{|{\bigcup \hat{\mathcal {U}}}|}{2}\). In Algorithm 3, for example, it suffices to check only minimal quorums with fewer than \(\frac{|{\bigcup \hat{\mathcal {U}}}|}{2}\) members.

5.3 Minimal blocking sets

Algorithm 4 presents our algorithm for enumerating all minimal blocking sets based on a branch-and-bound strategy. The check whether a given candidate set B is blocking is performed by checking whether the FBAS contains any quorums after B is removed from the node population. If a blocking set can still be formed from B and the yet-to-be-considered nodes V (this is the pruning rule), the enumeration continues, branching via either adding the next node in V to the candidate set or discarding it altogether. The order in which nodes are visited can be tuned using a suitable heuristic—we sort nodes using PageRank [19] (as for finding minimal quorums) in the example pseudocode and our current implementation. Like for Algorithm 1, the complexity of Algorithm 4 is in \(O(2^n)\) (for an FBAS with n nodes) with a likely practical average case complexity of \(O(2^m)\) (\(m\) being the size of the top tier).

figure e

5.4 Minimal splitting sets

Algorithm 5 presents our algorithm for enumerating all minimal splitting sets. We again perform a branch-and-bound search. The final condition for accepting a candidate set S is whether deleting it (cf. Def. 3.7) from the FBAS causes the FBAS to lose quorum intersection.

This check is significantly more expensive than the corresponding checks in Algorithm 1 and Algorithm 4. Additionally, unlike the previously presented algorithms, Algorithm 5 also needs to consider non-top tier nodes as candidates. We incorporate the observation (from Thm. A.1) that a node can only be part of a minimal splitting set if it is part of a minimal quorum (only then can it be part of an intersection of minimal quorums) or if a change of its quorum set can potentially cause new, smaller quorums to emerge. Consequently, we consider as candidates all top tier nodes and all nodes that are quorum expanders: nodes that are part of a quorum slice of another node that is a not a quorum slice for themselves (formal definition in Def. A.1). Informally, by not sharing a quorum slice with a node they affect, quorum expanders may force quorums to expand beyond this quorum slice. By changing their quorum set, quorum expanders could reverse this effect, leading to smaller quorums and, accordingly, an increased risk to quorum intersection.

The has_potential function embodies an explicit pruning condition for the branch-and-bound search. Here, we check whether a change in the FBAS’s minimal quorums is possible if some or all outstanding candidate nodes V are joined with the current candidate set S. As a heuristic to avoid actually calculating minimal quorums, we check whether the quorum-containing strongly connected components of the FBAS change after deleting V in addition to S.

For improving readability and comprehension, we leave out various details and smaller optimizations from our pseudocode listing for Algorithm 5. Among other things, we don’t include our full algorithms for enumerating quorum_expanders and deliberately ignore opportunities for caching and reusing the results of costly operations.

figure f

The asymptotic complexity of Algorithm 5 remains in \(O(2^n)\), respectively \(O(2^{|{T \cup X}|})\) where T is the top tier and X the set of all quorum expanders. However, due to the costly acceptance check for splitting sets and the larger number of nodes that need to be considered, the algorithm is significantly slower than Algorithm 1 and Algorithm 4 in practice.

5.5 Symmetric clusters

As a generalization of symmetric top tiers (Def. 4.5), we define symmetric clusters of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) as groups of nodes \(Y \subseteq {{\,\mathrm{\mathbf {V}}\,}}\) such that \(\exists D \in \mathfrak {D}, \forall v \in Y: {{\,\mathrm{\mathbf {Q}}\,}}(v) = {{\,\mathrm{qset}\,}}(v, D)\) and \(\bigcup {\bigcup {\{{{{\,\mathrm{\mathbf {Q}}\,}}(v), v \in Y}\}}} = Y\). If an FBAS has one symmetric cluster Y and \({{\,\mathrm{\mathbf {V}}\,}}\setminus Y\) does not contain a quorum, Y is the symmetric top tier of \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\)Footnote 8.

Symmetric clusters can be found in polynomial time, by grouping nodes with identical quorum set configurations (values for \({{\,\mathrm{\mathbf {Q}}\,}}\)) and checking the above condition for each thus formed candidate set.

Symmetric clusters can be analyzed significantly more efficiently. For example, an FBAS with a non-nested symmetric top tier is isomorphic to a classical, threshold-based quorum system (s.a. Thm. A.3 and A.4). For symmetric clusters formed around a nested quorum set, minimal quorums and minimal blocking sets can be enumerated without the overhead of checking candidate sets, by recursively listing combinations and forming their Cartesian product. If the interest is to find only such splitting sets that can cause nodes within the symmetric cluster to diverge, then the same is true for minimal splitting sets.

5.6 Analysis performance

Our analysis approach requires the enumeration of minimal quorums, minimal blocking sets and minimal splitting sets—which in all three cases is an NP-hard problem. It is unclear, however, what this means for the practical limitations of thoroughly determining the safety and liveness buffers of an FBAS. Practical limitations are difficult to conclusively determine as the real-life performance of analyses depends heavily on the topology of analyzed FBASs and the implementation of the algorithms.

In the following, we present a short exploratory study into the scalability of our own implementation. We construct synthetic FBASs of increasing size that consist of only a top tier. In the first series of presented experiments (Fig. 2), we construct FBASs \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) resembling classical \(3f+1\) quorum systems:

$$\begin{aligned} \forall v \in {{\,\mathrm{\mathbf {V}}\,}}: {{\,\mathrm{\mathbf {Q}}\,}}(v) = {{\,\mathrm{qset}\,}}(v, ({{\,\mathrm{\mathbf {V}}\,}}, \emptyset , \lceil {\frac{2|{{{\,\mathrm{\mathbf {V}}\,}}}|+1}{3}}\rceil )) \end{aligned}$$

In a second series of experiments (Fig. 3), we approximate the structure of the Stellar network’s top tier where each organization is represented by (usually) 3 physical nodes arranged in crash failure-tolerating \(2f+1\) inner quorum sets:

$$\begin{aligned} {{\,\mathrm{\mathbf {V}}\,}}&= \{{v_0, v_1, ... v_{n-1}}\}, n = 3m\\ \mathcal {I}&= \{{(\{{v_{3i}, v_{3i+1}, v_{3i+2}}\}, \emptyset , 2) \mid i \in [0, m)}\}\\ \forall v \in {{\,\mathrm{\mathbf {V}}\,}}: {{\,\mathrm{\mathbf {Q}}\,}}(v)&= {{\,\mathrm{qset}\,}}(v, (\emptyset , \mathcal {I}, \lceil {\frac{2m+1}{3}}\rceil )) \end{aligned}$$

We enumerate all minimal quorums, minimal blocking sets and minimal splitting sets of thus generated FBASs and record the time to completion of each of these operations. All analyses were single-threaded and performed on regular server-class hardware. We explicitly deactivated all optimizations based on detecting and exploiting symmetric clusters, so that the results of this study reflect the performance of the more expensive Algorithms 1, 4 and 5.

Fig. 2
figure 2

Analysis duration for FBASs resembling classical \(3f+1\) quorum systems. Analysis optimizations for symmetric top tiers were turned off

Fig. 3
figure 3

Analysis duration for FBASs resembling the structure of the Stellar network top tier. Analysis optimizations for symmetric top tiers were turned off

Figures 2 and 3 depict the median measured times on a log scale, from a set of 10 measurements per FBAS size (we performed the same analysis 10 times, recording individual times). As was expected, analysis durations raise exponentially with growing top tier sizes m. Analyses start requiring more than an hour to finish at \(m \ge 23\) for flat symmetric top tiers and \(m \ge 24\) for Stellar-like topologies. This is a cautiously positive result—top tier sizes observed in practice are currently in the range of 7 organizations (23 raw nodes) for the Stellar network (cf. Appendix C) and 7 organizations (10 raw nodes) for the MobileCoin network [18]. It is likely that, for example through parallelization or the development of additional optimizations for “almost symmetric” FBASs, the analysis durations for naturally occurring FBASs can be reduced further.

6 Bootstrapping FBASs

The reported openness enabled through the FBAS paradigm comes at the cost of increased configuration responsibilities for node operators. As discussed in Sec. 3, each node must become associated with a quorum set (respectively quorum slices) in order to become a useful part of an FBAS. We will refer to this process as quorum set configuration (QSC). But how should a node operator go about QSC? Based on the analytical toolset introduced in Sec. 4, we can now investigate what kinds of QSC policies are plausible and in what kind of FBASs they result.

Notably, we explore how individual preferences (such as which nodes should be “trusted”) can be mapped to the quorum set formalism. Based on experiments that use Internet topology as a representative graph representation of interdependence and trust, we conclude that purely individualistic configuration policies can result in systems with low liveness and high complexity. We outline possible directions for future research by sketching policies with a strategic element and empirically demonstrating their effectiveness.

6.1 QSC policies and their evaluation

A QSC policy is individually and repeatedly invoked for each node \(v \in {{\,\mathrm{\mathbf {V}}\,}}\). It takes information about a current FBAS instance \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) as input and returns a quorum set for v, setting a new value for \({{\,\mathrm{\mathbf {Q}}\,}}(v)\). We use the quorum set formalization introduced in Sec. 3.2. For illustration, consider the following trivial policy:

$$\begin{aligned} \forall v \in {{\,\mathrm{\mathbf {V}}\,}}:\quad {{\,\mathrm{\mathbf {Q}}\,}}(v) = {{\,\mathrm{qset}\,}}(v, ({{\,\mathrm{\mathbf {V}}\,}}, \emptyset , |{{{\,\mathrm{\mathbf {V}}\,}}}|)) \end{aligned}$$
(Super Safe QSC)

If implemented by all nodes in \({{\,\mathrm{\mathbf {V}}\,}}\), Super Safe QSC leads to each node having only one quorum slice—\({{\,\mathrm{\mathbf {V}}\,}}\) itself (\({{\,\mathrm{\mathbf {Q}}\,}}(v) = \{{{{\,\mathrm{\mathbf {V}}\,}}}\}\)). The policy maximizes safety but leads to blocking sets of cardinality 1—any node can block the single quorum in the induced FBAS.

As an improvement, the threshold of the formed quorum sets can be set in resemblance to classical BFT protocols:

$$\begin{aligned} \forall v \in {{\,\mathrm{\mathbf {V}}\,}}:\quad {{\,\mathrm{\mathbf {Q}}\,}}(v) = {{\,\mathrm{qset}\,}}(v, ({{\,\mathrm{\mathbf {V}}\,}}, \emptyset , \lceil {\frac{2|{{{\,\mathrm{\mathbf {V}}\,}}}|+1}{3}}\rceil )) \end{aligned}$$
(Ideal Open QSC)

For \(|{{{\,\mathrm{\mathbf {V}}\,}}}| = 3f + 1\) with an \(f \in \mathbb {Z}^{+}\), setting the threshold to \(t = \lceil {\frac{2|{{{\,\mathrm{\mathbf {V}}\,}}}|+1}{3}}\rceil \) leads to FBASs in which any \(2f + 1\) nodes form a (minimal) quorum. This results in both all minimal blocking sets and all minimal splitting sets of the induced FBAS having cardinality \(f + 1\), i.e., both safety and liveness can be maintained in the face of up to f node failures.

6.1.1 Choosing validators

The preceding example policies construct non-nested quorum sets that use as validators U the set of all nodes in the FBAS (\(U = {{\,\mathrm{\mathbf {V}}\,}}\)). These are clearly toy examples—if anything else, without additional mechanisms to restrict or filter the membership in \({{\,\mathrm{\mathbf {V}}\,}}\), \({{\,\mathrm{\mathbf {V}}\,}}\) can easily become dominated by faulty Sybil [5] nodes.

In the scope of this work, and in line with the motivation behind the FBAS paradigm, we consider \({{\,\mathrm{\mathbf {V}}\,}}\) to enjoy open membership, with no universally trusted whitelist or ranking. For arriving at sensible choices for U, QSC policies must therefore take individual knowledge into account.

6.1.2 Modeling individual preferences

QSC policies based on individual preferences contribute node-local knowledge to the collective FBAS configuration. For example:

  • Which nodes are trusted to be (and stay) non-faulty. It is often implied that QSC should reflect some form of trust, e.g., in wordings such as “flexible trust” [16] or “asymmetric distributed trust” [2]. While reasoning about the future behavior of participants in a consensus protocol might be an overwhelming task for node operators, they may at least encode plausible beliefs about non-Sybilness [5] (i.e., which groups of nodes are (un)likely to be controlled by the same entity).

  • To which nodes do dependencies exist (e.g., for business reasons).

    Adding nodes of organizations one interacts with to one’s quorum sets might be necessary to maintain “sync” with these organizations [13], as opposed to ending up with diverging ledgers in the event of a fork.

In the following discussion, we will use graph representations for modeling individual preferences. It is an intriguing hypothesis that the FBAS paradigm can enable Sybil-resistant and yet energy-efficient permissionless consensus by bootstrapping quorum systems along existing trust graphs or interdependence graphs. In Sec. 3.1 we saw that transforming an FBAS into an equally sized regular graph leads to a loss of information, i.e., can yield only heuristic representations. In the following sections we pose the inverse question: How can a “good” FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) be instantiated from a given graph \(G = ({{\,\mathrm{\mathbf {V}}\,}}, E)\)?

For evaluating example policies incorporating individual preferences, we will use the autonomous system (AS) relationships graph inferred by the CAIDA projectFootnote 9—a reflection of the interdependence and trust between networks that form the Internet. The topological structure of the Internet has repeatedly been cited as an argument for the viability of the FBAS model [13, 16]. We discuss results based on two snapshots of the AS relations graph: from January 1998—the earliest available snapshot describing a younger Internet with 3233 ASs connected via 4921 (directed) customer/provider links and 852 (undirected) peering links—and from January 2020—with 67308 ASs connected via 133864 customer/provider links and 312763 peering links. We will refer to the graphs as \(G_\text {AS98}\) and \(G_\text {AS20}\).

6.2 Naive individualistic QSC

We consider a QSC policy naively individualistic if it is based entirely on individual preferences. We model “preference for a node” as edges in a graph \(G = ({{\,\mathrm{\mathbf {V}}\,}}, E)\), with nodes being aware only of their own graph neighborhood.

Consider a simple representative of this class—forming quorum sets using the entire graph neighborhood of a node, weighing each neighbor equally within a \(3f + 1\) threshold logic (that models the assumption that strictly less than a third of all neighbors can be faulty):

$$\begin{aligned} \begin{aligned} \forall v \in {{\,\mathrm{\mathbf {V}}\,}}:\quad U&= \{{v}\} \cup \{{v^\prime \in {{\,\mathrm{\mathbf {V}}\,}}\mid (v, v^\prime ) \in E}\}\\ {{\,\mathrm{\mathbf {Q}}\,}}(v)&= {{\,\mathrm{qset}\,}}(v, (U, \emptyset , \lceil {\frac{2|{U}|+1}{3}}\rceil )) \end{aligned} \end{aligned}$$
(All Neighbors QSC)

If G is a complete graph, we get the same result as with Ideal Open QSC. If G is not connected, we cannot have quorum intersection (and hence safety). The latter is also true if G contains more than one cluster of sufficient size and weak (relative) connectedness to the rest of the graph. We can confirm that this is the case for the AS graph snapshots \(G_\text {AS98}\) and \(G_\text {AS20}\). Using them, All Neighbors QSC induces FBASs that do not enjoy quorum intersectionFootnote 10. The high prevalence of AS peering is a likely explanation for why sufficiently well intraconnected clusters can emerge outside of the “natural” top tier of the AS graph.

A lack of quorum intersection implies that the induced FBASs may split into multiple sub-FBASs. This might be a desirable effect when bootstrapping from individual preferences. For example, separated communities with low levels of inter-community interaction and trust might prefer the added sovereignty of an “own” FBAS. We repeated the analysis for the respectively largest sub-FBASs, with an upper bound on top tier sizeFootnote 11 of, respectively, 355 and 14339 nodes. Potential top tier sizes of this magnitude make a complete analysis unfeasible (s.a. the discussion on analysis scalability in Sec. 5.6). This is problematic, as the robustness of the resulting FBASs, in terms of safety and liveness, cannot be reliably determined. Existing weaknesses in the global quorum structure cannot be identified and (strategically) fixed. Weaknesses, however, are likely to exist. For example, preliminary analysis results for the FBAS instantiated from \(G_\text {AS98}\) imply the existence of blocking sets with only 3 members.

6.3 Tier-based QSC

Towards making resulting top tiers more focused (and hence, the resulting FBASs more efficient and more amenable to analysis), QSC policies can incorporate strategic considerations in addition to individual preferences. We explore a prudent example strategy in the following: the weighing of nodes based on tierness, or relative importance. Tierness is an established notion for ASs in the Internet graph. For FBASs, a tiered quorum structure with every node including only higher-tier neighbors in its quorum sets was proposed (as an example) as early as in the original FBAS proposal [16]. Classifying nodes based on their tierness is also related to the quality-based configuration format currently used by the Stellar software [13]. Lastly, it is a plausible assumption that the relative tierness of graph neighbors can be estimated locally, enabling QSC decisions that do not require a global view.

We sketch an example QSC policy in which nodes use only higher-tier nodes in their quorum sets, or same-tier nodes if none of their neighbor appears to be of higher tier. We assume that nodes can infer the relative tierness of their graph neighbors. Specifically, that they can determine which of their neighbors are of a higher tier than themselves. For simulation, we use the PageRank [19] score of nodes (calculated without dampening) as a proxy for their tierness. Each simulated node considers a neighbor of higher (lower) tier if the neighbor’s PageRank score is twice as high (low) as its own. More formally, with \(R(v)\) denoting the PageRank score of node v, \({{\,\mathrm{edges^{+}}\,}}(v)\) the set of its neighbors (\({{\,\mathrm{edges^{+}}\,}}(v) := \{{v^\prime \in {{\,\mathrm{\mathbf {V}}\,}}\mid (v, v^\prime ) \in E}\}\)), H its higher-tier neighbors and P its same-tier neighbors (“peers”):

$$\begin{aligned} \begin{aligned} H(v)&= \{{v^\prime \in {{\,\mathrm{edges^{+}}\,}}(v) \mid R(v^\prime ) \ge 2R(v)}\}\\ P(v)&= \{{v^\prime \in {{\,\mathrm{edges^{+}}\,}}(v) \mid \frac{1}{2}R(v)< R(v^\prime ) < 2R(v)}\} \end{aligned} \end{aligned}$$
(Tierness Heuristics)

Based on this heuristic, we can define the following QSC policy:

$$\begin{aligned} \begin{aligned} \forall v \in {{\,\mathrm{\mathbf {V}}\,}}: \quad U&= {\left\{ \begin{array}{ll} \, \{{v}\} \cup H(v) &{} \text{ if } H(v) \ne \emptyset \\ \, \{{v}\} \cup P(v) &{} \text{ else } \end{array}\right. }\\ {{\,\mathrm{\mathbf {Q}}\,}}(v)&= {{\,\mathrm{qset}\,}}(v, (U, \emptyset , \lceil {\frac{2|{U}|+1}{3}}\rceil )) \end{aligned} \end{aligned}$$
(Higher-Tier Neighbors QSC)

Our results show that improvements to the naive case are possible when incorporating strategic considerations, despite the fact that the quorum structure is heavily influenced by individual preferences. More prominently—top tiers become of more manageable size (both for analysis and for consensus protocols leveraging the FBAS).

Fig. 4
figure 4

Histogram of the cardinalities of relevant sets in FBASs resulting from the application of Higher-Tier Neighbors QSC using snapshots of the AS relationship graph (\(G_\text {AS98}\), \(G_\text {AS20}\))

We simulated the application of Higher-Tier Neighbors QSC using the AS graph snapshots \(G_\text {AS98}\) and \(G_\text {AS20}\). The two thus induced FBASs contained, respectively, 2 and 6 nodes with one-node quorums sets which we filter our for the subsequent analysis. We apply fbas_analyzer, our software-based analysis framework (cf. Sec. 5), to the resulting FBASs.

Figure 4 presents the analysis findings. It depicts histograms of the relevant sets, i.e., how many minimal quorums, minimal blocking sets or minimal splitting sets of a given size exist for the given FBAS. For the \(G_\text {AS98}\) case, we restricted our minimal splitting sets analysis to the core of the FBAS, i.e., to its top tier and all nodes that are referenced by top tier nodes either directly or transitivelyFootnote 12. We find that doing so yields more informative results; the full FBAS contains a large number of splitting sets with cardinality 1 that only split off very small groups of nodes from the rest. Even when restricting the analysis to core nodes only, we were not able to fully enumerate the minimal splitting sets for \(G_\text {AS20}\) in reasonable time, due to the size and specific structure of the resulting FBAS.

Strikingly, our analysis reveals that the liveness of both FBASs is easily compromised. Despite their relatively large top tiers (of 15 and 36 nodes, respectively), groups of only 2 nodes, and in the \(G_\text {AS20}\) case even one group of only one node, exist that are sufficient to completely block (or censor) the FBAS. For comparison, symmetric top tiers of the same size would result in all minimal blocking sets having sizes of, respectively, 5 and 12. This liveness-threatening discrepancy can be explained through cascading failures: If (for example) two nodes fail, this can result in a third node with a “weak” quorum set becoming unsatisfiable, so that three nodes have now de-facto failed, which can result in a fourth node becoming unsatisfiable, et cetera. It can be concluded that the composition and size of smallest blocking sets for an FBAS is heavily influenced by the “weakest” quorum sets in the FBAS’ top tier. An additional example for cascading failures is given Appendix B.

6.4 Symmetry enforcement

The graph-based QSC policies discussed so far easily result in systems that are brittle (in the sense of small minimal blocking sets) and hard to analyze. Both of these characteristics are vastly improved, relative to top tier size, in FBASs with symmetric top tiers. However, symmetric top tiers emerge organically from a preexisting relationship graph G only if the top tier nodes form a complete subgraph of G, which is not the case in the graphs investigated so far. As a policy enhancement, nodes believing themselves to be top tier can mirror the quorum sets of other apparently top tier nodes, strategically including non-neighbors in their quorum sets for improving the global FBAS structure. A behavior along this lines can, in fact, be observed in the live Stellar network (s.a. Appendix C).

Yet, by making validator decisions independent of the local knowledge representation G, new assumptions become necessary to be able to rule out attacks. Mirroring makes it easier for malicious top tier nodes to introduce Sybil nodes into the top tier. The approach is therefore only secure (w.r.t. both safety and liveness) if it can be assumed that nodes in T make plausibility checks before expanding their quorum sets, so that attempted (Sybil) attacks can be detected. Given the lack of explicit incentives for running validator nodes in systems like Stellar, such a burden on the operators of top tier nodes might be viewed as problematic [11]. However, similar critique can also be voiced against systems (like Bitcoin) that base their security arguments on notions of economic rationality, as economic rationality can also be leveraged by attackers [6].

7 Limits on openness and top tier fluidity

The FBAS paradigm reportedly enables the instantiation of consensus systems with open membership [13, 16]. And clearly, arbitrary nodes can join an FBAS, causing new quorums to be formed that contain them. Based on the preceding discussion, however, we recognize that without creating a new, de-facto disjoint FBAS, or the active reconfiguration of existing nodes, new nodes cannot become part of minimal quorums and hence minimal blocking sets. Thereby, their existence is irrelevant as far as the discussed liveness indicators are concerned, and their importance for safety is limited. In Sec. 4 we defined the notion of a top tier to reflect the set of nodes in an FBAS that is central to liveness, i.e., the set of nodes from which all minimal quorums and blocking sets are formed. The top tier wields absolute power to censor and block the whole FBAS.

In the following, we investigate the question to what extent this top tier can be considered a group with open membership. How can its power be diluted by promoting additional nodes to top tier status? Can nodes be “fired” from the top tier? We make the case that, in general, a top tier T can neither grow nor shrink without either the active involvement of existing top tier nodes or a loss of safety guarantees. We base all subsequent projections on the status quo of an FBAS that enjoys quorum intersection despite faulty nodes (a safe FBAS as per the discussion in Sec. 3.4).

7.1 Top-down top tier change

As a preliminary remark, recall that, as per Def. 4.4, we define the top tier T of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) as the union of all its minimal quorums. T is therefore also a quorum and intersects every quorum in \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\).

Theorem 7.1

(top tier can safely change itself) Let \(T \subset {{\,\mathrm{\mathbf {V}}\,}}\) be the top tier of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) that enjoys quorum availability and quorum intersection. Then it is possible, without compromising neither quorum availability nor quorum intersection, to instantiate a new top tier \(T^\prime \subseteq {{\,\mathrm{\mathbf {V}}\,}}, T^\prime \ne \emptyset \) by changing only the quorum sets of new and old top tier nodes \(v \in T \cup T^\prime \).

Proof

Let \(T^\prime \subseteq {{\,\mathrm{\mathbf {V}}\,}}, T^\prime \ne \emptyset \) be the target top tier. Let \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \) be a modification of \({{\,\mathrm{\mathbf {Q}}\,}}\) so that \(\forall v \in T \cup T^\prime : {{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) = \{{T^\prime }\}\)Footnote 13 and \(\forall v \notin T \cup T^\prime : {{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) = {{\,\mathrm{\mathbf {Q}}\,}}(v)\). As \(T^\prime \) is a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \), \((T^\prime , {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) enjoys quorum availability. Therefore, \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) enjoys quorum availability. \(({{\,\mathrm{\mathbf {V}}\,}}\setminus T', {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) does not enjoy quorum availability, because no node in T is satisfied without \(T^\prime \) and no node in \({{\,\mathrm{\mathbf {V}}\,}}\setminus T\) can form a quorum without a node from T (otherwise T would not have been the top tier w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}\), cf. Def. 4.4). There are therefore no quorums w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \) that are disjoint of \(T^\prime \). \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) therefore enjoys quorum intersection iff \((T^\prime , {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) enjoys quorum intersection, which it (trivially) does. \(\square \)

The situation is less clear if some nodes \(T \setminus T^\prime \) do not wish to leave T. Note, however, that single nodes can always endanger safety via trivial configurations such as \({{\,\mathrm{\mathbf {Q}}\,}}(v) = \{{\{{v}\}}\}\). If performed by one or more nodes in T, such an act of sabotage can have an impact on the safety of large portions of the FBAS.

7.2 Bottom-up top tier change

In the following, we assume a “self-centered” top tier in the sense that all top tier nodes include only other top tier nodes in quorum sets. Symmetric top tiers (Def. 4.5) have this property, as do top tiers observed in the wild in the Stellar network (cf. Appendix C).

Theorem 7.2

(no safe top tier change with uncooperative top tier) Let \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) be an FBAS that enjoys quorum intersection and has a “self-centered” top tier \(T \subset {{\,\mathrm{\mathbf {V}}\,}}\) such that all top tier quorum slices are comprised of only top tier nodes (\(\forall v \in {{\,\mathrm{\mathbf {V}}\,}}: \bigcup {{{\,\mathrm{\mathbf {Q}}\,}}(v)} \subseteq T\)). Then it is not possible, without compromising quorum intersection, to instantiate a new top tier \(T^\prime \subseteq {{\,\mathrm{\mathbf {V}}\,}}, T^\prime \ne T\) by changing only the quorum sets of non-top tier nodes \(v \in {{\,\mathrm{\mathbf {V}}\,}}\setminus T\).

Proof

Let \(T^\prime \subseteq {{\,\mathrm{\mathbf {V}}\,}}, T^\prime \ne T\) be the top tier of a new FBAS \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) that enjoys quorum intersection. Let \(\hat{\mathcal {U}}\) and \(\hat{\mathcal {U}}^\prime \) be the sets of all minimal quorums of \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}})\) and \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\), respectively. As per Def. 4.4, \(T^\prime \ne T\) implies that \(\hat{\mathcal {U}} \ne \hat{\mathcal {U}}^\prime \).

Assume there exists a \(\hat{U} \in \hat{\mathcal {U}} \setminus \hat{\mathcal {U}}^\prime \). Then \(\hat{U}\) is a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}\) and either (a) not a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \) or (b) not minimal w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \).

However, we require that the quorum sets of top tier nodes don’t change: \(\forall v \in T: {{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) = {{\,\mathrm{\mathbf {Q}}\,}}(v)\). Therefore \(\hat{U}\) is a quorum also w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \), contradicting (a). Hence, (b) must hold and there must be a \(\hat{U}^\prime \in \hat{\mathcal {U}}^\prime \) such that \(\hat{U}^\prime \subset \hat{U}\) (cf. Def. 4.1). As \(\hat{U}^\prime \subseteq \hat{U} \subseteq T\), \(\hat{U}^\prime \) being a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \) implies it also being a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}\). But then \(\hat{U}\) is not minimal w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}\), implying \(\hat{U} \notin \hat{\mathcal {U}}\) and thus again leading to a contradiction. This proves that \(\hat{\mathcal {U}} \subseteq \hat{\mathcal {U}}^\prime \).

Assume now there exists a \(\hat{U}^\prime \in \hat{\mathcal {U}}^\prime \setminus \hat{\mathcal {U}}\) and let \(\hat{U} \in \hat{\mathcal {U}}\). As \(({{\,\mathrm{\mathbf {V}}\,}}, {{\,\mathrm{\mathbf {Q}}\,}}^\prime )\) enjoys quorum intersection, \(\hat{U}^\prime \cap \hat{U} \ne \emptyset \) and \(\hat{U}^\prime \) contains members of the “old” top tier T. \(\hat{U}^\prime \) is a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \), but \(\hat{U}^\prime \cap T\) cannot be a quorum w.r.t. \({{\,\mathrm{\mathbf {Q}}\,}}^\prime \) as otherwise \(\hat{U}^\prime \) would not be a minimal quorum. There must therefore exist a node \(v \in \hat{U}^\prime \cap T\) with a quorum slice \(q \in {{\,\mathrm{\mathbf {Q}}\,}}^\prime (v)\) such that \((\hat{U}^\prime \cap T) \subset q \subseteq \hat{U}^\prime \) (cf. Def. 3.4), i.e., \(q \setminus T \ne \emptyset \). As \(v \in T\), we require that \({{\,\mathrm{\mathbf {Q}}\,}}^\prime (v) = {{\,\mathrm{\mathbf {Q}}\,}}(v)\) and \(\bigcup {{{\,\mathrm{\mathbf {Q}}\,}}(v)} \subseteq T\), which leads to a contradiction since \(q \in {{\,\mathrm{\mathbf {Q}}\,}}(v)\) and \(q \setminus T \ne \emptyset \). It must therefore hold that \(\hat{\mathcal {U}} \setminus \hat{\mathcal {U}}^\prime = \emptyset \), \(\hat{\mathcal {U}}=\hat{\mathcal {U}}^\prime \) and \(T = T^\prime \). \(\square \)

7.3 Consequences

Who determines which FBAS nodes get to form the top tier? Our results imply that, if maintaining safety is seen as an untouchable requirement, the top tier \(T_i\) of an FBAS \(({{\,\mathrm{\mathbf {V}}\,}}_i, {{\,\mathrm{\mathbf {Q}}\,}}_i)\) at “iteration” i is legitimated by decisions of, exclusively, members of \(T_{i-1} \cup T_i\) (if none of them cooperates, we lose safety, if all of them cooperate, we don’t). Because of the top tier’s importance to the liveness, safety and performance achievable within a given FBAS, open membership in \({{\,\mathrm{\mathbf {V}}\,}}_i\) is of little benefit without open membership in \(T_i\).

How closed is the membership in \(T_i\)? It might be sufficient that only some nodes in \(T_{i-1}\) support a transition to \(T_i\). If reactive QSC policies are used (e.g., for enforcing top tier symmetry as discussed in Sec. 6.4), one cooperative top tier node \(v \in T_{i-1}\) might already be enough for growing the top tier in a way that is robust and doesn’t only dilute the relative influence of v. How partially supported top tier changes would play out must be investigated based on more specific scenarios. We expect the safe “firing” of top tier nodes to be especially challenging.

Which begs the question—can the safety requirement be weakened? For example, given sufficiently good (out-of-band) coordination between members of \({{\,\mathrm{\mathbf {V}}\,}}_{i-1} \setminus T_{i-1}\), a \(({{\,\mathrm{\mathbf {V}}\,}}_i, {{\,\mathrm{\mathbf {Q}}\,}}_i)\) might be instantiated in which at least \(({{\,\mathrm{\mathbf {V}}\,}}_i \setminus T_{i-1}, {{\,\mathrm{\mathbf {Q}}\,}}_i)\) enjoys quorum intersection. It is conceivable that novel protocols can be developed, possibly also leveraging the FBAS structure, that reduce the notorious difficulty of coordinating such bottom-up actions.

8 Conclusion

We demonstrate in this paper that, despite the complexity of the FBAS model, the properties of concrete FBAS instances can be described in a way that is both precise and intuitive, and allows comparisons with more classical Byzantine agreement systems. We propose the notions of minimal blocking sets, minimal splitting sets and top tiers to describe which groups of nodes can compromise liveness and safety. In essence, minimal blocking sets and minimal splitting sets describe minimal viable threat scenarios, thereby enabling a comprehensive risk assessment in FBAS-based systems like the Stellar network. While some analyses imply computational problems of exponential complexity, we developed and implemented algorithms that enable the exact analysis of a wide range of interesting FBASs.

Our implemented analysis framework also enables us to investigate how individual configurations result in global properties. We find that overly strategic configuration policies result in FBASs that are indistinguishable from permissioned systems. Individualistic approaches, on the other hand, cannot guarantee safe results while quickly resulting in systems that are infeasible to analyze. Adding some strategic decision-making at organically emerging top tier nodes offers a potential middle way towards robust FBASs instantiated from the sum of individual preferences.

Independently of the way in which a given FBAS came to be, however, the composition of a once established top tier cannot be influenced without the cooperation of existing top tier nodes, without at the same time threatening safety. This seems to place the FBAS paradigm closer to the “permissioned consensus” camp than hoped. More investigation is needed to determine the exact impact of bottom-up top tier changes (as in number of nodes affected by a loss of safety or liveness, for example) and to formulate possible coordination strategies to keep such impacts low.