Abstract
Continuing our earlier work in Nam et al. (One-step replica symmetry breaking of random regular NAE-SAT I, arXiv:2011.14270, 2020), we study the random regular k-nae-sat model in the condensation regime. In Nam et al. (2020), the (1rsb) properties of the model were established with positive probability. In this paper, we improve the result to probability arbitrarily close to one. To do so, we introduce a new framework which is the synthesis of two approaches: the small subgraph conditioning and a variance decomposition technique using Doob martingales and discrete Fourier analysis. The main challenge is a delicate integration of the two methods to overcome the difficulty arising from applying the moment method to an unbounded state space.
Similar content being viewed by others
1 Introduction
Building on the earlier theory of spin-glasses, statistical physicists in the early 2000s developed a detailed collection of predictions for a broad class of sparse random constraint satisfaction problems (rcsp). These predictions describe a series of phase transitions as the constraint density varies, which is governed by one-step replica symmetry breaking (1rsb) ([31, 34]; cf. [4] and Chapter 19 of [33] for a survey). We study one of such rcsp’s, named the random regular k-nae-sat model, which is perhaps the most mathematically tractable among the 1rsb class of rcsp’s. As a continuation of our companion work [37], this paper completes our program to establish that the 1rsb prediction for the random regular nae-sat hold with probability arbitrarily close to one.
The nae-sat problem is a random Boolean cnf formula, where n Boolean variables are subject to constraints in the form of clauses which are the “or” of k of the variables or their negations chosen uniformly at random. The formula itself is the “and” of these clauses. A variable assignment \(\underline{x}\in \{0,1\}^{n}\) is called a nae-sat solution if both \(\underline{ x}\) and \(\lnot \underline{ x}\) evaluate to true. We then choose a uniformly random instance of d-regular (each variable appears d times) k-nae-sat (each clause has k literals) problem, which gives the random d-regular k-nae-sat problem with clause density \(\alpha =d/k\) (see Sect. 2 for the formal definition).
Let \(Z_n\) denote the number of solutions for a given random d-regular k-nae-sat instance. The physics prediction is that for each fixed \(\alpha \), there exists \(\textsf {f}(\alpha )\) called the free energy such that
A direct computation of the first moment \(\mathbb {E}Z_n\) gives that
(\(\textsf {f}^{\textsf {rs}}(\alpha )\) is called the replica-symmetric free energy), so \(\textsf {f}\le \textsf {f}^{\textsf {rs}}\) holds by Markov’s inequality. The work of Ding–Sly–Sun [25] and Sly–Sun–Zhang [43] established some of the physics conjectures on the description of \(Z_n\) and \(\textsf {f}\) given in [31, 36, 45], which are summarized as follows.
-
([25]) For large enough k, there exists the satisfiability threshold \(\alpha _{\textsf {sat}}\equiv \alpha _{\textsf {sat}}(k)>0\) such that
$$\begin{aligned} \lim _{n\rightarrow \infty } \mathbb {P}(Z_n>0) = {\left\{ \begin{array}{ll} 1 &{} \text { for } \alpha \in (0,\alpha _{\textsf {sat}});\\ 0 &{} \text { for }\alpha > \alpha _{\textsf {sat}}. \end{array}\right. } \end{aligned}$$ -
([43]) For large enough k, there exists the condensation threshold \(\alpha _{\textsf {cond}}\equiv \alpha _{\textsf {cond}}(k)\in (0,\alpha _{\textsf {sat}})\) such that
$$\begin{aligned} \textsf {f}(\alpha )= {\left\{ \begin{array}{ll} \textsf {f}^{\textsf {rs}}(\alpha ) &{} \text { for } \alpha \le \alpha _{\textsf {cond}};\\ \textsf {f}^{1\textsf {rsb}}(\alpha ) &{} \text { for } \alpha > \alpha _{\textsf {cond}}, \end{array}\right. } \end{aligned}$$(1)where \(\textsf {f}^{1\textsf {rsb}}(\alpha )\) is the 1rsb free energy. Moreover, \(\textsf {f}^{\textsf {rs}}(\alpha ) > \textsf {f}^{\textsf {1\textsf {rsb}}}(\alpha )\) holds for \(\alpha \in (\alpha _{\textsf {cond}},\alpha _{\textsf {sat}})\). For the explicit formula and derivation of \(\textsf {f}^{1\textsf {rsb}}(\alpha )\) and \(\alpha _{\textsf {cond}}\), we refer to Section 1.6 of [43] for a concise overview.
Furthermore, there are more detailed physics predictions that the solution space of the random regular k-nae-sat is condensed when \(\alpha \in (\alpha _{\textsf {cond}},\alpha _{\textsf {sat}})\) into a finite number of clusters. Here, cluster is defined by the connected component of the solution space, where we connect two solutions if they differ by one variable. Indeed, in [37], we proved that for large enough k, the solution space of random regular k-nae-sat indeed becomes condensed in the condensation regime for a positive fraction of the instances. That is, it holds with probability strictly bounded away from 0.
The following theorem strengthens the aforementioned result and shows that the condensation phenomenon holds with probability arbitrarily close to 1.
Theorem 1.1
Let \(k\ge k_0\) where \(k_0\) is a large absolute constant, and let \(\alpha \in (\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\) such that \(d\equiv \alpha k\in \mathbb {N}\). For all \(\varepsilon >0\) and \(M\in \mathbb {N}\), there exist constants \(K\equiv K(\varepsilon ,\alpha ,k)\in \mathbb {N}\) and \(C\equiv C(M,\varepsilon ,\alpha ,k)>0\) such that with probability at least \(1-\varepsilon \), the random d-regular k-nae-sat instance satisfies the following:
-
(a)
The K largest solution clusters, \(\mathcal {C}_1,\ldots ,\mathcal {C}_K\), occupy at least \(1-\varepsilon \) fraction of the solution space;
-
(b)
There are at least \(\exp (n \text {\textsf{f}}^{1 \text {\textsf{rsb}}}(\alpha ) -c^{\star }\log n -C )\) many solutions in \(\mathcal {C}_1,\ldots ,\mathcal {C}_M\), the M largest clusters (see Definition 2.19 for the definition of \(c^\star \)).
Remark 1.2
Throughout the paper, we take \(k_0\) to be a large absolute constant so that the results of [25, 43] and [37] hold. In addition, it was shown in [43, Proposition 1.4] that \((\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\) is a subset of \((\alpha _{\textsf {lbd}}, \alpha _{\textsf {ubd}})\), where \(\alpha _{\textsf {lbd}}\equiv (2^{k-1}-2)\log 2\) and \(\alpha _{\textsf {ubd}}\equiv 2^{k-1}\log 2\), so we restrict our attention to \(\alpha \in (\alpha _{\textsf {lbd}}, \alpha _{\textsf {ubd}})\).
1.1 One-step replica symmetry breaking
In the condensation regime \(\alpha \in (\alpha _{\textsf {cond}},\alpha _{\textsf {sat}})\), the random regular k-nae-sat model is believed to possess a single layer of hierarchy of clusters in the solution space. That is, the solutions are fairly well-connected inside each cluster so that no additional hierarchical structure in it. Such behavior is conjectured in various other models such as random graph coloring and random k-sat. We remark that there are also other models such as maximum independent set (or high-fugacity hard-core model) in random graphs with small degrees [10] and Sherrington-Kirkpatrick model [41, 44], which are expected or proven [6] to undergo full rsb, which means that there are infinitely many levels of hierarchy inside the solution clusters.
A way to characterize 1rsb is to look at the overlap between two uniformly drawn solutions. In the condensation regime, there are a bounded number of clusters containing most of the solutions. Thus, the event of two solutions belonging to the same cluster, or different clusters, each happen with a non-trivial probability. According to the description of 1rsb, there is no additional structure inside each cluster, so the Hamming distance between the two solutions is expected to concentrate precisely at two values, depending on whether they came from the same cluster or not.
It was verified in [37] that the overlap concentrates at two values for a positive fraction of the random regular nae-sat instances. Theorem 1.4 below verifies that the overlap concentration happens for almost all random regular nae-sat instances.
Definition 1.3
For \(\underline{x}^1,\underline{x}^2 \in \{0,1\}^n\), let \(\underline{y}^i = 2\underline{x}^i - {\textbf {1}}\). The overlap \(\rho (\underline{x}^1,\underline{x}^2)\) is defined by
In words, the overlap is the normalized difference between the number of variables with the same value and the number of those with different values.
Theorem 1.4
Let \(k\ge k_0\), \(\alpha \in (\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\) such that \(d\equiv \alpha k\in \mathbb {N}\), and \(p^\star \equiv p^\star (\alpha ,k)\in (0,1)\) be a fixed constant (for its definition, see Definition 6.8 of [37]). For all \(\varepsilon >0\), there exist constants \(\delta =\delta (\varepsilon ,\alpha ,k)>0\) and \(C\equiv C(\varepsilon ,\alpha ,k)\) such that with probability at least \(1-\varepsilon \), the random d-regular k-nae-sat instance \(\mathscr {G}\) satisfies the following. Let \( {x}^1, {x}^2\in \{0,1\}^n\) be independent, uniformly chosen satisfying assignments of \(\mathscr {G}\). Then, the absolute value \(\rho _{\text {abs}} \equiv |\rho |\) of their overlap \(\rho \equiv \rho (\underline{x}^1,\underline{x}^2)\) satisfies
-
(a)
\(\mathbb {P}(\rho _{\text {abs}}\le n^{-1/3} |\mathscr {G}) \ge \delta \);
-
(b)
\(\mathbb {P}( \big |\rho _{\text {abs}}-p\big | \le n^{-1/3} |\mathscr {G})\ge \delta \);
-
(c)
\(\mathbb {P}( \min \{ \rho _{\text {abs}}, |\rho _{\text {abs}}-p|\} \ge n^{-1/3}|\mathscr {G})\le Cn^{-1/4}\).
1.2 Related works
Many of the earlier works on rcsps focused on determining the satisfiability thresholds and for rcsp models that are known not to exhibit rsb, such goals were established. These models include random linear equations [7], random 2-sat [12, 13], random 1-in-k-sat [1] and k-xor-sat [22, 26, 38]. On the other hand, for the models which are predicted to exhibit rsb, intensive studies have been conducted to estimate their satisfiability threshold (random k-sat [5, 18, 30], random k-nae-sat [2, 17, 21] and random graph coloring [3, 14, 15, 19]).
More recently, the satisfiability thresholds for rcsps in the 1rsb class have been rigorously determined for several models including maximum independent set [24], random regular k-nae-sat [25], random regular k-sat [18] and random k-sat [23]). These works carried out a demanding second moment method to the number of clusters instead of the number of solutions. Although determining the colorability threshold is left open, the condensation threshold for random graph coloring was established in [9], where they conducted a challenging analysis based on a clever “planting" technique, and the results were generalized to other models in [16]. Also, [8] identified the condensation threshold for random regular k-sat, where each variable appears d/2-times positive and d/2-times negative.
Further theory was developed in [43] to establish the 1rsb free energy for random regular k-nae-sat in the condensation regime by applying the second moment method to the \(\lambda \)-tilted partition function. Later, our companion paper [37] made further progress in the same model by giving a cluster level description of the condensation phenomenon. Namely, [37] showed that with positive probability, a bounded number of clusters dominate the solution space and the overlap concentrates on two points in the condensation regime. Our main contribution is to push the probability arbitrarily close to one and show that the same phenomenon holds with high probability.
Lastly, [11] studied the random k-max-nae-sat beyond \(\alpha _{\textsf {sat}}\), where they verified that the 1rsb description breaks down before \(\alpha \asymp k^{-3}4^k\). Indeed, the Gardner transition from 1rsb to full rsb is expected at \(\alpha _{\textsf {Ga}}\asymp k^{-3}4^k >\alpha _{\textsf {sat}}\) [32, 35], and [11] provides evidence of this phenomenon.
1.3 Proof ideas
In [37], the majority of the work was to compute moments of the tilted cluster partition function \(\overline{{{\textbf {Z}}}}_\lambda \) and \(\overline{{{\textbf {Z}}}}_{\lambda ,s}\), defined as
where the sums are taken over all clusters \(\Upsilon \). Moreover, let \(\overline{{{\textbf {N}}}}_{s}\) denote the number of clusters whose size is in the interval \([e^{ns}, e^{ns +1} )\), i.e.
Denote by \(s_\circ \) the size of the solution space in normalized logarithmic scale from Theorem 1.1:
where \(c_\star \) is the constant introduced in Theorem 1.1 and \(C\in \mathbb {R}\). In [37], we obtained the estimates on \(\overline{{\textbf {N}}}_{s_\circ }\) from the second moment method showing that \(\mathbb {E}[\overline{{{\textbf {N}}}}_{s_\circ }^2] \lesssim _{k} (\mathbb {E}\overline{{{\textbf {N}}}}_{s_\circ })^2\) holds, and that \(\mathbb {E}\overline{{{\textbf {N}}}}_{s_\circ }\) decays exponentially as \(C\rightarrow -\infty \). Thus, it was shown in [37] that
However, in order to establish (a) and (b) of the Theorem 1.1, we need to push the probability in the second line to \(1-\varepsilon \).
To do so, one may hope to have \(\mathbb {E}[\overline{{{\textbf {N}}}}_{s_\circ }^2] \approx (\mathbb {E}\overline{{{\textbf {N}}}}_{s_\circ })^2\) to deduce \(\mathbb {P}(\overline{{{\textbf {N}}}}_{s_\circ } >0) {\rightarrow } 1\) for large enough C, but this is false in the case of random regular nae-sat. The primary reasons is that short cycles in the graph causes multiplicative fluctuations in \(\overline{{{\textbf {N}}}}_{s_\circ }\). Therefore, our approach is to rescale \(\overline{{{\textbf {N}}}}_{s_\circ }\) according to the effects of short cycles, rescaled partition function \(\widetilde{{{\textbf {N}}}}_{s_\circ }\) concentrate around its expectation. That is, \(\mathbb {E}[\widetilde{{{\textbf {N}}}}_{s_\circ }^2] \approx (\mathbb {E}\widetilde{{{\textbf {N}}}}_{s_\circ })^2\) (to be precise, this will only be true when C is large enough, due to the intrinsic correlations coming from the largest clusters). Furthermore, we argue that the fluctuations coming from the short cycles are not too big, and hence can be absorbed by \(\overline{{{\textbf {N}}}}_{s_\circ }\) if \(\mathbb {E}\overline{{{\textbf {N}}}}_{s_\circ }\) is large. To this end, we develop a new argument that combines small subgraph conditioning [39, 40], which is a widely used tool in problems on random graphs, and the Doob martingale approach used in [24, 25], which are not effective in our model if used alone.
The small subgraph conditioning method ([39, 40]; for a survey, see Chapter 9.3 of [29]) has proven to be useful in many settings [27, 28, 42] to derive the precise distributional limits of partition functions. For example, [27] applied this method to the proper coloring model of bipartite random regular graphs, where they determined the limiting distribution of the number of colorings. However, this method relies much on the algebraic identities specific to the model which are sometimes not tractable, including our case. Roughly speaking, one needs a fairly clear combinatorial formula of the second moment to carry out the algebraic and combinatorial computations.
Another technique that inspired our proof, which we will refer to as the Doob martingale approach, was introduced in [24, 25]. This method rather directly controls the multiplicative fluctuations of \(\overline{\text {{\textbf {N}}}}_{s_\circ }\) by investigating the Doob martingale increments of \(\log \overline{{{\textbf {N}}}}_{s_\circ }\). It has proven to be useful in the study of models like random regular nae-sat, as seen in [25]. However, in the spin systems with infinitely many spins like our model, some of the key estimates in the argument become false, due to the existence of rare spins (or large free components).
Our approach blends the two techniques in a novel way to back up each other’s limitations. Although we could not algebraically derive the identities required for the small subgraph conditioning, we instead deduce them by a modified Doob martingale approach for the truncated model which has a finite spin space. Then, we send the truncation parameter to infinity on these algebraic identities, and show that they converge to the corresponding formulas for the untruncated model. This step requires a more refined understanding on the first and second moments of \(\widetilde{{{\textbf {N}}}}_{s_\circ }\) including the constant coefficient of the leading exponential term, whereas the order of the leading order was sufficient in the earlier works [25, 43]. We then appeal to the small subgraph conditioning method to deduce the conclusion based on those identities. We believe that our approach is potentially applicable to other models with an infinite spin space where the traditional small subgraph conditioning method is not tractable.
1.4 Notational conventions
For non-negative quantities \(f=f_{d,k, n}\) and \(g=g_{d,k,n}\), we use any of the equivalent notations \(f=O_{k}(g), g= \Omega _k(f), f\lesssim _{k} g\) and \(g \gtrsim _{k} f \) to indicate that for each \(k\ge k_0\),
with the convention \(0/0\equiv 1\). We drop the subscript k if there exists a universal constant C such that
When \(f\lesssim _{k} g\) and \(g\lesssim _{k} f\), we write \(f\asymp _{k} g\). Similarly when \(f\lesssim g\) and \(g\lesssim f\), we write \(f \asymp g\).
2 The Combinatorial Model
We begin with introducing the mathematical framework to analyze the clusters of solutions. We follow the formulation derived in [43, Section 2]. In [37], we needed further definitions in addition to those from [43], but in this work it is enough to rely on the concepts of [43]. In this section, we briefly review the necessary definitions for completeness.
There is a natural graphical representation to describe a d-regular k-nae-sat instance by a labelled (d, k)-regular bipartite graph: Let \(V=\{ v_1, \ldots , v_n \}\) and \(F=\{a_1, \ldots , a_m \}\) be the sets of variables and clauses, respectively. Connect \(v_i\) and \(a_j\) by an edge if \(v_i\) is one of the variables contained in the clause \(a_j\). Let \(\mathcal {G}=(V,F,E)\) be this bipartite graph, and let \(\texttt {L}_e\in \{0,1\}\) for \(e\in E\) be the literal corresponding to the edge e. Then, the labelled bipartite graph \(\mathscr {G}=(V,F,E,\underline{\texttt {L}})\equiv (V,F,E,\{\texttt {L}_{e}\}_{e\in E})\) represents a nae-sat instance.
For each \(e\in E\), we denote the variable(resp. clause) adjacent to it by v(e) (resp. a(e)). Moreover, \(\delta v\) (resp. \(\delta a\)) are the collection of adjacent edges to \(v\in V\) (resp. \(a \in F\)). We denote \(\delta v {\setminus } e:= \delta v {\setminus } \{e\}\) and \(\delta a \setminus e:= \delta a \setminus \{e\}\) for simplicity. Formally speaking, we regard E as a perfect matching between the set of half-edges adjacent to variables and those to clauses which are labelled from 1 to \(nd=mk\), and hence a permutation in \(S_{nd}\).
Definition 2.1
For an integer \(l\ge 1\) and \(\underline{{\textbf {x}}}=(\textbf{x}_i) \in \{0,1\}^l\), define
Let \(\mathscr {G}= (V,F,E,\underline{\texttt {L}})\) be a nae-sat instance. An assignment \(\underline{{\textbf {x}}}\in \{0,1\}^V\) is called a solution if
where \(\oplus \) denotes the addition mod 2. Denote \(\textsf {SOL}(\mathscr {G})\subset \{0,1\}^V\) by the set of solutions and endow a graph structure on \(\textsf {SOL}(\mathscr {G})\) by connecting \(\underline{{\textbf {x}}}\sim \underline{{\textbf {x}}}'\) if and only if they have a unit Hamming distance. Also, let \(\textsf {CL}(\mathscr {G})\) be the set of clusters, namely the connected components under this adjacency.
2.1 The frozen configuration
Our first step is to define frozen configuration which is a basic way of encoding clusters. We introduce free variable which we denote by \({\texttt {f}}\), whose Boolean addition is defined as \(\texttt {f}\oplus 0=\texttt {f}\oplus 1=\texttt {f}\). Recalling the definition of \(I^{\textsc {nae}}\) (6), a frozen configuration is defined as follows.
Definition 2.2
(Frozen configuration). For \(\mathscr {G}= (V,F,E, \underline{\texttt {L}})\), \(\underline{x}\in \{0,1,{\texttt {f}}\}^V\) is called a frozen configuration if the following conditions are satisfied:
-
No nae-sat constraints are violated for \(\underline{x}\). That is, \(I^{\textsc {nae}}(\underline{x};\mathscr {G})=1\).
-
For \(v\in V\), \(x_v\in \{0,1\}\) if and only if it is forced to be so. That is, \(x_v\in \{0,1\}\) if and only if there exists \(e\in \delta v\) such that a(e) becomes violated if \(\texttt {L}_e\) is negated, i.e., \(I^{\textsc {nae}} (\underline{x}; \mathscr {G}\oplus \mathbb {1}_e )=0\) where \(\mathscr {G}\oplus \mathbb {1}_e\) denotes \(\mathscr {G}\) with \(\texttt {L}_e\) flipped. \(x_v={\texttt {f}}\) if and only if no such \(e\in \delta v\) exists.
We record the observations which are direct from the definition. Details can be found in the previous works ([25], Section 2 and [43], Section 2).
-
(1)
We can map a nae-sat solution \(\underline{{\textbf {x}}}\in \{0,1 \}^V\) to a frozen configuration via the following coarsening algorithm: If there is a variable v such that \(\textbf{x}_v\in \{0,1\}\) and \(I^{\textsc {nae}}(\underline{{\textbf {x}}};\mathscr {G}) = I^{\textsc {nae}}(\underline{{\textbf {x}}}\oplus \mathbb {1}_v;\mathscr {G})=1\) (i.e., flipping \(\textbf{x}_v\) does not violate any clause), then set \(\textbf{x}_v = \text {{\texttt {f}}}\). Iterate this process until additional modifications are impossible.
-
(2)
All solutions in a cluster \(\Upsilon \in \textsf{CL}(\mathscr {G})\) are mapped to the same frozen configuration \(\underline{x}\equiv \underline{x}[\Upsilon ] \in \{0,1,\text {{\texttt {f}}}\}^V\). However, coarsening algorithm is not necessarily surjective. For instance, a typical instance of \(\mathscr {G}\) does not have a cluster corresponding to all-free (\(\underline{{{\textbf {x}}}}\equiv {\texttt {f}}\)).
2.2 Message configurations
Although the frozen configurations provides a representation of clusters, it does not tell us how to comprehend the size of clusters. The main obstacle in doing so comes from the connected structure of free variables which can potentially be complicated. We now introduce the notions to comprehend this issue in a tractable way.
Definition 2.3
(Separating and forcing clauses). Let \(\underline{x}\) be a given frozen configuration on \(\mathscr {G}= (V,F,E,\underline{\texttt {L}})\). A clause \(a\in F\) is called separating if there exist \(e, e^\prime \in \delta a\) such that \(\texttt {L}_{e}\oplus x_{v(e)} = 0, \quad \texttt {L}_{e^\prime } \oplus x_{v(e^\prime )}=1.\) We say \(a\in F\) is non-separating if it is not a separating clause. Moreover, \(e\in E\) is called forcing if \(\texttt {L}_{e}\oplus x_{v(e)} \oplus 1 = \texttt {L}_{e'}\oplus x_{v(e')}\in \{0,1\}\) for all \(e'\in \delta a(e) {\setminus } e\). We say \(a\in F\) is forcing, if there exists \(e\in \delta a\) which is a forcing edge. In particular, a forcing clause is also separating.
Observe that a non-separating clause must be adjacent to at least two free variables, which is a fact frequently used throughout the paper.
Definition 2.4
(Free cycles). Let \(\underline{x}\) be a given frozen configuration on \(\mathscr {G}= (V,F,E,\underline{\texttt {L}})\). A cycle in \(\mathscr {G}\) (which should be of an even length) is called a free cycle if
-
Every variable v on the cycle is \(x_v = {\texttt {f}}\);
-
Every clause a on the cycle is non-separating.
Throughout the paper, our primary interest is on the frozen configurations which does not contain any free cycles. If \(\underline{x}\) does not have any free cycle, then we can easily extend it to a nae-sat solution in \(\underline{{\textbf {x}}}\) such that \(\textbf{x}_v = x_v\) if \(x_v\in \{0,1\}\), since nae-sat problem on a tree is always solvable.
Definition 2.5
(Free trees). Let \(\underline{x}\) be a frozen configuration in \(\mathscr {G}\) without any free cycles. Consider the induced subgraph H of \(\mathscr {G}\) consisting of free variables and non-separating clauses. Each connected component of H is called free piece of \(\underline{x}\) and denoted by \(\mathfrak {t}^{\text {in}}\). For each free piece \(\mathfrak {t}^{\text {in}}\), the free tree \(\mathfrak {t}\) is defined by the union of \(\mathfrak {t}^{\text {in}}\) and the half-edges incident to \(\mathfrak {t}^{\text {in}}\).
For the pair \((\underline{x}, \mathscr {G})\), we write \(\mathscr {F}(\underline{x},\mathscr {G})\) to denote the collection of free trees inside \((\underline{x}, \mathscr {G})\). We write \(V(\mathfrak {t})=V(\mathfrak {t}^{\text {in}})\), \(F(\mathfrak {t})=F(\mathfrak {t}^{\text {in}})\) and \(E(\mathfrak {t})=E(\mathfrak {t}^{\text {in}})\) to be the collection of variables, clauses and (full-)edges in \(\mathfrak {t}\). Moreover, define \(\dot{\partial } \mathfrak {t}\) (resp. \(\hat{\partial } \mathfrak {t}\)) to be the collection of boundary half-edges that are adjacent to \(F(\mathfrak {t})\) (resp. \(V(\mathfrak {t})\)), and write \(\partial \mathfrak {t}:= \dot{\partial }\mathfrak {t}\sqcup \hat{\partial } \mathfrak {t}\)
We now introduce the message configuration, which enables us to calculate the size of a free tree (that is, number of nae-sat solutions on \(\mathfrak {t}\) that extends \(\underline{x}\)) by local quantities. The message configuration is given by \(\underline{\tau }= (\tau _e)_{e\in E} \in \mathscr {M}^E\) (\(\mathscr {M}\) is defined below). Here, \(\tau _e=(\dot{\tau }_e,\hat{\tau }_e)\), where \(\dot{\tau }\) (resp. \(\hat{\tau }\)) denotes the message from v(e) to a(e) (resp. a(e) to v(e)).
A message will carry information of the structure of the free tree it belongs to. To this end, we first define the notion of joining l trees at a vertex (either variable or clause) to produce a new tree. Let \(t_1,\ldots , t_l\) be a collection of rooted bipartite factor trees satisfying the following conditions:
-
Their roots \(\rho _1,\ldots ,\rho _l\) are all of the same type (i.e., either all-variables or all-clauses) and are all degree one.
-
If an edge in \(t_i\) is adjacent to a degree one vertex, which is not the root, then the edge is called a boundary-edge. The rest of the edges are called internal-edges. For the special case where \(t_i\) consists of a single edge and a single vertex, we regard the single edge to be a boundary-edge.
-
\(t_1,\ldots ,t_l\) are boundary-labelled trees, meaning that their variables, clauses, and internal edges are unlabelled (except we distinguish the root), but the boundary edges are assigned with values from \(\{0,1,{{\texttt {S}}}\}\), where \({{\texttt {S}}}\) stands for ‘separating’.
Then, the joined tree \(t \equiv \textsf {j}(t_1,\ldots , t_l) \) is obtained by identifying all the roots as a single vertex o, and adding an edge which joins o to a new root \(o'\) of an opposite type of o (e.g., if o was a variable, then \(o'\) is a clause). Note that \(t= \textsf {j}(t_1,\ldots ,t_l)\) is also a boundary-labelled tree, whose labels at the boundary edges are induced by those of \(t_1,\ldots ,t_l\).
For the simplest trees that consist of single vertex and a single edge, we use 0 (resp. 1) to stand for the ones whose edge is labelled 0 (resp. 1): for the case of \(\dot{\tau }\), the root is the clause, and for the case of \(\hat{\tau }\), the root is the variable. Also, if its root is a variable and its edge is labelled \({{\texttt {S}}}\), we write the tree as \({{\texttt {S}}}\).
We can also define the Boolean addition to a boundary-labelled tree t as follows. For the trees 0, 1, the Boolean-additions \(0\oplus \texttt {L}\), \(1\oplus \texttt {L}\) are defined as above (\(t\oplus \texttt {L}\)), and we define \({{\texttt {S}}}\oplus \texttt {L}= {{\texttt {S}}}\) for \(\texttt {L}\in \{0,1\}\). For the rest of the trees, \(t \oplus 0:= t\), and \(t\oplus 1\) is the boundary-labelled tree with the same graphical structure as t and the labels of the boundary Boolean-added by 1 (Here, we define \({{\texttt {S}}}\oplus 1 = {{\texttt {S}}}\) for the \({{\texttt {S}}}\)-labels).
Definition 2.6
(Message configuration). Let \(\dot{\mathscr {M}}_0:= \{0,1,\star \}\) and \(\hat{\mathscr {M}}_0:= \emptyset \). Suppose that \(\dot{\mathscr {M}}_t, \hat{\mathscr {M}}_t\) are defined, and we inductively define \(\dot{\mathscr {M}}_{t+1}, \hat{\mathscr {M}}_{t+1}\) as follows: For \(\hat{\underline{\tau }} \in (\hat{\mathscr {M}}_t)^{d-1}\), \(\dot{\underline{\tau }} \in (\dot{\mathscr {M}}_t)^{k-1}\), we write \(\{\hat{\tau }_i \}:= \{\hat{\tau }_1,\ldots ,\hat{\tau }_{d-1} \}\) and similarly for \(\{\dot{\tau }_i \}\). We define
Further, we set \(\dot{\mathscr {M}}_{t+1}:= \dot{\mathscr {M}}_t \cup \dot{T}( \hat{\mathscr {M}}_t^{d-1} ) {\setminus } \{{\texttt {z}}\}\), and \(\hat{\mathscr {M}}_{t+1}:= \hat{\mathscr {M}}_t \cup \hat{T}(\dot{\mathscr {M}}_t^{k-1} )\), and define \(\dot{\mathscr {M}}\) (resp. \(\hat{\mathscr {M}}\)) to be the union of all \(\dot{\mathscr {M}}_t\) (resp. \(\hat{\mathscr {M}}_t\)) and \(\mathscr {M}:= \dot{\mathscr {M}} \times \hat{\mathscr {M}}\). Then, a (valid) message configuration on \(\mathscr {G}=(V,F,E,\underline{\texttt {L}})\) is a configuration \(\underline{\tau }\in \mathscr {M}^E\) that satisfies (i) the local equations given by
for all \(e\in E\), and (ii) if one element of \(\{\dot{\tau }_e,\hat{\tau }_e\}\) equals \(\star \) then the other element is in \(\{0,1\}\).
In the definition, \(\star \) is the symbol introduced to cover cycles, and \({\texttt {z}}\) is an error message. See Figure 1 in Section 2 of [43] for an example of \(\star \) message.
When a frozen configuration \(\underline{x}\) on \(\mathscr {G}\) is given, we can construct a message configuration \(\underline{\tau }\) via the following procedure:
-
1.
For a forcing edge e, set \(\hat{\tau }_e=x_{v(e)}\). Also, for an edge \(e\in E\), if there exists \(e^\prime \in \delta v(e) {\setminus } e\) such that \(\hat{\tau }_{e^\prime } \in \{0,1\}\), then set \(\dot{\tau }_e=x_{v(e)}\).
-
2.
For an edge \(e\in E\), if there exist \(e_1,e_2\in \delta a(e){\setminus } e\) such that \(\{\texttt {L}_{e_1}\oplus \dot{\tau }_{e_1}, \texttt {L}_{e_2}\oplus \dot{\tau }_{e_2}\}=\{0,1\}\), then set \(\hat{\tau }_e = {{\texttt {S}}}\).
-
3.
After these steps, apply the local equations (8) recursively to define \(\dot{\tau }_e\) and \(\hat{\tau }_e\) wherever possible.
-
4.
For the places where it is no longer possible to define their messages until the previous step, set them to be \(\star \).
In fact, the following lemma shows the relation between the frozen and message configurations. We refer to [43], Lemma 2.7 for its proof.
Lemma 2.7
The mapping explained above defines a bijection
Next, we introduce a dynamic programming method based on belief propagation to calculate the size of a free tree by local quantities from a message configuration.
Definition 2.8
Let \(\mathcal {P}\{0,1\} \) denote the space of probability measures on \(\{0,1\}\). We define the mappings \(\dot{{{\texttt {m}}}}:\dot{\mathscr {M}} \rightarrow \mathcal {P}\{0,1\}\) and \(\hat{{{\texttt {m}}}}: \hat{\mathscr {M}} \rightarrow \mathcal {P}\{0,1\}\) as follows. For \(\dot{\tau }\in \{0,1\}\) and \(\hat{\tau }\in \{0,1\}\), let \(\dot{{{\texttt {m}}}}[\dot{\tau }] =\delta _{\dot{\tau }}\), \(\hat{{{\texttt {m}}}}[\hat{\tau }] = \delta _{\hat{\tau }}\). For \(\dot{\tau }\in \dot{\mathscr {M}} {\setminus } \{0,1,\star \}\) and \(\hat{\tau }\in \hat{\mathscr {M}} {\setminus } \{0,1,\star \}\), \(\dot{{{\texttt {m}}}}[\dot{\tau }]\) and \(\hat{{{\texttt {m}}}}[\hat{\tau }]\) are recursively defined:
-
Let \(\dot{\tau } = \dot{T}(\hat{\tau }_1,\ldots ,\hat{\tau }_{d-1})\), with \(\star \notin \{\hat{\tau }_i \}\). Define
$$\begin{aligned} \dot{z}[\dot{\tau }] := \sum _{\textbf{x}\in \{0,1\} } \prod _{i=1}^{d-1} \hat{{{\texttt {m}}}}[\hat{\tau }_i](\textbf{x}), \quad \dot{{{\texttt {m}}}}[\dot{\tau }](\textbf{x}) := \frac{1}{\dot{z}[\dot{\tau }]} \prod _{i=1}^{d-1} \hat{{{\texttt {m}}}}[\hat{\tau }_i](\textbf{x}). \end{aligned}$$(10)Note that these equations are well-defined, since \((\hat{\tau }_1,\ldots , \hat{\tau }_{d-1})\) are well-defined up to permutation.
-
Let \(\hat{\tau } = \hat{T} ( \dot{\tau }_1,\ldots ,\dot{\tau }_{k-1}; \texttt {L})\), with \(\star \notin \{\dot{\tau }_i \}\). Define
$$\begin{aligned} \hat{z}[\hat{\tau }] := 2-\sum _{\textbf{x}\in \{0,1\}} \prod _{i=1}^{k-1} \dot{{{\texttt {m}}}}[\dot{\tau }_i](\textbf{x}), \quad \hat{{{\texttt {m}}}}[\hat{\tau }](\textbf{x}) := \frac{1}{\hat{z}[\hat{\tau }]} \left\{ 1- \prod _{i=1}^{k-1} \dot{{{\texttt {m}}}}[\dot{\tau }_i](\textbf{x}) \right\} . \end{aligned}$$(11)Similarly as above, these equations are well-defined.
Moreover, observe that inductively, \(\dot{{{\texttt {m}}}}[\dot{\tau }], \hat{{{\texttt {m}}}}[\hat{\tau }] \) are not Dirac measures unless \(\dot{\tau }, \hat{\tau }\in \{0,1\}\).
It turns out that \(\dot{{{\texttt {m}}}}[\star ], \hat{{{\texttt {m}}}}[\star ]\) can be arbitrary measures for our purpose, and hence we assume that they are uniform measures on \(\{0,1\}\).
The Eqs. (10) and (11) are known as belief propagation equations. We refer the detailed explanation to [43], Section 2 where the same notions are introduced, or to [33], Chapter 14 for more fundamental background. From these quantities, we define the following local weights which are going to lead us to computation of cluster sizes.
where the last identities in the last two lines hold for any choices of i. These weight factors can be used to derive the size of a free tree. Let \(\mathfrak {t}\) be a free tree in \(\mathscr {F}(\underline{x},\mathscr {G})\), and let \(w^{\text {lit}} (\mathfrak {t}; \underline{x},\mathscr {G})\) be the number of nae-sat solutions that extend \(\underline{x}\) to \(\{0,1\}^{V(\mathfrak {t})}\). Further, let \(\textsf {size}(\underline{x},\mathscr {G})\) denote the total number of nae-sat solutions that extend \(\underline{x}\) to \(\{0,1\}^V.\)
Lemma 2.9
([43], Lemma 2.9 and Corollary 2.10; [33], Ch. 14). Let \(\underline{x}\) be a frozen configuration on \(\mathscr {G}=(V,F,E,\underline{\texttt {L}})\) without any free cycles, and \(\underline{\tau }\) be the corresponding message configuration. For a free tree \(\mathfrak {t}\in \mathscr {F}(\underline{x};\mathscr {G})\), we have that
Furthermore, let \(\Upsilon \in \textsf {CL}(\mathscr {G})\) be the cluster corresponding to \(\underline{x}\). Then, we have
2.3 Colorings
In this subsection, we introduce the coloring configuration, which is a simplification of the message configuration. We give its definition analogously as in [43].
Recall the definition of \(\mathscr {M}=\dot{\mathscr {M}}\times \hat{\mathscr {M}}, \) and let \(\{ {\texttt {F}}\} \subset \mathscr {M}\) be defined by \(\{{\texttt {F}}\}:= \{\tau \in \mathscr {M}: \, \dot{\tau } \notin \{ 0,1,\star \} \text { and } \hat{\tau }\notin \{ 0,1,\star \} \}\).
Note that \(\{{\texttt {F}}\}\) corresponds to the messages on the edges of free trees, except the boundary edges labelled either 0 or 1.
Define \(\Omega := \{{{{\texttt {R}}}}_0, {{{\texttt {R}}}}_1, {{{\texttt {B}}}}_0, {{{\texttt {B}}}}_1\} \cup \{{\texttt {F}}\}\) and let \(\textsf {S}: \mathscr {M}\setminus \{(\star ,\star )\} \rightarrow \Omega \) be the projections given by
Here, we note that a (valid) message configuration \(\underline{\tau }=(\tau _e)_{e\in E} \in \mathscr {M}^{E}\) cannot have an edge e such that \(\tau _e=(\star ,\star )\) (see Definition 2.6), thus we may safely exclude the spin \((\star ,\star )\) from \(\mathscr {M}\).
For convenience, we abbreviate \(\{{{{\texttt {R}}}}\}= \{{{{\texttt {R}}}}_0, {{{\texttt {R}}}}_1 \}\) and \(\{{{{\texttt {B}}}}\} = \{{{{\texttt {B}}}}_0, {{{\texttt {B}}}}_1 \}\), and define the Boolean addition as \({{{\texttt {B}}}}_\textbf{x}\oplus \texttt {L}:= {{{\texttt {B}}}}_{\textbf{x}\oplus \texttt {L}}\), and similarly for \({{{\texttt {R}}}}_\textbf{x}\). Also, for \(\sigma \in \{ {{{\texttt {R}}}},{{{\texttt {B}}}},{{\texttt {S}}}\}\), we set \(\dot{\sigma }:= \sigma =:\hat{\sigma }\).
Definition 2.10
(Colorings). For \(\underline{\sigma }\in \Omega ^d\), let
Also, define \( \hat{I}^{\text {lit}}: \Omega ^k \rightarrow \mathbb {R}\) to be
On a nae-sat instance \(\mathscr {G}= (V,F,E,\underline{\texttt {L}})\), \(\underline{\sigma }\in \Omega ^E\) is a (valid) coloring if \(\dot{I}(\underline{\sigma }_{\delta v})=\hat{I}^{\text {lit}}((\underline{\sigma }\oplus \underline{\texttt {L}})_{\delta a}) =1 \) for all \(v\in V, a\in F\).
Given nae-sat instance \(\mathscr {G}\), it was shown in Lemma 2.12 of [43] that there is a bijection
The weight elements for coloring, denoted by \(\dot{\Phi }, \hat{\Phi }^{\text {lit}}, \bar{\Phi }\), are defined as follows. For \(\underline{\sigma }\in \Omega ^d,\) let
For \(\underline{\sigma }\in \Omega ^k\), let
(If \(\sigma \notin \{{{{\texttt {R}}}}\}, \) then \(\dot{\tau }(\sigma _i)\) is well-defined.)
Lastly, let
Note that if \(\hat{\sigma }={{\texttt {S}}}\), then \(\bar{\varphi }(\dot{\sigma },\hat{\sigma })=2\) for any \(\dot{\sigma }\). The rest of the details explaining the compatibility of \(\varphi \) and \(\Phi \) can be found in [43], Section 2.4. Then, the formula for the cluster size we have seen in Lemma 2.9 works the same for the coloring configuration.
Lemma 2.11
([43], Lemma 2.13). Let \(\underline{x}\in \{0,1,\text {{\texttt {f}}}\}^V\) be a frozen configuration on \(\mathscr {G}=(V,F,E,\underline{\texttt {L}})\), and let \(\underline{\sigma }\in \Omega ^E\) be the corresponding coloring. Define
Then, we have \(\textsf {size}(\underline{x};\mathscr {G}) = w_\mathscr {G}^{\text {lit}}(\underline{\sigma })\).
Among the valid frozen configurations, we can ignore the contribution from the configurations with too many free or red colors, as observed in the following lemma.
Lemma 2.12
([25] Proposition 2.2, [43] Lemma 3.3). For a frozen configuration \(\underline{x}\in \{0,1,\text {{\texttt {f}}}\}^{V}\), let \({{{\texttt {R}}}}(\underline{x})\) count the number of forcing edges and \(\text {{\texttt {f}}}(\underline{x})\) count the number of free variables. There exists an absolute constant \(c>0\) such that for \(k\ge k_0\), \(\alpha \in [\alpha _{\textsf {lbd}}, \alpha _{\textsf {ubd}}]\), and \(\lambda \in (0,1]\),
where \(\textsf {size}(\underline{x};\mathscr {G})\) is the number of nae-sat solutions \(\underline{{\textbf {x}}}\in \{0,1\}^{V}\) which extends \(\underline{x}\in \{0,1,\text {{\texttt {f}}}\}^{V}\).
Thus, our interest is in counting the number of frozen configurations and colorings such that the fractions of red edges and the fraction of free variables are bounded by \(7/2^k\). To this end, we define
where \({{{\texttt {R}}}}(\underline{\sigma })\) count the number of red edges and \(\text {{\texttt {f}}}(\underline{\sigma })\) count the number of free variables. The superscript \(\text {tr}\) is to emphasize that the above quantities count the contribution from frozen configurations which only contain free trees, i.e. no free cycles (Recall that by Lemma 2.7 and (14), the space of coloring has a bijective correspondence with the space of frozen configurations without free cycles). Similarly, recalling the definition of \(\overline{\text {{\textbf {N}}}}_s\) in (3), total number of clusters of size in \([e^{ns},e^{ns+1})\), \({{\textbf {N}}}_s\) is defined to be
Hence, \(e^{-n\lambda s-\lambda }{{\textbf {Z}}}_{\lambda ,s}\le {{\textbf {N}}}_{s}\le e^{-n\lambda s} {{\textbf {Z}}}_{\lambda ,s}\) holds.
Definition 2.13
(Truncated colorings). Let \(1\le L< \infty \), \(\underline{x}\) be a frozen configuration on \(\mathscr {G}\) without free cycles and \(\underline{\sigma }\in \Omega ^E\) be the coloring corresponding to \(\underline{x}\). Recalling the notation \(\mathscr {F}(\underline{x};\mathscr {G})\) (Definition 2.5), we say \(\underline{\sigma }\) is a (valid) L-truncated coloring if \(|V(\mathfrak {t})| \le L\) for all \(\mathfrak {t}\in \mathscr {F}(\underline{x};\mathscr {G})\). For an equivalent definition, let \(|\sigma |:=v(\dot{\sigma })+v(\hat{\sigma })-1\) for \(\sigma \in \{{\texttt {F}}\}\), where \(v(\dot{\sigma })\) (resp. \(v(\hat{\sigma })\)) denotes the number of variables in \(\dot{\sigma }\) (resp. \(\hat{\sigma }\)). Define \(\Omega _L:= \{{{{\texttt {R}}}},{{{\texttt {B}}}}\}\cup \{{\texttt {F}}\}_L\), where \(\{{\texttt {F}}\}_L\) is the collection of \(\sigma \in \{{\texttt {F}}\}\) such that \(|\sigma | \le L\). Then, \(\underline{\sigma }\) is a (valid) L-truncated coloring if \(\underline{\sigma }\in \Omega _L^E\).
To clarify the names, we often call the original coloring \(\underline{\sigma }\in \Omega ^E\) the untruncated coloring.
Analogous to (15), define the truncated partition function
2.4 Averaging over the literals
Let \(\mathscr {G}=(V,F,E,\underline{\texttt {L}})\) be a nae-sat instance and \(\mathcal {G}=(V,F,E)\) be the factor graph without the literal assignment. Let \(\mathbb {E}^{\text {lit}}\) denote the expectation over the literals \(\underline{\texttt {L}}\sim \text {Unif} [\{0,1\}^E]\). Then, for a coloring \(\underline{\sigma }\in \Omega ^{E}\), we can use Lemma 2.11 to write \(\mathbb {E}^{\text {lit}}[w_\mathscr {G}^{\text {lit}}(\underline{\sigma }) ]\) as
To this end, define
We now recall a property of \(\hat{\Phi }^{\text {lit}}\) from [43], Lemma 2.17:
Lemma 2.14
([43], Lemma 2.17). \(\hat{\Phi }^{\text {lit}}\) can be factorized as \(\hat{\Phi }^{\text {lit}}(\underline{\sigma }\oplus \underline{\texttt {L}}) = \hat{I}^{\text {lit}}(\underline{\sigma }\oplus \underline{\texttt {L}}) \hat{\Phi }^{\text {m}}(\underline{\sigma })\) for
As a consequence, we can write \(\hat{\Phi }(\underline{\sigma })^\lambda = \hat{\Phi }^{\text {m}}(\underline{\sigma })^\lambda \hat{v}(\underline{\sigma })\), where
2.5 Empirical profile of colorings
The coloring profile, defined below, was introduced in [43]. Hereafter, \(\mathscr {P}(\mathfrak {X})\) denotes the space of probability measures on \(\mathfrak {X}\).
Definition 2.15
(Coloring profile and the simplex of coloring profile, Definition 3.1 and 3.2 of [43]). Given a nae-sat instance \(\mathscr {G}\) and a coloring configuration \(\underline{\sigma }\in \Omega ^E \), the coloring profile of \(\underline{\sigma }\) is the triple \(H[\underline{\sigma }]\equiv H\equiv (\dot{H},\hat{H},\bar{H}) \) defined as follows.
A valid H must satisfy the following compatibility equation:
The simplex of coloring profile \(\varvec{\Delta }\) is the space of triples \(H=(\dot{H},\hat{H},\bar{H})\) which satisfies the following conditions:
-
\(\dot{H} \in \mathscr {P}(\text {supp}\,\dot{\Phi }), \hat{H} \in \mathscr {P}(\text {supp}\,\hat{\Phi })\) and \(\bar{H} \in \mathscr {P}(\Omega )\).
-
\(\dot{H},\hat{H}\) and \(\bar{H}\) satisfy (18).
-
Recalling the definition of \(\text {{\textbf {Z}}}_{\lambda }\) in (15), \(\dot{H},\hat{H}\) and \(\bar{H}\) satisfy \(\max \{\bar{H}({\texttt {f}}),\bar{H}({{{\texttt {R}}}})\} \le \frac{7}{2^k}\).
For \(L <\infty \), we let \(\varvec{\Delta }^{(L)}\) be the subspace of \(\varvec{\Delta }\) satisfying the following extra condition:
-
\(\dot{H} \in \mathscr {P}(\text {supp}\,\dot{\Phi }\cap \Omega _{L}^{d}), \hat{H} \in \mathscr {P}(\text {supp}\,\hat{\Phi }\cap \Omega _L^{k})\) and \(\bar{H} \in \mathscr {P}(\Omega _L)\).
Given a coloring profile \(H\in \varvec{\Delta }\), denote \({{\textbf {Z}}}_{\lambda }^{\text{ tr }}[H]\) by the contribution to \({{\textbf {Z}}}_{\lambda }^{\text{ tr }}\) from the coloring configurations whose coloring profile is H. That is, \({{\textbf {Z}}}_\lambda ^{\text{ tr }}[H]\!:=\! \sum _{\underline{\sigma }:\; H[\underline{\sigma }] = H} w^{\text{ lit }}(\underline{\sigma })^\lambda \). For \(H \in \varvec{\Delta }^{(L)}\), \({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }[H]\) is analogously defined. In [43], they showed that \(\mathbb {E}{{\textbf {Z}}}^{(L),\text{ tr }}_\lambda [H]\) for the L-truncated coloring model can be written as the following formula, which is a result of Stirling’s approximation:
Similar to \(F_{\lambda ,L}(H)\) for \(H\in \varvec{\Delta }^{(L)}\), the untruncated free energy \(F_{\lambda }(H)\) for \(H\in \varvec{\Delta }\) is defined by the same equation \(F_{\lambda }(H):=\Sigma (H)+\lambda s(H)\).
2.6 Belief propagation fixed point and optimal profiles
It was proven in [43] that the truncated free energy \(F_{\lambda ,L}(H)\) is maximized at the optimal profile \(H^\star _{\lambda ,L}\), defined in terms of Belief Propagation(BP) fixed point. In this subsection, we review the necessary notions to define \(H^\star _{\lambda ,L}\) (cf. Section 5 of [43]). To do so, we first define the BP functional: for probability measures \(\dot{\textbf{q}},\hat{\textbf{q}}\in \mathscr {P}(\Omega _L)\), where \(L<\infty \), let
where \(\sigma \in \Omega _L\) and \(\cong \) denotes equality up to normalization, so that the output is a probability measure. Now, restrict the domain to the probability measures with one-sided dependence, i.e. satisfying \(\dot{\textbf{q}}(\sigma )=\dot{f}(\dot{\sigma })\) and \(\hat{\textbf{q}}(\sigma )=\hat{f}(\hat{\sigma })\) for some \(\dot{f}:\dot{\Omega }_L\rightarrow \mathbb {R}_{\ge 0}\) and \(\hat{f}:\hat{\Omega }_L\rightarrow \mathbb {R}_{\ge 0}\). It can be checked that \(\dot{\textbf{B}}_{1,\lambda }, \hat{\textbf{B}}_{1,\lambda }\) preserve the one-sided property, inducing
More precisely, for \(\hat{q}\in \mathscr {P}(\hat{\Omega }_L)\) and \(\dot{q} \in \mathscr {P}(\dot{\Omega }_L)\), define the probability measures \(\dot{\text {BP}}_{\lambda ,L}(\hat{q})\in \mathscr {P}(\dot{\Omega }_L)\) and \(\hat{\text {BP}}_{\lambda ,L}(\dot{q})\in \mathscr {P}(\hat{\Omega }_L)\) as follows. For \(\dot{\sigma }\in \dot{\Omega }_L\) and \(\hat{\sigma }\in \hat{\Omega }_L\), let
where \(\hat{\sigma }^\prime \in \hat{\Omega }_L\) and \(\dot{\sigma }^\prime \in \dot{\Omega }_L\) are arbitrary with the only exception that when \(\dot{\sigma }\in \{{{{\texttt {R}}}},{{{\texttt {B}}}}\}\) (resp. \(\hat{\sigma }\in \{{{{\texttt {R}}}},{{{\texttt {B}}}}\}\)), then we take \(\hat{\sigma }^\prime = \dot{\sigma }\) (resp. \(\dot{\sigma }^\prime = \hat{\sigma }\)) so that the RHS above is non-zero. From the definition of \(\dot{\Phi },\hat{\Phi }\), and \(\bar{\Phi }\), it can be checked that the choices of \(\hat{\sigma }^\prime \in \hat{\Omega }_L\) and \(\dot{\sigma }^\prime \in \dot{\Omega }_L\) do not affect the values of the RHS above (see (12)). The normalizing constants \(\dot{\mathscr {Z}}_{\hat{q}}\) and \(\hat{\mathscr {Z}}_{\dot{q}}\) are given by
Here, \(\hat{\sigma }^\prime \in \hat{\Omega }_L\) and \(\dot{\sigma }^\prime \in \dot{\Omega }_L\) are again arbitrary. We then define the Belief Propagation functional by \(\text {BP}_{\lambda ,L}:= \dot{\text {BP}}_{\lambda ,L}\circ \hat{\text {BP}}_{\lambda ,L}\). The untruncated BP map, which we denote by \(\text {BP}_{\lambda }:\mathscr {P}(\dot{\Omega }) \rightarrow \mathscr {P}(\dot{\Omega })\), is analogously defined, where we replace \(\dot{\Omega }_L\)(resp. \(\hat{\Omega }_L\)) with \(\dot{\Omega }\)(resp. \(\hat{\Omega }\)).
Remark 2.16
In defining the untruncated BP map, note that \(\dot{\Omega }\) and \(\hat{\Omega }\) are not a finite set, thus the normalizing constant, analogue of (22), may be infinite. However, from the definitions of \(\dot{\Phi },\hat{\Phi }\), and \(\bar{\Phi }\), we have that \(\bar{\Phi }(\sigma _1)\dot{\Phi }(\underline{\sigma })\le 2\) and \(\bar{\Phi }(\tau _1)\dot{\Phi }(\underline{\tau })\le 2\) for \(\underline{\sigma }=(\sigma _1,\ldots ,\sigma _d) \in \Omega ^{d}\) and \(\underline{\tau }=(\tau _1,\ldots ,\tau _k) \in \Omega ^k\). Thus, it follows that the normalizing constants for the untruncated BP map are at most 2. We also remark that \(\underline{\sigma }=((\dot{\sigma }_1,\hat{\sigma }_1),\ldots ,(\dot{\sigma }_d,\hat{\sigma }_d))\in \Omega ^{d}\) such that \(\dot{\Phi }(\underline{\sigma })\ne 0\) is fully determined by \((\dot{\sigma }_1,\hat{\sigma }_1)\) and \(\hat{\sigma }_2,\ldots ,\hat{\sigma }_d\). Thus, the second sum \(\underline{\sigma }\in \Omega _L^d\) in the definition of \(\dot{\mathscr {Z}}_{\hat{q}}\) in (22) can be replaced with the sum over \(\sigma _1\in \Omega , \hat{\sigma }_2,\ldots , \hat{\sigma }_d\in \hat{\Omega }\). The analogous remark holds for the \(\hat{\mathscr {Z}}_{\dot{q}}\) and for the untruncated model.
Let \(\mathbf {\Gamma }_C\) be the set of \(\dot{q} \in \mathscr {P}(\dot{\Omega })\) such that
where \(\{{{{\texttt {R}}}}\}\equiv \{{{{\texttt {R}}}}_0,{{{\texttt {R}}}}_1\}\) and \(\{{{{\texttt {B}}}}\}\equiv \{{{{\texttt {B}}}}_0,{{{\texttt {B}}}}_1\}\). The proposition below shows that the BP map contracts in the set \(\mathbf {\Gamma }_C\) for large enough C, which guarantees the existence of Belief Propagation fixed point.
Proposition 2.17
(Proposition 5.5 item a,b of [43]). For \(\lambda \in [0,1]\), the following holds:
-
1.
There exists a large enough universal constant C such that the map \(\text {BP}\equiv \text {BP}_{\lambda ,L}\) has a unique fixed point \(\dot{q}^\star _{\lambda ,L}\in \mathbf {\Gamma }_C\). Moreover, if \(\dot{q}\in \mathbf {\Gamma }_C\), \(\text {BP}\dot{q}\in \mathbf {\Gamma }_C\) holds with
$$\begin{aligned} ||\text {BP}\dot{q}-\dot{q}^\star _{\lambda ,L}||_1\lesssim k^2 2^{-k}||\dot{q}-\dot{q}^\star _{\lambda ,L}||_1. \end{aligned}$$(24)The same holds for the untruncated BP, i.e. \(\text {BP}_{\lambda }\), with fixed point \(\dot{q}^\star _{\lambda }\in \Gamma _C\). \(\dot{q}^\star _{\lambda ,L}\) for large enough L and \(\dot{q}^\star _{\lambda }\) have full support in their domains.
-
2.
In the limit \(L \rightarrow \infty \), \(||\dot{q}^\star _{\lambda ,L}-\dot{q}^\star _{\lambda }||_1 \rightarrow 0\).
For \(\dot{q} \in \mathscr {P}(\dot{\Omega })\), denote \(\hat{q}\equiv \hat{\text {BP}}\dot{q}\), and define \(H_{\dot{q}}=(\dot{H}_{\dot{q}},\hat{H}_{\dot{q}}, \bar{H}_{\dot{q}})\in \varvec{\Delta }\) by
where \(\dot{\mathfrak {Z}}\equiv \dot{\mathfrak {Z}}_{\dot{q}},\hat{\mathfrak {Z}}\equiv \hat{\mathfrak {Z}}_{\dot{q}}\) and \(\bar{\mathfrak {Z}}\equiv \bar{\mathfrak {Z}}_{\dot{q}}\) are normalizing constants.
Definition 2.18
(Definition 5.6 of [43]). The optimal coloring profile for the truncated model and the untruncated model is the tuple \(H^\star _{\lambda ,L}=(\dot{H}^\star _{\lambda ,L},\hat{H}^\star _{\lambda ,L},\bar{H}^\star _{\lambda ,L})\) and \(H^\star _{\lambda }=(\dot{H}^\star _{\lambda },\hat{H}^\star _{\lambda },\bar{H}^\star _{\lambda })\), defined respectively by \( H^\star _{\lambda ,L}:= H_{\dot{q}^\star _{\lambda ,L}}\) and \(H^\star _{\lambda }:=H_{\dot{q}^\star _{\lambda }}\).
Definition 2.19
For \(k\ge k_0,\alpha \in (\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\) and \(\lambda \in [0,1]\), define the optimal \(\lambda \)-tilted truncated weight \(s^\star _{\lambda ,L}\equiv s^\star _{\lambda ,L}(\alpha ,k)\) and untruncated weight \(s^\star _{\lambda } \equiv s^\star _{\lambda }(\alpha ,k)\) by
Then, define the optimal tilting constants \(\lambda ^\star _L\equiv \lambda ^\star _L(\alpha ,k)\) and \(\lambda ^\star \equiv \lambda ^\star (\alpha , k)\) by
Finally, we define \(s^\star _L\equiv s^\star _L(\alpha ,k),s^\star \equiv s^\star (\alpha ,k)\) and \(c^\star \equiv c^\star (\alpha ,k)\) by
We remark that \(s^\star = \textsf {f}^{1\textsf {rsb}}(\alpha )\) and \(\lambda ^\star \in (0,1)\) holds for \(\alpha \in (\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\) (see Proposition 1.4 of [43]).
To end this section, we define the optimal coloring profile in the second moment (cf. Definition 5.6 of [43]). Define the analogue of \((\dot{\Phi },\hat{\Phi },\bar{\Phi })\) in the second moment \((\dot{\Phi }_2,\hat{\Phi }_2,\bar{\Phi }_2)\) by \(\dot{\Phi }_2:=\dot{\Phi }\otimes \dot{\Phi }\), \(\bar{\Phi }_2:=\bar{\Phi }\otimes \bar{\Phi }\) and
Then, the BP map in the second moment is defined by replacing \((\dot{\Phi },\hat{\Phi },\bar{\Phi })\) in (20) by \((\dot{\Phi }_2,\hat{\Phi }_2,\bar{\Phi }_2)\). Moreover, analogous to (25), define for \(\dot{q}\in \mathscr {P}\big ((\dot{\Omega }_L)^2\big )\) by replacing \((\dot{\Phi },\hat{\Phi },\bar{\Phi })\) in (25) by \((\dot{\Phi }_2,\hat{\Phi }_2,\bar{\Phi }_2)\). Here, and .
Definition 2.20
(Definition 5.6 of [43]). The optimal coloring profile in the second moment for the truncated model is the tuple \(H^{\bullet }_{\lambda ,L}=(\dot{H}^{\bullet }_{\lambda ,L},\hat{H}^{\bullet }_{\lambda ,L},\bar{H}^{\bullet }_{\lambda ,L})\) defined by .
3 Proof Outline
Recall that \({{\textbf {N}}}_s^{\text{ tr }}\equiv {{\textbf {Z}}}_{0,s}^{\text{ tr }}\) counts the number of valid colorings with weight between \(e^{ns}\) and \(e^{ns+1}\), which do not contain a free cycle. Also, recalling the constant \(s_\circ (C)\equiv s_\circ (n,\alpha ,C)\) in (4). It was shown in [37] that for fixed \(C\in \mathbb {R}\), \(\mathbb {E}{{\textbf {N}}}^{\text{ tr }}_{s_{\circ }(C)}\asymp _{k} e^{\lambda ^\star C}\) holds and we have the following:
Hence, the Cauchy-Schwarz inequality shows that there is a constant \(C_k<1\), which only depends on \(\alpha \) and k, such that for \(C>0\),
The remaining work is to push this probability close to 1. The key to proving Theorem 1.1 and 1.4 is the following theorem.
Theorem 3.1
Let \(k\ge k_0\), \(\alpha \in (\alpha _{\textsf {cond}}, \alpha _{\textsf {sat}})\), and set \(\lambda ^\star , s^\star \) as in Definition 2.19. For every \(\varepsilon >0\), there exists \(C(\varepsilon ,\alpha ,k)>0\) and \(\delta \equiv \delta (\varepsilon ,\alpha ,k)>0\) such that we have for \(n\ge n_0(\varepsilon ,\alpha ,k)\) and \(C\ge C(\varepsilon ,\alpha ,k)\),
where \(s_\circ (C)\equiv s_\circ (n,\alpha ,C)\equiv s^\star -\frac{ \log n}{2\lambda ^\star n} - \frac{C}{n}\).
Theorem 3.1 easily implies Theorems 1.1 and 1.4: in [37, Remark 6.11], we have already shown that Theorem 3.1 implies Theorem 1.4, so we are left to prove Theorem 1.1.
Proof of Theorem 1.1
By Theorem 3.22 of [37], \(\mathbb {E}{{\textbf {N}}}^{\text{ tr }}_{s_{\circ }(C)}\asymp e^{\lambda ^\star C}\), so Theorem 3.1 implies Theorem 1.1-(b). Hence, it remains to prove Theorem 1.1-(a). Fix \(\varepsilon >0\) throughout the proof. By Theorem 3.1, there exists \(C_1\equiv C_1(\varepsilon ,\alpha ,k)\) such that
Note that on the event \({{\textbf {N}}}_{s_\circ (C_1)}^{\text{ tr }}>0\), we have
where Z denotes the number of nae-sat solutions in \(\mathscr {G}\). Moreover, it was shown in [37, Theorem 1.1-(a)] that for \(C_2\le n^{1/5}\), we have
where \(C_k\) is a constant that depends only on k and the sum in the lhs is for \(s\in n^{-1}\mathbb {Z}\). Therefore, by Markov’s inequality, we can choose \(C_2\equiv C_2(\varepsilon ,\alpha ,k)\) to be large enough so that
Furthermore, by Theorem 1.1-(a) of [37], there exists \(C_3\equiv C_3(\varepsilon ,\alpha ,k)\) such that
Finally, Theorem 3.24 and Proposition 3.25 of [37] show that for \(|C|\le n^{1/4}\), \(\mathbb {E}{{\textbf {N}}}_{s_{\circ }(C)}\asymp _{k}e^{-\lambda ^\star C}\) holds. Thus, we can choose \(K\equiv K(\varepsilon ,\alpha ,k)\in \mathbb {N}\) large enough so that
Therefore, by (30)–(33), the conclusion of Theorem 1.1-(a) holds with \(K=K(\varepsilon ,\alpha ,k)\). \(\square \)
3.1 Outline of the proof of Theorem 3.1
In this subsection, we discuss the outline of the proof of Theorem 3.1. We begin with a natural way of characterizing cycles in \(\mathscr {G}= (\mathcal {G}, \underline{\texttt {L}})\) which was also used in [20].
Definition 3.2
(\(\zeta \)-cycle). Let \(l>0\) be an integer and for each \({\zeta }\in \{0,1\}^{2l}\), a \(\zeta \)-cycle in \(\mathscr {G}=(\mathcal {G},\underline{\texttt {L}})\) consists of
which satisfies the following conditions:
-
\(v_1,\ldots ,v_l\in [n]\equiv V\) are distinct variables, and for each \(i\in [l]\), \(e_{v_i}^0\) and \(e_{v_i}^1\) are distinct half-edges attached to \(v_i\).
-
\(a_1,\ldots ,a_l\in [m]\equiv F\) are distinct clauses, and for each \(i\in [l]\), \(e_{a_i}^0\) and \(e_{a_i}^1\in [k]\) are distinct half-edges attached to \(a_i\). Moreover,
$$\begin{aligned} a_1 = \min \{a_i:i\in [l] \}, \quad \text {and} \quad e_{a_1}^0<e_{a_1}^1. \end{aligned}$$(34) -
\((e_{v_i}^1,e_{a_{i+1}}^0)\) and \((e_{a_i}^1,e_{v_i}^0)\) are edges in \(\mathcal {G}\) for each \(i\in [l]\). (\(a_{l+1}=a_1\))
-
The literal on the half-edge \(\texttt {L}({e_{a_i}^j})\) is given by \(\texttt {L}({e_{a_i}^j}) = \zeta _{2(i-1)+j }\) for each \(i\in [l]\) and \(j\in \{0,1\}\). (\(\zeta _0=\zeta _{2l}\))
Note that (34) is introduced in order to prevent overcounting. Also, we denote the size of \(\zeta \) by \(||\zeta ||\), defined as
Furthermore, we denote by \(X({\zeta })\) the number of \(\zeta \)-cycles in \(\mathscr {G}=(\mathcal {G},\underline{\texttt {L}})\). For \(\zeta \in \{0,1\}^{2\,l}\), it is not difficult to see that
Moreover, \(\{X({\zeta })\}\) are asymptotically jointly independent in the sense that for any \(l_0>0\),
Both (36) and (37) follow from an application of the method of moments (e.g., see Theorem 9.5 in [29]). Given these definitions and properties, we are ready to state the small subgraph conditioning method, appropriately adjusted to our setting.
Theorem 3.3
(Small subgraph conditioning [39, 40]). Let \(\mathscr {G}= (\mathcal {G}, \underline{\texttt {L}})\) be a random d-regular k-nae-sat instance and let \(X({\zeta })\equiv X({\zeta ,n})\) be the number of \(\zeta \)-cycles in \(\mathscr {G}\) with \(\mu ({\zeta })\) given as (36). Suppose that a random variable \(Z_n\equiv Z_n(\mathscr {G})\) satisfies the following conditions:
-
(a)
For each \(l\in \mathbb {N}\) and \(\zeta \in \{0,1\}^{2l}\), the following limit exists:
$$\begin{aligned} 1+ \delta ({\zeta }) \equiv \lim _{n \rightarrow \infty } \frac{\mathbb {E}[ Z_nX({\zeta })]}{\mu ({\zeta })\mathbb {E}Z_n }. \end{aligned}$$(38)Moreover, for each \(a,l\in \mathbb {N}\) and \(\zeta \in \{0,1 \}^{2\,l}\), we have
$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\mathbb {E}[Z_n (X(\zeta ))_a] }{ \mathbb {E}Z_n} = (1+\delta (\zeta ))^a \mu (\zeta )^a, \end{aligned}$$where \((b)_a\) denotes the falling factorial \((b)_a:= b(b-1)\cdots (b-a+1)\).
-
(b)
The following limit exists:
$$\begin{aligned} C\equiv \lim _{n \rightarrow \infty } \frac{\mathbb {E}Z_n^2}{(\mathbb {E}Z_n)^2}. \end{aligned}$$ -
(c)
We have \(\sum _{l=1}^\infty \sum _{\zeta \in \{0,1\}^{2\,l}} \mu ({\zeta }) \delta ({\zeta })^2 <\infty .\)
-
(d)
Moreover, the constant C satisfies \(C\le \exp \left( \sum _{l=1}^\infty \sum _{\zeta \in \{0,1\}^{2\,l}} \mu ({\zeta }) \delta ({\zeta })^2 \right) \).
Then, we have the following conclusion:
where \(\bar{X}({\zeta })\) are independent Poisson random variables with mean \(\mu ({\zeta })\).
We briefly explain a way to understand the crux of the theorem as follows. Since \(\{X({\zeta })\}\) jointly converges to \(\{\bar{X}({\zeta })\}\), it is not hard to see that
using the conditions (a),(b),(c),(d) (e.g. see Theorem 9.12 in [29] and its proof). Therefore, conditions (b) and (d) imply that the conditional variance of \(Z_n\) given \(\{X({\zeta })\}\) is negligible compared to \((\mathbb {E}Z_n)^2\), and hence the distribution of \(Z_n\) is asymptotically the same as that of \(\mathbb {E}\big [Z_n\big | \{X({\zeta })\}\big ]\) as addressed in the conclusion of the theorem.
Having Theorem 3.3 in mind, our goal is to (approximately) establish the four assumptions for (a truncated version of) \({{\textbf {Z}}}_{\lambda ,s_\circ (C)}^{\text{ tr }},\) for \(s_{\circ }(C)\equiv s^\star -\frac{\log n}{2\lambda ^\star n}-\frac{C}{n}\). The condition (b) has already been obtained from the moment analysis from [37]. The condition (a) will be derived in Proposition 4.1 below and (c) will be derived in Lemma 4.6 below. The condition (d), however, will be true only in an approximate sense, where the approximation error becomes smaller when we take the constant C larger because of within-cluster correlations.
In the previous works [27, 28, 39, 40, 42], the condition (d) could be obtained through a direct calculation of the second moment in a purely combinatorial way. However, this approach seems to be intractable in our model; for instance, the main contributing empirical measure to the first moment \(H^\star _{\lambda }\) barely has combinatorial meaning.
Instead, we first establish (39) for the L-truncated model, by showing the concentration of the rescaled partition function, introduced in (40) below. The truncated model will be easier to work with since it has a finite spin space unlike the untruncated model. Then, we rely on the convergence results regarding the leading constants of first and second moments, derived in [37], to deduce (d) for the untruncated model in an approximate sense. We then apply ideas behind the proof of Theorem 3.3 to deduce Theorem 3.1 (for details, see Sect. 6).
We now give a more precise description on how we establish (d) for the truncated model. Let \(1\le L,l_0 <\infty \) and \(\lambda \in (0,\lambda ^\star _L)\), where \(\lambda ^\star _L\) is defined in Definition 2.19. Then, define the rescaled partition function \({{\textbf {Y}}}^{(L)}_{\lambda , l_0}\)
Here, \(\delta _{L}(\zeta )\) is the constant defined in (38) for \(Z_n = {{\textbf {Z}}}_{\lambda }^{(L),\text{ tr }}\), assuming its existence. The precise definition of \(\delta _{L}(\zeta )\) is given in (45). The reason to consider \(\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) instead of \({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }\) is to ignore the contribution from near-identical copies in the second moment. Then, Proposition 3.4 below shows that the rescaled partition function is concentrated for each \(L<\infty \). Its proof is provided in Sect. 5.
Proposition 3.4
Let \(k\ge k_0\), \(L<\infty \), and \(\lambda \in (\lambda ^\star _L-0.01\cdot 2^{-k},\lambda ^\star _L)\). Then we have
Remark 3.5
An important thing to note here is that Proposition 3.4 is not true for \(\lambda =\lambda ^\star _L\). If \(\lambda <\lambda ^\star _L\), then \(s^\star _{\lambda ,L}<s^\star _L\), so there should exist exponentially many clusters of size \(e^{n s^\star _{\lambda ,L}}\). Therefore, the intrinsic correlations within clusters are negligible (that is, when we pick two clusters at random, the probability of selecting the same one is close to 0) and the fluctuation is taken over by cycle effects. However, when there are bounded number of clusters of size \(e^{ns^\star _{\lambda ,L}}\) (that is, when \(\lambda \) is very close to \(\lambda ^\star _L\)), within-cluster correlations become non-trivial. Mathematically, we can see this from (29), where we can ignore the first moment term in the rhs of (29) if (and only if) it is large enough.
Nevertheless, for \(s_\circ (C)\) defined as in Theorem 3.1, we will see in Sect. 6 that if we set C to be large, then (d) holds, and hence the conclusion of Theorem 3.3 holds with a small error.
Further notations. Throughout this paper, we will often use the following multi-index notation. Let \(\underline{a}=(a_\zeta )_{||\zeta ||\le l_0}\), \(\underline{b}=(b_\zeta )_{||\zeta ||\le l_0}\) be tuples of integers indexed by \(\zeta \) with \(||\zeta ||\le l_0\). Then, we write
Moreover, for non-negative quantities \(f=f_{d,k, L, n}\) and \(g=g_{d,k,L, n}\), we use any of the equivalent notations \(f=O_{k,L}(g), g= \Omega _{k,L}(f), f\lesssim _{k,L} g\) and \(g \gtrsim _{k,L} f \) to indicate that \(f\le C_{k,L}\cdot g\) holds for some constant \(C_{k,L}>0\), which only depends on k, L.
4 The Effects of Cycles
In this section, our goal is to obtain the condition (a) of Theorem 3.3 for (truncated versions of) \({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }\) and \({{\textbf {Z}}}^{\text{ tr }}_{\lambda ,s_n}\), where \(|s_n-s^\star _{\lambda }|=O(n^{-2/3})\) (see Proposition 4.1 below). To do so, we first introduce necessary notations to define \(\delta (\zeta )\) appearing in Theorem 3.3.
For \(\lambda \in [0,1]\), recall the optimal coloring profile of the untruncated model \(H^\star _{\lambda }\) and truncated model \(H^\star _{\lambda ,L}\) from Definition 2.18. We denote the two-point marginals of \(\dot{H}^\star _{\lambda }\) by
and similarly for \(\dot{H}^\star _{\lambda ,L}\). On the other hand, for \({\underline{\texttt {L}}}\in \{0,1\}^k\), consider the optimal clause empirical measure \(\hat{H}^{\underline{\texttt {L}}}_{\lambda }\) given the literal assignment \(\underline{\texttt {L}}\in \{0,1\}^{k}\) around a clause, namely for \(\underline{\sigma }\in \Omega ^k\),
where \(\hat{\mathfrak {Z}}^{\underline{\texttt {L}}}_{\lambda }\) is the normalizing constant. Note that \(\hat{\mathfrak {Z}}^{\underline{\texttt {L}}}_{\lambda } = \hat{\mathfrak {Z}}_{\lambda }\) is independent of \(\underline{\texttt {L}}\) due to the symmetry \(\dot{q}_\lambda ^\star (\dot{\sigma })=\dot{q}_\lambda ^\star (\dot{\sigma }\oplus 1)\). Similarly, define \(\hat{H}^{\underline{\texttt {L}}}_{\lambda ,L}\) for the truncated model. Given the literals \(\texttt {L}_1,\texttt {L}_2\) at the first two coordinates of a clause, the two point marginal of \(\hat{H}^{\underline{\texttt {L}}}_{\lambda }\) is defined by
where the second equality holds for any \(\underline{\texttt {L}}\in \{0,1\}^k\) that agrees with \(\texttt {L}_1,\texttt {L}_2\) at the first two coordinates, due to the symmetry \(\hat{H}^{\underline{\texttt {L}}}_{\lambda }(\underline{\tau })=\hat{H}_{\lambda }^{\underline{\texttt {L}}\oplus \underline{\texttt {L}}'}(\underline{\tau }\oplus \underline{\texttt {L}}')\). The symmetry also implies that
for any \(\texttt {L}_1, \texttt {L}_2 \in \{0,1\}\) and \(\tau _1\in \Omega \). We also define \(\hat{H}^{\texttt {L}_1,\texttt {L}_2}_{\lambda ,L}\) analogously for the truncated model.
Then, we define \(\dot{A}\equiv \dot{A}_{\lambda },\hat{A}^{\texttt {L}_1,\texttt {L}_2}\equiv \hat{A}^{\texttt {L}_1,\texttt {L}_2}_{\lambda }\) to be the \(\Omega \times \Omega \) matrices as follows:
and \(\Omega _L\times \Omega _L\) matrices \(\dot{A}_{\lambda ,L}\) and \(\hat{A}_{\lambda ,L}^{\texttt {L}_1,\texttt {L}_2}\) are defined analogously using \(\dot{H}^\star _{\lambda ,L}, \hat{H}^{\texttt {L}_1,\texttt {L}_2}_{\lambda ,L}\) and \(\bar{H}^\star _{\lambda ,L}\). Note that both matrices have row sums equal to 1, and hence their largest eigenvalue is 1. For \(\zeta \in \{0,1\}^{2l}\), we introduce the following notation for convenience:
where \(\zeta _0 = \zeta _{2l}\). Moreover, we define \((\dot{A}_L \hat{A}_L)^\zeta \) analogously. Then, the main proposition of this section is given as follows.
Proposition 4.1
Let \(L,l_0>0\) and let \(\underline{X} = \{X({\zeta })\}_{||\zeta ||\le l_0}\) denote the number of \(\zeta \)-cycles in \(\mathscr {G}\). Also, set \(\mu ({\zeta })\) as (36), and for each \(\zeta \in \cup _l\{0,1\}^{2l}\), define
Then, there exists a constant \(c_{\textsf {cyc}}=c_{\textsf {cyc}}(l_0)\) such that the following statements hold true:
-
(1)
For \(\lambda \in (0,1)\) and any tuple of nonnegative integers \(\underline{a}=(a_\zeta )_{||\zeta ||\le l_0}\), such that \(||\underline{a}||_\infty \le c_{\textsf {cyc}} \log n\), we have
$$\begin{aligned} \mathbb {E}\left[ \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }} \cdot (\underline{X})_{\underline{a}} \right] = \left( 1+ err(n,\underline{a}) \right) \left( \underline{\mu } ( 1+ \underline{\delta }_L)\right) ^{\underline{a}} \mathbb {E}\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}, \end{aligned}$$(46)where \(\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) is defined in (40) and \(err(n,\underline{a}) = O_k \left( ||\underline{a}||_1 n^{-1/2} \log ^2 n \right) \).
-
(2)
For \(\lambda \in (0,\lambda _L^\star )\), where \(\lambda ^\star _{L}\) is defined in Definition 2.19, the analogue of (46) holds for the second moment as well. That is, for \(\underline{a}=(a_\zeta )_{||\zeta ||\le l_0}\) with \(||\underline{a}||_\infty \le c_{\textsf {cyc}} \log n\), we have
$$\begin{aligned} \mathbb {E}\left[ \big (\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^2 \cdot (\underline{X})_{\underline{a}} \right] = \left( 1+ err(n,\underline{a}) \right) \left( \underline{\mu } ( 1+ \underline{\delta }_L)^2\right) ^{\underline{a}} \mathbb {E}\big (\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^2. \end{aligned}$$(47) -
(3)
Under a slightly weaker error given by \(err'(n,\underline{a}) = O_k(||\underline{a}||_1 n^{-1/8})\), the analogue of (1) holds the same for the untruncated model with any \(\lambda \in (0,1)\). Namely, analogously to (40), define \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{\text{ tr }}\) and \(\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ,s}\) by
$$\begin{aligned}&\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda }:=\sum _{||H-H^\star _{\lambda }||_1\le n^{-1/2}\log ^2 n}{{\textbf {Z}}}^{\text{ tr }}_{\lambda }[H];\nonumber \\ {}&\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ,s}:=\sum _{||H-H^\star _{\lambda }||_1\le n^{-1/2}\log ^2 n}{{\textbf {Z}}}^{\text{ tr }}_{\lambda }[H]\mathbb {1}\big \{s(H)\in [ns,ns+1)\big \}\,. \end{aligned}$$(48)Then, (46) continues to hold when we replace \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }},err(n,\underline{a})\) and \(\underline{\delta }_L\) by \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{\text{ tr }}, err'(n,\underline{a})\) and \(\underline{\delta }\) respectively. Moreover, (46) continues to hold when we replace \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }},err(n,\underline{a})\) and \(\underline{\delta }_L\) by \({{\textbf {Z}}}_{\lambda ,s_n}^{\text{ tr }}, err'(n,\underline{a})\) and \(\underline{\delta }\) respectively, where \(|s_n-s^\star _{\lambda }|=O(n^{-2/3})\).
-
(4)
For each \(\zeta \in \cup _l \{0,1\}^{2l}\), we have \(\lim _{L\rightarrow \infty } \delta _L (\zeta ) = \delta (\zeta )\).
In the remainder of this section, we focus on proving (1) of Proposition 4.1. In the proof, we will be able to see that (2) of the proposition follow by an analogous argument (see Remark 4.4). The proofs of (3) and (4) are deferred to Appendix 6, since they require substantial amounts of additional technical work.
For each \(\zeta \in \{0,1\}^{2l}\) and a nonnegative integer \(a_\zeta \), let \(\mathcal {Y}_i \equiv \mathcal {Y}_i(\zeta ) \in \big \{\{v_{\iota },a_{\iota },(e_{v_{\iota }}^{j}, e_{a_{\iota }}^{j})_{j=0,1} \}_{\iota =1}^l\big \}\), \(i\in [a_\zeta ]\) denote the possible locations of \(a_\zeta \) \(\zeta \)-cycles defined as Definition 3.2. Then, it is not difficult to see that
where the summation runs over distinct \(\mathcal {Y}_1,\ldots ,\mathcal {Y}_{a_\zeta }\), and \(\underline{\texttt {L}}(\mathcal {Y}_i;\mathscr {G})\) denotes the literals on \(\mathcal {Y}_i\) inside \(\mathscr {G}\). Based on this observation, we will show (1) of Proposition 4.1 by computing the cost of planting cycles at specific locations \(\{\mathcal {Y}_i\}\). Moreover, in addition to \(\{\mathcal {Y}_i\}\), prescribing a particular coloring on those locations will be useful. In the following definition, we introduce the formal notations to carry out such an idea.
Definition 4.2
(Empirical profile on \(\mathcal {Y}\)). Let \(L,l_0>0\) be given integers and let \(\underline{a}=(a_\zeta )_{||\zeta ||\le l_0}\). Moreover, let
denote the distinct \(a_\zeta \) \(\zeta \)-cycles for each \(||\zeta ||\le l_0\) inside \(\mathscr {G}\) (Definition 3.2), and let \(\underline{\sigma }\) be a valid coloring on \(\mathscr {G}\). We define \({\Delta } \equiv {\Delta }[\underline{\sigma };\mathcal {Y}]\), the empirical profile on \(\mathcal {Y}\), as follows.
-
Let \(V(\mathcal {Y})\) (resp. \(F(\mathcal {Y})\)) be the set of variables (resp. clauses) in \(\cup _{||\zeta ||\le l_0} \cup _{i=1}^{a_\zeta } \mathcal {Y}_i(\zeta )\), and let \(E_c(\mathcal {Y})\) denote the collection of variable-adjacent half-edges included in \(\cup _{||\zeta ||\le l_0} \cup _{i=1}^{a_\zeta } \mathcal {Y}_i(\zeta )\). We write \(\underline{\sigma }_\mathcal {Y}\) to denote the restriction of \(\underline{\sigma }\) onto \(V(\mathcal {Y})\) and \(F(\mathcal {Y})\).
-
\(\Delta \equiv \Delta [\underline{\sigma };\mathcal {Y}] \equiv (\dot{\Delta }, (\hat{\Delta }^{\underline{\texttt {L}}})_{\underline{\texttt {L}}\in \{0,1\}^k}, \bar{\Delta }_c)\) is the counting measure of coloring configurations around \(V(\mathcal {Y}), F(\mathcal {Y})\) and \(E_c(\mathcal {Y})\) given as follows.
$$\begin{aligned} \begin{aligned}&\dot{\Delta } (\underline{\tau }) = |\{v\in V(\mathcal {Y}): \underline{\sigma }_{\delta v} = \underline{\tau } \} |, \quad \text {for all } \underline{\tau } \in \dot{\Omega }_L^d;\\&\hat{\Delta }^{\underline{\texttt {L}}} (\underline{\tau }) = |\{a\in F(\mathcal {Y}): \underline{\sigma }_{\delta a} = \underline{\tau }, \;\underline{\texttt {L}}_{\delta a} = \underline{\texttt {L}}\} |, \quad \text {for all } \underline{\tau } \in \dot{\Omega }_L^k, \; \underline{\texttt {L}}\in \{0,1\}^k;\\&\bar{\Delta }_c ({\tau }) = |\{e\in E_c(\mathcal {Y}): \sigma _{e} = \tau \} |, \quad \text {for all } \tau \in \dot{\Omega }_L. \end{aligned} \end{aligned}$$(50) -
We write \(|\dot{\Delta }| \equiv \langle \dot{\Delta },1 \rangle \), and define \(|\hat{\Delta }^{\underline{\texttt {L}}} |\), \(|\bar{\Delta }_c|\) analogously.
Note that \(\Delta \) is well-defined if \(\mathcal {Y}\) and \(\underline{\sigma }_\mathcal {Y}\) are given.
In the proof of Proposition 4.1, we will fix \(\mathcal {Y}\), the locations of \(\underline{a}\) \(\zeta \)-cycles, and a coloring configuration \(\underline{\tau }_\mathcal {Y}\) on \(\mathcal {Y}\), and compute the contributions from \(\mathscr {G}\) and \(\underline{\sigma }\) that has cycles on \(\mathcal {Y}\) and satisfies \(\underline{\sigma }_\mathcal {Y}= \underline{\tau }_\mathcal {Y}\). Formally, abbreviate \({{\textbf {Z}}}^\prime \equiv \widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) for simplicity and define
Then, we express that
where the notation in the last equality is introduced for convenience. The key idea of the proof is to study the rhs of the above equation. We follow the similar idea developed in [25], Section 6, which is to decompose \({{\textbf {Z}}}^\prime \) in terms of empirical profiles of \(\underline{\sigma }\) on \(\mathcal {G}\). The main contribution of our proof is to suggest a method that overcomes the complications caused by the indicator term (or the planted cycles).
Proof of Proposition 4.1-(1)
As discussed above, our goal is to understand \(\mathbb {E}[{{\textbf {Z}}}^\prime \mathbb {1}\{\mathcal {Y},\underline{\tau }_\mathcal {Y}\} ]\) for given \(\mathcal {Y}\) and \(\underline{\tau }_\mathcal {Y}\). To this end, we decompose the partition function in terms of coloring profiles. It will be convenient to work with
the non-normalized empirical counts of H. Moreover, if g is given, then the product of the weight, clause, and edge factors is also determined. Let us denote this by w(g), defined by
Recalling the definition of \({{\textbf {Z}}}^\prime \equiv \widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) in (40), we consider g such that \(||g-g^\star _{\lambda ,L}||_1\le \sqrt{n}\log ^2 n \), where we defined
Now, fix the literal assignment \(\underline{\texttt {L}}_E\) on \(\mathscr {G}\) which agrees with those on the cycles given by \(\mathcal {Y}\). Finally, let \(\Delta =(\dot{\Delta },\hat{\Delta }, \bar{\Delta }_c)\) denote the empirical profile on \(\mathcal {Y}\) induced by \(\underline{\tau }_\mathcal {Y}\). Then, we have
where we wrote \(H^\star = H^\star _{\lambda ,L}\) and the last equality follows from \(||g-g^\star _{\lambda ,L}||\le \sqrt{n}\log ^2 n\).
In the remaining, we sum the above over \(\mathcal {Y}\) and \(\underline{\tau }_\mathcal {Y}\), depending on the structure of \(\mathcal {Y}\). To this end, we define \(\eta \equiv \eta (\mathcal {Y})\) to be
where \(|\hat{\Delta }| = \sum _{\underline{\texttt {L}}} |\hat{\Delta }^{\underline{\texttt {L}}}|\) and noting that \(|\dot{\Delta }|, |\hat{\Delta }|\) and \(|\bar{\Delta }_c|\) are well-defined if \(\mathcal {Y}\) is given. Note that \(\eta \) describes the number of disjoint components in \(\mathcal {Y}\), in the sense that
Firstly, suppose that all the cycles given by \(\mathcal {Y}\) are disjoint, that is, \(\eta (\mathcal {Y})=0\). In other words, all the variable sets \(V(\mathcal {Y}_i (\zeta ))\), \(i\in [a_\zeta ], ||\zeta ||\le l_0\) are pairwise disjoint, and the same holds for the clause sets \(F(\mathcal {Y}_i (\zeta ))\). In this case, the effect of each cycle can be considered to be independent when summing (55) over \(\underline{\tau }_\mathcal {Y}\), which gives us
where \((\dot{A}_L \hat{A}_L)^\zeta \) defined as (44). Also, note that although \({\Delta }\) is defined depending on \(\underline{\tau }_\mathcal {Y}\), \(|\bar{\Delta }_c|\) in the denominator is well-defined given \(\mathcal {Y}\). Thus, averaging the above over all \(\underline{\texttt {L}}_E\) gives
Moreover, setting \(a^\dagger = \sum _{||\zeta ||\le l_0} a_\zeta ||\zeta ||\), the number of ways of choosing \(\mathcal {Y}\) to be \(\underline{a}\) disjoint \(\zeta \)-cycles can be written by
Having this in mind, summing (58) over all \(\mathcal {Y}\) that describes disjoint \(\underline{a}\) \(\zeta \)-cycles, and then over all \(||g-g^\star _{\lambda ,L}||\le n^{2/3}\), we obtain that
where \(\underline{\mu }, \underline{\delta }_L\) are defined as in the statement of the proposition.
Our next goal is to deal with \(\mathcal {Y}\) such that \(\eta (\mathcal {Y})=\eta >0\) and to show that such \(\mathcal {Y}\) provide a negligible contribution. Given \(\eta >0\), this implies that at least \(||\underline{a}||_1 - 2\eta \) cycles of \(\mathcal {Y}\) should be disjoint from everything else in \(\mathcal {Y}\). Therefore, when summing the terms with \(H^\star \) in (55) over \(\underline{\tau }_\mathcal {Y}\), all but at most \(2\eta \) cycles contribute by \((1+{\delta }_L(\zeta ))\), while the others with intersections can become a different value. Thus, we obtain that
for some constant \(C>0\) depending on \(k, L, l_0\).
Then, similarly to (59), we can bound the number of choices \(\mathcal {Y}\) satisfying \(\eta (\mathcal {Y})=\eta \). Since all but \(2\eta \) of the cycles are disjoint from the others, we have
The formula in the rhs can be described as follows.
-
1.
The first bracket describes the number of ways to choose variables and clauses, along with the locations of half-edges described by \(\mathcal {Y}\). Note that at this point we have not yet chosen the places of variables, clauses and half-edges that are given by the intersections of cycles in \(\mathcal {Y}\).
-
2.
The second bracket is introduced to prevent overcounting the locations of cycles that are disjoint from all others. Multiplication of \((2l_0)^{2\eta }\) comes from the observation that there can be at most \(2\eta \) intersecting cycles.
-
3.
The third bracket bounds the number of ways of choosing where to put overlapping variables and clauses, which can be understood as follows.
-
Choose where to put an overlapping variable (or clause): number of choices bounded by \(a^\dagger \).
-
If there is an overlapping half-edge adjacent to the chosen variable (or clause), we decide where to put the clause at its endpoint: number of choices bounded by d.
-
Since there are \(2a^\dagger - |\bar{\Delta }_c|\) overlapping half-edges and \(2a^\dagger - |\dot{\Delta }|-|\hat{\Delta }|\) overlapping variables and clauses, we obtain the expression (62).
-
To conclude the analysis, we need to sum (61) over \(\mathcal {Y}\) with \(\eta (\mathcal {Y}) = \eta \), using (62) (and average over \(\underline{\texttt {L}}_E\)). One thing to note here is the following relation among \(|\dot{\Delta }|, |\hat{\Delta }|,\) and \(\bar{\Delta }_c\):
which comes from the fact that for each overlapping edge, its endpoints count as overlapping variables and clauses. Therefore, we can simplify (62) as
Thus, we obtain that
for another constant \(C'\) depending on \(k, L, l_0\). We choose \(c_{\textsf {cyc}}=c_{\textsf {cyc}}(l_0)\) to be \(2^{2a^\dagger } \le n^{1/3}\) if \(||\underline{a}||_\infty \le c_{\textsf {cyc}} \log n\). Then, summing this over \(\eta \ge 1\) and all g with \(||g-g^\star _{\lambda ,L}|| \le \sqrt{n}\log ^2 n\) shows that the contributions from \(\mathcal {Y}\) with \(\eta (\mathcal {Y})\ge 1\) is negligible for our purposes. Combining with (60), we deduce the conclusion. \(\square \)
Remark 4.3
Although we will not use it in the rest of the paper, the analogue of Proposition 4.1-(1) for \({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }\) holds under the same condition. That is, we have
To prove the equation above, we have from Proposition 3.4 of [43] that
In the second line, we controlled the second term crudely by using \({{\textbf {Z}}}_{\lambda }^{(L),\text{ tr }}\le 2^n\) and (37). Having (66) in hand, the rest of the proof of (65) is the same as the proof of Proposition 4.1-(1).
Remark 4.4
Having proved Proposition 4.1-(1), the proof of Proposition 4.1-(2) is almost identical. Namely, if we consider the empirical coloring profile in the second moment and consider the analogue of w(g) (53) in the second moment (i.e. replace \(\dot{\Phi },\hat{\Phi }^{\text {lit}}\), and \(\bar{\Phi }\) in (53) respectively by \(\dot{\Phi }\otimes \dot{\Phi }, \hat{\Phi }^{\text {lit}}(\cdot \oplus \underline{\texttt {L}}_1)\otimes \hat{\Phi }^{\text {lit}}(\cdot \oplus \underline{\texttt {L}}_2)\), and \(\bar{\Phi }\otimes \bar{\Phi }\)), then the rest of the argument is the same.
As a corollary, we make an observation that the contribution to \(\mathbb {E}{{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }\) and \(\mathbb {E}\big (\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\big )^2\) from too large \(X(\zeta )\) is negligible.
Corollary 4.5
Let \(c>0\), \(L>0\), \(\lambda \in (0,\lambda ^\star _L)\) and \(\zeta \in \cup _l \{0,1\}^{2\,l}\) be fixed. Then, the following estimates hold true:
-
(1)
\(\mathbb {E}\Big [ \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }} \mathbb {1}\{X(\zeta )\ge c\log n \}\Big ] = n^{-\Omega (\log \log n)} \mathbb {E}\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L)}\);
-
(2)
\(\mathbb {E}\Big [ \big (\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^2 \mathbb {1}\{X(\zeta )\ge c\log n \}\Big ] = n^{-\Omega (\log \log n)} \mathbb {E}\Big [\big (\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^2\Big ]\);
-
(3)
The analogue of (1) is true for the untruncated model with \(\lambda \in (0,\lambda ^\star )\). Namely, (1) continues to hold when we replace \({{\textbf {Z}}}_{\lambda }^{(L),\text{ tr }}\) by \({{\textbf {Z}}}_{\lambda }^{\text{ tr }}\).
Proof
We present the proof of (1) of the corollary; the others will follow by the same idea due to Proposition 4.1. Let \(c_{\textsf {cyc}}=c_{\textsf {cyc}}(||\zeta ||)\) be as in Proposition 4.1, and set \(c'=\frac{1}{2}(c\wedge c_{\textsf {cyc}})\). Then, by Markov’s inequality, we have
Then, plugging the estimate from Proposition 4.1-(1) in the rhs implies the conclusion. \(\square \)
To conclude this section, we present an estimate that bounds the sizes of \(\delta (\zeta )\) and \(\delta _L(\zeta )\). One purpose for doing this is to obtain Assumption (c) of Theorem 3.3.
Lemma 4.6
In the setting of Proposition 4.1, let \(\lambda \in (0,\lambda ^\star ]\) and \(\delta _L\) be defined as (45). Then, there exists an absolute constant \(C>0\) such that for all \(\zeta \in \cup _l \{0,1\}^{2l}\) and L large enough,
Hence, \(\delta (\zeta ;\lambda ) \le (k^C 2^{-k})^{||\zeta ||}\) holds by Proposition 4.1-(4), and we have for large enough k,
where the last inequality holds because \(d\le k 2^k\) holds by Remark 1.2. Replacing \(\delta _L(\zeta ;\lambda )\) by \(\delta (\zeta ;\lambda )\) in the equation above, the analogue also holds for the untruncated model.
5 The Rescaled Partition Function and Its Concentration
In random regular k-nae-sat, it is believed that the primary reason for non-concentration of \({{\textbf {Z}}}_{\lambda }^{\text{ tr }}\) is the existence of short cycles in the graph. Based on the computations done in the previous section, we show that the partition function is indeed concentrated if we rescale it by the cycle effects. However, we work with the truncated model, since some of our important estimates break down in the untruncated model. The goal of this section is to establish Proposition 3.4.
To this end, we write the variance of the rescaled partition by the sum of squares of Doob martingale increments with respect to the clause-revealing filtration, and study each increment by using a version of discrete Fourier transform. Although such an idea was also used in [25] to study \({{\textbf {Z}}}_0\), the rescaling factors of the partition function make the analysis more involved and ask for more delicate estimates (for instance, Proposition 4.1) than what is done in [25]. Moreover, an important thing to note is that due to the rescaling, the result we obtain in Proposition 3.4 is stronger than Proposition 6.1 in [25]. This improvement describes the underlying principle more clearly, which says that the multiplicative fluctuation of the partition function originates from the existence of cycles.
Although the setting in this section is similar to that in Section 6, [25], we begin with explaining them in brief for completeness. Then, we focus on the point where the aforementioned improvement comes from, and outline the other technical details which are essentially analogous to those in [25].
Throughout this section, we fix \(L\ge 1\), \(\lambda \in (0,\lambda ^\star _L)\) and \(l_0>0\), which all can be arbitrary. Recall the rescaled partition function \({{\textbf {Y}}}\equiv {{\textbf {Y}}}_{\lambda ,l_0}^{(L)}(\mathscr {G})\) defined in (40):
where \(\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\equiv \sum _{||H-H^\star _{\lambda ,L}||_1\le n^{-1/2}\log ^{2}n}{{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }[H]\). We sometimes write \({{\textbf {Y}}}(\mathscr {G})\) to emphasize the dependence on \(\mathscr {G}= (\mathcal {G}, \underline{\texttt {L}})\), the underlying random (d, k)-regular graph.
Let \(\mathcal {F}_i\) be the \(\sigma \)-algebra generated by the first i clauses \(a_1,\ldots ,a_i\) and the matching of the half-edges adjacent to them. Then, we can write
For each i, let A denote the set of clauses with indices between \(i \vee (m-k+1)\) and m. Set \(\mathscr {K}\) to be the collection of variable-adjacent half-edges that are matched to A. Further, let \(\acute{\mathscr {G}} = (\acute{\mathcal {G}}, \acute{\texttt {L}})\) be the random (d, k)-regular graph coupled to \(\mathscr {G}\), which has the same clauses \(a_1,\ldots ,a_{\max {\{i-1,m-k\}}}\) and literals adjacent to them as \(\mathscr {G}\) and randomly resampled clauses and their literals adjacent to \(\mathscr {K}\):
Let \(G^\circ \equiv \mathcal {G}\setminus A\) be the graph obtained by removing A and the half-edges adjacent to it from \(\mathcal {G}\). Then, for \(i\le m-k+1\), Jensen’s inequality implies that
where the summation in the rhs runs over all possible matchings \(A, \acute{A}\) of \(\mathscr {K}\) by k clauses (we refer to Section 6.1 in [25] for the details). Note that the sum runs over the finitely many choices depending only on k, which is affordable in our estimate. Also, we can write down the same inequality with \(i>m-k+1\), for which the only difference is the size of \(\mathscr {K}\) being smaller than \(k^2\). Thus, in the remaining subsection, our goal is to show that for \(|\mathscr {K}|=k^2 \), there exists an absolute constant \(C>0\) such that
where we denoted \({{\textbf {Y}}}( A)\equiv {{\textbf {Y}}}(G^\circ \cup A)\). This estimate, which is shown at the end of Sect. 5.3, directly implies the conclusion of Proposition 3.4.
Before moving on, we present an analogue of Corollary 4.5 for the rescaled partition function. This will function as a useful fact in our later analysis on \({{\textbf {Y}}}\). Due to the rescaling factors in \({{\textbf {Y}}}\), the proof is more complicated than that of Corollary 4.5, but still based on similar ideas from Proposition 4.1 and hence we defer it to Sect. A.2.
Corollary 5.1
Let \(c>0\), \(L>0\), \(\lambda \in (0,\lambda ^\star _L)\) and \(l_0>0 \) be fixed and let \({{\textbf {Y}}}= {{\textbf {Y}}}_{\lambda ,l_0}^{(L)}\) as above. Then, for any \(\zeta \) such that \(||\zeta ||\le l_0\), the following estimates hold true:
-
(1)
\(\mathbb {E}[{{\textbf {Y}}}\mathbb {1}\{X(\zeta )\ge c\log n \}] = n^{-\Omega _k(\log \log n)} \mathbb {E}\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\);
-
(2)
\(\mathbb {E}[ {{\textbf {Y}}}^2 \mathbb {1}\{X(\zeta )\ge c\log n \}] = n^{-\Omega _k(\log \log n)} \mathbb {E}(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }})^2\);
5.1 Fourier decomposition and the effect of rescaling
To see (68), we will apply a discrete Fourier transform to \({{\textbf {Y}}}(A)\) and control its Fourier coefficients. We begin with introducing the following definitions to study the effect of A and \(\acute{A}\): Let \(B_t^\circ (\mathscr {K})\) denote the ball of graph-distance t in \(G^\circ \) around \(\mathscr {K}\). Hence, for instance, if t is even then the leaves of \(B_t^\circ (\mathscr {K})\) are the half-edges adjacent to clauses. Then, we set
Note that T is most likely a union of \(|\mathscr {K}|\) disjoint trees, but it can contain a cycle with probability \(O((dk)^{l_0/2}/n)\). Let \(\mathscr {U}\) denote the collection of leaves of T other than the ones in \(\mathscr {K}\), and we write \(G^\partial \equiv G^\circ \setminus T\).
Remark 5.2
(A parity assumption) For the rest of Sect. 5, we assume that \(l_0\) is even. The assumption gives that the half-edges in \(\mathscr {U}\) are adjacent to clauses of T and hence their counterparts are adjacent to variables of \(G^\partial \). For technical reasons in dealing with the rescaling factors (Lemma 5.5), we have to treat the case of odd \(l_0\) separately, however it will be apparent that the argument from Sects. 5.1–5.3 works the same. In Remark 5.4, we explain the main difference in formulating the Fourier decomposition for an odd \(l_0\).
Based on the above decomposition of \(\mathcal {G}\), we introduce several more notions as follows. For \(\zeta \in \{0,1\}^{2l}\) with \(l\le l_0\), let \(X({\zeta })\) and \(X^T({\zeta })\) (resp. \(\acute{X}({\zeta })\) and \(\acute{X}^T({\zeta })\)) be the number of \(\zeta \)-cycles in the graph \(G^\circ \cup A = \mathcal {G}\) and \(A\cup T\) (resp. \(G^\circ \cup \acute{A} = \acute{\mathcal {G}}\) and \(\acute{A}\cup T\)), respectively, and set
(Note that this quantity is the same as \(\acute{X}({\zeta })-\acute{X}^T({\zeta })\), since the distance from \(\mathscr {U}\) to \(\mathscr {K}\) is at least \(2l_0\).) Based on this notation, we define the local-neighborhood-rescaled partition function \({{\textbf {Z}}}_T\) and \(\acute{{{\textbf {Z}}}}_T\) by
where \({{\textbf {Z}}}' \equiv \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\) and \({{\textbf {Z}}}'[G^\circ \cup A]\) denotes the partition function on the graph \(G^\circ \cup A=\mathcal {G}\). Here, we omitted the dependence on the literals \(\underline{\texttt {L}}\) on \(\mathcal {G}\), since we are only interested in their moments.
One of the main ideas of Sect. 5 is to relate \({{\textbf {Y}}}\) and \({{\textbf {Z}}}_T\), by establishing the following lemma:
Lemma 5.3
Let \({{\textbf {Y}}}(A), {{\textbf {Y}}}(\acute{A}), {{\textbf {Z}}}_T\), \(\acute{\text {{\textbf {Z}}}}_T\) and \(X^\partial \) be defined as above. Then, we have
where \({{\textbf {Z}}}'\equiv \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\) and the error o(1) depends on L, \(l_0\).
The lemma can be understood as a generalization of Proposition 4.1 to the case of \({{\textbf {Z}}}_T\). Although the proof of the lemma is based on similar ideas as the proposition, the analysis becomes more delicate since we need to work with the difference \({{\textbf {Y}}}(A)- {{\textbf {Y}}}(\acute{A})\). The proof will be discussed later in Sect. 5.4.
In the remaining section, we develop ideas to deduce (68) from Lemma 5.3. To work with \({{\textbf {D}}}:={{\textbf {Z}}}_T - \acute{{{\textbf {Z}}}}_T\), we develop a discrete Fourier transform framework as introduced in Section 6 of [25]. Recall the definition of the weight factor \(w^{\text {lit}}_{\mathcal {G}}(\underline{\sigma }_{\mathcal {G}})\) on a factor graph \(\mathcal {G}\), which is
Let \(\kappa (\underline{\sigma }_\mathscr {U})\) (resp. \({{\textbf {Z}}}^\partial (\underline{\sigma }_\mathscr {U})\)) denote the contributions to \({{\textbf {Y}}}(A)\) coming from \(T{\setminus } \mathscr {U}\) (resp. \(G^\partial \)) given \(\underline{\sigma }_\mathscr {U}\), namely,
where \(\underline{\sigma }_T \sim \underline{\sigma }_\mathscr {U}\) means that the configuration of \(\underline{\sigma }_T\) on \(\mathscr {U}\) is \(\underline{\sigma }_\mathscr {U}\). Define \(\acute{\kappa }(\underline{\sigma }_\mathscr {U})\) analogously, by \(\acute{\kappa } (\underline{\sigma }_\mathscr {U}) \equiv \kappa (\underline{\sigma }_\mathscr {U}, \acute{\mathscr {G}})\).
The main intuition is that the dependence of \(\mathbb {E}{{\textbf {Z}}}^\partial (\underline{\sigma }_\mathscr {U})\) on \(\underline{\sigma }_\mathscr {U}\) should be given by the product measure that is i.i.d. \(\dot{q}^\star _{\lambda ,L}\) at each \(u\in \mathscr {U}\), where \(\dot{q}^\star _{\lambda ,L}\) is the fixed point of the BP recursion we saw in Proposition 2.17. To formalize this idea, we perform a discrete Fourier decomposition with respect to \(\underline{\sigma }_\mathscr {U}\) in the following setting. Let \(({{\textbf {b}}}_1,\ldots ,{{\textbf {b}}}_{|\dot{\Omega }_L|})\) be an orthonormal basis for \(L^2(\dot{\Omega }_L,\dot{q}^\star _{\lambda ,L})\) with \({{\textbf {b}}}_1\equiv 1\), and let \({{\textbf {q}}} \) be the product measure \(\otimes _{u\in \mathscr {U}} \dot{q}^\star _{\lambda ,L}\). Extend this to the orthonormal basis \(({{\textbf {b}}}_{\underline{r}})\) on \(L^2((\dot{\Omega }_L)^\mathscr {U}, {{\textbf {q}}})\) by
where \([|\dot{\Omega }_L|]:= \{1,2,\ldots , \dot{\Omega }_L \}.\) For a function f on \((\dot{\Omega }_L)^\mathscr {U}\), we denote its Fourier coefficient by
Then, defining \({{\textbf {F}}}(\underline{\sigma }_\mathscr {U})\equiv {{\textbf {q}}}(\underline{\sigma }_\mathscr {U})^{-1} {{\textbf {Z}}}^\partial (\underline{\sigma }_\mathscr {U})\), we use Plancherel’s identity to obtain that
Remark 5.4
(When \(l_0\) is odd). If \(l_0\) is odd, then the half-edges \(\mathscr {U}\) are adjacent to the clauses of \(G^\partial \). Therefore, the base measure of the Fourier decomposition should be \(\hat{q}^\star _{\lambda ,L}\) rather than \(\dot{q}^\star _{\lambda ,L}\). In this case, we rely on the same idea that \({{\textbf {Y}}}^\partial (\underline{\sigma }_{\mathscr {U}})\) should approximately be written in terms of the product measure of \(\hat{q}^\star _{\lambda ,L}\).
To describe the second moment of the above quantity, we abuse notation and write \({{\textbf {q}}}\), \({{\textbf {b}}}\) for the product measure of \(\dot{q}_{\lambda ,L}^\star \otimes \dot{q}_{\lambda ,L}^\star \) on \(\mathscr {U}\) and the orthonormal basis given by \({{\textbf {b}}}_{\underline{r}^1,\underline{r}^2}(\underline{\sigma }^1,\underline{\sigma }^2)\equiv {{\textbf {b}}}_{\underline{r}^1}(\underline{\sigma }^1){{\textbf {b}}}_{\underline{r}^2}(\underline{\sigma }^2).\) Moreover, we denote the pair configuration by \(\underline{\varvec{\sigma }}=(\underline{\sigma }^1,\underline{\sigma }^2)\) throughout Sect. 5. Let be the contribution of the pair configurations on \(G^\partial \) given by
Then, denote by the contribution to from pair coloring profile \(||H-H^\bullet _{\lambda ,L}||_1\le n^{-1/2}\log ^{2}n\), where \(H^\bullet _{\lambda ,L}\) is defined in Definition 2.20. Recall that \({{\textbf {Z}}}_T\) is defined in terms of \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\) as in (69). Since \(\lambda <\lambda ^\star _L\) and we restricted our attention to \(||H-H^\star _{\lambda ,L}||_1 \le n^{-1/2}\log ^{2}n\) in \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\), the major contribution to the second moment \(\mathbb {E}{{\textbf {D}}}^{2}\equiv \mathbb {E}({{\textbf {Z}}}_T-\acute{{{\textbf {Z}}}}_T)^{2}\) comes from , where is defined by
Namely, Proposition 4.20 of [37] and Proposition 3.10 of [43] imply that
Thus, we aim to upper bound . Let \(\mathbb {E}_T\) denote the conditional expectation given T. Again using Plancherel’s identity, we can write
where we wrote
In the remaining subsections, we begin with estimating \(\kappa ^\wedge \) in Sect. 5.2. This is the part that carries the major difference from [25] in the conceptual level, which in turn provides Proposition 3.4, a stronger conclusion than Proposition 6.1 of [25]. Then, since the Fourier coefficients deals with the non-rescaled partition function, we may appeal to the analysis given in [25] to deduce above (68) in Sect. 5.3.
Before moving on, we introduce some notations following [25] that are used in the remainder of Sect. 5. We write \(\varnothing \) as the index of an all-1 vector, that is, \({{\textbf {b}}}_{\varnothing }\equiv 1\). Moreover, for \(\underline{r}=(\underline{r}^1,\underline{r}^2)\in [|\dot{\Omega }_L|]^{2\mathscr {U}}\), we define
5.2 Local neighborhood Fourier coefficients
The properties of \(\kappa ^\wedge \) may vary significantly depending on the structure of \(T= B^\circ _{l_0}(\mathscr {K})\). Typically, T consists of \(|\mathscr {K}|\) disjoint trees, and in this case the rescaling factor has no effect due to the absence of cycles. Therefore, the analysis done in Section 6.4 of [25] can be applied to our case as follows. Let \({{\textbf {T}}}\) be the event that T consists of \(|\mathscr {K}|\) tree components. Then, Lemmas 6.8 and 6.9 of [25] imply that when \({{\textbf {T}}}\) tholds, for \(\underline{r}\in [|\dot{\Omega }_L|]^{2\mathscr {U}}\),
-
\(\kappa ^\wedge (\underline{r}) = \acute{\kappa }^\wedge (\underline{r})\) for all \(|\{\underline{r} \}|\le 1\).
-
\(\left. \kappa ^\wedge (\varnothing )\right| _{\text {{\textbf {T}}}}\) takes a constant value \(\overline{\kappa }^\wedge (\varnothing )\) independent of A and the literals on T.
-
\(|\kappa ^\wedge (\underline{r}) - \acute{\kappa }^\wedge (\underline{r})| \lesssim _k \overline{\kappa }^\wedge (\varnothing )/4^{(k-4)l_0}\) for all \(|\{\underline{r}\}|=2\).
Moreover, let \({{\textbf {C}}}^\circ \) denote the event that T contains a single cycle but consists of \(|\mathscr {K}|\) connected components, one of which contains a single cycle and others which are trees. In this case, although the rescaling factor is now non-trivial, it is the same for both \(\kappa \) and \(\acute{\kappa }\). Therefore, Lemma 6.8 of [25] tells us that
-
\(\kappa ^\wedge (\varnothing ) = \acute{\kappa }^\wedge (\varnothing )\).
The case where we notice an important difference is the event \({{\textbf {C}}}_{t}\), \(t\le l_0\), when \(B_{t-1}^\circ (\mathscr {K})\) has \(|\mathscr {K}|\) connected components but \(B_{t'}^\circ \) has \(|\mathscr {K}|-1\) components for \(t\le t'\le l_0\). Using the cycle effect, we deduce the following estimate which is stronger than Lemma 6.10 of [25].
Lemma 5.5
Suppose that \(T\in {{\textbf {C}}}_{t}\) for some \(t\le l_0\). Then, for any choice of A and \(\acute{A}\) of matching \(\mathscr {K}\) with k clauses, we have
Proof
Let \(T_0\) and \(T_\textsf {link}\) be the connected components of T defined as follows: \(T\in {{\textbf {C}}}_t\) consists of \(|\mathscr {K}|-2\) copies of isomorphic trees \(T_0\) and one tree \(T_\textsf {link}\) that contains two half-edges of \(\mathscr {K}\). Note that \(T\cup A\) and \(T\cup \acute{A}\) have different structures only if we are in the following situation:
-
One clause in A is connected with both half-edges of \(\mathscr {K}\cap T_\textsf {link}\). Thus, the connected components of \(T\cup A\) are \((k-1)\) copies of \(\mathcal {T}_0\) and one copy of \(\mathcal {T}_\textsf {cyc}\) as illustrated in Fig. 1. (Recall that we assumed \(|\mathscr {K}|=k^2\) (68).) Here, \(\mathcal {T}_0\) is the union of disjoint k copies of \(T_0\) and a clause connecting them. Also, \(\mathcal {T}_\textsf {cyc}\) is the union of \(k-2\) disjoint copies of \(T_0\), one \(T_\textsf {link}\), and a clause connecting them.
-
The two half-edges \(\mathscr {K}\cap T_{\textsf {link}}\) are connected to different clauses of \(\acute{A}\). Therefore, the connected components of \(T\cup \acute{A}\) are \((k-2)\) copies of \(\mathcal {T}_0\) and one copy of \(\mathcal {T}_\textsf {link}\). Here, \(\mathcal {T}_{\textsf {link}}\) is the union of \(2k-2\) disjoint copies of \(T_0\), one \(T_{\textsf {link}}\) and two clauses connecting them as illustrated in Fig. 1.
Let \(\kappa _0^\wedge \) and \(\kappa _\textsf {cyc}^\wedge \) (resp. \(\kappa _\textsf {link}^\wedge \)) be the contributions to \(\kappa ^\wedge (\varnothing )\) (resp. \(\acute{\kappa }^\wedge (\varnothing )\)) from \(\mathcal {T}_0\) and \(\mathcal {T}_\textsf {cyc}\), respectively (resp. \(\mathcal {T}_{\textsf {link}}\)). Then, we have
In what follows, we present an explicit computation of \(\kappa _0^\wedge \), \(\kappa _\textsf {cyc}^\wedge \) and \(\kappa _\textsf {link}^\wedge \) and show that the two quantities in (74) are the same.
We begin with computing \(\kappa _0^\wedge \). Since we are in a tree, \(\kappa _0^\wedge \) does not depend on the assignments of literals, and hence we can replace the weight factor \(w^{\text {lit}}\) by its averaged version w. Let \(e_0\) (resp. \(\mathscr {Y}_0\)) be the root half-edge (resp. the collection of leaf half-edges) of \(T_0\). We define
where \(\underline{\sigma }_{T_0}\sim (\sigma , \underline{\sigma }_{\mathscr {Y}_0})\) means that \(\underline{\sigma }_{T_0}\) agrees with \(\sigma \) and \(\underline{\sigma }_{\mathscr {Y}_0}\) at \(e_0\) and \(\mathscr {Y}_0\), respectively. Note that since \(\mathcal {T}_0\) is a tree, the rescaling factor from the cycle effect is trivial. Denoting the number of variables and clauses of \(T_0\) by \(v(T_0)\) and \(a(T_0)\), respectively, the Fourier coefficient of \(\varkappa _0(\sigma ;\,\cdot \,)\) at \(\varnothing \) is given by
where the second equality follows from the fact that \(\dot{q}^\star _{\lambda ,L}\) and the constants \(\dot{\mathscr {Z}} = \dot{\mathscr {Z}}_{{q}^\star _{\lambda ,L}}\) and \(\hat{\mathscr {Z}} = \hat{\mathscr {Z}}_{q^\star _{\lambda ,L}}\) are the fixed point and the normalizing constants of the Belief Propagation recursions (20). Thus, we can calculate \(\kappa _0^\wedge \) by
where \(\hat{\mathfrak {Z}}\) is the normalizing constant of \(\hat{H}^\star _{\lambda ,L}\) given by (25). Since \(\mathcal {T}_{\textsf {link}}\) is a tree, we can compute \(\kappa _\textsf {link}^\wedge \) using the same argument, namely,
since the total number of variables and clauses in \(\mathcal {T}_{\textsf {link}}\) are \((2k-2)v(T_0)+v(T_\textsf {link})\) and \((2k-2)a(T_0)+ a(T_\textsf {link})+2\).
What remains is to calculate \(\kappa _\textsf {cyc}^\wedge \). There is a single cycle of length 2t in the graph \(T\cup A\), and let this be a \(\zeta \)-cycle with \(\zeta \in \{0,1\}^{2t}\). Unlike the previous two cases, the literal assignment \(\zeta \) actually has a non-trivial effect, but still the literals outside of the cycle can be ignored. We compute
which does not include the rescaling term by the cycle effect. Let C denote the cycle in \(\mathcal {T}_{\textsf {cyc}}\) and 2t be its length. Let \(\mathscr {Y}_{C}\) be the half-edges that are adjacent to but not contained in C. Hence, \(t(d-2)\) (resp. \(t(k-2)\)) half-edges in \(\mathscr {Y}_{C}\) are adjacent to a variable (resp. a clause) in C.
For each \(u\in \mathscr {Y}_{\textsf {cyc}}\), let \(T_u\) denote the connected component of \(\mathcal {T}_{\textsf {cyc}}\setminus \{u \}\) that is a tree. Let \(e_u\) denote the root half-edge of \(T_u\), that is, the half-edge that is matched with u in \(\mathcal {T}_{\textsf {cyc}}\), and \(\varkappa _u(\sigma ;\,\cdot \,)\) be defined analogously as (75). Then, according to the same computation as (76), we obtain that
Furthermore, for convenience we denote the set of variables, clauses and edges of C by V, F, and E, respectively and setting \(\mathscr {Y}\equiv \mathscr {Y}_{C}\cup E\). For each \(a\in F\), denote the two literals on C that are adjacent to a by \(\zeta _a^1, \zeta _a^2\). Observe that \(\kappa _\textsf {cyc}^\wedge \) can be written by
where the second equality is obtained by multiplying \(\prod _{e\in E} \dot{q}^\star _{\lambda ,L}(\sigma _e) \hat{q}^\star _{\lambda ,L}(\sigma _e)\) both in the numerator and denominator of the first line. Moreover, the normalizing constant for \(\hat{H}^{\zeta _1,\zeta _2}\) is the same regardless of \(\zeta _1,\zeta _2\) (see (41)). (Note that in the RHS we wrote \(\dot{H}^\star \equiv \dot{H}^\star _{\lambda ,L}\) and similarly for \(\hat{H}^{\zeta _1,\zeta _2}, \bar{H}^\star \).) The literal assignments did not play a role in the previous two cases of \(\mathcal {T}_0\), \(\mathcal {T}_{\textsf {link}}\) which are trees, but in \(\mathcal {T}_{\textsf {cyc}}\) their effect is non-trivial in principle due to the existence of the cycle C. Plugging the identities \(\dot{\mathfrak {Z}}=\dot{\mathscr {Z}}\bar{\mathfrak {Z}}\) and \(\hat{\mathfrak {Z}}=\hat{\mathscr {Z}}\bar{\mathfrak {Z}}\) into (81), we deduce that
and hence \(\tilde{\kappa }_\textsf {cyc}^\wedge = \dot{\mathscr {Z}}^{v(\mathcal {T}_{\textsf {cyc}})} \hat{\mathscr {Z}}^{a(\mathcal {T}_{\textsf {cyc}})} \). Therefore, combining this result with (74), (77) and (78), we obtain the conclusion \(\kappa ^\wedge (\varnothing ) = \acute{\kappa }^\wedge (\varnothing ) \). \(\square \)
5.3 The martingale increment estimate and the proof of Proposition 3.4
We begin with establishing (68) by combining the discussions in the previous subsections. The proof follows by the same argument as Section 7, [25], along with plugging in the improved estimate Lemma 5.5 and obtaining an estimate on \(\mathbb {E}{{\textbf {Y}}}\) using Proposition 4.1.
To this end, we first review the result from [25] that gives the estimate on the Fourier coefficients defined in (73). In [25] Lemma 6.7 and the discussion below, it was showed that
independent of T. (The logarithmic factor for \(|\{\underline{r}^1, \underline{r}^2 \}| \ge 3\) is slightly worse than that of [25], since we work with g such that \(||g-g^\star ||\le \sqrt{n}\log ^2 n\), not \(||g-g^\star ||\le \sqrt{n}\log n\).) Based on this fact and the analysis from Sect. 5.2, our first goal in this subsection is to establish the following:
Lemma 5.6
Let \(L>0, \lambda \in (0,\lambda ^\star _L) \) and \(l_0 >0\) be fixed, and let \({{\textbf {Z}}}_T\) and \(\acute{{{\textbf {Z}}}}_T\) be given as (69). Then, there exist an absolute constant \(C>0\) and a constant \(C_{k,L}>0\) such that for large enough n,
where \({{\textbf {Z}}}' = \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }} \)
Proof
Let be defined as (71). Based on the expression (72), we study the conditional expectation for different shapes of T. To this end, we first recall the events \({{\textbf {T}}}\), \({{\textbf {C}}}^\circ \) and \({{\textbf {C}}}_t\) defined in the beginning of Sect. 5.2. We additionally write
Note that T can be constructed from a configuration model in a depth \(\ell \) neighborhood of \(\mathscr {K}\) which is of size \(O_k(1)\). Revealing the edges of these neighborhoods one by one, each new edge creates a cycle with probability \(O_k(1/n)\). The event \({{\textbf {C}}}^\circ \) requires a single cycle so by a union bound \(\mathbb {P}({{\textbf {C}}}^\circ )=O_k(1/n)\) while the event \(\text {{\textbf {B}}}\) requires at least two cycles so again by a union bound \(\mathbb {P}(\text {{\textbf {B}}})=O_k(n^{-2})\).
For each event above, we can make the following observation. When we have \({{\textbf {T}}}\), the only contribution to comes from \((\underline{r}^1, \underline{r}^2)\) such that \(|\{\underline{r}^1, \underline{r}^2 \}| \ge 2\), due to the properties of \(\kappa ^\wedge \) discussed in the beginning of Sect. 5.2. Note that the number of choices of \((\underline{r}_1, \underline{r}_2)\) with \(|\{\underline{r}^1, \underline{r}^2 \}| = 2\) is \(\le |\dot{\Omega }_L|^4 (k^54^k)^{l_0}\). Therefore, (82) gives that
Similarly on \({{\textbf {C}}}^\circ \), the analysis on \(\kappa ^\wedge \) implies that there is no contribution from \((\underline{r}^1,\underline{r}^2)= \varnothing \). Thus, we obtain from (82) that
Since the event \({{\textbf {B}}}\) has probability \(\mathbb {P}({{\textbf {B}}}) =O_k(n^{-2})\), we also have that
The last remaining case is \({{\textbf {C}}}_t\), and this is where we get a nontrivial improvement compared to [25]. Lemma 5.5 tells us that there is no contribution from \((\underline{r}_1,\underline{r}_2) = \varnothing \). Thus, similarly as (86), for each \(t\le l_0\) we have
Thus, combining the Eqs. (85)–(88), we obtain the conclusion. \(\square \)
To obtain the conclusion of the form (68), we need to replace \((\mathbb {E}{{\textbf {Z}}}')^2\) in (83) by \((\mathbb {E}{{\textbf {Y}}})^2\). This follows from Proposition 4.1 and can be summarized as follows.
Corollary 5.7
Let \(L>0\), \(\lambda \in (0,\lambda ^\star _L)\) and \(l_0>0\) be fixed, and let \({{\textbf {Y}}}\equiv {{\textbf {Y}}}_{\lambda ,l_0}^{(L)}\) be the rescaled partition function defined by (40). Further, let \(\underline{\mu }\), \(\underline{\delta }_L\) be as in Proposition 4.1. Then, we have
where \({{\textbf {Z}}}^\prime \equiv \widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\).
Proof
Let \(c_\textsf {cyc}=c_\textsf {cyc}(l_0)\) be given as Proposition 4.1. Corollary 5.1 shows that \(\mathbb {E}{{\textbf {Y}}}\mathbb {1}\{||\underline{X}||_\infty \ge c_{\textsf {cyc}} \log n \}\) is negligible for our purposes, and hence we focus on estimating \(\mathbb {E}{{\textbf {Y}}}\mathbb {1}\{||\underline{X}||_\infty \le c_{\textsf {cyc}} \log n \}\).
Note that for an integer \(x\ge 0\), \((1+\theta )^x = \sum _{a\ge 0} \frac{(x)_a}{a!} \theta ^a \). Thus, if we define \(\tilde{\delta }(\zeta ) \equiv (1+\delta _L(\zeta ))^{-1} -1\), we can write
and performing the summation in the rhs easily implies the conclusion.\(\square \)
We conclude this subsection by presenting the proof of Proposition 3.4.
Proof of Proposition 3.4
As discussed in the beginning of Sect. 5, it suffices to establish (68) to deduce Proposition 3.4. Combining Lemmas 5.3, 5.6 and Corollary 5.7 gives that
for some absolute constant \(C>0\). Moreover, Lemma 4.6 implies that
hence establishing (68). \(\square \)
5.4 Proof of Lemma 5.3
In this subsection, we establish Lemma 5.3. One nontrivial aspect of this lemma is achieving the error \(O(n^{-3/2} \log ^6 n) \mathbb {E}[({{\textbf {Z}}}')^2]\), where \({{\textbf {Z}}}^\prime \equiv \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\). For instance, there can be short cycles in \(\mathcal {G}\) intersecting T (but not included in T) with probability \(O(n^{-1})\), and in principle this will contribute by \(O(n^{-1}) \) in the error term. One observation we will see later is that the effect of these cycles wears off since we are looking at the difference \({{\textbf {Y}}}(A) - {{\textbf {Y}}}(\acute{A})\) between rescaled partition functions.
To begin with, we decompose the rescaling factor (which is exponential in \(\underline{X}^\partial \)) into the sum of polynomial factors based on an elementary fact we also saw in the proof of Corollary 5.7: for a nonnegative integer x, we have \((1+\theta )^x = \sum _{a\ge 0} \frac{(x)_a}{a!} \theta ^a\). Let \(\tilde{\delta }(\zeta )= (1+\delta _L(\zeta ) )^{-2}-1\), and write
Therefore, our goal is to understand \(\mathbb {E}[({{\textbf {Z}}}_T-\acute{{{\textbf {Z}}}}_T)^2 (\underline{X}^\partial )_{\underline{a}}\) which can be described as follows.
Lemma 5.8
Let \(L>0\), \(\lambda \in (0,\lambda _L^\star )\) and \(l_0>0\) be fixed, set \(\underline{\mu }\), \(\underline{\delta }_L\) as in Proposition 4.1, and let \({{\textbf {Z}}}_T, \acute{{{\textbf {Z}}}}_T\) be defined as (69). For any \(\underline{a}=(a_\zeta )_{||\zeta ||\le l_0} \) with \(||\underline{a}||_\infty \le \log ^2 n\), we have
The first step towards the proof is to write the lhs of (90) using the Fourier decomposition as in Sect. 5.1. To this end, we recall Definitions 3.2, 4.2 (but now \(\Delta \) counts the number of pair-coloring configurations around variables, clauses, and half-edges) and decompose \((\underline{X}^\partial )_{\underline{a}}\) similarly as the expression (51). Hence, we write
where \(\mathcal {Y}= \{\mathcal {Y}_i(\zeta ) \}_{i\in [a_\zeta ],\, ||\zeta ||\le l_0}\) denotes the locations of \(\underline{a}\) \(\zeta \)-cycles and \(\underline{\varvec{\sigma }}_\mathcal {Y}\) describes a prescribed coloring configuration on them.
In what follows, we fix a tuple \((\mathcal {Y},\underline{\varvec{\sigma }}_\mathcal {Y})\) and work with the summand of above via Fourier decomposition. Let
be the set of half-edges in \(\mathcal {U}\) that are adjacent to a variable in \(\mathcal {Y}\). Since the colors on U are already given by \(\underline{\varvec{\sigma }}_\mathcal {Y}\), we will perform a Fourier decomposition in terms of \(\underline{\varvec{\sigma }}_{\mathscr {U}'}\), with \(\mathscr {U}' \equiv \mathscr {U}\setminus U\). Let \(\kappa (\underline{\sigma }_{\mathscr {U}'}; \underline{\sigma }_\mathcal {Y}) \) (resp. \(\acute{\kappa }(\underline{\sigma }_{\mathscr {U}'}; \underline{\sigma }_\mathcal {Y}) \)) be the partition function on \(T \cup A\) (resp. \(T\cup \acute{A}\)) (in terms of the single-copy model), under the prescribed coloring configuration \(\underline{\sigma }_{\mathscr {U}'}\) on \(\mathscr {U}'\) and \(\underline{\sigma }_{\mathcal {Y}\cap T}\) on \(\mathcal {Y}\cap T\). Setting
and writing \(\underline{\varvec{\sigma }}_\mathcal {Y}= (\underline{\sigma }_\mathcal {Y}^1, \underline{\sigma }_\mathcal {Y}^2 )\), we obtain by following the same idea as (71) that
Note that \((\underline{X}^\partial )_{\underline{a}}\) is deterministically bounded by \(\exp (O(\log ^3 n))\), and hence at the end the second term will have a negligible contribution due to \(\exp (-\Omega (n) )\), which comes from the correlated pairs of colorings. Then, we investigate
To be specific, we want to derive the analog of Lemma 6.7, [25], which dealt with without having the planted cycles inside the graph. To explain the main computation, we introduce several notations before moving on. Let \(\bar{\Delta }\), \(\bar{\Delta }_U\) be counting measures on \(\dot{\Omega }_L^2\) defined as
Note that \(\bar{\Delta }\) and \(\bar{\Delta }_U\) indicate empirical counts of edge-colors on disjoint sets. Moreover, for a given coloring configuration \(\underline{\varvec{\sigma }}_\mathcal {Y}\) on \(\mathcal {Y}\), we define \(\Delta _\partial =(\dot{\Delta }_\partial , (\hat{\Delta }^{\underline{\texttt {L}}}_\partial )_{\underline{\texttt {L}}})\), the restricted empirical profile on \(\mathcal {Y}\setminus T\), by
Note that \(\dot{\Delta }_\partial \) carries the information on the colors on U, while \(\bar{\Delta }\) does not (and hence we use different notations). Lastly, let \(\mathscr {U}' \equiv \mathscr {U}\setminus U\), and for a given coloring configuration \(\underline{\varvec{\sigma }}_{\mathscr {U}'}\) on \(\mathscr {U}'\), define \(\bar{h}^{\underline{\varvec{\sigma }}_{\mathscr {U}'}}\) to be the following counting measure on \(\dot{\Omega }_L^2\):
Then, the next lemma provides a refined estimate on (92), which can be thought as a planted-cycles analog of Lemma 6.7, [25].
Lemma 5.9
Let \(\mathcal {Y}, \underline{\varvec{\sigma }}_\mathcal {Y}\) be given as above. For any given \(\underline{a}\) with \(||\underline{a}||_\infty \le \log ^2 n\) and for all \(\underline{\varvec{\sigma }}_{\mathscr {U}'}\), we have
where the terms in the identity can be explained as follows.
-
(1)
\(c_0>0\) is a constant depending only on \(|\mathscr {U}|\).
-
(2)
\(b (\underline{\varvec{\sigma }}_{\mathcal {Y}})\) is a quantity such that \(|\epsilon (\underline{\varvec{\sigma }}_{\mathcal {Y}})| = O(n^{-1/2} \log ^2n)\), independent of \(\underline{\varvec{\sigma }}_{\mathscr {U}'}\).
-
(3)
\(C_{k,L}>0\) is an integer depending only on k, L, and \(\xi _j = (\xi _j (\tau ))_{\tau \in \dot{\Omega }_L^2}\), \(0\le j\le C_{k,L}\) are fixed vectors on \(\dot{\Omega }_L^2\) satisfying
$$\begin{aligned} ||\xi _j||_\infty = O(n^{-1/2}). \end{aligned}$$ -
(4)
\(\mathbb {P}_T(\mathcal {Y})\) is the conditional probability given the structure T such that the prescribed half-edges of \(\mathcal {Y}\) are all paired together and assigned with the right literals.
-
(5)
Write \(\dot{H}\equiv \dot{H}^\star _{\lambda ,L} \), and similarly for \(\hat{H}^{\underline{\texttt {L}}}\), \(\bar{H}\). The function \(\beta _T( \mathcal {Y}, \Delta )\) is defined as
$$\begin{aligned} \beta _T(\mathcal {Y},\Delta ) \equiv \frac{\dot{H}^{\dot{\Delta }_\partial } \prod _{\underline{\texttt {L}}} (\hat{H}^{\underline{\texttt {L}}})^{\hat{\Delta }_\partial ^{\underline{\texttt {L}}}} }{ \bar{H}^{\bar{\Delta } + \bar{\Delta }_U }} \times \prod _{e\in U} \dot{q}^\star _{\lambda ,L} (\varvec{\sigma }_e). \end{aligned}$$
The proof goes similarly as that of Proposition 4.1, but requires extra care due to the complications caused by the (possible) intersection between \(\mathcal {Y}\) and T. Due to its technicality, we defer the proof to Sect. A.4 in the appendix.
Based on the expansion obtained from Lemma 5.9, we conclude the proof of Lemma 5.8.
Proof of Lemma 5.8
We work with fixed \(\mathcal {Y}, \underline{\varvec{\sigma }}_\mathcal {Y}\) as in Lemma 5.9. For \(\underline{r}=(\underline{r}^1,\underline{r}^2)\), define the Fourier coefficient of (92) as
We compare this with the Fourier coefficients
of which we already saw the estimates in (82). In addition, it will be crucial to understand the expansion of as in Lemma 5.9. This was already done in Lemma 6.7 of [25] and we record the result as follows. \(\square \)
Lemma 5.10
(Lemma 6.7, [25]). There exist a constant \(C_{k,L}'>0\) and coefficients \(\xi _j'\equiv (\xi _j'(\varvec{\sigma }))_{\varvec{\sigma }\in \dot{\Omega }_L^2}\) indexed by \(0\le j\le C_{k,L}'\), such that \(||\xi _j'||_\infty = O(n^{-1/2})\) and
where \(c_0\) is the constant appearing in Lemma 5.9. Moreover, \(C_{k,L}'\) and the coefficients \(\xi _j'\), \(1\le j \le C_{k,L}'\) can be set to be the same as \(C_{k,L}\) and \(\xi _j\) in Lemma 5.9.
The identity (96) follows directly from Lemma 6.7, [25], and the last statement turns out to be apparent from the proof of Lemma 5.9 (see Sect. A.4).
Based on Lemma 5.9, we obtain the following bound on the Fourier coefficient (94):
Moreover, suppose that \(U= \emptyset \), that is, \(\mathcal {Y}\) does not intersect with \(\mathscr {U}\). In this case, we can compare (94) and (95) in the following way, based on Lemmas 5.9 and 5.10:
Using these observations, we investigate the following formula which can be deduced from (91) by Plancherel’s identity:
where the Fourier coefficients of \(\varpi \) are given by
Define \(\eta (\mathcal {Y})\equiv \eta (\mathcal {Y};T)\equiv |\bar{\Delta }|+|U|-|\dot{\Delta }_\partial |- |\hat{\Delta }_\partial |\), similarly as (56). As before, note that the quantities \(|\bar{\Delta }|, |U|, |\dot{\Delta }_\partial |,\) and \( |\hat{\Delta }_\partial |\) are all well-defined if T and \(\mathcal {Y}\) are given. Observe that
The remaining work is done by a case analysis with respect to \(\eta (\mathcal {Y})\).
Case 1. \(\eta (\mathcal {Y})=0\).
In this case, all cycles in \(\mathcal {Y}\) are not only pairwise disjoint, but also disjoint with \(\mathscr {U}\). As we will see below, such \(\mathcal {Y}\) gives the most contribution to (99). Recall the events \(\text {{\textbf {T}}}\), \(\text {{\textbf {C}}}^\circ \), \(\text {{\textbf {C}}}_t\) and \(\text {{\textbf {B}}}\) defined in the beginning of Sect. 5.2 and in (84).
On the event \({{\textbf {T}}}^c = \cup _{t\le l_0} {{\textbf {C}}}_t \cup {{\textbf {C}}}^\circ \cup {{\textbf {B}}}\), we can apply the same approach as in the proof of Lemma 5.6 using (97) and obtain that
On the other hand, on \({{\textbf {T}}}\), \(\varpi ^\wedge (\underline{r}^1) = 0 \) for \(|\{\underline{r}^1 \} |\le 1\) and hence the most contribution comes from \(|\{\underline{r} \} | =2\). To control this quantity, we use the estimate (98) and get
If we sum over all \(\underline{\varvec{\sigma }}_\mathcal {Y}\), and then over all \(\mathcal {Y}\) such that \(\eta (\mathcal {Y})=0\), we obtain by following the same computations as (57)–(60) that
Case 2. \(\eta (\mathcal {Y})=1\).
One important observation we make here is that if \(T\in {{\textbf {T}}}\) \(\eta (\mathcal {Y})=1\), then for any \(\underline{\varvec{\sigma }}_\mathcal {Y}= (\underline{\sigma }_\mathcal {Y}^1,\underline{\sigma }_\mathcal {Y}^2)\), we have
and analogously for the second copy \(\underline{\sigma }_\mathcal {Y}^2\). If we had \(|U|\le 1\), then this is a direct consequence of the results mentioned in the beginning of Sect. 5.1.
On the other hand, suppose that \(|U|=2\). If we want to have \(\eta (\mathcal {Y})=1\), then the only choice of \(\mathcal {Y}\) is that there exists one cycle in \(\mathcal {Y}\) that intersects with \(\mathscr {U}\) at two distinct half-edges, while all others in \(\mathcal {Y}\) are disjoint from each other and from \(\mathscr {U}\). In such a case, since the lenghs of cycles in \(\mathcal {Y}\) are all at most \(2l_0\), the cycle intersecting with \(\mathscr {U}\) cannot intersect with A (or \(\acute{A}\)). Therefore, the two half-edges U are contained in the same tree of T, and hence by symmetry the \(\varnothing \)-th Fourier coefficient does not depend on A (or \(\acute{A}\)).
With this in mind, the \(\varnothing \)-th Fourier coefficient does not contribute to (99), and hence we get
where \(\Delta = \Delta [\underline{\varvec{\sigma }}_\mathcal {Y}]\).
On the event \(\text {{\textbf {T}}}^c\), we can bound it coarsely by
What remains is to sum the above two over \(\underline{\varvec{\sigma }}_\mathcal {Y}\) and \(\mathcal {Y}\) such that \(\eta ({\mathcal {Y}})=1\). Since there can be at most 2 cycles from \(\mathcal {Y}\) that are not disjoint from all the rest, there exists a constant \(C=C_{k,L,l_0}\) such that
(see (61)) Then, we can bound the number choices of \(\mathcal {Y}\) as done in (62) and (64). This gives that
Case 3. \(\eta (\mathcal {Y})\ge 2\).
In this case, we deduce the conclusion relatively straightforwardly since \(\sum _{\mathcal {Y}} \mathbb {P}_T (\mathcal {Y})\) is too small. Namely, we first have the crude bound from (97) such that
From similar observations as in (101), we can obtain that
where C is as in (101). Further, we control the number of choices of \(\mathcal {Y}\) as before, which gives that
Combining (100), (102) and (103), we obtain the conclusion.\(\square \)
Having Lemma 5.8 in hand, we are now ready to finish the proof of Lemma 5.3.
Proof of Lemma 5.3
Set \(\tilde{{\delta }}(\zeta ) = (1+\delta _L(\zeta ))^{-2}-1\). Using the identity \((1+\theta )^x = \sum _{a\ge 0} \frac{(x)_a}{a!} \theta ^a\) (which holds for all nonnegative integer x), we can express that
where we used Corollary 5.1 to obtain the error term in the RHS. Also note that \((\underline{X}^\partial )_{\underline{a}}=0\) if \(||\underline{a}||_\infty >\log n\) and \(||\underline{X}^\partial ||_\infty \le \log n\). Therefore, by applying Lemma 5.8, we see that the above is the same as
and from here we can directly deduce conclusion from performing the summation. \(\square \)
6 Small Subgraph Conditioning and the Proof of Theorem 3.1
In this section, we prove Theorem 3.1 by small subgraph conditioning method. To do so, we derive the condition (d) of Theorem 3.3 first for the truncated model, and then deduce the analogue for the untruncated model based on the continuity of the coefficients, which was proved in [37, Lemma 4.17].
Proposition 6.1
Let \(L>0\) and \(\lambda \in (0,\lambda ^\star _L)\) be given. Moreover, set \(\mu (\zeta ), \delta _L(\zeta ;\lambda )\) as in Proposition 4.1. Recalling \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\) defined in (40), we have
Proof
Fix \(\lambda <\lambda ^\star _L\) and abbreviate \(\delta _L(\zeta )\equiv \delta _L(\zeta ;\lambda )\) as before. We first show that the lhs is lower bounded by the rhs in (104). Let \(\underline{X} = (X(\zeta ))_\zeta \) be the number of \(\zeta \)-cycles in \(\mathscr {G}\). For an integer \(l_0>0\), we write \(\underline{X}_{\le l_0}=(X(\zeta ))_{||\zeta ||\le l_0}\) (note the difference from the notations used in the previous subsections). Note that Proposition 4.1-(1) gives us that the limiting law of \(\underline{X}_{\le l_0}\) reweighted by \(\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) must be independent Pois\(\big (\mu (\zeta )(1+\delta _L(\zeta )\big )\), since the moments of falling factorials are given by (46). Namely, for a given collection of integers \(\underline{x}_{\le l_0} = (x(\zeta ))_{||\zeta || \le l_0}\), we have
Recall that the unweighted \(\underline{X}_{\le l_0}\) has the limiting law given by (37). Thus, we have
for any \(\underline{x}_{\le l_0} = (x(\zeta ))_{||\zeta || \le l_0}\). Thus, by Fatou’s Lemma, we have
Since this holds for any \(l_0\), we obtain the lower bound of lhs of (104).
To work with the upper bound, recall the definition of the rescaled partition function \({{\textbf {Y}}}_{l_0} \equiv {{\textbf {Y}}}_{\lambda ,l_0}^{(L)}\) in (40). For any \(\varepsilon >0\), Proposition 3.4 implies that there exists \(l_0(\varepsilon )>0\) such that for \(l_0\ge l_0(\varepsilon )\),
On the other hand, we make the following observation which are the consequences of Corollaries 4.5, 5.1 and Proposition 4.1:
We briefly explain how to obtain (108). First, note that it suffices to estimate \(\mathbb {E}[{{\textbf {Y}}}_{l_0} \mathbb {1}_{\{||\underline{X} ||_\infty \le \log n \} }] \) due to Corollary 5.1. Then, we expand the rescaling factor of \({{\textbf {Y}}}_{l_0}\) by falling factorials using the formula (89). Each correlation term \(\mathbb {E}[{{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda } (\underline{X})_{\underline{a}}\mathbb {1}_{\{||\underline{X} ||_\infty \le \log n \} }]\) can then be studied based on Proposition 4.1 and Corollary 4.5. We can investigate the second moment of \({{\textbf {Y}}}_{l_0}\) analogously.
Combining (107) and (108) shows
which holds for all \(l_0 \ge l(\varepsilon )\) and \(\varepsilon >0\). Therefore, letting \(l_0\rightarrow \infty \) and \(\varepsilon \rightarrow 0\) gives the conclusion. \(\square \)
The next step is to deduce the analogue of Proposition 6.1 for the untruncated model. To do so, we first review the following notions from [37]: for coloring configurations \(\underline{\sigma }^{1},\underline{\sigma }^{2}\in \Omega ^{E}\), let \(x^{1}\in \{0,1,{\texttt {f}}\}^{V}\) (resp. \(x^{2}\in \{0,1,{\texttt {f}}\}^{V}\)) be the frozen configuration corresponding to \(\underline{\sigma }^{1}\) (resp. \(\underline{\sigma }^{2}\)) via Lemma 2.7 and (14). Then, define the overlap \(\rho (\underline{\sigma }^{1},\underline{\sigma }^{2})\) of \(\underline{\sigma }^1\) and \(\underline{\sigma }^2\) by
Then, for \(\lambda \in [0,1]\) and \(s\in [0,\log 2)\), denote by the contribution to \((\text {{\textbf {Z}}}^{\text {tr}}_{\lambda ,s})^{2}\) from the near-independence regime \(|\rho (\underline{\sigma }^{1},\underline{\sigma }^{2})-\frac{1}{2}|\le k^{2}2^{-k/2}\)
Similarly, we respectively denote by \({{\textbf {Z}}}^{2}_{\lambda ,\text{ ind }}\), \({{\textbf {Z}}}^{2,(L)}_{\lambda ,\text{ ind }}\) and \({{\textbf {Z}}}^{2,(L)}_{\lambda ,s,\text{ ind }}\) the contribution to \(({{\textbf {Z}}}^{\text{ tr }}_{\lambda })^{2}\), \(({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda })^2\) and \(({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda ,s})^{2}\) from the near-independence regime \(|\rho (\underline{\sigma }^{1},\underline{\sigma }^{2})-\frac{1}{2}|\le k^{2}2^{-k/2}\).
Proposition 6.2
Let \(\mu (\zeta )\) and \(\delta (\zeta ;\lambda ^\star )\) be the constants from Proposition 4.1. Then, for \((s_n)_{n \ge 1}\) converging to \(s^\star \) with \(|s_n-s^\star |\le n^{-2/3}\), we have
Proof
Note that for \(\lambda <\lambda ^\star _L\), Proposition 4.20 of [37] shows that the contribution to \(\mathbb {E}\big ( \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^{2}\) from the correlated regime \(|\rho (\underline{\sigma }^{1},\underline{\sigma }^{2})-\frac{1}{2}|\ge k^{2}2^{-k/2}\) is negligible compared to the near-independence regime. Also, since \(\widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\) is defined to be the contribution to \({{\textbf {Z}}}^{(L),\text{ tr }}_{\lambda }\) from \(||H-H^\star _{\lambda ,L}||_1 \le n^{-1/2}\log ^{2}n\), Proposition 3.10 of [43] shows that the contribution to \(\mathbb {E}\big ( \widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}\big )^{2}\) from the near-independence regime is \(\big (1-o(1)\big )\mathbb {E}{{\textbf {Z}}}^{2,(L)}_{\lambda ,\text{ ind }}\). Similarly, Proposition 3.4 of [43] shows \(\mathbb {E}\widetilde{{{\textbf {Z}}}}_{\lambda }^{(L),\text{ tr }}= \big (1-o(1)\big )\mathbb {E}{{\textbf {Z}}}_{\lambda }^{(L),\text{ tr }}\). Therefore, for \(\lambda <\lambda ^\star _L\), we have
where the last inequality is from Proposition 6.1. By Theorem 3.21, Proposition 4.15 and Proposition 4.18 in [37], we can send \(L\rightarrow \infty \) and \(\lambda \nearrow \lambda ^\star \) to have
where in the last inequality, we used (109) and Proposition 4.1-(4). Finally, Lemma 4.17 and Proposition 4.19 of [37] shows the lhs of the equation above equals \(\lim _{n\rightarrow \infty }\frac{\mathbb {E}{{\textbf {Z}}}^{2}_{\lambda ^\star ,s_n,\text{ ind }}}{\big (\mathbb {E}{{\textbf {Z}}}^{\text{ tr }}_{\lambda ^\star ,s_n}\big )^{2}}\), so (110) concludes the proof. \(\square \)
Corollary 6.3
Let \(\underline{X}_{\le l_0}=(X(\zeta ))_{||\zeta ||\le l_0}\) be collection of the number of \(\zeta \)-cycles in \(\mathscr {G}\) with size \(||\zeta ||\le l_0\). Denote \(s_\circ (C)\equiv s^\star -\frac{\log n}{2\lambda ^\star n}-\frac{C}{n}\). Recalling the definition of \(\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ,s}\) from (48), we have
Proof
Proceeding in the same fashion as (106) in the proof of Proposition 6.1, Proposition 4.1-(3) shows
To this end, we aim to find a matching upper bound for \(\frac{\mathbb {E}\big (\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}\big )^{2}}{\big (\mathbb {E}\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}\big )^{2}}\). Note that Proposition 4.20 of [37] shows that the contribution to \(\mathbb {E}\big (\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}\big )^{2}\) from the correlated regime \(|\rho (\underline{\sigma }^{1},\underline{\sigma }^{2})-\frac{1}{2}|\ge k^{2}2^{-k/2}\) is bounded above by \(\lesssim _{k}e^{2n\lambda ^\star s_{\circ }(C)}\mathbb {E}{{\textbf {N}}}_{s_{\circ }(C)}+e^{-\Omega _{k}(n)}\). Thus, we have
where the \(C_k\) and \(C_k^\prime \) are constants which depend only on k, and the last inequality holds because of Proposition 4.16 in [37]. Moreover, by Proposition 3.17 in [37], \(\mathbb {E}\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}=\big (1-o(1)\big )\mathbb {E}{{\textbf {Z}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}\) holds. Thus, combining (113) and Proposition 6.2, we have
Therefore, (112), (114), and Lemma 4.6 conclude the proof. \(\square \)
Proof of Theorem 3.1
Fix \(\varepsilon >0\). Having Corollary 6.3 in mind, for \(\delta>0, C>0\) and \(l_0\in \mathbb {N}\), we bound
We first control the second term of the rhs of (115): Proposition 4.1-(3) shows (cf. (105))
where \(\{\bar{X}(\zeta )\}_{\zeta }\) are independent Poisson random variables with mean \(\{\mu (\zeta )\}_\zeta \). Moreover, we have
where the infinite product in W is well defined a.s. due to Lemma 4.6 (see Theorem 9.13 of [29] for a proof). Thus, for small enough \(\delta \equiv \delta _{\varepsilon }\) which does not depend on C and large enough \(l_0\ge l_0(\varepsilon )\), we have
We now turn to the first term of the rhs of (115). By Chebyshev’s inequality and Corollary 6.3, for large enough \(C\ge C_{\varepsilon }\) and \(\ell _0\ge \ell _0(\varepsilon )\), we have
Therefore, by (115), (116) and (117), for \(\delta \equiv \delta _{\varepsilon }\) and \(C\ge C_{\varepsilon }\), we have
Since \(\mathbb {E}\widetilde{{{\textbf {Z}}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}=\big (1-o(1)\big )\mathbb {E}{{\textbf {Z}}}^{\text{ tr }}_{\lambda ^\star ,s_{\circ }(C)}\) holds by Proposition 3.17 of [37] and \({{\textbf {Z}}}^{\text{ tr }}_{\lambda ^\star ,s}\asymp e^{n\lambda ^\star s}{{\textbf {N}}}^{\text{ tr }}_{s}\) holds by definition, (118) concludes the proof. \(\square \)
References
Achlioptas, D., Chtcherba, A., Istrate, G., Moore, C.: The phase transition in 1-in-\(k\) SAT and NAE 3-sat. In: Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (Philadelphia, PA, USA, 2001), SODA ’01, Society for Industrial and Applied Mathematics, pp. 721–722
Achlioptas, D., Moore, C.: Random \(k\)-SAT: two moments suffice to cross a sharp threshold. SIAM J. Comput. 36(3), 740–762 (2006)
Achlioptas, D., Naor, A.: The two possible values of the chromatic number of a random graph. Ann. Math. (2) 162(3), 1335–1351 (2005)
Achlioptas, D., Naor, A., Peres, Y.: Rigorous location of phase transitions in hard optimization problems. Nature 435(7043), 759–764 (2005)
Achlioptas, D., Peres, Y.: The threshold for random \(k\)-SAT is \(2^k\log 2-O(k)\). J. Am. Math. Soc. 17(4), 947–973 (2004)
Auffinger, A., Chen, W.-K., Zeng, Q.: The sk model is infinite step replica symmetry breaking at zero temperature. Commun. Pure Appl. Math. 73(5), 921–943 (2020)
Ayre, P., Coja-Oghlan, A., Gao, P., Müller, N.: The satisfiability threshold for random linear equations. arXiv preprint arXiv:1710.07497 (2017)
Bapst, V., Coja-Oghlan, A.: The condensation phase transition in the regular \(k\)-SAT model. In: Approximation, randomization, and combinatorial optimization. Algorithms and techniques, vol. 60 of LIPIcs. Leibniz International Proceedings in Informatics, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Wadern, 2016, pp. Art. No. 22, 18
Bapst, V., Coja-Oghlan, A., Hetterich, S., Raßmann, F., Vilenchik, D.: The condensation phase transition in random graph coloring. Commun. Math. Phys. 341(2), 543–606 (2016)
Barbier, J., Krz̧akała, F., Zdeborová, L., Zhang, P.: The hard-core model on random graphs revisited. J. Phys. Conf. Ser. 473, 012021 (2013)
Bartha, Z., Sun, N., Zhang, Y.: Breaking of 1RSB in random MAX-NAE-SAT. arXiv preprint, arXiv:1904.08891 (2019)
Bollobás, B., Borgs, C., Chayes, J.T., Kim, J.H., Wilson, D.B.: The scaling window of the 2-SAT transition. Random Struct. Algorithms 18(3), 201–256 (2001)
Chvatal, V., Reed, B.: Mick gets some (the odds are on his side) (satisfiability). In: Proceedings of the 33rd Annual Symposium on Foundations of Computer Science (Washington, DC, USA, 1992), SFCS ’92, IEEE Computer Society, pp. 620–627
Coja-Oghlan, A.: Upper-bounding the \(k\)-colorability threshold by counting covers. Electron. J. Combin. 20(3), 32 (2013)
Coja-Oghlan, A., Efthymiou, C., Hetterich, S.: On the chromatic number of random regular graphs. J. Combin. Theory Ser. B 116, 367–439 (2016)
Coja-Oghlan, A., Krz̧akała, F., Perkins, W., Zdeborová, L.: Information-theoretic thresholds from the cavity method. Adv. Math. 333, 694–795 (2018)
Coja-Oghlan, A., Panagiotou, K.: Catching the \(k\)-NAESAT threshold [extended abstract]. In: STOC’12—Proceedings of the 2012 ACM Symposium on Theory of Computing (2012), ACM, New York, pp. 899–907
Coja-Oghlan, A., Panagiotou, K.: The asymptotic \(k\)-SAT threshold. Adv. Math. 288, 985–1068 (2016)
Coja-Oghlan, A., Vilenchik, D.: Chasing the \(k\)-colorability threshold. In,: IEEE 54th Annual Symposium on Foundations of Computer Science–FOCS ’13. IEEE Computer Society, Los Alamitos, CA , vol. 2013, pp. 380–389 (2013)
Coja-Oghlan, A., Wormald, N.: The number of satisfying assignments of random regular k-SAT formulas. Combin. Probab. Comput. 27(4), 496–530 (2018)
Coja-Oghlan, A., Zdeborová, L.: The condensation transition in random hypergraph 2-coloring. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms (2012), SODA ’12, ACM, New York, pp. 241–250
Dietzfelbinger, M., Goerdt, A., Mitzenmacher, M., Montanari, A., Pagh, R., Rink, M.: Tight thresholds for cuckoo hashing via XORSAT. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) Automata, Languages and Programming (Berlin, Heidelberg, 2010), Springer, Berlin, pp. 213–225
Ding, J., Sly, A., Sun, N.: Proof of the satisfiability conjecture for large k. In: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing (New York, NY, USA, 2015), STOC ’15, ACM, pp. 59–68
Ding, J., Sly, A., Sun, N.: Maximum independent sets on random regular graphs. Acta Math. 217(2), 263–340 (2016)
Ding, J., Sly, A., Sun, N.: Satisfiability threshold for random regular NAE-SAT. Commun. Math. Phys. 341(2), 435–489 (2016)
Dubois, O., Mandler, J.: The 3-XORSAT threshold. In: Proceedings of the 43rd Symposium on Foundations of Computer Science (Washington, DC, USA, 2002), FOCS ’02, IEEE Computer Society, pp. 769–778
Galanis, A., Štefankovič, D., Vigoda, E.: Inapproximability for antiferromagnetic spin systems in the tree nonuniqueness region. J. ACM 62(6), 50 (2015)
Galanis, A., Štefankovič, D., Vigoda, E.: Inapproximability of the partition function for the antiferromagnetic Ising and hard-core models. Comb. Probab. Comput. 25(4), 500–559 (2016)
Janson, S., Łuczak, T., Rucinski, A.: Random Graphs. Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley, New York (2000)
Kirousis, L.M., Kranakis, E., Krizanc, D., Stamatiou, Y.C.: Approximating the unsatisfiability threshold of random formulas. Random Struct. Algorithms 12(3), 253–269 (1998)
Krz̧akała, F., Montanari, A., Ricci-Tersenghi, F., Semerjian, G., Zdeborová, L.: Gibbs states and the set of solutions of random constraint satisfaction problems. Proc. Natl. Acad. Sci. 104(25), 10318–10323 (2007)
Krz̧akała, F., Pagnani, A., Weigt, M.: Threshold values, stability analysis, and high-\(q\) asymptotics for the coloring problem on random graphs. Phys. Rev. E 70, 046705 (2004)
Mézard, M., Montanari, A.: Information, Physics, and Computation. Oxford Graduate Texts. Oxford University Press, Oxford (2009)
Mézard, M., Parisi, G., Zecchina, R.: Analytic and algorithmic solution of random satisfiability problems. Science 297(5582), 812–815 (2002)
Montanari, A., Ricci-Tersenghi, F.: On the nature of the low-temperature phase in discontinuous mean-field spin glasses. Eur. Phys. J. B Condens. Matter Complex Syst. 33(3), 339–346 (2003)
Montanari, A., Ricci-Tersenghi, F., Semerjian, G.: Clusters of solutions and replica symmetry breaking in random \(k\)-satisfiability. J. Stat. Mech. Theory E 04, P04004 (2008)
Nam, D., Sly, A., Sohn, Y.: One-step replica symmetry breaking of random regular NAE-SAT I. arXiv preprint, arXiv:2011.14270 (2020)
Pittel, B., Sorkin, G.B.: The satisfiability threshold for \(k\)-XORSAT. Combin. Probab. Comput. 25(2), 236–268 (2016)
Robinson, R.W., Wormald, N.C.: Almost all cubic graphs are Hamiltonian. Random Struct. Algorithms 3(2), 117–125 (1992)
Robinson, R.W., Wormald, N.C.: Almost all regular graphs are Hamiltonian. Random Struct. Algorithms 5(2), 363–374 (1994)
Sherrington, D., Kirkpatrick, S.: Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792–1796 (1975)
Sly, A.: Computational transition at the uniqueness threshold. In: Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science (Washington, DC, USA, 2010), FOCS ’10, IEEE Computer Society, pp. 287–296
Sly, A., Sun, N., Zhang, Y.: The number of solutions for random regular NAE-SAT. In: Proceedings of the 57th Symposium on Foundations of Computer Science (2016), FOCS ’16, pp. 724–731
Talagrand, M.: The Parisi formula. Ann. Math. (2) 163(1), 221–263 (2006)
Zdeborová, L., Krz̧akała, F.: Phase transitions in the coloring of random graphs. Phys. Rev. E 76, 031131 (2007)
Acknowledgements
We thank Amir Dembo, Nike Sun and Yumeng Zhang for helpful discussions. We thank the anonymous reviewers for their careful reading and valuable feedbacks which improved our paper. DN is supported by a Samsung Scholarship. AS is supported by NSF Grants DMS-1352013 and DMS-1855527, Simons Investigator grant and a MacArthur Fellowship. YS is partially supported by NSF Grants DMS-1613091 and DMS-1954337.
Funding
Open Access funding provided by the MIT Libraries.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. Proof of technical lemmas
Appendix A. Proof of technical lemmas
In this section, we provide the omitted proofs from Sect. 4 and 5, which deals with the effect of short cycles in \(\mathbb {E}{{\textbf {Z}}}_\lambda \). We begin with establishing Lemma 4.6 and Proposition 4.1-(4) in Sect. A.1. Then, we discuss details of Corollary 5.1 in Sect. A.2. In Sect. A.3, we establish Proposition 4.1-(3). The final subsection, Sect. A.4, is devoted to the proof of Lemma 5.9.
1.1 Proof of Proposition 4.1-(4)
The goal of this subsection is to study \(\delta (\zeta ;\lambda )\) and \(\delta _L(\zeta ;\lambda )\) defined in (45). We first establish Lemma 4.6, and then show (4) of Proposition 4.1. Our approach is based on a rather direct study on the matrix \((\dot{A}\hat{A})^\zeta \). Once we obtain an explicit formula of the matrix, we use the combinatorial properties of free trees and the estimates on the belief propagation fixed point.
Proof of Lemma 4.6
Throughout the proof, we fix \(\lambda \in (0,\lambda ^\star ]\). Moreover, we assume that \(\zeta =\underline{0}\in \{0,1\}^{2l}\), and write \( \hat{A}_L\equiv \hat{A}^{0,0}_L\). It will be apparent that the same proof works for different choices of \(\zeta \). We first introduce several notations that will be crucial in the proof as follows.
On the finite-dimensional vector space \(\mathbb {R}^{\Omega _L}\), we define the inner product \(\langle \, \cdot ,\,\cdot \,\rangle _\star \) by
and denote \(||f||_\star ^2 \equiv \langle f,f\rangle _\star \). Note that both \(\dot{A}_L\) and \(\hat{A}_L\) are stochastic matrices, since from the fact that \(\dot{q}^\star _{\lambda ,L}\) is the BP fixed point, we have for every \(\tau _1\in \Omega \) and \((\texttt {L}_1,\texttt {L}_2)\in \{0,1\}^2\) that
Thus, all-1 vector \(\mathbb {1}\) is an eigenvector with eigenvalue 1 for both of the matrices \(\dot{A}_L\) and \(\hat{A}_L\). Also, note that if f is orthogonal to \(\mathbb {1}\) (denote \(f\perp _\star \mathbb {1}\)), then
where the equalities follow from (119). Moreover, it is straightforward to see that \((\dot{A}_L\hat{A}_L)\) defines a transition matrix of an ergodic Markov chain on \(\Omega _L\). Thus, 1 is the largest eigenvalue with single multiplicity. Define the matrix \(B_L\in \mathbb {R}^{|\Omega _L|\times |\Omega _L|}\) by
That is, \(B_L \equiv \dot{A}_L \hat{A}_L-\mathbb {1}(\bar{H}^\star )^\textsf{T}\), where we denoted \(\bar{H}^\star \) by the vector \(\bar{H}^\star \equiv (\bar{H}^\star _{\lambda ,L}(\tau ))_{\tau \in \Omega _L}\) with abuse of notation. Since \(\dot{A}_L \hat{A}_L\mathbb {1}=\mathbb {1}\) and \((\bar{H}^\star )^\textsf{T}\dot{A}_L \hat{A}_L=(\bar{H}^\star )^\textsf{T}\) holds, we have that \(B_L \mathbb {1}= (\bar{H}^\star )^\textsf{T} B_L=0\). Thus, \((\dot{A}_L\hat{A}_L)^{l}=B_L^{l}+\mathbb {1}(\bar{H}^\star )^\textsf{T}\), so
The remaining work is to understand the rhs of above.
Let \(\Omega _\circ \equiv \{{{{\texttt {B}}}}_0,{{{\texttt {B}}}}_1,{{{\texttt {R}}}}_0,{{{\texttt {R}}}}_1,{{\texttt {S}}}\} \), and \(\Omega _{\texttt {f}}\equiv \Omega _L \setminus \Omega _\circ \). We first need to understand how the entries of \(B_L\) are defined, especially \(B_L(\sigma ,\tau )\) with \(\sigma ,\tau \in \Omega _{\texttt {f}}\). If \(\sigma , \tau \in \Omega _{\texttt {f}}\), then we have the following observations:
-
\(\dot{A}_L(\sigma , \tau )=0\), unless both \(\sigma \) and \(\tau \) define the same free tree, and their root edges can be embedded into the tree as distinct edges adjacent to the same variable.
-
When \(\sigma ,\tau \) satisfy the above condition, denote \(\sigma = \sigma _v(e;\mathfrak {t})\) and \(\tau = \sigma _v(e';\mathfrak {t})\), where \(\mathfrak {t}\) denotes the free tree given by \(\sigma , \tau \) and v, e describe the variable and the half-edge in \(\mathfrak {t}\) where \(\sigma \) can be embedded. Then, we can observe that
$$\begin{aligned} \dot{A}_L(\sigma ,\tau ) = \frac{1}{d-1} \left| \left\{ e'': e''\sim v, \, e''\ne e,\, \sigma _v(e'';\mathfrak {t}) = \sigma _v(e';\mathfrak {t}) \right\} \right| . \end{aligned}$$ -
This holds the same for \(\hat{A}\), and hence we have for all \(\sigma ,\tau \in \Omega _{\texttt {f}}\) that
$$\begin{aligned} \hat{A}_L(\sigma ,\tau ) = \frac{1}{k-1} \left| \left\{ e'': e''\sim a, \, e''\ne e,\, \sigma _a(e'';\mathfrak {t}) = \sigma _a(e';\mathfrak {t}) \right\} \right| , \end{aligned}$$if and only if there exists some \(\mathfrak {t}, a,e,e'\) such that \(\sigma = \sigma _a(e;\mathfrak {t})\), \(\tau =\sigma _a(e';\mathfrak {t})\). Otherwise it is 0.
For a free tree \(\mathfrak {t}\), suppose that \(v,a\in \mathfrak {t}\) with \(v\sim a\), and \(e\sim v\), \(e'\sim a\) satisfy \(e\ne (va) \ne e'\). Then, letting \(\sigma = \sigma _v(e;\mathfrak {t})\) and \(\tau =\sigma _a(e';\mathfrak {t})\), we have
Here, note that there cannot be \(\tau '\in \Omega _\circ \) such that \(\dot{A}(\sigma ,\tau ') \hat{A}(\tau ',\tau ) \ne 0\). Further, since \(\bar{H}^\star _L (\Omega _{\texttt {f}}) \le (k^C2^{-k})^2\), for such \(\sigma , \tau \) we have
For \(\sigma , \tau \in \Omega _{\texttt {f}}\) that do not satisfy the above condition, we have \(B_L(\sigma ,\tau ) =-\bar{H}^\star _{\lambda ,L}(\sigma )= O((k^C2^{-k})^2)\). Having these observations in mind, the main analysis is to establish the following. \(\square \)
Claim A.1
There exists an absolute constant \(C>0\) such that the following hold true: For any positive integer l, we have
We first assume that the claim holds true and finish the proof of Lemma 4.6. In the formula
(with \(\sigma _{l+1}\equiv \sigma _1\)), we see that the first sum in the last line can be controlled by (124). To be specific, we define for \(\underline{\sigma }= (\sigma _i)_{i=1}^l \in \Omega _L^l\) that
where \(\mathfrak {t}(\sigma )\) denotes the free tree associated with the color \(\sigma \). If \(\underline{\sigma }=(\sigma _i)_{i=1}^l \subset \Omega _{\texttt {f}}^{l}\) contributes to the above sum, then \(|\mathfrak {t}[\underline{\sigma }]|>1\), since \(|\mathfrak {t}[\underline{\sigma }]|=1\) would imply that the free component given by \(\underline{\sigma }\) forms a cycle. Therefore, we can bound
For the second sum, there are some i with \(\sigma _i\in \Omega _{\texttt {f}}\), and in this case we can use (123) to control the summation. When there are multiple such colors, we estimate the sum within each interval between \(\sigma _i, \sigma _{i'} \in \Omega _\circ \) by (123). Since the number of ways to choose a subset of the indices \(\{i\mid \sigma _i\in \Omega _\circ \}\subseteq [l]\) is at most \(2^l\), it can be absorbed into \((k^C2^{-k})^l\) and hence we obtain the conclusion of Lemma 4.6.\(\square \)
Proof of Claim A.1
According to (121) and (122), it suffices to establish (123) for \(A_L\equiv \dot{A}_L\hat{A}_L\). This is because the contribution to \(B_L(\sigma , \tau )\) from \(\sigma \), \(\tau \) such that \(A_L(\sigma , \tau ) =0\) is bounded by \(O((k^C2^{-k})^2)\), which is of smaller order than \(k^C2^{-k}\) as we can see from (122).
In order to obtain (123), let \(\underline{\sigma }=(\sigma _i)_{i=1}^{l-1} \subset \Omega _{\texttt {f}}^{l-1}\), and observe that we need \(|\mathfrak {t}[\underline{\sigma }]|=1\) to have
For a fixed \(\sigma _1\in \Omega _{\texttt {f}}\), let \(\mathfrak {t}, v, e\) be such that \(\sigma _1 =\sigma _v(e;\mathfrak {t})\). Moreover, define \(\mathfrak {t}_{v\setminus e}\) to be the connected component of \(\mathfrak {t}{\setminus } \{e \}\) containing v, and let
Then, the formula (121) tells us that
Since \(A_L(\sigma _0,\sigma _1) \le (k^C2^{-k})^{v(\mathfrak {t})}\) for any \(\sigma _0\in \Omega _\circ \) and \(\sigma _1\) with \(\mathfrak {t}(\sigma _1)= \mathfrak {t}\), we see that
The inequality (124) can be proven in a similar way. Let \(\underline{\sigma }=(\sigma _i)_{i=1}^l\), and note that \(|\mathfrak {t}[\underline{\sigma }]=1|\) does not give any contribution to (124), since it implies that the free component given by \(\underline{\sigma }\) contains a cycle. Suppose that \(|\mathfrak {t}[\underline{\sigma }]|=2\), and assume that \(|\mathfrak {t}[\sigma _1,\ldots ,\sigma _{i_0-1}]|=|\mathfrak {t}[\sigma _{i_0},\ldots ,\sigma _{l}]|=1\). Using (126), we obtain that
where the term \((k^C2^{-k})^{v(\mathfrak {t}_1) + v(\mathfrak {t}_2)}\) comes from
Thus, summing (128) over all \(i_0\), \(\mathfrak {t}_1\), \(\mathfrak {t}_2\) as (127), we obtain (124). The case where \(|\mathfrak {t}[\underline{\sigma }]|>2\) can be derived analogously and is left to the interested reader. \(\square \)
The final goal of this subsection is demonstrating Proposition 4.1-(4). This comes as a rather straight-forward application of Claim A.1, and hence we briefly sketch the proof without all the details.
Proof of Proposition 4.1-(4)
Define the matrix B analogously as (120). Let \(L_0>0\) and let \(B|_{L_0}\) be the \(\Omega _{L_0} \times \Omega _{L_0}\) submatrix of B. Then, we can write
where \(\sigma _{l+1}\equiv \sigma _1\). Since \(\mathfrak {t}[\underline{\sigma }]\) cannot be a singleton for \(\underline{\sigma }= (\sigma _1)_{i=1}^l\) that contributes to the above sum due to the same reason as in the proof of (124), there should be some \(i_0\) such that \(\sigma _{i_0}\in \Omega \setminus \Omega _{L_0}\) and \(\mathfrak {t}(\sigma _{i_0-1}) \ne \mathfrak {t}(\sigma _{i_0})\). For such \(i_0\), we get
and hence the above sum can be controlled by
In order to compare \(Tr[B^l]\) to \(Tr[B_L^l]\), we set \(L>L_0 >0\), and obtain that
Moreover, we can see that \(Tr\left[ ((B_L)|_{L_0})^l \right] \) converges to \(Tr\left[ (B|_{L_0})^l \right] \) as \(L\rightarrow \infty \) since \(H^\star _L \rightarrow H^\star \). Therefore, we obtain the conclusion of Proposition 4.1-(4) by combining (129) and (130). \(\square \)
1.2 Proof of Corollary 5.1
In this section, we present the proof of Corollary 5.1. The proof is based on ideas from Proposition 4.1 and Corollary 4.5. We show (1) of the corollary, and then the derivation of (2) will be analogous.
Note that for any nonnegative integer x, we have \((1+\theta )^x = \sum _{a\ge 0} \frac{(x)_a}{a!} \theta ^a\). Set \(\tilde{\delta }(\zeta ) = (1+\delta _L(\zeta ))^{-1}-1\), we can write
where we abbreviated \({{\textbf {Z}}}' = \widetilde{{{\textbf {Z}}}}^{(L),\text{ tr }}_{\lambda }\). Let \(c_{\textsf {cyc}}=c_\textsf {cyc}(l_0)\) be as Proposition 4.1, and set \(c' = \frac{1}{3}(c\wedge c_{\textsf {cyc}})\). We will control \(\mathbb {E}[{{\textbf {Z}}}' \cdot (\underline{X})_{\underline{a}} \mathbb {1}\{||\underline{X}||_\infty \ge c\log n \}] \) for each \(\underline{a}\) as follows.
Case 1. \(||\underline{a}||_\infty \le c'\log n\).
Controlling the indicator crudely by \(\mathbb {1}\{||\underline{X}||_\infty \ge c\log n \} \le \sum _{||\zeta '||\le l_0 } \mathbb {1}\{X(\zeta ') \ge c\log n \}\), we study
for each \(\zeta '\). Define \(\underline{a}'\) by
Since \(||\underline{a}'||_\infty \le \frac{2}{3} (c_{\textsf {cyc}}\wedge c)\log n\), we can see that
where the last inequality follows from Proposition 4.1.
Case 2. \(||\underline{a}||_\infty >c'\log n\).
In this case, it will be enough to study \(\mathbb {E}[{{\textbf {Z}}}'\cdot (\underline{X})_{\underline{a}} ]\), similarly as Proposition 4.1. However, the proof of Proposition 4.1 apparently breaks down when \(||\underline{a}||_1\) is large, and hence we work with a more general but weaker approach to control Case 2.
To begin with, as (51) we write
where \(\mathcal {Y}= \{\mathcal {Y}_i(\zeta ) \}_{i\in [a_\zeta ],\, ||\zeta ||\le l_0}\) denotes the locations of \(\underline{a}\) \(\zeta \)-cycles and \(\underline{\tau }_\mathcal {Y}\) describes a prescribed coloring configuration on them (recall Definition 3.2). As before, we derive an estimate on the summand for each fixed \((\mathcal {Y}, \underline{\tau }_\mathcal {Y})\). Let \(\Delta =\Delta [\underline{\tau }_\mathcal {Y}]\) be given as Definition 4.2. Consider a literal assignment \(\underline{\texttt {L}}_E\) on and an empirical count measure \(g=(\dot{g}, (\hat{g}^{\underline{\texttt {L}}})_{\underline{\texttt {L}}\in \{0,1\}^k }, \bar{g} )\) on \(\mathscr {G}\) that contributes to \(\mathbb {E}{{\textbf {Z}}}'\). Here, we assume that \(\underline{\texttt {L}}_E\) and \((\hat{g}^{\underline{\texttt {L}}})\) are compatible in the sense that \(|\{a\in F: (\underline{\texttt {L}}_E)_a = \underline{\texttt {L}}\}| = |\hat{g}^{\underline{\texttt {L}}}| \) for each \(\underline{\texttt {L}}\in \{0,1\}^k\). Based on the expression in the first line of (55), we have that
Define the quantity \(H(g,\Delta )\) to be
Moreover, let \(\hat{\Delta } \equiv \sum _{\underline{\texttt {L}}} \hat{\Delta }^{\underline{\texttt {L}}}\), and define
Our goal is to deduce a general upper bound on \(\mathcal {H}(g,\Delta )\) that depends only on \(\eta (\mathcal {Y})\), not on g or \(||\underline{a}||\).
We can interpret \(\bar{\Delta }_c\) as a partition of the set \([|\bar{\Delta }_c|]\). That is, \(\bar{\Delta }_c(\sigma )\) for each \(\sigma \in \dot{\Omega }_L\) corresponds to a (disjoint) interval of length \(|\bar{\Delta }_c(\sigma )|\) inside \([|\bar{\Delta }_c|]\). Similarly, we can think of a partition of the set \([|\dot{\Delta }|+|\hat{\Delta }| ]\) by disjoint intervals of length \(|\dot{\Delta }(\sigma )|\) and \(|\hat{\Delta }^{\underline{\texttt {L}}}(\sigma )|\), for each \(\sigma \in \dot{\Omega }_L\) and \(\underline{\texttt {L}}\in \{0,1\}^k\). Since \(\bar{\Delta }\) corresponds to a marginal measure of \(\dot{\Delta }\) and \(\hat{\Delta }\), we see that the latter partition of \([|\dot{\Delta }|+|\hat{\Delta }|]\) can be chosen as a subpartition of the former of \([|\bar{\Delta }|]\). This means that the expression in the numerator of \(\mathcal {H}(g,\Delta )\) must be smaller than its denominator. Furthermore, note that \(|\bar{\Delta }|\) exceeds \( |\dot{\Delta }|+|\hat{\Delta }|\) by \(\eta \), and for any nonnegative integers \(\{y(\sigma )\}_{\sigma \in \dot{\Omega }_L}\) such that \(\sum _{\sigma } y(\sigma ) \ge \eta \), it holds that
Thus, \(\mathcal {H}(g,\Delta )\) can be crudely controlled as follows:
On the other hand, for a fixed \(\eta \), we can bound the number of possible choices of \(\mathcal {Y}\) analogously as (62). Setting \(a^\dagger = \sum _{||\zeta ||\le l_0} ||\zeta || a_\zeta \) and implementing (62) on (131), we deduce that
Therefore, we can sum this over all \(\eta \) and obtain that
where C is a constant depending on k, L, and \(l_0\). Averaging over g and summing the above for \(||\underline{a}||_\infty \ge \frac{1}{3}\log n\), we see that
The conclusion for (2) can be obtained analogously if we work with the pair model. \(\square \)
1.3 Proof of Proposition 4.1-(3)
Here present the proof of Proposition 4.1-(3), by establishing (46) for \({{\textbf {Z}}}_\lambda \). The proof for \({{\textbf {Z}}}_{\lambda ,s_n} \) will be analogous from the former case. The main difference from the truncated model is that the optimal empirical measure \(H^*\) is no longer bounded below by a constant. This aspect requires an extra care in the derivation of (55), which indeed is no longer true in general for the untruncated model. To overcome such difficulty, let \(\dot{q}^\star = \dot{q}^\star _{\lambda ^\star }\in \mathscr {P}(\dot{\Omega })\) be the BP fixed point, and we split the space \(\dot{\Omega }\) into two types:
Now, recall the expression (51) for \(\widetilde{{{\textbf {Z}}}}_{\lambda }^{\text{ tr }}\):
where \(\mathcal {Y}= \{\mathcal {Y}_i(\zeta ) \}_{i\in [a_\zeta ],\, ||\zeta ||\le l_0}\) denotes the locations of \(\underline{a}\) \(\zeta \)-cycles and \(\underline{\tau }_\mathcal {Y}\) describes a prescribed coloring configuration on them.
As before, we work with an empirical profile count \(g= (\dot{g},(\hat{g}^{\underline{\texttt {L}}})_{\underline{\texttt {L}}}, \bar{g})\) that satisfies \(||g-g^\star _{\lambda }||_1\le \sqrt{n} \log ^2 n\). We additionally assume that
and analogous conditions for \(\hat{g}^{\underline{\texttt {L}}}\) and \(\bar{g}\). The empirical counts g that does not satisfy the equation above are excluded due to the same reason as (66). We additionally write \(H = (\dot{H}, (\hat{H}^{\underline{\texttt {L}}})_{\underline{\texttt {L}}}, \bar{H})\) for their normalized versions, that is,
Recall the definition of the empirical profile \(\Delta = (\dot{\Delta }, (\hat{\Delta }^{\underline{\texttt {L}}})_{\underline{\texttt {L}}}, \bar{\Delta }_c)\) on \(\mathcal {Y}\) (Definition 4.2). Then, as in (131), we fix a literal assignment \(\underline{\texttt {L}}_E\) that is compatible with \((\hat{g}^{\underline{\texttt {L}}})_{\underline{\texttt {L}}}\) and write
Moreover, we define
as before, noting that it is well-defined without knowing \(\underline{\tau }_\mathcal {Y}.\) In what follows, we perform case analysis depending on \(\eta (\mathcal {Y})\). It turns out that the case \(\eta = 0\) gives the main contribution, but the analysis for both cases become more complicated than in the proof of Proposition 4.1-(1) or in Sect. A.2 due to the existence of \(\dot{\Omega }^{\textsf {atyp}}\).
The key analysis lies in the computation of \(\sum _{\underline{\tau }_\mathcal {Y}} \mathcal {H}(H,\Delta [\underline{\tau }_\mathcal {Y}] )\). In what follows, we carry on this analysis in two different cases, when \(\eta =0\) and when it is not.
1.3.1 Case 1. \(\eta =0\)
Since \(\mathcal {Y}\) consists of pairwise disjoint cycles, we can consider \(\mathcal {H}\) as a product of the corresponding function defined on each cycle and work out separately when summing over \(\underline{\tau }_\mathcal {Y}\). Therefore, we will assume that \(\mathcal {Y}= \{\mathcal {Y}(\zeta ) \}\) for some \(||\zeta ||\le l_0\), and later take products over different cycles.
We may separate the sum \(\sum _{\underline{\tau }_\mathcal {Y}} \mathcal {H}(H,\Delta [\underline{\tau }_\mathcal {Y}] )\) into two cases, when \(\underline{\tau }_\mathcal {Y}\subset \dot{\Omega }^{\textsf {typ}}\) and when it is not.
Case 1-1. \(\underline{\tau }_\mathcal {Y}\subset \dot{\Omega }^{\textsf {typ}}\).
If \(||g-g^\star ||\le \sqrt{n} \log ^2 n\), then for all \(\sigma \in \dot{\Omega }^{\textsf {typ}}\) we have
Moreover, recall the matrices \((\dot{A}\hat{A})^\zeta \) defined in (44). Similarly, we introduce
where \(\dot{A}_{\textsf {typ}}\) and \(\hat{A}_{\textsf {typ}}^{\texttt {L}_1,\texttt {L}_2}\) denote the \(\dot{\Omega }^{\textsf {typ}} \times \dot{\Omega }^{\textsf {typ}} \) submatrices of \(\dot{A}\) and \(\hat{A}^{\texttt {L}_1,\texttt {L}_2}\). Then, for H of our interest, we can express
Following the same analysis done in the proof of Proposition 4.1-(4) in Sect. 6, we obtain that
which gives us that
Case 1-2. \(\underline{\tau }_\mathcal {Y}\nsubseteq \dot{\Omega }^{\textsf {typ}}\).
This case can be treated by a similar way as the proof of Proposition 4.1-(4) in Sect. 6. Let \(l=||\zeta ||\), and without loss of generality we assume that \(\zeta = \underline{0}\). Denoting \(\hat{A} \equiv \hat{A}^{0,0}\), we can write
with \(\sigma _0 = \sigma _{2l}\).
Observe that in a tuple \((\sigma _1,\ldots ,\sigma _{2l})\) that contributes to the above sum, there should exists \(j\in [2l]\) such that \(\sigma _j \in \{{{{\texttt {B}}}}_0,{{{\texttt {B}}}}_1,{{\texttt {S}}}\}\) and \(\sigma _{j+1}\in \dot{\Omega }^{\textsf {atyp}}.\) Otherwise, it would imply that the tuple \((\sigma _1,\ldots ,\sigma _{2l})\) forms a free component that has a cycle (of lengh 2l), which contradicts the assumption that the set \(\dot{\Omega }\) only contains the colors which induce a free tree. Without loss of generality, suppose that \(j=2l-1\) satisfies the above criterion (the case of j being even can also be covered by the same argument). Then,
(Note that this holds not only for \(H^\star \), but for any H satisfying (134)) Thus, plugging this into (137) and summing over the rest of the colors gives that
Combining Cases 1-1 and 1-2, we obtain that for \(\mathcal {Y}\) with \(\eta (\mathcal {Y})=0\),
Therefore, in the general case when \(\mathcal {Y}\) consists of \(\underline{a}\) disjoint \(\zeta \)-cycles, averaging over g, \(\underline{\texttt {L}}_E\) and then summing over \(\mathcal {Y}\) gives
1.3.2 Case 2. \(\eta >0\)
In this case, \(\mathcal {Y}\) decomposes into \(||\underline{a}||_1-\eta \) connected components, and each component can be considered separately. If a component in \(\mathcal {Y}\) is a single cycle, it can be treated analogously as the previous case. Therefore, we assume that \(\mathcal {Y}= \{\mathcal {Y}(\zeta _1), \ldots , \mathcal {Y}(\zeta _j) \}\) such that the cycles \(\mathcal {Y}(\zeta _1),\ldots , \mathcal {Y}(\zeta _j)\) form a single connected component in \(\mathscr {G}\). Moreover, without loss of generality, we consider the case that all \(\zeta _i\), \(1\le i \le j\) are identically 0.
We define the orientation on \(\mathcal {Y}\) as follows:
-
O1.
For each half edge \(e=(va)\in E_c(\mathcal {Y})\), make it a directed edge by assigning a direction, either \(v\rightarrow a\) or \(a\rightarrow v\).
-
O2.
An assignment of directions on \(E_c(\mathcal {Y})\) is called an orientation if every variable and clause has at least one incoming edge adjacent to it.
Note that we can always construct an orientation as follows: Take a spanning tree of \(\mathcal {Y}\) and pick a variable (or clause) that has an edge not included in the tree. Starting from the selected vertex (root), we can assign directions on the tree so that all vertices but root has an incoming edge. Then, set the direction of the edge at root which is not in the tree to complete the orientation.
We fix an orientation of \(\mathcal {Y}\), and for each variable \(v\in V(\mathcal {Y})\) (resp. clause \(a\in F(\mathcal {Y})\)), fix e(v) (resp. e(a)) to be an incoming edge. Note that \(e(v),\; v\in V(\mathcal {Y})\) and \(e(a), \;a\in F(\mathcal {Y})\) are all distinct by definition.
Denoting \(E_c = E_c (\mathcal {Y}), \, V'=V(\mathcal {Y})\) and \(F'=F(\mathcal {Y})\), let
Here, note that \(\eta (\mathcal {Y}) = |E_\circ |\). Additionally, for each \(v\in V'\) and \(a\in F'\), we define
(Note that \(\delta _c (v)\) is a singleton unless v is an overlapping variable. Same goes for \(\delta _c(a)\).) For a fixed \(\underline{\sigma }_{E_c}\) we express the sum of \(\mathcal {H}(H,\Delta ) \equiv \mathcal {H}(H, \underline{\tau }_\mathcal {Y})\) as follows.
where the conditional measures in the formula are defined as
We study the sum of (139) over \(\underline{\sigma }_{E_c}\), in two cases: when \(\underline{\sigma }_{E_c} \subset \dot{\Omega }^{\textsf {typ}}\) and when it is not.
Case 2-1. \(\underline{\sigma }_{E_c}\subset \dot{\Omega }^{\textsf {typ}}\).
In this case, since \(|E_\circ | = \eta \), we have
Since each conditional measure \(\dot{H}(\;\cdot \;|\sigma _{e(v)})\), \(\hat{H}(\;\cdot \;|\sigma _{e(a)})\) has total mass equal to 1 on \(\dot{\Omega }\), we sum the above over all \(\sigma _{E_c} \subset \dot{\Omega }^{\textsf {typ}}\) and deduce that
Case 2-2. \(\underline{\sigma }_{E_c}\nsubseteq \dot{\Omega }^{\textsf {typ}}\).
As done in Case 1-2, there should exist two adjacent edges \(e',e''\in E_c\) such that \(\sigma _{e'} \in \{{{{\texttt {B}}}}_0, {{{\texttt {B}}}}_1, {{\texttt {S}}}\}\) and \(\sigma _{e''}\in \dot{\Omega }^{\textsf {atyp}}\). Assume that both \(e', e''\) are adjacent to a variable v and \(e'=e(v)\)
In such a setting, we have
Having this property in mind, fix \(\underline{\sigma }_{E_c} \nsubseteq \dot{\Omega }^{\textsf {typ}}\), and let \(E_\circ ^{\textsf {atyp}}\) be
and define \(\eta '\equiv \eta '(\underline{\sigma }_{E_c}) \equiv |E_\circ ^{\textsf {atyp}}|\). Then, similarly as (140), we can write
where we crudely bounded \(\bar{H}(\sigma _e) \ge n^{-1}\) for \(\sigma _e \in \dot{\Omega }^{\textsf {atyp}}.\) We claim that there should be at least \(\eta '+1\) variables or clauses such that (142) happen.
For each \(e \in E_\circ ^{\textsf {atyp}}\), consider the following “backtracking” algorithm:
-
(1)
Let \(e_0=e\), and let \(x(e_0)\) be the variable or clause that has \(e_0\) as an outgoing edge.
-
(2)
Let \(e_1 = e(x(e_0))\in E_c\setminus E_\circ \) be the unique incoming edge into \(x(e_0)\) as defined above. If \(\sigma _{e_1} \in \{{{{\texttt {B}}}}_0,{{{\texttt {B}}}}_1, {{\texttt {S}}}\}\), then we terminate the algorithm and output \(e_\star (e)= e_1\).
-
(3)
If not, define \(e_{i+1}= e(x(e_i))\) as (1), (2), and continue until termination as mentioned in (2).
For each \(e\in E_\circ ^{\textsf {atyp}}\), this algorithm must terminate, otherwise it will imply that \(\underline{\sigma }_{E_c}\) contains a cycle in a free component. Also, we introduce a similar algorithm which outputs \(e_{\star \star }(e)\in E_c\) for each \(e\in E_\circ ^{\textsf {atyp}}\):
-
(a)
Let \(y(e_0)\) be the variable or clause that has \(e_0=e\) as an incoming edge.
-
(b)
Let \(e_1= e(y(e_0)) \in E_c \setminus E_\circ \) be the unique incoming edge into \(y(e_0)\) as defined above. If \(\sigma _{e_1} \in \{{{{\texttt {B}}}}_0,{{{\texttt {B}}}}_1, {{\texttt {S}}}\}\), then we terminate the algorithm and output \(e_{\star \star }(e)= e_1\).
-
(c)
If not, define \(e_{i+1} = e( x(e_i))\) (\(i\ge 1\)), where \(x(e_i)\) is defined as (1) in the previous algorithm. Continue until termination as mentioned in (b).
This algorithm should also terminate in a finite time as we saw above. Moreover, \(e_\star (e)\) and \(e_{\star \star }(e)\) should be different for each \(e\in E_\circ ^{\textsf {atyp}}\), since if they were the same it would mean that the free component containing e has a cycle.
Consider the graph \(\mathfrak {G}=(\mathfrak {V},\mathfrak {E})\) defined as follows:
-
\(\mathfrak {V} \equiv \{e_\star (e), e_{\star \star }(e): \, e\in E_\circ ^{\textsf {atyp}} \}.\)
-
\(e_1, e_2 \in \mathfrak {V}\) are adjacent if there exists \(e\in E_\circ ^{\textsf {atyp}}\) such that \(e_1 = e_\star (e)\) and \(e_2 = e_{\star \star }(e)\).
Observe that \(\mathfrak {G}\) should not contain any cycles, since a cycle inside \(\mathfrak {G}\) will imply the existence of a free component containing a cycle. Since \(|\mathfrak {E}|=\eta '\), this implies that \(|\mathfrak {V}|\ge \eta '+1\). Since the set \(\mathfrak {V}\) locates the edges \(e\in E_c\) where (142) happens, we have at least \(\eta '+1\) distinct edges (or vertices) that satisfy (142).
Having this in mind, we sum (143) over all \(\underline{\sigma }_{E_c}\nsubseteq \dot{\Omega }^{\textsf {typ}}\) and deduce that
Back to the proof of Case 2.
Now we go back to the general setting, where \(\mathcal {Y}\) contains multiple connected components with \(\eta (\mathcal {Y})>0\). When we sum \(\mathbb {E}[\text {{\textbf {Z}}}_{\lambda }^{\text {tr}} \mathbb {1}\{\mathcal {Y},\underline{\tau }_\mathcal {Y}\}]\) over all \(\underline{\tau }\), each \(\zeta \)-cycle in \(\mathcal {Y}\) that is disjoint with all others will provide a contribution of \((1+\delta (\zeta ) +O(n^{-1/5} )\) as discussed in Case 1. On the other hand, the contributions from components that are not a single cycle will be bounded by \(n^{3\eta /4}\) due to (141), (144). Summarizing the discussion, we have
Summing over all \(\mathcal {Y}\) satisfying \(\eta (\mathcal {Y})=\eta \) can then be done using (63). This gives that
where \(C'\) is a constant depending only on k, d and \(a^\dagger \equiv \sum _{||\zeta ||\le l_0} ||\zeta ||a_\zeta \). We can choose \(c_{\textsf {cyc}}=c_{\textsf {cyc}}(l_0)\) so that \(2^{2a^\dagger } \le n^{1/8}\) for any \(||\underline{a}||_\infty \le c_{\textsf {cyc}}\log n\). Then, we obtain the following conclusion by summing the above over all \(\eta \ge 1\) and averaging over \(\underline{\texttt {L}}_E\) and g satisfying \(||g-g^\star _{\lambda }||_1\le \sqrt{n}\log ^2 n\) and (134):
Finally, we conclude the proof of Proposition 4.1-(5) by combining (138) and (145). \(\square \)
1.4 Proof of Lemma 5.9
In this section, we present the proof of Lemma 5.9. Our approach relies on applying similar ideas as Lemma 6.7 of [25] and Proposition 4.1 to
Proof of Lemma 5.9
For a given \(\underline{\tau }_{\mathscr {U}}\), let \(\dot{\epsilon }\) and \((\hat{\epsilon }^{\underline{\texttt {L}}})_{\underline{\texttt {L}}}\) be integer-valued measures on \((\dot{\Omega }_L^2)^d\) and \((\dot{\Omega }_L^2)^k\), respectively, such that
In particular, we can first define \(\dot{\epsilon }\) and \(\sum _{\underline{\texttt {L}}}\hat{\epsilon }^{\underline{\texttt {L}}}\), following the construction of \((\dot{\epsilon }, \hat{\epsilon })\) given in (60), [25] and Lemma 4.4, [43]: there exist \((\dot{\epsilon }^\tau , \hat{\epsilon }^\tau )_{\tau \in \dot{\Omega }_L^2}\) such that
satisfy the desired condition (147). After that, we distribute the mass \(\hat{\epsilon } \equiv \sum _{\underline{\texttt {L}}} \hat{\epsilon }^{\underline{\texttt {L}}}\), which can be done in the following way:
-
For each \(\underline{\tau }\in (\dot{\Omega }_L^2)^k\), pick one \(\underline{\texttt {L}}\in \{0,1\}^k\) such that \(\underline{\tau } \oplus \underline{\texttt {L}}\) defines a valid coloring around a clause. Then, set \(\hat{\epsilon }^{\underline{\texttt {L}}}(\underline{\tau }) = \hat{\epsilon }(\underline{\tau })\).
For such \(\dot{\epsilon }\) and \(\hat{\epsilon }\), let
where both depend only on \(|\mathscr {U}|\), not on \(\underline{\tau }_{\mathscr {U}}\).
Similarly as in the proof of Proposition 4.1, we study (146) by computing the contribution from each empirical profile. If \(g-\epsilon = (\dot{g}-\dot{\epsilon }, (\hat{g}^{\underline{\texttt {L}}}-\hat{\epsilon }^{\underline{\texttt {L}}})_{\underline{\texttt {L}}})\) is an empirical profile contributing to (93), then g contributes to the full random (d, k)-regular graph with \(\tilde{n}=n-|V(T)|+\nu \) variables and \(\tilde{m}=m-|F(T)|+\mu \) clauses. Let \(\Xi (g|\underline{\texttt {L}}_{\tilde{E}})\) be the contribution of g to \(\mathbb {E}[{{\textbf {Z}}}^2 |\underline{\texttt {L}}_{\tilde{E}} ]\) on such random graph with literal assignment \(\underline{\texttt {L}}_{\tilde{E}}\), given by
where w(g) is given by (53).
Let \(\Xi _c (g,\epsilon ,\Delta ,U\,|\underline{\texttt {L}}{E} )\) be the contribution of the profile \(g-\epsilon \) to (146), conditioned on the literal assignments being \(\underline{\texttt {L}}_E\). We can write down its explicit formula as follows.
where the meaning of each term in the rhs can be described as follows.
-
(1)
The first term counts the number of ways to locate the variables and clauses except the ones given by \(\mathcal {Y}\) and \(\underline{\tau }_\mathcal {Y}\).
-
(2)
The second denotes the probability of getting a valid matching between variable- and clause-adjacent half-edges. Note that \(\bar{\Delta }+\bar{\Delta }_U\) is subtracted since the edges on \(\mathcal {Y}\) should be matched through specific choices prescribed by \(\mathcal {Y}\).
-
(3)
In (2), we should exclude the cases that the half-edges in \(\cup _{v\in V(\mathcal {Y}) } \delta v \setminus E_c(\mathcal {Y})\) are matched with the boundary half-edges of T. The probability of not having such an occasion is given by the third term. For future use, we define
$$\begin{aligned} b_1 (g,\epsilon ,\Delta ,U) \equiv \frac{ (\dot{M}(\dot{g}-\dot{\epsilon }) -\bar{\Delta } -\bar{g}^{\underline{\tau }_{\mathscr {U}} } )_{\dot{M}\dot{\Delta }_\partial -\bar{\Delta } -\bar{\Delta }_U} }{ (\dot{M}(\dot{g}-\dot{\epsilon })- \bar{\Delta } -\bar{\Delta }_U )_{\dot{M}\dot{\Delta }_\partial -\bar{\Delta } -\bar{\Delta }_U} } \end{aligned}$$ -
(4)
The last term denotes the product of variable, clause and edge factors in \(G^\partial \).
Then, we compare \(\Xi _c(g,\epsilon ,\Delta ,U| \underline{\texttt {L}}_E)\) and \(\Xi (g|\underline{\texttt {L}}_{\tilde{E}})\), for g that satisfies \(||g-g_\star || \le \sqrt{n}\log ^2 n\), where we wrote \(g_\star \equiv g^\star _{\lambda ,L}\). Note that in such setting, \(\underline{\texttt {L}}_{\widetilde{E}}\) and \(\underline{\texttt {L}}_E\) should differ by \(|\hat{\epsilon }^{\underline{\texttt {L}}}|\) for each \(\underline{\texttt {L}}\in \{0,1\}^k\). Moreover, set \(\hat{g}=\sum _{\underline{\texttt {L}}} \hat{g}^{\underline{\texttt {L}}}\) and \(\hat{\Delta }_\partial = \sum _{\underline{\texttt {L}}} \hat{\Delta }^{\underline{\texttt {L}}}_\partial \). We can write
where we define \(\hat{\Phi }^{\underline{\texttt {L}}} (\underline{\tau }) \equiv \hat{\Phi }^{\text {lit}}(\underline{\tau } \oplus \underline{\texttt {L}}).\) We also set
and rearrange (149) to obtain that
We define
which is the constant \(c_0\) in the statement of the lemma. Moreover, since \(n-|\dot{g}|\) and \(m-|\hat{g}|\) are both bounded by \(O((dk)^{l_0})\), we can write
and this quantity is independent of \(\underline{\tau }_{\mathscr {U}'}\).
What remains is to analyze the error terms \(b_1\) and \(b_2\). The estimate for \(b_1\) can be obtained by the following direct expansion:
On the other hand, \(b_2\) can be studied based on the same approach as Lemma 6.7 of [25]. Define \(A[g] \equiv (\dot{A}[g], \hat{A}[g], \bar{A}[g])\) and \(B[g] \equiv (\dot{B}[g], \hat{B}[g], \bar{B}[g])\) to be
We can write \(b_2\) using the above, namely,
and similarly for the terms including \(\hat{g}\) and \(\dot{M}\dot{g}\) (See the proof of Lemma 6.7 (page 480) of [25] for its precise derivation). Moreover, since the leading exponent of \(\Xi (g)\) is negative-definite at \(g_\star \), the averages \(A^{\text {avg}}\), \(B^{\text {avg}}\) defined by
satisfy the bounds \(||A^{\text {avg}}||_\infty =O(n^{-1/2})\), \(||B^{\text {avg}}||_\infty = O(n^{-1})\). Meanwhile, we can write
and similarly the terms involving \(\hat{\epsilon }^{\underline{\texttt {L}}}\) and \(\dot{M}\dot{\epsilon }\).
One more thing to note when averaging (150) is that only \(2^{-|\bar{\Delta }|}\) fraction of \(\underline{\texttt {L}}_E\) gives a non-zero value (as written in (150)), since the literals prescribed by \(\mathcal {Y}\) should be fixed. Having this in mind, averaging (150) based on the observations (151), (152) and (153) gives us the conclusion. \(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Nam, D., Sly, A. & Sohn, Y. One-Step Replica Symmetry Breaking of Random Regular NAE-SAT II. Commun. Math. Phys. 405, 61 (2024). https://doi.org/10.1007/s00220-023-04868-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00220-023-04868-6