Markov capacity for factor codes with an unambiguous symbol

GUANGYUE HAN; BRIAN MARCUS; CHENGYU WU

doi:10.1017/etds.2023.103

Markov capacity for factor codes with an unambiguous symbol

Part of: Topological dynamics Communication, information

Published online by Cambridge University Press: 07 November 2023

GUANGYUE HAN ,

BRIAN MARCUS and

CHENGYU WU

Show author details

GUANGYUE HAN: Affiliation:
Department of Mathematics, The University of Hong Kong, Pok Fu Lam, Hong Kong (e-mail: ghan@hku.hk)
BRIAN MARCUS: Affiliation:
Department of Mathematics, The University of British Columbia, Vancouver, Canada (e-mail: marcus@math.ubc.ca)
CHENGYU WU*: Affiliation:
Department of Mathematics, The University of British Columbia, Vancouver, Canada (e-mail: marcus@math.ubc.ca)
*: e-mail: wuchengyu0228@gmail.com

Article contents

Abstract
Introduction
Notation and brief background from symbolic dynamics
A problem in information theory
Factor codes as channels
Factor codes with an unambiguous symbol
Characterization of the one-to-one condition for factor codes with an unambiguous symbol
Standard factor codes defined on spoke graphs
Characterization of the finite-to-one condition for standard factor codes on spoke graphs
Conjecture: properties P2 and P3 are equivalent for standard factor codes on spoke graphs
Standard factor codes defined on another class of graphs
Concluding remarks
References

Rights & Permissions

Abstract

In this paper, we first give a necessary and sufficient condition for a factor code with an unambiguous symbol to admit a subshift of finite type restricted to which it is one-to-one and onto. We then give a necessary and sufficient condition for the standard factor code on a spoke graph to admit a subshift of finite type restricted to which it is finite-to-one and onto. We also conjecture that for such a code, the finite-to-one and onto property is equivalent to the existence of a stationary Markov chain that achieves the capacity of the corresponding deterministic channel.

Keywords

symbolic dynamics factor codes finite-to-one codes Markov capacity shift of finite type

MSC classification

Primary: 37B10: Symbolic dynamics

Secondary: 94A40: Channel models (including quantum)

Type: Original Article
Information: Ergodic Theory and Dynamical Systems , First View , pp. 1 - 30

DOI: https://doi.org/10.1017/etds.2023.103 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2023. Published by Cambridge University Press

1 Introduction

Shifts of finite type (SFT), and more generally sofic shifts, are spaces of bi-infinite sequences that play a prominent role in symbolic dynamics. Of particular interest are factor codes (onto sliding block codes) from one such space to another, as they represent ways of encoding blocks in the domain space into blocks in the range space. However, typically, such maps are badly many-to-one. So, it would be useful to know when one can restrict to a subspace of the domain such that the code is still onto and one-to-one/finite-to-one. Consider the following properties. Given an irreducible SFT X, a sofic shift Y, and a factor code, $\phi : X \rightarrow Y$ :

P1 there exists an SFT $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy onto Y;
P2 there exists an SFT $Z\subset X$ such that $\phi \vert _Z$ is finite-to-one and onto Y;
P3 there exists a stationary Markov measure $\nu $ on X such that $\phi ^*(\nu ) = \mu _0$ , the unique measure of maximal entropy (mme for short) on Y.

We are interested in finding checkable, necessary and sufficient conditions for each of these properties and in determining relationships among these properties. Clearly, property P1 implies property P2 and property P2 implies property P3 because, given property P2, any mme $\nu $ on Z satisfies property P3 (see Proposition 4.2).

A factor code $\phi :X \rightarrow Y$ can be viewed as an input-constrained, deterministic, but typically lossy channel in the information theoretic sense: an input x determines a channel output $y = \phi (x)$ . Our interest in property P3 stems from the fact that it is equivalent to the condition that the Markov capacity achieves the capacity of this channel, that is, there is an input Markov measure on X that achieves capacity (see §§3 and 4 for more details).

Since Y is the image of an irreducible shift space, it must be irreducible, and it follows that $\mu _0$ is indeed unique and fully supported on Y. However, we do not require $\nu $ to be fully supported on X.

For property P1, there are certainly some necessary conditions; for instance, if Y has a fixed point, then X must have a fixed point and Y must be an SFT.

We consider the special class of factor codes with an unambiguous symbol. This means that the alphabet of Y is $\{0,1\}$ and in the block code $\Phi $ that generates $\phi $ , there is exactly one block u such that $\Phi (u) = 1$ . In Theorem 6.1, we characterize, for this class, all such $\phi $ for which there exists a shift space $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy onto Y and show that such a Z must necessarily be an SFT, that is, property P1 is satisfied. In Theorem 6.5, we give a refined version of this result when X is the full 2-shift.

Note that if a factor code $\phi $ defined on an irreducible SFT X is finite-to-one but not one-to-one itself, then property P1 is not satisfied. This follows from the fact that if property P1 is satisfied for some Z, then by [Reference Lind and MarcusLM95, Corollary 4.4.9], $h_{\mathrm {top}}(Z)<h_{\mathrm {top}}(X)$ , which contradicts [Reference Lind and MarcusLM95, Corollary 8.1.20]. For a simple example of such a $\phi $ with an unambiguous symbol, see Example 8.3.

For property P2, we recall from a counterexample [Reference Marcus, Petersen and WilliamsMPW84, pp. 287–289] that property P2 is not always satisfied. Motivated by that counterexample, we consider a subclass of factor codes with an unambiguous symbol, called standard factor codes on spoke graphs (for the definition, see §7). In Theorem 8.1, for this subclass, we characterize all such $\phi $ satisfying property P2, and we show that for any $\phi $ in this subclass, property P2 is equivalent to the existence of an SFT $Z \subset X$ , such that $\phi |_Z$ is almost invertible and onto Y.

The same counterexample in [Reference Marcus, Petersen and WilliamsMPW84, pp. 287–289] shows that for standard factor codes on spoke graphs, property P3 is not always satisfied.

We conjecture that for standard factor codes on spoke graphs, properties P3 and P2 are equivalent, that is, if there exists a stationary Markov measure $\nu $ on X such that ${\phi ^*(\nu ) = \mu _0}$ , then there exists an SFT $Z\subset X$ such that $\phi \vert _Z$ is finite-to-one and onto Y; if true, then for this class, the same characterization for property P2 holds for property P3. In Proposition 9.6, we prove this in several special cases. The proof combines the Chinese remainder theorem and a dominance condition.

We note that property P3 is related to the property that a factor code from an irreducible SFT to an irreducible SFT is Markovian, although in that case, one assumes that such $\nu $ is fully supported [Reference Boyle and TuncelBT84, Reference Boyle, Petersen, Marcus, Petersen and WeissmanBP11].

It was shown in [Reference Marcus, Petersen and WilliamsMPW84, Proposition 3.2] that property P2 always holds if we relax SFT Z to sofic Z. Similarly, it was shown in [Reference Marcus, Petersen and WilliamsMPW84, Corollary 3.3] that if we relax stationary Markov $\nu $ to stationary hidden Markov $\nu $ , then property P3 always holds.

We point the reader to a related paper which considers factor codes $\phi :X \rightarrow Y$ as deterministic channels and for a given factor code $\phi $ , characterizes those subshifts of entropy strictly less than that of Y that can be faithfully encoded through $\phi $ [Reference MacDonaldMac23].

The remainder of this paper is organized as follows. In §2, we give a brief background on symbolic dynamics, focusing on SFTs, sofic shifts, and factor codes. In §3, we describe a motivating problem from information theory. In §4, we describe factor codes as special channels in information theory (as was done in [Reference Marcus, Petersen and WilliamsMPW84]). We introduce in §5 the class of factor codes with an unambiguous symbol and, for this class, consider property P1 in §6. In §7, we introduce the subclass of standard factor codes on spoke graphs and consider property P2 for this subclass in §8. In §9, we consider property P3 for this subclass and prove Proposition 9.6. Finally, in §10, we discuss standard factor codes on another class of graphs.

2 Notation and brief background from symbolic dynamics

We introduce in this section some basic terms and facts in symbolic dynamics. For more details, see [Reference Lind and MarcusLM95].

Let $\mathcal {A}$ be a finite alphabet. The full $\mathcal {A}$ -shift, denoted by $\mathcal {A}^{\mathbb {Z}}$ , is the collection of all bi-infinite sequences over $\mathcal {A}$ . When $\mathcal {A}=\{0,1, \ldots , n-1\}$ , the full shift is called the full n-shift and will be denoted by $X_{[n]}$ . For any point $x=\cdots x_{-1} x_0 x_1 \cdots \in \mathcal {A}^{\mathbb {Z}}$ , we use $x_i$ to denote the ith coordinate of x and $x_{[i,j]}$ to denote the block $x_ix_{i+1} \ldots x_j$ . For a block $x_1\ldots x_m$ , we use $(x_1 \ldots x_m)^k$ to denote its k-concatenation and $(x_1 \ldots x_m)^\infty $ to mean its infinite concatenation. The shift map $\sigma $ on $\mathcal {A}^{\mathbb {Z}}$ is defined by $(\sigma (x))_i=x_{i+1}$ for any $x\in \mathcal {A}^{\mathbb {Z}}$ . A subset of $\mathcal {A}^{\mathbb {Z}}$ is a shift space if it is compact and is invariant under $\sigma $ . For any positive integer m and a shift space X, we use $\mathcal {B}_m(X)$ to denote the set of all allowed blocks of length m in X, and $\mathcal {B}(X) := \bigcup _{n} \mathcal {B}_n (X)$ is called the language of X. The Nth higher block shift of X is the image $\beta _N(X)$ in the full shift over $\mathcal {A}^N$ , where $\beta _N: X \rightarrow (\mathcal {A}^N)^{\mathbb {Z}}$ is defined by $(\beta _N(x))_i= x_{[i,i+N-1]}$ for any $x\in X$ . A shift space X is irreducible if for any $u, v\in \mathcal {B}(X)$ , there is a $w\in \mathcal {B}(X)$ such that $uwv\in \mathcal {B}(X)$ .

Let $\mathcal {A}_1$ , $\mathcal {A}_2$ be two alphabets, $s,t$ be two fixed integers, and let X be a shift space over $\mathcal {A}_1$ . The map $\phi : X \rightarrow {\mathcal {A}_2}^{\mathbb {Z}}$ defined by $\phi (x)_i=\phi (x_{[i-s, i+t]})$ for any i is called a sliding block code with anticipation t and memory s. A sliding block code $\phi : X \rightarrow Y$ is finite-to-one if there is an integer M such that $\vert \phi ^{-1}(y) \vert \leq M$ for every $y\in Y$ , and it is one-to-one when $M=1$ . Moreover, the sliding block code $\phi : X\rightarrow Y$ is a factor code if it is onto, in which case Y will be called the factor of X, and $\phi $ is a conjugacy if it is one-to-one and onto.

A point diamond for $\phi $ is a pair of distinct points in X that differ in finitely many coordinates and have the same image under $\phi $ . If X is irreducible, then $\phi $ is finite-to-one if and only if it has no point diamonds [Reference Lind and MarcusLM95, Theorem 8.1.16].

Let G be a directed graph with no multiple edges. For a path $\gamma $ in G, $V(\gamma )$ denotes the sequence of vertices of $\gamma $ and $\vert \gamma \vert $ is the length, that is, the number of edges, of $\gamma $ (for example, for $\gamma =e_1 e_2 \ldots e_n$ , $V(\gamma )=I(e_1) I(e_2) \ldots I(e_n) T(e_n)$ and $\vert \gamma \vert =n$ , where for any i, $I(e_i)$ and $T(e_i)$ denote the initial vertex and the terminal vertex of $e_i$ , respectively). We use $\mathcal {V}(G)$ to denote the vertex set of G and $\widehat {X_G}$ to denote the vertex shift induced by G. That is, the shift space whose points are sequences of vertices of bi-infinite paths in G. Let $\Phi : \mathcal {V}(G) \rightarrow \mathcal {A}$ be a labeling of vertices of G over a finite alphabet $\mathcal {A}$ . A graph diamond of $\Phi $ is a pair of distinct paths in G that have the same initial vertex, terminal vertex, and label. It is well known that, assuming G is irreducible, the factor code generated by $\Phi $ is finite-to-one if and only if $\Phi $ has no graph diamonds [Reference Lind and MarcusLM95, §8.1].

A shift space X can be expressed as $X = X_{\mathcal {F}}$ where $\mathcal {F}$ is a forbidden set, a list of forbidden words such that $x \in X$ if and only if x contains no element of $\mathcal {F}$ . The choice of the forbidden set of X is in general not unique. When $X=X_{\mathcal {F}}$ for some finite set $\mathcal {F}$ , X is called an SFT. An SFT X is called M-step (or has memory M) if $X=X_{\mathcal {F}}$ for a collection $\mathcal {F}$ of $(M+1)$ -blocks. A vertex shift is always a $1$ -step SFT and conversely, by lifting to its $(M+1)$ th higher block shift, an M-step SFT can always be represented as the vertex shift of a graph. A shift space Y is sofic if there exist an SFT X and a sliding block code $\phi $ such that $\phi (X)=Y$ . Clearly, SFTs must be sofic.

There is a general definition of the degree of a factor code on any subshift, see [Reference Lind and MarcusLM95, Definition 9.1.2]. For our purposes, we focus only on the following equivalent definition of the degree of a 1-block finite-to-one factor code $\phi : X\rightarrow Y$ , where X is an irreducible M-step SFT X: let $N:=\max \{1, M\}$ . The degree of $\phi $ is defined as the minimum over all blocks $w=w_1 w_2 \ldots w_{\vert w \vert }$ in Y and all $1\leq i\leq \vert w\vert -N+1$ of the number of distinct N-blocks in X that we see beginning at coordinate i among all the pre-images of w [Reference Lind and MarcusLM95, Proposition 9.1.12]. A word w that achieves the minimum above with some coordinate i is called a magic word, and the subblock $w_{i}w_{i+1}\ldots w_{i+N-1}$ is called the corresponding magic block.

A factor code $\phi $ is almost invertible if its degree is $1$ . While an almost invertible code need not be finite-to-one, on an irreducible SFT, it must be finite-to-one [Reference Lind and MarcusLM95, Proposition 9.2.2].

The topological entropy of a shift space X is

$$ \begin{align*}h_{\mathrm{top}} (X) := \lim_{m\rightarrow \infty} \frac{1}{m} \log \vert \mathcal{B}_m (X) \vert.\end{align*} $$

For a probability measure $\mu $ on X, let $h(\mu )$ denote its measure theoretic entropy. By the variational principle [Reference WaltersWal82, Theorem 8.6],

(1)

$$ \begin{align} h_{\mathrm{top}} (X) = \sup_\mu \{h(\mu): \mu \mbox{ is a shift-invariant Borel probability measure on }X \}. \end{align} $$

An mme $\mu _0$ of X is a probability measure on X such that the supremum in equation (1) is achieved.

Given $S\subset \mathbb {Z}_{\geq 0}$ , an S-gap shift $X(S)$ is a subshift of $X_{[2]}$ such that any $x\in X(S)$ is a concatenation of blocks of the form $0^s1$ with $s\in S$ , where points with infinitely many $0$ s to both sides are allowed when S is infinite. Let $\unicode{x3bb} $ be the unique positive solution to $\sum _{m\in S} x^{-m-1}=1$ . Then $h_{\mathrm {top}}(X(S))=\log \unicode{x3bb} $ [Reference Dastjerdi and JangjooDJ12], and the unique mme $\mu _0$ of $X(S)$ is determined by

$$ \begin{align*} \frac{\mu_0(X_{0} X_{1} \ldots X_{i+1} = 10^i 1)}{\mu_0(X_{0}=1)}= \unicode{x3bb}^{-i-1} \quad \mbox{for any }i\in S \end{align*} $$

and

$$ \begin{align*} \mu_0 (X_1 \ldots X_n&=x_1 \ldots x_n \vert X_{-m} \ldots X_{-1} X_0 =x_{-m} \ldots x_{-1}1)\\ &= \mu_0 (X_1 X_2 \ldots X_n=x_1\ldots x_n \vert X_0=1) \end{align*} $$

for any $m,n$ , and any allowed block $x_{-m}\ldots x_{-1}1x_1\ldots x_n$ [Reference García-Ramos and PavlovGP19, Corollary 3.9].

It has been proven in [Reference Dastjerdi and JangjooDJ12] that $X(S)$ is an SFT if and only if S is finite or cofinite. Indeed, the forbidden set of $X(S)$ is

(2)

$$ \begin{align} \mathcal{F}= \begin{cases} \{10^m1: m\in \{0,1,2,\ldots, \max{S}\} \setminus S\}\cup \{0^{1+\max{S}}\} \quad &\mbox{when }S\mbox{ is finite},\\ \{10^m1: m\in \mathbb{Z}_{\geq 0} \setminus S\} \quad &\mbox{when }S\mbox{ is cofinite,} \end{cases} \end{align} $$

which will be called the standard forbidden set of $X(S)$ in this paper.

3 A problem in information theory

A central object in information theory is a discrete channel. Here, there is a space X of input sequences, a space Y of output sequences, each over a finite alphabet, and for each $x \in X$ , a probability measure $\unicode{x3bb} _x$ on Y which gives the distribution of outputs, given that x was transmitted. One assumes that the map $x \mapsto \unicode{x3bb} _x$ is at least measurable and the channel is stationary in the sense that $\unicode{x3bb} _{\sigma x} = \sigma ^*\unicode{x3bb} _x$ , where $\sigma $ is the left shift defined on X and $\sigma ^*$ is the induced shift for measures.

Typically, X and Y are full shifts and in the simplest case, that of a discrete memoryless channel, $\unicode{x3bb} _x(y_1 \ldots y_n) = \Pi _{i=1}^n p(y_i|x_i)$ ; here, for each element a of the alphabet of X, $p(\cdot |a)$ is a probability distribution on the alphabet of Y; the channel is memoryless in the sense that conditioned on the input $x_i$ , the output $y_i$ is independent of all other inputs. For example, the binary symmetric channel (BSC) is the memoryless channel where X and Y are the full 2-shift and

$$ \begin{align*} p(b|a) = \begin{cases} \epsilon, & b \ne a, \\ 1- \epsilon, & b = a. \end{cases} \end{align*} $$

Here, $\epsilon $ is a parameter, known as the crossover probability.

Given a stationary (that is, shift invariant) input measure $\nu $ on X, one defines the stationary output measure $\kappa (\nu )$ on Y by $\kappa (\nu ) = \int \unicode{x3bb} _x\, d\nu $ . The mutual information of $\kappa (\nu )$ and $\nu $ is defined as

$$ \begin{align*} I(\kappa(\nu),\nu) = h(\kappa(\nu)) - h(\kappa(\nu) | \nu) = h(\nu) - h(\nu | \kappa(\nu)), \end{align*} $$

where $h(\cdot )$ denotes entropy and $h(\cdot | \cdot )$ denotes conditional entropy (the second equality follows from the chain rule for entropy, which is a fundamental equality in information theory); in information theory, shift-invariant measures are viewed as stationary processes and these entropies are often referred to as entropy rates.

There are several notions of channel capacity, which all agree under relatively mild assumptions. The stationary capacity (capacity for short) of a discrete noisy channel is defined as

$$ \begin{align*} \textit{Cap} = \sup_{ \mbox{ stationary }\nu} I(\kappa(\nu),\nu). \end{align*} $$

For a discrete memoryless channel, the capacity can be computed effectively because it agrees with the $\sup $ when restricted only to independent and identically distributed (that is, stationary Bernoulli) measures, turning it into a finite dimensional optimization problem, and, while there is no known closed form expression for capacity in general, the optimum can be effectively approximated by the well-known Blahut–Arimoto algorithm [Reference ArimotoAri72, Reference BlahutBla72].

We define the kth-order Markov capacity by

$$ \begin{align*} {\textit{Cap}}_k = \sup_{ \mbox{ stationary }k\mbox{th-order Markov }\nu} I(\kappa(\nu),\nu). \end{align*} $$

We are interested in the problem: when does Markov capacity achieve capacity, that is, when does $\textit {Cap}_k = \textit {Cap}$ for some k?

It is known, using the ergodic decomposition, that under mild assumptions, $\textit {Cap}$ (respectively, $\textit {Cap}_k$ ) coincides with the maximum mutual information over all stationary, ergodic input measures (respectively, stationary, irreducible, kth-order Markov input measures) [Reference FeinsteinFei59, Reference GrayGra11].

Again, with mild assumptions on the channel, one shows that $\lim _{k \rightarrow \infty } \textit {Cap}_k = \textit {Cap}$ [Reference Chen and SiegelCS08]; informally, ‘Markov capacity asymptotically achieves capacity.’ This is important because for fixed k, computation of $\textit {Cap}_k$ is a finite-dimensional optimization problem. According to the discussion above, for discrete memoryless channels, $\textit {Cap}_0 = \textit {Cap}$ ; informally, ‘Bernoulli capacity achieves capacity.’ However, for channels with memory, even just one step of memory, except in certain cases such as input-constrained noiseless channels below, it is believed that $\textit {Cap}_k \ne \textit {Cap}$ for all k. However, we are not aware of any such result.

If X is not a full shift, then the channel is called input-constrained. Typically, the input constraint X is an SFT or sofic shift. Such a shift space can be considered a noiseless channel in itself, in a trivial way: $Y = X$ and for each $x \in X$ , $\unicode{x3bb} _x = \delta _x$ , the point mass on $\{x\}$ . The capacity of this channel is easily seen to be the topological entropy, $h_{\mathrm {top}}(X)$ , otherwise known as the noiseless capacity, which can be easily computed.

Now, consider the input-constrained binary symmetric channel. This is the BSC, where the inputs are required to belong to a given SFT or sofic shift X over $\{0,1\}$ . While the capacity of the BSC and the noiseless capacity of X are known explicitly, the capacity of the X-constrained BSC is not known. And while Markov capacity asymptotically achieves capacity of this channel, it is believed that Markov capacity does not achieve capacity, i.e. for all k, $\textit {Cap}_k \ne \textit {Cap}$ . However, this has not been proven.

4 Factor codes as channels

This brings us to a main point of our paper: for a class of channels, albeit rather simple in practice, we can rigorously decide whether or not Markov capacity achieves capacity. An example of this was given in [Reference Marcus, Petersen and WilliamsMPW84, pp. 287–289]. Specifically, we view a factor code $\phi : X \rightarrow Y$ as an input-constrained, deterministic channel; here, $\unicode{x3bb} _x = \delta _{\phi (x)}$ , so the input determines the output uniquely. Intuitively, for this channel, input sequences are distorted in a deterministic way. It follows that, in this case, for any invariant input measure $\nu $ , $h(\kappa (\nu )|\nu ) = h(\phi ^*(\nu )|\nu ) = 0$ , where $\phi ^*$ is the induced map (of $\phi $ ) on stationary measures on X. So

$$ \begin{align*} \textit{Cap} = \sup_{ \mbox{ stationary }\nu} h(\phi^*(\nu)). \end{align*} $$

According to [Reference Marcus, Petersen and WilliamsMPW84, Corollary 3.2], there exists a stationary input measure $\nu $ (in fact, a stationary hidden Markov input measure) such that $\phi ^*(\nu ) = \mu _0$ , the unique mme on Y. Thus, by the variational principle [Reference WaltersWal82, Theorem 8.6], $\textit {Cap} = h_{\mathrm {top}}(Y)$ (an alternative to this argument is to show that the map $\nu \mapsto \phi ^*(\nu )$ is onto the set of all stationary measures on Y: given stationary $\mu $ on Y, use the Hahn–Banach theorem to find a not necessarily stationary $\nu '$ on X such that $\phi ^*(\nu ') = \mu $ and let $\nu $ be any weak limit point of the sequence $(1/n)(\nu ' +\sigma \nu ' + \cdots + \sigma ^{n-1}\nu ')$ ).

In summary, we have the following proposition.

Proposition 4.1. Let $\phi :X \rightarrow Y$ be a factor code from an irreducible SFT X to a sofic shift Y. Let $\mu _0$ be the unique measure of maximal entropy on Y. For the input-constrained, deterministic channel defined by $\phi $ :

(1) $\textit {Cap}$ (respectively, $\textit {Cap}_k$ ) coincides with the maximum mutual information over all stationary, ergodic input measures (respectively, stationary, irreducible, kth-order Markov input measures);
(2) $\lim _{k \rightarrow \infty } \textit {Cap}_k = \textit {Cap};$
(3) $\textit {Cap}= h_{\mathrm {top}}(Y);$
(4) a stationary measure $\nu $ on X achieves $\textit {Cap}$ if and only if $\phi ^*(\nu ) = \mu _0$ if and only if $h(\phi ^*(\nu )) = h_{\mathrm {top}}(Y).$

The following simple result gives a relation between properties P2 and P3.

Proposition 4.2. With the same assumptions as in Proposition 4.1, if there is an SFT $Z\subset X$ such that $\phi \vert _Z$ is finite-to-one and onto Y, then there is an irreducible stationary Markov measure $\nu $ on Z of order at most the memory of Z such that $\phi ^*(\nu )=\mu _0$ .

Proof. Let $\nu $ be the unique mme of any irreducible component of Z with maximum topological entropy. It is stationary, irreducible, and Markov. Since $\phi \vert _Z$ is finite-to-one and onto Y,

$$ \begin{align*} h(\phi^*(\nu))=h(\nu) =h_{\mathrm{top}} (\mathrm{supp}~ \nu) =h_{\mathrm{top}}(Z) = h_{\mathrm{top}}(Y). \end{align*} $$

Since $\mu _0$ is the unique mme on Y, we have $\phi ^*(\nu )=\mu _0$ .

Proposition 4.3. Let $\phi :X \rightarrow Y$ be a factor code from an irreducible SFT X to a sofic shift Y. Let $\nu $ be an irreducible stationary Markov measure on X and assume that $\phi ^*(\nu ) = \mu _0$ , the unique mme on Y (in particular, Markov capacity achieves capacity of the input-constrained deterministic channel determined by $\phi $ ).

The following are equivalent:

(1) $\phi |_{\mathrm {supp(\nu )}}$ is finite-to-one and onto;
(2) $h_{\mathrm {top}}(\mathrm {supp}(\nu )) = h_{\mathrm {top}}(Y)$ ;
(3) $h(\nu ) = h_{\mathrm {top}}(Y)$ ;
(4) for every periodic point in $\mathrm {supp}(\nu )$ , the weight per symbol, for $\nu $ , is $e^{-h_{\mathrm {top}}(Y)}$ (the weight per symbol of a periodic point $(p_0 \ldots p_{n-1})^\infty $ for a kth-order Markov measure $\nu $ on X is defined to be $\nu (p_0 \ldots p_{n-1} | p_{-k} \ldots p_{-1})^{1/n}$ ).

Proof. (1) $\Rightarrow $ (2): This follows directly from [Reference Lind and MarcusLM95, Corollary 8.1.2].

(2) $\Rightarrow $ (3):

$$ \begin{align*} h_{\mathrm{top}}(Y)= h_{\mathrm{top}}(\mathrm{supp}(\nu)) \ge h(\nu) \ge h(\phi^*(\nu)) = h(\mu_0) = h_{\mathrm{top}}(Y). \end{align*} $$

This yields item (3).

(3) $\Rightarrow $ (1): Apply [Reference ParryPar97, Theorem 2].

((2) and (3)) $\Rightarrow $ (4): The condition that for some $c \ge 0$ , for every periodic point in $\mathrm {supp}(\nu )$ , the $\nu $ -weight per symbol is $e^{-c}$ , is equivalent to the condition that $h(\nu ) = c$ and that $\nu $ is an mme for $\mathrm {supp}(\nu )$ . This is essentially contained in [Reference Petersen, Quas and ShinPT82, Proposition 44].

It follows from Propositions 4.2 and 4.3 that property P2 holds if and only if property P3 holds with a measure $\nu $ that is also irreducible stationary Markov and satisfies any of the equivalent conditions in Proposition 4.3. We will return to this point in §9.

5 Factor codes with an unambiguous symbol

We begin with a brief introduction to factor codes with an unambiguous symbol. Such factor codes are also known as factor codes with a singleton clump [Reference Parry and TuncelPQS03].

Let X be a shift space over an alphabet $\mathcal {A}$ and $D=b_1 b_2 \ldots b_k$ be an allowed block in X. Define $\Phi : \mathcal {A}^k\rightarrow \{0,1\}$ by

(3)

$$ \begin{align} \Phi(x_{[1,k]})= \begin{cases} 1& \mbox{if } x_{[1,k]}=D, \\ 0 & \mbox{otherwise}. \end{cases} \end{align} $$

Then, the factor code $\phi : X \rightarrow Y\subset X_{[2]}$ induced by $\Phi $ is called a factor code with an unambiguous symbol. Here, Y is the image of $\phi $ .

In the remainder of this paper, we focus on the case when X is an irreducible SFT. Note that in this case, by passing to a higher block shift, in the preceding definition, we can and sometimes will assume that $k=1$ and that X is an SFT with memory 1.

The following propositions give some properties of Y.

Proposition 5.1. Let $\phi : X\rightarrow Y$ be a factor code with an unambiguous symbol. Then Y is an S-gap shift.

Proof. The elements of Y are arbitrary concatenations of strings of the form $10^s$ with $s\in S$ such that there exists some allowed block w of length $k+s+1$ satisfying the following:

(1) $w_{[1,k]}=D$ ;
(2) $w_{[s+2,s+k+1]}=D$ ;
(3) for all $2\leq i\leq s+1$ , $w_{[i,k+i-1]}\neq D$ .

Hence, Y is an S-gap shift.

Proposition 5.2. Let $\phi : X\rightarrow Y$ be a factor code with an unambiguous symbol. If $X= X_{[2]}$ , then:

(1) $10^{k-1}1$ is not allowed in Y if and only if D is purely periodic (that is, $D = u^{\ell }$ for some $\ell \ge 2$ and some block u);
(2) for any $j\geq k$ , $10^j1$ is allowed in Y.

Proof. To prove item (1), first observe that $10^{k-1}1$ is allowed if and only if the image of $DD$ is $10^{k-1}1$ . If $10^{k-1}1$ is not allowed, then the image of $DD$ has a prefix of the form $10^c1$ for some $0\le c \le k-2$ . Let $d = c+1 \le k-1$ . Then for all $0\le i \le k-1$ , $b_i = b_{i+d}$ (here and below in this proof, subscripts are read modulo k). It follows that for all integers $m,n$ and all $0\le i \le k-1$ , $b_i = b_{i+md+nk}$ . Let $e = \gcd (d,k)$ . Then $e = md+nk$ for some $m,n$ . Thus, for all $0\le i \le k-1$ , $b_i = b_{i+e}$ . It follows that $D = b_1 \ldots b_k= (b_1\ldots b_e)^{k/e}$ . Since $e < k$ , $k/e \ge 2$ . So, D is purely periodic.

Conversely, assume that D is purely periodic. Then the image of the block $DD$ is not $10^{k-1}1$ and so $10^{k-1}1$ is not allowed.

We now prove item (2). For $j\geq k$ , we show $10^j1$ is allowed in Y by finding a binary block $x_1x_2\ldots x_{j-k+1}$ such that

(4)

$$ \begin{align} \Phi(b_1 \ldots b_k x_1 x_2\ldots x_{j-k+1} b_1 \ldots b_k) =10^j1. \end{align} $$

If $b_1 \ldots b_k=0^k$ , then one immediately verifies that $\Phi (b_1\ldots b_k 1^{j-k+1} b_1\ldots b_k)=10^j1$ . By reversing the roles of $0$ and $1$ in the domain, a similar argument works when $b_1 \ldots b_k=1^k$ .

Now assume that $b_1\ldots b_k\neq 0^k$ and $b_1\ldots b_k\neq 1^k$ . Express $b_1 \ldots b_k$ uniquely by

(5)

$$ \begin{align} b_1b_2\ldots b_k =(b_1 \ldots b_m)^s b_1 \ldots b_t \quad (m\geq 2, s\geq 1, 0\leq t<m), \end{align} $$

where $ms+t=k$ and m is the smallest positive integer such that $b_1 \ldots b_k$ can be expressed by equation (5). We consider the following two cases.

Case 1: $j-k+1\geq m$ . In this case, we claim that equation (4) is satisfied by letting $x_1x_2\ldots x_{j-k+1}=1^{j-k+1}$ . To see this, assume to the contrary that

$$ \begin{align*} \Phi((b_1 \ldots b_m)^s b_1 \ldots b_t 1^{j-k+1} (b_1 \ldots b_m)^s b_1 \ldots b_t) \neq 10^j1. \end{align*} $$

This means that there is an extra $1$ in addition to the two $1$ s at the first and the last position in the image. Hence, there is an extra $b_1 \ldots b_k$ in the input in addition to the two at the initial and tail end (these two $b_1 \ldots b_k$ terms will be called the head and the tail, respectively). Since $x_1 \ldots x_{j-k-1} =1^{j-k+1}$ and $b_1 \ldots b_k\neq 1^k$ , this extra $b_1 \ldots b_k$ must start with some $b_1 \ldots b_t$ in the head or end with some $b_1 \ldots b_t$ in the tail. Thus, it must intersect the ‘intermediate’ subblock $x_1 \ldots x_{j-k+1}$ in at least m bits. Therefore, either

(6)

$$ \begin{align} x_1 x_2 \ldots x_m= b_{t+1} \ldots b_m b_1 \ldots b_t \end{align} $$

(7)

$$ \begin{align} x_{j-k-m+2}\ldots x_{j-k+1}=b_1 \ldots b_m. \end{align} $$

Recalling that $x_1 \ldots x_{j-m+1}=1^{j-k+1}$ , either equation (6) or equation (7) implies $b_1 b_2 \ldots b_k=1^k$ , which is a contradiction.

Case 2: $1\leq j-k+1 < m$ . In this case, an extra $b_1 \ldots b_k$ in the input must intersect the head, the tail, and the ‘intermediate’ subblock $x_1 x_2\ldots x_{j-k+1}$ simultaneously. Thus, this extra $b_1 \ldots b_k$ must start with some $b_1 \ldots b_t$ in the head and end with some $b_1 \ldots b_t$ in the tail. Therefore, equation (4) holds as long as

(8)

$$ \begin{align} \begin{cases} x_1 \neq b_{t+1} \quad \mbox{and}\quad x_{j-k+1} \neq b_m \quad &\mbox{if }j-k>0,\\ x_1 \neq b_{t+1} &\mbox{if }j-k=0, \end{cases} \end{align} $$

which is always possible for some binary $x_1x_2 \ldots x_{j-k+1}$ .

6 Characterization of the one-to-one condition for factor codes with an unambiguous symbol

In this section, we address property P1 for factor codes with an unambiguous symbol. Through this section, a factor code with an unambiguous symbol always refers to the one induced by $\Phi $ in equation (3) unless otherwise specified.

We have the following theorem which characterizes the existence of a subshift of finite type, on which the restriction of $\phi $ is one-to-one and onto.

Theorem 6.1. Let $\phi : X\rightarrow Y$ be a factor code with an unambiguous symbol defined on an irreducible shift space X. Let S be such that Y is an S-gap shift. Then, there is a shift space $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy from Z onto Y if and only if either of the following conditions holds:

(C1) S is a finite set;
(C2) there is a fixed point (that is, fixed via the shift) in X other than $D^\infty $ .

Moreover, Z and Y must be SFTs if either condition (C1) or (C2) holds.

(Note: $D^{\infty }$ may or may not be in X and even if $D^\infty \in X$ , it may or may not be a fixed point.)

Remark 6.2. Note to say that S is finite means that there exists some M such that every allowed block in X of length M contains D as a subblock. Sometimes, one says that in such a case, D is a ‘Rome’.

Remark 6.3. According to Proposition 4.2, when condition (C1) or (C2) holds, the capacity of the deterministic channel, defined by $\phi $ , is achieved by a Markov chain.

Proof of Theorem 6.1

Only if part: If S is finite, we are done. So assume that S is infinite. Then $0^\infty \in Y$ . Since there exists a shift space $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy from Z onto Y, Z must have a fixed point z such that $\phi (z)=0^\infty $ . Finally, noting that $D^\infty \notin X$ or $\phi (D^\infty ) \neq 0^\infty $ , we conclude that z must be different from $D^\infty $ .

If part: Assume condition (C2) of the theorem. Up to recoding, we may assume that X is a (1-step) vertex shift $\widehat {X_G}$ , D is a vertex of the graph G, and there is a vertex A in G such that A is distinct from D and G has a self-loop $\tau $ at A. Using irreducibility of X, there are paths in G, $\beta ^+$ from D to A and $\beta ^-$ from A to D, neither of which contains D in its interior. Let $N: =\vert \beta ^+\beta ^-\vert -1$ .

Now Y is a gap shift with gap set of the form $S:= F \cup \{N, N+1, \ldots \}$ , where each element of F is less than N. For each $s\in S$ , choose $\pi ^s$ to be a first-return cycle of length s from D to itself (‘first-return’ means that it does not contain D in its interior). We will assume that for $s\geq N$ , we choose $\pi ^s=\beta ^+ \tau ^{s-N} \beta ^-$ . For $y\in Y$ , let $O_y:= \{j\in \mathbb {Z}: y_j=1\}$ and define $\eta : Y \rightarrow X$ as follows:

(D1) if $i\in O_y$ , define $(\eta (y))_i=D$ ;
(D2) if $j,j'\in O_y$ and $\{l\in \mathbb {Z}: j<l<j'\}\subset O_y^c$ , define $(\eta (y))_{[j,j']}=V(\pi ^{j'-j})$ ;
(D3) if $O_y$ has a maximum element s, define $(\eta (y))_{[s,\infty )}=V(\beta ^+\tau ^{\infty })$ ;
(D4) if $O_y$ has a minimum element s, define $(\eta (y))_{(-\infty , s]}=V(\tau ^{\infty } \beta ^-)$ ;
(D5) if $O_y=\emptyset $ , define $\eta (y)=A^\infty $ .

Observe that $\eta $ is injective because if $y,y'\in Y$ and $y\neq y'$ , then for some i, without loss of generality, we assume $y_i=1$ and $y_i'=0$ , and so $(\eta (y))_i=D$ and ${(\eta (y'))_i\neq D}$ . Furthermore, we claim that $\eta $ is a sliding block code. To see this, note that $\eta $ is shift-invariant by virtue of its definition, and $(\eta (y))_i$ is a function of $y_{[-N+i, N+i]}$ .

So, $\eta $ is an injective sliding block code from Y into $X=\widehat {X_G}$ . Let Z be its image. Then, $\eta ^{-1}$ is a bijective sliding block code from Z onto Y. Moreover, by the construction of $\eta $ , for every $y\in Y$ ,

(9)

$$ \begin{align} \phi \circ \eta(y) =y. \end{align} $$

It follows that $\eta ^{-1} = \phi \vert _Z$ . This completes the proof of the if part assuming condition (C2).

Now assume condition (C1). The proof follows along the same lines except that the definition of $\eta $ is even easier: $S=F$ is a finite set, and we only need the first two cases, definitions (D1) and (D2), of the definition of $\eta $ because for any $y\in Y$ , $O_y$ is a non-empty set with no maximum and no minimum.

Finally, we show that Y must be an SFT (and thus Z must also be an SFT) when condition (C1) or (C2) holds. To see this, first note that an S-gap shift is an SFT if and only if S is either finite or cofinite [Reference Dastjerdi and JangjooDJ12]. If condition (C1) holds, there is nothing to prove. If condition (C2) holds, then the proof of the ‘if part’ above in particular shows that Y is an S-gap shift with $S:=F\cup \{N, N+1, \ldots \}$ , where N is a positive integer and F is a finite subset of non-negative integers. Thus, S is cofinite and therefore Y is an SFT.

Example 6.4. Let $\mathcal {F}_1=\{111\}$ , $X=X_{\mathcal {F}_1}$ , and $\Phi : \{0,1\}^4 \rightarrow \{0,1\}$ be a $4$ -block code defined by

$$ \begin{align*} \Phi(x_{[1,4]})= \begin{cases} 1 \quad \mbox{if } x_{[1,4]}=1010,\\ 0 \quad \mbox{otherwise}. \end{cases} \end{align*} $$

We let $\phi : X\rightarrow Y$ be the factor code with an unambiguous symbol induced by $\Phi $ . According to Proposition 5.1, Y is an S-gap shift. Applying a similar argument as in the proof of Proposition 5.2 to $\phi $ , one can verify that $3\notin S$ and $\{4,5,6, 7\ldots \}\subset S$ . Furthermore, a direct examination gives

$$ \begin{align*}0\notin S, \quad 1\in S \quad \mbox{and} \quad 2\notin S.\end{align*} $$

Thus, Y is an S-gap shift with $S=\{1,4,5,6,7\ldots \}$ . Equivalently, Y is an SFT with the forbidden set $\mathcal {F}=\{11,1001,10001\}$ . Moreover, since $0^\infty \in X$ , condition (C2) is satisfied and we conclude from Theorem 6.1 that there is an SFT $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy from Z to Y.

When the domain of $\phi $ is $X_{[2]}$ , then condition (C2) in Theorem 6.1 holds and there is always an SFT $Z \subset X$ to which the restriction of $\phi $ is one-to-one and onto Y. Note that Y must be an S-gap shift with S cofinite. Our next result gives an explicit description of Z for some special cases.

Theorem 6.5. Let $\phi :X=X_{[2]} \rightarrow Y$ be a factor code with an unambiguous symbol, $\mathcal {F}$ be the standard forbidden set of Y, and $\overline {\mathcal {F}}$ be the bitwise complement of $\mathcal {F}$ . Then, the following are equivalent:

(1) at least one of the symbols from $\{0,1\}$ occurs at most once in D;
(2) either $\phi \vert _{X_{\mathcal {F}}}$ or $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is one-to-one and onto Y;
(3) either $\phi \vert _{X_{\mathcal {F}}}$ or $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is finite-to-one and onto Y;
(4) either $\phi \vert _{X_{\mathcal {F}}}$ or $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is onto Y.

(Note: When item (1) holds, $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ may not both satisfy item (2) (respectively, items (3) and (4)). For example, suppose $k=4$ and $D=b_1 b_2 b_3 b_4=0000$ . Then, one verifies that $\phi \vert _{X_{\mathcal {F}}}$ is one-to-one and onto, but $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is not. See Example 6.6 for more details.)

Proof. When $k=1$ , $Y=X=X_{[2]}$ and $\phi $ is trivially a conjugacy. Hence, we assume ${k\ge 2}$ throughout the remainder of the proof.

(1) $\Rightarrow $ (2): We consider the following two cases.

Case 1: $b_1 \ldots b_k=0^{k}$ or $b_1 \ldots b_k=1^k$ . Assume $b_1 \ldots b_k=0^{k}$ . Then, Y is an S-gap shift with $S=\{0, k, k+1, \ldots \}$ . Equivalently, Y is an SFT with forbidden set

$$ \begin{align*}\mathcal{F}=\{101,1001, \ldots, 10^{k-1}1\}.\end{align*} $$

Note that any $y\in Y$ can be uniquely expressed by $y=\cdots 1^{m_1}0^{n_1} 1^{m_2}0^{n_2} 1^{m_3} \ldots $ with $m_i\geq 1, n_i\geq k$ . Define

$$ \begin{align*}x:= \cdots 0^{m_1+k-1}1^{n_1-k+1} 0^{m_2+k-1} 1^{n_2-k+1} 0^{m_3+k-1} \ldots.\end{align*} $$

Then, $x\in X_{\mathcal {F}}$ and $\phi (x)=y$ . Hence, $\phi \vert _{X_{\mathcal {F}}}$ is onto.

We then claim that $\phi \vert _{X_{\mathcal {F}}}$ is one-to-one. To see this, consider $x, x'\in X_{\mathcal {F}}$ and $x\neq x'$ . Then, for some i, without loss of generality, we assume $x_i=1, x_i'=0.$ Now, $x_i=1$ implies $(\phi (x))_{[i, i+k-1]}=0^k$ ; however, recalling that $\mathcal {F}=\{101,1001, \ldots , 1{0}^{k-1}1\}$ , we deduce from $x^{\prime }_i=0$ that there is an $i\leq l\leq i+k-1$ such that $x^{\prime }_{[l-k+1, l]}=0^k$ and therefore $(\phi (x'))_l=1$ . Thus, $\phi (x)\neq \phi (x')$ and $\phi \vert _{X_{\mathcal {F}}}$ is one-to-one.

By reversing the roles of $0$ and $1$ in the domain, it follows that $\phi \vert _{X_{\overline {\mathcal {F}}}}: X_{\overline {\mathcal {F}}} \rightarrow X_{\mathcal {F}}$ is also one-to-one and onto when $b_1 \ldots b_k={1}^{k}$ .

Case 2: There is only one $0$ or only one $1$ in $b_1 b_2 \ldots b_k$ . We first assume that $b_j=1$ for some $1\leq j\leq k$ and $b_i=0$ for any $1\leq i\leq k$ and $i\neq j$ . Let $M:=\max \{j-1, k-j\}$ . Then Y is an S-gap shift with $S=\{M, M+1, \ldots \}$ . Equivalently, Y is an SFT with the forbidden set $\mathcal {F}=\{11, 101, \ldots , 10^{M-1} 1\}$ . Expressing any $x\in X_{\mathcal {F}}$ by

$$ \begin{align*} x=\cdots 1 0^{m_{-1}} 1 0^{m_0} 1 0^{m_1} 1 \ldots \end{align*} $$

with $m_l\geq M$ for all $l\in \mathbb {Z}$ , one directly verifies that $\phi (x) = \sigma ^{j-k} (x)$ . Thus, $\phi \vert _{X_{\mathcal {F}}}$ must be one-to-one and onto Y.

By reversing the roles of $0$ and $1$ in the domain, it follows that $\phi \vert _{X_{\overline {\mathcal {F}}}}\rightarrow X_{\mathcal {F}}$ is also one-to-one and onto when there is only one $0$ in $b_{[1,k]}$ .

(2) $\Rightarrow $ (3): Obvious.

(3) $\Rightarrow $ (4): Obvious.

(4) $\Rightarrow $ (1): We prove by way of contradiction. Suppose there are at least two $1$ s and at least two $0$ s in $b_{[1,k]}$ . Then, $k\geq 4$ and $11\in \mathcal {F}$ . We will show that both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto by finding a $y\in Y$ and two blocks $B_1\in \mathcal {F}$ and $B_2\in \overline {\mathcal {F}}$ such that any ${x\in \phi ^{-1}(y)}$ contains $B_1$ and $B_2$ . Indeed, if such a y exists, then $y\notin \phi (X_{\mathcal {F}})$ and $y\notin \phi (X_{\overline {\mathcal {F}}})$ , and therefore both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto, contradicting item (4).

We consider the following cases.

Case 1: Both $00$ and $11$ are subblocks of $b_1 b_2 \ldots b_k$ . Choose $y\in Y$ with $y_0=1$ . Then, for any $x\in \phi ^{-1}(y)$ , $x_{[-k+1,0]}=b_{[1,k]}$ . Since $11\in \mathcal {F}$ , $00\in \overline {\mathcal {F}}$ , and they are both subblocks of x, we conclude that $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

Case 2: Neither $00$ nor $11$ is a subblock of $b_1 b_2 \ldots b_k$ . In this case, $b_1 b_2 \ldots b_k$ is a binary block with $0$ and $1$ occurring alternately. We assume without loss of generality that $b_1b_2\ldots b_k=010101\ldots $ .

If k is odd, one verifies that $b_1=b_k=0$ , $\mathcal {F}=\{10^j1: j\in \{0, 2, 3, \ldots k-2\}\}$ , and $\overline {\mathcal {F}}=\{01^j0: j\in \{0, 2, 3, \ldots , k-2\}\}$ . Consider $y\in Y$ such that $y_{[0, k]}=1 0^{k-1} 1$ . For any $x\in \phi ^{-1}(y)$ , $x_{[-k+1, k]}=(b_1 b_2 \ldots b_k)^2$ ; in particular, $x_{[-1, 2]}=b_{k-1}b_kb_1b_2=1001\in \mathcal {F}$ and $x_{[0,1]}=b_kb_1=00 \in \overline {\mathcal {F}}$ . Thus, both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

If k is even, $\mathcal {F}=\{10^j1: j\in \{0, 2, 3, \ldots , k-1\}\}$ and $\overline {\mathcal {F}}=\{01^j0: j\in \{0, 2, 3, \ldots , k-1\}\}$ . Consider $y\in Y$ such that $y_{[0,k+1]}=10^k1$ . Then for any $x\in \phi ^{-1}(y)$ , either $x_{[-k+1, k+1]}=b_1b_2\ldots b_k 0 b_1 b_2\ldots b_k$ or $x_{[-k+1, k+1]}=b_1b_2\ldots b_k 1 b_1 b_2\ldots b_k$ . In the former case, $x_{[0,3]}=1001\in \mathcal {F}$ and $x_{[0,1]}=00\in \overline {\mathcal {F}}$ ; in the latter case, $x_{[0,1]}=11\in \mathcal {F}$ and $x_{[-1,2]}=0110\in \overline {\mathcal {F}}$ . Therefore, $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto in both cases.

Case 3: Exactly one of $00$ or $11$ is a subblock of $b_1b_2\ldots b_k$ . We assume without loss of generality that $11$ is a subblock of $b_1b_2\ldots b_k$ yet $00$ is not. If for any $2\leq j\leq k-2$ , $01^j0$ is not a subblock of $b_1b_2 \ldots b_k$ , then $b_1b_2 \ldots b_k=1^{m_1} (01)^{m_2} 1^{m_3}$ , where either $m_1\geq 2, m_2 \geq 2, m_3 \geq 0$ or $m_1\geq 0, m_2 \geq 2, m_3 \geq 1$ . In either case, one directly verifies that $11\in \mathcal {F}$ , $010\in \overline {\mathcal {F}}$ . Consider any $y\in Y$ with $y_0=1$ . Then, any $x\in \phi ^{-1} (y)$ satisfies $x_{[-k+1, 0]}= b_1 b_2 \ldots b_k$ , and therefore it contains both $11$ and $010$ . Thus, both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

Otherwise, there exists $2\leq j\leq k-2$ such that $01^{j}0$ is a subblock of $b_1 b_2 \ldots b_k$ . If $b_1 b_2\ldots b_{k-j-1}\neq b_{j+2}\ldots b_k$ , then $10^j1 \in \mathcal {F}$ and therefore $01^j0 \in \overline {\mathcal {F}}$ . Let $y\in Y$ be such that $y_0=1$ . Then, for any $x\in \phi ^{-1}(y)$ , $x_{[-k+1, 0]}=b_1 b_2 \ldots b_k$ , and therefore x contains both $11\in \mathcal {F}$ and $01^j0 \in \overline {\mathcal {F}}$ . Hence, both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

If $b_1 b_2\ldots b_{k-j-1}= b_{j+2}\ldots b_k$ , then

$$ \begin{align*} b_1 b_2 \ldots b_k= \begin{cases} 1^{s_1} (01^j)^{m_1}& \mbox{with } 0\leq s_1\leq j, m_1\geq 2\\ &\mbox{and } s_1+m_1(j+1)=k \\ \mbox{ or } \\ 1^{s_2} (01^j)^{m_2} 01^{t_2} & \mbox{with } 0\leq s_2 \leq j, m_2\geq 1, 0\leq t_2\leq j-1\\ &\mbox{and } s_2+m_2(j+1)+t_2+1=k, \end{cases} \end{align*} $$

and $10^{i}1\in \mathcal {F}$ for any $j+1\leq i\leq 2j$ .

Subcase 3.1: $b_1 b_2 \ldots b_k=1^{s_1} (01^j)^{m_1}$ for some $0\leq s_1\leq j$ and $m_1\geq 2$ . If $s_1=0$ , $b_1b_2\ldots b_k=(01^j)^{m_1}$ and it is purely periodic. In this case, we infer from Proposition 5.2(1) that $10^{k-1}1$ is not allowed in Y but $10^{k}1$ is. Consider $y\in Y$ with $y_{[0,k+1]}=10^k1$ . For any $x\in \phi ^{-1}(y)$ , either $x_{[-k+1, k+1]}=b_1 b_2\ldots b_k 0 b_1 b_2\ldots b_k = (01^j)^{m_1} 0 (01^j)^{m_1}$ or $x_{[-k+1, k+1]}=b_1 b_2\ldots b_k 1 b_1 b_2\ldots b_k = (01^j)^{m_1} 1 (01^j)^{m_1}$ . In the former case, $x_{[0,1]}=00\in \overline {\mathcal {F}}$ ; in the latter case, $x_{[-j-1,1]}=01^{j+1}0 \in \overline {\mathcal {F}}$ . Since $b_1 b_2 \ldots b_k$ contains $11\in \mathcal {F}$ , we conclude that both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

If $s_1\neq 0$ , $b_1b_2\ldots b_k$ is not purely periodic. Hence, we infer from Proposition 5.2(1) that $10^{k-1}1$ is allowed in Y. A similar argument as in Case 2 for odd k implies that both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

Subcase 3.2: $b_1 b_2 \ldots b_k=1^{s_2} (01^j)^{m_2}01^{t_2}$ for some $0\leq s_2\leq j$ , $m_2\geq 1$ , and $0\leq t_2 \leq j-1$ . If $s_2=j$ and $t_2=0$ , $b_1b_2\ldots b_k=(1^j0)^{m_2}$ . By reversing the roles of $0$ and $1$ , a similar argument as in Subcase 3.1 for $s_1=0$ implies that both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

If $s_2\neq j$ or $t_2\neq 0$ , a similar argument as in Subcase 3.1 for $s_1 \neq 0$ again implies that both $\phi \vert _{X_{\mathcal {F}}}$ and $\phi \vert _{X_{\overline {\mathcal {F}}}}$ are not onto.

Example 6.6. Let $\Phi : \{0,1\}^2 \rightarrow \{0,1\}$ be a $4$ -block code defined by

$$ \begin{align*} \Phi(0000)=1 \quad \mbox{and} \quad \Phi(b_1 b_2 b_3 b_4)=0 \quad \mbox{if } b_1 b_2 b_3 b_4\neq 0000. \end{align*} $$

Let $\phi : X=X_{[2]}\rightarrow Y$ be the factor code induced by $\Phi $ . Using Proposition 5.2, one verifies that Y is an S-gap shift with $S=\{0, 4,5,6, \ldots \}$ . Equivalently, Y is an SFT with the forbidden set $\mathcal {F}=\{101, 1001, 10001\}$ . Noting that $1^\infty \in X$ , we deduce from Theorem 6.1 that there is an SFT $Z\subset X$ such that $\phi \vert _Z$ is a conjugacy. Note that $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is not onto: since $010\in \overline {\mathcal {F}}$ and $\Phi ^{-1}(100001)=000010000$ , $100001$ is not allowed in the image of $\phi \vert _{X_{\overline {\mathcal {F}}}}$ and therefore $\phi \vert _{X_{\overline {\mathcal {F}}}}$ is not onto. It follows from Theorem 6.5 that we can choose Z to be $X_{\mathcal {F}}$ . The reader can verify this directly.

7 Standard factor codes defined on spoke graphs

In this section, we consider a class of factor codes with an unambiguous symbol motivated by the example in [Reference Marcus, Petersen and WilliamsMPW84, pp. 287–289].

A graph U is called a spoke if U consists of a state B, a simple path $\gamma ^+$ from B to a state $B'\ne B$ , a simple path $\gamma ^-$ from $B'$ to B, a simple cycle C including $B'$ such that $\gamma ^+, \gamma ^-$ and C are all disjoint (except that they all share the state $B'$ and $\gamma ^+, \gamma ^-$ share the state B). We also allow degenerate spokes with one simple cycle C at B, which we indicate by $\gamma ^+=\gamma ^-=\emptyset .$

A graph G is a spoke graph if it consists of a central state B and finitely many distinct spokes $U_i, i \in T$ such that for any $i\neq j\in T$ , $U_i$ and $U_j$ only intersect at B. Let $\gamma ^+_i, \gamma ^-_i,B_i'$ and $C_i$ denote the $\gamma ^+, \gamma ^-, B'$ and C of the spoke $U_i$ . Let $T_0\triangleq \{i\in T: \gamma ^+=\gamma ^-=\emptyset \}$ denote the indices of degenerate spokes and $T_1\triangleq T\setminus T_0$ denote the indices of regular spokes. See Figure 1 for an example of a spoke graph with two regular spokes and one degenerate spoke.

Figure 1 A spoke graph with two regular spokes and one degenerate spoke, where dots denote vertices.

Let $\Phi : \mathcal {V}(G) \rightarrow \{0,1\}$ be defined by

(10)

$$ \begin{align} \Phi(x)= \begin{cases} 1 \quad \mbox{if }x_i=B, \\ 0 \quad \mbox{otherwise}. \end{cases} \end{align} $$

For a block $x_1 \ldots x_m$ with $x_i\in \mathcal {V}(G)$ for any $1\leq i\leq m$ , we use $\Phi (x_1 \ldots x_m)$ to denote $\Phi (x_1)\Phi (x_2) \ldots \Phi (x_m)$ .

Consider the factor code $\phi : \widehat {X_G} \rightarrow Y\subset X_{[2]}$ induced by $\Phi $ . We call $\phi $ the standard factor code on G. The image Y of $\phi $ is a gap shift with gap set

$$ \begin{align*} S:= \bigcup_{i\in T} S_i, \end{align*} $$

where

$$ \begin{align*} S_i&:= \begin{cases} \{d_i -1 \}& \mbox{if }i\in T_0, \\ \{n \in \mathbb{Z}_{\geq 0}: n=a_i \ (\bmod \ d_i), n \ge m_i\} & \mbox{if }i\in T_1, \end{cases} \\ d_i &:= |C_i|, \qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad i\in T_0\cup T_1, \\ m_i &:= |\gamma^+_i| + |\gamma^-_i|-1, \qquad\qquad\qquad\qquad\quad\ \ \ \hspace{1.3pt} i\in T_1, \\ a_i&: = m_i \bmod d_i, \qquad\qquad\qquad\qquad\qquad\qquad\ 0\leq a_i\leq d_i-1. \end{align*} $$

Let $D = l.c.m.(\{d_i: i\in T_1\})$ and $n(i):= D/{d_i}$ . It is then immediate that for $i\in T_1$ ,

$$ \begin{align*} S_i = \{n\in \mathbb{Z}_{\geq 0}: n=b_i^{(j)} \ (\bmod \ D), 1\leq j\leq n(i), n\geq m_i \}, \end{align*} $$

where $b_i^{(j)}:=a_i+(j-1) d_i$ and $0\leq b_i^{(j)}<D$ for any $i\in T_1$ and any $1\leq j\leq n(i)$ . For each $i\in T_1$ , denote

$$ \begin{align*} K_i := \{b_i^{(1)}, b_i^{(2)}, \ldots, b_i^{n(i)}\} \quad \mbox{and} \quad K_i \bmod D:= \bigcup_{j=1}^{n(i)}\{n: n=b_i^{(j)}\ (\bmod \ D)\}. \end{align*} $$

Then the gap set S can be expressed by

$$ \begin{align*} S= \bigg(\bigcup_{i\in T_1} \{n \in \mathbb{Z}_{\geq 0}: n\in K_i \bmod D, n\geq m_i \}\bigg) \cup \{\vert C_i \vert -1: i\in T_0\}. \end{align*} $$

8 Characterization of the finite-to-one condition for standard factor codes on spoke graphs

Here, we characterize property P2 for standard factor codes on spoke graphs.

Theorem 8.1. Let G be a spoke graph and $\phi $ be the standard factor code on G. Then, the following are equivalent.

(1) There is a $W\subset T_1$ such that $\bigcup _{i\in W} K_i=\bigcup _{i\in T_1} K_i$ and $\{K_i: i\in W\}$ are pairwise disjoint.
(2) There is an irreducible SFT $Z\subset \widehat {X_G}$ such that $\phi \vert _{Z}$ is almost invertible and onto Y.
(3) There is an irreducible SFT $Z\subset \widehat {X_G}$ such that $\phi \vert _{Z}$ is finite-to-one and onto Y.

(Note 1: If $d_i \ge 2$ for all $i \in T_0\cup T_1$ , then the vertex shift of a spoke graph does not have a fixed point. If $T_1\neq \emptyset $ , then the image Y always has a fixed point $0^\infty $ . So, under these assumptions, $\phi \vert _Z$ cannot be a conjugacy.

Note 2: In items (2) and (3), it is not necessary to assume that Z is irreducible since otherwise we can replace Z with an irreducible component with maximal topological entropy.)

Proof. (1) $\Rightarrow $ (2): Suppose there is a set $W\subseteq T_1$ such that $\bigcup _{i \in W} K_i = \bigcup _{i \in T_1} K_i$ and $\{K_i: i \in W\}$ are pairwise disjoint.

Denote

$$ \begin{align*} S^{(0)}&=\bigcup_{i\in W} \{n \in \mathbb{Z}_{\geq 0}: n\in K_i \bmod D, n\geq m_i \}, \\ S^{(1)}&=\bigcup_{i\in T_1} \{n \in \mathbb{Z}_{\geq 0}: n\in K_i \bmod D, n\geq m_i \},\\ S^{(2)}&=\{\vert C_i \vert -1: i\in T_0\}. \end{align*} $$

We first construct a new graph H from the graph G through the following three steps:

(A) let H be the graph consisting of the central state $B\in \mathcal {V}(G)$ and all the spokes $U_i\subset G$ with $i\in W$ ;
(B) for each $r\in S^{(1)} \setminus S^{(0)}$ , add to H a simple cycle, denoted $C(r)$ , of length $r+1$ starting and ending with B;
(C) for each $s\in S^{(2)} \setminus S^{(1)}$ , choose an $i(s)\in T_0$ such that $\vert C_i \vert = s+1$ . Add the degenerate spoke $U_{i(s)}$ to H.

See Figure 2 for an example of the construction of H.

Figure 2 An example of G and H, where dots denote vertices, and $U_i$ terms and $U_i'$ terms are spokes in G and H, respectively. For this example, $T_0=\{3,4\}, T_1=\{1,2\}, W=\{2\}$ and $H_1=U_2', H_2=U_1'\cup U_3', H_3=U_4'$ .

Let $H_1, H_2, H_3$ denote subgraphs consisting of spokes added to H in steps (A), (B), and (C), respectively. It is worth noting that any $r\in S^{(1)}\setminus S^{(0)}$ corresponds to a ‘gap’ in regular spokes of G that is missing from $\{U_i: i\in W\}$ , and any $s\in S^{(2)}\setminus S^{(1)}$ corresponds to a ‘gap’ in degenerate spokes of G that is missing from $\{U_i: i\in T_1\}$ .

The following properties are immediate from the construction of H.

(a) H is a spoke graph. It consists of the central state B and several spokes intersecting at B, where spokes in $H_1$ are regular spokes and spokes in $H_2\cup H_3$ are degenerate spokes.
(b) $H_1\cup H_3$ is a subgraph of G.
(c) If $\eta _1$ and $\eta _2$ are two different simple cycles at B in H, then $\vert \eta _1\vert \neq \vert \eta _2\vert .$

Now, define a one-block map $\Psi : \mathcal {V}(H)\rightarrow \mathcal {V}(G)$ as follows:

• for $v\in \mathcal {V}(H_1\cup H_3)$ , let $\Psi (v)=v$ ;
• for any $r\in S^{(1)}\setminus S^{(0)}$ , choose a cycle $\widetilde {C}(r)$ in G starting and ending with B with no B in its interior such that $\vert \widetilde {C}(r) \vert = \vert C(r) \vert $ . Define
$$ \begin{align*} \Psi(V(C(r))):= V(\widetilde{C}(r)). \end{align*} $$

Note that for any two distinct vertices $v_1, v_2\in \mathcal {V}(H)$ , $\Psi (v_1)=\Psi (v_2)$ only if there exist $r_1, r_2\in S^{(1)}\setminus S^{(0)}$ with $r_1 \neq r_2$ such that $v_1\in V(C(r_1))$ and $v_2\in V(C(r_2))$ , where $C(r_1)$ and $C(r_2)$ are constructed in step (B).

Let $\psi :\widehat {X_H}\rightarrow \widehat {X_G}$ be the sliding block code induced by $\Psi $ and define $Z: =\psi (\widehat {X_H})$ . Note that any point $z\in Z$ is a concatenation of strings of the form

(11)

$$ \begin{align} Bu_1 u_2 \ldots u_k B,\quad \ldots v_{-3}v_{-2}v_{-1} B,\quad Bw_1w_2w_3\ldots,\quad \ldots i_{-2} i_{-1} i_0 i_1 i_2 \ldots, \end{align} $$

where $u_j$ terms, $v_j$ terms, $w_{j}$ terms, and $i_j$ terms are vertices in G distinct from B. Thus, to show that $\psi $ is one-to-one, it suffices to show that any string in equation (11) has a unique $\Psi $ -pre-image, and we prove this by considering the following cases.

(1) Any allowed block of the form $Bu_1 u_2 \ldots u_k B$ in $\widehat {X_G}$ must be the $\Psi $ -image of some block of the form $B x_1 x_2 \ldots x_k B$ with $x_i\in \mathcal {V}(H)$ for any $1\leq i\leq k$ . Noting from property (c) that each $B x_1 x_2 \ldots x_k B$ is uniquely determined by its length, we conclude that the $\Psi $ -pre-image of $Bu_1 u_2 \ldots u_k B$ is unique.
(2) For simplicity, among the infinite paths in equation (11), we consider only those of the form $\ldots v_{-3}v_{-2}v_{-1} B$ in $\widehat {X_G}$ . Such a string must be the $\Psi $ -image of some string of the form $\ldots x_{-3} x_{-2} x_{-1} B$ with $x_i\in \mathcal {V}(H_1)$ . Since $\Psi $ is the identity map on $\mathcal {V}(H_1\cup H_3)$ , $\ldots v_{-3}v_{-2}v_{-1} B$ has a unique $\Psi $ -pre-image.

Let $Z:=\psi (\widehat {X_H})$ . Then Z is an irreducible SFT because it is conjugate to $\widehat {X_H}$ . We now prove that $\phi \vert _Z$ is almost invertible and onto Y. Note that by definition, $\Phi \circ \Psi $ maps the central state B to $1$ and maps all other vertices in H to $0$ . So $\phi \circ \psi $ is the standard factor code on the spoke graph H.

To see that $\phi \circ \psi $ is onto, first note that the image $(\phi \circ \psi )(\widehat {X_{H}})$ is a gap shift with gaps of the form

$$ \begin{align*} S' &:= S^{(0)} \cup (S^{(1)} \setminus S^{(0)}) \cup (S^{(2)}\setminus S^{(1)}) \\ &=S^{(0)} \cup S^{(1)} \cup S^{(2)}\\ &=S^{(1)} \cup S^{(2)}, \end{align*} $$

where we use the fact that $S^{(0)}\subset S^{(1)}$ in the last equation. Since $\bigcup _{i \in W} K_i = \bigcup _{i \in T_1} K_i$ , we have $S' =S$ , where S is such that Y is an S-gap shift. Therefore, $\phi \circ \psi $ is onto.

We now show that $\phi \circ \psi $ is finite-to-one. We first note from the construction of H that for any $t\in S$ , there is a unique cycle of length $t+1$ in H starting and ending with B, whose interior does not contain B. Hence, for any $t\in S$ , there is a unique path in H whose image under $\Phi \circ \Psi $ is $10^t1$ . This implies that $\phi \circ \psi $ has no graph diamond and therefore it is finite-to-one.

Since the central state B is the only vertex in H whose $(\Phi \circ \Psi )$ -image is $1$ , and since $\phi \circ \psi $ is a finite-to-one 1-block code on a 1-step SFT, its degree is $1$ (by [Reference Lind and MarcusLM95, Theorem 9.1.11(3) and Proposition 9.1.12]) and therefore it is almost invertible.

Finally, since $\phi \circ \psi $ is almost invertible and onto Y, and $\psi $ is a conjugacy from $\widehat {X_H}$ to Z, we conclude that $\phi \vert _{Z}: Z\rightarrow Y$ is almost invertible and onto.

(2) $\Rightarrow $ (3): As we said in §2, any almost invertible factor code on an irreducible SFT is finite-to-one [Reference Lind and MarcusLM95, Proposition 9.2.2].

(3) $\Rightarrow $ (1): Suppose that there is an irreducible SFT $Z\subset X$ such that $\phi \vert _{Z}$ is finite-to-one and onto. Let k be the degree of $\phi \vert _{Z}$ and L be the maximum length of words in a forbidden list of blocks from X that defines Z. Then, it follows from our definition of the degree of finite-to-one codes in §2 that there exist a word of the form $u:=0^{e_1}10^{e_2}1\ldots 10^{e_n}$ with $e_i \in S$ , an integer $L\leq M\leq \vert u\vert $ , and an index $1\leq j\leq \vert u \vert -M+1$ such that the set

$$ \begin{align*} E:=\{v_{[j, j+M-1]}: v\in\mathcal{B}(Z), \Phi(v) =u\} \end{align*} $$

has cardinality k. Note that u is a magic word and $u_{[j, j+M-1]}$ is the corresponding magic block.

For notational convenience, in the remainder of this proof, for any block w with length $\vert u \vert $ , we use the following notation:

$$ \begin{align*} \overline{w}:= w_{[1, j-1]}, \quad \widetilde{w}:=w_{[j, j+M-1]}, \quad \widehat{w}:=w_{[j+M, \vert u\vert]}, \end{align*} $$

where u, j, and M are defined as above.

Denote elements in E by $a^{(1)}, a^{(2)}, \ldots , a^{(k)}$ and for any $1\leq l \leq k$ , define

$$ \begin{align*} B^{(l)}:=\{v\in \mathcal{B}(Z): \Phi(v)=u, \widetilde{v}=a^{(l)}\} \quad \mbox{and} \quad R:=\bigcup_{1\leq l\leq k} B^{(l)}. \end{align*} $$

Note that R is the set of all $\phi \vert _{Z}$ -pre-images of u. By a higher block recoding similar to [Reference Lind and MarcusLM95, Proposition 9.1.7], the following observation follows from [Reference Lind and MarcusLM95, Proposition 9.1.9 (part 2)].

Observation 1. Let $u x u$ be a word in $\mathcal {B}(Y)$ and let $A: =\{z\in \mathcal {B}(Z): \Phi (z)= uxu\}$ . Note that any element in A is of the form $v^{(l)} w v^{(l')}$ , where $1\leq l ,l' \leq k$ and $v^{(l)}\in B^{(l)}, v^{(l')}\in B^{(l')}$ . Then, there exists a permutation $\tau =\tau _{uxu}$ of $\{1,2,\ldots k\}$ such that for any pair $(l, l')$ , $v^{(l)} w v^{(l')}\in A$ for some w only if $l'=\tau (l)$ .

For any $1\leq l\leq k$ , define

$$ \begin{align*} F^{(l)}:=\{i\in T_1: &vV(\gamma_i^+ (C_i)^L \gamma_i^-)w \in \mathcal{B}(Z) \mbox{ for some }v\in B^{(l)}\mbox{ and some }w\in R \} \end{align*} $$

to be the index set of regular spokes that can follow some pre-images of u in $B^{(l)}$ and precede some pre-images of u in R. We claim that for any $1\leq l\leq k$ , $\{K_i: i\in F^{(l)}\}$ are pairwise disjoint and $\bigcup _{i\in F^{(l)}} K_i =\bigcup _{i\in T_1} K_i.$ We assume without loss of generality that $l=1$ in the following.

To show $\{K_i: i\in F^{(1)}\}$ are pairwise disjoint, we suppose to the contrary that there exists $f\in K_{i_1} \cap K_{i_2}$ for some $i_1, i_2\in F^{(1)}$ with $i_1\neq i_2$ . Choose $n(f)= f (\bmod \ D)$ such that $n(f)\geq \max \{d_{i_1} L+m_{i_1}, d_{i_2} L+m_{i_2}\}$ . Then, $n(f)\in S$ and according to the definition of $F^{(1)}$ , there are $v, x\in B^{(1)}$ , $w\in B^{(l_1)}$ , $y\in B^{(l_2)}$ for some $1\leq l_1, l_2 \leq k$ such that

$$ \begin{align*} \Phi(vV(\gamma_{i_1}^+ (C_{i_1})^{(n(f)-m_{i_1})/d_{i_1}}\gamma_{i_1}^- )w)&=\Phi(xV( \gamma_{i_2}^+ (C_{i_2})^{(n(f)-m_{i_2})/d_{i_2}}\gamma_{i_2}^- )y)=u10^{n(f)}1 u, \notag \\ \mbox{and} \quad \widetilde{v}=\widetilde{x}=a^{(1)}&, \quad \widetilde{w}=a^{(l_1)}, \quad \widetilde{y}=a^{(l_2)}. \notag \end{align*} $$

Then, we infer from Observation 1 that $l_1=l_2$ and therefore $\widetilde {w}=\widetilde {y}=a^{(l_1)}$ . Now, the two words

(12)

$$ \begin{align} \widetilde{v} \widehat{v}V( \gamma_{i_1}^+ (C_{i_1})^{(n(f)-m_{i_1})/d_{i_1}}\gamma_{i_1}^-) \overline{w} \widetilde{w} \quad \mbox{and} \quad \widetilde{x} \widehat{x} V(\gamma_{i_2}^+ (C_{i_2})^{(n(f)-m_{i_2})/d_{i_2}}\gamma_{i_2}^-) \overline{y} \widetilde{y} \end{align} $$

are both $\phi \vert _{Z}$ -pre-images of $\widetilde {u} \widehat {u}1 0^{n(f)}1 \overline {u}\widetilde {u}$ , and they both start with $a^{(1)}$ and end with $a^{(l_1)}$ . Since $a^{(1)}$ and $a^{(l_1)}$ both have length M, which is no less than L, we deduce that the two words in equation (12) can be extended to a point diamond, contradicting the fact that $\phi \vert _Z$ is finite-to-one.

To show $\bigcup _{i\in F^{(1)}} K_i =\bigcup _{i\in T_1} K_i,$ assume to the contrary that there is a $g\in \bigcup _{i\in T_1} K_i$ but $g\notin \bigcup _{i\in F^{(1)}} K_i$ . Choose $n(g):= g\ (\bmod \ D)$ such that $n(g)> \max \{ d_i: i\in T_0\} $ and $n(g)\geq \max \{d_i L + m_i: i\in T_1\}$ . Then, $n(g)\in S$ and we deduce from the definition of $F^{(1)}$ that the set

$$ \begin{align*} Q:=\{z_{[j, j+M-1]}: z\in \mathcal{B}(Z), \Phi(z)= u 10^{n(g)}1 u\} \end{align*} $$

does not contain $a^{(1)}$ . Noting that $Q\subset \{a^{(1)}, a^{(2)}, \ldots , a^{(k)}\}$ , since u is a magic word, the cardinality of Q is at most $k-1$ . This contradicts the fact that $\phi \vert _{Z}$ has degree k, and therefore $\bigcup _{i\in F^{(1)}} K_i =\bigcup _{i\in T_1} K_i$ .

Now let $W= F^{(1)}$ . Then, we immediately infer from above that W is the desired set and therefore complete the proof.

Remark 8.2. Our proof indeed shows that conditions $(2)$ and $(3)$ in Theorem 8.1 are equivalent for any 1-block factor code with an unambiguous symbol defined on a 1-step SFT.

Example 8.3. Let G be the graph in Figure 3 where B is the central state. Let $\phi $ be the standard factor code on G. Then, one verifies that $\Phi $ (which generates $\phi $ ) has no graph diamond and so $\phi $ is finite-to-one; however, $\phi $ is not one-to-one: both $(V_1V_2)^\infty $ and $(V_2V_1)^{\infty }$ are pre-images of $0^\infty $ . In this case, there is no subshift $Z\subset \widehat {X_G}$ such that $\phi \vert _Z$ is one-to-one and onto.

Figure 3 The graph G, which is a representation of $X_{\mathcal {F}}$ .

Example 8.4. Let G be the 3-spoke graph defined by

$$ \begin{align*}d_1=d_3=6, \quad d_2=3, \quad m_1=m_2=1, \quad m_3=4\end{align*} $$

and $\phi $ be the standard factor code on G. Then, $T_0=\emptyset $ , $T_1=\{1,2,3\}$ , $D=l.c.m.(d_1, d_2, d_3)=6$ and

$$ \begin{align*} K_1=\{1\}, \quad K_2=\{1,4\}, \quad K_3=\{4\}. \end{align*} $$

Here the image Y of $\phi $ is an S-gap shift with

$$ \begin{align*}S=\{n\in \mathbb{Z}_{\geq 0}: n=1 \bmod 3\}.\end{align*} $$

There are two ways to choose W.

(1) $W=\{1,3\}$ . It can be readily checked that $\bigcup _{i\in W} K_i = \bigcup _{i\in T_1} K_i$ and $K_1 \cap K_3 = \emptyset $ . So, by Theorem 8.1, there is an SFT $Z \subset \widehat {X_G}$ such that $\phi _Z$ is finite-to-one and onto Y. In this case, the proof chooses Z to be $\widehat {X_{U_1 \cup U_3}}$ .

(2) $W=\{2\}$ . Here, the proof of Theorem 8.1 chooses Z to be $\widehat {X_{U_2}}$ .

This shows that there are two irreducible Markov measures, $\nu _1$ and $\nu _2$ , with $\nu _1$ supported on $\widehat {X_{U_1\cup U_3}}$ and $\nu _2$ supported on $\widehat {X_{U_2}}$ , which both achieve the capacity of the channel given by the standard factor code on ${G}$ .

Example 8.5. Let G be the 4-spoke graph defined by

$$ \begin{align*} m_1=m_2=m_3=1, \quad m_4=10, \quad d_1=2,\quad d_2=3, \quad d_3=4, \quad d_4=6 \end{align*} $$

and $\phi $ be the standard factor code on G. Then,

$$ \begin{align*}T_0=\emptyset, \ \ T_1=\{1,2,3,4\}, \ \ D=l.c.m. (d_1, d_2,d_3,d_4)= 12\end{align*} $$

and

$$ \begin{align*} K_1=\{1,3,5,7,9,11\}, \quad K_2=\{1,4,7,10\}, \quad K_3=\{1,5,9\}, \quad K_4=\{4,10\}. \end{align*} $$

Let Y be the image of $\phi $ . Since $K_1 \cap K_4 =\emptyset $ and $K_1 \cup K_4 =\bigcup _{i\in T_1} K_i$ , it follows from Theorem 8.1 that there is an SFT $Z\subset \widehat {X_G}$ such that $\phi \vert _Z$ is finite-to-one and onto Y. Note that in this example, we cannot simply choose H in the proof of Theorem 8.1 to be the graph obtained from G by deleting $U_2$ and $U_3$ . This is because $10^41$ is allowed in Y, but not allowed in $\phi (\widehat {X_{U_1 \cup U_4}})$ : the only $\Phi $ -pre-image of $10^41$ is $V(\gamma _2^+ C_2 \gamma _2^-)$ and it comes only from the spoke $U_2$ . Instead, we let H be the graph obtained from G by deleting $U_2$ and $U_3$ , and then adding to H a cycle of length $5$ starting and ending with B. Then, according to the proof of Theorem 8.1, ${\widehat {X_H}}$ is conjugate to some SFT $Z\subset \widehat {X_G}$ and $\phi \vert _{Z}$ is finite-to-one and onto Y.

Example 8.6. An example for which the conditions in Theorem 8.1 are not satisfied is given in [Reference Marcus, Petersen and WilliamsMPW84, §3]. Here, G is the 4-spoke graph defined by

$$ \begin{align*} m_1=m_2=1, \ \ m_3=2, \ \ m_4=6, \ \ d_1=2, \ \ d_2=3, \ \ d_3=6, \ \ d_4=6. \end{align*} $$

Let $\phi $ be the standard factor code on G. It was shown in [Reference Marcus, Petersen and WilliamsMPW84] that for this $\phi $ , property P3 is not satisfied and therefore property P2 is not satisfied.

9 Conjecture: properties P2 and P3 are equivalent for standard factor codes on spoke graphs

Having characterized the condition under which property P2 is satisfied for standard factor codes on spoke graphs, we now turn to the question whether property P2 is equivalent to property P3 for these codes. Recall from Proposition 4.2 that property P2 always implies property P3 for a general factor code. For the converse, we have the following.

Conjecture 9.1. Let G be a spoke graph and $\phi $ be the standard factor code on G. Then property P3 implies property P2.

Remark 9.2. It will be shown that if property P3 holds, that is, there is a kth-order Markov measure $\nu $ on $\widehat {X_G}$ such that $\phi ^*(\nu )=\mu _0$ , the unique mme on Y, then $\nu (V(C_i) \vert V((C_i)^k)) = Q^{-d_i},$ where $C_i$ is the cycle (disjoint from B) on the spoke $U_i$ , $Q=e^{h_{\mathrm {top}}(Y)}$ , and $d_i$ is the length of $C_i$ . This is part (a) of the proof of Proposition 9.4 (see equation (13)). Hence, the $\nu $ -weight-per-symbol of each such $V((C_i)^\infty )$ is a constant $Q^{-1}$ . If it is true that the weight-per-symbol of each of the periodic points $V((\gamma _i^+\gamma _i^-)^\infty )$ is also $Q^{-1}$ , then one would have condition (4) of Proposition 4.3 and property P2 would be true. It may be that there is another Markov measure $\nu '$ on $\widehat {X_G}$ with $\phi ^*(\nu ')=\mu _0$ such that this condition is satisfied.

In the remainder of this section, we will prove some special cases of Conjecture 9.1. To this end, we begin with some lemmas.

Lemma 9.3. (Consequence of strong form of Chinese remainder theorem)

Let k be a positive integer. If for any $1\leq i< j\leq k$ , there exists $x_{i,j}$ such that $x_{i,j}=a_i \ (\bmod \ d_i)$ and $x_{i,j}=a_j \ (\bmod \ d_j)$ , then there exists x such that $x=a_l \ (\bmod \ d_l)$ for any $1\leq l\leq k$ .

Proof. For any $1\leq i<j\leq k$ , let $g_{i,j}=gcd(d_i, d_j)$ . Then $g_{i,j}$ divides $x_{i,j}-a_i$ and $x_{i,j}-a_j$ so $g_{i,j}$ divides $a_i-a_j$ . Hence, the generalized Chinese remainder theorem [Reference LeVequeLe56, Theorem 3-12] asserts that there is a common solution to $x=a_i \ (\bmod \ d_i), i=1,2,\ldots , k$ .

Lemma 9.4. Let $\nu $ be a kth-order Markov measure on $\widehat {X_G}$ such that property P3 holds. Define $\Pi _i := \nu (V(\gamma _i^+(C_i)^{Dk/d_i}) \vert B)$ , $P:= \{i\in T_1: \Pi _i>0\}$ and $R_j:= \{i\in T_1: j\in K_i\}=\{i\in T_1: d_i \mbox { divides } j-m_i \}$ . Then:

(a) for each $0\leq j<D$ ,
(13) $$ \begin{align} Q^{-Dk} = \sum_{i \in R_j\cap P} \Pi_i Q^{m_i + 1}(1 - Q^{-d_i}), \end{align} $$
where $Q:=e^{h_{\mathrm {top}}(Y)}$ ;
(b) $\bigcup _{i\in T_1\setminus P} K_i \subset \bigcup _{i\in P} K_i$ ;
(c) for each pair $j, j'$ , if $R_{j'}\cap P \subset R_j\cap P$ , then $R_{j'} \cap P = R_{j}\cap P$ .

Proof. Fix a congruence class $0\le j < D$ and let $\mu _0$ be the unique mme on Y.

Since $\phi (\nu ) = \mu _0$ , for all $n\ge \max _{i\in T} m_i/D$ ,

$$ \begin{align*} \mu_0(10^{Dk + j + Dn}1) = \sum_{i \in R_j} \nu(V(\gamma_i^+ (C_i)^{(1/d_i)(Dk + j + Dn - m_i)}\gamma_i^-)). \end{align*} $$

Let

$$ \begin{align*} Q_i := \nu(V(C_i) \vert V(\gamma_i^+(C_i)^{(1/d_i)Dk})). \end{align*} $$

Since $\mu _0(1) = \nu (B)$ , using the formula for the unique mme $\mu _0$ , we have

$$ \begin{align*} Q^{-(Dk + j + Dn + 1)} &= \sum_{i \in R_j} \Pi_i (Q_i^{1/d_i})^{-m_i} (Q_i^{1/d_i})^{Dn+j}(1-Q_i) \\ &= \sum_{i \in R_j\cap P} \Pi_i (Q_i^{1/d_i})^{-m_i} (Q_i^{1/d_i})^{Dn+j}(1-Q_i) \end{align*} $$

and so

$$ \begin{align*} Q^{-(Dk + 1)} = \sum_{i \in R_j\cap P} \Pi_i (Q_i^{1/d_i})^{-m_i} \left(\frac{Q_i^{1/d_i}}{Q^{-1}}\right)^{Dn+j}(1-Q_i). \end{align*} $$

Letting $n \rightarrow \infty $ , we have for all $i \in R_j \cap P$ ,

(14)

$$ \begin{align} \frac{Q_i^{1/d_i}}{Q^{-1}} = 1. \end{align} $$

This yields equation (13) and proves item (a).

Since Y is a gap shift, $\mu _0$ is fully supported and so gives positive measure to each allowed gap. Thus, $\bigcup _{i \in T_1 \setminus P} K_i \subset \bigcup _{i \in P} K_i$ , proving item (b).

To see item (c), we first derive from equation (13) that

$$ \begin{align*} \sum_{i \in R_{j'}\cap P} \Pi_iQ^{m_i+1}(1 - Q^{-d_i}) = Q^{-Dk} = \sum_{i \in R_j\cap P} \Pi_iQ^{m_i+1}(1 - Q^{-d_i}). \end{align*} $$

Thus,

$$ \begin{align*} \sum_{i \in (R_{j}\cap P) \setminus (R_{j'} \cap P)} \Pi_iQ^{m_i + 1}(1 - Q^{-d_i}) =0, \end{align*} $$

which immediately implies $ (R_{j}\cap P) \setminus (R_{j'} \cap P) = \emptyset $ .

Lemma 9.5. Let P be defined as in Lemma 9.4 and $i_1, i_2\in P$ with $K_{i_1}\cap K_{i_2}\neq \emptyset $ . Then, for any $j\in K_{i_1} \setminus K_{i_2}$ , there exists $i_3\in P$ such that:

(1) $j\in K_{i_3}$ ;
(2) $K_{i_2}\cap K_{i_3}= \emptyset .$

Proof. For notational convenience, we rewrite j by $j_1$ and define $\small S(i_1, i_2, j_1)\kern1.3pt{:=}\kern1.3pt(R_{j_1} \kern1.3pt{\cap}\kern1.3pt P) \kern1.2pt{\setminus} \{i_1, i_2\}$ , where $R_{j_1}$ is defined in Lemma 9.4.

We first show that $S(i_1, i_2, j_1)\neq \emptyset $ . Suppose to the contrary that $S(i_1, i_2, j_1)= \emptyset $ . Then, $R_{j_1}\cap P=\{i_1\}$ . Since $K_{i_1} \cap K_{i_2}\neq \emptyset $ , there exists $j_2\in K_{i_1} \cap K_{i_2}$ and therefore $R_{j_2}\cap P \supset \{i_1, i_2\}$ . Hence, $R_{j_1}\cap P\subsetneq R_{j_2} \cap P$ , contradicting Lemma 9.4(c).

We then claim that there exists $i_3\in S(i_1, i_2, j_1)$ such that $K_{i_2} \cap K_{i_3}=\emptyset $ . If not, then

$$ \begin{align*} K_{i_2} \cap K_i\neq \emptyset \quad \mbox{ for any }i\in S(i_1, i_2, j_1). \end{align*} $$

Recalling that $j_2\in K_{i_1}\cap K_{i_2}$ and $j_1\in K_{i_1} \cap (\bigcap _{i\in S(i_1, i_2, j_1)}K_i)$ , we derive from Lemma 9.3 that there exists $j_4\in K_{i_1} \cap K_{i_2} \cap (\bigcap _{i\in S(i_1, i_2, j_1)} K_i).$ Hence,

$$ \begin{align*}R_{j_1}\cap P =\{i_1\}\cup S(i_1, i_2, j_1) \subsetneq \{i_1, i_2\}\cup S(i_1, i_2, j_1) \subset R_{j_4}\cap P,\end{align*} $$

contradicting Lemma 9.4(c).

With these lemmas in hand, we prove the following.

Proposition 9.6. Let G be a spoke graph, $\phi $ be the standard factor code on G, and P be defined as in Lemma 9.4. If there is a stationary Markov measure $\nu $ on $\widehat {X_G}$ such that $\phi ^*(\nu )=\mu _0$ , the unique mme of the output Y, then there is an SFT $Z\subset \widehat {X_G}$ such that $\phi \vert _Z$ is finite-to-one and onto Y if any of the following hold:

(a) $\bigcap _{i\in P} K_i \neq \emptyset $ (in particular, this holds when $m_i = 1$ for all i or the $\{d_i\}$ are pairwise co-prime (by the Chinese remainder theorem));
(b) for any $i_1, i_2\in P$ , $K_{i_1}\cap K_{i_2} \neq \emptyset $ ;
(c) there are subsets $E_1$ and $E_2$ of P such that $\{K_i: i\in E_1\}$ and $\{K_i: i\in E_2\}$ are both pairwise disjoint and $\bigcup _{i\in E_1\cup E_2} K_i =\bigcup _{i\in T_1} K_i$ . In particular, this condition is satisfied if there are only two distinct $d_i$ terms;
(d) $\vert P\vert \leq 5$ .

Proof. According to Theorem 8.1, it suffices to show that there is a $W\subset T_1$ such that $\bigcup _{i\in W} K_i = \bigcup _{i\in T_1} K_i$ and $\{K_i: i\in W\}$ are pairwise disjoint.

Proof of item $(a)$ : Let $A:= \bigcup _{i\in P} K_i$ . Note that $P \ne \emptyset $ by the existence of $\nu $ . Let ${j\in \bigcap _{i\in P} K_i}$ . Apply Lemma 9.4(c) to this j and an arbitrary $j' \in A$ to get that for all $i \in P$ , $i \in \bigcap _{j'\in A} R_{j'}$ and so each $K_i = A$ . By Lemma 9.4(b), $A = \bigcup _{i \in T_1} K_i$ . Hence, W can be taken to consist of only one element, namely any element of P.

Proof of item $(b)$ : Since $K_i$ terms are pairwise intersecting, an application of Lemma 9.3 to $\{K_i: i\in P\}$ implies that $\bigcap _{i\in P} K_i\neq \emptyset $ which is item (a).

Proof of item $(c)$ : We assume without loss of generality that the $K_i$ terms are distinct. Denote

(15)

$$ \begin{align} F:= \{i\in E_1: K_i\cap K_{i'} = \emptyset \mbox{ for all } i'\in E_2\}. \end{align} $$

We claim that for any $i\in E_1 \setminus F$ , $K_i\subset \bigcup _{i'\in E_2} K_{i'}$ . To see this, assume to the contrary that there are $i_1 \in E_1 \setminus F$ and $j\in K_{i_1}$ such that $j \notin \bigcup _{ i' \in E_2} K_{i'}$ . Recalling that $K_{i_1} \cap K_{i_2} =\emptyset $ for $i_1, i_2\in E_1$ with $i_1\neq i_2$ , we have $R_{j}\cap P=\{i_1\}$ . However, $i_1 \in E_1 \setminus F$ implies that there exists $j'\in K_{i_1}$ and $i_3\in E_2$ such that $j'\in K_{i_1} \cap K_{i_3}$ . Hence, $R_{j'}\cap P\supset \{i_1, i_3\} \supsetneq \{i_1\}=R_{j} \cap P$ , which contradicts Lemma 9.4(c).

Now let $W:= F \cup E_2$ . Clearly $\{K_i: i\in W\}$ are pairwise disjoint by the definition of F. Since $K_i\subset \bigcup _{i'\in E_2} K_{i'}$ for any $i\in E_1 \setminus F$ , $\bigcup _{i\in W} K_i=\bigcup _{i\in P} K_i=\bigcup _{i\in T_1} K_i$ , proving item (c).

Proof of item $(d)$ : By adding repeated spokes (for which the choice of the set W is not affected), we can regard the cases $\vert P\vert <5$ as special cases of $\vert P \vert =5$ . Hence, we assume $\vert P \vert =5$ in the following.

Let $P=\{1,2,3,4,5\}$ . A pair $i,i'\in P$ is called an intersecting pair if $K_i \ne K_{i'}$ and $K_i \cap K_{i'} \ne \emptyset $ . We consider the following cases.

Case 1: For any intersecting pair $i, i'\in P$ , either $K_{i}\subset K_{i'}$ or $K_{i'}\subset K_{i}$ .

In this case, we define a partial order $\preccurlyeq $ in the following way: if $i,i'$ is an intersecting pair and $K_i \subset K_{i'}$ , then $K_i\preccurlyeq K_{i'}$ ; if $i ,i'$ is not a intersecting pair, then $K_i$ and $K_{i'}$ are incomparable.

The partial order $\preccurlyeq $ partitions the set $\{K_i: i\in P\}$ into several classes such that:

(1) each class is a chain with a unique maximal element (under $\preccurlyeq $ );
(2) if $K_i$ and $K_{i'}$ are from different classes, then $K_{i} \cap K_{i'}=\emptyset $ .

Hence, letting W be the indices of all the maximal elements, we have $\{K_i: i\in W\}$ are pairwise disjoint and $\bigcup _{i\in W} K_i = \bigcup _{i\in P} K_i =\bigcup _{i\in T_1} K_i$ .

Case 2: There exists an intersecting pair $i, i'\in P$ such that both $K_{i} \nsubseteq K_{i'}$ and ${K_{i'} \nsubseteq K_{i}}$ .

First note that in this case, we necessarily have $d_i\neq d_{i'}$ . We may assume that $i=1$ , $i'=2$ , and ${l.c.m.(d_1, d_2)}/{d_1} \geq 3$ . Let $j_1\in K_1\cap K_2$ , $j_2\triangleq (j_1+d_1) \bmod (l.c.m.(d_1, d_2))$ , and $j_3\triangleq (j_2+ d_1) \bmod (l.c.m.(d_1, d_2))$ , where $0\leq j_2,j_3 < l.c.m.(d_1, d_2)$ . Then $j_2, j_3\in K_1\setminus K_2$ . Furthermore, there is also a $j_4\in K_2 \setminus K_1$ . Applying Lemma 9.5 to $j_2, j_3, j_4$ , we deduce that there exist $l_1, l_2, l_3\in \{3,4,5\}$ such that

(16)

$$ \begin{align} j_2\in K_1 \cap K_{l_1}, \quad K_{l_1} \cap K_2 =\emptyset, \end{align} $$

(17)

$$ \begin{align} j_3\in K_1 \cap K_{l_2}, \quad K_{l_2} \cap K_2 =\emptyset, \notag \\ j_4\in K_2 \cap K_{l_3}, \quad K_{l_3} \cap K_1 =\emptyset. \end{align} $$

Note that we necessarily have $l_3\neq l_1$ and $l_3\neq l_2$ . We now claim that $l_1\neq l_2$ . To see this, suppose that $l_1=l_2=l$ . Then $j_2\in K_l, j_3\in K_l$ . Since $ j_3=(j_2+d_1) \bmod (l.c.m.(d_1, d_2))$ and $j_2\in K_1, j_3\in K_1$ , we have $K_l\subset K_1$ . Hence, $j_1\in K_1 \cap K_2 \cap K_l$ , contradicting the fact that $K_2\cap K_l=\emptyset $ . (See Figure 4, where for any $r,s$ , a $\bullet $ (respectively, a $\times $ ) on the $(r,s)$ position means that $j_r\in K_s$ (respectively, $j_r\notin K_s$ )).

Figure 4 Relationship between $K_1, K_2$ , and $K_l$ if $l_1=l_2=l$ .

Hence, $l_1, l_2$ , and $l_3$ are distinct. We may assume that $l_1=3, l_2=4$ , and $l_3=5$ . The current relation between $\{K_1, K_2 ,K_3, K_4,K_5\}$ is given in Figure 5, where $?$ means that whether this position is $\bullet $ or $\times $ is unknown up to now.

Figure 5 Relationship between $K_1, K_2 ,K_3, K_4, K_5$ with some unknowns.

We then claim that $j_3\notin K_3$ and $j_2\notin K_4$ (that is, $?_1=?_2=\times $ in Figure 5). To verify this claim, assume without loss of generality that $j_3\in K_3$ . Then $j_2\in K_1\cap K_3$ and ${j_3\in K_1 \cap K_3}$ . Noting that $j_3-j_2=d_1 \bmod (l.c.m.(d_1, d_2))$ , we must have $K_3\supset K_1$ , which contradicts the fact that $j_1\in K_1 \setminus K_3$ . Hence, $j_3\notin K_3$ . A similar argument shows that $j_2\notin K_4$ , proving the claim.

Now the relationship between $\{K_1, K_2, K_3, K_4, K_5\}$ is partially characterized in Figure 6.

Figure 6 Relationship between $K_1, K_2 ,K_3, K_4, K_5$ .

We then claim that $K_3\cap K_4=\emptyset $ . To see this, suppose to the contrary that there is a $j_5\in K_3\cap K_4$ . Since $j_2\in K_1\cap K_3$ , $j_3\in K_1\cap K_4$ , we infer from Lemma 9.3 that there is a $j_6\in K_1\cap K_3 \cap K_4$ , contradicting Lemma 9.4(c).

Now let $E_1:=\{1,5\}, E_2:=\{2,3,4\}$ . Since $\{K_i: i\in E_1\}$ and $\{K_i: i\in E_2\}$ are both pairwise disjoint, the desired result follows from item (c).

Remark 9.7. When $\vert P \vert \le 4$ , by carefully going through a similar argument as in the proof of item (d), one can show that for any $i\neq j\in P$ , $K_i \cap K_j=\emptyset $ or $K_i\subset K_j$ or $K_j\subset K_i$ .

10 Standard factor codes defined on another class of graphs

We believe that our approach in the proof of Theorem 8.1 also works for more general graphs. Note that for a graph G with one (regular) spoke, Theorem 8.1 implies that property P2 always holds. In this section, as an example, we show that property P2 also holds for standard factor codes defined on a different kind of spoke graph. To be specific, let G be a graph which consists of a central state B, a simple path $\gamma ^+$ from B to $B'\neq B$ , a simple path $\gamma ^-$ from $B'$ to B, and two simple cycles $C_1$ and $C_2$ including $B'$ such that:

(a) $\vert C_i\vert>0$ for $i=1,2$ ;
(b) $\gamma ^+$ and $\gamma ^-$ only intersect at B and $B'$ ;
(c) $\gamma ^+, \gamma ^-, C_1$ and $C_2$ share the vertex $B'$ , and there is no other common vertex among $\gamma ^+, \gamma ^-, C_1$ , and $C_2$ .

Here, we implicitly assume that $\gamma ^+\neq \emptyset $ and $\gamma ^-\neq \emptyset $ .

Just as in §7, a standard factor code $\phi $ on G is induced by a one-block map $\Phi : \mathcal {V}(G) \rightarrow \{0,1\}$ that maps the central state B to $1$ and any other vertex to $0$ .

Let Y be the image of $\phi $ . We have the following.

Proposition 10.1. Let G be the graph defined above and $\phi $ be the standard factor code on G. Then, there is an SFT $Z\subset \widehat {X_{G}}$ such that $\phi \vert _Z$ is finite-to-one and onto Y.

We need the following lemma.

Lemma 10.2. Suppose $d_1, d_2$ are two positive integers. Let

$$ \begin{align*} E\:&=\{n\in \mathbb{Z}_{\geq 0}: n=s \cdot d_1+t \cdot d_2, s,t\in \mathbb{Z}_{\geq 0}\}, \\ u:&=\frac{l.c.m.(d_1, d_2)}{d_2}. \end{align*} $$

Then for any $n\in E$ , the equation

(18)

$$ \begin{align} x\cdot d_1+ y\cdot d_2=n \quad \text{such that } x,y\in \mathbb{Z}_{\geq 0}, 0\leq y< u \end{align} $$

has a unique solution.

Proof. We first show that equation (18) has a solution. Suppose $n=s\cdot d_1+t\cdot d_2$ for some $s,t\in \mathbb {Z}_{\geq 0}$ . If $t< u$ , then $x=s, y=t$ is a solution to equation (18); otherwise, if $t\geq u$ , then there exist non-negative integers $k,r$ with $0\leq r< u$ such that $t=k u +r$ . Hence, we have

(19)

$$ \begin{align} n&= s \cdot d_1 + t \cdot d_2 \notag \\ &=s\cdot d_1+(ku+r) d_2 \notag \\ &= s \cdot d_1 + k \cdot (l.c.m.(d_1 ,d_2))+r d_2 \notag \\ &= \bigg(s+ k\cdot \frac{l.c.m. (d_1, d_2)}{d_1}\bigg) d_1 + r d_2. \end{align} $$

Since $d_1 \mid l.c.m. (d_1, d_2)$ and $0\leq r<u$ , we conclude from equation (19) that $x=s+ k\cdot {l.c.m. (d_1, d_2)}/{d_1}, y=r$ is a solution to equation (18).

We now prove that equation (18) has no more than one solution. Suppose to the contrary that there exist two different pairs of integers $(x_1, y_1)$ and $(x_2, y_2)$ that satisfy equation (18) and without loss of generality $y_1<y_2$ . Now we have $ x_1 \cdot d_1 + y_1 \cdot d_2 = x_2 \cdot d_1 + y_2 \cdot d_2=n, $ which implies $ (y_2-y_1) d_2 = (x_1-x_2) d_1$ . Hence, $d_1 \mid (y_2-y_1) d_2$ and it follows that

(20)

$$ \begin{align} (y_2-y_1) d_2\geq l.c.m. (d_1, d_2), \end{align} $$

since $d_2 \mid (y_2-y_1) d_2$ . However, recalling that $y_1, y_2<u$ , we have $y_2-y_1<u$ and

$$ \begin{align*} (y_2-y_1) d_2< u\cdot d_2 = \frac{l.c.m.(d_1, d_2)}{d_2}\cdot d_2 = l.c.m. (d_1, d_2), \end{align*} $$

contradicting equation (20).

Proof of Proposition 10.1

We first note that the image Y of $\phi $ is a gap shift with gap set

$$ \begin{align*} S:= \{n \in \mathbb{Z}_{\geq 0}: n=m+s \cdot d_1 +t\cdot d_2 \mbox{ with } s,t \in \mathbb{Z}_{\geq 0}\}, \end{align*} $$

Let $u:= l.c.m.(d_1, d_2)/d_2$ and denote the vertices on the cycle $C_2$ and path $\gamma ^+$ by

$$ \begin{align*} V(C_2)&=f_1 f_2\ldots f_{d_2}, \\ V(\gamma^+)&=B g_1 g_2 \ldots g_{\vert \gamma^+\vert-1} f_1, \end{align*} $$

where $f_1=B'$ . We then construct a new graph H from G through the following steps:

(A) let H be the graph obtained from G by deleting the cycle $C_2$ ;
(B) if $u> 1$ , add to H a simple path $\beta $ from B to $B'$ such that
$$ \begin{align*} \vert\beta \vert &= \vert\gamma^+ \vert + (u-1) d_2,\\ V(\beta)&=Bg_1' g_2' \ldots g_{\vert \gamma^+\vert-1}' f_1^{(1)} f_2^{(1)}\ldots f_{d_2}^{(1)} \ldots \ldots f_1^{(u-1)} f_2^{(u-1)}\ldots f_{d_2}^{(u-1)} B'; \end{align*} $$
(C) for each $1\leq j\leq u-2$ , add to H an edge from $f_{d_2}^{(j)}$ to $B'$ .

See Figure 7 for an example of G and H when $m=3$ , $\vert C_1 \vert =4$ , and $\vert C_2 \vert =3$ .

Figure 7 An example of G and H with $m=3$ , $\vert C_1 \vert =4, \vert C_2 \vert =3$ .

We now construct a sliding block code $\psi :X_H \rightarrow X_G$ such that $\psi $ is one-to-one and $\phi \circ \psi $ is finite-to-one and onto. It will follow that $Z:= \psi (X_H$ ) is an SFT and $\phi |_Z$ is finite-to-one and onto.

Let $\Psi : \mathcal {V}(H)\rightarrow \mathcal {V}(G)$ be the 1-block map defined by:

(a) for any vertex v on $\gamma ^+, \gamma ^-$ or $C_1$ , $\Psi (v)=v$ ;
(b) for any $1\leq i\leq \vert \gamma ^+\vert -1$ , $\Psi (g_i')=g_i$ ; for any $1\leq j\leq d_2$ and $1\leq k\leq u-1$ , $\Psi (f_{j}^{(k)})=f_j$ .

Let $\psi $ be the sliding block code induced by $\Psi $ . To show that $\psi $ is one-to-one, it suffices to show that there exists some M such that whenever $\psi (x) = y$ , then $x_0$ can be uniquely determined from $y_{[-M,M]}$ . We show this by considering the following possibilities for $y_0$ :

(1) if $y_0$ is on $\gamma _-$ or $C_1$ and $y_0\neq B'$ , then $x_0 = y_0$ ;
(2) if $y_0 = g_i$ for some i, let
$$ \begin{align*}N_1: =\min\{l\geq 0: y_l=g_{\vert \gamma^+\vert-1}\}\leq \vert \gamma^+ \vert-2.\end{align*} $$
Then $x_0 = g_i'$ if $y_{N_1+2}=f_2$ and $x_0=g_i$ otherwise;
(3) if $y_0 = f_j$ for some j, let
$$ \begin{align*} N_2:= \min \{ \ell \ge 0: y_{-\ell} = g_{|\gamma^+|}\} \leq (u-1)d_2. \end{align*} $$
If $y_1\neq f_l$ for any $1\leq l\leq d_2$ , then $x_0=f_1$ ; otherwise, $x_0 = f_j^{(k)}$ where $k = \lceil N_2/d_2 \rceil $ .

This shows that $\psi $ meets the criterion above to be one-to-one with $M:= \max \{\vert \gamma ^+ \vert , (u-1)d_2\}$ .

Now we show that $\phi \circ \psi : \widehat {X_H} \rightarrow Y$ is finite-to-one and onto. Note that by definition, $\phi \circ \psi $ maps the central state B of H to $1$ and maps any other vertex to $0$ .

To this end, first observe that any $k\in S$ must satisfy $k=m+s\cdot d_1 + t\cdot d_2$ for some $s, t\in \mathbb {Z}_{\geq 0}$ . Noting from Lemma 10.2 that there is a unique pair $(x, y)$ with $x,y\in \mathbb {Z}_{\geq 0}$ and $0\leq y<u$ such that $s\cdot d_1 + t \cdot d_2 = x \cdot d_1 + y \cdot d_2$ , we conclude that

$$ \begin{align*} (\Phi\circ\Psi)^{-1}(10^{k} 1)= \begin{cases} V(\gamma^+(C_1)^x \gamma^-) \quad &\mbox{if }y=0,\\ V(\beta(C_1)^x \gamma^- ) \quad &\mbox{if }0<y<u. \end{cases} \end{align*} $$

In particular, any block of the form $10^{k} 1$ with $k\in S$ has a unique pre-image under $\Phi \circ \Psi $ . Similarly, one can show that $|(\Phi \circ \Psi )^{-1}(0^\infty 1)| = 1$ , $|(\Phi \circ \Psi )^{-1}(1 0^\infty )| = u$ , and $|(\Phi \circ \Psi )^{-1}(0^\infty )| = d_1$ . Since each element $y \in Y$ is a concatenation of blocks of the form $10^k, 0^\infty 1, 1 0^\infty $ , and $0^\infty $ with $k\in S$ ,

$$ \begin{align*} 1 \le |(\phi \circ \psi)^{-1}(y)| \le \max(u, d_1). \end{align*} $$

So $\phi \circ \psi $ is finite-to-one and onto Y.

Remark 10.3. The subshift of finite type Z is not unique: indeed, by interchanging the role of $C_1$ and $C_2$ , we can construct another SFT $Z'\subset \widehat {X_G}$ such that $\phi \vert _{Z'}$ is finite-to-one and onto Y.

11 Concluding remarks

In this paper, we have interpreted input-constrained deterministic channels as factor codes on irreducible SFTs. We introduced two properties, properties P1 and P2 (weaker than property P1), of such factor codes sufficient for Markov capacity to achieve capacity of the corresponding channel. We characterized property P1 for a class of factor codes and property P2 for a more specialized class of factor codes. For the latter class, we conjectured that property P2 is equivalent to the condition that Markov capacity achieves capacity and gave several special cases to support this conjecture.

Acknowledgements

We thank Mike Boyle, Felipe García-Ramos, Sophie MacDonald, Tom Meyerovitch, Ronny Roth, and Paul Siegel for helpful discussions. We also thank the anonymous referee for the constructive comments which have greatly improved the quality of this article. The work of G.H. was supported by the Research Grants Council of the Hong Kong Special Administrative Region, China, under Project 17304121 and by the National Natural Science Foundation of China, under Project 61871343.

References

Arimoto, S.. An algorithm for computing the capacity of arbitrary memoryless channels. IEEE Trans. Inform. Theory 18(1) (1972), 12–20.CrossRef Google Scholar

Blahut, R. E.. Computation of channel capacity and rate distortion functions. IEEE Trans. Inform. Theory 18(4) (1972), 460–473.CrossRef Google Scholar

Boyle, M. and Petersen, K.. Hidden Markov processes in the context of symbolic dynamics . Entropy of Hidden Markov Processes and Connections to Dynamical Systems (Lecture Note Series, 385). Eds. Marcus, B., Petersen, K. and Weissman, T.. Mathematical Society, London, 2011, pp. 5–71.CrossRef Google Scholar

Boyle, M. and Tuncel, S.. Infinite-to-one codes and Markov measures. Trans. Amer. Math. Soc. 285(2) (1984), 657–684.CrossRef Google Scholar

Chen, J. and Siegel, P. H.. Markov processes asymptotically achieve the capacity of finite state intersymbol interference channels. IEEE Trans. Inform. Theory 54(3) (2008), 1295–1303.CrossRef Google Scholar

Dastjerdi, D. A. and Jangjoo, S.. Dynamics and topology of S-gap shifts. Topology Appl. 159(10–11) (2012), 2654–2661.CrossRef Google Scholar

Feinstein, A.. On the coding theorem and its converse for finite memory channels. Inform. Control 2(1) (1959), 25–44.CrossRef Google Scholar

García-Ramos, F. and Pavlov, R.. Extender sets and measures of maximal entropy for subshifts. J. Lond. Math. Soc. (2) 100(3) (2019), 1013–1033.CrossRef Google Scholar

Gray, R. M.. Entropy and Information Theory, 2nd edn. Springer, New York, NY, 2011.CrossRef Google Scholar

LeVeque, W. J.. Topics in Number Theory, Vol. 1. Addison Wesley, Reading, MA, 1956.Google Scholar

Lind, D. and Marcus, B.. An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge, 1995; (Cambridge Mathematical Library), 2nd edn. Cambridge University Press, Cambridge, 2021.CrossRef Google Scholar

MacDonald, S.. Encoding subshifts through a given sliding block code. Ergod. Th. & Dynam. Sys. doi:10.1017/etds.2023.56. Published online 3 August 2023.Google Scholar

Marcus, B., Petersen, K. and Williams, S.. Transmission rate and factors of Markov chains. Contemp. Math. 26 (1984), 279–294.CrossRef Google Scholar

Parry, W.. A finitary classification of topological Markov chains and sofic systems. Bull. Lond. Math. Soc. 9(1) (1997), 86–92.CrossRef Google Scholar

Petersen, K., Quas, A. and Shin, S.. Measures of maximal relative entropy. Ergod. Th. & Dynam. Sys. 23(1) (2003), 207–203.CrossRef Google Scholar

Parry, W. and Tuncel, S.. Classification Problems in Ergodic Theory (London Mathematical Society Lecture Note Series, 67). Cambridge University Press, Cambridge, 1982.CrossRef Google Scholar