Skip to content
Publicly Available Published by De Gruyter March 27, 2023

Efficient, continual, and generalized learning in the brain – neural mechanism of Mental Schema 2.0 –

  • Takefumi Ohki EMAIL logo , Naoto Kunii and Zenas C. Chao

Abstract

There has been tremendous progress in artificial neural networks (ANNs) over the past decade; however, the gap between ANNs and the biological brain as a learning device remains large. With the goal of closing this gap, this paper reviews learning mechanisms in the brain by focusing on three important issues in ANN research: efficiency, continuity, and generalization. We first discuss the method by which the brain utilizes a variety of self-organizing mechanisms to maximize learning efficiency, with a focus on the role of spontaneous activity of the brain in shaping synaptic connections to facilitate spatiotemporal learning and numerical processing. Then, we examined the neuronal mechanisms that enable lifelong continual learning, with a focus on memory replay during sleep and its implementation in brain-inspired ANNs. Finally, we explored the method by which the brain generalizes learned knowledge in new situations, particularly from the mathematical generalization perspective of topology. Besides a systematic comparison in learning mechanisms between the brain and ANNs, we propose “Mental Schema 2.0,” a new computational property underlying the brain’s unique learning ability that can be implemented in ANNs.

1 Introduction

Learning is one of the major themes in neuroscience, and many findings have already been revealed from molecular to behavioral levels (Asok et al. 2019; Chen et al. 2010; Fanselow and Poulos 2005; Kandel 2001; Krabbe et al. 2019; Nakazawa et al. 2004; Tse et al. 2007). However, with the development of artificial neural networks (ANNs), there is a stronger need for better understanding of how the brain can learn efficiently and continually and skillfully generalize the knowledge. In this paper, we discuss major differences between the brain and ANNs as learning devices, in terms of efficiency, continuity, and generalization, and elucidate the uniqueness of the brain’s learning mechanism.

In the first part of this paper, we review recent findings on brain development and its utilization of self-organized mechanisms to maximize learning efficiency (Figure 1A). For example, the brain has innate structures and spontaneous activity that facilitates processing of specific information, such as spatiotemporal and numerical information, from an early stage of development. In this regard, it is widely recognized that most existing ANNs have a contrasting strategy: the network structure and connection weights are determined solely by specific tasks and input data. From the perspective of learning efficiency, such a data-driven strategy is not necessarily optimal. Designing of an ANN to achieve an objective requires a great deal of trial and error and a large amount of data and costs (Sezgin et al. 2022; Strubell et al. 2020). We compared ANNs, such as convolutional neural networks (CNNs) and variational autoencoders (VAE), with biological neuronal networks necessary to generate spontaneous activities during the offline state and describe neuronal mechanisms that allow the brain to learn more efficiently (Flesch et al. 2018; Hadsell et al. 2020Parisi et al. 2019). Besides, we describe the effects of sharp-wave ripples (SPW-Rs), the most typical neuronal oscillations during the offline state that can upregulate and downregulate network connections to prevent catastrophic forgetting (Figure 1B), and introduce recent brain-inspired studies that have incorporated SPW-Rs into ANNs (Hadsell et al. 2020; Shin et al. 2017; van de Ven et al. 2020; Wang et al. 2022).

Figure 1: 
Schematic representations for efficient, continual, and generalized learning in the brain. (A) Efficient learning in the brain. In the brain, unlike artificial neural networks, spontaneous activity other than external inputs becomes a learning signal, which leads to efficient learning. The basic structure of the network and modules such as the formation of grid cells, etc., has already been partially prespecified in DNA. (B) Continuous learning. Continuous learning is possible in the brain because a part of the existing weights (represented by the blue line and R) is not rewritten even if new learning is performed. Such a mechanism is also realized by spontaneous activity. A typical example is sharp-wave ripples initiated by the rigid cell represented in blue. In contrast, the red line and P represent the weights newly added to the network. The gray dotted line and P denote the downregulated weights. (C) Generalization in the brain. The functionality of generalization is realized by a neural complex represented as a topological network. Topological networks have properties suitable for generalizing multiple information, such as encoding more abstract information.
Figure 1:

Schematic representations for efficient, continual, and generalized learning in the brain. (A) Efficient learning in the brain. In the brain, unlike artificial neural networks, spontaneous activity other than external inputs becomes a learning signal, which leads to efficient learning. The basic structure of the network and modules such as the formation of grid cells, etc., has already been partially prespecified in DNA. (B) Continuous learning. Continuous learning is possible in the brain because a part of the existing weights (represented by the blue line and R) is not rewritten even if new learning is performed. Such a mechanism is also realized by spontaneous activity. A typical example is sharp-wave ripples initiated by the rigid cell represented in blue. In contrast, the red line and P represent the weights newly added to the network. The gray dotted line and P denote the downregulated weights. (C) Generalization in the brain. The functionality of generalization is realized by a neural complex represented as a topological network. Topological networks have properties suitable for generalizing multiple information, such as encoding more abstract information.

In the last part of the paper, we focus on the brain’s ability to generalize, which enables the abstraction of multiple information to form generalized knowledge (e.g., semantic memory) and allows flexible action selection in very different environments. In recent years, efforts have been made to elucidate the generalizing ability of the brain, which has attracted a great deal of attention across the neuroscience community (Bahtiyar et al. 2020; Ngo et al. 2021; Ohki and Takei 2018; Vaidya et al. 2021; and Zeithamova and Bowman 2020). Conversely, ANNs are good at learning subtle features, especially in image data (Esteva et al. 2017; Hosny et al. 2018; Rostami et al. 2021), but rarely acquire semantics or implicit rules that can encompass whole knowledge, as evidenced by studies of natural language processing (Korngiebel and Mooney 2021). We suggest that the brain transforms all information, spatial and nonspatial, to a topological-invariant representation (Figure 1C), which has attracted more attention in recent years in the field of neuroscience (Chung et al. 2019; Dabaghian et al. 2012; Giusti et al. 2015; Patania et al. 2019; Romano et al. 2014; Sizemore et al. 2019).

2 Efficient learning

2.1 Biological constraints for efficient learning

On comparing nodes that make up ANNs with neurons and glial cells that constitute the brain, we immediately notice the fundamental computational differences (Figure 2). To elucidate these differences, Figure 2 schematically illustrates the fundamental computation implemented in ANNs (e.g., CNNs). Figure 2A demonstrates the computational process in the node, the most basic unit of ANNs. The computation consists of two main processes: the first process computes the inner product using data and weights, denoted as ( d a t a T w e i g h t ) , and the second process transforms the inner product using a nonlinear function (e.g., sigmoid function or ReLU), denoted as f in the figure. Thus, the computation at each node can be expressed concisely as f ( d a t a T w e i g h t ) . The computational result at each node is sent to the next node one after another, which is called forward propagation, and the predictive result denoted as y ˆ is the final output according to the objective function ( y ).

Figure 2: 
Fundamental computations in artificial neural networks (ANNs). (A) Computation at each node and the forward propagation. In this panel, inputs are denoted by 




d

1
∼
n




${d}_{1\sim n}$


 and bias term (or intercept) by 



b


$b$


. For this data, nodes have initially been assigned random weights within a certain range. The circle in magenta denoted as 



∑


$\sum $


 represents the first computational process in the node, i.e., the inner product. The computed inner product is transformed in a second process using a nonlinear function (e.g., sigmoid function or ReLU function). This process is represented as 



f


$f$


 by the pale blue circle. The output is denoted by 




y
ˆ



$\widehat{y}$


. The equation denoted as 



f

(


d
T

w

)



$f({d}^{T}w)$


, marked by a large white circle, summarizes these two processes. (B) Backward propagation as minimization of the loss function. In the left panel, we show an example of minimizing the loss function, using 1D data. In this panel, the x-axis represents the weight and the y-axis the loss function. The blue line indicates the ANN’s outputs in terms of the loss function and the different weights. The orange line is the derivative for the blue line. The value of weight that minimizes the loss function denoted as the red dot is called the local minima. In the right panel, the process of minimizing loss function is illustrated using two-variable data (i.e., two weights). In this case, since there are two weights to be adjusted, the partial derivative (



∂


$\partial $


) is used. The color plot illustrates the loss function obtained as a result of this computation for the combination of the two weights. Red indicates a high loss function and blue indicates a low loss function. The small blue square represents the starting position of the optimal search, and the small red circle denotes the calculation end points called the local minima. Note that the result of this calculation is not the optimal weight when viewed globally. (C) Schematic of a typical convolutional neural network. Note that the node computation is schematically illustrated for clarity.
Figure 2:

Fundamental computations in artificial neural networks (ANNs). (A) Computation at each node and the forward propagation. In this panel, inputs are denoted by d 1 n and bias term (or intercept) by b . For this data, nodes have initially been assigned random weights within a certain range. The circle in magenta denoted as represents the first computational process in the node, i.e., the inner product. The computed inner product is transformed in a second process using a nonlinear function (e.g., sigmoid function or ReLU function). This process is represented as f by the pale blue circle. The output is denoted by y ˆ . The equation denoted as f ( d T w ) , marked by a large white circle, summarizes these two processes. (B) Backward propagation as minimization of the loss function. In the left panel, we show an example of minimizing the loss function, using 1D data. In this panel, the x-axis represents the weight and the y-axis the loss function. The blue line indicates the ANN’s outputs in terms of the loss function and the different weights. The orange line is the derivative for the blue line. The value of weight that minimizes the loss function denoted as the red dot is called the local minima. In the right panel, the process of minimizing loss function is illustrated using two-variable data (i.e., two weights). In this case, since there are two weights to be adjusted, the partial derivative ( ) is used. The color plot illustrates the loss function obtained as a result of this computation for the combination of the two weights. Red indicates a high loss function and blue indicates a low loss function. The small blue square represents the starting position of the optimal search, and the small red circle denotes the calculation end points called the local minima. Note that the result of this calculation is not the optimal weight when viewed globally. (C) Schematic of a typical convolutional neural network. Note that the node computation is schematically illustrated for clarity.

Naturally, the initial predictive result ( y ˆ ) does not necessarily match the objective function. Therefore, as the next step, the error minimization process called backward propagation is conducted (Figure 2B). The goal of backward propagation is to adjust the weights at each node to minimize the loss function between the predictions ( y ˆ ) via forward propagation and the object function ( y ). For example, the loss function via mean-squared error and cross-entropy are defined as L = 1 2 ( y y ˆ ) 2 and L = - ( y log ( y ˆ ) + ( 1 y ) log ( 1 y ˆ ) ) , respectively. In the error minimization process, the gradient descent method is typically used. In Figure 2B, we demonstrate how gradient descent works using lower dimensional data for illustration purposes. As demonstrated with the 1D data example, the gradient descent is synonymous with the derivative calculation. As shown in the left panel in Figure 2B, when the loss of function is around zero, the value of weight is called local minima. Thus, the backward propagation is the minimizing process of loss function via derivative, which is mathematically defined as w B = w F η L . Here, w B and w F are the weights obtained after backward and forward propagation, respectively. η is a scaler that controls the amount of learning, called the learning rate, and L is the derivative of the loss function. Although the gradient descent method is a very useful technique, there are some pitfalls. For instance, the gradient method is easily trapped in local minima, which limits the search space for better weights. Such an example is illustrated in the right panel in Figure 2B, using 2D data. Although the true minimum exists at the bottom denoted as the dark blue, the searching process stops at local minima at the middle of the figure when the gradient does not change. As a result, the true minimum called global minima cannot be discovered. Usually, the search space for the optimal weights in ANNs via the gradient descent method is multidimensional, and the search range is huge. Furthermore, as the training data increases, the computation of the loss function also increases. Therefore, in ANNs normal training, it is usual to divide the training data into several data sets called mini-batch. Incidentally, the averaged sum of the loss functions obtained for a mini-batch is called the cost function, and learning is usually performed based on minimizing the cost function. Accordingly, the cost function is mathematically defined as I = 1 n i = 1 n L ( y ˆ i , y i ) . Thus, the fundamental mechanism of ANNs such as CNNs is realized by repeating simple and sophisticated mathematical computations over multiple layers (Figure 2C).

In contrast, the brain consists of a far more complex mechanism. For instance, neurons have many diverse morphologies, connection patterns, and functionalities such as filtering, logical operations, coincidence detection, segregation, and amplification (London and Häusser 2005; Kanari et al. 2022; Kepecs and Fishell 2014; Zeng and Sanes 2017). For instance, the distinctive amplitude attenuation and duration decay in spiking activities of a neuron can change corresponding to the distinctive frequency of inputs (e.g., 10–100 Hz) (Gulledge and Stuart 2003). Thus, even at the elemental computation level, neurons and glial cells show far more complex behavior and have distinctive functionalities from the nodes of ANNs. However, in order to elucidate the learning mechanism in the brain, the comparison can be meaningful. In particular, it may provide important clues to elucidate the learning mechanism in the brain by focusing on how weights are adjusted with reference to ANNs (e.g., the gradient descent method). Therefore, here we will firstly consider the brain learning mechanism from the perspective of optimizing weights.

The first problem we face is the enormous number of neurons and glial cells in the brain. It is currently assumed that the brain has 86 billion neurons, each with approximately 10,000 synaptic connections and 100 billion glial cells (Colbran 2015). This is by far greater than the number of nodes that make up an ANN for general research purposes. To adequately adjust such a huge number of weights via the data-driven learning algorithm such as the gradient descent method requires a huge amount of data and computational costs (Figure 2). This problem is instructive for rethinking the learning principles in the brain. For instance, why does the brain only need so little data, time, and energy to give appropriate weights to so many neurons? Furthermore, real-world data consisting of multimodal information that the brain actually learns and integrates are far more complex than the data that an ANN is usually exposed to. These facts imply that the brain may have a distinctive learning strategy from ANNs, and more specifically, the brain may be narrowing down the number of adjusted parameters, and/or weights might be predetermined or preconfigured from the very early developmental stages (Burgess 2008; Ge et al. 2021; Huszár et al. 2022; Khazipov et al. 2004).

As examples to imply such preconfigured networks, Figure 3 demonstrates developmental changes in the brain’s spatial–temporal information processing system, such as the head direction (Figure 3A), place cell (Figure 3B), and grid cell (Figure 3C, Langston et al. 2010; Wills et al. 2010). As shown, it is clear that a pup’s brain shows activation patterns almost identical to those in adults at their first spatial exploration (at least in the postnatal 15 days when the silicon probe can be stably placed in the pup’s soft brain). Such preconfigured neuronal circuits are highly likely to be encoded at the genetic level (Moser et al. 2008) and form more sophisticatedly via self-organized brain activity (Burgess 2008; Burgess and O’Keefe 2011; DiTullio and Balasubramanian 2021; Hasselmo et al. 2007; Kang and Balasubramanian 2019). Thus, the brain is assumed to be preconfigured to extract fundamental statistical properties of spatial information without explicit training signal, such as the supervised learning in ANNs.

Figure 3: 
Spatial system in the brain during the early developmental stage. (A) The developmental changes of the head direction cell in the pre- and para-subiculum. Black arrows in the upper left panel indicate the recording site. In the upper right panel, head direction cells show well-tuned directional activation patterns on postnatal days (P15 and P16). The surrogate data distribution of mean vector length is depicted in the bottom left panel. Red line indicates 95% (p > 0.05). The proportion of cells (%) in the pre- and para-subiculum shows the mean vector lengths at P15 and 16 (gray) and in adults (black). Arrow indicates chance levels (p > 0.05). (B) Development of place cells. In the upper left panel, the representative recording site in CA1 is indicated by the black arrow. At the upper right corner, the place fields encoded by CA1 place cell between P17 and 35 are demonstrated. Firing rate maps are color scaled from blue (min) and red (max). In the left bottom panel, the surrogate distribution of spatial information score sets created from P16 to 18 is demonstrated. Red line represents 95% (p > 0.05). At the right bottom, percentage of CA1 cells above the criteria (p < 0.05), depending on the postnatal days, are depicted. Red solid and dotted lines represent the statistical significance (p < 0.05) and upper limit of 95% confidence interval, respectively. (C) Development of grid cells. At the left upper panel, the black arrow indicates the representative recording site in layer 2 of the medial entorhinal cortex. The hexagonal firing patterns of grid cells from P16 to 34 are demonstrated in two ways: rate maps and spatial autocorrelations (



r


$r$


[−1 to +1]). From left to right in the bottom panel, null distribution of grid scores obtained by surrogation, percentage of cells above the 95th percentile threshold, the percentage of grid cells, grid scores for each age, and spatial correlation between rate maps are depicted. All these figures are adapted from Langston et al. (2010) with permission.
Figure 3:

Spatial system in the brain during the early developmental stage. (A) The developmental changes of the head direction cell in the pre- and para-subiculum. Black arrows in the upper left panel indicate the recording site. In the upper right panel, head direction cells show well-tuned directional activation patterns on postnatal days (P15 and P16). The surrogate data distribution of mean vector length is depicted in the bottom left panel. Red line indicates 95% (p > 0.05). The proportion of cells (%) in the pre- and para-subiculum shows the mean vector lengths at P15 and 16 (gray) and in adults (black). Arrow indicates chance levels (p > 0.05). (B) Development of place cells. In the upper left panel, the representative recording site in CA1 is indicated by the black arrow. At the upper right corner, the place fields encoded by CA1 place cell between P17 and 35 are demonstrated. Firing rate maps are color scaled from blue (min) and red (max). In the left bottom panel, the surrogate distribution of spatial information score sets created from P16 to 18 is demonstrated. Red line represents 95% (p > 0.05). At the right bottom, percentage of CA1 cells above the criteria (p < 0.05), depending on the postnatal days, are depicted. Red solid and dotted lines represent the statistical significance (p < 0.05) and upper limit of 95% confidence interval, respectively. (C) Development of grid cells. At the left upper panel, the black arrow indicates the representative recording site in layer 2 of the medial entorhinal cortex. The hexagonal firing patterns of grid cells from P16 to 34 are demonstrated in two ways: rate maps and spatial autocorrelations ( r [−1 to +1]). From left to right in the bottom panel, null distribution of grid scores obtained by surrogation, percentage of cells above the 95th percentile threshold, the percentage of grid cells, grid scores for each age, and spatial correlation between rate maps are depicted. All these figures are adapted from Langston et al. (2010) with permission.

Similar findings have been reported for higher cognitive functions such as number sense (Figure 4). Number processing must handle abstract information that does not depend on the modality of the stimulus (Kutter et al. 2018; Nieder 2016; Nieder and Dehaene 2009; Sawamura et al. 2002). In previous studies, it has already been identified that neurons show selective response for a specific number, certain preconfigured neuronal circuit motifs related to numerical processing (e.g., lateral inhibition), and brain regions (e.g., intraparietal sulcus and prefrontal cortex), not only in humans (Figure 4A) but also in other species such as crows, fish, bees, and nonhuman primates (Bongard and Nieder 2010; Nieder 2021; Nieder and Miller 2004). There have also been many studies on the development of number sensing (Edwards et al. 2016; Hyde et al. 2010; Izard et al. 2009; Mccrink and Wynn 2004). For instance, Izard et al. used electroencephalogram (EEG) to show that in 3-month-old infants (Figure 4B), the brain differentially encodes the identities of numbers and objects (Izard et al. 2008). Thus, these findings suggest that there are preconfigured brain mechanisms that can handle abstract representations, such as numbers, without being dependent on the modality of the stimulus.

Figure 4: 
Number system in the brain. (A) The brain regions for number sense and number neurons. The left panel illustrates representative brain regions involved in number sense that have been identified in previous studies. In monkeys, the lateral prefrontal cortex (lPFC), area 5, ventral intraparietal area (VIP), the intraparietal sulcus (IPS) and in humans, regions such as IPS and multiple prefrontal regions (mPFC and IPFC) have been identified. In the right panel, the firing patterns of number neurons that show selective responses to their number from one to four are demonstrated. For example, cell 1, which encodes “1,” selectively increases their firing rate, independent of the modality of the stimulus. For larger numbers (e.g., 4), such selectivity is known to be relatively weak. This phenomenon is reminiscent of the Weber–Fechner law. (B) Three-month-old infants detect number changes. Brain networks recognizing number change (i.e., a deviant number) at 3 months of age. The brain networks related to number sense such as IPS respond to a number change. On the other hand, the IPS and other number-sensing regions do not respond to an object change (i.e., a deviant object). (C) Lateral inhibition for number tuning. A neural circuit motif called lateral inhibition found in the number sense neurons. In this example, excitatory cells encoding 1 (marked by triangles) receive the inhibitory input from inhibitory cells (e.g., gamma-aminobutyric acid neurons) associated with 2. The converse is also true: the excitatory cell encoding 2 receives input from the inhibitory cell associated with 1. Such lateral inhibition is not a number-specific circuit motif but is widespread in the whole brain, such as the hippocampus. (D) The brain-inspired hierarchical convolutional neural network (HCNN) for number sense. The network consists of three structures: an input layer, a feature extraction network, and a classification network. In the feature extraction layer, multiple feature maps were extracted by convolution. Each feature map represented all visual features of the input image. After convolution, a non-linear activation function was applied. The max pooling layer in the feature extraction network aggregated the responses by computing the maximum response in a small non-overlapping region of the visual input. The classification network consisted of two layers, a global average pooling layer and a fully connected layer. The former computed the average response for each input feature map. The latter calculated the probability that the response of each unit is the presence of a particular object class in the input image. (E) Schematic diagram of the preconfigured brain network; one of the major differences from the artificial neural networks is that spontaneous activity itself, not just external input, serves as the training signal. Panel (A) was adopted from Nieder (2016), panel (B) from Izard et al. (2008), and panel (E) from Nasr et al. (2019) with permission.
Figure 4:

Number system in the brain. (A) The brain regions for number sense and number neurons. The left panel illustrates representative brain regions involved in number sense that have been identified in previous studies. In monkeys, the lateral prefrontal cortex (lPFC), area 5, ventral intraparietal area (VIP), the intraparietal sulcus (IPS) and in humans, regions such as IPS and multiple prefrontal regions (mPFC and IPFC) have been identified. In the right panel, the firing patterns of number neurons that show selective responses to their number from one to four are demonstrated. For example, cell 1, which encodes “1,” selectively increases their firing rate, independent of the modality of the stimulus. For larger numbers (e.g., 4), such selectivity is known to be relatively weak. This phenomenon is reminiscent of the Weber–Fechner law. (B) Three-month-old infants detect number changes. Brain networks recognizing number change (i.e., a deviant number) at 3 months of age. The brain networks related to number sense such as IPS respond to a number change. On the other hand, the IPS and other number-sensing regions do not respond to an object change (i.e., a deviant object). (C) Lateral inhibition for number tuning. A neural circuit motif called lateral inhibition found in the number sense neurons. In this example, excitatory cells encoding 1 (marked by triangles) receive the inhibitory input from inhibitory cells (e.g., gamma-aminobutyric acid neurons) associated with 2. The converse is also true: the excitatory cell encoding 2 receives input from the inhibitory cell associated with 1. Such lateral inhibition is not a number-specific circuit motif but is widespread in the whole brain, such as the hippocampus. (D) The brain-inspired hierarchical convolutional neural network (HCNN) for number sense. The network consists of three structures: an input layer, a feature extraction network, and a classification network. In the feature extraction layer, multiple feature maps were extracted by convolution. Each feature map represented all visual features of the input image. After convolution, a non-linear activation function was applied. The max pooling layer in the feature extraction network aggregated the responses by computing the maximum response in a small non-overlapping region of the visual input. The classification network consisted of two layers, a global average pooling layer and a fully connected layer. The former computed the average response for each input feature map. The latter calculated the probability that the response of each unit is the presence of a particular object class in the input image. (E) Schematic diagram of the preconfigured brain network; one of the major differences from the artificial neural networks is that spontaneous activity itself, not just external input, serves as the training signal. Panel (A) was adopted from Nieder (2016), panel (B) from Izard et al. (2008), and panel (E) from Nasr et al. (2019) with permission.

In recent years, several studies have developed ANNs that demonstrate responses similar to those of neurons in the brain (Kim et al. 2021; Stoianov and Zorzi 2012). In one study, the authors applied a hierarchical CNN to obtain numerical competence (Nasr et al. 2019). This brain-inspired ANN, with receptive fields and lateral inhibition that closely mimics the preconfigured brain visual system, was solely trained to classify real images that were unrelated to numerosity (Figure 4C, D). Intriguingly, approximately 10% of all nodes spontaneously showed selective response to the number of dots through learning. In addition, this ANN shares many characteristics of the brain’s number sense, such as decreasing accuracy as the number increases, known as the Weber–Fechner law (Mandler and Shebo 1982; Pica et al. 2004; Revkin et al. 2008). Based on these findings, the authors suggest that the spontaneous emergence of number sense is mainly due to neural mechanisms such as hierarchical layers in which nodes with receptive fields and lateral inhibition are formed topographically (Nasr et al. 2019). We can build on these findings to further investigate the mechanics of number sense in greater detail, like, the independence of number sense from stimulus attributes. How do these visual mechanisms relate to or are unrelated to the perception of the number of sound stimuli? How can we develop a novel ANN that can handle sequentially presented quantities and learn and use the successor principle (i.e., each number is created by adding 1 to its predecessor) based on learning from natural images? Thus, the brain-inspired ANNs allow us to investigate issues that are difficult to validate in actual brain-measurement experiments, with greater clarity.

On the other hand, it should be noted that a dataset of 1.2 million supervised images was used to train this network, which would be very different from the emergence of number sense in the brain. In particular, a universal behavioral trait in which infants as young as 3 months sleep for an average of 12–15 h per day needs to be taken more seriously (Galland et al. 2012; Mindell et al. 2010; Sadeh et al. 2009). Moreover, most of the sleep research till date has focused on the postlearning effects of sleep, such as memory consolidation. However, it is likely that sleep or spontaneous activity has forward effects on learning. For instance, the motor, visual, and auditory cortices, each have a unique characteristic spontaneous activity, such as neural oscillations (Higgins et al. 2021; Khazipov et al. 2004; Lehtelä et al. 1997; Ohki and Takei 2018; Ohki et al. 2016). Investigating these spontaneous activities (e.g., during sleep) from a more developmental perspective has the potential to reveal the forward effects of sleep on efficient learning in the brain.

To summarize this section, the brain is highly likely to adapt a different learning strategy from that of ANNs (Figures 2 and 4E), because there are too many weights in the brain to be adjusted solely by a data-driven learning. A possible alternative strategy is that the brain has preconfigured weights to extract fundamental statistical properties, and which narrows down the number of adjustable weights to realize efficient learning. As seen in Figures 3 and 4, typical examples of these are spatial information processing and number sense. These neural mechanisms show similar activation patterns with the ones of adults from very early developmental stages without requiring a large amount of training signals. These findings suggest that the brain may be using not only external inputs but also spontaneously formed brain activities (e.g., neural oscillations observed during sleep) as learning signals. Therefore, in the next section, we will discuss the neuronal foundation that generates spontaneous activities in the brain for efficient learning.

2.2 Brain with unequally biased weights

To elucidate efficient learning in the brain in more detail, we firstly introduce some studies in which the authors elucidated the neuronal foundations that generate spontaneous activities. Ikegaya et al. focused on postsynaptic current conductance (uEPSC) observed across a variety of ex vivo, in vitro, and in toto data (Ikegaya et al. 2013) and discovered that the amplitude distribution of uEPSC (e.g., measured on the CA3 synapses in hippocampal slice cultures) is a long-heavy-tailed distribution called the log-normal distribution (Figure 5A), i.e., the spontaneous organization of a recurrent neural network, such as CA3, leads to many weak synapses and a few of those which can produce a larger depolarization (e.g., >5 mV). Furthermore, the occurrence of sufficiently large uEPSCs is strongly associated with reciprocal connections between some neurons (Figure 5B), and only the larger uEPSCs generated by these neurons can induce more reliable synchronous firing activity at the population level. Importantly, such synchronous firing is associated with the occurrence of SPW-Rs (Takahashi et al. 2010), which is one of most typical spontaneous activity patterns in the brain (Figure 5C). Furthermore, the authors applied these neuronal properties (i.e., the log-normal distribution) to a novel neural network in silico (Figure 5D–F). As a result, this ANN maintained the six most fundamental properties related to spontaneous activity: (1) a persistent spontaneous firing pattern at frequencies below 1 Hz, (2) asynchronous spikes among majority neurons (e.g., plastic cells), (3) a moderate excitatory and inhibitory balance, (4) maintenance of network activity patterns that are robust against external perturbation, (5) responsiveness even to a single spike of excitatory neurons (e.g., rigid cells), and (6) a precise firing sequence (e.g., replay and preplay). Importantly, this statistical property of a skewed distribution in the brain is not limited to uEPSCs. In fact, the firing frequency of individual neurons, the number of synaptic connections, dendritic bouton size, local field potential (LFP) power spectrum, and axon diameters in various brain regions, including not only the hippocampus (Sayer et al. 1990) but also the neocortex (Cossell et al. 2015; Feldmeyer et al. 1999; Lefort et al. 2009; Song et al. 2005) and cerebellum (Brunel et al. 2004; Lanore et al. 2021), follow a log-normal distribution (Buzsáki and Mizuseki 2014). Thus, it is extremely important that such unequally biased neuronal weights reproduce and reflect the various features of spontaneous brain activity.

Figure 5: 
The weights of neural network and artificial neural network (ANN). (A) A long-tailed distribution of the amplitudes of unitary excitatory postsynaptic conductance (uEPSC) at CA3–CA3 synapses in hippocampal slice cultures. The broken line indicates the least-square best fit for the log-normal distribution. The two left insets indicate a presynaptic spike and 30 trials of postsynaptic responses. In the right inset, a semilog plot of the uEPSC distribution is shown. (B) Three neuronal connection motifs for producing uEPSC. Left, middle, and right: reciprocal, convergent, and divergent connection, respectively. A significant linear relationship between the uEPSC amplitudes and the reciprocal cell-connection (#1 and #2) was observed, but not with the other connection motifs. (C) Spike sequences in spontaneously active networks of hippocampal slice cultures. The spiking activity of CA3 neurons was recorded using functional multineuron calcium imaging (fMCI). In the sample data, forward and reverse spike sequences denoted in red and blue, respectively, were observed. (D) Persistent spontaneous activity of a recurrent network with the strongly biased synapses without external input in silico. Raster plots of representative spontaneous activity of all 4000 excitatory and 1000 inhibitory neurons. At the bottom, the histogram represents the number of excitatory neurons activated in a time window bin of 10 ms. (E) Distribution of the frequency of spontaneous spiking activities in excitatory (red) and inhibitory neurons (blue) in the simulated network. (F) Representative sequential firing patterns in the simulated network. In this figure, the sequential firing motif consisting of four cells was repeated five times within 200 s. (G) An illustration of the variational auto encoder (VAE) and variational sparse coding (VSC) model. An observed variable 




x
i



${x}_{i}$


 in gray is generated from an unobserved variable 




z
i



${z}_{i}$


 in both models. Usually, a Gaussian distribution as the likelihood function is chosen to fit the expected nature variation on the data for VAE. On the contrary, VSC models hold a sparse normal distribution with the spike and slab prior distribution. In the middle panel, classification accuracies of VAE in blue and VSC in red at varying number of late space dimensions are demonstrated. While VAE shows its peak classification accuracy only for the optimal selection of latent space dimensions, VSC maintains its peak performance across varying latent space dimensions. (H) A simplified illustration of the convolutional neural network (CNN). Initially, the weights of the nodes in the CNN are evenly distributed. These weights are gradually fixed and transformed into a Gaussian distribution after learning. The yellow and blue lines denote the weights distribution during the initial learning and final phases, respectively. (I) A simplified illustration of the brain network. In the brain, there are significant biases in weight and connectivity among individual neurons denoted as “R” in blue and “P” in red. Such biases strongly predetermine the flow of information processing. Thick arrows from the R cells represent strong synaptic connections, which have more influence on downstream neurons (i.e., P). Dashed arrows represent a weak synaptic connection of downstream neurons denoted as P cells, which have a smaller effect. These weight distributions are represented as a log-normal distribution shown in the bottom panel. Panels (A–F) have been adapted from Ikegaya et al. (2013) and (G) from Tonolini et al. (2020) with permission.
Figure 5:

The weights of neural network and artificial neural network (ANN). (A) A long-tailed distribution of the amplitudes of unitary excitatory postsynaptic conductance (uEPSC) at CA3–CA3 synapses in hippocampal slice cultures. The broken line indicates the least-square best fit for the log-normal distribution. The two left insets indicate a presynaptic spike and 30 trials of postsynaptic responses. In the right inset, a semilog plot of the uEPSC distribution is shown. (B) Three neuronal connection motifs for producing uEPSC. Left, middle, and right: reciprocal, convergent, and divergent connection, respectively. A significant linear relationship between the uEPSC amplitudes and the reciprocal cell-connection (#1 and #2) was observed, but not with the other connection motifs. (C) Spike sequences in spontaneously active networks of hippocampal slice cultures. The spiking activity of CA3 neurons was recorded using functional multineuron calcium imaging (fMCI). In the sample data, forward and reverse spike sequences denoted in red and blue, respectively, were observed. (D) Persistent spontaneous activity of a recurrent network with the strongly biased synapses without external input in silico. Raster plots of representative spontaneous activity of all 4000 excitatory and 1000 inhibitory neurons. At the bottom, the histogram represents the number of excitatory neurons activated in a time window bin of 10 ms. (E) Distribution of the frequency of spontaneous spiking activities in excitatory (red) and inhibitory neurons (blue) in the simulated network. (F) Representative sequential firing patterns in the simulated network. In this figure, the sequential firing motif consisting of four cells was repeated five times within 200 s. (G) An illustration of the variational auto encoder (VAE) and variational sparse coding (VSC) model. An observed variable x i in gray is generated from an unobserved variable z i in both models. Usually, a Gaussian distribution as the likelihood function is chosen to fit the expected nature variation on the data for VAE. On the contrary, VSC models hold a sparse normal distribution with the spike and slab prior distribution. In the middle panel, classification accuracies of VAE in blue and VSC in red at varying number of late space dimensions are demonstrated. While VAE shows its peak classification accuracy only for the optimal selection of latent space dimensions, VSC maintains its peak performance across varying latent space dimensions. (H) A simplified illustration of the convolutional neural network (CNN). Initially, the weights of the nodes in the CNN are evenly distributed. These weights are gradually fixed and transformed into a Gaussian distribution after learning. The yellow and blue lines denote the weights distribution during the initial learning and final phases, respectively. (I) A simplified illustration of the brain network. In the brain, there are significant biases in weight and connectivity among individual neurons denoted as “R” in blue and “P” in red. Such biases strongly predetermine the flow of information processing. Thick arrows from the R cells represent strong synaptic connections, which have more influence on downstream neurons (i.e., P). Dashed arrows represent a weak synaptic connection of downstream neurons denoted as P cells, which have a smaller effect. These weight distributions are represented as a log-normal distribution shown in the bottom panel. Panels (A–F) have been adapted from Ikegaya et al. (2013) and (G) from Tonolini et al. (2020) with permission.

As ANN research has a similar focus on the weights of the network, there are attempts to design “sparse” ANNs by adjusting network weights via regularization (Bui et al. 2021; Ma et al. 2019; Mitsuno et al. 2020; Rasmussen and Bro 2012). The ultimate goal of sparse ANNs is to find only significant weights to predict data. In contrast, deep learning, such as CNNs, aim to maximize prediction accuracy. In other words, sparse ANNs pursue the “why” for objective functions, whereas CNNs seek the “how.” As a representative model, Tonolini et al. (2020) developed a novel variational sparse coding (VSC) by integrating variational autoencoders (VAE) and a spare coding algorithm, which consists of sparsity in the latent space of a VAE with a Spike and Slab prior distribution (Figure 5G). A characteristic feature of this model is that the performance does not decrease even with increase in latent-space dimension. In other words, overfitting is defensible. Furthermore, the learning contents can be visualized more intuitively; for example, by manipulating one latent code, a more intuitive change in the image was observed (e.g., length of the sleeve). In addition, from the perspective of efficient learning, sparse models do not need to collect large datasets or use graphical processing unit (GPU) computation (i.e., a lower cost). It is also suitable for handling data linearity. It has been reported that the weights can be reduced by 80% without alteration in classification accuracy (Yaguchi et al. 2018). Thus, manipulation of weights may be an effective strategy to implement efficient learning with ANNs. However, sparse ANNs have several problems. For example, the prediction accuracy of sparse ANNs is inferior to that of CNNs when a large dataset is available. In terms of this, it would be useful to refer to brain mechanisms to overcome these problems. For example, in sparse ANNs via lasso regularization, weights close to 0 are replaced with 0. This regularization approach is distinct from the actual brain mechanism. In addition, in the VAE and VSC examples, a Gaussian or Bernoulli distribution and a Gaussian with spike and slab distribution are used, respectively. A similar finding of weight distribution was observed in the CNNs using the Modified National Institute of Standards and Technology (MNIST) database (Figure 5H). These properties of weight distribution in ANNs do not correspond to the brain. In a real brain, it is extremely rare to have a Gaussian distribution (Figure 5I). Therefore, developing network properties that consider the properties of a log-normal distribution may provide new insights into the development of new ANNs.

In summary, this section focused on the brain’s spontaneous activities and elucidated the fundamental properties of the weights necessary for generating it, which is involved in the brain’s efficient learning mechanism. In order to generate spontaneous activities, it is important that the weights follow a log-normal distribution (Figure 5I). Such a skewed distribution implies the existence of a small number of neurons, described as rigid cells, that have a stronger influence on other neurons, and a large number of neurons denoted as plastic cells that do not. In fact, the activity of rigid cells can robustly induce population activities such as SPW-Rs (Figures 1B and 5C, F). Such findings suggest that there is an underlying spontaneous activity pattern, which could work as a “template” for learning. Moreover, a skillful usage of the templates may allow for more efficient learning. Therefore, in the next section, we will discuss how the templates created by the skewed distribution of the brain actually improve learning efficiency. In contrast, recent ANN studies have developed a novel type of ANN by making some weights “sparse,” which achieved efficient earning. Therefore, future ANN research may incorporate a property of a log-normal distribution for weights to create novel ANNs that allow more efficient learning.

2.3 Learning as reassociation

We have described the unequally biased weights in the brain that can generate spontaneous activities, in the context that self-organized activities can be a training signal for the brain. In this section, we review the effect of spontaneous activities on learning. The first question to be addressed is the role of unequally weighted neurons in a population during learning. To address this issue, Grosmark and Buzsáki (2016) focused on spike sequences before, during, and after a one-dimensional spatial learning task. Importantly, in this study, some neurons with a high firing frequency were described as rigid cells and other neurons with a lower firing rate, but showing dynamic changes in response to learning, denoted as plastic cells, were separately quantified (Figure 6A). Critically, during and after the spatial learning task, the authors observed no significant changes in rigid cell activities; however, plastic cells that fired less frequently before learning showed significant spiking changes during and after learning (Figure 6B). These findings suggest that the brain’s strategy for efficient learning is to maintain basic patterns (e.g., rigid cells) while minimizing changes (e.g., plastic cells). This strategy is cost-effective. Although the spike sequence seen prior to the task is usually referred to as a preplay, some concerns and negative results have been reported in recent years, especially regarding statistical methods (Gillespie et al. 2021; Tingley and Peyrache 2020; van de Ven et al. 2020). These issues and alternative methods are discussed in a later section.

Figure 6: 
Learning by neuronal reassociation. (A) Spatial learning by reassociation. Left, time-series firing patterns of place cells (n = 77) corresponding to space during task execution. Right and bottom spike sequence changes before (PRE-epoch sleep), during immobility in the novel maze, and after (POST) epoch (sleep) learning. (B) Properties of rigid (blue) and plastic (red) neurons differ during learning. Within-session improvement of plastic neurons’ infield versus outside field firing ratios. Plastic cells improved spatial coding overlaps on the maze. The shaded areas show bootstrapped 95% confidence interval. The second panel from the left shows per neurons summary of within-filed firing specificity changes (mean ± SE). The third panel demonstrates differences in spatial coding properties within and outside the place field firing rate ratio (log-mean ± log-SE). The rightmost panel shows spatial information per spike (log-mean ± log-SE). (C) Brain machine interface (BCI) learning experiment. On the left, schematic of the BCI system is depicted. Under the visual feedback, the rhesus macaques were required to produce neuronal population activities that move a cursor to hit visual targets. On the right, the cursor movements via BCI before learning (intuitive mapping), early perturbation (perturbed mapping), and after learning (perturbed learning) are shown. (D) Neural population activity changes before and after learning. Left inside: Population activity patterns were shown as the 2D output before and after learning. The black and red dots denote population activity during the last 50 trials under intuitive mapping and during 50 trials of the best performance under perturbed mapping, respectively. Each point in the axis (




V
x



${V}_{x}$


 and 




V
y



${V}_{y}$


) indicates population activity that moves the cursor. The outlines of black and red represent 98% of the neural population patterns before and after learning. Left: The neural population activities after learning are depicted separately for each movement direction. The gray and red zones represent the distribution pattern of the neural population coding each direction (0–2



π


$\pi $


 degree). In this calculation, covariance matrices were calculated for the z term of the factor analysis. As results, no significant changes in the population activities were found across the learning phases (from the intuitive to perturbed mappings). Center: Quantification of covariance of the population activities along the dimensions of intuitive mapping. Each data point represents one experiment, and the diagonal line indicates unity. Right: Covariance along the dimension of perturbed mapping. As in the center panel, the diagonal line denotes unity. Panels (A–B) have been adapted from Grosmark and Buzsáki (2016) and (C–D) from Golub et al. (2018) with permission.
Figure 6:

Learning by neuronal reassociation. (A) Spatial learning by reassociation. Left, time-series firing patterns of place cells (n = 77) corresponding to space during task execution. Right and bottom spike sequence changes before (PRE-epoch sleep), during immobility in the novel maze, and after (POST) epoch (sleep) learning. (B) Properties of rigid (blue) and plastic (red) neurons differ during learning. Within-session improvement of plastic neurons’ infield versus outside field firing ratios. Plastic cells improved spatial coding overlaps on the maze. The shaded areas show bootstrapped 95% confidence interval. The second panel from the left shows per neurons summary of within-filed firing specificity changes (mean ± SE). The third panel demonstrates differences in spatial coding properties within and outside the place field firing rate ratio (log-mean ± log-SE). The rightmost panel shows spatial information per spike (log-mean ± log-SE). (C) Brain machine interface (BCI) learning experiment. On the left, schematic of the BCI system is depicted. Under the visual feedback, the rhesus macaques were required to produce neuronal population activities that move a cursor to hit visual targets. On the right, the cursor movements via BCI before learning (intuitive mapping), early perturbation (perturbed mapping), and after learning (perturbed learning) are shown. (D) Neural population activity changes before and after learning. Left inside: Population activity patterns were shown as the 2D output before and after learning. The black and red dots denote population activity during the last 50 trials under intuitive mapping and during 50 trials of the best performance under perturbed mapping, respectively. Each point in the axis ( V x and V y ) indicates population activity that moves the cursor. The outlines of black and red represent 98% of the neural population patterns before and after learning. Left: The neural population activities after learning are depicted separately for each movement direction. The gray and red zones represent the distribution pattern of the neural population coding each direction (0–2 π degree). In this calculation, covariance matrices were calculated for the z term of the factor analysis. As results, no significant changes in the population activities were found across the learning phases (from the intuitive to perturbed mappings). Center: Quantification of covariance of the population activities along the dimensions of intuitive mapping. Each data point represents one experiment, and the diagonal line indicates unity. Right: Covariance along the dimension of perturbed mapping. As in the center panel, the diagonal line denotes unity. Panels (A–B) have been adapted from Grosmark and Buzsáki (2016) and (C–D) from Golub et al. (2018) with permission.

A similar neuronal phenomenon in the neocortex was reported in another type of learning task using a brain–computer interface (BCI). In this study, the authors recorded population activities in the motor cortex of monkeys, learning proficient control of the cursor (Golub et al. 2018). First, the authors applied factor analysis to extract population activity patterns corresponding to eight directions of cursor movements, denoted as intuitive mapping, from the 80 calibration phases (Figure 6C). The population activity patterns were then applied to a Kalman filter to estimate the cursor velocity and position. After the calibration phase, the monkeys skillfully adapted to the BCI environment and induced brain activity to move the cursor appropriately to the indicated positions (left panel in Figure 6C). Following the calibration phase, perturbed BCI mapping was created by shuffling correspondence between the decoder (i.e., z term in the factor analysis) and population activity patterns to disturb the original relationship between the neural population activities and cursor movements (middle panel in Figure 6C). The authors used perturbed BCI mapping to investigate changes induced in the neural population by learning the mechanism to move the cursor (right panel in Figure 6C).

Initially, the authors hypothesized three learning-related changes. The first hypothesis was that learning occurs through realignment. According to this hypothesis, the overall neuronal population activity would have been globally remapped, and a new neuronal repertoire should have formed by learning. The second scenario involves learning by rescaling. Under the rescaling hypothesis, the animal would have learned to rescale the variance of neural population activity along each dimension to adjust for change in the population’s influence on cursor movement due to perturbation mapping. The final hypothesis is that learning occurs through reassociation. In this scenario, the brain would generate some limited firing patterns corresponding to intuitive mapping; hence, the learning effect would not significantly alter the original pattern. Therefore, under the re-association hypothesis, the monkey reuses existing population activity patterns for different intended movements. As a result, the re-association hypothesis was supported. The after-learning neuronal population activities demonstrated a nearly complete overlap with the activation patterns before learning (Figure 6D). Although this study did not focus on the firing characteristics of individual neurons, the finding that the population activity patterns did not change significantly after learning is consistent with the results of the aforementioned study. In other words, these studies suggest that the brain reuses existing patterns for improved learning efficiency. Incidentally, the brain can generate novel population activities to learn new skills. However, based on our experience, it is important to note that creation of a novel activity pattern for acquisition of new skills takes a certain number of trials and days (Oby et al. 2019).

In short, when learning, the brain does not completely rewrite weights but reuses existing weights. (The effect of relearning on weights in ANNs is described in the next section.) In particular, rigid cells serve as templates on which efficient learning is based. This template is continuously generated during spontaneous activities and maintained during learning, a typical example being SPW-Rs. However, the effect of learning can be quantified as a change in the activity of plastic cells, which function by adding information to the template maintained by rigid cells, encoding more learning-specific information. This efficient learning strategy of the brain (i.e., learning as re-association) is observed not only in spatial learning but also in motor learning tasks using BCI. Based on these research results, it seems that the brain skillfully separates the weights that should be changed from those that should be maintained during learning. Thus, the mechanism for efficient learning in the brain seems to converge on two points: the use of spontaneous activities as the training signals and template, and the limitation of the number of adjustable weights.

2.4 Summary of efficient learning

The conclusions of these studies regarding efficient learning implemented in the brain are as follows: first, the brain has preconfigured mechanisms before birth that are good at extracting certain information, which can be the neuronal foundation of efficient learning as a biological constraint. Ultimately, mechanisms such as the space system in the brain and number sense are likely to be genetically encoded and/or formed by spontaneous activities such as neural oscillations, as they demonstrate their functionality in very early development, before sufficient training data are provided. Second, the properties of neural networks necessary to generate spontaneous activity needed for efficient learning are discussed. The brain has unequally weighted properties and is characterized by a log-normal distribution. Such unequally biased mechanisms give rise to the basic brain motifs (i.e., a spontaneous spike sequence) seen during SPW-Rs. The brain does not rewrite this motif significantly during learning, rather reuses it. An advantage of such a strategy is minimized learning costs. In this respect, the network weights of ANNs, such as VAE and CNNs, follow a distinctive distribution, such as a Gaussian distribution. A new ANN design that enables more efficient learning would consider the method of spontaneous activity pattern implementation, including log-normal distribution in ANNs. Some of these challenges will be discussed in the next section.

3 Continual learning and sleep

In the previous section, we have discussed that the brain achieves efficient learning by keeping some weights unchanged. In this section, we focus on the neuronal mechanism by which some weights are maintained and others are consolidated. In particular, sleep has an important functional significance in this process. Sleep is the most representative self-organized state of the brain, and during this process, characteristic spontaneous activities such as SPW-Rs, spindle, and slow waves have been demonstrated to be systematically associated with the sleeping state, such as REM and non-REM sleep (Adamantidis et al. 2019; Helfrich et al. 2018; Mander et al. 2017). Although the functionality of sleep for learning is only partially elucidated (Girardeau and Lopes-Dos-Santos 2021; Mikutta et al. 2019; Samanta et al. 2020; Walker and Stickgold 2004), the role of sleep, such as replay in continuous learning, has received increasing attention in recent years with the rise of ANNs (Chen and Liu 2018; Flesch et al. 2018; Roscow et al. 2021; Wittkuhn et al. 2021). In neuroscience, the functionality of replay has traditionally been discussed in the context of memory consolidation (Buch et al. 2021; Girardeau and Lopes-Dos-Santos 2021; Girardeau et al. 2009; Gridchyn et al. 2020). Recently, its functionality to support continual learning and prevent catastrophic forgetting in comparison with ANNs has attracted more attention (Figure 7A). In this section, we provide an overview of replay and introduce new ANNs, such as generative replay, to achieve continual learning.

Figure 7: 
Neural implementation for continual learning. (A) Catastrophic forgetting. Left: In case of the brain, the previously learned content (red) is maintained, even after new learning (blue) is conducted. Right: In case of a typical ANN, on the other hand, previously learned information is rapidly lost. (B) Closed-loop manipulation of the sharp-wave ripples (SPW-Rs). The left panel shows an example of killing SPW-Rs (a target trial) and delayed control. In the central panel, SPW-Rs induce suppression of the fEPSPs, while the silencing of SPW-Rs prevents synaptic depression. The right panel indicates the behavioral spatial memory task performance significantly decreased in the SPW-Rs silencing condition, which suggests that synaptic depression via SPW-Rs is necessary for spatial memory acquisition. These figures (B) are adopted from Norimoto et al. (2018). (C) Triple activation of SPW-Rs across the ventral part of hippocampus CA1 (magenta), basolateral nucleus of the amygdala (green), and layer 5 in the prefrontal cortex (blue). On the left upper, middle, and bottom panel, the wideband of LFP, filtered LFP, and the instantaneous ensemble activation strength and spikes measured from three regions, respectively, are shown. On the right, the activation order across three regions. The letters C, B, and P denote CA1, amygdala, and prefrontal cortex, respectively. The gray line indicates the mean and 95% confidence interval. These figures are adopted from Miyawaki and Mizuseki (2022). (D) One example of the brain-inspired generative ANN that is tolerable for continual learning. Top: Schematic representation of sequential training of the ANN described as the scholar model. Note training the scholar model sequentially is identical with continual learning. This ANN consisted of two parts: a generator (



G


$G$


) and solver (



S


$S$


). First, a current generator is trained to imitate a mixed data distribution of real image (



x


$x$


) and generated (



x
′


$x\prime $


) data from the previous generator in the old model via replay. A solver learns both from real input target (



x
,
y


$x,y$


) and a generated or replayed pair (



x
′
,
y
′


$x\prime ,y\prime $


). By repeating this process sequentially, previous learning contents are not lost even when new learning contents are added into the network. Middle: Classification accuracies for two different datasets. (a) The scholar model was first trained on MNIST and then on the street view house numbers (SVHN) datasets or the learning order was reversed (b). The thick (M) and dim curves (S) denote the classification accuracy for the original and new data, respectively. ER in green denotes the case of learning where the model replayed real past data paired with the predicted target from the old solver, denoted as exact replay. GR represents the learning where the previous learning content was generated by the generator (denoted as GR in orange). As the baseline model, if sequential training without the generator was applied to the solver denoted as none in purple, catastrophic forgetting started, thereby resulting in poorer classification accuracy. Bottom: Examples of data generated for replay using the generator model. In this example, MNIST was used for training first, followed by the SVHN. From the left, the numbers of training sessions were 1000, 2000, 5000, 10,000, and 20,000. These figures are adopted from Shin et al. (2017).
Figure 7:

Neural implementation for continual learning. (A) Catastrophic forgetting. Left: In case of the brain, the previously learned content (red) is maintained, even after new learning (blue) is conducted. Right: In case of a typical ANN, on the other hand, previously learned information is rapidly lost. (B) Closed-loop manipulation of the sharp-wave ripples (SPW-Rs). The left panel shows an example of killing SPW-Rs (a target trial) and delayed control. In the central panel, SPW-Rs induce suppression of the fEPSPs, while the silencing of SPW-Rs prevents synaptic depression. The right panel indicates the behavioral spatial memory task performance significantly decreased in the SPW-Rs silencing condition, which suggests that synaptic depression via SPW-Rs is necessary for spatial memory acquisition. These figures (B) are adopted from Norimoto et al. (2018). (C) Triple activation of SPW-Rs across the ventral part of hippocampus CA1 (magenta), basolateral nucleus of the amygdala (green), and layer 5 in the prefrontal cortex (blue). On the left upper, middle, and bottom panel, the wideband of LFP, filtered LFP, and the instantaneous ensemble activation strength and spikes measured from three regions, respectively, are shown. On the right, the activation order across three regions. The letters C, B, and P denote CA1, amygdala, and prefrontal cortex, respectively. The gray line indicates the mean and 95% confidence interval. These figures are adopted from Miyawaki and Mizuseki (2022). (D) One example of the brain-inspired generative ANN that is tolerable for continual learning. Top: Schematic representation of sequential training of the ANN described as the scholar model. Note training the scholar model sequentially is identical with continual learning. This ANN consisted of two parts: a generator ( G ) and solver ( S ). First, a current generator is trained to imitate a mixed data distribution of real image ( x ) and generated ( x ) data from the previous generator in the old model via replay. A solver learns both from real input target ( x , y ) and a generated or replayed pair ( x , y ). By repeating this process sequentially, previous learning contents are not lost even when new learning contents are added into the network. Middle: Classification accuracies for two different datasets. (a) The scholar model was first trained on MNIST and then on the street view house numbers (SVHN) datasets or the learning order was reversed (b). The thick (M) and dim curves (S) denote the classification accuracy for the original and new data, respectively. ER in green denotes the case of learning where the model replayed real past data paired with the predicted target from the old solver, denoted as exact replay. GR represents the learning where the previous learning content was generated by the generator (denoted as GR in orange). As the baseline model, if sequential training without the generator was applied to the solver denoted as none in purple, catastrophic forgetting started, thereby resulting in poorer classification accuracy. Bottom: Examples of data generated for replay using the generator model. In this example, MNIST was used for training first, followed by the SVHN. From the left, the numbers of training sessions were 1000, 2000, 5000, 10,000, and 20,000. These figures are adopted from Shin et al. (2017).

3.1 Replay and SPW-Rs

Replay was first discovered by Pavlides and Winson (1989), wherein cell assemblies in the hippocampus encoding specific spatial information during the wake state increased their firing rate and synchronous firing probability, especially during slow-wave sleep. Thus, this phenomenon is called replay because the learned content is repeated or reactivated during sleep and compressed (e.g., several meters of spatial information are compressed into approximately 100 ms of activity). Although many studies have focused on the elevated activities of cell assemblies to consolidate learning contents, it has been reported that the functionality of replay can simultaneously downregulate the weights of unrelated synapses (Navarro-Lobato and Genzel 2019; Norimoto et al. 2018; Samanta et al. 2020) (Figures 1B and 7B). Furthermore, it has been robustly confirmed that replayed firings occur in the forward, as well as reverse direction (Ambrose et al. 2016; Foster and Wilson 2006). Although little is known about the difference in functionality between forward and reverse replay, it has been reported that reverse replay often occurs in association with reward learning (Bhattarai et al. 2020; Liu et al. 2021). Synchronous firing, usually during replay, can be observed as a high-frequency activity (80–200 Hz, i.e., SPW-Rs) at the LFP level. Thus, replay during sleep related to learning has been a major issue in the neuroscience community for decades.

Replay and ripple oscillations have often been used to refer to neural activity in the hippocampus because replay and SPW-Rs were originally found in the hippocampus (Buzsáki et al. 1983; Pavlides and Winson 1989). However, recently, it has been reported that similar spiking activities and high-frequency oscillations were also observed in other regions, such as the parietal association cortex, medial prefrontal cortex (mPFC), and amygdala (Khodagholy et al. 2017; Miyawaki and Mizuseki 2022), which were found to be occasionally synchronous and mostly asynchronous with the hippocampus. Thus, it is highly likely that neural activities with properties similar to those of replay and ripple oscillations exist in other brain regions (ripple in the mPFC will be discussed in the next section). Importantly, there was a robust ordering in synchronous ripples across these multiple regions (Figure 7C), which suggests a hippocampal role proposed as the hippocampal memory index hypothesis (Buzsáki and Tingley 2018).

To summarize, during sleep, the brain uses neural oscillations (e.g., SPW-Rs) that occur during spontaneous activities, to maintain, strengthen, and weaken weights. A typical neural phenomenon is replay, which is induced by the firing of rigid cells, and plastic neurons associated with learning, whose activity is enhanced. On the other hand, the activity of neurons not associated with learning is suppressed. Through such an elaborate mechanism, the brain maintains the neural network in a constantly optimized state.

3.2 Brain-inspired generative replay model to prevent catastrophic forgetting

Recently, not only neuroscientists but also engineers have been inspired by replay to develop new ANNs, mainly to overcome catastrophic forgetting. Catastrophic forgetting, or catastrophic interference, was first reported by (McCloskey and Cohen 1989). They found that the ANNs tend to forget previously learned content after training on new data (Figure 7A). This is mainly because ANNs are not configured with the nonadjustable weights such as rigid cells; the weights of the nodes determined in prior training are easily overwritten by training on new data. Generally, transfer learning with backpropagation-based fine-tuning in a deep neural network suffers due to this problem (Chen and Liu 2018). Thus, the creation of novel ANNs that sustain information and achieve lifelong learning similar to the brain is an important research topic that is attracting a great deal of interest in the deep learning community (Chen and Liu 2018; Hadsell et al. 2020; Kirkpatrick et al. 2017).

Many methods, such as progressive neural networks, elastic weight consolidation, “learning without forgetting,” and FearNet, have already been proposed to solve this problem (Jung et al. 2018; Kemker and Christopher 2017; Kirkpatrick et al. 2017; Li and Hoiem 2018; Palm et al. 2018; Rusu et al. 2016; Shin et al. 2017). For instance, the strategy employed in elastic weight consolidation is intuitively similar to that of the brain, as described in the previous section, in which some important weights related to previous tasks are maintained (e.g., rigid cells), and variable plastic nodes (e.g., plastic cells) are selectively adjusted in this model (i.e., constraining parameters to stay in the task’s low-error region) (Kirkpatrick et al. 2017). In addition, insightful as another brain-inspired ANN would be the deep generative replay developed based on the finding that the hippocampus is comparable to a generative model, as shown in Figures 5C and 6A (Kumaran et al. 2016; Shin et al. 2017; van de Ven et al. 2020). For instance, Shin et al. (2017) applied generative models that aimed to maximize the likelihood of generated pseudo-data in a given real data distribution (Goodfellow et al. 2020) and developed a novel deep generative replay framework with a cooperative dual model architecture consisting of a generator (G) and solver (S). In this model, generator G produces images similar to real ones, and solver S, as the discriminator, is a task-solving model parameterized by parameter θ related to the loss function (Figure 7D). Training of the ANN involving two independent procedures for G and S is done as follows: first, G receives not only current input ( x ) but also the generated data ( x ) from previous learning. As a result, G learns to imitate real data distribution as closely as possible. This learning process is described as intrinsic replay or pseudo-rehearsal because not only new data but also pseudo-data generated from previous learning are reused or replayed for the model to prevent catastrophic forgetting (Robins 1995). Subsequently, S learns to tie the inputs and targets from both real input target data pairs, denoted as ( x , y ) , and replay the input-target pairs ( x , y ) with the loss function. The loss function of the i -th S model is defined as follows:

L t r a i n ( θ i ) = r E ( x , y ) D i [ L ( S ( x ; θ i ) , y ) ] + ( 1 r ) E x G i 1 [ L ( S x ; θ i ) , S ( x ; θ i 1 ) ]

where θ i are network parameters of the i -th model, and r is the ratio of including real data. Note that the second loss term should be ignored when i = 1 because there are no replayed data.

In summary, catastrophic forgetting has been a problem in ANN research. To overcome this problem, recent ANN research has incorporated a neurophysiological phenomenon called replay. In this brain-inspired generative replay model, the ANNs can replay previously learned contents, thereby preventing the rewriting of the previous weights. This allows the ANNs to correspond to continuous learning. Although this is a reverse-engineering approach that relies on the findings of neuroscience research, it is also an effort to constructively reexamine the correctness of the brain’s learning mechanisms. In other words, SPW-Rs, which appear during sleep, maintain the weights associated with past learning, enabling the brain to continue learning. When considering social applications of ANNs, the applicability of this brain-inspired generative replay model is thought to be extremely large.

3.3 Summary of continual learning

Results of the brain-inspired ANNs, which originate from replay, reveal the reason for the brain to incur significant costs for intrinsic activities, such as non-REM sleep (Raichle 2010; Zhang and Raichle 2010). Recent research has focused on the functionality of replays in relation to the reaction of new memories. However, brain-inspired ANNs indicate that the functionality of replay is related to continuous learning, that is, maintaining past memories. This process, known as the homeostatic balancing function, maintains only those weights that are necessary for maintenance of past memories and reduces the weights (i.e., synapses) that are unnecessary (Norimoto et al. 2018; Tononi and Cirelli 2014). Such functionality may also be based on sophisticated neural wiring that can be observed in single neurons, which have recently been revealed (Ishikawa and Ikegaya 2020). Thus, studies on brain-inspired ANNs have the potential to reveal the functionality of sleep in relation to brain learning in a manner complementary to existing neuroscience research.

Although we have mainly focused on ANNs that incorporate replay (SPW-Rs), better ANNs can be created in future by incorporation of other sleep-related mechanisms such as slow waves and spindles into the ANN. In recent sleep models, slow waves and spindles have also been discussed in the framework of memory consolidation (Helfrich et al. 2018; Latchoumane et al. 2017; Mikutta et al. 2019). However, functional differentiation exists among SPW-Rs, slow waves, spindles, and triple cross-frequency coupling (Ohki and Takei 2018; Oyanedel et al. 2020). In addition, there are reports of differences in frequency characteristics, such as fast and slow spindles, as well as differences in spatial distributions (Gonzalez et al. 2022; Mölle et al. 2011; Schilling et al. 2018). Therefore, incorporating these three spatiotemporally different neural rhythms into ANNs and clarifying the difference in their nature has the potential to reveal a more positive function of sleep in continual learning and may also lead to the development of better ANNs.

4 Generalization

In order to understand the world better, our brains need to make abstractions among objects, events, sounds, meanings, movement, odor, space, time, etc. and learn the relation of knowledge. The brain can generalize such knowledge, which can be utilized in a novel environment. In short, generalization is a mental process that maintains a certain structure, genuinely extends only its adaptive range, can assume a more causal relationship, and ultimately enables us to select more effective actions. These mental processes are essential, especially in science. For instance, it is well known that Jules–Henri Poincaré stated that “mathematics is the art of regarding different things as the same thing” (Yamaguchi 2010). Indeed, an area can be regarded as a probability identically from the perspective of mathematics and statistics. Likewise, in the field of neuroscience, it is suggested that the calculation of amplitude fluctuation of higher frequencies at a specific phase of low frequency (i.e., phase-amplitude coupling) can be viewed as the same as the optimal transport theory-based distance and cost (Ohki 2022). Naturally, generalization is not a privilege reserved for professionals. Every parent has seen the spectacle of a 2-year-old child in acquiring a word, not just a sound but a new meaning, and generalizing it smoothly in a different environment (Dehaene-Lambertz and Spelke 2015). The brain makes it happen with fragmented sensory inputs, riddled with noise, and ambiguity. However, it is not easy for developers to implement these learning characteristics of the brain in ANNs (Curtis et al. 2022; Tenenbaum et al. 2011; Xu et al. 2022). Thus, the existence of such an internal process is empirically self-evident, and there is no doubt that this type of computation reflects the specificity of brain functionality of learning (Konidaris 2019; Zeki 1999).

Neural mechanisms implemented with generalization are assumed to show identical functionality or similar activation pattern under superficially different stimuli and multiple situations, yet identical in their essentialities. In our previous work, we described generalization as mental schema and discussed the neurophysiological mechanisms involved, mainly focusing on neural oscillations (Ohki and Takei 2018). Our goal in this section is to review recent progress in this domain and to propose a mathematical method to quantify this concept in a more mathematically valid way as Mental Schema 2.0. Recently, the hippocampus has been the central region of investigation to elucidate this mechanism (Baram et al. 2021; Bowman and Zeithamova 2018; Kumaran 2012). The hippocampus has been recognized as an essential brain region for episodic memory and learning (FeldmanHall et al. 2021; Jarrard 1993; Wallenstein et al. 1998). On the other hand, the hippocampus is also a region where various mechanisms for spatial information processing have been discovered (Broadbent et al. 2004; Foster and Wilson 2007; Michon et al. 2019; O’Keefe and Recce 1993; Sherry et al. 1992; Stevenson et al. 2018; Wikenheiser and Redish 2015). Therefore, based on this research background, it is highly expected to elucidate how the hippocampus can handle seemingly different information, such as spatial and episodic memory, in a more generalized manner. In this section, we describe how such a brain mechanism is implemented with a topological neural mechanism.

4.1 Schema cells in the brain

Recent neuroscience research has attempted to clarify generalization as the core functionality of the brain. For instance, one study investigated whether hippocampal cells in the monkey could reflect spatial abstraction: extracting commonalities across multiple spatial experiences beyond superficial differences (Baraduc et al. 2019). In order to elucidate such neuronal process, the authors trained the monkeys to explore a virtual maze with a joystick in search for an invisible reward (Figure 8A). To obtain rewards, the monkeys were required to estimate their own location with respect to the landmarks and to learn reward positions. After training in a familiar maze, where the landmarks were always presented at the same position, the monkeys showed a high accuracy rate of more than 90% (Figure 8B). Subsequently, the animals were tested in an identically shaped novel maze, except with never-before-seen landmarks (Figure 8B). In a novel environment, they were required to make more flexible spatial inferences without being distracted by landmark differences. After several dozen trials, the monkeys were able to handle this new task and showed a high percentage of correct responses (Figure 8B). As different landmarks were presented each time in the novel environment, the learning associations of landmarks were not enough to be successful.

Figure 8: 
The generalization mechanism of the spatial information in the brain. (A) Spatial information generalization task for the macaque monkeys. The upper panel shows an allocentric view of the space used in the task. Five star-shaped corridors, landmarks placed between the corridors, and the locations of rewards are indicated by water droplets. In this panel, the movement of the monkey is indicated by an orange line. The gray inverted triangle indicates the monkey’s field of view. As shown in the rightmost panel, after one trial, the monkey is automatically moved to one of four random aisle entrances, except for the reward. The bottom panel illustrates the monkey’s internal space. (B) The first panel from the left illustrates the location of landmarks and rewards used for learning (familial trials). The middle panel of the same shows an occasional novel task. The rightmost panel shows learning curves for the familiar and novel trials. (C) The left panel shows activity of a neuron that exhibited a common pattern of activity during the familiar and novel tasks. These neurons also showed common firing activity in the task spatial and the task state space. The right panel shows the location of the hippocampus (green) where the neuron activity was recorded. (D) The bottom panel shows correlation between the neuronal activity observed in the familiar and novel trials. The left bar graph shows the percentage of neurons (about 30%) that showed a high correlation between the familial and novel trials among all neurons recorded. Neurons indicated by the black lane show map correlations insensitive to rotation. The histograms of map correlation coefficients in the position and task state space in dark orange and dark blue, respectively. The gray outline denotes surrogate data distribution. (E) Spatial information generalization task for the rodents. Left: A spatial rule and spatial cue (i.e., light-guided) task. Rats were required to shift between two distinctive tasks on a plus maze. During the spatial rule, rats had to ignore the light cue, while in the light-guided task, rats had to follow it. Right: Percentage of errors in 15 trials before the rule changed, 15 trials after the rule changed, and last 15 trials. Error bars show standard error of the mean. (F) Coding of the relative spatial position in the mPFC. In the left panel, z-scored firing rate maps of one of the linearized four trajectories (i.e., north to east) in the hippocampus (HPC) and the mPFC are depicted. In the right, an example of firing rate map measured in the mPFC cells is demonstrated. Irrespective of the spatial and rule differences, some of the mPFC neurons show the multiple firing fields. For instance, some neurons show a selective higher firing rate at two start arms positions, the star and goal arms, or goals arms. (G) Population vector correlations between the south and north and between the east and west. In contrast with the hippocampal cells, in the right panel, the mean errors of decoding 1D and 2D spatial position via the Bayesian decoder for the hippocampus (HPC) and mPFC were shown. Importantly, the decoding errors for 1D map significantly decreased only in the mPFC. (H) Replay in hippocampus and mPFC. In the left panel, forward and backward replay corresponding to the linearized maze position observed in hippocampal CA1; in the central panel, a similar replay is observed in the mPFC. The right panel shows the percentage of co-occurrence of ripple wave and replay (%). (A–D) and (E–H) are adopted from Baraduc et al. (2019) and Kaefer et al. (2020), respectively.
Figure 8:

The generalization mechanism of the spatial information in the brain. (A) Spatial information generalization task for the macaque monkeys. The upper panel shows an allocentric view of the space used in the task. Five star-shaped corridors, landmarks placed between the corridors, and the locations of rewards are indicated by water droplets. In this panel, the movement of the monkey is indicated by an orange line. The gray inverted triangle indicates the monkey’s field of view. As shown in the rightmost panel, after one trial, the monkey is automatically moved to one of four random aisle entrances, except for the reward. The bottom panel illustrates the monkey’s internal space. (B) The first panel from the left illustrates the location of landmarks and rewards used for learning (familial trials). The middle panel of the same shows an occasional novel task. The rightmost panel shows learning curves for the familiar and novel trials. (C) The left panel shows activity of a neuron that exhibited a common pattern of activity during the familiar and novel tasks. These neurons also showed common firing activity in the task spatial and the task state space. The right panel shows the location of the hippocampus (green) where the neuron activity was recorded. (D) The bottom panel shows correlation between the neuronal activity observed in the familiar and novel trials. The left bar graph shows the percentage of neurons (about 30%) that showed a high correlation between the familial and novel trials among all neurons recorded. Neurons indicated by the black lane show map correlations insensitive to rotation. The histograms of map correlation coefficients in the position and task state space in dark orange and dark blue, respectively. The gray outline denotes surrogate data distribution. (E) Spatial information generalization task for the rodents. Left: A spatial rule and spatial cue (i.e., light-guided) task. Rats were required to shift between two distinctive tasks on a plus maze. During the spatial rule, rats had to ignore the light cue, while in the light-guided task, rats had to follow it. Right: Percentage of errors in 15 trials before the rule changed, 15 trials after the rule changed, and last 15 trials. Error bars show standard error of the mean. (F) Coding of the relative spatial position in the mPFC. In the left panel, z-scored firing rate maps of one of the linearized four trajectories (i.e., north to east) in the hippocampus (HPC) and the mPFC are depicted. In the right, an example of firing rate map measured in the mPFC cells is demonstrated. Irrespective of the spatial and rule differences, some of the mPFC neurons show the multiple firing fields. For instance, some neurons show a selective higher firing rate at two start arms positions, the star and goal arms, or goals arms. (G) Population vector correlations between the south and north and between the east and west. In contrast with the hippocampal cells, in the right panel, the mean errors of decoding 1D and 2D spatial position via the Bayesian decoder for the hippocampus (HPC) and mPFC were shown. Importantly, the decoding errors for 1D map significantly decreased only in the mPFC. (H) Replay in hippocampus and mPFC. In the left panel, forward and backward replay corresponding to the linearized maze position observed in hippocampal CA1; in the central panel, a similar replay is observed in the mPFC. The right panel shows the percentage of co-occurrence of ripple wave and replay (%). (A–D) and (E–H) are adopted from Baraduc et al. (2019) and Kaefer et al. (2020), respectively.

Based on these behavioral observations, the authors investigated the activity of hippocampal cells during the search for familiar and novel mazes using two different quantification approaches. First, the monkeys’ location information was represented using a normal Cartesian map (Figure 8C). Second was a more internal spatial map transformed into a task-related state. In this map, the navigation behavior was represented as possible action trajectories (e.g., rotation and translation) reflecting the head orientation in a virtual maze. They found that some cells encoded both familiar and novel environments, whereas many cells discriminated between the two environments. Furthermore, a subpopulation of these cells showed spatially similar activity patterns when the maps were reallocated with respect to reward positions. Similarly, spatially identical activations of these cells were observed when compared with a Cartesian map–based position and an internal space–based position (Figure 8C). Intriguingly, as the monkeys learned about the novel environment, the hippocampal activities observed during the novel environment task were increasingly correlated with those in the familiar environment (Figure 8D). Based on these findings, they named hippocampal neurons that exhibit a common firing pattern across two different environments and spatial scales as “schema cells.”

The presence of schema cells outside the hippocampus has also been reported (Kaefer et al. 2020). Specifically, the authors used a rule-switching task (Figure 8E) and demonstrated that cells in the medial prefrontal cortex (mPFC) showed multiple firing fields. These fields often occupied symmetrical locations, such as the start arms and/or goal arms of the maze, irrespective of the directions of movement and task rules. Therefore, these firing cells assumed generalization of different spatial positions as the “start” and “goal” (Figure 8F). The symmetric or generalizing coding properties of the mPFC were quantified using correlations of the population vector (Figure 8G). The average of correlations of population vector in the mPFC were high (i.e., 0.89 and 0.92 for the start and goal arms, respectively), while the correlations in the hippocampal populations were quite low (e.g., 0.07 and 0.06 for the start and goal arms, respectively). These results indicate that the mPFC population had similar spatial firing patterns in both the start and goal arms, irrespective of the spatial positions based on a Cartesian map. In addition, the authors observed a significant improvement in the decoding precision of the mPFC when the animal’s spatial position between the start and goal was transformed into a 1D map (i.e., a linearized map), rather than a 2D map (e.g., a Cartesian map). Thus, the improved decoding precision with a 1D map also indicated that the mPFC contained less Cartesian map-based spatial information (Figure 8G).

They also investigated the mPFC and hippocampal population activity during replay. The cell assemblies in both the mPFC and hippocampus showed task-related forward and backward replay (Figure 8H). However, the statistical properties of replay between the mPFC and hippocampus, such as the duration of replay and the time compression ratio of the trajectory, were significantly different. Furthermore, cross-correlation of the task-related replay between mPFC and hippocampus showed no significant effect, which indicated that task-related replay in these two areas generally occurred independently. Specifically, on average, only 5% of all trajectory events in the mPFC occurred simultaneously with trajectory events in the hippocampus.

Both of these studies focused on spatial information processing. On the other hand, Knudsen and Wallis (2021) showed that hippocampal neurons encode not only physical space but also abstract value space (Knudsen and Wallis 2021). In this study, the authors trained macaques to learn the values represented by a combination of three pictures. As shown in Figure 9A, this task consisted of three processes: fixation, choice (free and forced choice), and reward. In obtaining a reward, the three pictures had different values, and the probability of obtaining a reward depended on which picture was chosen. Thus, the animal quickly learned to choose the image with a higher probability of receiving the reward. Importantly, this reward probability sequentially varied across trials. By skillfully varying the reward probability, the authors produced a three-dimensional (Figure 9A), circular, several loops of the helix, double lemniscate abstract value space, and ABA context (Figure 9B). In these multiple experiments, hippocampal neurons consistently exhibited the same encoding properties as those observed during physical space information processing. For example, the hippocampus during fixation points (Figure 9A) generated a spike sequence that encoded specific positional information in the abstract value space in a direction-dependent manner. Furthermore, this study successfully quantified the neuronal formations of hippocampal neurons involved in the generalization of learning contexts (i.e., ABA′) that appear to be schema cells during remapping (Figure 9B). Specifically, during context change (i.e., B A′), the authors found three patterns of remapping. First, when the context changed, a pattern similar to A was also observed in A′. In the second pattern, some of the neurons showed a common firing pattern in contexts B and A′. They also identified neurons that maintained the same firing patterns as A and B during the A′ presentation. The authors argued that neurons that were independent of such contexts and showed consistent firing patterns may be neurons involved in the generalization of learning content, that is, schema cells (or rigid cells). We should point out that some of these context-independent neurons can also play an important role in efficient and continued learning, as described in the previous sections. Accordingly, it is extremely important for future studies to investigate the behavior of such context-independent neurons (e.g., schema cells or rigid cells) in detail to elucidate their roles in generalization.

Figure 9: 
Hippocampal schema cells in value space fields. (A) Task procedure and the hippocampal cell activities in an abstract three-dimensional value space. In the upper left, the task procedure of a single trial is depicted. First, the animals were required to look at the fixation point denoted as the red point for 700 ms and to perform either a free choice or a forced choice task. The macaques chose pictures via saccades, which resulted in the probabilistic delivery of reward (80 or 20%). At the bottom, the value change of three pictures across trials (left) and the value trajectory through value space (right). In the right panel, the histograms of spike density of six cells encoding the place field in an abstract three-dimensional value space are demonstrated (left). Firing rates of six cells are shown superimposed on the trajectory through value space. (B) Value-related schema cells in the ABA′ task. Left column: Schematic representation of the ABA′ task. In this task, image sets consisting of three pictures with values demonstrated at the lower left are used in the order of ABA′. The value changes can be expressed as a circular. Right column: Examples of neurons showing activities for this task. Top: Two neurons were spatially correlated on only A and A′ block. Middle: Two neurons were spatially correlated on only B and A′ block. Bottom: Two neurons were correlated on both A:A′ and B:A′ block. The right triangle shows the correlation values of the correlated neurons for these three blocks. All these figures are adopted from Knudsen and Wallis (2021) with permission.
Figure 9:

Hippocampal schema cells in value space fields. (A) Task procedure and the hippocampal cell activities in an abstract three-dimensional value space. In the upper left, the task procedure of a single trial is depicted. First, the animals were required to look at the fixation point denoted as the red point for 700 ms and to perform either a free choice or a forced choice task. The macaques chose pictures via saccades, which resulted in the probabilistic delivery of reward (80 or 20%). At the bottom, the value change of three pictures across trials (left) and the value trajectory through value space (right). In the right panel, the histograms of spike density of six cells encoding the place field in an abstract three-dimensional value space are demonstrated (left). Firing rates of six cells are shown superimposed on the trajectory through value space. (B) Value-related schema cells in the ABA′ task. Left column: Schematic representation of the ABA′ task. In this task, image sets consisting of three pictures with values demonstrated at the lower left are used in the order of ABA′. The value changes can be expressed as a circular. Right column: Examples of neurons showing activities for this task. Top: Two neurons were spatially correlated on only A and A′ block. Middle: Two neurons were spatially correlated on only B and A′ block. Bottom: Two neurons were correlated on both A:A′ and B:A′ block. The right triangle shows the correlation values of the correlated neurons for these three blocks. All these figures are adopted from Knudsen and Wallis (2021) with permission.

We summarize the results of the three studies. The first and second studies applied two different experimental environments designed based on a Cartesian map and succeeded in identifying neurons encoding abstract spatial information in the hippocampus and mPFC, respectively. These cells showed common firing patterns even in different spatial environments. Specifically, Baraduc et al. demonstrated that schema cells could not just learn the associations of landmarks but were capable of encoding more abstract spatial information to make appropriate behavioral choices in different environments. Kaefer et al. also found that more than the hippocampus, neurons in the mPFC did not rely on a Cartesian map but held multiple spatial fields encoding abstract spatial information such as the start and goal positions. The properties of the hippocampal and mPFC replay also suggested that the two regions differed in the encoding properties of spatial information. Furthermore, Knudsen and Wallis (2021) revealed two unique points. First, hippocampal neurons, which have been assumed to encode spatial information, can also encode abstract information such as value via identical neuronal mechanisms (e.g., preplay and theta-sequence) as those in the physical space. These findings clearly indicate that the hippocampus is capable of encoding various information regardless of stimulus attributes. The second is related to the identity of a neuron’s activity during remapping. In other words, context-independent neurons were identified during remapping from the unique perspective of the abstract value space. Taken together, these three experimental results strongly suggest the existence of a generalization mechanism (e.g., rigid cells or schema cells) in the brain that encodes more abstract information beyond superficial details. It is important to note that hippocampal computation does not have completely different strategies for different properties of information, such as spatial, nonspatial, and abstract value spaces. Therefore, to clarify the true functionality of the hippocampus, it is necessary to make a comprehensive effort to clarify its more computational properties of generalization, focusing on schema cells. Therefore, the next question we face is how we should quantify the neuronal mechanism of generalization in more detail; hence, we proceed to discuss a computational approach called the topological data analysis. In the next section, we introduce the basic concepts of topology and its related pervious works.

4.2 Encoding topological information and the topological data analysis

To elucidate hippocampal spatial encoding, most of previous studies have used space design based on a Cartesian coordinate system. Indeed, many types of neurons related to spatial information processing have been identified based on this coordinate system (Behrens et al. 2018; Danjo et al. 2018; Gauthier and Tank 2018; Høydal et al. 2019; Lever et al. 2009; Sarel et al. 2017). For instance, grid cells are generally characterized by a hexagonal firing pattern, but grid cell firing features differ in terms of grid spacing (distance between grid fields), grid direction (rotation of the grid axis), and grid phase (xy position). Moreover, grid cells are formed as semi-independent module-like structures along the dorsal to ventral side of the medial entorhinal cortex. Precisely, the pattern of a smaller grid is more dominant in the dorsal side and a larger grid more prominent in the ventral part. (Notice that this modular structure in the grid cells is also closely related to shifts in the scale of grid orientation.) Thus, grid cells show a clearer and more detailed representation of the Cartesian coordinate system, such as distances and directions. Accordingly, the distance and direction moved by the rodents were precisely estimated from the activities of the grid cells (Burak and Fiete 2009; Fiete et al. 2008; Mathis et al. 2012; Wei et al. 2015). On the other hand, it has been gradually recognized that there is no absolute correspondence between the formation of the spatial map in the hippocampus and spatial learning based on a Cartesian coordinate system. Some place cells (e.g., plastic neurons) show a large activity pattern change depending on other sensory stimuli, such as room color change or local landmarks (e.g., remapping); however, navigational performance remains intact (Julian and Doeller 2021; Jeffery et al. 2003; Sanders et al. 2020), possibly due to the functionality of schema or rigid cells. Thus, the type of spatial information that place cells actually encode is still under debate.

In recent years, the new possibility that place cells may encode information in a topological way has been suggested (Babichev and Dabaghian 2018; Babichev et al. 2019; Chen et al. 2012, 2014; Dabaghian 2020; Dabaghian et al. 2012). Topology is primarily concerned with stretching and contracting shapes or space and forms to quantify abstract structures called topological invariants (e.g., a mug with a perforated handle and a donut are considered isomorphic in topological space by the criteria of connection and number of cavities). Intuitively, we can consider the topological map as a subway map where only connection matters, while a geometric map can be regarded as a topographic city map. In other words, the basic rule of topological space is “connection” regardless of the shape of the geometrical space. (A mathematical explanation of topology is described in the section “Topology and topological invariants.” Please refer to this section.) Dabaghian et al. demonstrated experimental evidence in vivo to support the idea that place cells encode such topological information using a deformable U-shaped maze (Dabaghian et al. 2014). This maze allows its arms to be straight or folded, forming a zigzag. Thus, this maze could alter its geometrical nature with diametrically opposite protrusions and orientations, but not the topological structure, because the relative order of connections in the maze was not changed. Intriguingly, the basic firing properties of place cells, such as their firing order and connections among place fields, were maintained, even when the rats moved through geometrically different environments. Similarly, other studies have reported the topological nature of the place cells. One study demonstrated that as the open field increased in size, the place field covered by the place cells also expanded, while maintaining the same shape and connections of place fields (Muller and Kubie 1987). A comparable finding was observed in linear tracks (Gothard et al. 1996) and rectangular (O’ Keefe and Burgess 1996) and morphing (Lever et al. 2002; Leutgeb et al. 2005) environments. Considering these observations, it seems that the nature of the hippocampal encoding is literally topological; i.e., “connection” is the basic rule, independent of the shape of the geometrical space. In short, the series of studies revealed novel properties of hippocampal encoding by reconstructing spatial information from a topological perspective.

The advantages of encoding topological information can be easily understood when its computational properties are taken into account. For example, one advantage of topological encoding is the ability to abstract information. Consider the subway map as an example of topological map again. When we are actually riding the subway, topographic information (e.g., the gradients of the subway road) is of little importance. What is more important is that the current station is connected to the destination station. Thus, such encoding has a unique benefit that can simplify and abstract information. Other unique topological computational properties can be better understood by comparing them with those of the graph theory (Bassett and Sporns 2017; Sizemore et al. 2019). The graph-based method is one of the most frequently used connectivity analyses in neuroscience and has provided significant insights of the brain networks such as the scale-free properties found in Dentate Gyrus and CA3 and the abnormal network patterns related to brain pathology (Bassett et al. 2018; Bonifazi et al. 2009; Li et al. 2010; Sunaga et al. 2020; Tagawa et al. 2022). Computationally, it relies on pairwise relationships such as between two neurons or brain regions. In contrast, the topological data analysis aims to capture multiple connections beyond pairwise relationships, which seems to be a more adequate assumption for the network property of the brain (Giusti et al. 2016). Furthermore, complex neuronal connections or networks consisting of multiple neurons or brain regions can be quantified as a lower dimensional manifold (Curto 2017). For instance, these computational properties of the topological data analysis can reveal the coordination of multiple neurons showing similar topological structures between spontaneous activity and the visually task-related activity (Singh et al. 2008), waking and sleep (Chaudhuri et al. 2019), and the pathological traits of the whole brain networks in schizophrenia (Stolz et al. 2021). Similarly, the topological data analysis has also been used in recent years for the development of ANNs (Cang and Wei 2017; Hensel et al. 2021; Hofer et al. 2017). Thus, a topological reconsideration of hippocampal encoding can reveal novel aspects of the brain and the topological data analysis has a unique computational advantage in abstracting and capturing patterns or structures (or commonalities) across multiple connections inside a cell assembly.

We hypothesize that these topological natures (e.g., the ability to encode more abstract information and the topological network structures represented as a lower-dimensional manifold) found in the brain regions such as hippocampus (Dabaghian et al. 2014), entorhinal cortex (Gardner et al. 2022), postsubiculum (Chaudhuri et al. 2019), and the cortex (Singh et al. 2008) are closely related to the realization of generalization. For instance, a recent study revealed that the joint activity of grid cells can be represented as topological information such as toroidal manifold (Gardner et al. 2022). In particular, focusing on the topological properties of the hippocampus, which has a unique encoding property independent of stimulus attributes, is of great significance toward achieving this study goal. In the next section, we will take this hypothesis one step further and clarify the topological properties of the hippocampal memory space to further clarify the functional properties of generalization.

4.3 Elucidation of generalized memory space in the hippocampus via topological data analysis

As we have discussed, one of the unique functionalities of the hippocampus is that it can encode a wide variety of stimuli such as visual, auditory, odor, spatial, abstract value, etc., via the identical neuronal mechanism such as theta-sequence and SPW-Rs. Moreover, hippocampal encoding property can handle highly abstract information such as topological space information. These encoding properties, together with the functionality of context-independent neurons, such as schema cells and rigid cells, are thought to be involved in the realization of generalization in the brain. In this section, we discuss the generalized memory space of the hippocampus to be revealed by the topological data analysis.

To elucidate generalization in the brain, the most significant clue may be the wide variety of spike sequences (note that these spike sequences can be observed as SPW-Rs) that the hippocampus spontaneously generates during the off-state (e.g., slow-wave sleep), as demonstrated in Figure 5C, F. As an explanation for the functionality of such spike sequences, it has been proposed that a part of the spike sequence is used as a template described as pre-play to efficiently encode future experiences (Figure 6A) (Dragoi and Tonegawa 2011; Grosmark and Buzsáki 2016). The generation of these templates is induced by the firing of rigid cells, which have a larger influence on other neurons (Figure 5B, C, F, I). Since the weights of rigid cells are not overwritten by learning, their context-independent functionality can be maintained permanently. (These context-independent neurons are also denoted as schema cells in Figures 8 and 9). We hypothesize that the formation of these templates may be one fundamental neuronal mechanism of generalization. We are currently investigating this issue, to reveal that SPW-Rs before tasks (i.e., pre-play) can function as templates even for higher-level cognitive functions such as context-independent memory (Ohki et al. under preparation). Therefore, it is significant to focus on rigid cells or schema cells that can induce SPW-Rs and elucidate their network patterns. Importantly, the topological data analysis could play an important role in this regard. This is because, in recent years, pre-play has been the subject of ongoing debate, mainly owing to its computational and statistical validity (Gillespie et al. 2021; Tingley and Peyrache 2020). In contrast, a topology-based computational approach could work as an alternative to currently used methods. More precisely, it would be worthwhile to attempt to quantify the network structures for a wide variety of spontaneous spike sequences via the topological data analysis (Figure 10A). It would also be insightful to determine whether spontaneous spiking sequences can be classified into more stimulus-attribute–dependent groups (e.g., nonspatial or spatial memory) and more abstract groups of information (e.g., episodic or semantic memory). In addition, to investigate whether a similar topological structure observed across several spontaneous spiking sequences will reflect the proximity of encoding information is a significant question to be elucidated. These questions will provide a major advance in understanding the neural mechanisms that implement the generalization function of the brain.

Figure 10: 
(A) Topological representation of the hippocampus. The image on the left denotes the complex composed of the above n-simplexes can ultimately be approximated as a hippocampal network. This is called the neural complex. The rigid and plastic cells are indicated by blue and red dots, respectively. Right: Simulation of rat navigation in a square-shaped environment with a cavity in the middle. The trajectories denoted as 




Υ
1



${\mathrm{{\Upsilon}}}_{1}$


, 




Υ
2



${\mathrm{{\Upsilon}}}_{2}$


, and 




Υ
3



${\mathrm{{\Upsilon}}}_{3}$


 represent segments of physical trajectory navigated by the rat. A blue cluster in the top-left corner of the environment represents a typical place field. A highlighted color pair (green and red), a triple (light blue, orange, and dark red), and a quadruple (magenta, blue, purple, and pale green) denotes the overlapping place fields. These place fields are represented as 0–3 simplices, respectively, in the upper-middle panel. A collection of simplices can form a simplicial complex as part of the neural complex, which schematically represents the connection structure of the mental place field map. 




Γ
1



${{\Gamma }}_{1}$


,




Γ
2



${{\Gamma }}_{2}$


, and 




Γ
3



${{\Gamma }}_{3}$


 described in the topological mental spaces correspond to the physical space passage denoted as 




Υ
1



${\mathrm{{\Upsilon}}}_{1}$


, 




Υ
2



${\mathrm{{\Upsilon}}}_{2}$


, and 




Υ
3



${\mathrm{{\Upsilon}}}_{3}$


, respectively. (B) For the time-series Betti number, the green and blue lines denote the 




b
0



${b}_{0}$


 and 




b
1



${b}_{1}$


, respectively. Most of the time, both Betti numbers remained small, suggesting a few of the coactivity complex (




F
τ



${F}_{\tau }$


) and a few cavities in them. In contrast, the rapid increase in the Betti numbers, such as the instability period 




I
n



${I}_{n}$


 emphasized in pink, indicates periods of strong topological fluctuations. Bottom: During the instability period 




I
5



${I}_{5}$


 depicted on the left, reactivation (i.e., replay of a specific spike sequence) is injected into the neuronal complex, and the instability of the neuronal complex is suppressed. However, when the connection decays, the instability returns. Five successive replays, denoted by five vertical red dashed lines, can provide a more sustainable effect on the neuronal complex. A right panel in A and all panels in B are adopted from Babichev et al. (2019).
Figure 10:

(A) Topological representation of the hippocampus. The image on the left denotes the complex composed of the above n-simplexes can ultimately be approximated as a hippocampal network. This is called the neural complex. The rigid and plastic cells are indicated by blue and red dots, respectively. Right: Simulation of rat navigation in a square-shaped environment with a cavity in the middle. The trajectories denoted as Υ 1 , Υ 2 , and Υ 3 represent segments of physical trajectory navigated by the rat. A blue cluster in the top-left corner of the environment represents a typical place field. A highlighted color pair (green and red), a triple (light blue, orange, and dark red), and a quadruple (magenta, blue, purple, and pale green) denotes the overlapping place fields. These place fields are represented as 0–3 simplices, respectively, in the upper-middle panel. A collection of simplices can form a simplicial complex as part of the neural complex, which schematically represents the connection structure of the mental place field map. Γ 1 , Γ 2 , and Γ 3 described in the topological mental spaces correspond to the physical space passage denoted as Υ 1 , Υ 2 , and Υ 3 , respectively. (B) For the time-series Betti number, the green and blue lines denote the b 0 and b 1 , respectively. Most of the time, both Betti numbers remained small, suggesting a few of the coactivity complex ( F τ ) and a few cavities in them. In contrast, the rapid increase in the Betti numbers, such as the instability period I n emphasized in pink, indicates periods of strong topological fluctuations. Bottom: During the instability period I 5 depicted on the left, reactivation (i.e., replay of a specific spike sequence) is injected into the neuronal complex, and the instability of the neuronal complex is suppressed. However, when the connection decays, the instability returns. Five successive replays, denoted by five vertical red dashed lines, can provide a more sustainable effect on the neuronal complex. A right panel in A and all panels in B are adopted from Babichev et al. (2019).

The second clue is to elucidate the encoding information of cell assemblies during the on-state via the topological data analysis. Previous studies have shown that, regardless of spatial or nonspatial information, the hippocampus compresses and encodes information into a single theta phase for past, present, and future information (Dragoi and Buzsáki 2006; Qasim et al. 2021; Shahbaba et al. 2022; Terada et al. 2017). This implies that the information encoded by the cell assembly can always be represented in an ordered and connected form. Given this property, it is reasonable to topologically quantify the cell assembly as a cell assembly complex (Babichev and Dabaghian 2018; Dabaghian 2020) (Figure 10B). Indeed, it has been shown in silico that spatial information with topological properties can be learned and maintained when a cell assembly complex is replayed (Babichev et al. 2019). Interestingly, the occurrence of a doublets and triplets of ripples in vivo is often observed (Buzsáki 2015), and in this computational model too, the neural complex can be stabilized only if multiple replays occur at a high enough frequency to be physiologically plausible. Thus, the learning and maintenance of topological spatial maps in the brain (i.e., neuronal complexes) can be represented using topological approaches. As a major issue for future research, it is important to clarify how event cells that encode discrete nonspatial information can represent generalized information such as semantic memory. In this regard, the topological data analysis can be a powerful tool.

Finally, it is important to incorporate a topological perspective into the design of the experimental paradigms. The spatial environments used in previous studies to clarify the functions of generalization in the brain are often topologically identical. On the other hand, only a few studies have demonstrated that the function of generalization in the brain (i.e., memory schema) facilitates learning by successfully incorporating topologically designed spatial information into the task space (Dabaghian et al. 2012; Tse et al. 2007). Thus, computational and experimental efforts with reference to topological space will reveal the properties of the brain’s generalization function, and thus, of the brain’s learning mechanism.

4.4 Topology and topological invariants

To make the topological calculation (e.g., algebraic topology) more concrete, we briefly introduce the method to calculate topological invariants called the Betti number. For example, the zeroth Betti number ( b 0 ) represents the number of connected components in the data, the first Betti number ( b 1 ) represents the circle, etc. Accordingly, the Betti number of a circle is b = (1, 1, 0 …), that of a 2D sphere is b = (1, 0, 1, 0 …), and that of a torus is b = (1, 2, 1, 0 …). To obtain the Betti number, we first translate the data into an object called a simplicial complex, denoted as S . A simplicial complex ( S ) is a set consisting of a finite collection of n -simplices ( σ ). As an intuitive example, a 0-simplex is a vertex, a 1-simplex is an edge, a 2-simplex is a triangle, a 3-simplex is a tetrahedron, etc. (Figure 11A). We can denote n -simplices formed with vertices v 0 , v n as { v 0 , v n } . Note n -simplex ( n 1 ) has an orientation vector denoted by arrows. Thus, a collection of n -simplices forms a simplicial ( S ) or neuronal ( N ) complex, which can be regarded as a complex of all neurons that make up the hippocampus (Figure 10A). To compute the topological invariant, it is necessary to impose an algebraic structure on the simplicial complex ( S ) or the neuronal ( N ) c o m p l e x . Namely, if the data to be analyzed are n -dimensional, we would create an abstract vector space ( C n ) consisting of n simplices, and the dimension of C n would correspond to the number of n -simplices. Consequently, C n can be represented by all linear sum combinations of n simplex σ i n . To mathematically describe the linear sum called c h a i n with n 0 , it is defined as C n = i a i σ i n , where a i ϵ Z . Note that a i ϵ Z is a property of the free abelian group itself; therefore, the chain satisfies the property of the free abelian group (Hatcher 2001).

Figure 11: 
Topology and neuronal complex. (A) Basic components of a simplicial complex. Here, we demonstrate a 0 simplex to three simplices used for the construction of a simplicial complex. Note that 



n


$n$


-simplices (



n
≥
1


$n\ge 1$


) have an orientation similar to that of a vector. (B) Topological barcode and Vietoris–Rips complex. The upper four figures denote the simplicial complex of nine points for distinctive values of the proximity parameter (



ε


$\varepsilon $


). The vertical lines in the barcode denote four different levels of 



ε


$\varepsilon $


. The horizontal bars intersecting each vertical line indicate Betti number. This panel was created according to Topaz et al. (2015). (C) 



n


$n$


-Cycle and boundary cycles. Top: Examples of cycles (Z) and boundary cycles (B), which are important in computing Betti numbers. Here, Z and B are depicted as two-simplex. Middle: A valve diagram showing the set relationship between cycles and boundary cycles, where B is a subset of Z. On the other hand, B has different components (Z′ and Z″). Note that B is a subgroup of Z. On the other hand, there are different components represented as Z′ and Z″ in Z. Bottom: Two projections called image and kernel are used to obtain Z and B.
Figure 11:

Topology and neuronal complex. (A) Basic components of a simplicial complex. Here, we demonstrate a 0 simplex to three simplices used for the construction of a simplicial complex. Note that n -simplices ( n 1 ) have an orientation similar to that of a vector. (B) Topological barcode and Vietoris–Rips complex. The upper four figures denote the simplicial complex of nine points for distinctive values of the proximity parameter ( ε ). The vertical lines in the barcode denote four different levels of ε . The horizontal bars intersecting each vertical line indicate Betti number. This panel was created according to Topaz et al. (2015). (C) n -Cycle and boundary cycles. Top: Examples of cycles (Z) and boundary cycles (B), which are important in computing Betti numbers. Here, Z and B are depicted as two-simplex. Middle: A valve diagram showing the set relationship between cycles and boundary cycles, where B is a subset of Z. On the other hand, B has different components (Z′ and Z″). Note that B is a subgroup of Z. On the other hand, there are different components represented as Z′ and Z″ in Z. Bottom: Two projections called image and kernel are used to obtain Z and B.

The question here is how to determine whether the vertices are connected. There are several approaches to address this issue. For example, if the data are point-like, that represent the state space of the brain, a method called the Vietoris–Rips complex may be useful (Figure 11B), mainly because of better computational tractability (Carlsson 2009). Using the Vietoris–Rips complex, we can define a distance metric denoted as ε , which can be realized as a symmetric N × N matrix of distances among vertices. For each ε > 0 , which is called the proximity parameter, we created various simplicial complexes ( S ε ) (Figure 11B). For example, a 1-simplex (i.e., an edge) is formed when two points are within in ε . Similarly, a 2-simplex (i.e., triangle) can be created whenever three vertices are within in ε . Thus, a simplicial complex via the proximity parameter allows any number of vertices to be connected rather than examining a pairwise relationship. For the creation of the distance matrix in functional magnetic resonance imaging studies, methods such as a combination of searchlight analysis and the Vietoris–Rips complex have also been used (Ellis et al. 2019).

The third point relates to the topological concept of boundary. In the example of the two-simplex shown in the right panel of Figure 11C, the boundary denotes three edges, and consequently, the boundary of the two-simplex will become one-dimensional (i.e.,  C n C n 1 ). Computing the boundary is realized using a linear transformation called the boundary operator ∂. Mathematically, this can be described as follows:

( v 0 , v 1 , , v n ) = i = 0 n ( 1 ) i [ v 0 , v 1 , v i 1 , v i + 1 , v n ]

As is clear from this equation, the boundary operation ∂ is a linear sum obtained by removing a certain vertex ( v i ). When removing a certain vertex, the sign of the linear sum is determined by whether the vertex number is even or odd; for example, the sign of an even-numbered vertex such as v 2 becomes + and the sign of an odd-numbered vertex such as v 1 is . More specifically, [ v 0 , v 1 , v 2 ] = [ v 1 , v 2 ] [ v 0 , v 2 ] + [ v 0 , v 1 ] . The result of the calculation is three oriented edges surrounding a triangle, similar to vectors. Thus, the boundary of the two-simplex [ v 0 , v 1 , v 2 ] signifies that each vertex receives two inputs from two vertices. Importantly, in the case of the 0-simplex, the dimension cannot be reduced any further; therefore, ( σ 0 ) = 0 .

Here, consider the cases where the boundary operator ∂ can be applied to n -chain: n : C n C n 1 , which is called a kernel ( ker ). When ker ( C n ) = 0 is valid, ker ( C n ) is called n -cycle, denoted as Z n in Figure 11C. Similarly, the boundary operation n + 1 : C n + 1 C n is called image ( I m ), which is denoted as B n . Here, we look for C n that can be represented as B n = n + 1 ( C n + 1 ) , which is described as the boundary cycle. These two boundary operations are mathematically described as follows:

n c y c l e s : Z : = ker ( : C C n 1 )
n b o u n d a r y c y c l e : B : = I m ( : C n + 1 C n )

As shown in the valve diagram in the middle panel of Figure 11C, the key difference between the n -cycle ( Z n ) and the boundary cycle ( B n ) is that the n -cycle ( Z n ) consists of any cycle structure that includes the edges and/or interior region. By contrast, the boundary cycle ( B n ) necessarily contains both. Accordingly, in the next step, we need to classify the n -cycle ( Z n ) and boundary cycle ( B n ). Note that this is based on the basic property of boundary operation (i.e.,  n n + 1 = 0 ).

The n -cycle ( Z n ) and boundary cycle ( B n ) can be distinguished by the equivalence relation, denoted as . The equivalence relationship consists of three rules: reflection ( x x ), symmetric ( x y and y x ), and transition ( x y , y z , and x z ) rule. When the n -cycle ( Z n ) satisfies these three rules, it implies that the n -cycle ( Z n ) is equivalent or homologous to the boundary cycle ( B n ). The equivalent class classified by the equivalence relation is called the homology class and is denoted as H n = { [ Z n ] } . The formula for H n is algebraically described as H n = [ Z n / B n ] as the quotient of the vector space, and the Betti number ( B e t t i n ) is formulated as follows:

B e t t i n = dim ( H n ) = dim ( Z n ) dim ( B n )

Thus, the Betti number signifies a closed boundary (or cavity) without an interior region denoted as Z and Z . Incidentally, the Betti number may vary depending on the proximity parameter ( ε ). Therefore, a method called the topological barcode (Figure 11B) has been proposed to obtain a more robust Betti number for the proximity parameter ( ε ) (Ghrist 2008; Topaz et al. 2015).

5 Conclusions

This paper attempts to contrast the nature of learning between ANNs and the brain and elucidates the key issues: how does the brain learn efficiently and continually, and produce generalized knowledge? Unlike ANNs, the brain does not adopt data-driven learning styles. Instead, the brain has an innate mechanism specialized for processing certain information, such as grid, head direction, and place cells, and number sense. Furthermore, we describe the mechanism by which the brain uses self-organizing structures and spontaneous activity patterns as the training signals to achieve more efficient learning. Such spontaneous brain activities are characterized by a log-normal distribution of weights and firing properties. In this respect, the network weights of ANNs, such as CNNs, follow a distinctive distribution, such as a Gaussian distribution. In addition, some neurons at the top of the log-normal distribution, known as rigid cells, can initiate spike sequences or SPW-Rs, which can be used as learning templates. Thus, the brain uses self-organizing structures and spontaneous activity patterns to learn more efficiently.

The second issue is related to continuous learning. The brain can continually learn new things, while the previous learning content remains almost intact. On the contrary, the former ANNs had some disadvantages of continual learning, which caused a complete loss of previously learned content. This phenomenon is known as catastrophic forgetting. To elucidate the potential brain mechanism that prevents catastrophic forgetting, we focused on spontaneous brain activities observed during sleep, such as SPW-Rs. Ripple waves not only reactivate the learning process but also downregulate unnecessary weights of networks for learning. In addition, ripples are spontaneously and repeatedly formed so that past memories are constantly reactivated during sleep and, consequently, the network of the previously learned contents is maintained. Using such a brain-inspired mechanism, it is now possible to realize continual learning in state-of-the-art ANNs.

Third, we describe the generalization function in the brain. The ability to generalize helps to form more sophisticated or abstract knowledge beyond superficial differences and is beneficial in choosing more effective and/or costless actions in novel environments. To address this issue, we introduced schema cell neurons that encode commonalities across several Cartesian coordinate environments and abstract value space. Furthermore, in order to further clarify the generalization of the brain, we suggest that topological approaches could play significant roles. In this regard, we pointed out that the hippocampus originally has topological properties: the hippocampus only cares for connections. Such topological properties of the hippocampus could shed light on the mechanisms that span multiple modalities, such as spatial and nonspatial memory, which neuroscience has been trying to unravel for several decades.


Corresponding author: Takefumi Ohki, International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo Institutes for Advanced Study, The University of Tokyo, Tokyo 113-0033, Japan, E-mail: .

Funding source: World Premier International Research Center Initiative (WPI), MEXT, Japan

Acknowledgments

This work was supported by World Premier International Research Center Initiative (WPI), MEXT, Japan. The funding source had no contribution to the content of the manuscript. We would like to express our deepest gratitude to the lab members who provided meaningful feedback and to our families who support our search activities.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: This work was supported by the World Premier International Research Center Initiative (WPI), MEXT, Japan.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

Adamantidis, A.R., Gutierrez Herrera, C., and Gent, T.C. (2019). Oscillating circuitries in the sleeping brain. Nat. Rev. Neurosci. 20: 746–762, https://doi.org/10.1038/s41583-019-0223-4.Search in Google Scholar PubMed

Ambrose, R.E., Pfeiffer, B.E., and Foster, D.J. (2016). Reverse replay of hippocampal place cells is uniquely modulated by changing reward. Neuron 91: 1124–1136, https://doi.org/10.1016/j.neuron.2016.07.047.Search in Google Scholar PubMed PubMed Central

Asok, A., Leroy, F., Rayman, J.B., and Kandel, E.R. (2019). Molecular mechanisms of the memory trace. Trends Neurosci. 42: 14–22, https://doi.org/10.1016/j.tins.2018.10.005.Search in Google Scholar PubMed PubMed Central

Babichev, A. and Dabaghian, Y.A. (2018). Topological schemas of memory spaces. Front. Comput. Neurosci. 12: 27, https://doi.org/10.3389/fncom.2018.00027.Search in Google Scholar PubMed PubMed Central

Babichev, A., Morozov, D., and Dabaghian, Y. (2019). Replays of spatial memories suppress topological fluctuations in cognitive map. Netw. Neurosci. 3: 707–724, https://doi.org/10.1162/netn_a_00076.Search in Google Scholar PubMed PubMed Central

Bahtiyar, S., Gulmez Karaca, K., Henckens, M.J.A.G., and Roozendaal, B. (2020). Norepinephrine and glucocorticoid effects on the brain mechanisms underlying memory accuracy and generalization. Mol. Cell. Neurosci. 108: 103537, https://doi.org/10.1016/j.mcn.2020.103537.Search in Google Scholar PubMed

Baraduc, P., Duhamel, J.R., and Wirth, S. (2019). Schema cells in the macaque hippocampus. Science 363: 635–639, https://doi.org/10.1126/science.aav5404.Search in Google Scholar PubMed

Baram, A.B., Muller, T.H., Nili, H., Garvert, M.M., and Behrens, T.E.J. (2021). Entorhinal and ventromedial prefrontal cortices abstract and generalize the structure of reinforcement learning problems. Neuron 109: 713.e7–723.e7, https://doi.org/10.1016/j.neuron.2020.11.024.Search in Google Scholar PubMed PubMed Central

Bassett, D.S. and Sporns, O. (2017). Network neuroscience. Nat. Neurosci. 20: 353–364, https://doi.org/10.1038/nn.4502.Search in Google Scholar PubMed PubMed Central

Bassett, D.S., Xia, C.H., and Satterthwaite, T.D. (2018). Understanding the emergence of neuropsychiatric disorders with network neuroscience. Biol. Psychiatr. Cognit. Neurosci. Neuroimaging 3: 742–753, https://doi.org/10.1016/j.bpsc.2018.03.015.Search in Google Scholar PubMed PubMed Central

Behrens, T.E.J., Muller, T.H., Whittington, J.C.R., Mark, S., Baram, A.B., Stachenfeld, K.L., and Kurth-Nelson, Z. (2018). What is a cognitive map? Organizing knowledge for flexible behavior. Neuron 100: 490–509, https://doi.org/10.1016/j.neuron.2018.10.002.Search in Google Scholar PubMed

Bhattarai, B., Lee, J.W., and Jung, M.W. (2020). Distinct effects of reward and navigation history on hippocampal forward and reverse replays. Proc. Natl. Acad. Sci. U. S. A. 117: 689–697, https://doi.org/10.1073/pnas.1912533117.Search in Google Scholar PubMed PubMed Central

Bongard, S. and Nieder, A. (2010). Basic mathematical rules are encoded by primate prefrontal cortex neurons. Proc. Natl. Acad. Sci. U. S. A. 107: 2277–2282, https://doi.org/10.1073/pnas.0909180107.Search in Google Scholar PubMed PubMed Central

Bonifazi, P., Goldin, M., Picardo, M.A., Jorquera, I., Cattani, A., Bianconi, G., Represa, A., Ben-Ari, Y., and Cossart, R. (2009). GABAergic hub neurons orchestrate synchrony in developing hippocampal networks. Science 326: 1419–1424, https://doi.org/10.1126/science.1175509.Search in Google Scholar PubMed

Bowman, C.R. and Zeithamova, D. (2018). Abstract memory representations in the ventromedial prefrontal cortex and hippocampus support concept generalization. J. Neurosci. 38: 2605–2614, https://doi.org/10.1523/jneurosci.2811-17.2018.Search in Google Scholar

Broadbent, N.J., Squire, L.R., and Clark, R.E. (2004). Spatial memory, recognition memory, and the hippocampus. Proc. Natl. Acad. Sci. U. S. A. 101: 14515–14520, https://doi.org/10.1073/pnas.0406344101.Search in Google Scholar PubMed PubMed Central

Brunel, N., Hakim, V., Isope, P., Nadal, J.P., and Barbour, B. (2004). Optimal information storage and the distribution of synaptic weights: perceptron versus Purkinje cell. Neuron 43: 745–757, https://doi.org/10.1016/s0896-6273(04)00528-8.Search in Google Scholar

Buch, E.R., Claudino, L., Quentin, R., Bönstrup, M., and Cohen, L.G. (2021). Consolidation of human skill linked to waking hippocampo-neocortical replay. Cell Rep. 35: 109193, https://doi.org/10.1016/j.celrep.2021.109193.Search in Google Scholar PubMed PubMed Central

Bui, K., Park, F., Zhang, S., Qi, Y., and Xin, J. (2021). Structured sparsity of convolutional neural networks via nonconvex sparse group regularization. Front. Appl. Math. Stat. 6: 62, https://doi.org/10.3389/fams.2020.529564.Search in Google Scholar

Burak, Y. and Fiete, I.R. (2009). Accurate path integration in continuous attractor network models of grid cells. PLoS Comput. Biol. 5: e1000291, https://doi.org/10.1371/journal.pcbi.1000291.Search in Google Scholar PubMed PubMed Central

Burgess, N. (2008). Grid cells and theta as oscillatory interference: theory and predictions. Hippocampus 18: 1157–1174, https://doi.org/10.1002/hipo.20518.Search in Google Scholar PubMed PubMed Central

Burgess, N. and O’Keefe, J. (2011). Models of place and grid cell firing and theta rhythmicity. Curr. Opin. Neurobiol. 21: 734–744, https://doi.org/10.1016/j.conb.2011.07.002.Search in Google Scholar PubMed PubMed Central

Buzsáki, G. (2015). Hippocampal sharp wave-ripple: a cognitive biomarker for episodic memory and planning. Hippocampus 25: 1073–1188, https://doi.org/10.1002/hipo.22488.Search in Google Scholar PubMed PubMed Central

Buzsáki, G. and Mizuseki, K. (2014). The log-dynamic brain: how skewed distributions affect network operations. Nat. Rev. Neurosci. 15: 264–278, https://doi.org/10.1038/nrn3687.Search in Google Scholar PubMed PubMed Central

Buzsáki, G. and Tingley, D. (2018). Space and time: the hippocampus as a sequence generator. Trends Cognit. Sci. 22: 853–869, https://doi.org/10.1016/j.tics.2018.07.006.Search in Google Scholar PubMed PubMed Central

Buzsáki, G., Leung, L.W., and Vanderwolf, C.H. (1983). Cellular bases of hippocampal EEG in the behaving rat. Brain Res. 287: 139–171, https://doi.org/10.1016/0165-0173(83)90037-1.Search in Google Scholar PubMed

Cang, Z. and Wei, G.W. (2017). TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLoS Comput. Biol. 13: e1005690, https://doi.org/10.1371/journal.pcbi.1005690.Search in Google Scholar PubMed PubMed Central

Carlsson, G. (2009). Topology and data. Bull. Am. Math. Soc. 46: 255–308, https://doi.org/10.1090/s0273-0979-09-01249-x.Search in Google Scholar

Chaudhuri, R., Gerçek, B., Pandey, B., Peyrache, A., and Fiete, I. (2019). The intrinsic attractor manifold and population dynamics of a canonical cognitive circuit across waking and sleep. Nat. Neurosci. 22: 1512–1520, https://doi.org/10.1038/s41593-019-0460-x.Search in Google Scholar PubMed

Chen, G., Zou, X., Watanabe, H., van Deursen, J.M., and Shen, J. (2010). CREB binding protein is required for both short-term and long-term memory formation. J. Neurosci. 30: 13066–13077, https://doi.org/10.1523/jneurosci.2378-10.2010.Search in Google Scholar PubMed PubMed Central

Chen, Z., Gomperts, S.N., Yamamoto, J., and Wilson, M.A. (2014). Neural representation of spatial topology in the rodent hippocampus. Neural Comput. 26: 1–39, https://doi.org/10.1162/neco_a_00538.Search in Google Scholar PubMed PubMed Central

Chen, Z., Kloosterman, F., Brown, E.N., and Wilson, M.A. (2012). Uncovering spatial topology represented by rat hippocampal population neuronal codes. J. Comput. Neurosci. 33: 227–255, https://doi.org/10.1007/s10827-012-0384-x.Search in Google Scholar PubMed PubMed Central

Chen, Z. and Liu, B. (2018). Lifelong machine learning. In: Synthesis Lectures on artificial Intelligence and machine learning, 2nd ed. Cham: Springer, pp. 1–207.10.2200/S00832ED1V01Y201802AIM037Search in Google Scholar

Chung, M.K., Lee, H., DiChristofano, A., Ombao, H., and Solo, V. (2019). Exact topological inference of the resting-state brain networks in twins. Netw. Neurosci. 3: 674–694, https://doi.org/10.1162/netn_a_00091.Search in Google Scholar PubMed PubMed Central

Colbran, R.J. (2015). Thematic minireview series: molecular mechanisms of synaptic plasticity. J. Biol. Chem. 290: 28594–28595, https://doi.org/10.1074/jbc.r115.696468.Search in Google Scholar PubMed PubMed Central

Cossell, L., Iacaruso, M.F., Muir, D.R., Houlton, R., Sader, E.N., Ko, H., Hofer, S.B., and Mrsic-Flogel, T.D. (2015). Functional organization of excitatory synaptic strength in primary visual cortex. Nature 518: 399–403, https://doi.org/10.1038/nature14182.Search in Google Scholar PubMed PubMed Central

Curtis, A., Silver, T., Tenenbaum, J.B., Lozano-Pérez, T., and Kaelbling, L. (2022). Discovering state and action abstractions for generalized task and motion planning. In: The 36th AAAI conference on artificial intelligence (AAAI-22), AAII, Vol. 36, pp. 5377–5384.10.1609/aaai.v36i5.20475Search in Google Scholar

Curto, C. (2017). What can topology tell us about the neural code? Bull. Am. Math. Soc. 54: 63–78, https://doi.org/10.1090/bull/1554.Search in Google Scholar

Dabaghian, Y. (2020). From topological analyses to functional modeling: the case of hippocampus. Front. Comput. Neurosci. 14: 593166, https://doi.org/10.3389/fncom.2020.593166.Search in Google Scholar PubMed PubMed Central

Dabaghian, Y., Brandt, V.L., and Frank, L.M. (2014). Reconceiving the hippocampal map as a topological template. Elife 3: e03476, https://doi.org/10.7554/elife.03476.Search in Google Scholar PubMed PubMed Central

Dabaghian, Y., Mémoli, F., Frank, L., and Carlsson, G. (2012). A topological paradigm for hippocampal spatial map formation using persistent homology. PLoS Comput. Biol. 8: e1002581, https://doi.org/10.1371/journal.pcbi.1002581.Search in Google Scholar PubMed PubMed Central

Danjo, T., Toyoizumi, T., and Fujisawa, S. (2018). Spatial representations of self and other in the hippocampus. Science 359: 213–218, https://doi.org/10.1126/science.aao3898.Search in Google Scholar PubMed

Dehaene-Lambertz, G. and Spelke, E.S. (2015). The infancy of the human brain. Neuron 88: 93–109, https://doi.org/10.1016/j.neuron.2015.09.026.Search in Google Scholar PubMed

DiTullio, R.W. and Balasubramanian, V. (2021). Dynamical self-organization and efficient representation of space by grid cells. Curr. Opin. Neurobiol. 70: 206–213, https://doi.org/10.1016/j.conb.2021.11.007.Search in Google Scholar PubMed PubMed Central

Dragoi, G. and Buzsáki, G. (2006). Temporal encoding of place sequences by hippocampal cell assemblies. Neuron 50: 145–157, https://doi.org/10.1016/j.neuron.2006.02.023.Search in Google Scholar PubMed

Dragoi, G. and Tonegawa, S. (2011). Preplay of future place cell sequences by hippocampal cellular assemblies. Nature 469: 397–401, https://doi.org/10.1038/nature09633.Search in Google Scholar PubMed PubMed Central

Edwards, L.A., Wagner, J.B., Simon, C.E. and Hyde, D.C. (2016). Functional brain organization for number processing in pre-verbal infants. Dev. Sci. 19: 757–769, https://doi.org/10.1111/desc.12333.Search in Google Scholar PubMed

Ellis, C.T., Lesnick, M., Henselman-Petrusek, G., Keller, B., and Cohen, J.D. (2019). Feasibility of topological data analysis for event-related fMRI. Netw. Neurosci. 3: 695–706, https://doi.org/10.1162/netn_a_00095.Search in Google Scholar PubMed PubMed Central

Esteva, A., Kuprel, B., Novoa, R.A., Ko, J., Swetter, S.M., Blau, H.M., and Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature 542: 115–118, https://doi.org/10.1038/nature21056.Search in Google Scholar PubMed PubMed Central

Fanselow, M. and Poulos, A.M. (2005). The neuroscience of mammalian associative learning. Annu. Rev. Psychol. 56: 207–234, https://doi.org/10.1146/annurev.psych.56.091103.070213.Search in Google Scholar PubMed

FeldmanHall, O., Montez, D.F., Phelps, E.A., Davachi, L., and Murty, V.P. (2021). Hippocampus guides adaptive learning during dynamic social interactions. J. Neurosci. 41: 1340–1348, https://doi.org/10.1523/jneurosci.0873-20.2020.Search in Google Scholar PubMed PubMed Central

Feldmeyer, D., Egger, V., Lübke, J., and Sakmann, B. (1999). Reliable synaptic connections between pairs of excitatory layer 4 neurones within a single “barrel” of developing rat somatosensory cortex. J. Physiol. 521: 169–190, https://doi.org/10.1111/j.1469-7793.1999.00169.x.Search in Google Scholar PubMed PubMed Central

Fiete, I.R., Burak, Y., and Brookings, T. (2008). What grid cells convey about rat location. J. Neurosci. 28: 6858–6871, https://doi.org/10.1523/jneurosci.5684-07.2008.Search in Google Scholar

Flesch, T., Balaguer, J., Dekker, R., Nili, H., and Summerfield, C. (2018). Comparing continual task learning in minds and machines. Proc. Natl. Acad. Sci. U. S. A. 115: E10313–E10322, https://doi.org/10.1073/pnas.1800755115.Search in Google Scholar PubMed PubMed Central

Foster, D.J. and Wilson, M.A. (2006). Reverse replay of behavioural sequences in hippocampal place cells during the awake state. Nature 440: 680–683, https://doi.org/10.1038/nature04587.Search in Google Scholar PubMed

Foster, D.J. and Wilson, M.A. (2007). Hippocampal theta sequences. Hippocampus 17: 1093–1099, https://doi.org/10.1002/hipo.20345.Search in Google Scholar PubMed

Galland, B.C., Taylor, B.J., Elder, D.E., and Herbison, P. (2012). Normal sleep patterns in infants and children: a systematic review of observational studies. Sleep Med. Rev. 16: 213–222, https://doi.org/10.1016/j.smrv.2011.06.001.Search in Google Scholar PubMed

Gardner, R.J., Hermansen, E., Pachitariu, M., Burak, Y., Baas, N.A., Dunn, B.A., Moser, M.B., and Moser, E.I. (2022). Toroidal topology of population activity in grid cells. Nature 602: 123–128, https://doi.org/10.1038/s41586-021-04268-7.Search in Google Scholar PubMed PubMed Central

Gauthier, J.L. and Tank, D.W. (2018). A dedicated population for reward coding in the hippocampus. Neuron 99: 179.e7–193.e7, https://doi.org/10.1016/j.neuron.2018.06.008.Search in Google Scholar PubMed PubMed Central

Ge, X., Zhang, K., Gribizis, A., Hamodi, A.S., Sabino, A.M., and Crair, M.C. (2021). Retinal waves prime visual motion detection by simulating future optic flow. Science 373, https://doi.org/10.1126/science.abd0830.Search in Google Scholar PubMed PubMed Central

Ghrist, R. (2008). Barcodes: the persistent topology of data. Bull. Amer. Math. Soc. 45: 61–75.10.1090/S0273-0979-07-01191-3Search in Google Scholar

Gillespie, A.K., Astudillo Maya, D.A., Denovellis, E.L., Liu, D.F., Kastner, D.B., Coulter, M.E., Roumis, D.K., Eden, U.T., and Frank, L.M. (2021). Hippocampal replay reflects specific past experiences rather than a plan for subsequent choice. Neuron 109: 3149.e6–3163.e6, https://doi.org/10.1016/j.neuron.2021.07.029.Search in Google Scholar PubMed PubMed Central

Girardeau, G., Benchenane, K., Wiener, S.I., Buzsáki, G., and Zugaro, M.B. (2009). Selective suppression of hippocampal ripples impairs spatial memory. Nat. Neurosci. 12: 1222–1223, https://doi.org/10.1038/nn.2384.Search in Google Scholar PubMed

Girardeau, G. and Lopes-Dos-Santos, V. (2021). Brain neural patterns and the memory function of sleep. Science 374: 560–564, https://doi.org/10.1126/science.abi8370.Search in Google Scholar PubMed PubMed Central

Giusti, C., Ghrist, R., and Bassett, D.S. (2016). Two’s company, three (or more) is a simplex: algebraic-topological tools for understanding higher-order structure in neural data. J. Comput. Neurosci. 41: 1–14, https://doi.org/10.1007/s10827-016-0608-6.Search in Google Scholar PubMed PubMed Central

Giusti, C., Pastalkova, E., Curto, C., and Itskov, V. (2015). Clique topology reveals intrinsic geometric structure in neural correlations. Proc. Natl. Acad. Sci. U. S. A. 112: 13455–13460, https://doi.org/10.1073/pnas.1506407112.Search in Google Scholar PubMed PubMed Central

Golub, M.D., Sadtler, P.T., Oby, E.R., Quick, K.M., Ryu, S.I., Tyler-Kabara, E.C., Batista, A.P., Chase, S.M., and Yu, B.M. (2018). Learning by neural reassociation. Nat. Neurosci. 21: 607–616, https://doi.org/10.1038/s41593-018-0095-3.Search in Google Scholar PubMed PubMed Central

Gonzalez, C., Jiang, X., Gonzalez-Martinez, J., and Halgren, E. (2022). Human spindle variability. J. Neurosci. 42: 4517–4537, https://doi.org/10.1523/jneurosci.1786-21.2022.Search in Google Scholar

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2020). Generative adversarial networks. Commun. ACM 63: 139–144, https://doi.org/10.1145/3422622.Search in Google Scholar

Gothard, K.M., Skaggs, W.E., and Mcnaughton, B.L. (1996). Dynamics of mismatch correction in the hippocampal ensemble code for space: interaction between path integration and environmental cues. J. Neurosci. 16: 8027–8040, https://doi.org/10.1523/jneurosci.16-24-08027.1996.Search in Google Scholar PubMed PubMed Central

Gridchyn, I., Schoenenberger, P., O’Neill, J., and Csicsvari, J. (2020). Assembly-specific disruption of hippocampal replay leads to selective memory deficit. Neuron 106: 291.e6–300.e6, https://doi.org/10.1016/j.neuron.2020.01.021.Search in Google Scholar PubMed

Grosmark, A.D. and Buzsáki, G. (2016). Diversity in neural firing dynamics supports both rigid and learned hippocampal sequences. Science 351: 1440–1443, https://doi.org/10.1126/science.aad1935.Search in Google Scholar PubMed PubMed Central

Gulledge, A. and Stuart, G. (2003). Action potential initiation and propagation in layer 5 pyramidal neurons of the rat prefrontal cortex: absence of dopamine modulation. J. Neurosci. 23: 11363–11372, https://doi.org/10.1523/JNEUROSCI.23-36-11363.2003.Search in Google Scholar PubMed PubMed Central

Hadsell, R., Rao, D., Rusu, A.A. and Pascanu, R. (2020). Embracing change: continual learning in deep neural networks. Trends Cognit. Sci. 24: 1028–1040, https://doi.org/10.1016/j.tics.2020.09.004.Search in Google Scholar PubMed

Hasselmo, M.E., Giocomo, L.M., and Zilli, E.A. (2007). Grid cell firing may arise from interference of theta frequency membrane potential oscillations in single neurons. Hippocampus 17: 1252–1271, https://doi.org/10.1002/hipo.20374.Search in Google Scholar PubMed PubMed Central

Hatcher, A. (2001). Algebraic topology. Cambridge Univeristy Press, New York.Search in Google Scholar

Helfrich, R.F., Mander, B.A., Jagust, W.J., Knight, R.T. and Walker, M.P. (2018). Old brains come uncoupled in sleep: slow wave-spindle synchrony, brain atrophy, and forgetting. Neuron 97: 221.e4–230.e4, https://doi.org/10.1016/j.neuron.2017.11.020.Search in Google Scholar PubMed PubMed Central

Hensel, F., Moor, M., and Rieck, B. (2021). A survey of topological machine learning methods. Front. Artif. Intell. 4: 681108, https://doi.org/10.3389/frai.2021.681108.Search in Google Scholar PubMed PubMed Central

Higgins, C., Liu, Y., Vidaurre, D., Kurth-Nelson, Z., Dolan, R., Behrens, T. and Woolrich, M. (2021). Replay bursts in humans coincide with activation of the default mode and parietal alpha networks. Neuron 109: 882.e7–893.e7, https://doi.org/10.1016/j.neuron.2020.12.007.Search in Google Scholar PubMed PubMed Central

Hofer, C., Kwitt, R., Niethammer, M., and Uhl, A. (2017). Deep learning with topological signatures. Adv. Neural Inf. Process. Syst. 30.Search in Google Scholar

Hosny, A., Parmar, C., Coroller, T.P., Grossmann, P., Zeleznik, R., Kumar, A., Bussink, J., Gillies, R.J., Mak, R.H., and Aerts, H.J.W.L. (2018). Deep learning for lung cancer prognostication: a retrospective multi-cohort radiomics study. PLoS Med. 15: e1002711, https://doi.org/10.1371/journal.pmed.1002711.Search in Google Scholar PubMed PubMed Central

Høydal, Ø.A., Skytøen, E.R., Andersson, S.O., Moser, M.B., and Moser, E.I. (2019). Object-vector coding in the medial entorhinal cortex. Nature 568: 400–404, https://doi.org/10.1038/s41586-019-1077-7.Search in Google Scholar PubMed

Huszár, R., Zhang, Y., Blockus, H., and Buzsáki, G. (2022). Preconfigured dynamics in the hippocampus are guided by embryonic birthdate and rate of neurogenesis. Nat. Neurosci. 25: 1201–1212, https://doi.org/10.1038/s41593-022-01138-x.Search in Google Scholar PubMed

Hyde, D.C., Boas, D.A., Blair, C., and Carey, S. (2010). Near-infrared spectroscopy shows right parietal specialization for number in pre-verbal infants. Neuroimage 53: 647–652, https://doi.org/10.1016/j.neuroimage.2010.06.030.Search in Google Scholar PubMed PubMed Central

Ikegaya, Y., Sasaki, T., Ishikawa, D., Honma, N., Tao, K., Takahashi, N., Minamisawa, G., Ujita, S., and Matsuki, N. (2013). Interpyramid spike transmission stabilizes the sparseness of recurrent network activity. Cerebr. Cortex 23: 293–304, https://doi.org/10.1093/cercor/bhs006.Search in Google Scholar PubMed

Ishikawa, T. and Ikegaya, Y. (2020). Locally sequential synaptic reactivation during hippocampal ripples. Sci. Adv. 6: eaay1492, https://doi.org/10.1126/sciadv.aay1492.Search in Google Scholar PubMed PubMed Central

Izard, V., Dehaene-Lambertz, G., and Dehaene, S. (2008). Distinct cerebral pathways for object identity and number in human infants. PLoS Biol. 6: e11, https://doi.org/10.1371/journal.pbio.0060011.Search in Google Scholar PubMed PubMed Central

Izard, V.R., Sann, C., Spelke, E.S., and Streri, A. (2009). Newborn infants perceive abstract numbers. Proc. Natl. Acad. Sci. U. S. A. 106: 10382–10385, https://doi.org/10.1073/pnas.0812142106.Search in Google Scholar PubMed PubMed Central

Jarrard, L.E. (1993). On the role of the hippocampus in learning and memory in the rat. Behav. Neural. Biol. 60: 9–26, https://doi.org/10.1016/0163-1047(93)90664-4.Search in Google Scholar PubMed

Jeffery, K.J., Gilbert, A., Burton, S., and Strudwick, A. (2003). Preserved performance in a hippocampal-dependent spatial task despite complete place cell remapping. Hippocampus 13: 175–189, https://doi.org/10.1002/hipo.10047.Search in Google Scholar PubMed

Julian, J.B. and Doeller, C.F. (2021). Remapping and realignment in the human hippocampal formation predict context-dependent spatial behavior. Nat. Neurosci. 24: 863–872, https://doi.org/10.1038/s41593-021-00835-3.Search in Google Scholar PubMed

Jung, H., Ju, J., Jung, M., and Kim, J. (2018). Less-forgetful learning for domain expansion in deep neural networks. AAAI 32: 3358–3365, https://doi.org/10.1609/aaai.v32i1.11769.Search in Google Scholar

Kaefer, K., Nardin, M., Blahna, K., and Csicsvari, J. (2020). Replay of behavioral sequences in the medial prefrontal cortex during rule switching. Neuron 106: 154.e6–165.e6, https://doi.org/10.1016/j.neuron.2020.01.015.Search in Google Scholar PubMed

Kanari, L., Dictus, H., Chalimourda, A., Arnaudon, A., Van Geit, W., Coste, B., Shillcock, J., Hess, K., and Markram, H. (2022). Computational synthesis of cortical dendritic morphologies. Cell Rep. 39: 110586, https://doi.org/10.1016/j.celrep.2022.110586.Search in Google Scholar PubMed

Kandel, E.R. (2001). The molecular biology of memory storage: a dialogue between genes and synapses. Science 294: 1030–1038, https://doi.org/10.1126/science.1067020.Search in Google Scholar PubMed

Kang, L. and Balasubramanian, V. (2019). A geometric attractor mechanism for self-organization of entorhinal grid modules. Elife 8, https://doi.org/10.7554/elife.46687.Search in Google Scholar PubMed PubMed Central

Kemker, R. and Christopher, K. (2017). Fearnet: brain-inspired model for incremental learning. arXiv Preprint, https://doi.org/10.48550/arXiv.1711.10563.Search in Google Scholar

Kepecs, A. and Fishell, G. (2014). Interneuron cell types are fit to function. Nature 505: 318–326, https://doi.org/10.1038/nature12983.Search in Google Scholar PubMed PubMed Central

Khazipov, R., Sirota, A., Leinekugel, X., Holmes, G.L., Ben-Ari, Y., and Buzsáki, G. (2004). Early motor activity drives spindle bursts in the developing somatosensory cortex. Nature 432: 758–761, https://doi.org/10.1038/nature03132.Search in Google Scholar PubMed

Khodagholy, D., Gelinas, J.N., and Buzsáki, G. (2017). Learning-enhanced coupling between ripple oscillations in association cortices and hippocampus. Science 358: 369–372, https://doi.org/10.1126/science.aan6203.Search in Google Scholar PubMed PubMed Central

Kim, G., Jang, J., Baek, S., Song, M., and Paik, S.B. (2021). Visual number sense in untrained deep neural networks. Sci. Adv. 7, https://doi.org/10.1126/sciadv.abd6127.Search in Google Scholar PubMed PubMed Central

Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al.. (2017). Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. U. S. A. 114: 3521–3526, https://doi.org/10.1073/pnas.1611835114.Search in Google Scholar PubMed PubMed Central

Knudsen, E.B. and Wallis, J.D. (2021). Hippocampal neurons construct a map of an abstract value space. Cell 184: 4640.e10–4650.e10, https://doi.org/10.1016/j.cell.2021.07.010.Search in Google Scholar PubMed PubMed Central

Konidaris, G. (2019). On the necessity of abstraction. Curr. Opin. Behav. Sci. 29: 1–7, https://doi.org/10.1016/j.cobeha.2018.11.005.Search in Google Scholar PubMed PubMed Central

Korngiebel, D.M. and Mooney, S.D. (2021). Considering the possibilities and pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in healthcare delivery. NPJ Digit. Med. 4: 93, https://doi.org/10.1038/s41746-021-00464-x.Search in Google Scholar PubMed PubMed Central

Krabbe, S., Paradiso, E., d’Aquin, S., Bitterman, Y., Courtin, J., Xu, C., Yonehara, K., Markovic, M., Müller, C., Eichlisberger, T., et al.. (2019). Adaptive disinhibitory gating by VIP interneurons permits associative learning. Nat. Neurosci. 22: 1834–1843, https://doi.org/10.1038/s41593-019-0508-y.Search in Google Scholar PubMed

Kumaran, D. (2012). What representations and computations underpin the contribution of the hippocampus to generalization and inference? Front. Hum. Neurosci. 6: 157, https://doi.org/10.3389/fnhum.2012.00157.Search in Google Scholar PubMed PubMed Central

Kumaran, D., Hassabis, D., and McClelland, J.L. (2016). What learning systems do intelligent agents need? Complementary learning systems theory updated. Trends Cognit. Sci. 20: 512–534, https://doi.org/10.1016/j.tics.2016.05.004.Search in Google Scholar PubMed

Kutter, E.F., Bostroem, J., Elger, C.E., Mormann, F., and Nieder, A. (2018). Single neurons in the human brain encode numbers. Neuron 100: 753.e4–761.e4, https://doi.org/10.1016/j.neuron.2018.08.036.Search in Google Scholar PubMed

Langston, R.F., Ainge, J.A., Couey, J.J., Canto, C.B., Bjerknes, T.L., Witter, M.P., Moser, E.I., and Moser, M.B. (2010). Development of the spatial representation system in the rat. Science 328: 1576–1580, https://doi.org/10.1126/science.1188210.Search in Google Scholar PubMed

Lanore, F., Cayco-Gajic, N.A., Gurnani, H., Coyle, D., and Silver, R.A. (2021). Cerebellar granule cell axons support high-dimensional representations. Nat. Neurosci. 24: 1142–1150, https://doi.org/10.1038/s41593-021-00873-x.Search in Google Scholar PubMed PubMed Central

Lefort, S., Tomm, C., Floyd Sarria, J.C., and Petersen, C.C.H. (2009). The excitatory neuronal network of the C2 barrel column in mouse primary somatosensory cortex. Neuron 61: 301–316, https://doi.org/10.1016/j.neuron.2008.12.020.Search in Google Scholar PubMed

Lehtelä, L., Salmelin, R., and Hari, R. (1997). Evidence for reactive magnetic 10-Hz rhythm in the human auditory cortex. Neurosci. Lett. 222: 111–114, https://doi.org/10.1016/s0304-3940(97)13361-4.Search in Google Scholar PubMed

Leutgeb, J.K., Leutgeb, S., Treves, A., Meyer, R., Barnes, C.A., McNaughton, B.L., Moser, M.B., and Moser, E.I. (2005). Progressive transformation of hippocampal neuronal representations in “morphed” environments. Neuron 48: 345–358, https://doi.org/10.1016/j.neuron.2005.09.007.Search in Google Scholar PubMed

Lever, C., Burton, S., Jeewajee, A., O’Keefe, J., and Burgess, N. (2009). Boundary vector cells in the subiculum of the hippocampal formation. J. Neurosci. 29: 9771–9777, https://doi.org/10.1523/jneurosci.1319-09.2009.Search in Google Scholar

Lever, C., Wills, T., Cacucci, F., Burgess, N., and O’Keefe, J. (2002). Long-term plasticity in hippocampal place-cell representation of environmental geometry. Nature 416: 90–94, https://doi.org/10.1038/416090a.Search in Google Scholar PubMed

Li, X., Ouyang, G., Usami, A., Ikegaya, Y., and Sik, A. (2010). Scale-free topology of the CA3 hippocampal network: a novel method to analyze functional neuronal assemblies. Biophys. J. 98: 1733–1741, https://doi.org/10.1016/j.bpj.2010.01.013.Search in Google Scholar PubMed PubMed Central

Li, Z. and Hoiem, D. (2018). Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40: 2935–2947, https://doi.org/10.1109/tpami.2017.2773081.Search in Google Scholar PubMed

Liu, Y., Mattar, M.G., Behrens, T.E.J., Daw, N.D., and Dolan, R.J. (2021). Experience replay is associated with efficient nonlocal learning. Science 372, https://doi.org/10.1126/science.abf1357.Search in Google Scholar PubMed PubMed Central

London, M. and Häusser, M. (2005). Dendritic computation. Annu. Rev. Neurosci. 28: 503–532, https://doi.org/10.1146/annurev.neuro.28.061604.135703.Search in Google Scholar PubMed

Latchoumane, C.F.V., Ngo, H.V., Born, J., and Shin, H.S. (2017). Thalamic spindles promote memory formation during sleep through triple phase-locking of cortical, thalamic, and hippocampal rhythms. Neuron 95: 424.e6–435.e6, https://doi.org/10.1016/j.neuron.2017.06.025.Search in Google Scholar PubMed

Ma, R., Miao, J., Niu, L., and Zhang, P. (2019). Transformed ℓ1 regularization for learning sparse deep neural networks. Neural Network 119: 286–298, https://doi.org/10.1016/j.neunet.2019.08.015.Search in Google Scholar PubMed

Mander, B.A., Winer, J.R., and Walker, M.P. (2017). Sleep and human aging. Neuron 94: 19–36, https://doi.org/10.1016/j.neuron.2017.02.004.Search in Google Scholar PubMed PubMed Central

Mandler, G. and Shebo, B.J. (1982). Subitizing: an analysis of its component processes. J. Exp. Psychol. Gen. 111: 1–22, https://doi.org/10.1037/0096-3445.111.1.1.Search in Google Scholar

Mathis, A., Herz, A.V.M., and Stemmler, M. (2012). Optimal population codes for space: grid cells outperform place cells. Neural Comput. 24: 2280–2317, https://doi.org/10.1162/neco_a_00319.Search in Google Scholar

McCloskey, M. and Cohen, N.J. (1989). Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. Adv. Res. Theor. 24: 109–165.10.1016/S0079-7421(08)60536-8Search in Google Scholar

Mccrink, K. and Wynn, K. (2004). Large-number addition and subtraction by 9-month-old infants. Psychol. Sci. 15: 776–781, https://doi.org/10.1111/j.0956-7976.2004.00755.x.Search in Google Scholar PubMed

Michon, F., Sun, J.J., Kim, C.Y., Ciliberti, D., and Kloosterman, F. (2019). Post-learning hippocampal replay selectively reinforces spatial memory for highly rewarded locations. Curr. Biol. 29: 1436.e5–1444.e5, https://doi.org/10.1016/j.cub.2019.03.048.Search in Google Scholar PubMed

Mikutta, C., Feige, B., Maier, J.G., Hertenstein, E., Holz, J., Riemann, D., and Nissen, C. (2019). Phase-amplitude coupling of sleep slow oscillatory and spindle activity correlates with overnight memory consolidation. J. Sleep Res. 28: e12835, https://doi.org/10.1111/jsr.12835.Search in Google Scholar PubMed

Mindell, J.A., Sadeh, A., Wiegand, B., How, T.H., and Goh, D.Y.T. (2010). Cross-cultural differences in infant and toddler sleep. Sleep Med. 11: 274–280, https://doi.org/10.1016/j.sleep.2009.04.012.Search in Google Scholar PubMed

Mitsuno, K., Miyao, J., and Kurita, T. (2020). Hierarchical group sparse regularization for deep convolutional neural networks; hierarchical group sparse regularization for deep convolutional neural networks. In: 2020 international joint conference on neural networks (IJCNN).10.1109/IJCNN48605.2020.9207531Search in Google Scholar

Miyawaki, H. and Mizuseki, K. (2022). De novo inter-regional coactivations of preconfigured local ensembles support memory. Nat. Commun. 11: 1272, https://doi.org/10.1038/s41467-022-28929-x.Search in Google Scholar PubMed PubMed Central

Mölle, M., Bergmann, T.O., Marshall, L., and Born, J. (2011). Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep 34: 1411–1421, https://doi.org/10.5665/sleep.1290.Search in Google Scholar PubMed PubMed Central

Moser, E.I., Kropff, E. and Moser, M.B. (2008). Place cells, grid cells, and the Brain’s spatial representation system. Annu. Rev. Neurosci. 31: 69–89, https://doi.org/10.1146/annurev.neuro.31.061307.090723.Search in Google Scholar PubMed

Muller, R.U. and Kubie, J.L. (1987). The effects of changes in the environment on the spatial firing of hippocampal complex-spike cells. J. Neurosci. 7: 1951–1968, https://doi.org/10.1523/jneurosci.07-07-01951.1987.Search in Google Scholar PubMed PubMed Central

Nakazawa, K., McHugh, T.J., Wilson, M.A., and Tonegawa, S. (2004). NMDA receptors, place cells and hippocampal spatial memory. Nat. Rev. Neurosci. 5: 361–372, https://doi.org/10.1038/nrn1385.Search in Google Scholar PubMed

Nasr, K., Viswanathan, P., and Nieder, A. (2019). Number detectors spontaneously emerge in a deep neural network designed for visual object recognition. Sci. Adv. 5: eaav7903, https://doi.org/10.1126/sciadv.aav7903.Search in Google Scholar PubMed PubMed Central

Navarro-Lobato, I. and Genzel, L. (2019). The up and down of sleep: from molecules to electrophysiology. Neurobiol. Learn. Mem. 160: 3–10, https://doi.org/10.1016/j.nlm.2018.03.013.Search in Google Scholar PubMed

Ngo, C.T., Benear, S.L., Popal, H., Olson, I.R., and Newcombe, N.S. (2021). Contingency of semantic generalization on episodic specificity varies across development. Curr. Biol. 31: 2690.e5–2697.e5, https://doi.org/10.1016/j.cub.2021.03.088.Search in Google Scholar PubMed PubMed Central

Nieder, A. (2016). The neuronal code for number. Nat. Rev. Neurosci. 17: 366–382, https://doi.org/10.1038/nrn.2016.40.Search in Google Scholar PubMed

Nieder, A. (2021). Neuroethology of number sense across the animal kingdom. J. Exp. Biol. 224: 244764, https://doi.org/10.1242/jeb.218289.Search in Google Scholar PubMed

Nieder, A. and Dehaene, S. (2009). Representation of number in the brain. Annu. Rev. Neurosci. 32: 185–208, https://doi.org/10.1146/annurev.neuro.051508.135550.Search in Google Scholar PubMed

Nieder, A. and Miller, E.K. (2004). A parieto-frontal network for visual numerical information in the monkey. Proc. Natl. Acad. Sci. U. S. A. 101: 7457–7462, https://doi.org/10.1073/pnas.0402239101.Search in Google Scholar PubMed PubMed Central

Norimoto, H., Makino, K., Gao, M., Shikano, Y., Okamoto, K., Ishikawa, T., Sasaki, T., Hioki, H., Fujisawa, S., and Ikegaya, Y. (2018). Hippocampal ripples down-regulate synapses. Science 359: 1524–1527, https://doi.org/10.1126/science.aao0702.Search in Google Scholar PubMed

O’Keefe, J. and Burgess, N. (1996). Geometric determinants of the place fields of hippocampal neurons. Nature 381: 425–428, https://doi.org/10.1038/381425a0.Search in Google Scholar PubMed

O’Keefe, J. and Recce, M.L. (1993). Phase relationship between hippocampal place units and the EEG theta rhythm. Hippocampus 3: 317–330, https://doi.org/10.1002/hipo.450030307.Search in Google Scholar PubMed

Oby, E.R., Golub, M.D., Hennig, J.A., Degenhart, A.D., Tyler-Kabara, E.C., Yu, B.M., Chase, S.M., and Batista, A.P. (2019). New neural activity patterns emerge with long-term learning. Proc. Natl. Acad. Sci. U. S. A. 116: 15210–15215, https://doi.org/10.1073/pnas.1820296116.Search in Google Scholar PubMed PubMed Central

Ohki, T. (2022). Measuring phase-amplitude coupling between neural oscillations of different frequencies via the Wasserstein distance. J. Neurosci. Methods 374: 109578, https://doi.org/10.1016/j.jneumeth.2022.109578.Search in Google Scholar PubMed

Ohki, T., Gunji, A., Takei, Y., Takahashi, H., Kaneko, Y., Kita, Y., Hironaga, N., Tobimatsu, S., Kamio, Y., Hanakawa, T., et al.. (2016). Neural oscillations in the temporal pole for a temporally congruent audio-visual speech detection task. Sci. Rep. 6: 37973, https://doi.org/10.1038/srep37973.Search in Google Scholar PubMed PubMed Central

Ohki, T. and Takei, Y. (2018). Neural mechanisms of mental schema: a triplet of delta, low beta/spindle and ripple oscillations. Eur. J. Neurosci. 48: 2416–2430, https://doi.org/10.1111/ejn.13844.Search in Google Scholar PubMed

Oyanedel, C.N., Durán, E., Niethard, N., Inostroza, M., and Born, J. (2020). Temporal associations between sleep slow oscillations, spindles and ripples. Eur. J. Neurosci. 52: 4762–4778, https://doi.org/10.1111/ejn.14906.Search in Google Scholar PubMed

Palm, G., Knoblauch, A., Triesch, J., Parisi, G.I., Tani, J., Weber, C. and Wermter, S. (2018). Lifelong learning of spatiotemporal representations with Dual-Memory recurrent self-organization. Front. Neurorob. 12: 78, https://doi.org/10.3389/fnbot.2018.00078.Search in Google Scholar PubMed PubMed Central

Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., and Wermter, S. (2019). Continual lifelong learning with neural networks: a review. Neural Network 113: 54–71, https://doi.org/10.1016/j.neunet.2019.01.012.Search in Google Scholar PubMed

Patania, A., Selvaggi, P., Veronese, M., Dipasquale, O., Expert, P., and Petri, G. (2019). Topological gene expression networks recapitulate brain anatomy and function. Netw. Neurosci. 3: 744–762, https://doi.org/10.1162/netn_a_00094.Search in Google Scholar PubMed PubMed Central

Pavlides, C. and Winson, J. (1989). Influences of hippocampal place cell firing in the awake state on the activity of these cells during subsequent sleep episodes. J. Neurosci. 9: 2907–2918, https://doi.org/10.1523/jneurosci.09-08-02907.1989.Search in Google Scholar

Pica, P., Lemer, C., Izard, V., and Dehaene, S. (2004). Exact and approximate arithmetic in an Amazonian indigene group. Science 306: 499–503, https://doi.org/10.1126/science.1102085.Search in Google Scholar PubMed

Qasim, S.E., Fried, I., and Jacobs, J. (2021). Phase precession in the human hippocampus and entorhinal cortex. Cell 184: 3242.e10–3255.e10, https://doi.org/10.1016/j.cell.2021.04.017.Search in Google Scholar PubMed PubMed Central

Raichle, M.E. (2010). Two views of brain function. Trends Cognit. Sci. 14: 180–190, https://doi.org/10.1016/j.tics.2010.01.008.Search in Google Scholar PubMed

Rasmussen, M.A. and Bro, R. (2012). A tutorial on the Lasso approach to sparse modeling. Chemometr. Intell. Lab. Syst. 119: 21–31, https://doi.org/10.1016/j.chemolab.2012.10.003.Search in Google Scholar

Revkin, S.K., Piazza, M., Izard, V., Cohen, L., and Dehaene, S. (2008). Does subitizing reflect numerical estimation? Psychol. Sci. 19: 607–614, https://doi.org/10.1111/j.1467-9280.2008.02130.x.Search in Google Scholar PubMed

Robins, A. (1995). Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci. 7: 123–146, https://doi.org/10.1080/09540099550039318.Search in Google Scholar

Romano, D., Nicolau, M., Quintin, E.M., Mazaika, P.K., Lightbody, A.A., Cody Hazlett, H., Piven, J., Carlsson, G., and Reiss, A.L. (2014). Topological methods reveal high and low functioning neuro-phenotypes within fragile X syndrome. Hum. Brain Mapp. 35: 4904–4915, https://doi.org/10.1002/hbm.22521.Search in Google Scholar PubMed PubMed Central

Roscow, E.L., Chua, R., Costa, R.P., Jones, M.W., and Lepora, N. (2021). Learning offline: memory replay in biological and artificial reinforcement learning. Trends Neurosci. 44: 808–821, https://doi.org/10.1016/j.tins.2021.07.007.Search in Google Scholar PubMed

Rostami, B., Anisuzzaman, D.M., Wang, C., Gopalakrishnan, S., Niezgoda, J., and Yu, Z. (2021). Multiclass wound image classification using an ensemble deep CNN-based classifier. Comput. Biol. Med. 134: 104536, https://doi.org/10.1016/j.compbiomed.2021.104536.Search in Google Scholar PubMed

Rusu, A.A., Rabinowitz, N.C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., Pascanu, R., and Hadsell, R. (2016). Progressive neural networks. arXiv Preprint, https://doi.org/10.48550/arXiv.1606.04671.Search in Google Scholar

Sadeh, A., Mindell, J.A., Luedtke, K., and Wiegand, B. (2009). Sleep and sleep ecology in the first 3 years: a web-based study. J. Sleep Res. 18: 60–73, https://doi.org/10.1111/j.1365-2869.2008.00699.x.Search in Google Scholar PubMed

Samanta, A., Alonso, A., and Genzel, L. (2020). Memory reactivations and consolidation: considering neuromodulators across wake and sleep. Curr. Opin. Physiol. 15: 120–127, https://doi.org/10.1016/j.cophys.2020.01.003.Search in Google Scholar

Sanders, H., Wilson, M.A., and Gershman, S.J. (2020). Hippocampal remapping as hidden state inference. Elife 9: 1–31, https://doi.org/10.7554/elife.51140.Search in Google Scholar PubMed PubMed Central

Sarel, A., Finkelstein, A., Las, L., and Ulanovsky, N. (2017). Vectorial representation of spatial goals in the hippocampus of bats. Science 355: 176–180, https://doi.org/10.1126/science.aak9589.Search in Google Scholar PubMed

Sawamura, H., Shima, K., and Tanji, J. (2002). Numerical representation for action in the parietal cortex of the monkey. Nature 415: 918–922, https://doi.org/10.1038/415918a.Search in Google Scholar PubMed

Sayer, R.J., Friedlander, M.J., and Redman, S.J. (1990). The time course and amplitude of EPSPs evoked at synapses between pairs of CA3/CAl neurons in the hippocampal slice. J. Neurosci. 70: 828–838.10.1523/JNEUROSCI.10-03-00826.1990Search in Google Scholar PubMed PubMed Central

Schilling, C., Gappa, L., Schredl, M., Streit, F., Treutlein, J., Frank, J., Deuschle, M., Meyer-Lindenberg, A., Rietschel, M., and Witt, S.H. (2018). Fast sleep spindle density is associated with rs4680 (Val108/158Met) genotype of catechol-O-methyltransferase (COMT). Sleep 41, https://doi.org/10.1093/sleep/zsy007.Search in Google Scholar PubMed

Sezgin, E., Sirrianni, J., and Linwood, S.L. (2022). Operationalizing and implementing pretrained, large artificial intelligence linguistic models in the US health care system: outlook of Generative Pretrained Transformer 3 (GPT-3) as a service model. JMIR Med. Inform. 10: e32875, https://doi.org/10.2196/32875.Search in Google Scholar PubMed PubMed Central

Shahbaba, B., Li, L., Agostinelli, F., Saraf, M., Cooper, K.W., Haghverdian, D., Elias, G.A., Baldi, P. and Fortin, N.J. (2022). Hippocampal ensembles represent sequential relationships among an extended sequence of nonspatial events. Nat. Commun. 13: 787, https://doi.org/10.1038/s41467-022-28057-6.Search in Google Scholar PubMed PubMed Central

Sherry, D.F., Jacobs, L.F. and Gaulin, S.J.C. (1992). Spatial memory and adaptive specialization of the hippocampus. Trends Neurosci. 15: 298–303, https://doi.org/10.1016/0166-2236(92)90080-r.Search in Google Scholar PubMed

Shin, H., Lee, J.K., Kim, J., and Kim, Sk. (2017). Continual learning with deep generative replay. Adv. Neural Inf. Process. Syst. 30.Search in Google Scholar

Singh, G., Memoli, F., Ishkhanov, T., Sapiro, G., Carlsson, G. and Ringach, D.L. (2008). Topological analysis of population activity in visual cortex. J. Vis. 8: 11.1–1118, https://doi.org/10.1167/8.8.11.Search in Google Scholar PubMed PubMed Central

Sizemore, A.E., Phillips-Cremins, J.E., Ghrist, R., and Bassett, D.S. (2019). The importance of the whole: topological data analysis for the network neuroscientist. Netw. Neurosci. 3: 656–673, https://doi.org/10.1162/netn_a_00073.Search in Google Scholar PubMed PubMed Central

Song, S., Sjöström, P.J., Reigl, M., Nelson, S., and Chklovskii, D.B. (2005). Highly nonrandom features of synaptic connectivity in local cortical circuits. PLoS Biol. 3: e68, https://doi.org/10.1371/journal.pbio.0030068.Search in Google Scholar PubMed PubMed Central

Stevenson, R.F., Zheng, J., Mnatsakanyan, L., Vadera, S., Knight, R.T., Lin, J.J. and Yassa, M.A. (2018). Hippocampal CA1 gamma power predicts the precision of spatial memory judgments. Proc. Natl. Acad. Sci. U. S. A. 115: 10148–10153, https://doi.org/10.1073/pnas.1805724115.Search in Google Scholar PubMed PubMed Central

Stoianov, I. and Zorzi, M. (2012). Emergence of a “visual number sense” in hierarchical generative models. Nat. Neurosci. 15: 194–196, https://doi.org/10.1038/nn.2996.Search in Google Scholar PubMed

Stolz, B.J., Emerson, T., Nahkuri, S., Porter, M.A., and Harrington, H.A. (2021). Topological data analysis of task-based fMRI data from experiments on schizophrenia. J. Phys. Complex. 2: 035006, https://doi.org/10.1088/2632-072x/abb4c6.Search in Google Scholar

Strubell, E., Ganesh, A., and McCallum, A. (2020). Energy and policy considerations for modern deep learning research. AAAI 34: 13693–13696, https://doi.org/10.1609/aaai.v34i09.7123.Search in Google Scholar

Sunaga, M., Takei, Y., Kato, Y., Tagawa, M., Suto, T., Hironaga, N., Ohki, T., Takahashi, Y., Fujihara, K., Sakurai, N., et al.. (2020). Frequency-specific resting connectome in bipolar disorder: an MEG study. Front. Psychiatr. 11: 597, https://doi.org/10.3389/fpsyt.2020.00597.Search in Google Scholar PubMed PubMed Central

Tagawa, M., Takei, Y., Kato, Y., Suto, T., Hironaga, N., Ohki, T., Takahashi, Y., Fujihara, K., Sakurai, N., Ujita, K., et al.. (2022). Disrupted local beta band networks in schizophrenia revealed through graph analysis: a magnetoencephalography study. Psychiatr. Clin. Neurosci. 76: 309–320, https://doi.org/10.1111/pcn.13362.Search in Google Scholar PubMed

Takahashi, N., Sasaki, T., Matsumoto, W., Matsuki, N., and Ikegaya, Y. (2010). Circuit topology for synchronizing neurons in spontaneously active networks. Proc. Natl. Acad. Sci. U. S. A. 107: 10244–10249, https://doi.org/10.1073/pnas.0914594107.Search in Google Scholar PubMed PubMed Central

Tenenbaum, J.B., Kemp, C., Griffiths, T.L., and Goodman, N.D. (2011). How to grow a mind: statistics, structure, and abstraction. Science 331: 1279–1285, https://doi.org/10.1126/science.1192788.Search in Google Scholar PubMed

Terada, S., Sakurai, Y., Nakahara, H., and Fujisawa, S. (2017). Temporal and rate coding for discrete event sequences in the hippocampus. Neuron 94: 1248.e4–1262.e4, https://doi.org/10.1016/j.neuron.2017.05.024.Search in Google Scholar PubMed

Tingley, D. and Peyrache, A. (2020). On the methods for reactivation and replay analysis. Philos. Trans. R. Soc. Lond. B Biol. Sci. 375: 20190231, https://doi.org/10.1098/rstb.2019.0231.Search in Google Scholar PubMed PubMed Central

Tonolini, F., Jensen, B.S., and Murray-Smith, R. (2020). Variational sparse coding. In: Proceedings of the 35th uncertainty in artificial intelligence conference. PMLR 115, pp. 690–700.Search in Google Scholar

Tononi, G. and Cirelli, C. (2014). Sleep and the price of plasticity: from synaptic and cellular homeostasis to memory consolidation and integration. Neuron 81: 12–34, https://doi.org/10.1016/j.neuron.2013.12.025.Search in Google Scholar PubMed PubMed Central

Topaz, C.M., Ziegelmeier, L., and Halverson, T. (2015). Topological data analysis of biological aggregation models. PLoS One 10: e0126383, https://doi.org/10.1371/journal.pone.0126383.Search in Google Scholar PubMed PubMed Central

Tse, D., Langston, R.F., Kakeyama, M., Bethus, I., Spooner, P.A., Wood, E.R., Witter, M.P., and Morris, R.G.M. (2007). Schemas and memory consolidation. Science 316: 76–82, https://doi.org/10.1126/science.1135935.Search in Google Scholar PubMed

Vaidya, A.R., Jones, H.M., Castillo, J., and Badre, D. (2021). Neural representation of abstract task structure during generalization. Elife 10, https://doi.org/10.7554/elife.63226.Search in Google Scholar PubMed PubMed Central

van de Ven, G.M., Siegelmann, H.T., and Tolias, A.S. (2020). Brain-inspired replay for continual learning with artificial neural networks. Nat. Commun. 11: 4069, https://doi.org/10.1038/s41467-020-17866-2.Search in Google Scholar PubMed PubMed Central

van der Meer, M.A.A., Kemere, C., and Diba, K. (2020). Progress and issues in second-order analysis of hippocampal replay. Philos. Trans. R. Soc. Lond. B Biol. Sci. 375: 20190238, https://doi.org/10.1098/rstb.2019.0238.Search in Google Scholar PubMed PubMed Central

Wallenstein, G.V., Eichenbaum, H., and Hasselmo, M.E. (1998). The hippocampus as an associator of discontiguous events. Trends Neurosci. 21: 317–323, https://doi.org/10.1016/s0166-2236(97)01220-4.Search in Google Scholar PubMed

Walker, M.P. and Stickgold, R. (2004). Sleep-dependent learning and memory consolidation. Neuron 44: 121–133, https://doi.org/10.1016/j.neuron.2004.08.031.Search in Google Scholar PubMed

Wang, L., Lei, B., Li, Q., Su, H., Zhu, J., and Zhong, Y. (2022). Triple-memory networks: a brain-inspired method for continual learning. IEEE Trans. Neural Network Learn. Syst. 33: 1925–1934, https://doi.org/10.1109/tnnls.2021.3111019.Search in Google Scholar

Wei, X.X., Prentice, J., and Balasubramanian, V. (2015). A principle of economy predicts the functional architecture of grid cells. Elife 4: e08362, https://doi.org/10.7554/elife.08362.Search in Google Scholar

Wikenheiser, A.M. and Redish, A.D. (2015). Hippocampal theta sequences reflect current goals. Nat. Neurosci. 18: 289–294, https://doi.org/10.1038/nn.3909.Search in Google Scholar PubMed PubMed Central

Wills, T.J., Cacucci, F., Burgess, N., and O’Keefe, J. (2010). Development of the hippocampal cognitive map in preweanling rats. Science 328: 1573–1576, https://doi.org/10.1126/science.1188224.Search in Google Scholar PubMed PubMed Central

Wittkuhn, L., Chien, S., Hall-McMaster, S., and Schuck, N.W. (2021). Replay in minds and machines. Neurosci. Biobehav. Rev. 129: 367–388, https://doi.org/10.1016/j.neubiorev.2021.08.002.Search in Google Scholar PubMed

Xu, M., Shen, Y., Zhang, S., Lu, Y., Zhao, D., Tenenbaum, J.B., and Gan, C. (2022). Prompting decision transformer for few-shot policy generalization. In: Proceedings of the 39th International conference on machine learning. PMLR 162, pp. 24631–24645.Search in Google Scholar

Yaguchi, A., Suzuki, T., Asano, W., Nitta, S., Sakata, Y. and Tanizawa, A. (2018). Adam induces implicit weight sparsity in rectifier neural networks. Proceedings ICMLA 2018: 318–325.10.1109/ICMLA.2018.00054Search in Google Scholar

Yamaguchi, M. (2010). Understanding mathematics. Chikumashobo, Tokyo.Search in Google Scholar

Zeithamova, D. and Bowman, C.R. (2020). Generalization and the hippocampus: more than one story? Neurobiol. Learn. Mem. 175: 107317, https://doi.org/10.1016/j.nlm.2020.107317.Search in Google Scholar PubMed PubMed Central

Zeki, S. (1999). Splendours and miseries of the brain. Philos. Trans. R. Soc. Lond. B Biol. Sci. 354: 2053–2065, https://doi.org/10.1098/rstb.1999.0543.Search in Google Scholar PubMed PubMed Central

Zeng, H. and Sanes, J.R. (2017). Neuronal cell-type classification: challenges, opportunities and the path forward. Nat. Rev. Neurosci. 18: 530–546, https://doi.org/10.1038/nrn.2017.85.Search in Google Scholar PubMed

Zhang, D. and Raichle, M.E. (2010). Disease and the brain’s dark energy. Nat. Rev. Neurol. 6: 15–28, https://doi.org/10.1038/nrneurol.2009.198.Search in Google Scholar PubMed

Received: 2022-11-15
Accepted: 2023-02-26
Published Online: 2023-03-27
Published in Print: 2023-12-15

© 2023 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 27.4.2024 from https://www.degruyter.com/document/doi/10.1515/revneuro-2022-0137/html
Scroll to top button