Skip to main content
Log in

Information entropy facilitates (not impedes) lexical processing during language comprehension

  • Brief Report
  • Published:
Psychonomic Bulletin & Review Aims and scope Submit manuscript

Abstract

It is well known that contextual predictability facilitates word identification, but it is less clear whether the uncertainty associated with the current context (i.e., its lexical entropy) influences sentence processing. On the one hand, high entropy contexts may lead to interference due to greater number of lexical competitors. On the other hand, predicting multiple lexical competitors may facilitate processing through the preactivation of shared semantic features. In this study, we examined whether entropy measured at the trial level (i.e., for each participant, for each item) corresponds to facilitatory or inhibitory effects. Trial-level entropy captures each individual’s knowledge about specific contexts and is therefore a more valid and sensitive measure of entropy (relative to the commonly employed item-level entropy). Participants (N = 112) completed two experimental sessions (with counterbalanced orders) that were separated by a 3- to 14-day interval. In one session, they produced up to 10 completions for sentence fragments (N = 647). In another session, they read the same sentences including a target word (whose entropy value was calculated based on the produced completions) while reading times were measured. We observed a facilitatory (not inhibitory) effect of trial-level entropy on lexical processing over and above item-level measures of lexical predictability (including cloze probability, surprisal, and semantic constraint). Extra analyses revealed that greater semantic overlap between the target and the produced responses facilitated target processing. Thus, the results lend support to theories of lexical prediction maintaining that prediction involves broad activation of semantic features rather than activation of full lexical forms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

All the data, materials and analysis codes for this article are available at https://osf.io/buwxs/. This study was not pre-registered.

Notes

  1. Note that greater entropy reduction would result in a slower (not faster) reading times on the target word (bowl).

  2. Note that in addition to semantic features, a cohort of activated lexical items may also lead to activation of form-related features such as phonological and/or orthographic information. Importantly, form-related features have been shown to exert both facilitatory and inhibitory effects on target processing depending on the frequency of the target word relative to the activated cohort (e.g., Carreiras et al., 1997; Karimi & Diaz, 2020; Vergara-Martinez & Swaab, 2012). Although the potential effects of such form-related factors would constitute a valid and interesting research question, this was not the aim of the current research. Given the exponential growth of interaction terms with the addition of these factors and their interactions, we believe a proper investigation of the potential effects of such variables requires a considerably larger dataset with proper controls implemented in the experimental design. Despite this, we still examined the potential effect of orthographic similarity on reading times. This analysis revealed an inhibitory (not facilitatory) main effect of orthographic overlap. Moreover, this variable did not reliably interact with trial entropy (see section Investigating the Mechanism Underlying the Effect of Trial Entropy).

  3. Due to dropouts and experimenter errors, the distribution of participants was not exactly balanced across sessions. Fifty-four participants completed the cloze task before the reading task, and 58 participants did the reading task before the cloze task.

  4. We thank an anonymous reviewer for suggesting these analyses.

  5. However, we believe that more direct comparisons of trial-level, item-level, and participant-level entropy (subject-specific prediction fluency) will be an important goal for future research.

  6. We thank Kiel Christianson for bringing this discussion point to our attention.

References

Download references

Acknowledgements

We thank Trevor Brothers and Martin Pickering for providing feedback on this manuscript. We also thank Soran Malaie and Kathryn Walters for their help with data collection.

H.K. was supported by National Institutes on Aging of the National Institutes of Health under award number 1R15AG073945-01A1. The data, materials, and analysis scripts for all experiments are available at: https://osf.io/buwxs/.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hossein Karimi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix. Extra confirmatory analyses

Appendix. Extra confirmatory analyses

As mentioned above, in addition to the main analyses, we also performed extra analyses on the target region where the effect of entropy was significant to corroborate the main findings. The results of these analyses are briefly outlined in the main text. Here, we present these analyses in more detail. The statistical models corresponding to these analyses can be found in the supplementary materials.

  • 1. Memory (session order) did not affect reading times on any of the critical regions.

Because participants who took part in the SPR session first were necessarily exposed to the target word, and because our main measure was reading times on the target word (and the following regions), we checked the potential effect of session order (SPR-first vs. Cloze-first; a proxy for memory carryover effects) on reading times as well as on the probability of guessing the target word during the cloze session. Importantly, session order did not have a significant effect on RTs (the primary dependent variable) on any of the critical regions (“target-1”: t = −.32, “target”: t = −.38, “target+1”: t = −.28, “target+2”: t = −.38, “target+3”: t = .07). However, session order did have a significant effect on the predictability of the target word, such that the probability of the target word being included in the response set was higher in the SPR-first relative to the Cloze-first session (46.4% vs. 44.1%, respectively; t = 2.45). The full results for these analyses are reported in the supplementary materials.

  • 2. The effect of entropy was still significant when the probability values were ignored and only the number of produced responses was the predictor.

Because human participants may not be good at judging probabilities, we replaced entropy values with number of responses produced per trial. The results showed that the more responses were produced in a trial, the faster the reading times were on the target word on that trial (t = −2.84), indicating that the observed effect of entropy does not depend on the given probability values. Note that this result also indicates that the way the original probabilities were normalized was not critical for obtaining this effect.

  • 3. The sole effect of entropy was significant (and facilitatory), ruling out the possibility that the observed effect of entropy may arise from multicollinearity between the predictors.

Because entropy and cloze probability were moderately correlated (−.41), including them in the same regression model could have caused issues due to multicollinearity. Thus, we ran a separate model predicting reading times on the target word with only entropy as the predictor. This model still showed a significant facilitatory effect of entropy (t = −2.54).

  • 4. The entropy effect was independent of surprisal.

In another analysis, we replaced cloze probability with surprisal in our main original analyses to ensure that the effect of entropy is independent of surprisal. This model also still showed a significant facilitatory effect of entropy (t = −2.77).

  • 5. The entropy effect was independent of semantic constraint.

We also replaced cloze probability with sentence constraint in our main models. This model also still showed a significant facilitatory effect of entropy (t = −3.04) on the target word, indicating that this effect is independent of sentential constraint too.

  • 6. Entropy had similar effects when the target word was produced in the response set vs. when it was not.

Another concern was that participants who provided their cloze responses second, after the self-paced reading task, were slightly more likely to provide the target word in their response set (see above). To determine if a memory effect was driving our entropy results, we re-ran our original analyses on subsets of trials in which the critical word was (or was not) included in the response set. When the target was included within the responses, entropy had a non-significant effect on reading times on the target word (t = −1.68). However, when the target word was not present in the response set, the effect of trial entropy was significant (t = −2.46). Note that these results are obviously underpowered because they are based on less data points, and therefore the results should not be overinterpreted. In fact, the number of trials in which the target word was predicted was smaller than those in which the target was not predicted (10,600 vs. 12,816, respectively), which may explain the non-significant but trending effect of entropy within target-present trials. In any case, it appears that the facilitatory effect of trial-level entropy was not a result of our within-subject design. If anything, the observed effect of entropy seems to be slightly greater when participants excluded the target word in their response set during the cloze norming task, suggesting that memory retrieval did not drive the observed facilitatory effects.

  • 7. The effect of entropy was independent of the length and frequency of the target word, as well as the words preceding the target, ruling out the possibility that the observed effect depends on target frequency or that it may reflect spillover effects from earlier regions.

Although the target words were identical across the three levels of semantic constraint, it is possible that the target word’s frequency might have still influenced reading times. We therefore performed two separate analyses in which the length and frequency of target words were added to trial entropy as predictors of reading times. These models showed that the effect of trial entropy remained significant (t = −2.53, t = −2.48, respectively). Moreover, even though the word immediately before the target (i.e., “target −1”) was always constant across all conditions, the words further preceding the target varied across constraint conditions (although an average of two words were constant across all items), it was still possible that the observed effect of entropy on the target could have originated in earlier regions and spilled over to the target region. As expected, the results revealed that both the length and frequency of these two words were significant predictors of reading times on the target. However, entropy remained a significant predictor both in the model including the length, as well as in the model including the frequency of the preceding words (t = −2.81, t = −2.49, respectively), demonstrating its independent influence on reading times on the target word. For all analyses reported in this section, we gathered data on the length and log-transformed subtitle frequency (“SUBTLWF”) of the relevant words from the English Lexicon Project (https://elexicon.wustl.edu/), and included them as predictors in the regression models. Because length and frequency are highly correlated, we conducted two separate analyses, one including the length of the relevant words and the other including their frequency.

  • 8. The effect of trial entropy was in the same direction across the Cloze-first and SPR-first sessions.

Although the interaction between trial entropy and session order was not significant (see Table 4), we still further examined the effect of trial entropy across the two sessions. The results revealed that the effect of trial entropy was similar in both sessions. Specifically, the t values associated with the effect of trial entropy across the Cloze-first and SPR-first sessions were −1.76 and −2.44, respectively. Note that the nonsignificant effect of trial entropy in the Cloze-first sessions could simply be due to lack of power (recall that we had 54 participants in the Cloze-first order vs. 58 in the SPR-first order). Interestingly, the effect of cloze probability was not significant in SPR-first sessions. This could be because the range of cloze values obtained in the cloze sessions were more limited due to exposure to the target word in the SPR sessions.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karimi, H., Weber, P. & Zinn, J. Information entropy facilitates (not impedes) lexical processing during language comprehension. Psychon Bull Rev (2024). https://doi.org/10.3758/s13423-024-02463-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.3758/s13423-024-02463-x

Keywords

Navigation