Skip to main content

Advertisement

Log in

Nativism and empiricism in artificial intelligence

  • Published:
Philosophical Studies Aims and scope Submit manuscript

Abstract

Historically, the dispute between empiricists and nativists in philosophy and cognitive science has concerned human and animal minds (Margolis and Laurence in Philos Stud: An Int J Philos Anal Tradit 165(2): 693-718, 2013, Ritchie in Synthese 199(Suppl 1): 159–176, 2021, Colombo in Synthese 195: 4817–4838, 2018). But recent progress has highlighted how empiricist and nativist concerns arise in the construction of artificial systems (Buckner in From deep learning to rational machines: What the history of philosophy can teach us about the future of artificial intelligence. Oxford University Press.). This paper uses nativism and empiricism to address questions about the nature of artificial intelligence and its trajectory. It begins by defining the nativism/empiricism debate in terms of the generality of a system. Nativist systems have initial states with domain-specific features; empiricist systems have initial states with only domain-general features. With the debate framed in this way, it then explores a variety of arguments for nativism and empiricism in AI. These arguments revolve around two different questions which must be distinguished: whether nativism the only possible approach to developing human-level AI (HLAI); and whether nativism is the most practical approach to developing HLAI. On the first question, it argues that nativism is quite clearly not the only possible approach to developing HLAI, as is sometimes suggested. It argues that existing arguments for the necessity of nativism are unconvincing, because they analogize from poverty of the stimulus arguments about humans, while AIs often have access to much more data than humans. Then it argues that the case of evolution gives us a compelling argument against nativism. On the second, practical question, the paper argues that there is a tradeoff between the advantages of encoding innate machinery directly, and the advantages of evolving or learning it. However, as the past decade has shown, empiricism is a much more viable path to greater capability levels, given the ‘bitter lesson’ (Sutton in Reinforcement Learning: An Introduction. MIT press.) that encoding the ‘correct’ knowledge in AI systems is perennially outperformed by more empiricist methods that leverage large-scale data and computation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. This is not to say that these systems are fully human level in language production (whatever that might be), and they are obviously doing something quite different from what humans are doing when they produce language. How far we can expect these techniques to scale is still an open question.

  2. Marcus 2020 argues that GPT-2 shows the limit of scaling up empiricist LLMs, noting that “Every linguistic class in the 1980's and 1990s was filled with analyses of syntactic tree structures; GPT-2 has none”.

  3. For another kind of connection, see Buckner’s (2018) examination of how convolutional neural networks shed light on abstraction in empiricist theories of the mind. See an extended and related treatment of the topicss of this paper in Buckner (2023).

  4. Sometimes the issue is framed as what the newborn mind is like; however, this is not quite right because (a) learning can occur before birth and (b) innate features can be unlearned but not manifest until later in life. See Laurence and Margolis (2001) Sect. 2.

  5. I say ‘early work’ here and ‘early Chomsky’ throughout because Chomsky’s later Minimalist Program (e.g. Chomsky 1999) can be construed as less nativist than his earlier work (e.g. Chomsky 1965).

  6. For more modern versions of empiricism, see (among many others) Prinz (2002), Barsalou (1999), Elman et al. (1997).

  7. This framing can help us to avoid conflating nativism and empiricism with any particular technique that is often associated with these traditions, such as the association between connectionism and empiricism, or between symbolic methods and nativism. These general frameworks can admit of nativist or empiricist construals (Colombo 2018).

  8. Per the definition, this sentence can be read as “…mechanisms, states and processes. Unless otherwise specified, “mechanisms” will be a convenient shorthand when the same points apply to states and processes.

  9. All that said, the issue of nativism vs. empiricism in AI is relatively young, and as it evolves it may well be that Laurence and Margolis’s framing is a better fit in the AI context as well. In either case, it will be important to keep in mind that it’s a matter of degree how much domain-specific machinery a system has.

  10. Although there is some skepticism about the possibility of human-level artificial intelligence, or the coherence of the notion, in this paper I take for granted, as do many AI researchers, that it is a coherent notion and cannot be ruled out as incoherent or impossible a priori—especially when it is defined in terms of performance as it has been in this paper. The performance conception allows us to sidestep general metaphysical questions about whether an artificial system would ‘really’ have understanding, intelligence, or so on.

  11. For simplicity, this means that I am reserving ‘empiricism’ generally for techniques on the far end of the empiricist spectrum: a ‘single’ learning mechanism. Though of course, in practice it may make more sense to speak of degrees of nativism versus empiricism (cf. Colombo 2018).

  12. Cf. Colombo (2018) for an illuminating proposal for, likewise, using Bayesian frameworks to precisify debates about the relevant learning mechanisms for intelligent systems.

  13. A counterexample to metaphysical necessity nativism would be an astronomically large “empiricist” Blockhead that “learns” everything it needs to know by encountering exactly the right dataset—the one that lists all the “right” actions in every task it will ever face. (The Blockhead thought experiment first appears in Block (1981), though not by its now-canonical name; though sometimes attributed to Jackson, Jackson and Pettit (1993) state “we do not know who first coined this term”). Perhaps this counterexample could be excluded by requiring that a system learn robustly in a variety of environments. For a discussion of Blockhead-style arguments in the context of large language models, see Millière and Buckner (2024).

  14. For examples of general poverty of the stimulus arguments, see Samet and Zaitchik (2017).

  15. I mention GPT-3 instead of the more recent GPT-4 because OpenAI has not disclosed details about the latter system (OpenAI 2023).

  16. This is just to say that GPT-3 bakes in far less than a nativist might have predicted (e.g. Marcus 2020)—not that GPT-3 surmounts the same challenges that an empiricist human language learner would. As Laurence and Margolis (2001; Sect. 3.3) point out, such a learner would not just have to learn a grammar from words and sentences—such a learner must also figure out which noises are words and sentences in the first place, what the phonemes are, which aspects of speech are meaning-relevant, and more. A pure-text LLM does not face this challenge, as its inputs and outputs are all tokens.

  17. Hofstadter (1979) speculates: “There may be programs that can beat anyone at chess, but they will not be exclusively chess programs. They will be programs of general intelligence.” Hofstadter (quoted in Weber 1996): “My God, I used to think chess required thought… Now, I realize it doesn’t.”.

  18. While this form of evolutionary empiricism is couched in terms of deep learning, it could easily be generalized to other forms of learning-driven AI research.

  19. Fodor (2008) maintains that there are concepts that are not innate but nonetheless are not learned; he does so by arguing that they acquired non-psychologically by lower-level ‘biological’ process which do not qualify as learning (cf. Margolis & Laurence 2011). Thank you to a reviewer for pressing this point.

  20. As Sutton and Barto (2018) point out, it uses very coarse-grained evidence: it is indifferent to which states of the world any creature passes through. But it doesn’t seem like there’s a principled way to exclude this as an “evidential” process, even if it is very coarse-grained with respect to the batches of evidence it updates on.

  21. But perhaps it will not be true that we can see our innate machinery as “learned” if it is a mere byproduct, a spandrel, or simply a fluke. However, this is less likely to be true given that our innate machinery, or at least much of it, is shared with other animals.

  22. E.g. in a reply to commentators, the authors write that they “did not intend to take a strong stance on “nature versus nurture” or “designing versus learning” for how our proposed ingredients should come to be incorporated into more human-like AI systems.” (p.52).

  23. As with so many foundational issues in AI, this point was anticipated by Turing (1950) in “Computing Machinery and Intelligence”: “We have thus divided our problem into two parts: the child-programme and the education process. These two remain very closely connected. We cannot expect to find a good child-machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications: Structure of the child machine = Hereditary material; Changes = Mutations; Natural selection = Judgment of the experimenter. One may hope, however, that this process will be more expeditious than evolution. The survival of the fittest is a slow method for measuring advantages. The experimenter, by the exercise of intelligence, should be able to speed it up. Equally important is the fact that he is not restricted to random mutations. If he can trace a cause for some weakness he can probably think of the kind of mutation which will improve it.".

  24. This is a place where my “strict” framing matters. It could be that the appropriate amount of innate machinery for efficiency is only minimal; so that on my strict framing, this counts as a nativist AI but others might classify it as an empiricist AI.

  25. Again, I am being concessive to the nativist that we “know” these things.

  26. It’s possible that the amount of innate machinery needed to achieve explainability is minimal. If this is the case, then my “strict” framing may be disadvantaging the empiricist here; see p. 4.

  27. In stipulating that the reasons / explanations be humanly comprehensible, I’m following Doshi-Velez and Kim (2017), who gloss explainability as “the ability to explain or to present in understandable terms to a human.” This means that human cognitive limitations are an important part of what makes a system interpretable. This definition also applies whether or not one thinks that artificial systems can have “reasons” for action, as opposed to explanations of their actions.

  28. See Goodfellow et al. (2014) for the canonical paper on adversarial examples.

  29. This paper has mostly focused on insights for thinking about AI. But of course, as I’ve mentioned, there is the promise that these debates clarify our thinking about humans; I hold out a similar hope to Colombo (2018) that the formal methodology imposed by AI, like the formalisms of Bayesianism discussed by Colombo (2018), will help us continue to precisify many of the relevant issues.

References

  • Baldassarre, G., Santucci, V. G., Cartoni, E., & Caligiore, D. (2017). The architecture challenge: future artificial-intelligence systems will require sophisticated architectures, and knowledge of the brain might guide their construction. The Behavioral and Brain Sciences, 40, e254. https://doi.org/10.1017/S0140525X17000036

    Article  Google Scholar 

  • Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22(4), 577–660. https://doi.org/10.1017/S0140525X99002149

    Article  Google Scholar 

  • Block, N. (1981). Psychologism and behaviorism. Philosophical Review, 90, 5–43.

    Article  Google Scholar 

  • Botvinick, M., Barrett, D. G., Battaglia, P., de Freitas, N., Kumaran, D., Leibo, J. Z., & Hassabis, D. (2017). Building machines that learn and think for themselves. Behavioral and Brain Sciences, 40.

  • Buckner, C. J. (2023). From deep learning to rational machines: What the history of philosophy can teach us about the future of artificial intelligence. Oxford University Press.

  • Buckner, C. (2018). Empiricism without Magic: transformational abstraction in deep convolutional neural networks. Synthese, 195(12), 5339–5372. https://doi.org/10.1007/s11229-018-01949-1

    Article  Google Scholar 

  • Carey, S. (2011). Précis of the origin of concepts. Behavioral and Brain Sciences, 34(3), 113–124.

    Article  Google Scholar 

  • Chomsky, N. (1965). Aspects of the theory of syntax. MITPress.

    Google Scholar 

  • Chomsky, N. (1999). Derivation by phase. MIT.

    Google Scholar 

  • Colombo, M. (2018). Bayesian cognitive science, predictive brains, and the nativism debate. Synthese, 195, 4817–4838.

    Article  Google Scholar 

  • Doshi-Velez, F., and Kim, B. (2017). “Towards a Rigorous Science of Interpretable Machine Learning.” ArXiv Preprint arXiv:1702.08608.

  • Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1997). Rethinking Innateness: A Connectionist Perspective on Development (Reprint). A Bradford Book / The MIT Press.

    Google Scholar 

  • Fodor, J. A. (2008). LOT 2: The language of thought revisited. Oxford University Press.

    Book  Google Scholar 

  • Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

  • Goodman, H. N. (1955). Fact, Fiction, and Forecast.

  • Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. IEEE Intelligent Systems, 24(2), 8–12.

    Article  Google Scholar 

  • Hespos, S. J., & VanMarle, K. (2012). Physics for infants: Characterizing the origins of knowledge about objects, substances, and number. Wiley Interdisciplinary Reviews: Cognitive Science, 3(1), 19–27.

    Google Scholar 

  • Hofstadter, D. R. (1979). Gödel, Escher, Bach: An Eternal Golden Braid. Basic Books.

    Google Scholar 

  • Houwer, De., Jan, D.-H., & Moors, A. (2013). What Is Learning? On the Nature and Merits of a Functional Definition of Learning. Psychonomic Bulletin & Review, 20(4), 631–642. https://doi.org/10.3758/s13423-013-0386-3

    Article  Google Scholar 

  • Ilyas, A., Shibani S., Dimitris T., Logan E., Tran, B., and Madry, A. (2019). “Adversarial Examples Are Not Bugs, They Are Features.” [Cs, Stat], August. http://arxiv.org/abs/1905.02175.

  • Jackson, F., & Pettit, P. (1993). Folk belief and commonplace belief. Mind and Language, 8(2), 298–305.

    Article  Google Scholar 

  • Karlsson, F., Voutilainen, A., Heikkilae, J. and Anttila, A. eds., (2011). Constraint Grammar: a language-independent system for parsing unrestricted text (Vol. 4). Walter de Gruyter.

  • Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and Brain Sciences, 40, e253.

    Article  Google Scholar 

  • Laurence, S., & Margolis, E. (2001). The poverty of the stimulus argument. British Journal for the Philosophy of Science, 52(2).

  • LeCun, Y., and Marcus, G.. “Does artificial intelligence need more innate machinery?” Debate hosted by NYU Center for Mind, Brain, and Consciousness. October 5, 2017. https://wp.nyu.edu/consciousness/innate-ai/

  • Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7(1), 1760100425. https://doi.org/10.1146/annurev-linguistics-032020-051035

    Article  Google Scholar 

  • Lipton, Z. C. (2018). The mythos of model interpretability. Queue, 16(3), 31–57.

    Article  Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60, 91–110.

    Article  Google Scholar 

  • Marcus, G., (2018b). “Innateness, AlphaZero, and Artificial Intelligence.” [Cs], January. http://arxiv.org/abs/1801.05667.

  • Marcus, G., (2018a). “Deep Learning: A Critical Appraisal.” [Cs, Stat], January. http://arxiv.org/abs/1801.00631.

  • Marcus, G., (2020). GPT-2 and the Nature of Intelligence. The Gradient, 310.

  • Marcus, G., (2022). Deep learning is hitting a wall. Nautilus, Accessed, 03–11.

  • Margolis, E., & Laurence, S. (2013). In defense of nativism. Philosophical Studies: An International Journal for Philosophy in the Analytic Tradition, 165(2), 693–718.

    Article  Google Scholar 

  • Marsland, T. A. (1990). A short history of computer chess. In Computers, Chess, and Cognition (pp. 3–7). New York, NY: Springer New York.

  • Millière, R., & Buckner, C. (2024). A Philosophical Introduction to Language Models--Part I: Continuity With Classic Debates. arXiv preprint arXiv:2401.03910.

  • Minsky, Marvin, and Seymour Papert. (2017). Perceptrons, Reissue Of The 1988 Expanded Edition With A New Foreword By Léon Bottou. The MIT Press. https://mitpress.mit.edu/books/perceptrons-reissue-1988-expanded-edition-new-foreword-leon-bottou.

  • OpenAI (2023). GPT-4 Technical Report.

  • Prinz, Jesse J. (2002). Furnishing the Mind: Concepts and Their Perceptual Basis. MIT Press.

  • Ramsey, W., & Stich, S. (1990). Connectionism and Three Levels of Nativism. Synthese, 82(2), 177–205.

    Article  Google Scholar 

  • Reed, S., Zolna, K., Parisotto, E., Colmenarejo, S. G., Novikov, A., Barth-Maron, G., & de Freitas, N. (2022). A generalist agent. arXiv preprint arXiv:2205.06175.

  • Ritchie, J. B. (2021). What’s wrong with the minimal conception of innateness in cognitive science? Synthese, 199(Suppl 1), 159–176.

    Article  Google Scholar 

  • Samet, J., and Zaitchik, D. (2017). “Innateness and Contemporary Theories of Cognition.” In The Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Fall 2017. Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/fall2017/entries/innateness-cognition/.

  • Samet, J. (1987). Troubles with fodor’s nativism. Midwest Studies in Philosophy, 10(1), 575–594.

    Google Scholar 

  • Santoro, A. (2019). “Thoughts on ‘A Critique of Pure Learning’, Zador (2019).” Medium. October 17, 2019. https://medium.com/@adamsantoro/thoughts-on-a-critique-of-pure-learning-zador-2019-820a7dbbc783.

  • Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., & Silver, D. (2020). Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609.

  • Semenova, L., Rudin, C., & Parr, R. (2022). On the existence of simpler machine learning models. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (pp. 1827–1858).

  • Sevilla, J., Heim, L., Ho, A., Besiroglu, T., Hobbhahn, M., & Villalobos, P. (2022). Compute trends across three eras of machine learning. In 2022 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.

  • Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., et al. (2018). A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419), 1140–1144. https://doi.org/10.1126/science.aar6404

    Article  Google Scholar 

  • Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., & Bolton, A. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354.

    Article  Google Scholar 

  • Spelke, Elizabeth S., and Joseph A. Blass. (2017). “Intelligent Machines and Human Minds.” Behavioral and Brain Sciences 40. https://doi.org/10.1017/S0140525X17000267.

  • Spelke, E.S., (2022). What Babies Know: Core Knowledge and Composition Volume 1 (Vol. 1). Oxford University Press.

  • Such, Felipe Petroski, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. (2017). “Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning.” [Cs], December. http://arxiv.org/abs/1712.06567.

  • Sutton, Richard S., and Andrew G. Barto. (2018). Reinforcement Learning: An Introduction. MIT press.

  • Sutton, Richard S. (2019). “The Bitter Lesson.” March 13, 2019. http://www.incompleteideas.net/IncIdeas/BitterLesson.html.

  • The Mystery of Go, the Ancient Game That Computers Still Can’t Win | WIRED.” (2014). Accessed February 20, 2019. https://www.wired.com/2014/05/the-world-of-computer-go/.

  • Turing, A. (1950). Computing machinery and intelligence. Mind LIX, 236, 433–460. https://doi.org/10.1093/mind/LIX.236.433

    Article  Google Scholar 

  • Villalobos, P. (2023). Scaling Laws Literature Review. Published online at epochai.org. Retrieved from: 'https://epochai.org/blog/scaling-laws-literature-review'.

  • Weber, B. (1996). “Mean Chess-Playing Computer Tears at Meaning of Thought.” New York Times 19.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Long.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Long, R. Nativism and empiricism in artificial intelligence. Philos Stud 181, 763–788 (2024). https://doi.org/10.1007/s11098-024-02122-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11098-024-02122-w

Keywords

Navigation