Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton March 30, 2021

Corpus-based typology: applications, challenges and some solutions

  • Natalia Levshina EMAIL logo
From the journal Linguistic Typology

Abstract

Over the last few years, the number of corpora that can be used for language comparison has dramatically increased. The corpora are so diverse in their structure, size and annotation style, that a novice might not know where to start. The present paper charts this new and changing territory, providing a few landmarks, warning signs and safe paths. Although no corpus at present can replace the traditional type of typological data based on language description in reference grammars, corpora can help with diverse tasks, being particularly well suited for investigating probabilistic and gradient properties of languages and for discovering and interpreting cross-linguistic generalizations based on processing and communicative mechanisms. At the same time, the use of corpora for typological purposes has not only advantages and opportunities, but also numerous challenges. This paper also contains an empirical case study addressing two pertinent problems: the role of text types in language comparison and the problem of the word as a comparative concept.


Corresponding author: Natalia Levshina [nɐ’talʲjə ’lʲefʃɪnə], Max Planck Institute for Psycholinguistics, Wundtlaan 1, 6525 Nijmegen The Netherlands, E-mail:

Award Identifier / Grant number: 024.001.006

Appendix

Table 1:

Greenberg’s corpus-based indices for morphological typology.

Index Meaning Illustrations
index of synthesis the ratio of the number of morphemes to the number of words sing-s has 2 morphemes
index of agglutination the ratio of agglutinative constructions to the number of morpheme junctures kind-ness is an agglutinative juncture; dep-th is a non-agglutinative juncture
compounding index the ratio of the number of root morphemes to the number of words black-bird has two root morphemes
derivational index the ratio of derivational morphemes to words lion-ess has one derivational morpheme (-ess)
gross inflectional index the ratio of inflectional morphemes to words sing-s has one inflectional morpheme (-s)
prefixial index the ratio of prefixes to the number of words re-make has one prefix
suffixial index the ratio of suffixes to the number of words sing-s has one suffix
isolational index the ratio of grammatical dependencies without inflectional morphemes to the number of grammatical dependencies nice person has no features expressed by inflectional morphemes
pure inflectional index the ratio of grammatical relationships (features) expressed by “pure” inflections (i.e. not concord) to the number of grammatical dependencies he walk-ed has one feature (past tense) formally expressed by a morpheme
concordial index the ratio of formally expressed concord features to the number of grammatical dependencies he sing-s has two concord features (3rd person, singular) formally expressed by a morpheme

References

Allassonnière-Tang, Marc. 2020. Optimal parameters for extracting constituent order. Talk presented at the 43rd annual meeting of the Societas Linguistica Europaea (online). https://osf.io/4rup8/.Search in Google Scholar

Altmann, Gabriel. 1980. Prolegomena to Menzerath’s law. Glottometrika 2. 1–10.Search in Google Scholar

Auwera, Johan van der, Ewa Schalley & Nuyts Jan. 2005. Epistemic possibility in a Slavonic parallel corpus: A pilot study. In P. Karlik & B. Hansen (eds.), Modalität in slavischen Sprachen. Neue Perspektiven, 201–217. München: Sagner.Search in Google Scholar

Benedetto, Dario, Emanuele Caglioti & Vittorio Loreto. 2002. Language trees and zipping. Physical Review Letters 88(4). 048702. https://doi.org/10.1103/PhysRevLett.88.048702.Search in Google Scholar

Bentz, Christian. 2018. Adaptive languages: An information-theoretic account of linguistic diversity. Berlin: Mouton.10.1515/9783110560107Search in Google Scholar

Bentz, Christian & Ramon Ferrer-i-Cancho. 2016. Zipf’s law of abbreviation as a language universal. In Christian Bentz, Gerhard Jäger & Igor Yanovich (eds.), Proceedings of the Leiden Workshop on capturing phylogenetic algorithms for linguistics. University of Tubingen. Available at: https://publikationen.uni-tuebingen.de/xmlui/handle/10900/68558.Search in Google Scholar

Bopp, Franz. 1816. Uber das Conjugationssystem der Sanskritsprache. Frankfurt am Main: Andreäischen.Search in Google Scholar

Brants, Thorsten & Alex Franz. 2009. Web 1T 5-gram, 10 European Languages Version 1. Available at: https://catalog.ldc.upenn.edu/LDC2009T25.Search in Google Scholar

Cilibrasi, Rudi & Paul M. B. Vitányi. 2005. Clustering by compression. IEEE Transactions on Information Theory 51(4). 1523–1545.10.1109/TIT.2005.844059Search in Google Scholar

Coupé, Christophe, Yoon Mi Oh, Dan Dediu & François Pellegrino. 2019. Different languages, similar encoding efficiency: Comparable information rates across the human communication niche. Science Advances 5. eeaw2594. https://doi.org/10.1126/sciadv.aaw2594.Search in Google Scholar

Croft, William. 2003. Typology and universals, 2nd edn. Cambridge: Cambridge University Press.Search in Google Scholar

Cysouw, Michael & Bernhard Wälchli. 2007. Parallel texts: Using translational equivalents in linguistic typology. Sprachtypologie und Universalienforschung (STUF) 60(2). 95–99.10.1524/stuf.2007.60.2.95Search in Google Scholar

Dahl, Östen. 2004. The growth and maintenance of linguistic complexity. Amsterdam: Benjamins.10.1075/slcs.71Search in Google Scholar

Dingemanse, Mark, Francisco Torreira & N. J. Enfield. 2013. Is “Huh?” a Universal Word? Conversational infrastructure and the convergent evolution of linguistic items. PLoS ONE 8(11). e78273. https://doi.org/10.1371/journal.pone.0078273.Search in Google Scholar

Dryer, Matthrw S. 1992. The Greenbergian word order correlations. Language 68. 81–138.10.1353/lan.1992.0028Search in Google Scholar

Dryer, Matthew S. 2013. Order of subject, object and verb. In Matthew S. Dryer & Martin Haspelmath (eds.), The world atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology Available at: http://wals.info/chapter/81 (accessed 20 August 2013).Search in Google Scholar

Du Bois, John W. 1987. The discourse basis of ergativity. Language 63. 805–855.10.2307/415719Search in Google Scholar

Du Bois, John W., Lorraine E. Kumpf & William J. Ashby (eds.). 2003. Preferred argument structure: Grammar as architecture for function. Amsterdam: John Benjamins.10.1075/sidag.14Search in Google Scholar

Ehret, Katharina & Benedikt Szmrecsanyi. 2016. An information-theoretic approach to assess linguistic complexity. In Raffaela Baechler & Seiler Guido (eds.), Complexity and isolation, 71–94. Berlin: de Gruyter.10.1515/9783110348965-004Search in Google Scholar

Erguvanli, Eser Emine. 1984. The function of word order in Turkish grammar. University of California Press. UCLA PhD dissertation 1979.Search in Google Scholar

Fedzechkina, Maryia, Elissa L. Newport & T. Florian Jaeger. 2016. Balancing effort and information transmission during language acquisition: Evidence from word order and case marking. Cognitive Science 41(2). 416–446.10.1111/cogs.12346Search in Google Scholar

Ferrer-i-Cancho, Ramon. 2006. Why do syntactic links not cross? Europhysics Letters 76(6). 1228.10.1209/epl/i2006-10406-0Search in Google Scholar

François, Alex. 2008. Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. In Martine Vanhove (ed.), From polysemy to semantic change: Towards a typology of lexical semantic associations, 163–215. Amsterdam: Benjamins.10.1075/slcs.106.09fraSearch in Google Scholar

Futrell, Richard, Kyle Mahowald & Edward Gibson. 2015. Quantifying word order freedom in dependency corpora. Proceedings of the third international conference on dependency linguistics (Depling 2015), 91–100. Uppsala.Search in Google Scholar

Geoffrey Haig & Stefan Schnell (eds.). 2016. Multi-CAST (Multilingual Corpus of Annotated Spoken Texts). Available at: https://multicast.aspra.uni-bamberg.de/.Search in Google Scholar

Gerdes, Kim, Sylvain Kahane & Xinying Chen. 2019. Rediscovering Greenberg’s word order universals in UD. Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019), 124–131. ACL. https://doi.org/10.18653/v1/W19-8015.Search in Google Scholar

Givón, Talmy. 1991. Isomorphism in the grammatical code: Cognitive and biological considerations. Studies in Language 15. 85–114.10.1075/cilt.110.07givSearch in Google Scholar

Goldhahn, Dirk, Thomas Eckart & Uwe Quasthoff. 2012. Building large monolingual dictionaries at the Leipzig Corpora Collection: From 100 to 200 languages. In Proceedings of the 8th International Language Resources and Evaluation (LREC’12).Search in Google Scholar

Goldhahn, Dirk, Uwe Quasthoff & Gerhard Heyer. 2014. Corpus-based linguistic typology: A comprehensive approach. Proceedings of KONVENS-2014. 215–221.Search in Google Scholar

Greenberg, Joseph H. 1960. A quantitative approach to the morphological typology of language. International Journal of American Linguistics 26(3). 178–194. https://doi.org/10.1086/464575.Search in Google Scholar

Greenberg, Joseph H. 1963. Some universals of grammar with particular reference to the order of meaningful elements. In Joseph H. Greenberg (ed.), Universals of human language, 73–113. Cambridge, Mass: MIT Press.Search in Google Scholar

Greenberg, Joseph H. 1966. Language universals, with special reference to feature hierarchies. The Hague: Mouton.Search in Google Scholar

Guzmán Naranjo, Matías & Laura Becker. 2018. Quantitative word order typology with UD. In Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018), Issue 155, 91–104. Norway: Oslo University, 13–14 December 2018.Search in Google Scholar

Haig, Geoffrey & Stefan Schnell. 2014. Annotations using GRAID (Grammatical Relations and Animacy in Discourse): Introduction and guidelines for annotators. Version 7.0. Available at: https://multicast.aspra.uni-bamberg.de/data/pubs/graid/Haig+Schnell2014_GRAID-manual_v7.0.pdf.Search in Google Scholar

Haspelmath, Martin. 2009. An empirical test of the agglutination hypothesis. In Sergio Scalise, Elisabetta Magni & Antonietta Bisetto (eds.), Universals of language today, 13–29. Dordrecht: Springer.10.1007/978-1-4020-8825-4_2Search in Google Scholar

Haspelmath, Martin. 2011. The indeterminacy of word segmentation and the nature of morphology and syntax. Folia Linguistica 45(1). 31–80. https://doi.org/10.1515/flin.2011.002.Search in Google Scholar

Haspelmath, Martin, Andrea Calude, Michael Spagnol, Heiko Narrog & Elif Bamyacı. 2014. Coding causal-noncausal verb alternations: A form-frequency correspondence explanation. Journal of Linguistics 50(3). 587–625.10.1017/S0022226714000255Search in Google Scholar

Haspelmath, Martin & Andres Karjus. 2018. Explaining asymmetries in number marking: Singulatives, pluratives and usage frequency. Linguistics 55(6). 1213–1235.10.1515/ling-2017-0026Search in Google Scholar

Hawkins, John A. 1994. A performance theory of order and constituency. Cambridge: Cambridge University Press.10.1017/CBO9780511554285Search in Google Scholar

Hawkins, John. 2004. Efficiency and complexity in grammars. Oxford: Oxford University Press.10.1093/acprof:oso/9780199252695.001.0001Search in Google Scholar

Hockett, Charles F. 1958. A course in modern linguistics. New York: Macmillan.10.1111/j.1467-1770.1958.tb00870.xSearch in Google Scholar

Humboldt, Wilhelm von. 1822. Über das Entstehen der grammatischen Formen und ihren Einfluss auf die Ideenentwicklung. In Abhandlungen der Akademie der Wissenschaften zu Berlin, 31–63.10.1515/9783110833751-005Search in Google Scholar

Jancso, Anna, Moran Steven & Sabine Stoll. 2020. The ACQDIV corpus database and aggregation pipeline. In Proceedings of the 12th language resources and evaluation conference, 156–165. Marseille: European Language Resources Association (ELRA).Search in Google Scholar

Juola, Patrick. 1998. Measuring linguistic complexity: The morphological tier. Journal of Quantitative Linguistics 5(3). 206–213.10.1080/09296179808590128Search in Google Scholar

Juola, Patrick. 2008. Assessing linguistic complexity. In Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.), Language complexity: Typology, contact, change, 89–108. Amsterdam: Benjamins.10.1075/slcs.94.07juoSearch in Google Scholar

Koehn, Philipp. 2005. Europarl: A parallel corpus for statistical machine translation. In Proceedings of the 10th machine translation summit, 79–86. Phuket, Thailand: Asia-Pacific Association for Machine Translation.Search in Google Scholar

Koplenig, Alexander, Peter Meyer, Sascha Wolfer & Carolin Müller-Spitze. 2017. The statistical trade-off between word order and word structure – Large-scale evidence for the principle of least effort. PLoS One 2017(12). e0173614. https://doi.org/10.1371/journal.pone.0173614.Search in Google Scholar

Kurumada, Chigusa & Scott Grimm. 2019. Predictability of meaning in grammatical encoding: Optional plural marking. Cognition 191. 103953. https://doi.org/10.1016/j.cognition.2019.04.022.Search in Google Scholar

Levshina, Natalia. 2015. European analytic causatives as a comparative concept: Evidence from a parallel corpus of film subtitles. Folia Linguistica 49(2). 487–520.10.1515/flin-2015-0017Search in Google Scholar

Levshina, Natalia. 2016. Why we need a token-based typology: A case study of analytic and lexical causatives in fifteen European languages. Folia Linguistica 50(2). 507–542.10.1515/flin-2016-0019Search in Google Scholar

Levshina, Natalia. 2017a. A multivariate study of T/V forms in European languages based on a parallel corpus of film subtitles. Research in Language 15(2). 153–172.10.1515/rela-2017-0010Search in Google Scholar

Levshina, Natalia. 2017b. Online film subtitles as a corpus: An n-gram approach. Corpora 12(3). 311–338.10.3366/cor.2017.0123Search in Google Scholar

Levshina, Natalia. 2018. Towards a theory of communicative efficiency in human languages. Habilitation thesis. Leipzig: Leipzig University.Search in Google Scholar

Levshina, Natalia. 2019. Token-based typology and word order entropy: A study based on Universal Dependencies. Linguistic Typology 23(3). 533–572.10.1515/lingty-2019-0025Search in Google Scholar

Levshina, Natalia. In press. Semantic maps of causation: New hybrid approaches based on corpora and grammar descriptions. Zeitschrift für Sprachwissenschaft.10.1515/zfs-2021-2043Search in Google Scholar

Li, Ming, Xin Chen, Xin Li, Bin Ma & Paul M.B. Vitányi. 2004. The similarity metric. IEEE Transactions on Information Theory 50(12). 3250–3264.10.1007/978-0-387-73003-5_381Search in Google Scholar

List, Johann-Mattis, Simon J. Greenhill, Cormac Anderson, Thomas Mayer, Tiago Tresoldi & Robert Forkel. 2018. CLICS2: An improved database of cross-linguistic colexifications assembling lexical data with the help of cross-linguistic data formats. Linguistic Typology 22(2). 277–306. https://doi.org/10.1515/lingty-2018-0010.Search in Google Scholar

Liu, Haitao. 2008. Dependency distance as a metric of language comprehension difficulty. Journal of Cognitive Science 9(2). 159–191.10.17791/jcs.2008.9.2.159Search in Google Scholar

Liu, Haitao. 2010. Dependency direction as a means of word-order typology: A method based on dependency treebanks. Lingua 120(6). 1567–1578.10.1016/j.lingua.2009.10.001Search in Google Scholar

Liu, Zoey. 2020. Mixed evidence for crosslinguistic dependency length minimization. STUF – Language Typology and Universals 73(4). 605–633.10.1515/stuf-2020-1020Search in Google Scholar

Majid, Asifa, James S. Boster & Melissa Bowerman. 2008. The cross-linguistic categorization of everyday events: A study of cutting and breaking. Cognition 109(2). 235–250. https://doi.org/10.1016/j.cognition.2008.08.009.Search in Google Scholar

Mayer, Thomas & Michael Cysouw. 2014. Creating a massively parallel bible corpus. In Proceedings of the international conference on language resources and evaluation (LREC), 3158–3163. Reykjavik: European Language Resources Association (ELRA).Search in Google Scholar

Menzerath, Paul. 1954. Phonetische Studien. Vol. 3: Die Architektonik des deutschen Wortschatzes. Bonn, Hannover & Stuttgart: Dümmler.Search in Google Scholar

Moran, Steven, Damián E. Blasi, Robert Schikowski, Aylin C. Küntay, Barbara Pfeiler, Shanley Allen & Sabine Stoll. 2018. A universal cue for grammatical categories in the input to children: Frequent frames. Cognition 175. 131–140. https://doi.org/10.1016/j.cognition.2018.02.005.Search in Google Scholar

Osborne, Timothy & Gerdes Kim. 2019. The status of function words in dependency grammar: A critique of Universal Dependencies (UD). Glossa: A Journal of General Linguistics 4(1). 17. https://doi.org/10.5334/gjgl.537.Search in Google Scholar

Östling, Robert. 2015. Word order typology through multilingual word alignment. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Short Papers), 205–211.10.3115/v1/P15-2034Search in Google Scholar

Östling, Robert. 2016. Studying colexification through massively parallel corpora. In Päivi Juvonen & Maria Koptjevskaja-Tamm (eds.), The lexical typology of semantic shifts, 157–176. Berlin: De Gruyter Mouton. https://doi.org/10.1515/9783110377675-006.Search in Google Scholar

Paschen, Ludger, François Delafontaine, Christoph Draxler, Susanne Fuchs, Matthew Stave & Frank Seifart. 2020. Building a time-aligned cross-linguistic reference corpus from language documentation data (DoReCo).In Proceedings of the 12th language resources and evaluation conference, 2657–2666. Marseille, France: European Language Resources Association. https://www.aclweb.org/anthology/2020.lrec-1.324.Search in Google Scholar

Piantadosi, Steven, Harry Tily & Edward Gibson. 2011. Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences 108(9). 3526.10.1073/pnas.1012551108Search in Google Scholar

Ponti, Edoardo Maria, Helen O’Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova & Anna Korhonen. 2019. Modeling language variation and universals: A survey on typological linguistics for natural language processing. Computational Linguistics 45(3). 559–601. https://doi.org/10.1162/COLI_a_00357.Search in Google Scholar

Samardžić, Tanja & Paola Merlo. 2018. The probability of external causation: An empirical account of crosslinguistic variation in lexical causatives. Linguistics 56(5). 895–938. https://doi.org/10.1515/ling-2018-0001.Search in Google Scholar

Sapir, Edward. 1921. Language. New York: Harcourt, Brace and World.Search in Google Scholar

Schapper, Antoinette, Lila San Roque & Rachel Hendery. 2016. Tree, firewood and fire in the languages of Sahul. In Maria Koptjevskaja-Tamm & Päivi Juvonen (eds.), Lexico-typological approaches to semantic shifts and motivation patterns in the lexicon, 355–422. Berlin: Mouton de Gruyter.10.1515/9783110377675-012Search in Google Scholar

Schlegel, August Wilhelm von. 1818. Observations sur la langue et la literature provençales. Paris: Librairie grecque-latine-allemande.Search in Google Scholar

Seifart, Frank, Jan Strunk, Swintha Danielsen, Iren Hartmann, Brigitte Pakendorf, Søren Wichmann, Alena Witzlack-Makarevich, Nivja H. de Jong & Balthasar Bickel. 2018. Nouns slow down speech across structurally and culturally diverse languages. PNAS 115(22). 5720–5725. https://doi.org/10.1073/pnas.1800708115.Search in Google Scholar

Sinnemäki, Kaius. 2014. Complexity trade-offs: A case study. In Frederik J. Newmeyer & Laurel B. Preston (eds.), Measuring grammatical complexity, 179–201.10.1093/acprof:oso/9780199685301.003.0009Search in Google Scholar

Sitchinava, Dmitry, and Natalia Perkova. 2019. Bilingual parallel corpora featuring the Circum-Baltic languages within the Russian National corpus. Proceedings of the digital humanities in the nordic countries 4th conference, 495–502. http://ceur-ws.org/Vol-2364/45_paper.pdf.Search in Google Scholar

Stave, Matthew, Ludger Paschen, François Pellegrino & Seifart Frank. In press. Optimization of morpheme length: A cross-linguistic assessment of Zipf’s and Menzerath’s laws. Linguistics Vanguard.10.1515/lingvan-2019-0076Search in Google Scholar

Stoll, Sabine & Balthasar Bickel. 2013. Capturing diversity in language acquisition research. In Language typology and historical contingency: In Honor of Johanna Nichols, 195–216. Amsterdam: John Benjamins.10.1075/tsl.104.08sloSearch in Google Scholar

Stolz, Thomas, Nataliya Levkovych, Aina Urdze, Julia Nintemann & Maja Robbers. 2017. Spatial interrogatives in Europe and beyond: Where, whither, whence. Berlin: De Gruyter Mouton.10.1515/9783110539516Search in Google Scholar

Straka, Milan & Jana Straková. 2017. Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. In Proceedings of the CoNLL 2017 shared task: Multilingual parsing from raw text to universal dependencies. Vancouver, Canada.10.18653/v1/K17-3009Search in Google Scholar

Szmrecsanyi, Benedikt. 2009. Typological parameters of intralingual variability: Grammatical analyticity versus syntheticity in varieties of English. Language Variation and Change 21(3). 319–353.10.1017/S0954394509990123Search in Google Scholar

Talmy, Leonard. 1991. Path to realization: A typology of event conflation. In Proceedings of the seventeenth annual meeting of the Berkeley Linguistics Society, 480–519. Berkeley: University of California.10.3765/bls.v17i0.1620Search in Google Scholar

Tiedemann, Jörg. 2012. Parallel data, tools and interfaces in OPUS. In Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds.), Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), 2214–2218. Istanbul: European Language Resources Association (ELRA).Search in Google Scholar

Tiersma, Peter M. 1982. Local and general markedness. Language 58(4). 832–849.10.2307/413959Search in Google Scholar

Vatanen, Tommi, Jaakko J. Väyrynen & Sami Virpioja. 2010. Language identification of short text segments with n-gram models. In Proceedings of the seventh international conference on language resources and evaluation (LREC’10), 3423–3430. Malta: European Language Resources Association (ELRA).Search in Google Scholar

Verkerk, Annemarie. 2014. The evolutionary dynamics of motion event encoding. PhD dissertation. Radboud University Nijmegen.Search in Google Scholar

von Waldenfels, Ruprecht. 2012. Aspect in the imperative across Slavic – A corpus driven pilot study. In Atle Grønn & Anna Pazelskaya (eds.), The Russian verb. Oslo studies in language, vol. 4, 141–154.10.5617/osla.165Search in Google Scholar

Wälchli, Bernhard. 2009. Data reduction typology and the bimodal distribution bias. Linguistic Typology 13. 77–94.10.1515/LITY.2009.004Search in Google Scholar

Wälchli, Bernhard & Michael Cysouw. 2012. Lexical typology through similarity semantics: Toward a semantic map of motion verbs. Linguistics 50(3). 671–710.10.1515/ling-2012-0021Search in Google Scholar

Wijffels, Jan, Milan Straka & Straková Jana. 2018. udpipe: Tokenization, parts of speech tagging, lemmatization and dependency parsing with the UDPipe NLP Toolkit. R package version 0.7. Available at: https://CRAN.R-project.org/package=udpipe.Search in Google Scholar

Zeman, Daniel, Joakim Nivre, Mitchell Abrams, et al.. 2020. Universal Dependencies 2.6, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University. Available at: http://hdl.handle.net/11234/1-3226. See also http://universaldependencies.org.Search in Google Scholar

Zipf, George. 1965[1935]. The psychobiology of Language: An introduction to dynamic philology. Cambridge, MA: MIT Press.Search in Google Scholar


Supplementary Material

The online version of this article offers supplementary material (https://doi.org/10.1515/lingty-2020-0118).


Received: 2020-08-31
Accepted: 2021-02-06
Published Online: 2021-03-30
Published in Print: 2022-05-25

© 2021 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 7.5.2024 from https://www.degruyter.com/document/doi/10.1515/lingty-2020-0118/html
Scroll to top button