1 Introduction

The investigation of the relationship between language and economic behavior is a recent addition to the economic literature. One aspect of languages has been put under particular scrutiny, namely the way time is encoded and the way predictions about future events are expressed.

For events out of individual control, such as weather conditions, some languages prescribe to express predictions about future circumstances by means of an explicit and obligatory linguistic marker, either a periphrastic form (e.g. English, Tomorrow it will rain or It’s going to rain) or an inflectional form (e.g. French, Demain il pleuvra, tr. Tomorrow it-will-rain). In other languages, for example German, the use of future tense is not obligatory and the present tense may be used (Tomorrow it will rain is rendered as Morgen es regnet, i.e. Tomorrow it-rains), although a future tense might be used in other contexts. In these languages, predictions of future events are typically expressed as present events.Footnote 1 It has been argued that the way languages express predictions may affect the cognitive domain and the way individuals perceive time. Speakers of languages where future tense marking is not obligatory (Thieroff, 2000; Chen, 2013) could perceive that the divide between present and future events is blurred. Hence, they could resolve inter-temporal trade-offs differently from speakers of languages where the expression of present and future events requires sharply distinct grammatical markers.

The link between language and human choices has been analyzed in the past by philosophers of language such as Sapir (1921) and Whorf (1956). The so-called Sapir-Whorf hypothesis consists in the idea that a language’s structure may influence its speakers’ cognition and their conceptualization of reality:

“the ’real world’ is to a large extent unconsciously built upon the language habits of the group. [...] We see and hear and otherwise experience very largely as we do because the language habits of our community predispose certain choices of interpretation.”- Sapir (1958:69)

“We dissect nature along lines laid down by our native languages. [...] the world is presented in a kaleidoscopic flux of impressions which has to be organized by our minds-and this means largely by the linguistic systems in our minds.”- Whorf (1940:213)

Such hypothesis, also known as Linguistic Relativity hypothesis, has been at the center of a decades-long debate among linguists, with some authors criticizing its lack of cohesive formulation and rigorous proof that language influences human thought (Pinker, 1994). In more recent years the idea that language may influence human thinking and behavior has gained a renewed consideration among authors, because of the recent advances in cognitive psychology, and the debate shifted to the extent by which a language structure and its speakers’ worldview are connected (Lakoff, 1987; Lucy, 1992; Gumperz & Levinson, 1996; Lucy, 1997; Boroditsky, 2001; Casasanto, 2015).Footnote 2

In the economic literature, the first contribution to suggest that language future reference may influence inter-temporal choices was Chen (2013). Chen ’s hypothesis is that speakers of languages that do not prescribe an obligatory grammatical marking of future reference in prediction contexts (defined as weak FTR) may perceive future rewards as closer and be more willing to undertake future-oriented behaviours, such as saving, compared to speakers of languages that do so (strong FTR). His cross-country evidence supports the hypothesis. However, Chen ’s empirical approach has been criticised, because of possible confounding factors, such as cultural traits that correlate with language future reference and are responsible of influencing future-oriented behaviors. For example, Dahl (2013) suggests that a within country approach would be preferable to separate the effects of language and culture.Footnote 3 In addition, the fact that languages are typically clustered in larger families raised some doubts on the validity of the cross-country statistical inference in Chen ’s original work. In a more recent paper (Roberts et al., 2015), Chen and co-authors check the robustness of Chen (2013)’s findings when accounting for the geographic and historical relatedness of languages. They find that the cross-country association between language and savings is weaker and for some specifications the estimates are not statistically significant.

The purpose of our paper is to analyze the effect of language future reference on individual economic behavior, in a setting which allows to address some of the limitations of Chen ’s study. In particular, we focus on one peculiar future oriented behavior, i.e. the propensity for self-employment.

The choice to engage in self-employment rather than being a dependent employee is eminently forward looking, as it requires an initial investment, either in specific human capital or in equipment, promotion, advertising and the creation of a customer base, with the purpose of earning a future reward. Time preferences matter in this choice, not only because they affect the present value of the returns to investments (Lazear & Moore, 1984), but also because they influence the present value of non-pecuniary returns, such as independence, and the possibility to fully exploit those personal skills which are conductive to self-employment (Hamilton, 2000; Berglann et al., 2011).

While the importance of risk aversion has been much stressed in the literature on entrepreneurship and self-employment, since the contribution of Kihlstrom and Laffont (1979), because the returns of a self-employed are typically uncertain, the role of time preferences have been relatively understudied, despite the returns to self-employment being clearly deferred. A number of recent contributions fill this gap. In their theoretical model of endogenous technical change where growth is driven by the innovative activity of entrepreneurs, Doepke and Zilibotti (2014) argue that the occupational choice of entrepreneurship hinges on risk tolerance and patience traits, with the entrepreneurial rate in an economy being determined by parental investments in children’s patience and risk tolerance. Using data from the Global Preference Survey, Falk et al. (2018) show that patience is positively and significantly correlated with the plan to start a business, after conditioning for attitude to risk and other preferences. So does and concludes Shtudiner (2018) with Israeli data. Recently, Dawson and Failla (2022) using large survey data from Britain conclude that patience increases both the likelihood of becoming entrepreneurs and entrepreneurial earnings, and Bokern et al. (2022) report a higher level of self-assessed patience among the Netherlander self-employed. Using the World Value Survey, Lortie et al. (2019) document a positive association at the regional level between entrepreneurial activity and long-term orientation (a construct defined by Minkov and Hofstade (2012)). Finally, Andersen et al. (2014) run an experimental analysis in Denmark and document that the entrepreneurs are willing to wait longer for monetary rewards than the general population.

We adopt a broad definition of self-employed which includes all those individuals that risk on their own, self-organize their work schedule, often manage dependent employees and are residual claimants on the revenues from their activity. Hence our definition spans from business leaders to self-employed professionals and small business owners.

We are the first, to our knowledge, to investigate the link between language and self-employment. Language itself is an expression and a part of a people’s culture and largely contributes to shaping a people’s identity in terms of preferences, attitudes and beliefs. Hence, our analysis adds to the broader debate on the relationship between the culture shared within a social group (e.g. religious, ethnic, national) and socio-economic outcomes. A number of papers on the cultural determinants of self-employment and entrepreneurship do exist and we refer to Nunziata and Rocco (2016, 2018) for a review.

Our large scale within-country analysis is focused on Switzerland, a very homogeneous country as regards its socio-economic conditions, where the vast majority of the self-employed are concentrated in high-skilled occupations and self-employment can hardly be a residual option (the so called necessity entrepreneurs). We adopt an epidemiological approach, in the spirit of Fernandez (2011), exploiting individual-level data from the last complete census of the population carried out in 2000 and focusing on long-term immigrants residing in Switzerland for more than five years, i.e. similarly accustomed to Swiss institutions and culture, and whose mother tongue is not a Swiss language.Footnote 4 Switzerland represents an ideal case study because of its historically rooted multilingualism. In Switzerland there are four official languages, three of which can be classified as strong FTR (French, Italian and Romansh) and one as weak FTR (German). Moreover, the country hosts a large share of immigrants (21.8% of its population according to the 2000 Census) which further increases its linguistic variety. Online Appendix A provides a broad discussion on the classification of languages into weak and strong FTR categories.Footnote 5

The first margin of our analysis compares immigrants in Switzerland who preserve their mother tongue as main language of daily use (defined as stayers). This analysis exploits the variation in FTR across languages learnt in early childhood and improves with respect to Chen (2013) because the concern of context confounders is minimised given the homogeneity of the Swiss socio-economic landscape. The second and novel margin of analysis, which is feasible only in a multilingual country such as Switzerland, involves the comparison between immigrants who switch to a Swiss language as main language of daily use (defined as switchers). In this case we compare individuals from the same country of origin, who learn and speak a Swiss weak FTR language with those who learn and speak a Swiss strong FTR language.

This margin identifies the differential effect of learning a weak versus a strong FTR language in the destination country, conditional on the culture of origin’s characteristics and the mother tongue. Hence, our analysis allows to test whether a language learnt later in life may change the individual’s mindset compared to a language learnt from parents in early childhood. This is an important question which has not been previously answered in the literature. Intuitively, the mother tongue should have a deeper influence on the individual mindset, because it is learned at the same time individual cognitive abilities start to develop. However, a few studies provide experimental evidence showing that even relatively short exposures to a new language tend to affect the mindset (Casasanto, 2008, 2010) and can produce structural changes in the learner’s brain (Mårtensson et al., 2012).

Along the first margin, our estimates indicate that speakers of weak FTR languages learnt in early childhood are about 4 percentage points more likely to be self-employed compared to speakers of weak FTR languages. This effect is larger and statistically different than the effect estimated along the second margin, i.e. considering languages learnt later in life (equal to about 2 percentage points).Footnote 6 Hence, speakers of a language which is less restrictive in its grammatical markings of future events are significantly more likely to be self-employed.

Our results lend credit to the hypothesis that the absence of a explicit and obligatory marking of future tense in prediction-based contexts could make perceive future rewards as less distant and, therefore, provide stronger incentives to invest in an entrepreneurial activity. We extend the analysis through a battery of robustness tests and conclude with a series of heterogeneous estimates by skill groups, age and gender.

The paper is organized as follows: Sect. 2 highlights our contribution to the literature on the economic and behavioral effect of language and on the determinants of self-employment. Section 3 provides a description of Switzerland’s multilingualism and its historical evolution, and presents the main linguistic differences among Swiss native languages. Section 4 describes the data. In Sect. 5, we present the research design and the empirical findings, including a number of robustness checks and further evidence on the heterogeneous effect of FTR across a number of socio-economic and demographic dimensions. Finally, Sect. 6 concludes.

2 Contribution to the literature

This papers contributes to three strands of literature. The first is the literature on the cultural determinants of economic behavior, with a special focus on long-term orientation, the second attains to the determinants of self-employment and entrepreneurship, and the third is the broader cognitive psychology literature on the relationship between language and human behavior.

As regards the first strand, recent contributions provide extensive evidence of the relationship between culture and education and labour market outcomes, among others. For example, Figlio et al. (2019) use data on second generation immigrant pupils in Florida and show that their within-school performance is related with long-term orientation, one of the six cultural traits codified and measured by Hofstede et al. (2010). Eugster et al. (2017) exploit the exogenous variation in culture (and in particular in attitudes toward work) that is generated by the linguistic border which separates German from Romance languages in Switzerland. They show that speakers of Romance languages tend to remain unemployed longer and their job-search is significantly less intense compared to their across-the-border German speakers counterpart. Compared to them, we isolate the effect of one particular cultural characteristic of language, i.e. time orientation embodied by future tense encoding, rather than the effect of a general notion of culture, broadly intended, attached to specific linguistic areas.

Galor et al. (2017) and Galor et al. (2020) argue that crop productivity in pre-industrial times determined the emergence of cultural traits which in turn were reflected in language characteristics. The latter played a pivotal role in making cultural traits persistent and have a permanent direct and independent effect on the decision to acquire college education. Galor et al. (2020) provide a major improvement compared to the cross-country analysis by Chen . They distinguish between languages prescribing either inflectional forms (e.g. French and Italian) or periphrastic structures (e.g. English) to mark future tense.Footnote 7 They adopt an epidemiological approach and focus on second generation immigrants in the US. In their baseline setting, immigrants’ culture of origin is captured by country of origin dummies and the effect of language future reference is mainly identified by the variation, within country of origin, between those who switch to English as their main language of daily use (switchers), and those who maintain their mother tongue (stayers). However, all switchers speak English, a language with periphrastic future marking, while stayers can speak a language with either periphrastic or inflectional future marking. Hence, estimating the effect of future reference involves the comparison between switchers and stayers. The latter might not be fully comparable, if their decision to speak a new language or not is correlated with a differential degree of integration, resulting in different opportunities in the labour market.

The authors address this concern and present additional estimates where immigrants from English or Spanish speaking countries are excluded from the analysis. This robustness check provides evidence that their findings are not driven by the higher integration of individuals who are proficient in the two dominating languages in the US, who may therefore have greater incentives to invest in human capital.

Our approach is akin to Galor et al., but we take a step further by exploiting Switzerland’s multilingual landscape to identify the differential effect of the obligatoriness of future tense marking when the main language of daily use is the mother tongue learnt in childhood versus when it is a language learnt later in life.

This is possible because, on the one hand, we can exploit the variability in language future reference among those immigrants who reside in Switzerland for more than 5 years but keep their mother tongue as main language of daily use (defined as stayers), controlling for their country of origin’s cultural characteristics. On the other hand, we can compare immigrants who substitute their mother tongue as main language of daily use and switch to a Swiss weak FTR language with those who switch to a Swiss strong FTR language (both defined as switchers).

For instance, an immigrant from Spain can keep speaking Spanish (a strong FTR language) as her main language of daily use and can therefore be compared to other immigrants originating from other countries who keep speaking their mother tongue (weak or strong FTR). Alternatively, she can switch to French, Italian or Romansh (a Swiss strong FTR language) or to German (a Swiss weak FTR language) and therefore we can estimate the differential effect of switching to a weak vs a strong FTR language, given the characteristics of the mother tongue.

Our paper also contributes to the literature on the determinants of self-employment and entrepreneurship. For a broader review we refer to Simoes et al. (2016). For the determinants of entrepreneurship, and in particular on the role played by innate entrepreneurial abilities, risk-aversion, individual wealth and financial constraints and social networks, we refer to Parker (2005). According to the theoretical literature in economics, the selection into entrepreneurship is affected by innate entrepreneurial abilities (Lucas, 1978) and risk-aversion (Kihlstrom & Laffont, 1979). Other contributions in the psychological literature focus on the role of personality traits, such as the need for achievement, the desire for independence, self-confidence and attitude toward risk (McClelland, 1967; Douglas & Shepherd, 2002; Cuervo, 2005), and internal locus of control (Evans & Leighton, 1989).Footnote 8 Such dimensions are found to be relevant in shaping the propensity for entrepreneurship, and can in turn be affected by cultural factors (Nunziata & Rocco, 2022). In this paper we provide evidence that time orientation, encoded by the main language of daily use, is an important factor in the decision to be self-employed, a result which is entirely novel, to the best of our knowledge.

Finally, our paper provides a contribution to the recent cognitive psychology literature on the relationship between language and human thinking and behavior. Several contributions have debated the extent by which language is associated to its speakers’ worldview (Lakoff, 1987; Lucy, 1992; Gumperz & Levinson, 1996; Lucy, 1997; Boroditsky, 2001). Recently, Casasanto (2008, 2010, 2015) showed how cross-linguistic differences in spatial metaphors for time correspond to differences in non linguistic mental representations.

In other words, languages can significantly transform peoples’s minds, and in particular their time orientation. This effect has been reported to be significant even after relatively short exposures to a new language among adults, confirming that human thinking is not only affected by the language learnt from parents during childhood, but it is also affected by a language learnt later in life. These findings are consistent with the experimental evidence provided by Mårtensson et al. (2012), who document that learning a foreign language in a relatively short amount of time affects the structure of adult individuals’ brain.Footnote 9 To our knowledge, our contribution provides the first investigation of the behavioral implications of languages learnt at different life stages on observational data.

3 Key features of Swiss culture and languages

While we refer to Online Appendix B for a more extensive overview of Swiss historical multilingualism, in this section we provide a brief discussion of the linguistic landscape in Switzerland. Since part of our estimates rely on the comparison between Swiss native languages’ FTR only, controlling for the mother tongue and other ancestral cultural traits (the analysis on switchers), we analyze the linguistic differences between German, French and Italian in detail by relying on WALS data on languages’ structural features. In addition, since part of our estimates investigate the effect of the obligatoriness of future reference implied by speaking a non-Swiss mother tongue, controlling for country of origin cultural features (the analysis on stayers), we also provide a brief description of the set of languages spoken by long-term immigrants in our sample.

More than two thirds of Swiss citizens speak German, one fifth speaks French, about 5 percent Italian and less than 1 percent Romansh.Footnote 10 The French language is highly reputed and it was considered the educated language also among the German elites in the past.Footnote 11 Moreover, while Swiss French and Italian are resilient to the influence of other languages and do not absorb external vocabulary, structures or pronunciation, the Swiss German absorbs heavily, especially from French. For this reason, the Swiss Germans do not hold a majoritarian outlook vis-a-vis other linguistic communities (Schmid, 1981).

At the canton level, institutions promote language uniformity. It is possible for a citizen to speak only the canton official language without needing to learn another national language. Hence, the proportion of Swiss who spoke more than one language in a native-like fashion was small, about 15 percent overall according to 1990 Swiss census data (Rash, 2002).Footnote 12 The Swiss learn another national language at school, as a foreign language, but eventually a large proportion of Swiss residents do not speak a second language, or at least not well (Pap, 1990).Footnote 13 Typically, the Swiss French learn German, and the Swiss Germans and Italians learn French. Bilingualism partly depends on internal migration, with Germans being more likely to move to the French or Italian areas than vice versa (Pap, 1990). As a result, in the Francophone region over 10 percent of its inhabitants spoke German in 1990, while in the German-speaking areas only 1.9 percent were French speakers (Rash, 2002). Despite this substantial internal migration, the Franco-German boundary is neat and corresponds to a boundary already established by 1100.Footnote 14

3.1 Linguistic differences among Swiss native languages

Using data drawn from the WALS-World Atlas of Language Structures (Dryer & Haspelmath, 2013), that displays the structural properties of the world’s languages, we review the main linguistic differences between weak FTR and strong FTR Swiss languages, i.e., we compare, respectively, German and French, and German and Italian.Footnote 15

Beyond differing in the coding of future tense, we observe that German and French are dissimilar over the 38 WALS linguistic features. More than half (27) pertain to the areas of phonology, morphology, order of words, simple clauses and complex sentences. Other differences regard the number of genders, the use of cases and other categories such as the coding of numerals and pronouns. None of these characteristics is related to the conceptualization of time. The only WALS verbal category, besides future tense, on which the two languages differ, is the distinction between perfective and imperfective aspects.Footnote 16 Differently from French, German lacks the distinction between perfective and imperfective. Similarly, German and Italian differ for 28 WALS linguistic features and only one, the way prohibitions are expressed grammatically, attains to the verbal domain but it does not directly relate to the expression of events along the time dimension. As discussed below, we account for all these differences in our estimations.

3.2 Linguistic diversity among immigrants’ languages

The Swiss linguistic landscape is enriched by the large proportion of foreign-born residents. As reported in Table A4 in the Online Appendix, 56 percent of immigrants substitute their mother tongue (i.e. the official language in their country of birth) with one of the Swiss languages as main language of daily use (we label them as switchers). The remaining part, 44 percent, keeps speaking their mother tongue while in Switzerland (stayers). For example, those who are born in Spain and report Spanish as their main spoken language. German and French are the most used languages among immigrants, followed by Portuguese, Spanish, Turkish, English and Italian. German is the only weak FTR language among Swiss languages, and it is the most spoken in the country. Among immigrants, weak FTR languages are German, Dutch, the Scandinavian languages, Finnish, Chinese and Japanese. In all our estimates we control for these languages’ WALS linguistic features other than future encoding.

4 Data

We use micro-data from the complete Swiss Census collected in 2000 and we focus on long-term first-generation immigrants who are born abroad but have resided in Switzerland for over 5 years and, as a result, are similarly accustomed to Swiss institutions and culture.Footnote 17

Our sample includes those individuals who are employed and aged between 25 (when education is typically completed), and 70 (to limit concerns of differential selective retirement between self-employed and dependent workers), regardless of their citizenship.Footnote 18 In total, we can identify 37 areas of origin from the 2000 Swiss Census.

However, some migrants come from countries where Swiss native languages are either official, as in Germany, France and Italy, or widespread, as in the case of French in Tunisia and Algeria, because of their colonial past. These migrants’ work trajectory might be naturally advantaged by the higher proficiency in one of the Swiss local languages. In addition, it is more likely that they choose to settle in areas where their language is spoken by the majority of Swiss natives (e.g. Austrians locating in German-speaking Zürich). Their inclusion in the analysis might therefore raise concerns for identification due to the potential self-selective geographical sorting according to language. We hence exclude those immigrants who are born in countries where German, French and Italian are either official or widely spoken and end up with immigrants from 28 areas of origin (see Table A4 in the Online Appendix).Footnote 19

Thanks to the information in the Census about the place of residence 5 years earlier, including foreign countries, we are able to distinguish between individuals who migrated before or after 1995.Footnote 20

For our analysis, we only consider foreign-born individuals who spent more than 5 years in the country, and exclude those individuals who are at an early stage of their migration trajectory and therefore did not spend enough time in Switzerland to absorb local culture and customs. These individuals of recent immigration are less likely to start an entrepreneurial activity because of lower chances to create wide and effective networks, to accumulate knowledge about local business opportunities and to adapt their skills to local labour markets.Footnote 21

Our sample amounts to about 278,000 immigrants, of which around 12% work on their own. The share of self-employed in high skilled occupations is 13%, i.e. almost double than that among low skilled occupations (around 7%).Footnote 22 According to OECD (2014), the share of self-employed with employees in Switzerland is one of the largest among OECD countries, and the share without employees is one of the smallest. This, coupled with the relatively high educational attainment and occupational status among self-employed immigrants in our sample, suggests that self-employment is Switzerland is not a residual option to immigrants who would otherwise remain unemployed.Footnote 23

The 2000 Census reports the “main language spoken at home or at the workplace”, i.e. the respondents’ main language of daily use which can be different from their mother tongue. This language is described by the respondents as the predominant language they speak through the day, and it is therefore the most likely to affect their time orientation. Since we can distinguish between those individuals who declare their mother tongue as the main language of daily use and those who instead declare a different language, we can investigate both the effect of a language that has been inherited from parents and a language that has been acquired later in life.

We define the dummy weak FTR, which is equal to one if an individual speaks a language which does not require the use of future tense for expressing predictions of future events, and zero if the spoken language prescribes an obligatory future marker in those contexts.

In Table 1 we classify our sample of languages as weak and strong FTR according to the definition adopted in Chen (2013). Overall, we are able to identify 15 strong FTR and 6 weak FTR languages.Footnote 24 As reported in Table 1, about 12 percent of immigrants in our sample are self-employed. Across languages, the self-employment rate is quite low among Portuguese and Spanish speakers (4 and 8 percent respectively), while it is relatively high among Arabic, Hebrew, Hungarian, Persian and Dutch speakers (27, 26, 21, 21 and 20 percent respectively), although these groups amount to about 2 percent of our total observations.

Table 2 reports the sample mean of some selected individual characteristics distinguishing between weak and strong FTR speakers. Compared to strong FTR speakers, weak FTR speakers display a higher share of self-employed (15 vs 10 percent). The proportion of individuals with more than upper secondary education (36 vs 21 percent) and the proportion of Swiss citizens (44 vs 24 percent) are larger among the weak FTR group as well. As regards religion, Catholicism is relatively more frequent among strong FTR speakers while Protestantism and the absence of a religious affiliation are relatively more common in the weak FTR group.

Table 1 Future Time Reference (FTR) classification of Languages
Table 2 Summary statistics-by type of FTR

5 Research design and empirical findings

5.1 Language future reference and time orientation

The object of our investigation is the causal chain that originates in the obligatoriness of future marking in prediction contexts prescribed by the main language of daily use, i.e. our weak FTR dummy. The latter may affect the individual mindset concerning the perception of the future, that in turn may influence a future-oriented behaviour like entrepreneurial activity. The first crucial link in the causal chain, i.e. the effect of weak FTR on time orientation, has been recently discussed and supported by a growing literature in cognitive psychology and the social sciences (Lakoff, 1987; Lucy, 1992; Gumperz & Levinson, 1996; Lucy, 1997). Boroditsky (2001) provide experimental evidence supporting the claim that language may shape how individual think and perceive time. Additional experimental evidence supporting the idea that language shapes mental representations of time is provided by Casasanto (2008, 2010). Tan et al. (2008) shows that language modifies the way we perceive reality by acting at the level of the brain cortex. Interestingly, Mårtensson et al. (2012) document that already after three months of an intensive language training in a foreign language, structural changes in the trainees’ brain become apparent in neuro-images.

Sutter et al. (2018) perform an inter-temporal choice experiment in primary school in a bilingual Northern Italian city showing that German-speaking children are more likely than Italian-speaking children to delay gratification, a behaviour that is consistent with the time orientation of their spoken language. Similarly, speakers of weak FTR languages are found to be more likely to support costly forward-looking pro-environmental policies (Mavisakalyan et al., 2018), to be religious, as they may perceive after-life as temporally closer (Mavisakalyan et al., 2022), and to invest in R&D whose rewards are typically reaped in the medium to long run (Chi et al., 2020).Footnote 25

We check whether language FTR is significantly correlated with the cultural trait regarding time perception and inter-temporal preferences at the country level by performing an auxiliary cross-country analysis. As country-wide measures of cultural traits, we consider two sets of data. The first is provided by the experimentally validated Global Preference Survey (GPS henceforth, Falk et al., 2018) that measures preferences at the country level on six dimensions, namely risk-taking, patience, positive reciprocity, negative reciprocity, altruism and trust.Footnote 26 The second is provided by the six cultural dimensions developed by Hofstede et al. (2010), i.e. uncertainty avoidance, long-term orientation, individualism, power distance, masculinity and indulgence.

Table A5 in the Online Appendix presents our cross-country estimates of the correlation between our Weak FTR dummy and each GPS dimension. All specifications include continent fixed effects and several geographic and institutional controls at the country level (latitude, land quality, elevation, temperature, precipitation, distance to waterways, percentage of arable land, genetic diversity, legal origin dummies, Old World dummy). Our findings reveal a positive association between Patience and the probability that a country language is characterized by weak FTR. No significant correlation is found between Weak FTR and any other GPS cultural dimension, including risk aversion (Willingness to take risks) and trust. When all six GPS cultural dimensions are included in the model, we find that a standard deviation increase in Patience (about 0.4 points) is associated with a rough 24 (= 0.4 × 0.61) percentage point increase in the probability that a weak FTR language is spoken in a country.

We obtain similar results when performing the same analysis using the six cultural dimensions developed by Hofstede et al. (2010). In particular, as shown in Table A6 in the Online Appendix, we find a positive correlation between Long-Term Orientation, which proxies how a society promotes societal change and efforts, and the probability that a country language is characterized by weak FTR. No significant correlation is found between weak FTR and any other Hofstede cultural dimension, including Uncertainty Avoidance, which is aimed at capturing how individuals in a country feel uncomfortable with uncertainty.

Our findings are in line with the literature (in particular with Falk et al., 2018, Galor and Özak, 2016, Galor et al., 2017 and Galor et al., 2020), confirming that language FTR is significantly correlated with the cultural trait regarding time perception and inter-temporal preferences, while there is no evidence of association with other cultural characteristics, in particular to those related to risk-aversion, which can be a significant determinant of selection into self-employment and future-oriented activities in general.

5.2 Research design

Our contribution focuses on Switzerland that represents an ideal and unique case study for the investigation of the effect of language on economic behaviour because of its historically rooted multilingualism, the homogeneous economic context and the large share of immigrants which further increases its linguistic variety. We implement an epidemiological approach (Fernandez, 2011; Galor et al., 2020) where we distinguish the effect of language from that of the culture of origin, controlling simultaneously for the contextual effects related to the place of residence. In our analysis, we first consider the sub-sample of stayers, who keep speaking their mother tongue while living in Switzerland. We then shift the focus to switchers, who conversely adopted one of Swiss native languages for daily use. This distinction allows us to estimate both the effect a language inherited from parents as well as of a language learnt later in life.

Our sample is constructed by imposing two restrictions on the adult population of immigrants in Switzerland. First, we exclude immigrants who are born in countries where German, French and Italian are either official or widely spoken (e.g. like Tunisia where a significant portion of the population speaks French), and who might therefore gain an advantage by their proficiency in one of Swiss native languages. This exclusion also limits the concerns about migrants’ self-selective sorting to Swiss areas where their homeland language is spoken by the majority of local natives. Secondly, we only retain those immigrants who have lived in Switzerland for more than 5 years and who can be plausibly considered as long-term migrants. As a result, we only consider individuals who had time to absorb the local culture of their destination country and adjust to Swiss institutions and customs. In addition, we avoid a comparison between individuals at too different stages of their migration trajectory.

When focusing on the sub-sample of stayers, we specify the following linear probability model:

$$\begin{aligned} {Self-Employed }_{idc} = \alpha +\beta _1\ WeakFTR_{idc}+\varvec{\gamma _1 X_{idc}} + \varvec{\gamma _2 W_{idc}}+\varvec{\gamma _3 Z_{idc}} +\mu _d + \varepsilon _{idc} \end{aligned}$$
(1)

where the dependent variable is Self-Employed, a dummy equal to 1 if individual i living in district d and originating from country c is self-employed. Our regressor of interest is the dummy Weak FTR which indicates whether individual i speaks a weak FTR language for daily use. The vector \(\varvec{X_{idc}}\) includes a set of religious dummies and demographic controls such as: gender, age, age squared, household characteristics (marital status, number of children in household), a dummy for Swiss citizenship and a dummy for being a high-skilled immigrant (i.e. with more than secondary school–see Peri and Sparber 2009). The vector \(\varvec{W_{idc}}\) includes four major languages’ features other than future reference, namely the presence of: (i) markers for past tense; (ii) a gender-based system; (iii) politeness distinctions; (iv) present perfect tense.Footnote 27 The parameter \(\mu _d\) accounts for district of residence (LAU-1 level) fixed effects that control for all contextual unobservable local characteristics affecting self-employment, such as the presence of entrepreneurship clusters, and local economic and institutional settings, including taxation policies and general socio-economic and cultural factors.Footnote 28 Standard errors are clustered at the country of origin, Swiss linguistic area and spoken language level (Galor et al., 2020).Footnote 29

In the case of stayers, our baseline model specification prevents us from including area of origin fixed effects because these would be perfectly collinear with Weak FTR. This is due to the fact that we cannot leverage any within-country-of-origin linguistic variation in our sample.Footnote 30 Therefore, to account for migrants’ national cultural background, we augment the model with the vector \(\varvec{Z_{idc}}\), which includes the complete set of country-level preference measures (risk-taking, patience, positive reciprocity, negative reciprocity, altruism and trust) from the Global Preference Survey (Falk et al., 2018).Footnote 31 We also control for the share of self-employment in the country of origin in 2000 (World Bank data) to capture differential cultural propensities to entrepreneurship.Footnote 32

Thanks to Swiss multilingualism, we are also able to analyze the sub-sample of switchers, i.e. those immigrants whose main language of daily use is one of the Swiss native languages. In this case, we exploit the variation in spoken language’s future reference across immigrants from the same country, who share the same cultural background and all unobservable characteristics that pertain to the area of origin. Similarly to Galor et al. (2020), we adopt as baseline specification for the sub-sample of switchers the following model:

$$\begin{aligned} {Self-Employed }_{idc} = \alpha +\beta _1\ WeakFTR_{idc}+\varvec{\gamma _1 X_{idc}} + \varvec{\gamma _2 W_{idc}}+\upsilon _c +\mu _d + \varepsilon _{idc} \end{aligned}$$
(2)

where, differently from model (1), the vector of country-level controls, \(\varvec{Z_{idc}}\), is replaced by the more demanding set of country of origin fixed effects, \(\upsilon _c\), that accounts for all unobservable cultural and socio-economic factors pertaining to the country of origin and for the mother tongue relatedness to any of the Swiss native languages.Footnote 33 However, we also present the result for switchers from the more parsimonious model as in equation (1).

When focusing on the sample switchers, we need to consider the possibility that causation may also run from the decision to be self-employed to the choice of learning and using a particular Swiss language. To gauge whether this is relevant concern, we compare our baseline estimates with the findings from a sample of young individuals aged below 35, who necessarily migrated before being 30 years old. Given their young age, it is less likely that their choice to become self-employed precedes the choice of which language to speak in Switzerland, because typically individuals start their career as dependent workers and then only later they switch to self-employment (Piguet, 2010).Footnote 34, Footnote 35

Finally, we pool stayers and switchers together and adopt a flexible econometric specification as follows:

$$\begin{aligned}&{Self-Employed }_{idc} = \alpha +\beta _1\ WeakFTR_{idc}+\beta _3\ WeakFTR_{idc} \times Switcher_{idc}+\nonumber \\&\qquad +\varvec{\gamma _1 X_{idc}} + \varvec{\gamma _2 W_{idc}}+\upsilon _c +\mu _d + \varepsilon _{idc} \end{aligned}$$
(3)

where \(Switcher_{idc}\) is a dummy which indicates whether individual i is a switcher and accounts for the fact that switchers and stayers might not be fully comparable. Indeed, switchers are more likely to be integrated in the destination country and to have personality traits, such as openness and courage, which are typically supportive of self-employment and entrepreneurship. The inclusion of the interaction term, \(WeakFTR_{idc} \times Switcher_{idc}\), allows us to test whether the impact of language FTR differs between the sub-samples of switchers and stayers. This amounts to unveiling any difference between the estimated effect among speakers of a language learnt later in life, as for switchers, and among speakers of a language inherited from parents, as for stayers.

5.3 Baseline empirical sindings

Table 3 presents the results from our baseline estimates. In column 1, we focus on the sub-sample of stayers and show that, after controlling for individual characteristics, district of residence fixed effects, WALS linguistic features, GPS cultural dimensions (including risk-aversion) and the self-employment rate in the country of origin, speaking a Weak FTR language is associated with a statistically significant increase in the probability of being self-employed equal to 4.1 percentage points, which corresponds to approximately 40 percent of the self-employment rate in this group of immigrants (equal to around 10 percent of total employment).

Both columns 2 and 3 consider the sub-sample of switchers but differ in that we adopt a parsimonious specification with the full set of GPS dimensions and birth-country self-employment rate as controls for the former model, whereas we include country of origin fixed effects in the latter.Footnote 36 Both specifications yield positive and statistically significant point estimates that amount to about 2 percentage points and correspond to slightly more than 15 percent of the sample mean (equal to around 13 percent of total employment). The similar point estimates across the two specifications suggest that the parsimonious model is successful in capturing a relevant share of the heterogeneity across countries of origin.

Finally, we estimate the flexible model in equation (3) for the pooled sample of stayers and switchers. In column 4, we adopt the parsimonious specification, while we include the full set of country of origin fixed effects in column 5. In both cases, the coefficient for the \(weak\ FTR\) dummy is positive and significant, while we estimate negative and significant coefficients for the interaction term with the Switcher dummy. This indicates that the effect of FTR is significantly larger among migrants who speak their mother tongue (as for the stayers) than among the ones who speak a language that they have plausibly learned later in life (as for the switchers). In other words, our set of estimates suggest that the main language of daily use’s future marking may affect the individual’s conceptualization of time even if the language is acquired later in life, although the effect seems to be much larger when the language is inherited from parents at the same time individual cognitive abilities start to develop, as one would expect. Not surprisingly, the point estimate for the Switcher dummy is always positive and significant, confirming that the openness and social integration that comes with speaking a Swiss native language is associated with an increase in the probability to be self-employed.

Table 3 Baseline estimates

5.4 Robustness checks

We perform a series of tests with the purpose of assessing the robustness of our baseline results. We first estimate our baseline models replacing the Global Preference Survey controls with the alternative set of country-specific cultural dimensions provided by Hofstede et al. (2010), i.e. uncertainty avoidance, long-term orientation, individualism, power distance, masculinity and indulgence. Our estimates, reported in Table A7 in the Online Appendix, are comparable to the findings in Table 3. We also check whether our findings are driven by immigrants from certain countries who display an unusual low percentage of self-employed, such as Portugal and Spain. When we estimate our baseline models excluding such outlier countries and languages, i.e. after excluding Spain and Portugal, or Spanish and Portuguese, all point estimates remain positive and highly significant.

As a further robustness check, we restrict the focus to multilingual cantons where the presence of more than one native linguistic group is rooted in history and each language is recognized as official. These cantons’ particular feature provides the same status to each of the official languages and therefore immigrants may switch to any of the official languages without incurring in any particular socio-economic advantage or penalty. We define as multilingual those cantons where the districts do not display an homogeneous linguistic majority.Footnote 37 As displayed in Fig. 1, according to this definition, only 4 out of 26 cantons are multilingual: Bern, Fribourg and Valais have districts where the majority language is either German or French, while Graubünden includes districts with German, Italian and Romansh majorities. The rest of the cantons are linguistically earmarked, and most of them are predominantly German. In Bern, Fribourg and Valais, the German and French areas are clearly separated. In Graubünden the distribution of languages is more irregular and each municipality adopts its own official language (Grin & Korth, 2005). For consistency, we have therefore preferred to keep Graubünden out of the analysis on multilingual cantons. Its inclusion, however, does not alter in any way the significance and direction of our findings (estimates are available upon request).

In Table 4, we estimate the model over progressively more homogeneous areas. We start with the entire Switzerland by showing the baseline result from column 5 of Table 3 as reference. We then consider, in column 2, the three Franco-German bilingual cantons of Bern, Fribourg and Valais. In this case, however, the sample size is reduced to slightly more than 40,000 observations. Finally, in column 3, we only consider the districts in bilingual cantons lying on the Rösti line, i.e. the linguistic border that separates German and non German-speaking areas, with the sample size further reduced to about 15,000 observations.Footnote 38 The result in column 2 indicates that weak FTR speakers are significantly more likely to be self-employed in multilingual cantons as well. The point estimate is larger than the baseline, albeit non significantly different. The interaction term yields a negative and significant coefficient as in the baseline. Again, this indicates that the effect is larger among immigrants who keep speaking their mother tongue as opposed to the ones who speak a language learned later in life. When we consider districts along the linguistic border in column 3, we still find a positive and significant association between language FTR and self-employment, with broadly similar point estimates, although this is only significant at the 10 percent level. The estimated coefficient for the interaction term is again negative and comparable to the previous estimates, but not significant. Most plausibly, the remarkable drop in the number of observations, combined with the demanding econometric specification, concur to the resulting less precise estimates.

Table 5 displays a second battery of robustness tests. In column 1, we exclude low skilled occupations to address the concern that self-employment may be the last resort for poorly integrated immigrant workers. According to ISCO classification, we remove category 5 (service workers and shop and market sales workers) and 9 (elementary occupations). For the estimate in Column 2, we remove immigrants who are employed in agriculture, which is a sector where self-employment is traditionally over-represented. In columns 3 and 4 we estimate the heterogeneous effect by education. We consider immigrants with secondary education or less in column 3, while we focus on those with tertiary education in column 4. Point estimates from these tests are all positive and significant, and comparable in magnitude to our baseline results. The only exception is represented by the estimate of the interaction term among high-skilled migrants (column 4) which is still negative but not significant. This indicates that the distance between stayers and switchers is reduced for the high skilled, suggesting that the effect of learning a new language is comparable to that of the mother tongue in this case. A possible interpretation of this finding attains to the learning skills of immigrants with tertiary education, who, as a result, deeply internalize the time orientation implied by the new language.

In Table 6, we perform separate analysis by gender and age. We find that the point estimates are marginally larger among men and among individuals aged under 35. The heterogeneity by age is important because it represents an indirect test of whether our choice of considering only the long-term immigrants residing for over 5 years in Switzerland is comparable to analyzing 1.5 generation immigrants (i.e. those migrated before age 18), as it is common in an epidemiological setting. Typically, young immigrants tend to be culturally homogeneous, as they are exposed to the culture of the destination country since a young age. However, since we do not observe the immigrants’ precise year of arrival in Switzerland, our sample encompasses the sample of 1.5 generation immigrants. Define p(a) the number of individuals aged a in our sample and q(a) the proportion of 1.5 generation immigrants among individuals of age \(a\in [25,70]\) in our sample. Such proportion is 1 among first generation immigrants aged 23 and residing for over 5 years in Switzerland (all of them entered in Switzerland before age 18). When we consider older cohorts, the share of 1.5 generation immigrants in our sample q(a) necessarily declines with age. For example, among those aged 24, a small portion might have entered at age 19. Among those aged 25 some might have entered between age 19 and 20, so that q(a) further declines, and so on. The proportion of 1.5 generation immigrants in the sub-sample of individuals younger than a certain age A is equal to:

$$\begin{aligned} P(1.5)_{25,A}=\frac{\sum _{a=25}^{a=A}q(a)p(a)}{\sum _{a=25}^{a=A}p(a)} \end{aligned}$$
(4)

which corresponds to a weighted average of the decreasing function of age q(a). Hence, in the sample aged 25-35 we will have a larger proportion of 1.5 generation immigrants than in the total sample (that includes individuals between 25 and 70 years of age) and in the sample of individuals over 35.

However, we find that the estimates for the 25-35 years old are only marginal larger (and not statistically different) than the baseline and the sample of the aged 35+. This suggests that focusing on the sample of long-term immigrants residing for over 5 years in Switzerland yields results that are plausibly in line with those that could be obtained on the subsample of 1.5 generation immigrants.

The robustness of our findings to the analysis on the sample of migrants aged below 35 suggests that the concern that switchers may choose to speak a Swiss language after becoming self-employed is unlikely to affect our results. Indeed, given the respondents’ young age at migration, necessarily below 30 years of age, it is less likely that the young switchers’ choice to become self-employed precedes the choice of which language to speak in Switzerland as typically individuals start their career as dependent workers and then only later they switch to self-employment (Piguet, 2010). This concern is also contradicted by our findings on the sample of stayers, who speak their mother tongue learned during childhood.

We further exploit Swiss multilingualism, by performing the baseline analysis on natives only, i.e. on residents who were born in Switzerland, are Swiss citizens and speak one of the four Swiss native languages. In this case, the effect of speaking a weak-FTR language is identified through the comparison of German (weak-FTR) versus French, Italian and Romansh (strong-FTR). Our sample of natives is much larger, including about 2,200,000 observations. In addition to the usual controls, here we include district of birth fixed effects instead of country of origin fixed effects, on top of district of residence fixed effects, exploiting the substantial internal migration among natives and especially among German speakers who tend to move to French and Italian areas. As noted above, in this case we cannot control for other language features since the latter do not vary across native Swiss languages. Our findings, reported in Table A8 in the Online Appendix, are very much in line with our previous results, as the increase in the probability of being self-employed is estimated between 2.2 (when considering the whole Switzerland) and 3.5 percentage points (in multilingual cantons).

As a final check, we alter the classification of languages following the approach of Galor et al. (2020) and Mavisakalyan et al. (2022, 2018), i.e. we group languages on the basis of the existence of an inflectional versus a periphrastic future tense, rather than on the actual use of a future tense in prediction contexts. According to these contributions, speakers of non-inflectional languages should be long term orientated, as are the speakers of weak-FTR languages. We therefore re-estimate our models by defining a dummy which is equal to 1 if the language has non-inflectional future and zero otherwise.Footnote 39 Our findings, reported in Table A3 in the Online Appendix, are similar to our baseline estimates.

Fig. 1
figure 1

The geographical distribution of Swiss linguistic majorities (Census 2000)

Fig. 2
figure 2

Multilingual cantons and the linguistic border (Census 2000)

Table 4 Estimates on multilingual cantons and districts at the linguistic border
Table 5 Estimates with low-skilled occ. and agriculture excluded, and heterogeneous by education
Table 6 Heterogeneous effects by age and gender

6 Conclusions

This paper contributes to the recent and growing literature on the relationship between language and economic behaviour, by providing the first comprehensive empirical analysis of the link between the way languages encode future events and one of the most important forward-looking economic choices, the decision to be self-employed. Our empirical strategy is based on an epidemiological approach which benefits from detailed Swiss census data. Switzerland is an ideal laboratory for such analysis, because it is characterized by a long-standing multilingualism and a large immigrant population living in a relatively small geographic area which is homogeneous as regards institutions, socio-economic conditions and broad cultural features.

We test the hypothesis that speakers of weak FTR languages, who are not required to use a future tense in prediction-based contexts, may have a closer perception of future rewards and be more willing to undertake future-oriented behaviours, such as being self-employed.

Our data from a multilingual country allow us to investigate the differential effect of the obligatoriness of future tense marking when the main language of daily use is the mother tongue learnt in childhood versus when it is a language learnt later in life. This is possible because, part of the long-term immigrants who reside in Switzerland keep their mother tongue as main language of daily use (defined as stayers), whereas part switch to a Swiss language, either weak FTR or strong FTR (both defined as switchers). In the first case, we estimate the effect of speaking a weak FTR language controlling for the country of origin’s cultural characteristics. In the second case, we exploit the variation in the spoken language’s FTR status within the immigrant population originating from the same country of departure. In all our estimates we therefore distinguish the effect of language from that of cultural origins (including risk aversion), controlling simultaneously for the contextual effects related to the place of residence, as well as for language features other than FTR.

Since we observe multiple languages with different FTR within each status, our identification does not hinges on the comparison of switchers with stayers. This is a key advantage in our setting since switchers and stayers might not be fully comparable as they may differ along a series of unobservable characteristics possibly correlated with the propensity to be self-employed, such as social integration and personality traits supportive of entrepreneurship.

In order to exclude those individuals who might be advantaged because of the proficiency in one of the local languages, our sample does not include immigrants who are born in countries where one of Swiss native languages is either official or widely spoken. Moreover, we further limit the analysis to those immigrants who have lived in Switzerland for more than 5 years, i.e. including only those long-term immigrants who spent enough time in Switzerland to absorb local culture and customs. As a result, individuals who are at an early stage of their migration trajectory and who therefore have limited opportunities of integrating into local labour markets are excluded.

Our baseline estimates reveal that speakers of weak FTR languages are between 4.1 and 2 percentage points more likely to be self-employed compared to speakers of strong FTR languages, lending credit to the hypothesis that the absence of a clear and obligatory marking of future tense in prediction-based contexts could make perceive future rewards as less distant and therefore provide stronger incentives to invest in an activity whose returns are typically postponed.

The effect of weak FTR is positive and significant for both stayers and switchers. However, we find that the effect is larger for the stayers, i.e. among immigrants who speak their mother tongue, than for the switchers, i.e. the ones who speak a language that is not the one they grew up with. In other words, our set of estimates suggest that the main language of daily use’s future marking may affect the individual’s conceptualization of time even if the language is acquired later in life. However, the effect seems to be sensibly larger (i.e. 4.1 versus 2 percentage points) when the language is learnt during childhood, as one would expect. This finding is consistent with previous experimental evidence that suggests that language may shape mental representations of time even when adult individuals are exposed to a language that is not their mother tongue (Casasanto, 2008, 2010, 2015).

Our findings are remarkably stable across a battery of robustness tests, including restricting the analysis to multilingual cantons and districts on the linguistic border, where more than one of Swiss native languages are official and can be spoken without incurring in any particular socio-economic advantage or penalty.

Our results are also robust to the exclusion of workers in low-skilled occupations, who might choose self-employment as last resort before unemployment, and agriculture, where self-employment is typically over-represented. We also find that the effect of FTR is stable across education groups, with the distance in the effect between stayers and switchers being reduced for the high skilled. Furthermore, we include an indirect test that suggests that focusing on the sample of long-term immigrants, residing for over 5 years in Switzerland, yields results that are plausibly in line with those that could be obtained on the sub-sample of 1.5 generation immigrants.

Finally, our findings are confirmed when we further exploit Swiss multilingualism, by performing the same analysis on natives only, i.e. on residents who were born in Switzerland, have Swiss citizenship and speak one of the four Swiss native languages.