Introduction

The identification of the movement of people and animals using strontium isotope analysis (87Sr/86Sr) has proven remarkably useful. However, in the context of geologically homogeneous regions, such as the Arabian Gulf, the use of this analysis for the identification of “local” and “non-local” individuals has proven challenging (Gregoricka 2011, 2013; Kutterer and Uerpmann 2017). While most archaeological biogeochemical applications of strontium isotope analysis rely on either the creation of isoscape models and/or representative “baseline” sampling of local bioavailable strontium, there are serious challenges to applying these approaches in places like Bahrain. Located in the Arabian Gulf, Bahrain has experienced extensive environmental change, has limited archaeological non-domesticated faunal samples, and is known to have engaged in trade involving domesticates (Bangsgaard 2001, 2003; Lorimer 1908, Uerpmann and Uerpmann 1999, 2008).

Major settlement on Bahrain dates from the early Dilmun period (2200–1750 BCE) and consists of burials, settlements, and temples (Crawford 1998; Crawford and Moon 2017; Laursen 2017). Naturally occurring springs facilitated agriculture, but the islands’ economy was also part of an extensive trading network in agricultural (including animals) and luxury products with the surrounding regions (Mesopotamia, Indus, Iran, Arabian Peninsula) (Dobney and Jaques 1994; Insoll 2005; Larsen 1983; Laursen 2017; Lombard 2016; Magee 2014; Nesbitt 1993; Uerpmann and Uerpmann 1994; 1999). The continued importance of trade throughout the Dilmun settlement in Bahrain is attested by written sources (Olijdam 1997; 2016). Meanwhile, the greatest archaeological evidence of the importance of trade to the local economy during the Dilmun era is material culture, including imported grave goods (Højlund et al. 2008; Killick 2003; Lombard 2000; 2016). Despite periods of economic unrest during the late Tylos to Middle Islamic periods, evidence of trade (again based on material goods) is continuous (Kervran 1999; Lombard 2000; 2016). While trade and economic relationships are a major focus of research in the Gulf (e.g., Carter 2003; Magee 2014; Olijdam 2018; Potts 1993; Swan 2018), a concentration on objects of movement rather than on the living beings implicated in the movement has meant that aspects of scale (such as the proportion of human and domesticate populations involved, Gregoricka 2021) are understudied compared to the identification of long-distance linkages (Seland 2014).

Strontium isotope analysis of archaeological samples from Bahrain offers an opportunity to evaluate the movement of human and animals in a regional trade center. However, in a geologically homogenous area like Bahrain, it can be difficult to determine local versus non-local using this kind of analysis. This challenge is compounded in the light of significant modern environmental change, particularly the loss of native and agricultural vegetation, and drying of the springs (Almansoori et al. 2015; Larsen 1983; Tengberg and Lombard 2001; Zubari et al. 1994; Zubari 2005). Further, limited animal bone samples, frequently used to characterize “local” strontium values, come from a range of site contexts and may result from trade in live animals. In this paper, we argue that using diverse statistical methods applied to strontium isotope ratios of humans and animals allows us to explore aspects of scale with degrees of certainty. This approach provides a basis for exploring what types of movement might have occurred.

Strontium isotope analysis

The principle underlying the use of strontium isotopes for identifying movement is that the strontium ratio (87Sr/86Sr) of an individual’s local geology is reflected in their tissues over the time those tissues formed. Strontium within a local landscape comes from the weathering of bedrock into soils and groundwater, as well as potentially non-local sources such as surface water, precipitation, windblown particulates, sea spray, and fertilizers (Bentley 2006; Capo et al. 1998; Holt et al. 2021; Thomsen and Andreasen 2019). These sources are incorporated into plants and animals in a given landscape, and the strontium isotope ratios of these organisms (bioavailable strontium) reflect a mixture of these sources. Food sources containing more strontium and calcium contribute more to the measured strontium values of an individual (typically, plants have higher contents than meat; Bentley 2006). Bioavailable strontium in a landscape can differ from geological values due to local factors such as different rates of substrate erosion, particulates, and atmospheric contributions (Burton and Hahn 2016). Therefore, in archaeological analyses, characterizing the bioavailable strontium, as opposed to geological ratios, in a landscape is considered more useful for identifying individuals who do not reflect a “local” ratio. Yet, there are places, like Bahrain, where characterizing local bioavailable strontium is difficult.

Bahrain and geological sources of strontium

Bahrain is an archipelago comprised of 13 islands in the shallow waters of the Arabian Gulf. The largest island, Bahrain Island (563 km2), sits on an elevated portion of the Eocene Rus and Dammam formations surrounded by more recent Neogene (23.03–2.58 mya), Quaternary (2.58–0 mya), and Holocene (11,700 kya-present) deposits (Doornkamp et al. 1980: 1–2). There are two primary sources of “local” strontium in Bahrain: spring water from aquifers and the strontium in soils that is derived from the erosion of geological formations including the Eocene carbonate of Jabal al-Dukhan (Ar. Mountain of Smoke) and the more recent surrounding sands and playas.

The main source of drinking water for humans and animals was freshwater springs sourced from the Eocene Rus and Dammam aquifer systems formed by dolomite, shale, and limestone formations (Doornkamp et al. 1980: 329). In the north of Bahrain, the springs feed irrigation canals, gardens, and natural water courses supporting date palm gardens (bustān). Evidence of wells (e.g., at Barbar temple) plus written accounts attest to their use from the Early Dilmun period (Larsen 1983). This groundwater contains high levels of salts including ions which are commonly substituted for strontium (1983: 29). There are no available 87Sr/86Sr ratios for groundwater from Bahrain. There are very few natural springs remaining and they are contaminated by seawater and brackish water infiltration given the loss of hydrostatic pressure due to modern overuse (Larsen 1983; Zubari et al 1994). Wells sampled from the Rus and Dammam aquifers in the Rub al Khali of Saudi Arabia have 87Sr/86Sr ratios ranging from 0.70771 to 0.70874 (Sultan et al. 2008: 77). Additional wells from Saudi Arabia sampled by Saeed et al. (2021) show remarkable consistency in the Dammam aquifer samples ( = 0.70789, n = 2) with no modern seawater intrusion. Seawater intrusion in the Bahrain springs before modern exploitation is unlikely given that there were relatively minor decreases in hydrostatic pressure over time (Larsen 1983). Comparing the Eocene aquifer samples to samples from an older aquifer system (0.70770–0.70785) and a more recent Neogene aquifer system (0.70819–0.70822) provides evidence that regionally, 87Sr/86Sr ratios of more recent geology are elevated in comparison to older sources (Saeed et al. 2021: 9). The ubiquity of the Dammam aquifer samples means that we might expect moderately uniform 87Sr/86Sr ratios in spring water in Bahrain in the past.

Strontium variation in soils is often dependent on the underlying geology, but this is a problem for Bahrain, where there are no published values for the various soil substrates. Instead, we need to use regional values from better-sampled areas of the Gulf. In terms of geological values, the Dammam formation was sampled in Kuwait with a 87Sr/86Sr ratio of 0.70778 (Amer and Al-Hajeri 2019), which is consistent with the aquifer values for this formation.

While these values are useful in terms of a regional picture, their relevance for Bahrain is unclear. In Bahrain, the more recent deposits are a mix of Quaternary and Holocene gravel, sands, and silts along with Sabkha deposits (Willis 1963: 2). These deposits, which underly most of the habitable portions of the island, along with thin layers of sandy loam soil, are from disparate sources. Nevertheless, these soils are important for understanding the local strontium range of Bahrain given that these soils were used for gardening over thousands of years. Gardens were crucial to subsistence in Bahrain, and they relied on extensive modification of the soils through manuring and irrigation. Today, these soils remain raised over 1 m above the natural geological deposits (Doornkamp et al. 1980). Plants from these soils likely represented a significant source of bioavailable strontium for people and animals in Bahrain. Bioavailable strontium values for Bahrain until the present study are limited to faunal samples from Gregoricka 2013: 455 (n = 20) and range from 0.708107 to 0.708518. While these values paint a general picture of what bioavailable strontium signal may be in Bahrain, one particular concern is that the animals represented in the faunal samples may not be local given the diversity of contexts from which they are sources (burials, temple, settlements).

The existing sampling of strontium in Bahrain is not sufficient for the creation of an isoscape. The consistency in the regional groundwater and geological values suggests that the strontium values of peoples and animals from Bahrain will show limited variation. That lack of local heterogeneity is useful in that “local” values should demonstrate a narrow range of variation corresponding to a normal distribution given sufficient samples. Some movement, particularly between Bahrain and the neighboring coast of Saudi Arabia, may be undetectable, but this is an instance where statistical approaches may be particularly appropriate, especially if applied to humans and animals rather than assuming one represents a “baseline.” A comparative approach, we argue, allows for the scale and scope of movement to be parameterized.

Methods of identifying local

A range of methods have been used in archaeological analyses to characterize local strontium isotopic variation. These can be largely grouped into those relying on the creation of local isoscapes based on proxies (local geology, modern environmental or archaeological proxies) and those derived from statistical analyses.

In the context of Bahrain, many of the potential proxies are not feasible. It is impossible to determine geological strontium values for Bahrain based on modern data. Bioavailable strontium values are hard to obtain given the limited areas of native vegetation and a lack of deeper-rooted plants (Hartman and Richards 2014), extensive gardening with widespread use of modern fertilizers (see Frei 2021 and Thomsen and Andreasen 2019), and the change from naturally occurring springs to desalinated water and pumped groundwater (tapping older aquifer sources, Holt et al. 2021).

Currently, archaeological proxies from Bahrain are not available given the limited number of settlement excavations and the rarity in subsequent assemblages of wild taxa and small mammals with limited range (e.g., Price et al. 2002; Bentley 2006). Relying upon herbivore values as a priori “local” is particularly problematic given the evidence of animal trade (e.g., Madgwick et al. 2019).

Statistical methods rely upon analysis of the distribution of strontium results to establish the “local” range. Most commonly, the range has been established using a criterion of two standard deviations from the sample mean. The sample mean is variably calculated, from the entire sample (e.g., Grupe et al. 1997), from fauna only (e.g., Gregoricka 2013), or from particular fauna (e.g., Price et al. 2002). The assumption when using faunal remains to create a “local” range is that the sources of variation for these samples will be low and should reflect the average of local values (Szostek et al. 2015). The reliance upon two standard deviations is predicated upon the assumption that human and animal samples are sufficiently representative to reliably reflect variation, that there are more locals than non-locals, and that the data are approximately normally distributed. If local animal trade in the species sampled is substantial, then the proportion of extra-locals will be undercounted (Grimstead et al. 2017). The selection of a cut-off of two standard deviations automatically assumes that 95% of values will be within the “local” range, thereby constraining any consideration of the scale of movement. Some have used a one standard deviation range as a less conservative threshold, but that may result in the problem of overestimating non-locals.

An alternate method (e.g., Wright 2005; Tung and Knudson 2011) circumvents the initial assumption of a normal distribution. Removing extreme values from the data set (in Wright’s case defined by comparison to regional geology), the “trimmed” data are then analyzed for concordance to a normal distribution. Then, a second group of outlying samples are identified as possible outliers based on their position on the tails of the distribution. The likelihood of these classifications of local versus non-local is then assessed in comparison to other archaeological data. The method assumes that a “local” population will be normally distributed when most people are born locally and when the diet comes from the same sources. There are circumstances where non-locals may outnumber locals, but these should be suspected from contextual data as well as deviation from a normal distribution (e.g., Harappa, Kenoyer et al. 2013; Valentine et al. 2015). The method does rely upon a large sample size but is less arbitrary in its identification of outliers.

An additional method suggested by Tung and Knudson (2011) is that of division rows. The data are grouped sequentially according to their strontium ratios with differences calculated between adjacent values. Breaks in the sequence are identified where the difference is significantly higher (in the example used by the authors, the difference is an order of magnitude higher, and it must also exceed measurement error). The authors emphasize that this method is useful to determine outliers if the sample size is large. It has the advantage of allowing (as does Wright’s method) a more detailed evaluation of the entire corpus of results but is not predicated on the assumption of a normal distribution making a comparison using other techniques worthwhile.

The potential for differential offset between teeth of the same individual is also useful to the extent that intraindividual variation may reflect variation in regional geology (Hrnčíř and Laffoon 2019), distinguishing individuals who moved during their lives. Assuming local values will predominate in any sample, differential offsets within an individual provide further information on the extent of variation, particularly when it can be demonstrated that major dietary differences within a single individual are unlikely.

Each of these methods is predicated upon a differing set of requirements and assumptions which affects their value in a geologically homogenous area. In the context of Bahrain, only statistical approaches are currently feasible, but we argue that a combination of these methods allows us to parameterize the potential scale of movement of human and animal populations while recognizing that given the homogeneity of much of the region, some movement will be indistinguishable.

Materials and methods

Archaeological enamel samples were obtained from humans and animals from sites across Bahrain (Table 1, Fig. 2). The samples analyzed in this paper span from the Early Dilmun (ca. 2200 BCE) to the Middle Islamic period (ca. 1250 CE) and come from a range of contexts. We combine our dataset with the previously published faunal strontium isotope data from Gregoricka (2011). Individual data including tooth class are available as Supplementary Data. With the exception of one cow incisor, all animal teeth were molars. Human teeth were preferentially molars but two lower canines and three premolars were sampled.

Table 1 Samples of human and animal enamel used in this analysis, including date and site. More than one tooth was analyzed from a limited number of individuals

The sites of Hamad Town, Janabiya, and Saar are comprised of Early Dilmun (2200–1750 BCE) burial mounds (Højlund et al. 2008; Laursen 2017). While the mound fields were reused, the individuals analyzed in this work can be safely attributed to the early Dilmun period by grave goods and burial type. These goods included items of Mesopotamian or Indus origin, suggesting trade links, and potentially the movement of people and goods (Potts 1993; Olijdam and David-Cuny 2018). While there has been a division drawn between elite and non-elite tombs on the island (Laursen 2017), all of the tombs sampled here are classed as non-elite. The faunal samples collected by Gregoricka (2011) also date to the early Dilmun period but originate from the Barbar temple site, which was in use and rebuilt three times over a millennium. The faunal bone from the Early Dilmun contexts at Barbar is argued to be the result of temple activities requiring the highest quality of animals and the age distribution of the animals is not consistent with a breeding population (Bangsgaard 2003).

The Al Maqsha samples are from individuals buried in Middle Dilmun (1750–1000 BCE) subterranean rock-cut tombs. The small sample size from this site is due to poor preservation. While trade is attested in this period, it has been suggested by some authors that long-distance networks may have contracted and then expanded, possibly with the establishment of a Kassite foreign enclave at Qal’et al.-Bahrain (Crawford 1996; Lombard 2000, 2016). Written records confirm major trade in agricultural produce, including dates and grains, during this period (Lombard 2016; Olijdam 1997).

The individuals from A’ali analyzed in this manuscript are from a Late Dilmun chamber with collective burials (MNI = 129) along with Late Dilmun pottery, firmly setting these burials to the years between 1000 and 300 BCE (Littleton and Frohlich 1989). During this period, burials are either collective (as in this tomb) or singular within pottery coffins. There is no indication that collective burials were other than the normative form of burial (Littleton 1997). There is an alternative form of bathtub coffins, hypothesized to be indicative of Mesopotamian ties or a foreign enclave but the current samples do not come from these. (Lombard 2016).

The Tylos period (300 BCE–600 CE) is represented by a single individual from Karranah (Andersen 2007) buried in a typical tomb of this time. While the collapse of extensive trading networks due to political unrest is suggested for the later part of this period (Andersen 2007), there is evidence of an extensive trade in gold and glassware in the early Tylos period (Lombard 2016) and a Sassanian fortress was built in the last centuries of this period, demonstrating that Bahrain was still regionally connected.

Qal’at al-Bahrain (QAB) was a port settlement site on the northern coast of Bahrain (Fig. 1) and was involved in Gulf trade semi-continuously from the Early Dilmun to the Islamic period (ca 628–1600 CE, Lombard 2016). The QAB humans analyzed in this work were excavated from a Middle Islamic cemetery located on the site and used for at least 200 years (ca 1400–1600 CE) based on stratigraphy (Kervran et al. 2005: 339), while the animals come from the Islamic settlement levels. During the Middle Islamic period, economic activity at QAB resumed after a period of apparent abandonment. Evidence of trade at this time includes the industrial-scale production of date honey at a Salgharid warehouse, which was traded to the Song Dynasty of China in exchange for porcelain (Kervran et al. 2005; Lombard 2016). While the village and graves at QAB post-date the re-establishment of commercial activity at this site, these activities point to the important role Bahrain played in regional and inter-regional trade during the Islamic period.

Fig. 1
figure 1

Map of Bahrain showing site locations

Analytical methods

Before destructive analysis, all teeth were recorded for pathology and photographed, and impressions were made of the human teeth. Permission for destructive analysis was granted by the Bahrain Authority for Culture and Antiquities (BACA).

Initial sample preparation was completed at the University of Auckland (Aotearoa, New Zealand). Careful attention was taken to avoid any enamel that was stained or had a chalky consistency. The teeth selected for analysis did not have significant wear or caries. The crown of each tooth was rinsed with Type 1 water, and the outer surface was removed with a Dremel to remove surface contaminants. Enamel powder (10 mg) was removed from the tooth using the Dremel (model #4000) attached to a Dremel workstation with a diamond cylindrical drill bit along the growth axis of the tooth, to sample enamel over the entire time of enamel formation.

The enamel was prepared for strontium isotopic analysis at the University of Calgary (Canada) using the protocol reported by Peschel et al. (2017). The enamel samples were digested in 1.0 mL 5 M HNO3 and 200 µL ultrapure 30% H2O2. The samples were redissolved in 500 µL 3 M HNO3 for ion exchange. The sample solutions were then loaded onto columns with Eichrom Sr-spec resin, eluted with MilliQ water, and dried in a heating block.

Strontium isotope ratios were measured using a Thermo-Fisher Triton thermal-ionization mass spectrometer in the Department of Earth Sciences at the University of Calgary. The dried sample was dissolved in 2.0 µL of a Ta2O3 activator solution and deposited onto a single, out-gassed rhenium filament. Samples were then heated to 1440 °C prior to the measurement of isotopic ratios. Each sample was analyzed for ten blocks (15 cycles/block) with a 8.389 s integration time for a total of 150 measurements. Baseline measurement and matrix rotation of the amplifiers were done between each block. The 87Sr/86Sr was internally normalized to an 88Sr/86Sr ratio of 8.375209 using the exponential law to correct for instrumental mass bias. The 85Rb signal was monitored to correct for any interference of 87Rb on 87Sr.

The reproducibility of the standard SRM987 during sample analysis was 0.710264 ± 0.000012 (2σ, n = 9). The 87Sr/86Sr was internally normalized to an 88Sr/86Sr ratio of 8.375209 using the exponential law to correct for instrumental mass bias. Analytical uncertainty for these measurements is 0.000013 (2σ).

Statistical analysis

Results were analyzed using SPSS v28. Descriptive statistics were calculated including measures of normality assessed by the Shapiro–Wilk test. Outliers for all methods were identified using the interquartile range rule (IQR + 1.5*IQR).

Results

Descriptive statistics

Table 2 contains the descriptive statistics for the human, animal (including Gregoricka’s data), and combined enamel samples. Individual values can be found in the Supplementary Data. The range of animal values from Gregoricka (0.708107–0.708518) is encapsulated by the combined animal range. There is no statistically significant difference between the human and animal means (t =  − 1.019, df = 98, p = 0.311). The distribution for animals is positively skewed while the distribution for humans has markedly long tails. Neither corresponds to a normal distribution (Fig. 2).

Table 2 Descriptive statistics of Bahrain 87Sr/86Sr samples including animal data from Gregoricka (2011)
Fig. 2
figure 2

87Sr/86Sr values of all Bahrain samples. a Histogram showing normal curve. b Test of normality. c Q-Q plot of values

The comparison of the enamel samples by species with geological and seawater values is shown in Fig. 3. The local geology is indicated by the more recent Quaternary or Neogene aquifer values. This range is too narrow to encompass local variability but consistent with the average 87Sr/86Sr for the combined Bahrain samples suggesting that the recent geological strata have a significant impact on bioavailable strontium. In contrast, the average water value from the Damman aquifer (Fig. 3, lower blue line) sits well below the Bahrain values. No values approach the Arabian Gulf seawater 87Sr/86Sr level of 0.7093 (Saeed et al. 2021).

Fig. 3
figure 3

Distribution of sample 87Sr/86Sr values by species compared to Neogene aquifer values (green band), average Damman aquifer value (lower blue line), and the local range calculated by Gregoricka (2011) from faunal values ± two standard deviations

Trimmed data

Based on the analysis of the entire sample using outliers, 15 samples (ten humans and five animals: two cattle, three ovicaprines) can be assigned as extreme values (falling below 0.707989 or above 0.708419). Removing these from calculation results in a trimmed data set as suggested by Wright (2005) (Table 3). The rationale of the trimmed method is that the trimmed data set can be assessed for concordance with a normal distribution.

Table 3 Descriptive statistics of the “trimmed” dataset once outliers (below 0.707989 or above 0.708419) have been removed

This trimmed data set has a normal distribution as assessed by the Shapiro–Wilk test (Fig. 4), with outliers in this subset defined as less than 0.708025 or more than 0.708360. A further two humans are identified as outliers and therefore probably not local. Strontium ratios can be categorized on the basis of this statistical analysis as follows: definitely non-local (< 0.707990 or > 0.708420), probably non-local (> 0.707990– < 0.708025; > 0.708360 < 0.708420), and local (0.708025–0.708360).

Fig. 4
figure 4

87Sr/86Sr values of Bahrain samples within the trimmed data set. a Histogram showing normal curve. b Tests of normality. c Q-Q plot of values

These data suggest that around 81.5% of animals (n = 27) and 83.6% of the human samples (n = 73) are local. However, on the Q-Q plot (Fig. 4), values below 0.70815 and above 0.708300 fall away from a normal distribution suggesting that they may either be non-local, reflecting a mixture of strontium sources for individuals on the move, or local individuals consuming food or water which was non-local in origin. Using this approach alone, there is no way of distinguishing between these scenarios. An alternate statistical approach is to use the division rows method which might help clarify whether a narrower “local” range of 0.708150–0.708300 is meaningful. This method is non-parametric and uses all individuals.

Division rows

In calculating differences between neighboring values, multiple samples from the same individual were not used. Rather, only the latest forming tooth was used to avoid bias from intraindividual homogeneity. The average difference between neighboring values was 0.000017 (SD = 0.000664). Plotting the difference (Fig. 5) shows that most values are indistinguishable; 61% of values are less than 0.000004 from their neighbor, while 19% are more than 0.000013 different from their neighbor, reflecting some stepwise progression in the distribution. These steps are shown in Fig. 6. The arrows indicate a step of 0.000019 (lower end) and 0.000016 (upper end) or between 0.708143 and 0.708303. These steps are significantly greater than the average gap between the central values (in this case 0.000003) and greater than the analytical error of 0.000013.

Fig. 5
figure 5

Distribution of differences between neighbouring 87Sr/86Sr values (n = 80) excluding paired samples

Fig. 6
figure 6

Sequential distribution of 87Sr/86Sr values of entire sample from lowest (excluding single human value 0.707177) to highest. Arrows point to the first “jump” in values at lower (0.000019) and upper (0.000016) ends of the central band (0.708143–0.708300). Blue are human, black are cattle, and red are ovicaprine

Five humans fall outside the animal range. Otherwise, human and faunal values are interchangeable. At the lower end of the range (below 0.708143), except for two outliers (one person from QAB and one from Hamad Town), the 14 values range from 0.708082 to 0.708124 and include four animals from Barbar: three cattle and one ovicaprine. These individuals vary little from each other ( = 0.000003). In contrast, at the other end of the range (above 0.708303), the individual values (n = 19) are more variable ( = 0.000015) and comprise 11 humans and eight animals (three cattle and five ovicaprines). The range suggested as definitely local (0.708140–0.708303) is narrow but consistent with the central band of trimmed values corresponding to the Q-Q plot of the trimmed distribution (Fig. 4c).

Differential offset with individuals

One way of confirming if this smaller range is valid is to compare the intra-individual level of variation. In SE Arabia, this offset is notably small (Hrnčíř and Laffoon 2019). This Bahrain set gives a smaller offset (Table 4), confirming the homogeneity of strontium values between individuals. The offset is significantly less than the range suggested for definitely local values of 0.000150 by division rows or 0.000370 using the trimmed data method.

Table 4 Calculation of the offsets between paired teeth from the same individual. Data for Southeastern Arabia from Hrnčíř and Laffoon (2019) and Bahrain data from human samples within this study where early forming teeth are M1, middle teeth are M2 or LC, and late teeth are the M3

Discussion

There have been many studies grappling with the identification of a local range in geologically heterogeneous regions. As pointed out by Gregoricka (2011, 2013) and Hrnčíř and Laffoon (2019), the Arabian Gulf and surrounding regions present the opposite problem. Furthermore, the evidence of extensive trade at different points in time involving the movement of both humans and animals means faunal samples (unless of less mobile local fauna, e.g., snails) are problematic. Yet in Bahrain, extensive human modification of the landscape and vegetation makes the application of modern proxies and the generation of a meaningful strontium isoscape impossible. Statistical methods analyzing the distribution of strontium values are required to estimate a local range and, as we have demonstrated, allow designations of local/non-local to be parameterized with degrees of certainty as opposed to a single local/non-local division.

Comparison of methods

The division rows method gives the narrowest local range (0.708140–708300). This is a range of ca 0.000150. The “jumps” on either side are greater than the analytical error, but small. They are consistent with what we know of the region and levels of variation present (Gregoricka 2013; Hrnčíř and Laffoon 2019; Kutterer and Uerpmann 2017; Greenfield et al. 2022). This narrow range coincides tightly with the values plotting as a normal distribution using the trimmed dataset method and suggests that this might be used as a band of definitely local with the wider range from trimmed data (0.708025–0.708360) as a probable local range. This concordance occurs despite the fact that trimmed methods are focused on the central tendency of a sample while the division row method is focused upon between individual divergence (i.e., dispersal). This seems particularly apt given its encapsulation of the estimated geological 87Sr/86Sr range. All the methods (including the commonly used 2SD) are consistent in identifying outliers, presumably extra-locals, at the lower end of the distribution, but there is a difference in the upper range of values. Division rows plus trimmed methods that do not assume a normal distribution a priori give a narrower upper bound and reflect the long tail and greater variability at this end of the distribution.

None of these methods is absolute measures of locality or movement for several reasons. Given the homogeneity in strontium values within the region (i.e., Bahrain and the eastern coast of Saudi Arabia), movement within this region may be invisible. Furthermore, all samples are time-averaged, so there may be mixing through multiple movements during the formation of enamel. In addition, strontium values are a window in time (the period of development of the tissue), not a sum value of movement. However, they do allow for movement (i.e., the incorporation of bioavailable strontium from sources the individual is exposed to over the period of development of the tissue sample) to be parameterized with varying degrees of certainty.

Diet can impact measured strontium values for individuals and should be considered carefully as a potential confounding factor in statistical methods of parameterizing local/non-local individuals. One possible bias that could impact the estimation of a local range is marine input in individual strontium ratios. Three individuals have 87Sr/86Sr values above 0.708800. While it is possible that these values reflect a marine food contribution, carbon and nitrogen isotope analysis of these same individuals indicates no evidence for heavy marine food reliance on Bahrain, lending credence to the idea that these three people spent their childhood outside the island (Smith 2023). A second possible bias is the consumption of meat from non-local animals. This is unlikely given that meat contains relatively little strontium compared to the garden produce which is the major dietary component in Bahrain (Smith 2023).

The multiple statistical methods used here indicate homogeneity between most of the samples. This consistency makes sense given the remarkable continuity on the reliance of gardens for produce in Bahrain (Smith 2023) and that individual strontium values reflect the bioavailable strontium of food sources, particularly plants.

Identifying scale of movement

Methods using two standard deviations unsurprisingly result in estimates of less than 10% of human and animals as “non-local” (Table 5). Given the non-normality of the distribution, these estimates are outside what would be expected for Bahrain and privilege extreme outliers in a homogeneous landscape. Using the trimmed data set and outliers determined using interquartile ranges suggest that about 13.7% of humans and 18.5% of animals were non-local at the time of tooth development with a further 2.7% of humans probably non-local. Taking the central band of division rows suggests that 32.9% of humans and 44.4% of animals were non-local. This central band corresponds to the values of the trimmed data set that accord with a normal distribution, but the bordering values have a strong potential to include individuals with mixed values. Mixing can be consequent to movement occurring during the period of development sampled or preferential consumption of non-local food during that period. These are, of course, not mutually exclusive scenarios: people with a different cultural background are more likely to have access to and preference for non-local foods. Untangling the confound of mixing may be assisted by archaeological context (e.g., association of discrete burial context, Leach et al. 2010) or by multi-isotope analysis (e.g., including a dietary marker, Sołtysiak 2019).

Table 5 Designation of the “local values” using different methods: 2SD around the animal mean (Gregoricka 2011, 2013), trimmed data (including the set of probably non-local), and division rows. Subsequent categorization of human and animal values by source is shown below

In Fig. 7, strontium values for humans, ovicaprine, and cattle are shown against the bands suggested by the trimmed data set with the inclusion of the central band of values (0.708140–0.708300) suggested by division rows and the QQ plot (Fig. 4c). It confirms that on the lower range of strontium values, except for the three human outliers, there is a small group of animals and humans showing consistent values but possibly on the margins of what we would expect for Bahrain. The four animals come from Barbar temple, while the humans come from across the island and different time periods. Several explanations include the following: that the Bahrain range is drawn too narrowly on the lower margin, dietary differences, or that the source is an area with similar strontium values to Bahrain. Dietary differences seem unlikely given the interspersing of human and faunal results and the lack of any clear temporal or spatial pattern in human values. Mixing is unlikely for the faunal values because there are no faunal samples representing more extreme values, which we would expect as seen at the other end of the distribution.

Fig. 7
figure 7

Samples arranged in increasing order against the designations of locality derived from the trimmed data (extra-local to probably local) and the division rows (narrow band of local). Humans, blue; cattle, black; ovicaprines, red

It seems likely the range is drawn too narrowly at the lower margin, or this group comes from an area with a similar or slightly lower local range. Given comparative data from the broader region (Supplementary Table 2) and geological mapping (Ryan et al. 2021: Fig. 3), potential candidate areas are Iran, given similar levels from Tepe Yahya (Gregoricka 2013), southern Mesopotamia (Greenfield et al. 2022), or Eastern Saudi Arabia (based on geological similarity and proximity). There is currently no way to distinguish between these potential sources. Due to this issue of geological homogeneity, movement between Bahrain and other regional locales, particularly the eastern Saudi Arabian coast, may be invisible. The other band of “probably local” has a very different distribution: the values are more widely spaced and could easily represent mixing given the number of human and animal samples with higher values. Including only the three animals and humans in this upper band as non-local would suggest that 33.3% of animals and 19.2% of humans have moved. The comparison overall points to between 13.7 and 32.9% of human samples having a non-local value and between 18.5 and 44.4% of animals falling into this group.

The purpose of movement

These data suggest a larger percentage of ovicaprines and cattle, compared to humans, were non-local to Bahrain. It is, however, important to recognize that human movement may be underestimated since only the period of dental formation is recorded in the human teeth. In this analysis, the numbers of early (24), middle (23), and late (21) forming human teeth were similar and there is no significant difference in the designation of local versus extra-local by tooth class. A more recent residence is recorded in the hypsodont teeth of ovicaprine and cattle. Nevertheless, the identification of potential and definite non-local faunal individuals confirms the importance of animal trade to the economy and subsistence of people in Bahrain from the Early Dilmun period. Significant trade in animals has been confirmed for the Early Dilmun and the Islamic periods which are the largest animal assemblages available for analysis. The situation for the Middle Dilmun and Tylos periods is less clear given the lack of settlement excavations. Evidence indicates the persistence of land and water routes from the Arabian Gulf margins over this period although the relative importance of specific neighbors changes over time (Lombard 2016; Magee 2014).

Crucially, a comparison between animal samples from Barbar temple compared to elsewhere on the island points to the ways in which this trade was purposeful. Approximately 28.6% of the ovicaprines and 12.5% of the cattle at Barbar can be classified as definitely non-local (Table 6). There are insufficient cattle samples to compare, but only 9.1% of ovicaprines from other contexts are non-local. Zooarchaeological analysis at Barbar temple has suggested the use of the site involved ritual practices centered around the temple complex (Bangsgaard 2003). Cattle and ovicaprines found at the site were slaughtered at a younger age than would be sustainable for a breeding population, suggesting they were not raised locally. The Barbar cattle and ovicaprines also differ in body size from animals recovered from the Early Dilmun layers of QAB and are thought to be of a different origin (Bangsgaard 2003: 14). The zooarchaeological data, and now, the strontium isotope data, suggest that some livestock was imported specifically for temple activities at Barbar.

Table 6 Percentage of ovicaprines classified as local or non-local at Barbar compared to other Bahrain sites

Apart from a live trade in animals for ritual purposes, there is evidence that a small proportion of cattle and sheep was imported, possibly for breeding purposes. There is one non-local cow from QAB and one ovicaprine from an Early Dilmun burial mound that do not fall within the local or possibly local range. This finding supports arguments based on body size differences of cattle and sheep at QAB and Saar that different breeds were imported to Bahrain (Uerpmann and Uerpmann 1999:642; 2008:74), with likely sources including Mesopotamia and the Arabian Peninsula. In particular, the body size differences in sheep in the Gulf seem to reflect the introduction of animals from Mesopotamia and Iran, along with the small-bodied population of sheep that was maintained on the mainland from the Neolithic and moved with people to Bahrain (see Bangsgaard 2003 and Uerpmann and Uerpmann 1999 for discussion of body size in Bahrain; see Uerpmann and Uerpmann 2008 for South-east Arabia). Uerpmann and Uerpmann (2008: 74) argue that the transportation of animals overseas was no trouble, and certainly, the data here support that conclusion. Additionally, historical records remind us of the potential extent of trade: 14,000 sheep and goats were imported from Iran, and 2050 sheep from the Arabian mainland in 1905 (Lorimer 1908: 246). The trade in live animals was a significant part of the economy from the Early Dilmun, and importation for special purposes points to the need to consider the context of faunal samples, rather than assuming they will represent local values, especially in trading centers like Bahrain.

Outside of Barbar temple, the identification of between 13.7 and 32.9% of people as having spent some period of their childhood outside the region corresponds to what might be expected within a trading nexus such as Bahrain. These figures vary by period when those with sufficient sample size are compared. Only two of 27 Early Dilmun teeth were extra-local (one probable, one definite, 3.7–7.4%), one from late Dilmun (probably extra-local, maximum 4.8%), and compared to eight from the Islamic period at Qal’at al-Bahrain (one probable extra-local and seven definite, 35–40%). It should be noted that this is considerably less than the scale of movement identified at Harappa (“almost half” Kenoyer et al. 2013:2295, “all first molars sampled” Valentine et al. 2015: 11) or the Byzantine port of Alia (Perry et al. 2017), where the number of “non-locals” (91.3%) exceeds locals. The early and late Dilmun rates are comparable to Bronze Age sites in the Oman Peninsula (8% Gregoricka 2013, 0% et al. Buhais Kutterer and Uerpmann 2017) but the Islamic period sample shows a more extensive scale of movement. These comparisons throw into question the nature of the movement. Could this reflect single migratory movements undertaken during childhood/adolescent or childhood/adolescent movement to multiple places as part of trade? Analysis of paired samples is essential to untie those complexities along with future work we will undertake to explore the identity (age, sex estimation) of potential non-local individuals. Nevertheless, the alignment of the human values with faunal values, except for the most extreme outliers, points to the ways in which many of those movements mirrored trade in subsistence products.

Conclusions

A lack of proxies for bioavailable strontium in a context of geological homogeneity presents challenges for identifying the movement of humans and animals in archaeological samples. In cases where there are sufficient samples, using multiple statistical methods allows for potential movement to be identified and parameterized with different degrees of certainty as opposed to using just one arbitrary method. This multi-method approach then provides a basis for further testing including in heterogeneous areas where migration might be more easily detectable.

In the context of Bahrain, the current analysis points to the trade of live animals and how that trade was differentiated by purpose. Larger samples from across time and site types should help elucidate how that changed over time and the different roles of urban versus rural settlements in this animal trade. In contrast, our estimates of people who moved to Bahrain are just the tip of the iceberg given human movement is not necessarily from one place to another, but potentially sequential, complicated, and can occur during adulthood as well. Nevertheless, the startling homogeneity of the central band of Bahrain values points to a predominantly local population even in the Islamic period. Future temporal comparison will indicate whether this is a constant in Bahrain.