Skip to main content
Log in

South Asia: The Missing Diverse in Diversity

  • Review
  • Published:
Behavior Genetics Aims and scope Submit manuscript

Abstract

South Asia, making up around 25% of the world’s population, encompasses a wide range of individuals with tremendous genetic and environmental diversity. This region, which spans eight countries, is home to over 4500 anthropologically defined groups that speak numerous languages and have an array of religious beliefs and cultures, making it one of the most diverse places in the world. Much of the region’s rich genetic diversity and structure is the result of a complex combination of population history, migration patterns, and endogamous practices. Despite the overwhelming size and diversity, South Asians have often been underrepresented in genetic research, making up less than 2% of the participants in genetic studies. This has led to a lack of population specific understanding of genetic disease risks. We aim to raise awareness about underlying genetic diversity in this ancestry group, call attention to the lack of representation of the group, and to highlight strategies for future studies in South Asians.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Acharya S, Sahoo H (2021) Consanguineous marriages in India: prevalence and determinants. J Health Manag 23:631–648

    Article  Google Scholar 

  • Arciero E, Dogra SA, Malawsky DS et al (2021) Fine-scale population structure and demographic history of British Pakistanis. Nat Commun 12:7189

    Article  PubMed  PubMed Central  Google Scholar 

  • Basu A, Sarkar-Roy N, Majumder PP (2016) Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proc Natl Acad Sci USA 113:1594–1599

    Article  PubMed  PubMed Central  Google Scholar 

  • Bennett T (1997) “Racial” and ethnic classification: two steps forward and one step back? Public Health Rep 112:477–480

    PubMed  PubMed Central  Google Scholar 

  • Bentley AR, Callier S, Rotimi CN (2017) Diversity and inclusion in genomic research: why the uneven progress? J Commun Genet 8:255–266

    Article  Google Scholar 

  • Berreman GD (1960) Caste in India and the United States. Am J Sociol 66:120–127

    Article  Google Scholar 

  • Bittles AH, Black ML (2010) Consanguinity, human evolution, and complex diseases. Proc Natl Acad Sci 107:1779–1786

    Article  PubMed  PubMed Central  Google Scholar 

  • Bloom DE, Sekher TV, Lee J (2021) Longitudinal Aging Study in India (LASI): new data resources for addressing aging in India. Nature Aging 1:1070–1072

    Article  PubMed  Google Scholar 

  • Bycroft C, Freeman C, Petkova D et al (2018) The UK Biobank resource with deep phenotyping and genomic data. Nature 562:203–209

    Article  PubMed  PubMed Central  Google Scholar 

  • Caleyachetty R, Barber TM, Mohammed NI et al (2021) Ethnicity-specific BMI cutoffs for obesity based on type 2 diabetes risk in England: a population-based cohort study. Lancet Diabetes Endocrinol 9:419–426

    Article  PubMed  PubMed Central  Google Scholar 

  • Cavalli-Sforza LL (2005) The human genome diversity project: past, present and future. Nat Rev Genet 6:333–340

    Article  PubMed  Google Scholar 

  • Chadda RK, Deb KS (2013) Indian family systems, collectivistic society and psychotherapy. Indian J Psychiatry 55:S299-309

    Article  PubMed  PubMed Central  Google Scholar 

  • Chambers JC, Abbott J, Zhang W et al (2014) The South Asian genome. PLoS ONE 9:e102645

    Article  PubMed  PubMed Central  Google Scholar 

  • Chambers JC, Loh M, Lehne B et al (2015) Epigenome-wide association of DNA methylation markers in peripheral blood from Indian Asians and Europeans with incident type 2 diabetes: a nested case-control study. Lancet Diabetes Endocrinol 3:526–534

    Article  PubMed  PubMed Central  Google Scholar 

  • Chan SH, Bylstra Y, Teo JX et al (2022) Analysis of clinically relevant variants from ancestrally diverse Asian genomes. Nat Commun 13:6694

    Article  PubMed  PubMed Central  Google Scholar 

  • Finer S, Martin HC, Khan A et al (2020) Cohort Profile: East London Genes & Health (ELGH), a community-based population genomics and health study in British Bangladeshi and British Pakistani people. Int J Epidemiol 49:20–21i

    Article  PubMed  Google Scholar 

  • Garg A, Chaturvedi P, Gupta PC (2014) A review of the systemic adverse effects of areca nut or betel nut. Indian J Med Paediatr Oncol 35:3–9

    Article  PubMed  PubMed Central  Google Scholar 

  • GenomeAsia100K Consortium (2019) The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature 576:106–111

    Article  Google Scholar 

  • Genomes Project Consortium, Auton A, Brooks LD et al (2015) A global reference for human genetic variation. Nature 526:68–74

    Article  Google Scholar 

  • GTEx Consortium (2020) The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369:1318–1330

    Article  Google Scholar 

  • GUaRDIAN Consortium, Sivasubbu S, Scaria V (2019) Genomics of rare genetic diseases-experiences from India. Hum Genomics 14:52

    Google Scholar 

  • Hamamy H (2012) Consanguineous marriages : Preconception consultation in primary health care settings. J Commun Genet 3:185–192

    Article  Google Scholar 

  • He W, Goodkind D, Kowal PR (2016) An aging world: 2015. https://www.researchgate.net/profile/Paul-Kowal/publication/299528572_An_Aging_World_2015/links/56fd4be108ae17c8efaa1132/An-Aging-World-2015.pdf. Accessed 11 Apr 2023

  • Holup JL, Press N, Vollmer WM et al (2007) Performance of the US Office of management and Budget’s revised race and ethnicity categories in Asian populations. Int J Intercult Relat 31:561–573

    Article  PubMed  PubMed Central  Google Scholar 

  • Huang QQ, Sallah N, Dunca D et al (2022) Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistani and Bangladeshi individuals. Nat Commun 13:4664

    Article  PubMed  PubMed Central  Google Scholar 

  • Hudson N (1996) From "nation to “race”: the origin of racial classification in eighteenth-century thought. Eighteenth Century Stud 29:247–264

    Article  Google Scholar 

  • India State-Level Disease Burden Initiative Mental Disorders Collaborators (2020) The burden of mental disorders across the states of India: the Global Burden of Disease Study 1990–2017. Lancet Psychiatry 7:148–161

    Article  Google Scholar 

  • International HapMap 3 Consortium, Altshuler DM, Gibbs RA et al (2010) Integrating common and rare genetic variation in diverse human populations. Nature 467:52–58

    Article  Google Scholar 

  • Iqbal S, Zakar R, Fischer F, Zakar MZ (2022) Consanguineous marriages and their association with women’s reproductive health and fertility behavior in Pakistan: secondary data analysis from Demographic and Health Surveys, 1990–2018. BMC Womens Health 22:118

    Article  PubMed  PubMed Central  Google Scholar 

  • Jackson JP, Weidman NM (2005) The origins of scientific racism. J Blacks High Educ 50:66–79

    Google Scholar 

  • Jorde LB, Bamshad MJ (2020) Genetic ancestry testing: what is it and why is it important? JAMA 323:1089–1090

    Article  PubMed  PubMed Central  Google Scholar 

  • Khan AT, Gogarten SM, McHugh CP et al (2022) Recommendations on the use and reporting of race, ethnicity, and ancestry in genetic research: experiences from the NHLBI TOPMed program. Cell Genom. https://doi.org/10.1016/j.xgen.2022.100155

    Article  PubMed  PubMed Central  Google Scholar 

  • Kooner JS, Saleheen D, Sim X et al (2011) Genome-wide association study in individuals of South Asian ancestry identifies six new type 2 diabetes susceptibility loci. Nat Genet 43:984–989

    Article  PubMed  PubMed Central  Google Scholar 

  • Lee J, Ramakrishnan K (2020) Who counts as Asian. Ethn Racial Stud 43:1733–1756

    Article  Google Scholar 

  • Lee J, Banerjee J, Khobragade PY et al (2019) LASI-DAD study: a protocol for a prospective cohort study of late-life cognition and dementia in India. BMJ Open 9:e030300

    Article  PubMed  PubMed Central  Google Scholar 

  • Li YR, Keating BJ (2014) Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations. Genome Med 6:91

    Article  PubMed  PubMed Central  Google Scholar 

  • Lim ET, Würtz P, Havulinna AS et al (2014) Distribution and medical impact of loss-of-function variants in the Finnish founder population. PLoS Genet 10:e1004494

    Article  PubMed  PubMed Central  Google Scholar 

  • Liu Y-J, Peng W, Hu M-B et al (2016) The pharmacology, toxicology and potential applications of arecoline: a review. Pharm Biol 54:2753–2760

    Article  PubMed  Google Scholar 

  • MacNee W, Rabinovich RA, Choudhury G (2014) Ageing and the border between health and disease. Eur Respir J 44:1332–1352

    Article  PubMed  Google Scholar 

  • Mallick S, Li H, Lipson M et al (2016) The Simons genome diversity project: 300 genomes from 142 diverse populations. Nature 538:201–206

    Article  PubMed  PubMed Central  Google Scholar 

  • Mastana SS (2014) Unity in diversity: an overview of the genomic anthropology of India. Ann Hum Biol 41:287–299

    Article  PubMed  Google Scholar 

  • Messer RH, Gonzalez GDS (2021) Relationship between culture and race. In: Shackelford TK, Weekes-Shackelford VA (eds) Encyclopedia of evolutionary psychological science. Springer International Publishing, Cham, pp 6538–6540

    Chapter  Google Scholar 

  • Metspalu M, Romero IG, Yunusbayev B et al (2011) Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. Am J Hum Genet 89:731–744

    Article  PubMed  PubMed Central  Google Scholar 

  • Metspalu M, Mondal M, Chaubey G (2018) The genetic makings of South Asia. Curr Opin Genet Dev 53:128–133

    Article  PubMed  Google Scholar 

  • Mills MC, Rahal C (2020) The GWAS Diversity Monitor tracks diversity by disease in real time. Nat Genet 52:242–243

    Article  PubMed  Google Scholar 

  • Moorjani P, Thangaraj K, Patterson N et al (2013) Genetic evidence for recent population mixture in India. Am J Hum Genet 93:422–438

    Article  PubMed  PubMed Central  Google Scholar 

  • Morales J, Welter D, Bowler EH et al (2018) A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol 19:21

    Article  PubMed  PubMed Central  Google Scholar 

  • Morning A (2001) The racial self-identification of South Asians in the United States. J Ethn Migr Stud 27:61–79

    Article  Google Scholar 

  • Nagoshi CT, Johnson RC, Danko GP (1990) Assortative mating for cultural identification as indicated by language use. Behav Genet 20:23–31

    Article  PubMed  Google Scholar 

  • Nakatsuka N, Moorjani P, Rai N et al (2017) The promise of discovering population-specific disease-associated genes in South Asia. Nat Genet 49:1403–1407

    Article  PubMed  PubMed Central  Google Scholar 

  • Narasimhan VM, Patterson N, Moorjani P et al (2019) The formation of human populations in South and Central Asia. Science. https://doi.org/10.1126/science.aat7487

    Article  PubMed  PubMed Central  Google Scholar 

  • Pomeroy E, Mushrif-Tripathy V, Cole TJ et al (2019) Ancient origins of low lean mass among South Asians and implications for modern type 2 diabetes susceptibility. Sci Rep 9:10515

    Article  PubMed  PubMed Central  Google Scholar 

  • Popejoy AB, Fullerton SM (2016) Genomics is failing on diversity. Nature Publishing Group, London. https://doi.org/10.1038/538161a

    Book  Google Scholar 

  • Rangaswamy P (2005) South Asian diaspora. In: Ember M, Ember CR, Skoggard I (eds) Encyclopedia of diasporas: immigrant and refugee cultures around the world. Springer US, Boston, pp 285–296

    Chapter  Google Scholar 

  • Risley H (1999) The people of India. Asian Educational Services, New Delhi

    Google Scholar 

  • Saleheen D, Zaidi M, Rasheed A et al (2009) The Pakistan Risk of Myocardial Infarction Study: a resource for the study of genetic, lifestyle and other determinants of myocardial infarction in South Asia. Eur J Epidemiol 24:329–338

    Article  PubMed  PubMed Central  Google Scholar 

  • Saunders GRB, Wang X, Chen F et al (2022) Genetic diversity fuels gene discovery for tobacco and alcohol use. Nature 612:720–724

    Article  PubMed  PubMed Central  Google Scholar 

  • Sen R (1992) Formation of state and the Indus Valley Civilization. Indian Anthropol 22:25–40

    Google Scholar 

  • Sengupta D, Choudhury A, Basu A, Ramsay M (2016) Population stratification and underrepresentation of Indian subcontinent genetic diversity in the 1000 genomes project dataset. Genome Biol Evol 8:3460–3470

    Article  PubMed  PubMed Central  Google Scholar 

  • Shinde V, Narasimhan VM, Rohland N et al (2019) An ancient Harappan genome lacks ancestry from steppe pastoralists or Iranian farmers. Cell 179:729-735.e10

    Article  PubMed  PubMed Central  Google Scholar 

  • Siribaddana SH, Ball HA, Hewage SN et al (2008) Colombo Twin and Singleton Study (CoTASS): a description of a population based twin study of mental disorders in Sri Lanka. BMC Psychiatry 8:49

    Article  PubMed  PubMed Central  Google Scholar 

  • Sirugo G, Williams SM, Tishkoff SA (2019) The missing diversity in human genetic studies. Cell 177:26–31

    Article  PubMed  PubMed Central  Google Scholar 

  • Slatkin M (2004) A population-genetic test of founder effects and implications for Ashkenazi Jewish diseases. Am J Hum Genet 75:282–293

    Article  PubMed  PubMed Central  Google Scholar 

  • Taliun D, Harris DN, Kessler MD et al (2021) Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590:290–299

    Article  PubMed  PubMed Central  Google Scholar 

  • Tamang R, Singh L, Thangaraj K (2012) Complex genetic origin of Indian populations and its implications. J Biosci 37:911–919

    Article  PubMed  Google Scholar 

  • Teixeira JC, Cooper A (2019) Using hominin introgression to trace modern human dispersals. Proc Natl Acad Sci USA 116:15327–15332

    Article  PubMed  PubMed Central  Google Scholar 

  • Thapar R (1996) The theory of Aryan race and India: history and politics. Soc Sci 24:3–29

    Google Scholar 

  • United Nations (UN) (2017) World economic situation and prospects 2017

  • United Nations Publications (2021) International Migration 2020: highlights. UN

  • Vijaya RM, Bhullar N (2022) Colorism and employment bias in India: an experimental study in stratification economics. Rev Evol Polit Econ 3:599

    Article  Google Scholar 

  • Wall JD, Sathirapongsasuti JF, Gupta R et al (2023) South Asian medical cohorts reveal strong founder effects and high rates of homozygosity. Nat Commun 14:3377

    Article  PubMed  PubMed Central  Google Scholar 

  • Weaver LJ (2022) The laboratory of scientific racism: India and the origins of anthropology. Annu Rev Anthropol 51:67–83

    Article  Google Scholar 

  • Westerterp KR (2017) Control of energy expenditure in humans. Eur J Clin Nutr 71:340–344

    Article  PubMed  Google Scholar 

Download references

Funding

The funding was supported by National Institute on Drug Abuse, DA053693, T32DA017637, DA053693, National Institute of Mental Health, MH016880, National Institute on Aging, AG046938, National Institutes of Health, DA051937.

Author information

Authors and Affiliations

Authors

Contributions

DRD and TBH wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Deepika R. Dokuru.

Ethics declarations

Competing interests

The authors declare no competing interests.

Human and Animal Rights and Informed Consent

Human and Animal Rights and Informed Consent This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Edited by Sara Jaffee.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dokuru, D.R., Horwitz, T.B., Freis, S.M. et al. South Asia: The Missing Diverse in Diversity. Behav Genet 54, 51–62 (2024). https://doi.org/10.1007/s10519-023-10161-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10519-023-10161-y

Keywords

Navigation