Introduction

Osteopetrosis describes a group of rare, inherited skeletal disorders resulting from the defective bone resorption by the osteoclasts that causes a marked increase in bone density [1]. The disease was first defined as a clinical entity in 1904 by Albers-Schonberg in a 26 year old man with generalised skeletal sclerosis and multiple fractures while the term osteopetrosis was introduced in 1926 by Karshner [2, 3]. Based on the pattern of inheritance, two major forms of osteopetrosis have been described: autosomal dominant osteopetrosis (ADO) and autosomal recessive osteopetrosis (ARO) also known as malignant infantile osteopetrosis [4].

ADO, also known as Albers-Schonberg disease, is generally considered to be the milder form with adult-onset and was initially ascribed to pathogenic variants in CLCN7 (ADO2) or LRP5 (ADO1) [5, 6]. However, variable disease severity has been reported in ADO patients ranging from asymptomatic disease to severe phenotype manifesting in early childhood [6]. ADO1 caused by LRP5 variants resulting in reduced LRP5 affinity for the extracellular antagonists namely SOST and DKK1 is associated with enhanced osteoblast activity and Wnt canonical signaling rather than an osteoclast defect and is now classified as a disease of LRP5 activation [6,7,8]. Autosomal recessive osteopetrosis (ARO), also termed malignant infantile osteopetrosis, presents very early in life, has a severe phenotype and is usually fatal if not treated [4]. Patients present with varying clinical manifestations including anemia, hepatosplenomegaly, bleeding, infections, growth retardation, visual and hearing loss, skeletal and dental anomalies, nerve palsies [9]. ARO can be osteoclast rich caused by pathogenic variants in TCIRG1, CLCN7, CAII, OSTM1, SNX10 and PLEKHM1 or osteoclast poor caused by pathogenic variants in TNFSF11 and TNFRSF11A [4, 7, 10]. Rare cases of X-linked osteopetrosis have also been described in the literature resulting from pathogenic variants in the IKBKG [11].

With the introduction of next generation sequencing (NGS), pathogenic variants in other genes like CSF1R, CTSK, C16ORF57, FERMT3, LRRK1, MITF, RELA, SLC29A3 and TRAF6 have also been identified in patients with osteopetrosis although their exact role in the pathogenesis of osteopetrosis is not completely understood [7]. The heterogeneity in phenotype and genotype can present diagnostic challenges for this rare entity.

In this study, we describe our results from the genetic analysis of 31 patients clinically suspected with osteopetrosis. The results highlight the genetic heterogeneity of osteopetrosis and the advantages of using a targeted NGS approach for the genetic diagnosis of osteopetrosis.

Materials and Methods

Samples

The study includes 31 patients suspected of osteopetrosis based on clinical history, examination, laboratory, and radiological findings. Parents’ samples, wherever available were screened to assess the carrier status if a variant was identified in the index case. Peripheral blood was collected for nucleic acid (DNA) extraction and molecular analysis following informed consent. Till 2019, samples were screened by Sanger sequencing for variants in TCIRG1, CLCN7, TNFRSF11A and TNFSF11 genes. After 2019, all the samples were screened directly by next generation sequencing. In 4 patients from the pre-NGS timeline with no variants identified by Sanger sequencing, NGS was done later with the banked DNA sample. Clinical history, including presenting complaints, physical examination findings and biochemical investigation results were collected from the patient’s clinical records. Prenatal diagnosis was also done in one family following consent and ethics committee approval.

DNA Sequencing

Sanger sequencing (SS) was done using the earlier established protocols in our lab [12]. Peripheral blood was collected after obtaining consent in EDTA anticoagulant tubes. Genomic DNA was extracted from the peripheral blood leukocytes employing a commercial kit (Gentra Puregene Blood kit, Qiagen, Hilden, Germany) consistent with the manufacturer’s instructions. Microvolume spectrophotometer (Nanodrop, Thermo Scientific) and agarose gel electrophoresis were used to assess the quality and quantity of the extracted DNA. Amplification was carried out by polymerase chain reaction (PCR) using primers targeting the exons and exon-intron junctions. Primers were designed to cover all the coding exons and the splice regions of the introns (minimum 20 bases on either side of the exons) in TCIRG1 (ENST00000265686.8), TNFSF11 (ENST00000398795.7), TNFRSF11A (ENST00000586569.3) and CLCN7 (ENST00000382745.9) (Online Resource 1). Primer sequences for TCIRG1 (except EX1F, EX1R, 16altF and 16altR) were obtained from published literature [13] while the rest of the primers were designed using Primer3web version 4.1.0 (https://primer3.ut.ee/). BLAST like alignment tool (BLAT) and In-Silico PCR tool in the UCSC Genome Browser (http://genome.ucsc.edu/) were used to confirm the specificity of the primers [14, 15]. Genetic analysis was carried out with Sanger sequencing using Big Dye Terminator version 3.1 chemistry and analysed in ABI 3130 DNA analyser (Applied Biosystems, Foster City, California). The sequences were aligned and analysed using the bioinformatics software Geneious [16].

Next Generation Sequencing (NGS)

The sequencing protocol with Illumina Hiseq sequencer and the subsequent bioinformatics used in our lab for germline variants have been described in detail earlier [17]. Library preparation was done with targeted gene capture using a custom capture kit that covered 2196 genes.

NGS Data Analysis and Variant Annotation

Bioinformatics analysis was carried out using Genome Analysis ToolKit (GATK) best practices pipeline. Briefly, FASTQ files were converted into binary alignment map (BAM) files following alignment with the human reference genome (GRCh37/hg19) and subsequently to variant call format (VCF) files. Following annotation, the subsequent variant filtering strategy was done based on the minor allele frequency (MAF) in population databases like genome aggregation database (gnomAD; gnomad.broadinstitute.org) and single nucleotide polymorphism database (dbSNP) where variants identified in more than 1% of the population (MAF > 0.01) were filtered out [18]. Variant search in the local population database (Indigenomes) was also performed [19]. The initial analysis included 20 genes associated with osteopetrosis followed by a more detailed analysis involving the rest of the genes in the panel (Online Resource 2). The clinical significance of reported variants and the genotype-phenotype correlation was assessed from the clinically relevant variant (ClinVar) database, Human genome mutation database (HGMD) and online Mendelian inheritance in man (OMIM) records [20,21,22]. For novel variants not described earlier, scores from different computational algorithms like SIFT, PolyPhen2, MutationAssessor and the conservation of the protein sequences involving the mutation sites (for exon variants) were compared across different species to determine their functional importance [23,24,25]. Variants were verified with the integrative genomics viewer (IGV) for excluding possible false calls due to background signal noise [26]. Variant search in the literature for disease evidence was performed using Mastermind®, a comprehensive search engine for genetic variants [27]. The variants were labelled as pathogenic/likely pathogenic/variant of uncertain significance (VUS)/likely benign/benign based on the American college of medical genetics and genomics (ACMG) guidelines [28]. Copy number variants (CNV) analysis was performed using the DRAGEN (Dynamic Read Analysis for GENomics) CNV pipeline and CNVs were screened based on the ratio between expected and observed reads and Bayes Factor (BF), defined as the likelihood ratio value (log10) for the CNV call divided by the normal copy number, was used as statistical support for the identified CNVs.

PCR for SNX10 Deletion

A PCR with custom-designed primers was done to confirm a large deletion involving SNX10 (ensemble transcript ID: ENST00000338523.9) in a patient. Primers were designed targeting exon 1 of SNX10 as described earlier (Online resource 1). The post-PCR amplicons were run on 2% agarose gel electrophoresis at 120 volts for 20 min and visualised with a gel documentation system (BIO-RAD, California, USA). Amplicons produced with primers targeting exon 1 of the HBB was used as an internal control.

Results

Patient Characteristics

The study cohort included 31 patients with 18 males and 13 females. The median age at presentation was 9 months [range: 1 month to 27 years]. With the exception of one patient diagnosed at 27 years of age, all the other patients were diagnosed under 12 years. History of developmental delay could be elucidated from 15 patients while 11 patients had complaints of loss of vision. Hepatomegaly and splenomegaly were documented in 25 and 23 patients respectively while facial dysmorphism or stunted growth was identified in 21 patients. History of consanguinity was recorded in 24 patients (including 15 among the 17 patients with homozygous variants). The clinical and laboratory parameters of the study cohort including the haematological indices, serum calcium, phosphorus, parathormone and vitamin D levels are shown in Online Resource 3.

Genotype Results

Screening for the genetic variants was done in 48 samples including 31 samples from the index patients, 16 samples from the parents’ and 1 chorionic villus sample (CVS) sample (Fig. 1). Among the 20 patients, where variants were screened by SS targeting TCIRG1, TNFSF11, TNFRSF11A and CLCN7, pathogenic variants were identified in 13 of them. This includes 8 patients with biallelic variants and one patient with a monoallelic variant in TCIRG1, 3 patients with biallelic variants in TNFRSF11A and 1 patient with biallelic variants in CLCN7 (Fig. 2). In the remaining 11 patients along with 4 patients where SS failed to pick up a pathogenic variant, variants were screened by NGS. Pathogenic/likely pathogenic variants were identified in 9 patients including 4 patients with biallelic variants in TCIRG1, 1 patient each with biallelic variants in TNFSF11 and TNFRSF11A, 2 patients with a monoallelic variant in CLCN7 and 1 patient with a homozygous deletion in SNX10. Four patients had monoallelic variants in PTPN11, TCIRG1, VDR and a reported variant in FBN1 respectively with the last patient also harbouring a second variant of uncertain significance (VUS) in CSF1R. The genotype results of the entire cohort are shown in Table 1 and Online Resource 4. NGS could not be done from the banked DNA sample due to quality/quantity issues in 3 patients with no pathogenic variants identified by SS. Copy number variation (CNV) analysis showed a possible homozygous deletion (CNV ratio = 0) in SNX10 in 1 patient which was confirmed with custom-designed primers targeting the deleted exon (Fig. 3a and Online Resource 5).

Fig. 1
figure 1

Flow chart showing the approach to the molecular diagnosis in the study cohort and the results of genetic screening

CVS, chorionic villus sampling; P, pathogenic; LP, likely pathogenic; SS, sanger sequencing; NGS, next generation sequencing; VUS, variant of uncertain significance

Fig. 2
figure 2

TCIRG1 (Ensembl transcript ID: ENST00000265686.8) with 20 exons (19 coding exons) coding for 830 amino acids showing the spectrum of variants identified in the study. *Denotes novel variants

Fig. 3
figure 3

Confirmatory PCR a done in a patient with SNX10 deletion identified by the copy number variant (CNV) analysis. HBB was used as the internal control. Representative electropherograms b from a family with homozygous c.807 + 2T > G variant in TCIRG1. The patient and the CVS were homozygous while the father and mother were heterozygous for the variant. Normal control showing wild type (T) sequence is shown below

A total of 29 variants, including 28 unique variants (1 variant was seen in two different patients belonging to different families) were identified in 26 patients which included missense (15), splice site (5), frameshift (4), nonsense (4) variants and a large deletion. Parents’ samples were screened in 8 families and the results are shown in Table 1. Prenatal diagnosis was also done in one family where the pathogenic variant in the index patient was also identified in the CVS (Table 1). Extended screening by including the remaining genes in the panel for analysis showed heterozygous variants of uncertain significance in FBN1, PTPN11, VDR and CSF1R in 3 patients. No variant could be identified in 2 patients even after NGS despite clear clinical and radiological evidence of osteopetrosis. Among the 28 unique variants identified in this study, 11 had been reported earlier in the literature while the remaining 17 variants were novel based on a literature search with Mastermind®. All the identified variants were submitted to ClinVar database [ClinVar accession numbers: SCV002516023–SCV002516050].

Discussion

There is a paucity of published data describing the genetic variants in osteopetrosis from Indian patients. The earlier study by Phadke et al. described the genotype in 8 osteopetrosis patients (6 TCIRG1 and 2 CLCN7 pathogenic variants) [29]. To the best of our knowledge, this is the largest data describing the mutation spectrum in osteopetrosis from India. Among the 31 index cases, we were able to identify pathogenic/likely pathogenic variants in 22 patients (70.9%) involving genes that are known to be mutated in osteopetrosis. Another 4 patients had variants that were classified as VUS based on the ACMG guidelines.

Pathogenic variants in TCIRG1 are the most common cause of ARO [7]. TCIRG1 encodes a subunit of a large protein complex known as a vacuolar H + ATPase which transports protons across the membrane thereby regulating the pH of cells (osteoclasts) and their surrounding environment [30]. TCIRG1 variants have a heterogeneous distribution along the entire gene resulting in an increased incidence of novel variants in the gene [29, 31,32,33,34]. This is corroborated by our study findings, wherein, among the 13 pathogenic/likely pathogenic variants identified in the 12 ARO patients with TCIRG1 variants (including one patient with compound heterozygous variants), 6 variants were novel. Interestingly, patient 20 had a reported pathogenic variant in a heterozygous state involving TCIRG1. The clinical presentation along with the family history (elder sibling died at the age of 10 months with osteopetrosis) was suggestive of ARO. Although there have been case reports of ADO with heterozygous variants in TCIRG1 resulting in a milder phenotype, this may still not explain the more severe phenotype in this patient, further supported by the fact that the asymptomatic mother also harboured the same variant [35]. Further workup could reveal additional variants that would explain the severe phenotype in the patient.

Biallelic variants in CLCN7 are the second most common cause of ARO while heterozygous CLCN7 variants are the commonest cause for ADO [7]. Like TCIRG1, several novel CLCN7 pathogenic variants have been reported in patients from different ethnic backgrounds with varying disease severity [36,37,38]. In our study we had one ARO patient with a compound heterozygous variant [c.1125 C > G (p.Ile375Met) and c.1681 C > T (p.Arg561Trp)] in CLCN7 and two ADO patients with heterozygous c.643G > A (p.Gly215Arg) and heterozygous c.2274 C > G (p.Phe758Leu) variants in CLCN7. While both the variants in the ADO patients had been reported earlier, the two variants identified in the ARO patient were novel [38,39,40].

The TNFSF11 (RANKL) codes for the essential osteoclastogenic ligand and, along with its functional receptor TNFRSF11A (RANK), is involved in osteoclast differentiation and activation via downstream signaling pathways such as nuclear factor kappa B (NF-kB), c-Jun N-terminal kinase (JNK) and SRC family kinases [7, 41]. In our study, we had 5 ARO patients with homozygous variants in TNFRSF11A (4) or TNFSF11 (1). Except for the c.730G > T (p.Ala244Ser) variant in TNFRSF11A, which has been reported earlier in various studies, all the other variants identified in the two genes were novel [42, 43].

SNX10 codes for a protein that participates in protein sorting and membrane trafficking [44]. Large deletions in SNX10 have been reported earlier in patients with ARO [45, 46]. We identified a large deletion in SNX10 involving the non-coding exon 1 in one patient with ARO by NGS CNV analysis (Supplemental File S5) which was subsequently confirmed with a PCR (Fig. 3a). However, as this is a targeted exon sequencing, the exact breakpoints and the extent of the deletion could not be determined. Earlier studies have shown that large deletions involving the non-coding exon 1 and the upstream region of SNX10 result in autosomal recessive osteopetrosis [45].

In patients where no variants were identified in the initial analysis involving 20 genes, an extended screening was performed. Heterozygous variants were identified in FBN1, PTPN11, VDR and CSF1R in 3 patients (Supplemental File S5). These variants were classified as VUS and further studies are warranted to establish their clinical significance. The above genes have been reported in other conditions, with one or more overlapping clinical features, such as acromicric or geleophysic dysplasia (FBN1), rickets (VDR) and dysosteosclerosis (CSF1R) and hence a possible clinical misdiagnosis cannot be ruled out in these cases. Juvenile myelomonocytic leukemia (JMML), caused by variants in genes such as PTPN11, is one of the close differentials for ARO [47,48,49]. Animal studies have shown that knockout of PTPN11 results in mild osteopetrosis [50]. Although the pathogenicity scores predicted the variant c.28 A > C in PTPN11 identified in patient 10 to be deleterious, the variant was classified as VUS and further functional studies are required for a better genotype-phenotype correlation to attribute causality to osteopetrosis. No significant variant could be identified in 2 samples after NGS (Table 1). Earlier studies have also shown that genetic diagnosis could not be made in about 10% of osteopetrosis cases which could be because of novel genes involved that are yet to be functionally characterised or because of variants that are generally missed during the analysis such as functionally relevant deep intronic variants [7]. Further, as this is a targeted exome sequencing, copy number variant detection is based on the read depth and may not be reliably identified by this method as the read depth in exome sequencing is known to be extremely variable and influenced by several factors such as GC content, PCR duplication bias, targeted depth, sequencing efficiency, and sample batching.

Currently, the only curative treatment for ARO is haematopoietic stem cell transplantation (HSCT), done at the earliest following the diagnosis [51]. In vitro and animal studies on gene therapy for osteopetrosis have invoked a lot of interest as it has the potential to cure without some of the allogeneic HSCT related complications [52]. One of the biggest advantages of performing genetic screening is the option of performing prenatal diagnosis in families where a causative variant has been identified in the index patient [29]. In our study, prenatal diagnosis was done with chorionic villus sampling (CVS) in one family with c.807 + 2T > G variant in TCIRG1 after the ethics committee approval (Table 1). Sequencing the target exon of TCIRG1 from the CVS DNA revealed a homozygous c.807 + 2T > G variant and subsequently, medical termination of pregnancy was done in week 14 of pregnancy (Fig. 3b).

In conclusion, despite the molecular heterogeneity, NGS has enabled us to take big strides in understanding the genetic bases of osteopetrosis. Our study findings are consistent with the earlier genetic studies in osteopetrosis with an increased incidence of consanguinity in the affected families and novel variants in the genes involved with TCIRG1 being the most common mutated gene. With NGS becoming more and more affordable, molecular diagnosis can help in the diagnosis of osteopetrosis and make treatment-related decisions such as prenatal diagnosis and HSCT.

Table 1 Results of genetic studies and the variants identified in our study cohort