-
Sanger validation of WGS variants - when to? bioRxiv. Genom. Pub Date : 2024-04-25 Arina A. Kopernik, Gaukhar Yu. Zobkova, Natalia Doroschuk, Anna V. Smirnova, Daria V. Molodtsova-Zolotukhina, Olesya Sagaydak, Oxana P. Ryzhkova, Sergey I. Kutsev, Olga Groznova, Lyusya Melikyan, Elizaveta Bondarchuk, Mary Woroncow, Eugene Albert, Viktor P. Bogdanov, Pavel Y. Volchkov
With the development of Next-Generation Sequencing (NGS) technologies it became possible to simultaneously analyze millions of variants. Despite the quality improvement it is generally still required to confirm the variants before reporting. However, in recent years the dominant idea is that one could define the quality thresholds for "high quality" variants which do not require orthogonal validation
-
Chromosome-level assembly of Cucumis sativus cv. 'Tokiwa' as a reference genome of Japanese cucumber bioRxiv. Genom. Pub Date : 2024-04-25 Takashi Seiko, Chiaki Muto, Koichiro Shimomura, Ryoichi Yano, Yoichi Kawazu, Mitsuhiro Sugiyama, Kenji Kato, Norihiko Tomooka, Ken Naito
Cucumber is one of the most important vegetables in the Japanese market. To facilitate genomics-based breeding, there is a demand for reference genome of Japanese cucumber. However, although cucumber genome is relatively small, its assembly is a challenging issue because of tandem repeats comprising ~30% (~100 Mbp) of the genome. To overcome, we deployed the Oxford nanopore sequencing that produces
-
Recent emergence of cephalosporin resistant Salmonella Typhi in India due to the endemic clone acquiring IncFIB(K) plasmid encoding blaCTX-M-15 gene bioRxiv. Genom. Pub Date : 2024-04-25 Tharani Priya Thirumoorthy, Jobin John Jacob, Aravind V, Monisha Priya T, Bhavini Sandip Shah, Veena Iyer, Geeti Maheshwari, Urmi Trivedi, Anand Shah, Pooja Patel, Anushree Gaigawale, Yesudoss M, Pavithra Sathya Narayanan, Ankur Mutreja, Megan Carey, Jacob John, Gagandeep Kang, Balaji Veeraraghavan
The emergence and spread of Salmonella Typhi (S. Typhi) resistant to third generation cephalosporins are a serious global health concern. In this study, we have genomically characterized 142 cephalosporin resistant S. Typhi strains isolated from Gujarat, India. Comparative genome analysis of study isolates revealed the emergence of a new clone of ceftriaxone-resistant S. Typhi harboring three plasmids
-
Inferring gene regulatory networks using DNA methylation data bioRxiv. Genom. Pub Date : 2024-04-25 Thomas E Bartlett
We show much-improved accuracy of inference of GRN (gene regulatory network) structure resulting from the use of an epigenomic prior network. We also find that DNAme data are very effective for inferring the epigenomic prior network, recapitulating known epigenomic network structure found previously from chromatin accessibility data, and in some cases providing potential TF cis-regulations for eight
-
Genome-wide profiling of highly similar paralogous genes using HiFi sequencing bioRxiv. Genom. Pub Date : 2024-04-24 Xiao Chen, Daniel Baker, Egor Dolzhenko, Joseph M Devaney, Jessica Noya, April S Berlyoung, Rhonda Brandon, Kathleen S Hruska, Lucas Lochovsky, Paul Kruszka, Scott Newman, Emily Farrow, Isabelle Thiffault, Tomi Pastinen, Dalia Kasperaviciute, Christian Gilissen, Lisenka Vissers, Alexander Hoischen, Seth Berger, Eric Vilain, Emmanuele Delot, Genomics Research to Elucidate the Genetics of Rare Diseases
Variant calling is hindered in segmental duplications by sequence homology. We developed Paraphase, a HiFi-based informatics method that resolves highly similar genes by phasing all haplotypes of a gene family. We applied Paraphase to 160 long (>10 kb) segmental duplication regions across the human genome with high (>99%) sequence similarity, encoding 316 genes. Analysis across five ancestral populations
-
Bridging genomic gaps: A versatile SARS-CoV-2 benchmark dataset for adaptive laboratory workflows bioRxiv. Genom. Pub Date : 2024-04-24 Sara E. Zufan, Louise M. Judd, Calum J. Walsh, Michelle L. Sait, Susan A. Ballard, Jason Kwong, Timothy P. Stinear, Torsten Seemann, Benjamin P. Howden
Genomic sequencing's adoption in public health laboratories (PHLs) for pathogen surveillance is innovative yet challenging, particularly in the realm of bioinformatics. Low- and middle-income countries (LMICs) face heightened difficulties due to supply chain volatility, workforce training, and unreliable infrastructure such as electricity and internet services. These challenges also extend to high-income
-
Catalytically distinct IDH1 mutants tune phenotype severity in tumor models bioRxiv. Genom. Pub Date : 2024-04-23 Mowaffaq Adam Ahmed Adam, Mikella Robinson, Ashley V. Schwartz, Grace A. Wells, An Hoang, Elene Albekioni, Grace Chao, Joi Weeks, Uduak Z. George, Carrie D. House, Sevin Turcan, Christal D. Sohl
Mutations in isocitrate dehydrogenase 1 (IDH1) impart a neomorphic reaction that produces the oncometabolite D-2-hydroxyglutarate (D2HG), which can inhibit DNA and histone demethylases to drive tumorigenesis via epigenetic changes. Though heterozygous point mutations in patients primarily affect residue R132, there are myriad D2HG-producing mutants that display unique catalytic efficiency of D2HG production
-
Yeast poly(A)-binding protein (Pab1) controls translation initiation in vivo primarily by blocking mRNA decapping and decay bioRxiv. Genom. Pub Date : 2024-04-23 Poonam Poonia, Vishalini Valabhoju, Tianwei Li, James Iben, Xiao Niu, Zhenguo Lin, Alan Hinnebusch
Poly(A)-binding protein (Pab1 in yeast) is involved in mRNA decay and translation initiation, but its molecular functions are incompletely understood. We found that auxin-induced degradation of Pab1 reduced bulk mRNA and polysome abundance in a manner suppressed by deleting the catalytic subunit of decapping enzyme (dcp2Δ), demonstrating that enhanced decapping/degradation is the major driver of reduced
-
Beyond A and B Compartments: how major nuclear locales define nuclear genome organization and function bioRxiv. Genom. Pub Date : 2024-04-23 Omid Gholamalamdari, Tom van Schaik, Yuchuan Wang, Pradeep Kumar, Liguo Zhang, Yang Zhang, Gabriela A. Hernandez Gonzalez, Athanasios E. Vouzas, Peiyao A. Zhao, David M. Gilbert, Jian Ma, Bas van Steensel, Andrew S. Belmont
Models of nuclear genome organization often propose a binary division into active versus inactive compartments, yet they overlook nuclear bodies. Here we integrated analysis of sequencing and image-based data to compare genome organization in four human cell types relative to three different nuclear locales: the nuclear lamina, nuclear speckles, and nucleoli. Whereas gene expression correlates mostly
-
Reference genome of the ant Lasius platythorax bioRxiv. Genom. Pub Date : 2024-04-23 Barbara Feldmeyer, Nadege Guiglielmoni, Joseph Kirangwa, Florian Menzel, Judit Salces-Ortiz, Rosa Fernandez, Elena Buena-Atienza, Claudio Ciofi, Maria Angela Diroma, Alessio Iannucci, Chiara Natali, Ann Mc Cartney, Olaf Riess, Nicolas Casadei, Ann-Marie Waldvogel
Ants are a highly diversified insect family of the order Hymenoptera, with many fascinating characteristics such as eusociality, chemical communication, farming, or social parasitism. Moreover, ants frequent a wide variety of habitats from dry deserts, grasslands and savannas to cold temperate forests. The ability of ants to inhabit such diverse habitat ranges demonstrates their adaptability and ecological
-
The Ribosomal Operon Database (ROD): A full-length rDNA operon database extracted from genome assemblies bioRxiv. Genom. Pub Date : 2024-04-23 Anders Kristian Krabberod, Embla Stokke, Ella Thoen, Inger Skrede, Havard Kauserud
Current rDNA reference sequence databases are tailored towards shorter DNA markers, such as parts of the 16/18S marker or the ITS region. However, due to advances in long-read DNA sequencing technologies, longer stretches of the rDNA operon are increasingly used in environmental sequencing studies to increase the phylogenetic resolution. There is, therefore, a growing need for longer rDNA reference
-
Evaluation of sequence-based tools to gather more insight into the positioning of rhizogenic agrobacteria within the Agrobacterium tumefaciens species complex bioRxiv. Genom. Pub Date : 2024-04-23 Pablo Roberto Vargas Ribera, Nuri Kim, Marc Venbrux, Sergio Álvarez-Pérez, Hans Rediers
Rhizogenic Agrobacterium, the causative agent of hairy root disease (HRD), is known for its high phenotypic and genetic diversity. The taxonomy of rhizogenic agrobacteria has undergone several changes in the past and is still somewhat controversial. While the classification of Agrobacterium strains was initially mainly based on phenotypic properties and the symptoms they induced on plants, more and
-
Massively parallel reporter assays and mouse transgenic assays provide complementary information about neuronal enhancer activity bioRxiv. Genom. Pub Date : 2024-04-23 Michael Kosicki, Dianne Laboy Cintron, Nicholas F. Page, Ilias Georgakopoulos-Soares, Jennifer A. Akiyama, Ingrid Plajzer-Frick, Catherine S. Novak, Momoe Kato, Riana D. Hunter, Kianna von Maydell, Sarah Barton, Patrick Godfrey, Erik Beckman, Stephan J. Sanders, Len A. Pennacchio, Nadav Ahituv
Genetic studies find hundreds of thousands of noncoding variants associated with psychiatric disorders. Massively parallel reporter assays (MPRAs) and in vivo transgenic mouse assays can be used to assay the impact of these variants. However, the relevance of MPRAs to in vivo function is unknown and transgenic assays suffer from low throughput. Here, we studied the utility of combining the two assays
-
Optical genome mapping enables accurate repeat expansion testing bioRxiv. Genom. Pub Date : 2024-04-22 Bart van der Sanden, Kornelia Neveling, Syukri Shukor, Michael D Gallagher, Joyce Lee, Stephanie L Burke, Maartje Pennings, Ronald van Beek, Michiel Oorsprong, Ellen Kater-Baats, Eveline Kamping, Alide Tieleman, Nicol Voermans, Ingrid E Scheffer, Jozef Gecz, Mark Corbett, Lisenka ELM Vissers, Andy WC Pang, Alex Hastie, Erik-Jan Kamsteeg, Alexander Hoischen
Short tandem repeats (STRs) are amongst the most abundant class of variations in human genomes and are meiotically and mitotically unstable which leads to expansions and contractions. STR expansions are frequently associated with genetic disorders, with the size of expansions often correlating with the severity and age of onset. Therefore, being able to accurately detect the total repeat expansion
-
Single cell regulatory architecture of human pancreatic islets suggests sex differences in β cell function and the pathogenesis of type 2 diabetes. bioRxiv. Genom. Pub Date : 2024-04-22 Mirza Muhammad Fahd Qadir, Ruth M. Elgamal, Keijing Song, Parul Kudtarkar, Siva S.V.P. Sakamuri, Prasad V. Katakam, Samir El-Dahr, Jay K. Kolls, Kyle J. Gaulton, Franck Mauvais-Jarvis
Biological sex affects the pathogenesis of type 2 and type 1 diabetes (T2D, T1D) including the development of β cell failure observed more often in males. The mechanisms that drive sex differences in β cell failure is unknown. Studying sex differences in islet regulation and function represent a unique avenue to understand the sex-specific heterogeneity in β cell failure in diabetes. Here, we examined
-
Droplet Hi-C for Fast and Scalable Profiling of Chromatin Architecture in Single Cells bioRxiv. Genom. Pub Date : 2024-04-22 Lei Chang, Yang Xie, Brett Taylor, Zhaoning Wang, Jiachen Sun, Tuyet R Tan, Rafael Bejar, Clark C Chen, Frank B Furnari, Ming Hu, Bing Ren
Comprehensive analysis of chromatin architecture is crucial for understanding the gene regulatory programs during development and in disease pathogenesis, yet current methods often inadequately address the unique challenges presented by analysis of heterogeneous tissue samples. Here, we introduce Droplet Hi-C, which employs a commercial microfluidic device for high-throughput, single-cell chromatin
-
Utility Analyses of AVITI Sequencing Chemistry bioRxiv. Genom. Pub Date : 2024-04-22 Silvia Liu, Carolyn Obert, Yanping Yu, Junhua Zhao, Baoguo Ren, Kelly Wiseman, Benjamin J Krajacich, Wenjia Wang, Kyle Metcalfe, Mat Smith, Tuval Ben-Yehezkel, Jianhua Luo
Background: DNA sequencing is a critical tool in modern biology. Over the last two decades, it has been revolutionized by the advent of massively parallel sequencing, leading to significant advances in the genome and transcriptome sequencing of various organisms. Nevertheless, challenges with accuracy, lack of competitive options and prohibitive costs associated with high throughput parallel short-read
-
Transposable elements in a cold-tolerant fly species, Drosophila montana: a link to adaptation to the harsh cold environments bioRxiv. Genom. Pub Date : 2024-04-22 Mohadeseh S Tahami, Carlos Vargas-Chavez, Noora Poikela, Marta Coronado-Zamora, Josefa Gonzalez, Maaria Kankare
Background Substantial discoveries during the past century have revealed that transposable elements (TEs) can play a crucial role in genome evolution by affecting gene expression and inducing genetic rearrangements, among other molecular and structural effects. Yet, our knowledge on the role of TEs in adaptation to extreme climates is still at its infancy. The availability of long-read sequencing has
-
A de novo variant in PAK2 detected in an individual with Knobloch type 2 syndrome bioRxiv. Genom. Pub Date : 2024-04-22 Elizabeth A Werren, Louisa Kalsner, Jessica Ewald, Michael Peracchio, Cameron King, Purva Vats, Peter A Audano, Peter N. Robinson, Mark D Adams, Melissa A Kelly, Adam P Matson
P21-activated kinase 2 (PAK2) is a serine/threonine kinase essential for a variety of cellular processes including signal transduction, cellular survival, proliferation, and migration. A recent report proposed monoallelic PAK2 variants cause Knobloch syndrome type 2 (KNO2)-a developmental disorder primarily characterized by ocular anomalies. Here, we identified a novel de novo heterozygous missense
-
Reconstructing prehistoric viral genomes from Neanderthal sequencing data bioRxiv. Genom. Pub Date : 2024-04-21 Renata C. Ferreira, Gustavo V. Alves, Marcello Ramon, Fernando Martins Antoneli, Marcelo RS Briones
DNA viruses that produce persistent infections have been proposed as potential causes for the extinction of Neanderthals and therefore, the identification of viral genome remnants in Neanderthal sequencing reads is an initial step to address this hypothesis. Here, as proof of concept, we searched for viral remnants in sequencing reads of Neanderthal genome data by mapping to adenovirus, herpesvirus
-
Long-read sequencing and structural variant characterization in 1,019 samples from the 1000 Genomes Project bioRxiv. Genom. Pub Date : 2024-04-20 Siegfried Schloissnig, Samarendra Pani, Bernardo Rodriguez-Martin, Jana Ebler, Carsten Hain, Vasiliki Tsapalou, Arda Soylev, Patrick Huether, Hufsah Ashraf, Timofey Prodanov, Mila Asparuhova, Sarah Hunt, Tobias Rausch, Tobias Marschall, Jan O Korbel
Structural variants (SVs) contribute significantly to human genetic diversity and disease. Previously, SVs have remained incompletely resolved by population genomics, with short-read sequencing facing limitations in capturing the whole spectrum of SVs at nucleotide resolution. Here we leveraged nanopore sequencing to construct an intermediate coverage resource of 1,019 long-read genomes sampled within
-
Unraveling the phylogenetic signal of gene expression from single-cell RNA-seq data bioRxiv. Genom. Pub Date : 2024-04-20 Joao M. Alves, Laura Tomas, David Posada
Single-cell RNA sequencing (scRNA-seq) has transformed our understanding of phenotypic heterogeneity. Although the predominant focus of scRNA-seq analyses has been assessing gene expression changes, several approaches have been proposed in recent years to identify changes at the DNA level from scRNA-seq data. In this study, we evaluated the relative performance of six strategies for calling single-nucleotide
-
A high-continuity and annotated reference genome of allotetraploid Siberian wildrye (Elymus sibiricus L., Poaceae: Triticeae) bioRxiv. Genom. Pub Date : 2024-04-20 Jiajun Yan, Xinrui Li, Lili Wang, Daxu Li, Changmian Ji, Zujun Yang, Lili Chen, Changbing Zhang, Minghong You, Lijun Yan, Wenlong Gou, Xiong Lei, Xiaofei Ji, Yingzhu Li, Qi Wu, Decai Mao, Dan Chang, Shangang Jia, Ping Li, Jianbo Zhang, Yanli Xiong, Yi Xiong, Mengli Han, Zhao Chen, Xinchao Cheng, Juan Tang, Wengang Xie, Wenhui Liu, Hongkun Zheng, Xiao Ma, Xuebing Yan, Shiqie Bai
Elymus sibiricus L. (Siberian wildrye, Es), a species belonging to the wheat tribe, is extensively employed as forage and for the reclamation of degraded grasslands within the Qinghai-Tibet Plateau (QTP). This study provides a high-quality reference genome assembly for the allotetraploid Es, which is composed of 14 pseudomolecules with the total genome size of 6.57 Gb. Our finding suggest that large-scale
-
Comprehensive Whole Genome Sequencing Reveals Origins of Mutational Signatures Associated with Aging and Temozolomide Chemotherapy bioRxiv. Genom. Pub Date : 2024-04-20 Taejoo Hwang, Lukasz Karol Sitko, Ratih Khoirunnisa, Fernanda Navarro Aguad, David M Samuel, Hajoong Park, Banyoon Cheon, Luthfiyyah Mutsnaini, Jaewoong Lee, Shunichi Takeda, Semin Lee, Dmitri Ivanov, Anton Gartner
In a comprehensive study to decipher the multi-layered response to the chemotherapeutic agent temozolomide (TMZ), we analyzed 427 genomes and determined mutational patterns in a collection of ~40 isogenic DNA repair-deficient human TK6 lymphoblast cell lines. We demonstrate that the spontaneous mutational background is very similar to the aging-associated mutational signature SBS40 and mainly caused
-
Integrative machine learning approaches for predicting disease risk using multi-omics data from the UK Biobank bioRxiv. Genom. Pub Date : 2024-04-20 Oscar Thomas Aguilar, Cheng Chang, Elsa Bismuth, Manuel A Rivas
We train prediction and survival models using multi-omics data for disease risk identification and stratification. Existing work on disease prediction focuses on risk analysis using datasets of individual data types (metabolomic, genomics, demographic), while our study creates an integrated model for disease risk assessment. We compare machine learning models such as Lasso Regression, Multi-Layer Perceptron
-
Healthy carriage of Salmonella within cattle lymph nodes is a key source of ground beef contamination with strains of clinical significance bioRxiv. Genom. Pub Date : 2024-04-20 Enrique Jesus Delgado-Suarez, Abril Viridiana Garcia-Meneses, Elfrego Adrian Ponce-Hernandez, Cindy Fabiola Hernandez-Perez, Maria Salud Rubio-Lozano, Nayarit Emerita Ballesteros-Nova, Orbelin Soberanis-Ramos
This study assessed the genetic relatedness, evolutionary dynamics, and virulence profile of Salmonella isolated from lymph nodes and ground beef from apparently healthy cattle over two years. For this purpose, we used a set of isolates of nine different serovars: Anatum (n=23), Reading (n=22), Typhimurium (n=10), London (n=9), Kentucky (n=6), Fresno (n=4), Give, Muenster, and monophasic 1,4,[5],12:i-
-
JUND plays a genome-wide role in the quiescent to contractile switch in the pregnant human myometrium bioRxiv. Genom. Pub Date : 2024-04-20 Nawrah Khader, Anna Dorogin, Oksana Shynlova, Jennifer A. Mitchell
The myometrium, the muscular layer of the uterus, undergoes crucial transitions during pregnancy, maintaining quiescence throughout gestation, and generating coordinated contractions during labor. Dysregulation of this transition can lead to premature labor with serious complications for the infant. Despite extensive gene expression data available for varying myometrial states, the molecular mechanisms
-
Phylogenomic reconstruction of Cryptosporidium spp. captured directly from clinical samples reveals extensive genetic diversity bioRxiv. Genom. Pub Date : 2024-04-20 Asis Khan, Eliza V. C. Alves-Ferreira, Helena Vogel, Senyo Botchie, Irene Ayi, Mattie Pawlowic, Guy Robinson, Rachel M. Chalmers, Hernan Lorenzi, Michael E. Grigg
Cryptosporidium is a leading cause of severe diarrhea and mortality in young children and infants in Africa and southern Asia. More than twenty Cryptosporidium species infect humans, of which C. parvum and C. hominis are the major agents causing moderate to severe diarrhea. Relatively few genetic markers are typically applied to genotype and/or diagnose Cryptosporidium. Most infections produce limited
-
Genetic variation at transcription factor binding sites largely explains phenotypic heritability in maize bioRxiv. Genom. Pub Date : 2024-04-20 Julia Engelhorn, Samantha J Snodgrass, Amelie Kok, Arun S Seetharam, Michael Schneider, Tatjana Kiwit, Ayush Singh, Michael Banf, Merritt Khaipho-Burch, Daniel E Runcie, Victor A. Sanchez-Camargo, J Vladimir Torres-Rodriguez, Guangchao Sun, Maike Stam, Fabio Fiorani, Sebastian Beier, James C Schnable, Hank W. Bass, Matthew B Hufford, Benjamin Stich, Wolf B Frommer, Jeffrey Ross-Ibarra, Thomas Hartwig
Comprehensive maps of functional variation at transcription factor (TF) binding sites (cis-elements) are crucial for elucidating how genotype shapes phenotype. Here we report the construction of a pancistrome of the maize leaf under well-watered and drought conditions. We quantified haplotype-specific TF footprints across a pangenome of 25 maize hybrids and mapped over two-hundred thousand genetic
-
Overcoming Limitations to Deep Learning in Domesticated Animals with TrioTrain bioRxiv. Genom. Pub Date : 2024-04-20 Jenna Kalleberg, Jacob Rissman, Robert Schnabel
Variant calling across diverse species remains challenging as most bioinformatics tools default to assumptions based on human genomes. DeepVariant (DV) excels without joint genotyping while offering fewer implementation barriers. However, the growing appeal of a "universal" algorithm has magnified the unknown impacts when used with non-human genomes. Here, we use bovine genomes to assess the limits
-
Decoupled genetic and epigenetic variation in the montane endemic Erodium cazorlanum (Geraniaceae) and a widespread congener bioRxiv. Genom. Pub Date : 2024-04-20 Ruben Martin-Blazquez, Monica Medrano, Pilar Bazaga, Francisco Balao, Ovidiu Paun, Conchita Alonso
Epigenetic states offer an additional layer of variation besides genetic polymorphism that contribute to phenotypic variation and may arise either randomly or in response to environmental factors. We hypothesize that closely related species with different life-histories and habitat requirements could show distinct patterns of intraspecific epigenetic variation. We used Restriction-site Associated DNA
-
Adaptation across a precipitation gradient from niche center to niche edge bioRxiv. Genom. Pub Date : 2024-04-20 Samantha vanDeurs, Oliver Reutimann, Hirzi Luqman, Dikla Lifshitz, Einav Mayzlish Gati, Jake Alexander, Simone Fior
Evaluating the potential for species to adapt to changing climate relies on an appreciation of current patterns of adaptive variation and selection, which might vary in intensity across the niche of a species, affecting our inference of where adaptation might be most important in the future. Here we investigate the genetic basis of adaptation in Lactuca serriola, the wild relative of the common lettuce
-
Parasite genetic variation and systemic immune responses are not associated with different clinical presentations of cutaneous leishmaniasis caused by Leishmania aethiopica bioRxiv. Genom. Pub Date : 2024-04-19 Endalew Yizengaw, Yegnasew Takele, Susanne U. Franssen, Bizuayehu Gashaw, Mulat Yimer, Emebet Adem, Endalkachew Nibret, Gizachew Yismaw, Edward Cruz Cervera, Kefale Ejigu, Dessalegn Tamiru, Abaineh Munshea, Ingrid Muller, Richard Weller, James Cotton, Pascale Kropf
Cutaneous leishmaniasis (CL) is a neglected tropical skin disease, caused by the protozoan parasite Leishmania (L.). It is endemic to 90 countries and causes >200,000 new infections each year. In Ethiopia, CL is mainly caused by L. aethiopica and can present in different clinical forms: localised cutaneous leishmaniasis (LCL); mucocutaneous leishmaniasis (MCL), where the mucosa of the nose and/or the
-
Massively parallel jumping assay decodes Alu retrotransposition activity bioRxiv. Genom. Pub Date : 2024-04-19 Nadav Ahituv, Navneet Matharu, Jingjing Zhao, Ajuni Sohota, Linbei Deng, Yan Hung, Zizheng Li, Jasmine Sims, Sawitree Rattanasopha, Josh Meyer, Lucia Carbone, Martin Kircher
The human genome contains millions of retrotransposons, several of which could become active due to somatic mutations having phenotypic consequences, including disease. However, it is not thoroughly understood how nucleotide changes in retrotransposons affect their jumping activity. Here, we developed a novel massively parallel jumping assay (MPJA) that can test the jumping potential of thousands of
-
DNA virus infections shape transposable elements activity in vitro and in vivo bioRxiv. Genom. Pub Date : 2024-04-19 Jiang Tan, Vedran Franke, Eva Neugebauer, Justine Lagisquet, Anna Katharina Kuderna, Stephanie Walter, Markus Landthaler, Armin Ensser, Thomas Stamminger, Thomas Gramberg, Altuna Akalin, Emanuel Wyler, Florian Full
Transposable elements (TEs) are implicated in a variety of processes including placental and preimplantation development and a variety of human diseases. TEs are known to be activated in the context of viral infections, but the mechanisms and consequences are not understood. We show strong activation of TEs upon DNA virus infection, in particular the MLT- and THE1-class of LTR-containing retrotransposons
-
Wampee chromosome-level reference genome elucidates fruit sugar-acid metabolism bioRxiv. Genom. Pub Date : 2024-04-19 Huiqiong Chen, Jingxuan Wang, Xiangfeng Wang, Cheng Peng, Xiaoxiao Chang, Zhe Chen, Bowen Yang, Xinrui Wang, Jishui Qiu, Li Guo, Yusheng Lu
Wampee (Clausena lansium) is an economically significant subtropical fruit tree widely cultivated in Southern China. High-quality genomic resources are unavailable, but they are essential for functional genomics and germplasm enhancement of wampee. Here, we provide a chromosome-level genome sequence for the wampee cultivar JinFeng and a population genomic analysis of 266 accessions. The 297.1 Mb wampee
-
Large-scale single-virus genomics uncovers hidden diversity of river water viruses and diversified gene profiles bioRxiv. Genom. Pub Date : 2024-04-19 Yohei Nishikawa, Ryota Wagatsuma, Yuko Tsukada, Lin Chia-ling, Rieka Chijiiwa, Masahito Hosokawa, Haruko Takeyama
Environmental viruses (primarily bacteriophages) are widely recognized as playing an important role in ecosystem homeostasis through the infection of host cells. However, the majority of environmental viruses are still unknown as their mosaic structure and frequent mutations in their sequences hinder genome construction in current metagenomics. To enable the large-scale acquisition of environmental
-
Robust differential expression testing for single-cell CRISPR screens at low multiplicity of infection bioRxiv. Genom. Pub Date : 2024-04-18 Timothy Barry, Kaishu Mason, Kathryn Roeder, Eugene Katsevich
Single-cell CRISPR screens (perturb-seq) link genetic perturbations to phenotypic changes in individual cells. The most fundamental task in perturb-seq analysis is to test for association between a perturbation and a count outcome, such as gene expression. We conduct the first-ever comprehensive benchmarking study of association testing methods for low multiplicity-of-infection (MOI) perturb-seq data
-
T7 DNA polymerase treatment improves quantitative sequencing of both double-stranded and single-stranded DNA viruses bioRxiv. Genom. Pub Date : 2024-04-18 Maud Billaud, Ilias Theodorou, Quentin Lamy-Besnier, Shiraz A Shah, Francois Lecointe, Luisa De Sordi, Marianne De Paepe, Marie-Agnes Petit
Background: Bulk microbiome, as well as virome-enriched shotgun sequencing only reveals the double-stranded DNA (dsDNA) content of a given sample, unless specific treatments are applied. However, genomes of viruses often consist of a circular single-stranded DNA (ssDNA) molecule. Pre-treatment and amplification of DNA using the multiple displacement amplification (MDA) method enables conversion of
-
A Foundational Large Language Model for Edible Plant Genomes bioRxiv. Genom. Pub Date : 2024-04-18 Javier Mendoza-Revilla, Evan Trop, Liam Gonzalez, Masa Roller, Hugo Dalla-Torre, Bernardo P de Almeida, Guillaume Richard, Jonathan Caton, Nicolas Lopez Carranza, Marcin Skwark, Alex Laterre, Karim Beguir, Thomas Pierrot, Marie Lopez
Significant progress has been made in the field of plant genomics, as demonstrated by the increased use of high-throughput methodologies that enable the characterization of multiple genome-wide molecular phenotypes. These findings have provided valuable insights into plant traits and their underlying genetic mechanisms, particularly in model plant species. Nonetheless, effectively leveraging them to
-
Enhancer-driven cell type comparison reveals similarities between the mammalian and bird pallium bioRxiv. Genom. Pub Date : 2024-04-18 Nikolai Hecker, Niklas Kempynck, David Mauduit, Darina Abaffyová, Roel Vandepoel, Sam Dieltiens, Ioannis Sarropoulos, Carmen Bravo González-Blas, Elke Leysen, Rani Moors, Gert Hulselmans, Lynette Lim, Joris De Wit, Valerie Christiaens, Suresh Poovathingal, Stein Aerts
Combinations of transcription factors govern the identity of cell types, which is reflected by enhancer codes in cis-regulatory genomic regions. Cell type-specific enhancer codes at nucleotide-level resolution have not yet been characterized for the mammalian neocortex. It is currently unknown whether these codes are conserved in other vertebrate brains, and whether they are informative to resolve
-
Fast and flexible joint fine-mapping of multiple traits via the Sum of Single Effects model bioRxiv. Genom. Pub Date : 2024-04-18 Yuxin Zou, Peter Carbonetto, Dongyue Xie, Gao Wang, Matthew Stephens
We introduce mvSuSiE, a multi-trait fine-mapping method for identifying putative causal variants from genetic association data (individual-level or summary data). mvSuSiE learns patterns of shared genetic effects from data, and exploits these patterns to improve power to identify causal SNPs. Comparisons on simulated data show that mvSuSiE is competitive in speed, power and precision with existing
-
Dark side of the honeymoon: reconstructing the Asian x European rose breeding history through the lens of genomics bioRxiv. Genom. Pub Date : 2024-04-18 Thibault Leroy, Elise Albert, Tatiana Thouroude, Sylvie Baudino, Jean-Claude Caissard, Annie Chastellier, Jerome Chameau, Julien Jeauffre, Therese Loubert, Saretta Nindya Paramita, Alix Pernet, Vanessa Soufflet-Freslon, Cristiana Oghina-Pavie, Fabrice Foucher, Laurence Hibrand-Saint Oyant, Jeremy Clotault
Roses hold significant symbolic value in Western cultural heritage, often serving as a symbol of love and romance. Despite their ancient cultivation, the appreciation for the phenotypic diversity of roses emerged relatively recently, notably during the 19th century. This period is characterized by a remarkable expansion in the number of varieties, from around 100 to over 8,000, representing a golden
-
Genomic diversity and evolution in the Hawaiian Islands endemic Kokia (Malvaceae) bioRxiv. Genom. Pub Date : 2024-04-17 Ehsan Kayal, Mark A. Arick, Chuan-yu Hsu, Adam Thrash, Mitsuko Yorkston, Clifford W. Morden, Jonathan F. Wendel, Daniel G. Peterson, Corrinne E. Grover
Island species are highly vulnerable due to habitat destruction and their often small population sizes with reduced genetic diversity. The Hawaiian Islands constitute the most isolated archipelago on the planet, harboring many endemic species. Kokia is an endangered flowering plant genus endemic to these islands, encompassing three extant and one extinct species. Recent studies provided evidence of
-
GALEON: A Comprehensive Bioinformatic Tool to Analyse and Visualise Gene Clusters in Complete Genomes bioRxiv. Genom. Pub Date : 2024-04-17 Vadim A Pisarenco, Joel Vizueta, Julio Rozas
Gene clusters, defined as a set of genes encoding functionally-related proteins, are abundant in eukaryotic genomes. Despite the increasing availability of chromosome-level genomes, the comprehensive analysis of gene family evolution remains largely unexplored, particularly for large and highly dynamic gene families or those including very recent family members. These challenges stem from limitations
-
Sex-specific recombination landscape in a species with holocentric chromosomes bioRxiv. Genom. Pub Date : 2024-04-17 Sebastian Chmielewski, Mateusz Konczal, Jonathan Michael Parrett, Stephane Rombauts, Katarzyna Dudek, Jacek Radwan, Wieslaw Babik
The rate and chromosomal positioning of meiotic recombination significantly affects the distribution of the genetic diversity in eukaryotic genomes. Many studies have revealed sex-specific recombination patterns, with male recombination typically biased toward chromosome ends, while female recombination is more evenly distributed along chromosomes, or concentrated near the chromosome center. It has
-
Interpretable and predictive models to harness the life science data revolution bioRxiv. Genom. Pub Date : 2024-04-17 Joshua P Jahner, C. Alex Buerkle, Dustin G Gannon, Eliza M Grames, S. Eryn McFarlane, Andrew Siefert, Katherine L Bell, Victoria L DeLeo, Matthew L Forister, Joshua G Harrison, Daniel C Laughlin, Amy C Patterson, Breanna F Powers, Chhaya M Werner, Isabella A Oleksy
The proliferation of high-dimensional biological data is kindling hope that life scientists will be able to fit statistical and machine learning models that are highly predictive and interpretable. However, large biological data are commonly burdened with an inherent trade-off: in-sample prediction will improve as additional predictors are included in the model, but this may come at the cost of poor
-
Expression of Most Retrotransposons in Human Blood Correlates with Biological Aging bioRxiv. Genom. Pub Date : 2024-04-17 Yi-Ting Tsai, Nogayhan Seymen, Ian Richard Thompson, Xinchen Zou, Warisha Mumtaz, Sila Gerlevik, Ghulam J. Mufti, Mohammad M Karimi
Retrotransposons (RTEs) have been postulated to reactivate with age and contribute to aging through activated innate immune response and inflammation. Here, we systematically analyzed the relationship between RTEs expression and aging using published transcriptomic and methylomic datasets of human blood. Despite no observed correlation between RTEs activity and chronological age, most RTE classes and
-
Defective Integrator activity shapes the transcriptome of patients with multiple sclerosis bioRxiv. Genom. Pub Date : 2024-04-17 Yevhenia Porozhan, Mikkel Carstensen, Sandrine Thouroude, Mickael Costallat, Christophe Rachez, Eric Batsche, Thor Petersen, Tove Christensen, Christian Muchardt
While the role of the immune system in multiple sclerosis (MS) is widely acknowledged, our comprehension of the transcriptional foundations of this prevalent neurological disorder remains largely incomplete. Here, we conducted high-depth RNA sequencing on monocytes from a cohort of patients with MS to explore rare RNA species associated with the disease. In a subset of the patients, a markedly altered
-
DNA methylation differences between the female and male X chromosomes in human brain bioRxiv. Genom. Pub Date : 2024-04-17 Robert Morgan, Eddie Loh, Devika Singh, Isabel Mendizabal, Soojin Yi
The mechanisms of X chromosome inactivation suggest fundamental epigenetic differences between the female and male X chromosomes. However, DNA methylation studies often exclude the X chromosomes. In addition, many previous studies relied on techniques that examine non-randomly selected subsets of positions such as array-based methods, rather than assessing the whole X chromosome. Consequently, our
-
Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner bioRxiv. Genom. Pub Date : 2024-04-16 Hyun Joo Ji, Mihaela Pertea
Recently developed long–read RNA sequencing technologies promise to provide a more accurate and comprehensive view of transcriptomes compared to short-read sequencers, primarily due to their capability to achieve full–length sequencing of transcripts. However, realizing this potential requires computational tools tailored to process long reads, which exhibit a higher error rate than short reads. Existing
-
Chromosome-level baobab (Adansonia digitata) genome illuminates its evolutionary insights bioRxiv. Genom. Pub Date : 2024-04-16 Justine K Kitony, Kelly Colt, Bradley W Abra, Nolan T Hartwick, Semar Petrus, Emad Hassan Konozy, Nisa Karimi, Levi Yant, Todd P Michael
Baobab, Adansonia digitata, is a long-lived tree endemic to Africa that holds great economic, ecological, and cultural value. However, our knowledge of its genomic features, evolutionary history, and diversity is limited, rendering it orphaned scientifically. We generated a haploid chromosome-level reference genome anchored into 42 chromosomes for A. digitata, as well as draft assemblies for a sibling
-
Development and Implementation of a Core Genome Multilocus Sequence Typing (cgMLST) scheme for Haemophilus influenzae bioRxiv. Genom. Pub Date : 2024-04-16 Made Ananda Krisna, Keith A. Jolley, William Monteith, Alexandra Boubour, Raph L. Hamers, Angela B. Brueggemann, Odile B. Harrison, Martin C. J. Maiden
Haemophilus influenzae is part of the human nasopharyngeal microbiota and a pathogen causing invasive disease. The extensive genetic diversity observed in H. influenzae necessitates discriminatory analytical approaches to evaluate its population structure. This study developed a core genome MLST (cgMLST) scheme for H. influenzae using pangenome analysis tools and validated the cgMLST scheme using datasets
-
Genomic signature of reproductive isolation between the last two remnant populations of Torrey pine (Pinus torreyana Parry) bioRxiv. Genom. Pub Date : 2024-04-16 Lionel N Di Santo, Alayna Mead, Jessica Wright, Jill Hamilton
Understanding the genomic mechanisms contributing to speciation requires studies of taxa that are in the early stages of divergence, before complete reproductive isolation has evolved. One of the rarest pines in the world, Torrey Pine (Pinus torreyana Parry), persists naturally across one island and one mainland population in southern California, and is an ideal system for assessing the evolution of
-
Genotyping of paired clinical isolates using PvCSP, PvMSP3α PvMSP3βand exploring STRs to differentiate between relapse and reinfection in P. vivax bioRxiv. Genom. Pub Date : 2024-04-16 Deepali Savargaonkar, Renuka Gahtori, Swati Sinha, Preeti Kumari, Paras Mahale, Bina Srivastava, Veena Pande, Himmat Singh Pawar, Anupkumar R Anvikar
The challenge of eliminating Vivax malaria is due to the relapses caused by hypnozoites. Despite several attempts to identify molecular markers to differentiate between relapse and new infection, a reliable marker has not yet been established. To address this issue, a genomic study was conducted on paired samples of patients who had experienced Plasmodium vivax infection twice. Genotyping was performed
-
The recombination landscape of the barn owl, from families to populations bioRxiv. Genom. Pub Date : 2024-04-16 Alexandros Topaloudis, Eléonore Lavanchy, Tristan Cumer, Anne-Lyse Ducrest, Celine Simon, Ana Paula Machado, Nika Paposhvili, Alexandre Roulin, Jérôme Goudet
Homologous recombination is a meiotic process that generates diversity along the genome and interacts with all evolutionary forces. Despite its importance, studies of recombination landscapes are lacking due to methodological limitations and a dearth of appropriate data. Linkage mapping based on familial data gives unbiased sex-specific broad-scale estimates of recombination while linkage disequilibrium
-
Inference of Locus-Specific Population Mixtures From Linked Genome-Wide Allele Frequencies bioRxiv. Genom. Pub Date : 2024-04-16 Carlos S Reyna-Blanco, Madleina Caduff, Marco Galimberti, Christoph Leuenberger, Daniel Wegmann
Admixture between populations and species is common in nature. Since the influx of new genetic material might be either facilitated or hindered by selection, variation in mixture proportions along the genome is expected in organisms undergoing recombination. Various graph-based models have been developed to better understand these evolutionary dynamics of population splits and mixtures. However, current
-
Discovering non-additive heritability using additive GWAS summary statistics bioRxiv. Genom. Pub Date : 2024-04-15 Samuel Pattillo Smith, Gregory Darnell, Dana Udwin, Julian Stamp, Arbel Harpak, Sohini Ramachandran, Lorin Crawford
LD score regression (LDSC) is a method to estimate narrow-sense heritability from genome-wide association study (GWAS) summary statistics alone, making it a fast and popular approach. In this work, we present interaction-LD score (i-LDSC) regression: an extension of the original LDSC framework that accounts for interactions between genetic variants. By studying a wide range of generative models in
-
Molecular responses of chicken embryos to maternal heat stress through DNA methylation and gene expression bioRxiv. Genom. Pub Date : 2024-04-15 Keyvan Karami, Jules Sabban, Chloe Cerutti, Guillaume Devailly, Sylvain Foissac, David Gourichon, Alexandre Hubert, Jean-Noel Hubert, Sophie Leroux, Tatiana Zerjal, Sandrine Lagarrigue, Frederique Pitel
Climate change, with its repercussions on agriculture, is one of the most important adaptation challenges for livestock production. Poultry production is a major source of proteins for human consumption all over the world. With a growing human population, improving poultry adaptation to environmental constraints becomes critical. Extensive evidence highlights the influence of environmental variations
-
Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases bioRxiv. Genom. Pub Date : 2024-04-15 Raehoon Jeong, Martha L Bulyk
Most genetic loci associated with complex traits and diseases through genome-wide association studies (GWAS) are noncoding, suggesting that the causal variants likely have gene regulatory effects. However, only a small number of loci have been linked to expression quantitative trait loci (eQTLs) detected currently. To better understand the potential reasons for many trait-associated loci lacking eQTL