CBCB Seminar

March 4, 2024 3:30 PM

Ammon-Pinizzotto Biopharmaceutical Innovation (BPI) Building
Conference Room 140

Functional Annotation of Glycan Motifs via Glycosylation Enzymes and the GlyGen GlycoTree Sandbox

Nathan Edwards, PhD

Associate Professor
Department of Biochemistry and Molecular & Cellular Biology, and
Clinical and Translational Glycoscience Research Center
Georgetown University

Abstract: The GlyGen GlycoMotif data-resource integrates nearly 700 glycan motifs and determinants collected from eleven different glycoinformatics resources, provides precomputed semantic alignments with GlyTouCan glycan structures, and hosts the actively curated set of glycan motifs for the GlyGen glycan knowledge-base. Despite glycans’ significant structural heterogeneity, recurring glycan motifs and determinants are understood to be responsible for driving glycans’ cell- and protein- binding activity and thus their functional role in specific biological contexts. We seek to explore the inference glycan motif function based on the phenotypes, cell-types, and other functional annotations associated with the glycoenzymes required for synthesis of glycans containing a motif. The GlyGen GlycoTree Sandbox associates human and mouse glycotransferases with the monosaccharide residues of GlyGen N-linked and O-linked glycan structures. GlycoMotif motif alignments to corresponding GlyGen structures make it possible to map Sandbox glycotransferases on structures’ residues to the corresponding residue of the motif. Specific motif residues may be annotated repeatedly by the same enzymes or by different enzymes by virtue of motif alignments to a variety of structures and multiple placements on a single structure. A key computational innovation necessary to transfer GlycoTree Sandbox enzyme annotations is the use of canonical monosaccharide ids, and motif-to-structure and structure-to-structure alignments that identify corresponding residues. For structure-to-structure alignments a single bijection between the structures’ monosaccharides is returned, while for motif-to-structure alignments, pairs of corresponding residues for all possible placements of the motif on the structure are computed. The GlycoMotif glycan motif data-resource integrates glycoenzyme annotations with each motif, and provides enzymes pages for each glycoenzyme showing the motifs associated with at least one structure they help to synthesize. Enzymes are annotated with gene names, UniProt and MGI accessions, and links to functional annotation resources such as the International Mouse Phenotype Consortium (IMPC). Human glycoenzymes implicated in congenital disorders of glycosylation (Freeze, et al., 2014) are also shown.  Other functional annotations associated with genes, notably mined from large-scale gene-expression data-resources such as the Genotype-Tissue Expression Project (GTex), suggest tissue and/or cell-type specific expression of glycoenzymes, and may suggest such specificity for glycan structures containing specific glycan motifs.

Bio: Dr. Nathan Edwards is an Associate Professor in the Department of Biochemistry and Molecular & Cellular Biology at Georgetown University and Co-director of the Clinical and Translational Glycoscience Research Center, also at Georgetown. Dr. Edwards received his Ph.D. in 2000 from Cornell University in the field of Operations Research, and then joined Celera Genomics’ Informatics Research group, working in proteomics informatics, and Applied Biosystems, before transitioning back to academia at University of Maryland, College Park, and then Georgetown University in 2008. His research interests include proteomics and glycoproteomics informatics, particularly peptide and glycopeptide identification from tandem mass-spectra, and more generally glycoinformatics. Dr. Edwards has been involved in the GlyGen Glycan Knowledgebase and Informatics Portal project since its inception in 2017, developing and supporting a variety of public facing subprojects in support of GlyGen – including the GlycoMotif glycan motif site; the GNOme glycan subsumption and naming ontology; and the GlyGen Sandbox for mapping glycan structures to glycosylation pathways and enzymes.