Research Associate Professor
Department of Computer & Information Sciences
University of Delaware (UD)
Adjunct Assistant Professor
Department of Biochemistry and Molecular & Cell Biology
Georgetown University Medical Center (GUMC)
Contact
Office: 205 Delaware Biotechnology Institute
Mailing address:
Delaware Biotechnology Institute
590 Avenue 1743, Suite 147
Newark, DE 19713
Phone: (302) 831-3444
FAX: (302) 831-4841
E-Mail: arighi@dbi.udel.edu
Web Site: Protein Information Resource
Education
B.S.: University of Buenos Aires, Argentina
Ph.D.: University of Buenos Aires, Argentina
Postdoctoral: National Institutes of Health, USA
Previous Positions
Research Assistant, Department of Biological Chemistry, School of Pharmacy and Biochemistry, University of Buenos Aires, Argentina (1992-1996)
Postdoctoral Fellow. Laboratory of Dr. Juan S. Bonifacino, Cell Biology and Metabolism Branch, National Institute of Child Health and Human Development, NIH, MD (2001-2005)
Research Assistant Professor, Department of Biochemistry and Molecular & Cell Biology, GUMC (2005-2009)
Research Assistant Professor. Center for Bioinformatics and Computational Biology, Department of Computer and Information Sciences, UD (2009-2012)
Honors, Professional Appointments and Activities
Honors
PEW Trust Latin American Fellowship for the Biomedical Sciences (2001-2003)
National Institutes of Health Visiting Fellowship (2003-2005)
Fellowship to attend Computational and Comparative Genomic Course, Cold Spring Harbor Lab (2008)
Travel expenses to chair the PIR workshop at Plant and Animal Genome XVIII meeting (2010-2011)
Board and Committees
Chair of the Executive Committee of the International Society for Biocuration 2017-present
Member of the Faculty Senate Library Committee, University of Delaware, 2017-present
Member of the Editorial Board of Database: The Journal of Biological Databases and Curation, 2014-present
Member of PIR Executive Board, 2010-present
Member of the Executive Committee of the International Society for Biocuration, 2015-present, Secretary 2016-2017.
Member of the International Society for Computational Biology (ISCB)
Bioinformatics PhD Thesis Committee CBCB, University of Delaware, 2015
PhD Thesis Committee Department of Chemistry, University of Delaware, 2015-present
Bioinformatics Master Thesis Committee CBCB, University of Delaware, 2013
Statistics Master Thesis Committee Georgetown University, 2008
Scientific and organizing committee of BioCreative IV-2013 (Co-Chair); BioCreative-2012 (Co-Chair), BioCreative III-2010 (Co-Chair); Text mining session at ISB 2010 and ISB 2015 (Co-Chair); BioCreative workshop at ISB 2012 (Co-Chair)
Grant Ad hoc Reviews: External grant reviewer for the Swiss Science Foundation (2013)
Special Journal Issues and Conference Proceedings: Co-Editor of BioCreative Special issues in BCM Central (BioCreative III, 2011), Database (BioCreative workshop 2012, BioCreative IV, BioCreative V and BioCreative VI)
Ad hoc Reviewer: BMC Bioinformatics, BMC genomics, J Proteomics Bioinformatics, PloS, Nucleic Acids Research, Faculty 1000, Database.
Teaching: 2005-Present involved in teaching in multiple courses at the graduate level.
Course Director of BCHB521 Bioinformatics at GUMC (2013-2014 )
Proteomics Workshop, FAES, NIH. Instructor (2006-2013)
Bioinformatics Workshop. FAES, NIH. Instructor (2008-2013)
PIR Workshop at the ISCB Youth Bioinformatics Workshop at George Mason University (May 5, 2016)
Course Director of BINF644 Introduction to Bioinformatics at UD (starting spring 2018-)
Research interest
My general interest is in the accurate representation of protein information (e.g sequence, evolution, function, post-translational modifications, and pathways), that can be reasoned both by humans and computers, to provide the basis for hypothesis generation. I work within the framework of many international and interdisciplinary Consortia, such as UniProt (lead of curation group and text mining efforts at PIR), Protein Ontology (lead of curation team, and workshop organizer), and BioCreative (current PI of the NIH BioCreative Conference grant, and since 2009 workshop organizer, organizer of the User Interactive Text Mining track, user advisory group chair). Activities in my group include (i) database curation and bibliography mapping (UniProtKB), (ii) curation of proteoforms for the Protein Ontology, (iii) the development (in collaboration with text mining group) and evaluation of natural language processing tools to assist the researcher in retrieving information about genes, proteins and miRNAsin collaboration with Dr. Vijay Shanker, iV) bioinformatics/text mining support for other research groups
Publications
Full list of publications available here
Selected list based on projects (* equal contribution to work)
UniProt:
- Poux S*, Arighi CN*, Magrane M*, Bateman A, Wei CH, Lu Z, Boutet E, Bye-A-Jee H,
Famiglietti ML, Roechert B, UniProt Consortium T. On expert curation and
scalability: UniProtKB/Swiss-Prot as a case study. Bioinformatics (Oxford,
England). 2017; 33(21):3454-3460. PMID: 29036270 - UniProt: the universal protein knowledgebase. Nucleic acids research. 2017;
45(D1):D158-D169. PubMed [journal] PMID: 27899622, PMCID: PMC5210571 - Poux S*, Magrane M*, Arighi CN*, Bridge A, O’Donovan C, Laiho K. Expert curation in
UniProtKB: a case study on dealing with conflicting and erroneous data. Database: the journal of biological databases and curation. 2014; 2014:bau016. PMID: 24622611, PMCID: PMC3950660
Protein Ontology:
- Natale DA, Arighi CN, Blake JA, Bona J, Chen C, Chen SC, Christie KR, Cowart J,
D’Eustachio P, Diehl AD, Drabkin HJ, Duncan WD, Huang H, Ren J, Ross K,
Ruttenberg A, Shamovsky V, Smith B, Wang Q, Zhang J, El-Sayed A, Wu CH. Protein
Ontology (PRO): enhancing and scaling up the representation of protein entities.
Nucleic acids research. 2017; 45(D1):D339-D346. PubMed [journal] PMID: 27899649,
PMCID: PMC5210558 - Ross KE, Natale DA, Arighi C, Chen SC, Huang H, Li G, Ren J, Wang M,
Vijay-Shanker K, Wu CH. Scalable Text Mining Assisted Curation of
Post-Translationally Modified Proteoforms in the Protein Ontology. CEUR workshop
proceedings. 2016; 1747. PubMed [journal] PMID: 28706471,
PMCID: PMC5504912 - Arighi C, Shamovsky V, Masci AM, Ruttenberg A, Smith B, Natale DA, Wu C,
D’Eustachio P. Toll-like receptor signaling in vertebrates: testing the
integration of protein, complex, and pathway data in the protein ontology
framework. PloS one. 2015; 10(3):e0122978. PubMed [journal] PMID: 25894391,
PMCID: PMC4404318
BioCreative:
- Singhal A, Leaman R, Catlett N, Lemberger T, McEntyre J, Polson S, Xenarios I,
Arighi C, Lu Z. Pressing needs of biomedical text mining in biocuration and
beyond: opportunities and challenges. Database : the journal of biological
databases and curation. 2016; 2016. PubMed [journal] PMID: 28025348, PMCID:
PMC5199160 - Kim S, Islamaj Doğan R, Chatr-Aryamontri A, Chang CS, Oughtred R, Rust J,
Batista-Navarro R, Carter J, Ananiadou S, Matos S, Santos A, Campos D, Oliveira
JL, Singh O, Jonnagaddala J, Dai HJ, Su EC, Chang YC, Su YC, Chu CH, Chen CC, Hsu
WL, Peng Y, Arighi C, Wu CH, Vijay-Shanker K, Aydın F, Hüsünbeyi ZM, Özgür A,
Shin SY, Kwon D, Dolinski K, Tyers M, Wilbur WJ, Comeau DC. BioCreative V BioC
track overview: collaborative biocurator assistant task for BioGRID. Database :
the journal of biological databases and curation. 2016; 2016. PubMed [journal] PMID: 27589962, PMCID: PMC5009341 - Peng Y, Arighi C, Wu CH, Vijay-Shanker K. BioC-compatible full-text passage
detection for protein-protein interactions using extended dependency graph.
Database : the journal of biological databases and curation. 2016; 2016. PubMed
[journal] PMID: 27170286, PMCID: PMC4915133 - Wang Q, S Abdul S, Almeida L, Ananiadou S, Balderas-Martínez YI, Batista-Navarro
R, Campos D, Chilton L, Chou HJ, Contreras G, Cooper L, Dai HJ, Ferrell B, Fluck
J, Gama-Castro S, George N, Gkoutos G, Irin AK, Jensen LJ, Jimenez S, Jue TR,
Keseler I, Madan S, Matos S, McQuilton P, Milacic M, Mort M, Natarajan J, Pafilis
E, Pereira E, Rao S, Rinaldi F, Rothfels K, Salgado D, Silva RM, Singh O,
Stefancsik R, Su CH, Subramani S, Tadepally HD, Tsaprouni L, Vasilevsky N, Wang
X, Chatr-Aryamontri A, Laulederkind SJ, Matis-Mitchell S, McEntyre J, Orchard S,
Pundir S, Rodriguez-Esteban R, Van Auken K, Lu Z, Schaeffer M, Wu CH, Hirschman
L, Arighi CN. Overview of the interactive task in BioCreative V. Database : the
journal of biological databases and curation. 2016; 2016. PubMed [journal] PMID:
27589961, PMCID: PMC5009325
Text mining UD:
- Li G, Ross KE, Arighi CN, Peng Y, Wu CH, Vijay-Shanker K. miRTex: A Text Mining
System for miRNA-Gene Relation Extraction. PLoS computational biology. 2015;
11(9):e1004391. PubMed [journal] PMID: 26407127, PMCID: PMC4583433 - Torii M, Arighi CN, Li G, Wang Q, Wu CH, Vijay-Shanker K. RLIMS-P 2.0: A
Generalizable Rule-Based Information Extraction System for Literature Mining of
Protein Phosphorylation Information. IEEE/ACM transactions on computational
biology and bioinformatics. 2015; 12(1):17-29. NIHMSID: NIHMS718336 PubMed
[journal] PMID: 26357075, PMCID: PMC4568560 - Tudor CO, Ross KE, Li G, Vijay-Shanker K, Wu CH, Arighi CN. Construction of
phosphorylation interaction networks by text mining of full-length articles using
the eFIP system. Database : the journal of biological databases and curation.
2015; 2015. PubMed [journal] PMID: 25833953, PMCID: PMC4381107 - Ding R, Arighi CN, Lee JY, Wu CH, Vijay-Shanker K. pGenN, a gene normalization
tool for plant genes and proteins in scientific literature. PloS one. 2015;
10(8):e0135305. PubMed [journal] PMID: 26258475, PMCID: PMC4530884