Bioinformatics - Projects

Project Principal Investigator
GCID: Genome Center for Infectious Disease Claire M. Fraser Ph.D.
NeMO: Neuroscience Multi-Omic Archive Owen White Ph.D.
DCPPC: The NIH Data Commons Pilot Phase Consortium Owen White Ph.D.
HMP DACC Owen White Ph.D.
ECO: Evidence and Conclusion Ontology Michelle Gwinn-Giglio Ph.D.
DO: Disease Ontology Lynn Schriml Ph.D.
Analysis Engine Michelle Gwinn-Giglio Ph.D.


This project explores the dynamic interactions between pathogens, hosts, microbiota, the immune system, and the environment, with the goal to provide a comprehensive understanding of the determinants of infectious disease. The project includes work on bacteria (Escherichia coli), fungi (Candida, Aspergillus, and Mucormycosis species), and eukaryotic parasites (Plasmodium and Brugia).

NeMO: Neuroscience Multi-Omic Archive

As part of the NIH BRAIN Initiative, researchers at IGS have developed the Neuroscience Multi-Omic (NeMO) Archive, specifically focused on the storage and dissemination of omic data from the BRAIN initiative and related brain research projects. The repository provides a catalog of omics data from cells in the mammalian brain together with rich metadata about the studies. The NeMO Portal provides a query interface for users to explore this vast store of data.

DCPPC: The NIH Data Commons Pilot Phase Consortium

Massive quantities of high throughtput biological data of many types have been generated. Currently, it is difficult for most researchers to combine, query, and carry out analysis on data from diverse sources. The NIH Data Commons will provide a cloud-based platform to house data and analysis workflows from NIH-funded projects. Ultimately, this resource will provide a storage and computing environment that will facilitate the ability of researchers to store, share, access and carry out analysis, resulting in new hypothesis generation and discovery. IGS is part of the NIH Data Commons Pilot Phase Consortium charged with producing an initial implementation of this cloud resource. As a member of the DCPPC, we produce a publicly facing web resource for the project as well as contributing to infrastructure and metadata harmonization efforts.


The NIH Common Fund initiated the Human Microbiome Project (HMP) to explore the microbial communities of the human host and characterize their role in human health and disease. The initial five-year phase of the effort established a baseline of data from a large sample of healthy subjects, explored changes in community compositions with disease states, and provided resources for the community to use in human microbiome research. A second phase of the effort called the Integrated Human Microbiome Project, or iHMP, focused on three particular conditions in human health: onset of type 2 diabetes, inflammatory bowel disease, and pregnancy/pre-term birth. In contrast to the initial phase of the HMP project, in this phase many different types of omics approaches were carried out on both the microbiome and the host in order to provide a more systems-level view of human-microbe interactions. IGS is the data coordination center for both of these HMP efforts. Click here for more information.


ECO: Evidence and Conclusion Ontology

The Evidence and Conclusion Ontology (ECO) contains terms to describe types of evidence used in the process of biocuration. Capture of this information in a systemica way with an ontology, allows tracking of annotation provenance, establishment of quality control measures and query of evidence. ECO contains over 1500 terms and is in use by many leading biological resources including the Gene Ontology, UniProt and several model organism databases. ECO is continually being expanded and revised based on the needs of the biocuration community.

DO: Disease Ontology

The Disease Ontology has been developed as a standardized ontology for human disease with the purpose of providing the biomedical community with consistent, reusable and sustainable descriptions of human disease terms, phenotype characteristics and related medical vocabulary disease concepts through collaborative efforts of researchers at Northwestern University, Center for Genetic Medicine and the University of Maryland School of Medicine, Institute for Genome Sciences. The Disease Ontology semantically integrates disease and medical vocabularies through extensive cross mapping of DO terms to MeSH, ICD, NCI’s thesaurus, SNOMED and OMIM.


Analysis Engine

It has become relatively easy to acquire the genome sequence of prokaryotic organisms. However, there are still few options available for doing systematic, complete annotation of the whole genome using a robust annotation pipeline. The IGS Analysis Engine provides comprehensive automated annotation along with all underlying search data as well as tools for visualization and (optional) manual curation. In addition to single genome annotation, we also offer comparative analysis of multiple genomes with an associated visualization tool. These services are provided on a fee-for-service basis.