Infectious Diseases - Projects

Project Principal Investigator
Diversity of enteric pathogens including E. coli/Shigella David Rasko Ph.D.
Genome diversity among non-typhoidal Salmonella enterica Jacques Ravel Ph.D.
Evolution of Eukaryotic Parasites Joana Carneiro Da Silva Ph.D.
GSCID Claire M. Fraser Ph.D.
Genetic Population Structure Among Escherichia coli O157:H7 Associated with Hemolytic Uremic Syndrome Jacques Ravel Ph.D.
Evolutionary History of Plague Jacques Ravel Ph.D.
High-Throughput Identification and Characterization of Novel Vaccine Candidates Against Bacterial Pathogens Hervé S.G. Tettelin Ph.D.

Diversity of enteric pathogens including E. coli/Shigella


Diarrheal disease is a problem on a global scale as recognized by United Nations Millennium Development Goal number 4, which states as a main goal, "to combat death due to diarrhea diseases in children less than 5 years of age". The most comprehensive diarrheal studies indicate that there are greater than 110 million cases of diarrhea in children under 5 each year and approximately 2 million people die each year as a direct result of diarrheal disease, and a large proportion of those are children. The major bacterial pathogens that contribute to diarrheal disease are Escherichia coli and Shigella species. While E.coli and Shigella are very similar from an organismal perspective they can exist as commensal organisms, causing no disease in humans and animals to infection causing significant disease that results in death. We still do not fully grasp the diversity of these pathogens and commensals. We are trying to capture this diversity on both the genomic and transcriptomic level using the latest-generation sequencing technologies. A current project will address the genomic content of greater than 100 isolates of E. coli and Shigella, leading to further functional characterization of pathogenic features that will result in additional therapeutic and vaccine targets.

Genome diversity among non-typhoidal Salmonella enterica


Non-typhoidal Salmonella are a leading cause in bacterial outbreaks of foodborne gastrointestinal infection and responsible for an estimated 1.4 million cases of infection, 15,000 hospitalizations and more than 400 deaths annually in the U.S. alone. The Salmonella genus contains over 2,500 known serotypes that are distributed between the two species S. enterica and S. bongori. Serotypes differ significantly with respect to host specificity, pathogenicity, geographical distribution, prevalence of antimicrobial resistance and other phenotypes. Correspondingly, the genomic diversity between different serotypes is significant. Although a few serotypes are transmitted from person to person with no apparent animal reservoirs, most Salmonella serovars have zoonotic origins; and Salmonella have been isolated from at least 100 different animal species.

As part of the first Microbial Sequencing Centers program, funded by the National Institute of Allergy and Infectious Diseases, a comprehensive genome sequencing project of non-typhoidal Salmonella strains is currently under way, in order to assess the genomic diversity within this bacterial group. For this project, a total of 20 genomes have been selected for sequencing, annotation, and comparative analysis, originating from food, animal, and human non-typhoidal Salmonella isolates covering different degrees of animal and/or human virulence, as well as recent trends in the emergence of antimicrobial resistance.

Evolution of Eukaryotic Parasites

The phylum Apicomplexa comprises a diverse group of unicellular, eukaryotic parasites, many of which infect humans, or mammalian species on which human livelihood greatly depends. This phylum includes the causative agents of malaria, babesiosis, cryptosporidiosis, and toxoplasmosis in humans, as well as theileriosis and East Coast fever in cattle. The genome sequence for over a dozen apicomplexan species is available in either complete of draft form, including those of eight Plasmodium species, and three species of each in the Theileria, Babesia and Cryptosporidium genera.



The Genomic Sequencing Center for Infectious Disease will provide researchers with rapid and cost-efficient production of high-quality genome sequences of NIAID Category A-C priority pathogens, related organisms, clinical isolates, closely related species, and invertebrate vectors of infectious diseases and microorganisms responsible for emerging and re-emerging infectious diseases.

Genetic population structure among Escherichia coli O157:H7
associated with hemolytic uremic syndrome


The outbreak of E. coli O157:H7 in spinach captured the attention of both the public health and lay communities. As of October 2006, 199 people from 26 states were infected after ingesting fresh spinach contaminated with E. coli O157:H7. Among the ill, more than half were hospitalized and in 16% of the cases, infection progressed to hemolytic uremic syndrome (HUS). The high number of patients hospitalized and high rate of kidney failure suggest that this outbreak was due to a highly virulent strain of E. coli O157: H7. To gain insights into the genetic diversity and genome dynamics in this genetically highly homogenous population, we selected nineteen isolates collected by the Federal Drug Administration (FDA) for whole genome sequencing and comparative analyses. The sequencing of both environmental and clinical isolates presents a rare opportunity to analyze alterations in the genomic sequences that occur during an infectious progress within a short two-month outbreak period. The in-depth genome analyses allows for the first time to analyze the types of host variation, selection and adaptation occurring during the time course of a single outbreak on a genome-wide scale. Extensive single-nucleotide polymorphisms (SNP)-based genotyping analyses among these closely related bacterial strains will enable us to investigate the isolate-specific genetic polymorphism and determine the phylogenetic relationship within this highly virulent lineage.

Evolutionary history of plague


The plague bacterium Y. pestis is responsible for the three plague pandemics in human history and imposes a global threat for human health to this day. Natural outbreaks of plague occur in active foci in Asia, Africa and America, and its possible use as a biological weapon raises major public health concerns. Owing to the critical importance of Y. pestis, additional sequence information is important for examining variation in virulence phenotype at the level of individual polymorphisms. The aim of this study is to gain insights into the evolutionary history and global population structure of the species since it emerged from its progenitor Y. pseudotuberculosis. The genomes of ten additional Y. pestis strains and one Y. pseudotuberculosis strain IP31758 causing Far East Scarlet-like Fever (FESLF) are being sequenced to capture the genetic diversity in this evolutionary young and genetically homogenous pathogen. A major goal is the reconstitution of the phylogeny to deduce the ancestry of the species utilizing SNP-based genotyping approaches and to elucidate common and unique traits in genome evolution and speciation. The identified genetic markers will aid in future forensic, diagnostic and epidemiological studies by setting up the basis for an accurate and robust typing system, and result in a refined model for the evolutionary history of plague.

High-Throughput Identification and Characterization of Novel Vaccine
Candidates Against Bacterial Pathogens

Genomics has revolutionized the way novel candidates are identified for the development of efficacious vaccines. Reverse vaccinology, whereby all candidates of interest are identified by analysis of a pathogen's genome, enables characterization of many candidates simultaneously. It accelerates the initial steps of vaccine development and greatly increases the chances of obtaining reliable candidates or cocktails thereof as an end result. The gene complement the pathogen is analyzed to predict surface-exposed proteins. These are expressed in vitro and used to immunize mice. Antisera are tested in functional assays to verify their potential as vaccine candidates. The availability of one or two genome sequences for any given pathogen provides access to strain-specific vaccine candidates but often fails to identify candidates that would confer general protection. Analysis of the genome of multiple strains of a given pathogen is more informative and leads to the concept of the pan-genome. Comparative analysis of eight strains of group B Streptococcus (GBS) reveals a core genome of 1800 genes present in all strains, while 20% of the genes are dispensable (absent in one or more strains). Each genome displays an average of 33 unique genes indicating that the species' gene repertoire, or "pan-genome", is extremely large. The same result was obtained with group A Streptococcus and other species. Overall, the availability of eight complete genomes of a single species unveiled 26% more information than any single isolate. Thus, the genome sequence of multiple, independent isolates is required to understand the global complexity of a bacterial species and gain access to relevant virulence determinants and protective antigens.