Posts in category Infectious Disease

Page 2 of 4

Impact: Ebola Research Efforts at JCVI

We have all read the stories with concern about the rapid spread of Ebola virus disease (EVD) in Africa. Now, with the first diagnosis of the virus in the United States, it is clear this virus is not under control. If not contained, Ebola poses a significant threat to the African continent and beyond. JCVI is on the front lines of working to better understand this infectious agent. Dr. Reed Shabman, a member of JCVI’s infectious disease team, is seeking to understand why Ebola and Marburg viruses (both are filoviruses) infections result in such severe human disease.

Ebola virus

Produced by the National Institute of Allergy and Infectious Diseases (NIAID), under a magnification of 25,000X, this digitally-colorized scanning electron micrograph (SEM) depicts numerous filamentous Ebola virus particles (red) budding from a chronically-infected VERO E6 cell (blue). Image credit: NIAID

During his time as a postdoc at the Icahn School of Medicine at Mount Sinai in New York, Reed helped to develop research platforms designed to understand how Ebola virus mediates its replication, gene expression and evades the immune system. The innovative approaches used at JCVI do not require high level containment facilities and through established collaborations with Biosafety level-4 (BSL-4) labs his group is able to confirm their results in the context of actual Ebola infection.

Some of the ongoing collaborative projects in the group include:

  • Determining how Ebola virus evades the host immune system, specifically the innate immune response.
  • Employing sequencing platforms to identify previously undescribed aspects of Ebola and Marburg virus RNAs.
  • Developing reporter systems to understand how the untranslated regions (UTRs) of Ebola and Marburg virus control their protein production.

This important research seeks to enhance the scientific community’s understanding of Ebola and Marburg virus biology which will aid in our ability to rationally design ways to combat these deadly viruses.

Ebola Background

Ebola has entered the human population before, with the first documented cases occurring in 1976 in areas that are now South Sudan and the Democratic Republic of Congo. Since 1976, there have been approximately 20 outbreaks in central Africa resulting in just over 2300 confirmed cases of Ebola virus disease (EVD).

One unusual aspect of the current Ebola outbreak is that instead of central Africa, this outbreak is occurring in west Africa. Initial cases were reported in February in Guinea. Shortly after these initial reports, EVD spread into Liberia followed by cases in Sierra Leeone and Nigeria. While efforts to control the virus in Nigeria appear to be successful, the number of cases since the first reported case now totals approximately 8000 with almost 4000 fatalities. The numbers from this single outbreak are larger than all other previous outbreaks combined.

Ebola virus was identified almost 40 years ago; however, there are still no approved vaccines or antiviral approaches beyond supportive care. There are promising therapies and vaccines on the horizon, but a fundamental understanding of how the virus interacts with human host is critical to advance the progress of treating the deadly disease.

Study Signals Bat Flu Unlikely to Jump to Humans

Bats species harbor a large number of viruses that cause human disease.  So, when the first influenza sequences from Guatemalan little yellow-shouldered bats were uncovered in 2009, the question arose of whether bat influenza viruses pose a threat to human health.  A collaborative project between JCVI and Kansas State University was recently published in PLoS Pathogens to address this question.

H1N1 influenza virus particles

Image Credit: National Institute of Allergy and Infectious Diseases (NIAID)

The approach employed cutting-edge synthetic biology approaches and demonstrated that, while the sequences of the bat influenza virus of the subtype H17N10 are viable, they are unable to infect human cells. Additional experiments clearly indicated that these bat virus sequences are not able to reassort with other influenza A and B viruses known to infect humans. Therefore, the potential for a pandemic bat influenza entering the human population is extremely unlikely.

David Wentworth, the former Director of Viral Programs at JCVI, was the lead investigator for this study.  Additional authors from JCVI include Tim Stockwell, Wei Wang, Xudong Lin, Bin Zhou (now at NYU), and Reed Shabman.

For additional information see the press release.

H3Africa Update

The National Institutes of Health (NIH) and the UK-based Wellcome Trust, in partnership with the African Society of Human Genetics, developed a program to foster genomic and epidemiological research in African scientific institutions. The laboratory and computational infrastructure available to most scientists on the African continent is currently insufficient to keep up with the rapid developments in DNA sequencing technologies and the need to use advanced computationally intensive methods to analyze this data.

Through the H3Africa Consortium, a partnership between NIH and Wellcome Trust, funding has become available to support knowledge development and implementation of genomics-centered research in several African academic institutions. The first scientific paper to come from this effort, Enabeling the Genomic Revolution in Africa, was published in the journal Science in June 2014.

H3Africa Efforts at J. Craig Venter Institute (JCVI)

One of the main initiatives of H3Africa is to foster scientific exchange between US-based partners and their African-based consortium members. JCVI is involved in a number of such partnerships through training and research collaborations.

Tuberculosis Research with Addis Ababa University

Addis Ababa University is the only Ethiopian institution to receive a primary award from NIH under H3Africa. It is based on a collaboration with JCVI. Professor Gobena Ameni of Addis Ababa University and Dr. Rembert Pieper of JCVI developed a proposal on Systems Biology for Molecular Analysis of Tuberculosis in Ethiopia which was initiated earlier this year. The research focuses on genomic variability in M. tuberculosis strains in Ethiopian pastoralist societies and also has an oral microbiome and proteomic biomarker discovery component.

Bioinformatics Training for African Scientists

As part of H3Africa, JCVI is leveraging its recent GCID award, where appropriate, for training of African Scientists. As part of this effort Dr. Andrey Tovchigrechko  taught microbiome analysis to graduate students in Ibadan, Nigeria. The workshop was organized by the local H3Africa Bioinformatics Network node. The workshop took place in July, 2014 and comprised of students from Nigeria and other West and Central African countries.

Symposium presenters.

Symposium presenters.

Workshop student participants.

Workshop participants.

The workshop was held at IITA.

The workshop was held at IITA.

During the three day workshop, Dr. Tovchigrechko taught the students launching and controlling computing instances on Amazon cloud, the basics of Python and R programming, MG-RAST Web interface, MG-RAST R package matR and JCVI-developed R code MGSAT. MG-RAST tutorials were provided by one of its developers Andreas Wilke (ANL).

Dr. Tovchigrechko also gave a talk, along with a dozen other speakers, at a one-day symposium at the University of Ibadan that preceded the workshop and included approximately 200 participants. Special thanks go to Nash Oyekanmi, the organizer and manager of the whole event, for his relentless efforts.

Collaborations with University of Cape Town

Also as part of the H3Africa Consortium, Dr. William Nierman from JCVI and Dr. Mark Nicol from the University of Cape Town, South Africa are in collaboration to study the nasopharyngeal microbiome and respiratory disease in African children. Dr. Nierman’s group has conducted a month long in house microbiome training workshop with students from Dr. Nicol’s group.

The focus of the training was to teach students JCVI’s complete microbiome pipeline (including sample preparation, sequencing generation, and final association analysis). The aim of the training collaboration is to ensure that this complete pipeline can be performed at the University of Cape Town, to help build independent and sustainable capacity in this field within South Africa.

 

Understanding Complex Data through Better Visualization

Recently, researchers at JCVI reported on the Rhizoctonia solani mitochondrial genome which was the largest fungal mitochondrion to be sequenced to date. We showed that its unusually large size was probably due to the expansion of multiple genetic elements that populated the genome in somewhat of a ‘parasitic’ relationship. The visualization was meant to impress the number and variety of these repetitive genetic elements, and was selected in a commentary in  FEMS Microbiology Letters as an example of how to summarize molecular data in order to obtain an overall view of the results.

The outermost circle represents the chromosome and repetitive elements. Other important features such as genes, endonucleases, exons, RNAseq coverage are represented in the concentric circles respectively. Grey links represent short repeats (< 35bp) found up to 100 times in the genome; colored links show the location of repeats and follow the coloration in Track 1.

The outermost circle represents the chromosome and repetitive elements. Other important features such as genes, endonucleases, exons, RNAseq coverage are represented in the concentric circles respectively. Grey links represent short repeats (< 35bp) found up to 100 times in the genome; colored links show the location of repeats and follow the coloration in Track 1.

JCVI Hosts South African Scientists to Share Microbiome Research Techniques

Two scientists from the University of Cape Town, South Africa have joined Dr. Bill Nierman’s lab for the next month as part of NIH’s Human Heredity and Health in Africa (H3Africa) Initiative, a training program designed to build out technical biological skills in the African research community. This training relates specifically to developing techniques around the area of microbiome analysis, a relatively new field in the biological sciences.

Microbiome analysis for the collaborative study is looking at entire community of microorganisms in the respiratory tract of South African infants to better understand how the microbiome is associated with infant pneumonia and wheezing episodes. The expectation is that the organisms that reside in the infant respiratory tract will provide protection from or a predisposition to the pneumonia or wheezing episodes.

 

The Nierman Group

The Nierman group left to right Sarah Lucas, Bill Nierman, Shantelle Claassen, Mamadou Kaba and Stephanie Mounaud (unpictured Jyoti Shanker and Lilliana Losada) welcomes visiting scientists Ms. Classeen and Dr. Kaba from University of Cape Town for a month long training in microbiome sequencing and analysis.

Mamado Kaba, MD, PhD and colleague Shantelle Claassen from the University of Cape Town will be working closely under the guidance of JCVI’s Stephanie Mounaud who is functioning as the project manager and coordinating the laboratory components of a similar project at JCVI studying the microbiomes of inafnts in the Philippines and also in South Africa. These studies are sponsored by the Bill and Melinda Gates Foundation. The training will focus initially on preparing samples for DNA sequencing on a modern DNA sequencing platform, the Illumina MiSeq instrument. Once the sequence reads are off the sequencer, the instructional focus will shift to analysis of the reads by means of an informatics pipeline that develop phylogenies, or family trees, of the microbes that are obtained from the infant respiratory tract so that the abundance and relatedness of the microbes can be established. The bioinformatics training will be provided by Jyoti Shankar, the statistical analyst working on the Gates Foundation Project.

Mamadou Kaba is a Wellcome Trust Fellow working in the Division of Medical Microbiology, Faculty of Health Sciences, University of Cape Town. Mamadou’s research interests include the molecular epidemiology of infectious diseases and the study of human microbiome in healthy and disease conditions. He has contributed in establishing a new research group conducting studies on how the composition of the upper respiratory tract, gastrointestinal, and the house dust microbial communities influences the development of respiratory diseases.

Prior to joining the University of Cape Town, Mamadou worked as Research Associate at the Laboratory of Medical Microbiology, Timone University Hospital, Marseille, France, where he studied the epidemiological characteristics of infection with hepatitis E virus in South-eastern France.

Shantelle Claassen is pursuing a Masters degree in the Division of Medical Microbiology at the University of Cape Town. She has completed a BSc (Med) Honours degree in Infectious Diseases and Immunology at the University of Cape Town, during which she examined the relative efficacy of extracting bacterial genomic DNA from human faecal samples using five commercial DNA extraction kits. The DNA extraction kits were evaluated based on their ability to efficiently lyse bacterial cells, cause minimal DNA shearing, produce reproducible results and ensure broad-range representation of bacterial diversity.

Mamadou and Shantelle are currently involved in an additional prospective, longitudinal study of which the primary objective is to investigate the association between fecal bacterial communities and recurrent wheezing during the first two years of life.

The 2014 Summer Internship Application is Open and Announcing the Genomics Scholar Program

The 2014 Summer Internship Application is now open.   Last summer, we hosted 49 interns from a pool of 424 applicants. They presented their research in the First Annual Summer Internship Poster Sessions held in San Diego and Rockville. The posters were judged by a team of volunteer JCVI scientists and the poster sessions were open to all employees, interns and their guests to share what great work they all participated in this summer.

 

 

2013 Intern Poster Session

2013 Intern Poster Session

We are also excited to announce the new Genomics Scholar Program beginning this summer and also accepting applications.  The Genomic Scholar Program (GSP) is a targeted research experience program to community college students in Rockville. Our program incorporates multiple avenues of support for students through the research experience with the Principal Investigators as mentors, and supplemental professional development provided by the JCVI.  Additionally, selected students will have the opportunity to participate in undergraduate research conferences.

The GSP is supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under award number R25DK098111.

‘Twas the night before Christmas

‘Twas the night before Christmas, when all through the building
All our creatures were stirring, even our mold;
The dishes were placed in the incubator with prayer,
In hopes that pure growth soon would be there;

The scientists were nestled all close to their screens instead
While swirls of DNA danced in their heads;

My coworker in her labcoat, and I with my pipettor,
Had just settled down for a long overnighter,

When out in the lab there arose such a clatter,
I sprang from my microscope to see what was the matter.
Away to the incubator I flew like a flash,
Tore open the doors then saw what was trash.

When, what to my tired red eyes should appear,
But a bunch of contaminated plates, there goes my career.

Santa Hat - Ho Ho Ho

Brim and ball: Neosartorya fischeri; Hat: Penicillium marneffei; Ho,Ho,Ho: Aspergillus flavus. Image Credit: Stephanie Mounaud / J. Craig Venter Institute.

Last year, still in an isolated fungal room placed far away from others, I made an attempt at this one, but my stocks were contaminated. Something all fungal folks know something about. (Aspergillus is just EVERYWHERE). So with a little luck (let’s face it, with complete luck) I was able to clean things up and told the fungus to be on its best behavior. However, N. fischeri still did not want to play nice with the P. marneffei…so they remained slightly separated.

Fungal Christmas Tree

Star: Talaromyces stipitatus; Tree: Aspergillus nidulans Ornaments: Penicillium marneffei; Trunk: Aspergillus terreus. Image Credit: Stephanie Mounaud / J. Craig Venter Institute.

I hope everyone enjoys my creation, although the credit goes to my jolly ole fungus for being so wonderfully diverse and satisfying my slightly nerdy creative side.

Fungalman

Hat, Eyes, Mouth, Buttons: Aspergillus niger; Arms: Aspergillus nidulans; Nose: Aspergillus terreus with Penicillium marneffei; Body: Neosartorya fischeri. Image Credit: Stephanie Mounaud / J. Craig Venter Institute.

Let us all show the world the true side of fungus and all its amazing potential. Because we all know they can do more than just sit there and look pretty.

JCVI Viral Finishing Pipeline: a Winning Combination of Advanced Sequencing Technologies, Software Development and Automated Data Processing

JCVI viral projects are supported by the NIAID Genomic Sequencing Center for Infectious Disease (GSCID). The viral sequencing and finishing pipeline at JCVI combines next generation sequencing technologies with automated data processing. This allowed us to complete over 1,800 viral genomes in the last 12 months, and almost 8,800 genomes since 2005.

Viral Projects at JCVI

JIRA Viral Sample Tracking Workflow

Our NextGen pipeline, which utilizes SISPA-generated libraries with Roche/454 and Illumina sequencing, enables us to complete a wide variety of viral genomes including challenging samples. Automated assembly pipeline employs CLCbio command-line tools and JCVI cas2consed, a cas to ace assembly format conversion tool. Our complimentary Sanger pipeline software is currently being integrated with the NextGen pipeline. This will improve our data processing and will allow us to use validation software (autoTasker) more efficiently.

Assembly of Repetitive Viral Genomes

Genome Organization of Varicella-Zoster

Assembly of Novel Viral Genomes

CLC Assembly Viewer Representation

Promoter of Bat Genome

Promoter of Bat Genome

During the past year we have found that novel viruses, repetitive genomes, and mixed infection samples could not be easily integrated with our high-throughput assembly pipeline. We have developed an assembly and finishing process that utilizes components of the high-throughput pipeline and combines them with manual reference selection and editing. Using this approach we completed novel adenovirus genomes and mixed-infection avian influenza genomes, and improved assemblies of previously unknown arbovirus genomes. We are currently working on optimizing and automating this new pipeline.

Assembly of Mixed Viral Genomes

Consed Representation of Mixed Viral Sample

Consed Representation of Mixed Viral Sample

Repetitive genomes have long been known to present great challenges during assembly and finishing. We are presenting a new approach to assembly and finishing of repetitive varicella genome that is based on separating it into overlapping PCR amplicons followed by merging sequenced amplicons during assembly.

To streamline our viral pipelines, we have fully integrated them with JCVI’s LIMS and JIRA Workflow Management to create a semi-automated tracking interface that follows the progress of viral samples from acquisition through to NCBI submission. This allows us to process a large volume of samples with limited manual interaction and, at the same time, gives us flexibility to work on challenging and novel genomes.

Acknowledgements

The JCVI Viral Genomics Group is supported by federal funds from the National Institute of Allergy and Infectious Disease, the National Institutes of Health, and the Department of Health and Human Services under contracts no. HHSN272200900007C.

Bat coronavirus project is collaboration with Kathryn Holmes and Sam Dominguez, University of Colorado Medical Center.

The authors would like to thank members of the Viral Genomics and Informatics group at JCVI.

References

Viral genome sequencing by random priming methods. Djikeng A, Halpin R, Kuzmickas R, Depasse J, Feldblyum J, Sengamalay N, Afonso C, Zhang X, Anderson NG, Ghedin E, Spiro DJ. BMC Genomics. 2008 Jan 7;9:5A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species.  Allander T, Emerson SU, Engle RE, Purcell RH, Bukh J.

Note

This post is based on a poster by Nadia Fedorova, Danny Katzel, Tim Stockwell, Peter Edworthy, Rebecca Halpin, and David E. Wentworth.

Scientist Spotlight: Meet David Wentworth

During the height of the H1N1 Flu pandemic, David Wentworth was running a microbial genetics laboratory at the Wadsworth Center, New York State Department of Health (NYSDOH) where he was instrumental in developing a method to amplify influenza genomes regardless of strain using “universal primers” or short strands of DNA that recognize conserved segments across the genomes of many different flu strains. This amplification process was developed to generate recombinant influenza A viruses (the most common flu type affecting humans and animals) that could be used for the production of new vaccines. From a clinical swab it took his team 9-12 days to develop vaccine seed stocks. It was this work that first brought Dave to JCVI’s attention.

Several years ago Dave began collaborations with JCVI scientists to sequence human and avian influenza viruses. The collaborations intensified two years ago when all pandemic flu samples (or suspected flu samples) were first sent to Dave’s lab so the virus could be amplified in sufficient quantities for sequencing using his new amplification pipeline. The amplification took only a day and then isolated, non-infectious, DNA was sent to JCVI for sequencing. JCVI was the natural choice for this work since we are host to the government-funded “Influenza Genome Sequencing Project,” with the goal of sequencing large numbers of viral genomes to help scientists worldwide to understand how flu viruses evolve and cause disease. JCVI researchers then deposited influenza sequences into GenBank within two days of receiving DNA from Dave’s lab, enabling researchers worldwide to track what strains are circulating and how they are evolving. JCVI has sequenced over 75% of the influenza genomes in GenBank, the NIH public repository for sharing genetic sequencing data.

Influenza Genome Amplification Directly From Clinical Specimens

Influenza Genome Amplification Directly From Clinical Specimens (Zhou, B., M. E. Donnelly, D. T. Scholes, K. St.George, M. Hatta, Y. Kawaoka, and D. E. Wentworth. 2009. J.Virol. 83:10309-10313.).

Dave was soon invited for a talk at JCVI. “The opportunities at JCVI were to help build the [viral genomics] program. And already good, quality people are here studying viruses with a focus on viral evolution and sequencing analysis,” Dave remarked. “Being part of generating that information, I think makes you have a better feel for the biology.” The capabilities for viral sequencing combined with IFX strengths and the interest in viral evolution at JCVI was a draw for Dave and he soon joined the team. Moreover, there are opportunities at JCVI to work with collaborators who send specimens from various regions of the world for sequencing so that we can “more deeply understand the mutations that contribute to virulence,” he said. He is particularly interested in antigenic drift (how viruses escape immunity) that contributes to the “annual influenza escape,” which is critical in developing vaccine strains.

New Live Attenuated Vaccine Approaches

New Live Attenuated Vaccine Approaches. Figure shows influenza RNA polymerase activity (GFP) at various temperatures. Mutations engineered into the genome (PB1-Mut3, PB2-Mut4) synergize and inhibit replication at higher temperatures of the lung (37 C) or fever (39 C).

The need for new and improved methods to develop vaccines, coupled with the advances in synthetic genomics developed at JCVI led to the formation last year by JCVI and the company Synthetic Genomics Inc. of a new company, Synthetic Genomics Vaccines Inc. (SGVI). JCVI scientists, through SGVI, are working on a three-year collaboration agreement with Novartis to apply synthetic genomics tools and technologies to accelerate the production of the influenza seed strains required for vaccine manufacturing. The agreement, supported by an award from the U.S. Biomedical Advanced Research and Development Authority (BARDA), could ultimately lead to a more timely and effective response to seasonal and pandemic influenza outbreaks. The idea is to create viruses de novo or synthesize genes critical for its antigenicity and put these in normal vaccine strains for production. The goal of the work at SGVI is to synthesize a virus in one week, or rather a seed stock, which still needs to be amplified in big fermenters. New seed stocks take 3-4 weeks to produce which is currently a rate liming step.

You don’t hear too many people singing its praises and saying “I love the flu!” as Dave has remarked, but put in context, his enthusiasm for his work shines through best when talking about his love of teaching. He gets excited teaching young scientists about virology, especially helping them to understand the important areas to study, and where the research will lead to solve a major problem. “The rewarding part of being a mentor is to see all of the people who have found their niche – it might not be bench research but they are still carrying knowledge with them.”

David Wentworth DEW checking a hive in the late Spring.

David Wentworth DEW checking a hive in the late Spring.

Aside from spending time with his family, Dave enjoys a hobby started by his dad – to cultivate honey bees. A community gardens group at a middle school in Albany, NY was looking for bees to pollinate their plants. Dave spearheaded the effort and used it as a learning tool for kids, who helped feed honey to caterpillars and moths. He also used to give lectures on bee cultivation and has taught college courses in animal science. Dave’s enthusiasm for science among his students and peers could be considered infectious, just like the subject of his research!

Evaluating Strain-level Variation of Key Acidogenic Species in Dental Plaque Biofilms

The characterization of the dental plaque microbiome, using traditional 16S rDNA profiling strategies, illustrates both the strengths and the limitations of this method. The central limitation of the 16S rDNA methodology is the inability to decipher strain-level variation within a microbiome. Why is this important? It is becoming a common theme in microbiome research that microbiomes associated with the human host are distinct from those that inhabit the environment. The species present in distinct human microbiomes represent only a small number of taxa. Within these taxa are relatively few genera that have massive representation of member species. This structure has been referred to as the deep fan structure.  When comparing microbiomes representing healthy and diseased subjects, it may be commonplace that important strain-level variations exist, that are in many instances potentially causally related to the health of the human host. The dental plaque microbiome illustrates this point strongly. Oral microbiologists have isolated strains from species including: S. mitis, S. sanguinis, S. mutans, S. gordonii and others that differ dramatically in their acid production and acid tolerance characteristics. The genes encoding these activities are not part of the core genome, but reflect functions encoded in the strain-variable portion of the genome (~10-30% of the genomes coding capacity). Important aspects of human disease etiology may be missed if we fail to address this possibility.

Summary of Progress: Dental plaque samples from human subjects with and without dental caries were used to isolate S. mutans and S. sobrinus colonies using enrichment culturing procedures. Most colonies were subjected to 2-3 rounds of replating to obtain pure colonies. The individual clones were then grown in liquid media to isolate genomic DNAs to carry out fingerprinting of strains based on RFLP analysis. This allowed us to collapse positive strains that appeared identical or highly similar into a set of strains that appeared to be of maximal diversity, encoding the largest number of unique gene sequences. We further characterized the individual strains using primer pairs that are specific for either S. mutans or S. sobrinus. Several of the isolates were negative by PCR and these corresponded to isolates with unusual RFLP patterns and so were excluded from further analysis. Some isolates tested positive for one of the two primer pairs used for screening and were marked as such but retained for further analysis using genome sequencing. The isolates obtained were multiplexed into two lanes of the Solexa GSA IIx at a theoretical depth of coverage of 50X. Previous evidence based on comparative analyses indicates that strain-specific regions of the S. mutans genome are not randomly distributed but rather are present at discrete locations. The breadth of these regions is not fully characterized but will be greatly enhanced by our analyses. To date no reference genome sequence is available for S. sobrinus, a potentially important contributor to dental caries.

Each genome to be sequenced was uniquely barcoded using the EpiBio Nextera DNA sample prep kit, and sequencing was performed using an Illumina Genome Analyzer IIx. The sequenced reads were then used to search against the Genbank non-redundant nucleotide database for quality assessment and to determine the top hit of each genome.  As shown in Table 1, 76 isolates generated best hits to S. mutans and 47 to S. sobrinus genomes. Among the 17 isolates that do not appear to be either S. mutans or S. sobrinus it is somewhat puzzling how they were cultivated on the medias used. We believe these colonies were impure and predominantly that of the genome sequenced.

Top Blast Hits Genomes # of isolates
S. sobrinus 47
S. parasanguinis 1
E. faecalis 1
Lactobacillus spp. 1
S. mutans 76
Chryseobacterium gleum 1
S. aureus 8
S.  epidermidis 1
S. caprae 4

Table 1. Summary of the tops hits of the reads from each isolate sequenced.

We used Newbler to assemble each of the genomic sequence reads. For S. mutans we used mapping assembly against the S. mutans UA159 sequence and we performed de novo assembly for S. sobrinus sequence reads due to the lack of available reference genome sequence. Overall the sequencing of isolates was successful with one exception. The remaining 75 isolates assembled with an average coverage of 91% with respect to the reference genome. Given what is known about strain-specific gene content in S. mutans one expects 90% coverage to be equivalent to complete coverage since ~10% of UA159’s genome sequence is not likely to be shared with these isolates. The average number of contigs/isolate is 215 with average length of 10,842 bp. Based on this outcome it is highly likely that we will identify sequence reads from essentially all strain-specific genes for each isolate, the extent that full-length gene sequence has been generated and further to what extent those sequences display genomic context are a part of our current efforts.

Ongoing Efforts. We are currently identifying strain-specific sequences from each isolate to determine the extent that these sequences might be shared among newly characterized isolates and their association with either caries-free or caries-active subjects. We will also identify the set of core gene sequences that appear to be present in all S. mutans and S. sobrinus genomes respectively. Ultimately we have demonstrated the use of high throughput sequencing technology as a means for characterizing oral pathogens of interest. Suggested applications for this type of research effort include the generation of strain-specific oligonucleotides to be added to existing DNA microarray content to enhance analysis using standard CGH methods. Another powerful use of this data can be obtained via the application of a variety of selection schemes that reveal the fitness of individual strains among the groups sequenced. The identification of strain-specific sequence signatures allows us to design primer pairs that can be used to measure the abundance and growth characteristics of that strain by qPCR. Potentially more interesting is the measurement of strains’ growth characteristics in competition with other sequenced strains. We have created mixtures of all of the sequenced S. mutans and S. sobrinus strains as independent pools and also generated a super pool including all sequenced strains. We have subjected these pools to a number of selective growth conditions including oxidative stress, low pH and growth on a variety of sugar substrates. In each case we envision that the generation of gene expression data and/or qPCR data detailing the abundance of each strain before and after selection will reveal individual strains that display high and low resistance to low pH, oxidative stress etc. This experimental procedure is analogous to phenotypic screens involving pools of single gene KO strains that have been uniquely barcoded to allow highly parallel analysis using DNA microarrays as popularized by the S. cerevisiae community. The variation performed here is to make use of the strain-specific gene sequences as a surrogate for the molecular barcode. Each strain will have at least one and probably hundreds of unique sequence identifiers that may be exploited for this purpose.

It is our hope that this demonstration will provide the dental research community a blueprint for how genome sequence data can be exploited and become more than a simple GenBank record for reference purposes. The experimental process described above provides a novel way to relate genotypic and phenotypic information on collections of strains derived from healthy and diseased human subjects. The sequence data for all assemblies has been placed in the public domain and we are currently awaiting accession number assignments. If you have some ideas for negative selection, let me know, I am happy to share the strains/pools and funding permitting, primer pair aliquots targeting specific strains in the pools.

The projects described above were supported by NIAID via a contract to JCVI under the Pathogen Functional genomics Resource Center (N01-AI15447) and funds from NIDCR to PFGRC in an attempt to enable the HMP research community to exploit genomic and metagenomic methods. The work pertaining to the oral cavity was done in collaboration with Dr. Walter Bretz at NYU and the efforts pertaining to the gut microbiome were done in collaboration with Dr. Cynthia Sears at JHU.