Posts in category Infectious Disease

Page 3 of 4

Evaluating Strain-level Variation of Key Acidogenic Species in Dental Plaque Biofilms

The characterization of the dental plaque microbiome, using traditional 16S rDNA profiling strategies, illustrates both the strengths and the limitations of this method. The central limitation of the 16S rDNA methodology is the inability to decipher strain-level variation within a microbiome. Why is this important? It is becoming a common theme in microbiome research that microbiomes associated with the human host are distinct from those that inhabit the environment. The species present in distinct human microbiomes represent only a small number of taxa. Within these taxa are relatively few genera that have massive representation of member species. This structure has been referred to as the deep fan structure.  When comparing microbiomes representing healthy and diseased subjects, it may be commonplace that important strain-level variations exist, that are in many instances potentially causally related to the health of the human host. The dental plaque microbiome illustrates this point strongly. Oral microbiologists have isolated strains from species including: S. mitis, S. sanguinis, S. mutans, S. gordonii and others that differ dramatically in their acid production and acid tolerance characteristics. The genes encoding these activities are not part of the core genome, but reflect functions encoded in the strain-variable portion of the genome (~10-30% of the genomes coding capacity). Important aspects of human disease etiology may be missed if we fail to address this possibility.

Summary of Progress: Dental plaque samples from human subjects with and without dental caries were used to isolate S. mutans and S. sobrinus colonies using enrichment culturing procedures. Most colonies were subjected to 2-3 rounds of replating to obtain pure colonies. The individual clones were then grown in liquid media to isolate genomic DNAs to carry out fingerprinting of strains based on RFLP analysis. This allowed us to collapse positive strains that appeared identical or highly similar into a set of strains that appeared to be of maximal diversity, encoding the largest number of unique gene sequences. We further characterized the individual strains using primer pairs that are specific for either S. mutans or S. sobrinus. Several of the isolates were negative by PCR and these corresponded to isolates with unusual RFLP patterns and so were excluded from further analysis. Some isolates tested positive for one of the two primer pairs used for screening and were marked as such but retained for further analysis using genome sequencing. The isolates obtained were multiplexed into two lanes of the Solexa GSA IIx at a theoretical depth of coverage of 50X. Previous evidence based on comparative analyses indicates that strain-specific regions of the S. mutans genome are not randomly distributed but rather are present at discrete locations. The breadth of these regions is not fully characterized but will be greatly enhanced by our analyses. To date no reference genome sequence is available for S. sobrinus, a potentially important contributor to dental caries.

Each genome to be sequenced was uniquely barcoded using the EpiBio Nextera DNA sample prep kit, and sequencing was performed using an Illumina Genome Analyzer IIx. The sequenced reads were then used to search against the Genbank non-redundant nucleotide database for quality assessment and to determine the top hit of each genome.  As shown in Table 1, 76 isolates generated best hits to S. mutans and 47 to S. sobrinus genomes. Among the 17 isolates that do not appear to be either S. mutans or S. sobrinus it is somewhat puzzling how they were cultivated on the medias used. We believe these colonies were impure and predominantly that of the genome sequenced.

Top Blast Hits Genomes # of isolates
S. sobrinus 47
S. parasanguinis 1
E. faecalis 1
Lactobacillus spp. 1
S. mutans 76
Chryseobacterium gleum 1
S. aureus 8
S.  epidermidis 1
S. caprae 4

Table 1. Summary of the tops hits of the reads from each isolate sequenced.

We used Newbler to assemble each of the genomic sequence reads. For S. mutans we used mapping assembly against the S. mutans UA159 sequence and we performed de novo assembly for S. sobrinus sequence reads due to the lack of available reference genome sequence. Overall the sequencing of isolates was successful with one exception. The remaining 75 isolates assembled with an average coverage of 91% with respect to the reference genome. Given what is known about strain-specific gene content in S. mutans one expects 90% coverage to be equivalent to complete coverage since ~10% of UA159’s genome sequence is not likely to be shared with these isolates. The average number of contigs/isolate is 215 with average length of 10,842 bp. Based on this outcome it is highly likely that we will identify sequence reads from essentially all strain-specific genes for each isolate, the extent that full-length gene sequence has been generated and further to what extent those sequences display genomic context are a part of our current efforts.

Ongoing Efforts. We are currently identifying strain-specific sequences from each isolate to determine the extent that these sequences might be shared among newly characterized isolates and their association with either caries-free or caries-active subjects. We will also identify the set of core gene sequences that appear to be present in all S. mutans and S. sobrinus genomes respectively. Ultimately we have demonstrated the use of high throughput sequencing technology as a means for characterizing oral pathogens of interest. Suggested applications for this type of research effort include the generation of strain-specific oligonucleotides to be added to existing DNA microarray content to enhance analysis using standard CGH methods. Another powerful use of this data can be obtained via the application of a variety of selection schemes that reveal the fitness of individual strains among the groups sequenced. The identification of strain-specific sequence signatures allows us to design primer pairs that can be used to measure the abundance and growth characteristics of that strain by qPCR. Potentially more interesting is the measurement of strains’ growth characteristics in competition with other sequenced strains. We have created mixtures of all of the sequenced S. mutans and S. sobrinus strains as independent pools and also generated a super pool including all sequenced strains. We have subjected these pools to a number of selective growth conditions including oxidative stress, low pH and growth on a variety of sugar substrates. In each case we envision that the generation of gene expression data and/or qPCR data detailing the abundance of each strain before and after selection will reveal individual strains that display high and low resistance to low pH, oxidative stress etc. This experimental procedure is analogous to phenotypic screens involving pools of single gene KO strains that have been uniquely barcoded to allow highly parallel analysis using DNA microarrays as popularized by the S. cerevisiae community. The variation performed here is to make use of the strain-specific gene sequences as a surrogate for the molecular barcode. Each strain will have at least one and probably hundreds of unique sequence identifiers that may be exploited for this purpose.

It is our hope that this demonstration will provide the dental research community a blueprint for how genome sequence data can be exploited and become more than a simple GenBank record for reference purposes. The experimental process described above provides a novel way to relate genotypic and phenotypic information on collections of strains derived from healthy and diseased human subjects. The sequence data for all assemblies has been placed in the public domain and we are currently awaiting accession number assignments. If you have some ideas for negative selection, let me know, I am happy to share the strains/pools and funding permitting, primer pair aliquots targeting specific strains in the pools.

The projects described above were supported by NIAID via a contract to JCVI under the Pathogen Functional genomics Resource Center (N01-AI15447) and funds from NIDCR to PFGRC in an attempt to enable the HMP research community to exploit genomic and metagenomic methods. The work pertaining to the oral cavity was done in collaboration with Dr. Walter Bretz at NYU and the efforts pertaining to the gut microbiome were done in collaboration with Dr. Cynthia Sears at JHU.

Cataloguing the Gene Expression Patterns of Dental Plaque Biofilms: A Reference Dental Plaque Transcriptome

The RNA-Seq method has been widely adopted as an alternative to the use of DNA microarrays. In most contexts, the RNA-Seq method is implemented when a single reference organism is being studied. Our project endeavored to establish working methods to enable the generation of cDNA libraries that were depleted of contaminating human mRNA and host/microbiome rRNA sequences that would otherwise represent over 95% of the total sequence reads obtained. We have also made significant efforts to define bioinformatics procedures that allow RNA-Seq data to be assigned to appropriate species such that global gene expression analyses can be routinely conducted by the dental research community and those involved in HMP research objectives.

We have established a catalogue of expressed genes in dental plaque by turning to the Solexa sequencing platform and applying RNA-Seq to a collection of 19 twin pairs that are either concordant for dental health (caries-free concordant twin pairs), concordant for dental caries (caries-active concordant twin pairs) or discordant for dental caries (one twin caries-free and the other member of the twin pair caries-active). Based on our analysis of the data we have established that the most abundant ten species in each sample varies significantly from subject to subject. This fact greatly complicates the mapping of reads to reference genomes. Another significant conceptual challenge we faced was how to conduct highly specific mapping of transcripts to genomes of interest. We know that genes in genomes evolve at substantially different rates; some genes may differ by 2-5% across species boundaries whereas others may differ by 25-30%. The consequence of this is that no single cut-off for mapping a transcript to a reference genome may be reliably employed. We therefore reasoned that by creating an oral cavity reference genome database we could map each transcript according to reasonable specificity criteria but impose a best-hit criteria on the data to ensure minimal mis-mapping.

Based upon the data generated (38 samples X ~32.8 million reads/sample) ~1 billion reads or over 100 Gb of sequence data, we have fulfilled the goal of establishing a robust procedure for RNA-Seq and the specific transcripts expressed in dental plaque biofilms. These sequences and the associated SOPs developed for effective microbial RNA enrichment have been made available through the DACC ( In addition, we have devised a strategy for mapping reads to particular functional or biochemical pathways such as those related to acid/base production as an independent means of exploiting RNA-Seq data. In this scheme the details of which species are expressing functions is not considered of importance but rather the sum total of expressed sequences related to acid/base production is. The approach used here is similar to that described above in that a database is created pertaining to all sequence data derived from particular biochemical pathways as a means of recruiting reads of appropriate sequence identity mapping to annotated genes. Over- or under-representation of expressed genes constituting discrete pathways may then be evaluated.

The projects described above were supported by NIAID via a contract to JCVI under the Pathogen Functional genomics Resource Center (N01-AI15447)and funds from NIDCR to PFGRC in an attempt to enable the HMP research community to exploit genomic and metagenomic methods. The work pertaining to the oral cavity was done in collaboration with Dr. Walter Bretz at NYU and the efforts pertaining to the gut microbiome were done in collaboration with Dr. Cynthia Sears at JHU.

Surrogate Methods for Profiling Species of the Oral and Gut Microbiome

We engaged in an effort focused on alleviating a substantial barrier facing the human microbiome research community. While powerful, the 16S rDNA gene is insufficiently divergent to allow discrimination of many species and essentially no strains present within communities. The increasing costs of Sanger sequencing has forced most investigators to adopt the use of the Roche, 454 sequencing platform to address the question, “who’s there?”  The benefits of the 454 sequence data are clear as investigators enjoy deep data sets with excellent statistical power. A major drawback relates to the fact that the read length of the 454 platform does not  allow the acquisition of a sufficient number of “informative bases” to allow species level identification and therefore generally depicts the genera present in the microbiome. While there is much to be gained by large-scale analysis of genus-based comparisons, it is highly desirable to have species and even strain-level resolution. Much of the difference in healthy and diseased human microbiomes may lie at the species and strain-level making it important to develop strategies to allow species abundance measurements to be made on large human cohorts, in a cost-effective manner. We used capture array technology in an iterative fashion to establish a comprehensive sequence database of seven conserved gene sequences. We performed a proof of concept using two model systems: the oral (dental plaque) microbiome and the fecal microbiome. We designed capture oligonucleotides that tiled each of seven universally conserved gene sequences present in Genbank belonging to genera known to be present in the gut and oral cavity, respectively. We refer to these oligonucleotides as “seed sequences” for use in capturing orthologous sequences present in both stool and dental plaque biofilms and saliva.

We next prepared complex mixtures of dental plaque and saliva from several individuals and separately also prepared a similar stool mixture representing a diversity of subjects. The DNAs generated from these microbiome samples were used in conjunction with the capture array. We refer to the captured DNAs as “cloud sequences” that represent related sequences (phylogenetic clades) surrounding the original seed sequences. We repeated the capture array process three times such that novel identified sequences relative to the original seeds were added to subsequent capture array designs. Our goal is to establish a taxonomic representation of these microbiomes based on detailed DNA sequence data of seven housekeeping genes, reminiscent of long-standing MLST approaches. We are leveraging existing and future reference genome sequences to annotate the sequence data obtained from capture array data. Additional species may be subsequently added to this framework by the HMP research community simply by sequencing the relevant loci from defined species available via ATCC, BEI or from the strain collections held by hundreds of investigators world-wide.  The power of this approach lies in the provision of DNA sequences that can be used to design qPCR primer pairs capable of highly discriminatory amplification and abundance measurements of species and strains of potential interest.

Despite the fluctuation in the efficiency of capturing orthologs among the seven target genes, we were able to generate a substantial depth of coverage for three genes in the oral cavity, pyrG, pgi and recA and four genes in the gut pyrG, dnaG, pgi and recA. We have been analyzing the total gene sequence data obtained from capture arrays including four 454 runs each for oral and fecal microbiomes. Given the nature of the sequence data as a representation of highly related sequences derived from tens or hundreds of strains belonging to the same species we were pessimistic that assembly of sequence reads would be fruitful. Our attempt at de novo assembly, using newbler, verified our concerns and was not successful. We have defined an in silico approach to organize the sequence data that involves generating a microbiome reference genome database populated with relevant genomes derived from the oral cavity and gut. In addition to the original genes collected from Genbank, we added the 7 targeted gene sequences from 134 oral-related genomes and 162 gut-related genomes. By creating this database we will be able to map each gene sequence to the reference genome to enhance the specificity of each assignment. We are mapping the reads from our sequencing data to genomes using a high stringency cut-offs. Those reads mapping to reference genomes will be used to generate a multiple sequence alignments to derive a consensus sequence and identify exploitable polymorphisms for qPCR primer design. For this we will not only rely on the multi-sequence alignments but we will also compare alignments for any individual species to others within a major clade (common genera). This will allow us to determine the sequences with the highest probability of being unique to the species of interest. Preliminary assessment of the DNA sequence data has shown promising outcomes as we are able to recapitulate phylogenetic clades such as the viridans group of Streptococci using gene sequences derived from recA. This supports the idea that gene representation from species known to be present in the oral cavity were effectively captured. The clade or sub-clade primer design will be based on all the sequences reliably mapped to genomes.

It is our goal to design useful primer pairs representing species-level resolution. This will be achieveable in many cases but not all. We are seeking funds to create a repository of primer pairs to share with the HMP community. It should be noted that initially, none of the primer designs will be experimentally validated and as such users will need to carefully evaluate their usage in the context of their experimental goals. It is our plan to continue efforts associated with this project to conduct validations to the extent that funding permits. These results will be added to the primer designs as they are validated or deemed unsuitable for experimental use.

The projects described above were supported by NIAID via a contract to JCVI under the Pathogen Functional genomics Resource Center (N01-AI15447)and funds from NIDCR to PFGRC in an attempt to enable the HMP research community to exploit genomic and metagenomic methods. The work pertaining to the oral cavity was done in collaboration with Dr. Walter Bretz at NYU and the efforts pertaining to the gut microbiome were done in collaboration with Dr. Cynthia Sears at JHU.

Sequencing of high yield influenza reassortants at JCVI

As part of the Influenza Genome Sequencing Project, JCVI will be sequencing a large number of high yield influenza reassortants created in the lab of Dr. Doris Bucher at New York Medical College. Dr. Bucher’s lab has prepared the type A H3N2 high yield reassortants  (hyrs) for the influenza vaccine for the past several years, both within the US and world wide.
The Bucher lab continues the tradition of preparing the hyrs as developed by preeminent influenza virologist Dr. Edwin D. Kilbourne (1920-2011). Dr Kilbourne developed and applied the technology to produce the first genetically engineered influenza vaccines; these vaccines, which typically change yearly, have been in use for over 40 years.
JCVI will be sequencing approximately 46 hyrs from Dr. Kilbourne’s collection which was assembled as part of the Kilbourne/New York Medical College Archive of Influenza Virus Reassortants, Mutants, and Antisera. Detailed information is provided for every virus stored in the archive with information at the archive website ( The assembly of the archive was sponsored by the NIAID and viruses in the archive are available through BEI Resources ( All sequence data and meta data associated with the hyrs sequenced at JCVI will be made publically available in the Influenza Research Database (
Dr. Kilbourne passed away on February 21, 2001 at the age of 90. A eulogy in remembrance of Dr. Kilbourne and his pioneering work in the field of influenza virology can be found at:

Insights gained from influenza genomic sequence data: viral diversity within human populations

The advent of large amounts of influenza genomic sequence data produced by the Influenza Genome Sequencing Project (IGSP) has led to new concepts regarding influenza viral diversity.  It was previously believed that a single influenza lineage entered a human population at the start of an influenza season and gradually spread over time; however, recent analyses of influenza genomes revealed that multiple viral lineages co-circulate within individual populations throughout an influenza season.  These different lineages appear to be continuously introduced which provides the opportunity for frequent intra-subtype reassortment.  Interestingly, similar levels of influenza diversity exist within populations of both large metropolitan cities and small towns (E.C. Holmes, 2009).  Multiple, diverse viral lineages of the same subtype have been observed co-circulating in urban locations comprised of expansive travel networks and rural locations that are geographically isolated.

Additional analyses of complete influenza genomes have led to a ‘source-sink’ model of influenza seasonality.  In this model, a global, human ‘source’ population of influenza viruses is thought to be responsible for the antigenic variants that ignite seasonal epidemics in the ‘sink’ populations of the Northern and Southern hemispheres (A. Rambaut, 2008; E. C. Holmes, 2009).  The geographic regions of East and Southeast Asia have been hypothesized as potential sources of influenza due to the large, dense human populations which would allow influenza viruses to antigenically evolve with maximum efficiency.  These locales may be the focus of future surveillance efforts aimed at identifying emergent influenza viruses that have evolved mechanisms to evade current vaccines.

A Look Back at 2010 at the JCVI…

As the J. Craig Venter Institute (JCVI) soars into its 19th year, we reflect on the past year of highlights and accomplishments to mark the close 2010 and look forward to more significant scientific advances in 2011.

JCVI Top 10 of 2010 …

1. First Synthetic Cell: Fifteen years in the making, 2010 brought to bear with huge anticipation the successful construction of the first self-replicating, synthetic bacterial cell. The work was published in Science in May. The synthetic cell called Mycoplasma mycoides JCVI-syn1.0 is the proof of principle that genomes can be designed in the computer, chemically made in the laboratory and transplanted into a recipient cell to produce a new self-replicating cell controlled only by an artificial genome. Although the first synthetic cell was not designed to produce a specific bioproduct, the team has shown that this can be done and the potential benefits are numerous. The research team, lead by JCVI President Craig Venter, Hamilton Smith, Clyde Hutchison, and Daniel Gibson, envision a future where the rapid design and production of biological products using synthetic biology techniques will be used to produce clean fuels, medicines, and other bioproducts. Throughout the course of this work, the JCVI Policy group has extensively engaged in outside review of the ethical and societal implications of this work, including advising the new Presidential Commission on Bioethics on their recommendations for oversight.

M. mycoides JCVI-syn1

M. mycoides JCVI-syn1

2. Synthetic Vaccines: Following on the heels of the announcement of the first synthetic cell, the company Synthetic Genomics Inc. and JCVI announced in October the formation of a new company, Synthetic Genomics Vaccines Inc. (SGVI). The privately held company is focused on developing next generation vaccines that can be rapidly produced and tested, which is especially important for outbreaks of new infectious diseases. SGVI also announced a three-year collaboration with Novartis to apply synthetic genomics technologies to accelerate the production of the influenza (flu) seed strains required for vaccine manufacturing. The seed strain is the starter culture of a virus, and is the base from which larger quantities of the vaccine virus can be grown. Under this collaboration, Novartis and SGVI will work to develop a “bank” of synthetically constructed seed viruses ready to go into production as soon as WHO makes recommendations on the flu strains. The technology could reduce vaccine production time by up to two months, which is particularly critical in the event of a pandemic.

3. Hydra Genome – one of the animal kingdom’s earliest common ancestors: JCVI scientists along with more than 70 other researchers from around the world, have sequenced and analyzed the genome of Hydra magnipapillata, a fresh water member of the cnidaria– stinging animals that include jellyfish, sea anemones and corals. The research, published in the March 14 edition of Nature, was co-led by Ewen F. Kirkness, JCVI, Jarrod A. Chapman, Department of Energy Joint Genome Institute, and Oleg Simakov, University of California, Berkeley. This is the second sequenced cnidarian genome, following that of a sea anemone, Nematostella vectensis, in 2007. The ancestors of these two species diverged more than 500 million years ago, and comparison of their genomes has revealed common features of the earliest animals that gave rise to the diversity of animals on Earth today. The team found clear evidence for conserved genome structure between the Hydra and other animals, like humans. Unexpectedly, the sequencing also revealed a novel bacterium that lives in close association with the Hydra.

4. Uncovering the Human Microbiome: Microbes are living within and on the human body and this collective community is called the human microbiome. JCVI Scientists, as one component of the large scale NIH Roadmap Human Microbiome Project, and along with colleagues at three other genome centers sequenced the genomes of ~180 microbes from the human body, published in the May 21 edition of Science. At the JCVI we anticipate sequencing an additional 400 species over the next few months. Colleagues at the JCVI are also using single cell approaches to isolate new strains that have not been cultured – isolates whose genomes will also be completely sequenced. The role these microbes play in human health and disease is still relatively unknown and these approaches are allowing us to gain a greater understanding of these enigmatic species.

5. Body Louse Genome: A global research team led by Ewen Kirkness and colleagues from JCVI published a study in the Proceedings of the National Academy of Sciences in June describing the sequencing and analysis of the human body louse, Pediculus humanus humanus, a human parasite responsible for the transmission of bacteria that cause epidemic typhus, relapsing fever and trench fever. Detailed analysis of the genome was then conducted by a large international group of 71 scientists, coordinated by Barry Pittendrigh, University of Illinois, and Professor Evgeny Zdobnov, University of Geneva Medical School. Comparative studies of the body louse genome with other species revealed features that will enhance our understanding of the relationships between disease-vector insects, the pathogens they transmit, and the human hosts. In addition to the targeted louse genome, the project unexpectedly yielded the complete genome sequence of a bacterial species, Riesia, that lives in close association with lice, and which is essential for survival of the insects. The researchers believe that the genome will be a valuable reference for evolutionary studies of insect species, especially in the areas related to insect growth and development.

6. Castor Bean Genome Sequencing: A research team co-led by Agnes P. Chan and colleagues from JCVI and Jonathan Crabtree and others at the Institute for Genome Sciences, University of Maryland School of Medicine, published the sequence and analysis of the castor bean (Ricinus communis) genome in Nature Biotechnology in August. Because of the potential use of castor bean as a biofuel and its production of the potent toxin ricin, the team focused efforts on analysis of genes related to oil and ricin production. The analyses could be important for comparative studies with other oilseed crops, and could also allow for genetic engineering of castor bean to produce oil without ricin. Identifying and understanding the ricin–producing gene family in castor bean will be important in preventing and dealing with potential bioterrorism events. Genomics enables enhanced diagnostic and forensic methods for the detection of ricin and precise identification of strains and geographical origins. As a next step, the group suggests further comparative genomic studies with the close relative cassava, a major crop in the developing world, to further elucidate their disease resistance aspects.

7. Science Education: JCVI was an Official Partner of the inaugural USA Science and Engineering Festival held on the National Mall in Washington, DC in October. The Festival, which was the country’s first national science festival, included over 500 of the country’s leading science and engineering organizations with the aim to reignite the interest of our nation’s youth in the sciences. The JCVI ‘Discover Genomes’ Bus was showcased during a two-day expo and some of the research being done at JCVI was presented to around 1700 visitors by our scientists and staff.

There were lines all day!

8. Viral Genomics– In 2010 the JCVI has published over 1600 influenza genomes and over 75% of all published flu genomes to date have been sequenced by the JCVI, totaling over 6000 genomes. This year the diversity of viral genomes we have sequenced has significantly expanded under the NIH Genomic Sequencing Center for Infectious Diseases contract. Some of the projects include viruses causing diseases such as measles, mumps, rubella, encephalitis, SARS, and the common cold, just to name a few. The viral group has annotated and published 79 Rotavirus (stomach flu) and 33 Coronavirus genomes (includes SARS and common cold) this year and many more will be published in 2011. The pace of sequencing and finishing genomes has also increased this year as a result of adoption of nextgen platforms (e.g. Illumina/454 and Illumina/Solexa) and the development of more efficient methodologies to increase productivity while reducing costs.

9. Marine Microbial Genome Sequencing Project: JCVI scientists have continued their quest to isolate and sequencing microbes living in global ocean waters to discover new genes and enzymes, and to help understand the role microbes play in the ocean ecosystem. Shibu Yooseph, Kenneth Nealson and colleagues at JCVI published an analysis of 137 known marine microbial genomes living in the global ocean surface in Nature in November. These genomes were compared to metagenomic samples of ocean waters of 10.97 million sequences of JCVI’s Sorcerer II Global Ocean Sampling (GOS) metagenomic data and thousands of 16S rRNA sequences. The marine genomes were collected as part of the Gordon and Betty Moore Foundation-funded Marine Microbial Genome Sequencing Project, a project coordinated by JCVI that has a primary goal of obtaining whole genome sequences of ecologically important microbes from a variety of diverse, global marine environments. The work provides a good example of combining metagenomic data with sequenced genomes data to study microbial communities and to generate testable hypotheses in microbial ecology.

10. Sorcerer II Global Ocean Sampling Expedition: On December 17th 2010 Sorcerer II arrived in Florida after spending the last two years with her crew collecting samples in The Baltic, Mediterranean and Black Seas. Funded generously by the Beyster Family Foundation Fund, The San Diego Foundation, and Life Technologies Foundation, Sorcerer II has sailed ~28,000 nautical miles since departing San Diego in March 2009. During this time 212 samples were collected and over 5,100 liters of sea water was filtered and sent to JCVI for analysis of the microbial life contained within these samples. The JCVI established strong collaborations with scientists in all 16 countries in which samples were collected, which will lead to joint publications and future collaborative studies in the new year. Read more.

Sunrise in the Ligurian Sea

Looking Forward to 2011…

Ten-year anniversary of the Human Genome Project: To commemorate the anniversary of the publications of the first human genome sequences in 2001, JCVI and Nature are hosting a conference and celebration in February 2011 titled – Human Genomics: The Next 10 Years. The conference will look forward to the promises of human genomics for the next 10 years, with sessions on medical advances related to genomics; the technological and ethical challenges of human genomics; personalized and familial genomics; the human microbiome project; variation in the human genome; and making sense of the genetic code. This conference will be a great way to jump into the new year and inspire the grandiose ideas and achievements that genomic scientists will accomplish over the years to come.

Holiday Art

In a relatively unknown place, on the 3rd floor of JCVI in Rockville, MD, is a small fungal room where art meets science (and of course where all our fungal research takes place). Fungus often gets such a bad reputation for being gross and somewhat ‘standard’. We fungal folks know better and I am hoping to educate others with the underlying beauty that fungi possess, in a funky way. I recognize that beauty is in the eye of the beholder but I felt this might convince some that fungus can be fun and not just something that grows in the back of your fridge or a nuisance that contaminates your plates. Please enjoy these funky fungal holiday art forms.

Fungal Christmas tree. Top: Talaromyces stipitatus; Tree: Aspergillus nidulans; Ornaments: Penicillium marneffei; Trunk: Aspergillus terreus.

Fungal Christmas tree. Top: Talaromyces stipitatus; Tree: Aspergillus nidulans; Ornaments: Penicillium marneffei; Trunk: Aspergillus terreus.

Fungal snowman. Hat, Eyes, Mouth, Buttons: Aspergillus niger; Arms: Aspergillus nidulans; Nose: Aspergillus terreus with Penicillium marneffei; Body: Neosartorya fischeri.

Fungal snowman. Hat, Eyes, Mouth, Buttons: Aspergillus niger; Arms: Aspergillus nidulans; Nose: Aspergillus terreus with Penicillium marneffei; Body: Neosartorya fischeri.

Fungal Christmas Tree.

Fungal Christmas Tree.

I am open to suggestions and only limited by my own creativity (and of course my current work load) but never by the diversity of the very cool fungal world.

Insights gained from influenza genomic sequence data: frequent intrasubtype reassortment

Studies using whole genomic influenza sequence data produced by the Influenza Genome Sequencing Project (IGSP) have focused mainly on influenza evolution and epidemiology. For instance, IGSP data has provided important insight into the frequency of intrasubtype reassortment (in which reassortment occurs between different segments of the Influenza genome). The data suggests that reassortment occurs frequently, leading to viruses with altered antigenic properties that may evade current vaccines. Thus, it is useful to study not only the HA and NA segments that produce the hemagglutinin and neuraminidase proteins that sit on the surface of the virion and interact with host cells, but the whole viral genome, as this provides a complete picture of the emergence of the virus (E.C. Holmes, 2009).

The significance of intrasubtype reassortment for strain emergence was shown by the appearance of the new strain of Influenza H1N1 in 2009, which is a reassortant virus containing multiple swine influenza lineages.

In the October 2010 publication by Ilyushina et al, they show that despite the lack of detection thus far in humans, viable seasonal/pandemic Influenza virus reassortants can be generated in a laboratory setting. Their study showed that intrasubtype reassortment is able to occur between seasonal H3N2 and pandemic H1N1 viruses, potentially leading to the emergence of a strain with higher virulence.

Take home message of the 2010 Amebiasis Montreal Meeting: beware of who you kiss…

The Entamoeba community is a small and collegial one.  Everyone knows everyone and everyone else wants to collaborate, and learn and do more to tackle down this neglected among neglected diseases.  For many, the thought of an amoeba brings to memory Garry Larson’s The Far Side amorphous characters watching TV and dealing with domestic issues…but what a few know is that the WHO considers amebiasis one of the major health problems in developing countries surpassed  only by malaria and schistosomiasis for death caused by a parasitic infection.

Amoeba Real Life...

TIGR/JCVI has had a long-standing relationship with Entamoeba histolytica and other related species. Started by Brendan Loftus back in the day, followed by Neil Hall and continuing by yours truly, with the Entamoeba GSCID project, we have provided this eager community with what they are in great need of: genome sequences. And they are appreciative of that, for sure.

The meeting was small and intense, and started with a one day workshop on clinical aspects of the disease, not only the enteric disease, but the oral disease. It was good to mingle with dentists and see their point of view and the devastating reality of clinical cases of periodontal disease…and periodontal disease affects, at one point of another, the majority of the adults. And Entamoeba gingivalis is always there…but not always diagnosed.  Why? We assumed bacteria, bacteria, and bacteria. Treat the bacteria with antibiotics…but ignore the amoebas. Patients do everything right when it comes to oral hygiene, but the disease persists, the bone continues to be destroyed, and teeth are lost…secondary consequences such as fatigue, diabetes, heart disease, renal dysfunction, low birth weight, and a myriad of other diagnoses are known to be related to oral health and periodontal disease, but rarely in the context of this amebiasis infections. Interesting cases of entire families being affected by this parasite were presented (it is highly contagious). Dr. Bonner, the organizer of the conference is a big advocate of microscopes in the dentist practices as the primary diagnostic method for periodontal disease and identification of amebiasis, and he is surely determined to speak to the world about that. And he also wants the genome done.

Entamoeba community, Montreal 2010

During the core of the Meeting, Neil Hall and myself were the only ones on the genomics side of things: we presented on SNP analysis done so far on the strains we have sequenced, in their case using SOLiD, on our case just 454 (for now). Both talks were very well received and the community is eager to see more. Both pieces of work will be used in a global SNP analysis, particularly focusing on a family of proteins known to be involved in virulence, the Gal/GalNac lectins. These proteins are one of the main targets for vaccine development, led by Dr. Bill Petri, who is in the process to start full speed with that endeavor. Also on this topic, Dr. Jonathan Ravdin (Dean and Executive Vice President of the Medical College of Wisconsin) presented the results on an intranasal Gal-lectin subunit vaccine on experimental Entamoeba histolytica in baboons, showing its protective effect against enteric colitis, and the promising future of this vaccine target for humans.

For amebiasis “a la mode”, the usual suspect topics of the conference involved Entamoeba histolyica signaling pathways, classic protein characterization of large families, proteases and invasion, and large genotyping studies of outbreaks, as well as mechanism of pathogenesis.  One interesting talk was on the generation of cyst like forms in vitro for Entamoeba histolytica, a true breakthrough, since there is no model or encystations so far, and the possibility to obtain this structures in vitro will certainly open up a whole new world of studies that before were confined to Entamoeba invadens, a very distant parasite of lizards that does encyst in vitro.

A modest interest into drug discovery was present. There are so far two kind of drug therapies for amebiasis, luminal amoebicides (paromomycin, diloxanide furoate and iodoquinol) for intestinal disease, but ineffective against organisms in tissue and metronidazole and other derivatives,  for invasive disease. However, resistance to these drugs is easily achievable at clinically acceptable drug levels. James McKerrow (UCSF) group presented a high throughput screening of small molecules using an FDA-approved library of drugs and known bioactive compounds, and they identified six compounds with similar or better activity than metronidazole, in vitro. Other group from Mexico is focusing on probiotics and natural compounds such as Astrophitus capricorne (cactus), Jatrhropa dioica and Eucalyptus camaldulensis.

Montreal View from Le Crystal Hotel

To finalize, because the list of interesting things can be just too much, a wonderful piece of work that will hopefully provide an extensive framework for further studies of host susceptibility to E. histolytica comes from Bill Petri’s lab. They have performed a small hairpin RNA (shRNA) screen to identify human factors crucial for E. histolytica cytotoxicity. Using a mammalian shRNA knockdown library they did nine rounds of selection using 1:5 parasite:host and 1:50 parasite:host ratios and resistant clones after 6 rounds of selection, were sequenced using Solexa. This way, they identified a number of host gene families including kinases, surface receptors and ion channels that may be important for susceptibility to the parasite, and of course, they are working on that…

But the take home message that I have imprinted in my brain is…do not kiss your dog.  After three years of age, ALL DOGS have periodontal disease. And inevitably they have Entamoeba gingivalis and possibly other species as well…and they are one of the sources of infection to humans. Dog kisses owner, owner kisses lover, wife, husband, and kids…and the amoeba conquest of the world continues…

For your delight, two movies on the topic: Mark Bonner, the meeting organizer (in French) and a film of a patient with periodontal disease biofilm. Enjoy!

Entamoeba histolytica research presented at the Molecular Parasitology Meeting

Entamoeba histolytica causes invasive intestinal and extraintestinal infections, known as amoebiasis, in about 50 million people and still remains a significant cause of human death in developing countries. However, for unknown reasons, fewer than 10% of E. histolytica infections are symptomatic (causing symptoms such as diarrhea, dysentery or liver abscess). The J. Craig Venter Institute is among the institutions awarded the NIAID Genome Sequencing Centers for Infectious Diseases (GSCID) contracts to provide high-quality genome sequencing and high-throughput genotyping of NIAID Category A-C priority pathogens.

Photo of Entamoeba histolytica

Entamoeba histolytica in the trophozoite stage.

A GSCID project led at JCVI by Dr. Elisabet Caler includes performing whole-genome sequencing of Entamoeba phenotypic variants from symptomatic, asymptomatic and liver abscess-causing strains chosen to include a range of clinical manifestations and taken from human cases, as well as strains grown under different conditions. Our objective is to develop a genome-wide landscape of Entamoeba diversity to understand how sequence variations in the parasite relate to pathogenicity (ability to cause disease) and clinical outcome.

The Molecular Parasitology Meeting held at the Woods Hole Oceanographic Institution, Woods Hole, MA last week provided a window into the exciting science of Parasitology.  The keynote speaker, Fotis Kafatos, spoke on “Major Challenges to Global Health in the Tropics and Beyond–Insect Vectors of Malaria and Other Parasitic or Viral Diseases.”  Dr. Kafatos stressed that a multi-pronged approach to the control of malaria is necessary to prevent the devastating loss of life that malaria causes.

Woods Hole Oceanographic Institution

A view of Woods Hole Oceanographic Institution.

The many excellent papers and posters provided an overview of the field, including   Plasmodium falciparum, Toxoplasma gondii, the trypanosomes, Giardia lamblia, Trichomonas vaginalis, Entamoeba histolytica, Schistosoma species, Babesia bovis, and associated vectors.  Topics spanned basic biology, drug design, sequencing and host-pathogen interactions.

I presented an overview of the Entamoeba sequencing project at the meeting.   Discussions as a result of the presentation included questions about the details of sequencing and handling the next-generation sequencing data.   We had animated discussions about methods for assembly of the DNA sequences, including reference-guided vs de novo assembly.   Many attendees were impressed with JCVI’s open-source METAREP metagenomic tool (J. Goll, et al., Bioinformatics 2010).  Determination of the best methods for the analysis of differences in the clinical isolates generated much discussion.  Entamoeba researchers see the sequences as a great resource and are looking forward to being able to mine the data.  One, from India, was very excited that he was going to have about 15 times the resources he has had in the past, since he has had only had one genome to mine up until now.

The Molecular Parasitology Meeting was an excellent venue for scientific exchange.  The Entamoeba histolytica GSCID project will help us understand the pathogenicity of Entamoeba histolytica, and has the potential to save lives in developing countries.