Posts tagged GSC

JCVI Viral Finishing Pipeline: a Winning Combination of Advanced Sequencing Technologies, Software Development and Automated Data Processing

JCVI viral projects are supported by the NIAID Genomic Sequencing Center for Infectious Disease (GSCID). The viral sequencing and finishing pipeline at JCVI combines next generation sequencing technologies with automated data processing. This allowed us to complete over 1,800 viral genomes in the last 12 months, and almost 8,800 genomes since 2005.

Viral Projects at JCVI

JIRA Viral Sample Tracking Workflow

Our NextGen pipeline, which utilizes SISPA-generated libraries with Roche/454 and Illumina sequencing, enables us to complete a wide variety of viral genomes including challenging samples. Automated assembly pipeline employs CLCbio command-line tools and JCVI cas2consed, a cas to ace assembly format conversion tool. Our complimentary Sanger pipeline software is currently being integrated with the NextGen pipeline. This will improve our data processing and will allow us to use validation software (autoTasker) more efficiently.

Assembly of Repetitive Viral Genomes

Genome Organization of Varicella-Zoster

Assembly of Novel Viral Genomes

CLC Assembly Viewer Representation

Promoter of Bat Genome

Promoter of Bat Genome

During the past year we have found that novel viruses, repetitive genomes, and mixed infection samples could not be easily integrated with our high-throughput assembly pipeline. We have developed an assembly and finishing process that utilizes components of the high-throughput pipeline and combines them with manual reference selection and editing. Using this approach we completed novel adenovirus genomes and mixed-infection avian influenza genomes, and improved assemblies of previously unknown arbovirus genomes. We are currently working on optimizing and automating this new pipeline.

Assembly of Mixed Viral Genomes

Consed Representation of Mixed Viral Sample

Consed Representation of Mixed Viral Sample

Repetitive genomes have long been known to present great challenges during assembly and finishing. We are presenting a new approach to assembly and finishing of repetitive varicella genome that is based on separating it into overlapping PCR amplicons followed by merging sequenced amplicons during assembly.

To streamline our viral pipelines, we have fully integrated them with JCVI’s LIMS and JIRA Workflow Management to create a semi-automated tracking interface that follows the progress of viral samples from acquisition through to NCBI submission. This allows us to process a large volume of samples with limited manual interaction and, at the same time, gives us flexibility to work on challenging and novel genomes.


The JCVI Viral Genomics Group is supported by federal funds from the National Institute of Allergy and Infectious Disease, the National Institutes of Health, and the Department of Health and Human Services under contracts no. HHSN272200900007C.

Bat coronavirus project is collaboration with Kathryn Holmes and Sam Dominguez, University of Colorado Medical Center.

The authors would like to thank members of the Viral Genomics and Informatics group at JCVI.


Viral genome sequencing by random priming methods. Djikeng A, Halpin R, Kuzmickas R, Depasse J, Feldblyum J, Sengamalay N, Afonso C, Zhang X, Anderson NG, Ghedin E, Spiro DJ. BMC Genomics. 2008 Jan 7;9:5A virus discovery method incorporating DNase treatment and its application to the identification of two bovine parvovirus species.  Allander T, Emerson SU, Engle RE, Purcell RH, Bukh J.


This post is based on a poster by Nadia Fedorova, Danny Katzel, Tim Stockwell, Peter Edworthy, Rebecca Halpin, and David E. Wentworth.

Scientist Spotlight: Meet David Wentworth

During the height of the H1N1 Flu pandemic, David Wentworth was running a microbial genetics laboratory at the Wadsworth Center, New York State Department of Health (NYSDOH) where he was instrumental in developing a method to amplify influenza genomes regardless of strain using “universal primers” or short strands of DNA that recognize conserved segments across the genomes of many different flu strains. This amplification process was developed to generate recombinant influenza A viruses (the most common flu type affecting humans and animals) that could be used for the production of new vaccines. From a clinical swab it took his team 9-12 days to develop vaccine seed stocks. It was this work that first brought Dave to JCVI’s attention.

Several years ago Dave began collaborations with JCVI scientists to sequence human and avian influenza viruses. The collaborations intensified two years ago when all pandemic flu samples (or suspected flu samples) were first sent to Dave’s lab so the virus could be amplified in sufficient quantities for sequencing using his new amplification pipeline. The amplification took only a day and then isolated, non-infectious, DNA was sent to JCVI for sequencing. JCVI was the natural choice for this work since we are host to the government-funded “Influenza Genome Sequencing Project,” with the goal of sequencing large numbers of viral genomes to help scientists worldwide to understand how flu viruses evolve and cause disease. JCVI researchers then deposited influenza sequences into GenBank within two days of receiving DNA from Dave’s lab, enabling researchers worldwide to track what strains are circulating and how they are evolving. JCVI has sequenced over 75% of the influenza genomes in GenBank, the NIH public repository for sharing genetic sequencing data.

Influenza Genome Amplification Directly From Clinical Specimens

Influenza Genome Amplification Directly From Clinical Specimens (Zhou, B., M. E. Donnelly, D. T. Scholes, K. St.George, M. Hatta, Y. Kawaoka, and D. E. Wentworth. 2009. J.Virol. 83:10309-10313.).

Dave was soon invited for a talk at JCVI. “The opportunities at JCVI were to help build the [viral genomics] program. And already good, quality people are here studying viruses with a focus on viral evolution and sequencing analysis,” Dave remarked. “Being part of generating that information, I think makes you have a better feel for the biology.” The capabilities for viral sequencing combined with IFX strengths and the interest in viral evolution at JCVI was a draw for Dave and he soon joined the team. Moreover, there are opportunities at JCVI to work with collaborators who send specimens from various regions of the world for sequencing so that we can “more deeply understand the mutations that contribute to virulence,” he said. He is particularly interested in antigenic drift (how viruses escape immunity) that contributes to the “annual influenza escape,” which is critical in developing vaccine strains.

New Live Attenuated Vaccine Approaches

New Live Attenuated Vaccine Approaches. Figure shows influenza RNA polymerase activity (GFP) at various temperatures. Mutations engineered into the genome (PB1-Mut3, PB2-Mut4) synergize and inhibit replication at higher temperatures of the lung (37 C) or fever (39 C).

The need for new and improved methods to develop vaccines, coupled with the advances in synthetic genomics developed at JCVI led to the formation last year by JCVI and the company Synthetic Genomics Inc. of a new company, Synthetic Genomics Vaccines Inc. (SGVI). JCVI scientists, through SGVI, are working on a three-year collaboration agreement with Novartis to apply synthetic genomics tools and technologies to accelerate the production of the influenza seed strains required for vaccine manufacturing. The agreement, supported by an award from the U.S. Biomedical Advanced Research and Development Authority (BARDA), could ultimately lead to a more timely and effective response to seasonal and pandemic influenza outbreaks. The idea is to create viruses de novo or synthesize genes critical for its antigenicity and put these in normal vaccine strains for production. The goal of the work at SGVI is to synthesize a virus in one week, or rather a seed stock, which still needs to be amplified in big fermenters. New seed stocks take 3-4 weeks to produce which is currently a rate liming step.

You don’t hear too many people singing its praises and saying “I love the flu!” as Dave has remarked, but put in context, his enthusiasm for his work shines through best when talking about his love of teaching. He gets excited teaching young scientists about virology, especially helping them to understand the important areas to study, and where the research will lead to solve a major problem. “The rewarding part of being a mentor is to see all of the people who have found their niche – it might not be bench research but they are still carrying knowledge with them.”

David Wentworth DEW checking a hive in the late Spring.

David Wentworth DEW checking a hive in the late Spring.

Aside from spending time with his family, Dave enjoys a hobby started by his dad – to cultivate honey bees. A community gardens group at a middle school in Albany, NY was looking for bees to pollinate their plants. Dave spearheaded the effort and used it as a learning tool for kids, who helped feed honey to caterpillars and moths. He also used to give lectures on bee cultivation and has taught college courses in animal science. Dave’s enthusiasm for science among his students and peers could be considered infectious, just like the subject of his research!

Sequencing of high yield influenza reassortants at JCVI

As part of the Influenza Genome Sequencing Project, JCVI will be sequencing a large number of high yield influenza reassortants created in the lab of Dr. Doris Bucher at New York Medical College. Dr. Bucher’s lab has prepared the type A H3N2 high yield reassortants  (hyrs) for the influenza vaccine for the past several years, both within the US and world wide.
The Bucher lab continues the tradition of preparing the hyrs as developed by preeminent influenza virologist Dr. Edwin D. Kilbourne (1920-2011). Dr Kilbourne developed and applied the technology to produce the first genetically engineered influenza vaccines; these vaccines, which typically change yearly, have been in use for over 40 years.
JCVI will be sequencing approximately 46 hyrs from Dr. Kilbourne’s collection which was assembled as part of the Kilbourne/New York Medical College Archive of Influenza Virus Reassortants, Mutants, and Antisera. Detailed information is provided for every virus stored in the archive with information at the archive website ( The assembly of the archive was sponsored by the NIAID and viruses in the archive are available through BEI Resources ( All sequence data and meta data associated with the hyrs sequenced at JCVI will be made publically available in the Influenza Research Database (
Dr. Kilbourne passed away on February 21, 2001 at the age of 90. A eulogy in remembrance of Dr. Kilbourne and his pioneering work in the field of influenza virology can be found at:

Insights gained from influenza genomic sequence data: viral diversity within human populations

The advent of large amounts of influenza genomic sequence data produced by the Influenza Genome Sequencing Project (IGSP) has led to new concepts regarding influenza viral diversity.  It was previously believed that a single influenza lineage entered a human population at the start of an influenza season and gradually spread over time; however, recent analyses of influenza genomes revealed that multiple viral lineages co-circulate within individual populations throughout an influenza season.  These different lineages appear to be continuously introduced which provides the opportunity for frequent intra-subtype reassortment.  Interestingly, similar levels of influenza diversity exist within populations of both large metropolitan cities and small towns (E.C. Holmes, 2009).  Multiple, diverse viral lineages of the same subtype have been observed co-circulating in urban locations comprised of expansive travel networks and rural locations that are geographically isolated.

Additional analyses of complete influenza genomes have led to a ‘source-sink’ model of influenza seasonality.  In this model, a global, human ‘source’ population of influenza viruses is thought to be responsible for the antigenic variants that ignite seasonal epidemics in the ‘sink’ populations of the Northern and Southern hemispheres (A. Rambaut, 2008; E. C. Holmes, 2009).  The geographic regions of East and Southeast Asia have been hypothesized as potential sources of influenza due to the large, dense human populations which would allow influenza viruses to antigenically evolve with maximum efficiency.  These locales may be the focus of future surveillance efforts aimed at identifying emergent influenza viruses that have evolved mechanisms to evade current vaccines.

Insights gained from influenza genomic sequence data: frequent intrasubtype reassortment

Studies using whole genomic influenza sequence data produced by the Influenza Genome Sequencing Project (IGSP) have focused mainly on influenza evolution and epidemiology. For instance, IGSP data has provided important insight into the frequency of intrasubtype reassortment (in which reassortment occurs between different segments of the Influenza genome). The data suggests that reassortment occurs frequently, leading to viruses with altered antigenic properties that may evade current vaccines. Thus, it is useful to study not only the HA and NA segments that produce the hemagglutinin and neuraminidase proteins that sit on the surface of the virion and interact with host cells, but the whole viral genome, as this provides a complete picture of the emergence of the virus (E.C. Holmes, 2009).

The significance of intrasubtype reassortment for strain emergence was shown by the appearance of the new strain of Influenza H1N1 in 2009, which is a reassortant virus containing multiple swine influenza lineages.

In the October 2010 publication by Ilyushina et al, they show that despite the lack of detection thus far in humans, viable seasonal/pandemic Influenza virus reassortants can be generated in a laboratory setting. Their study showed that intrasubtype reassortment is able to occur between seasonal H3N2 and pandemic H1N1 viruses, potentially leading to the emergence of a strain with higher virulence.

Take home message of the 2010 Amebiasis Montreal Meeting: beware of who you kiss…

The Entamoeba community is a small and collegial one.  Everyone knows everyone and everyone else wants to collaborate, and learn and do more to tackle down this neglected among neglected diseases.  For many, the thought of an amoeba brings to memory Garry Larson’s The Far Side amorphous characters watching TV and dealing with domestic issues…but what a few know is that the WHO considers amebiasis one of the major health problems in developing countries surpassed  only by malaria and schistosomiasis for death caused by a parasitic infection.

Amoeba Real Life...

TIGR/JCVI has had a long-standing relationship with Entamoeba histolytica and other related species. Started by Brendan Loftus back in the day, followed by Neil Hall and continuing by yours truly, with the Entamoeba GSCID project, we have provided this eager community with what they are in great need of: genome sequences. And they are appreciative of that, for sure.

The meeting was small and intense, and started with a one day workshop on clinical aspects of the disease, not only the enteric disease, but the oral disease. It was good to mingle with dentists and see their point of view and the devastating reality of clinical cases of periodontal disease…and periodontal disease affects, at one point of another, the majority of the adults. And Entamoeba gingivalis is always there…but not always diagnosed.  Why? We assumed bacteria, bacteria, and bacteria. Treat the bacteria with antibiotics…but ignore the amoebas. Patients do everything right when it comes to oral hygiene, but the disease persists, the bone continues to be destroyed, and teeth are lost…secondary consequences such as fatigue, diabetes, heart disease, renal dysfunction, low birth weight, and a myriad of other diagnoses are known to be related to oral health and periodontal disease, but rarely in the context of this amebiasis infections. Interesting cases of entire families being affected by this parasite were presented (it is highly contagious). Dr. Bonner, the organizer of the conference is a big advocate of microscopes in the dentist practices as the primary diagnostic method for periodontal disease and identification of amebiasis, and he is surely determined to speak to the world about that. And he also wants the genome done.

Entamoeba community, Montreal 2010

During the core of the Meeting, Neil Hall and myself were the only ones on the genomics side of things: we presented on SNP analysis done so far on the strains we have sequenced, in their case using SOLiD, on our case just 454 (for now). Both talks were very well received and the community is eager to see more. Both pieces of work will be used in a global SNP analysis, particularly focusing on a family of proteins known to be involved in virulence, the Gal/GalNac lectins. These proteins are one of the main targets for vaccine development, led by Dr. Bill Petri, who is in the process to start full speed with that endeavor. Also on this topic, Dr. Jonathan Ravdin (Dean and Executive Vice President of the Medical College of Wisconsin) presented the results on an intranasal Gal-lectin subunit vaccine on experimental Entamoeba histolytica in baboons, showing its protective effect against enteric colitis, and the promising future of this vaccine target for humans.

For amebiasis “a la mode”, the usual suspect topics of the conference involved Entamoeba histolyica signaling pathways, classic protein characterization of large families, proteases and invasion, and large genotyping studies of outbreaks, as well as mechanism of pathogenesis.  One interesting talk was on the generation of cyst like forms in vitro for Entamoeba histolytica, a true breakthrough, since there is no model or encystations so far, and the possibility to obtain this structures in vitro will certainly open up a whole new world of studies that before were confined to Entamoeba invadens, a very distant parasite of lizards that does encyst in vitro.

A modest interest into drug discovery was present. There are so far two kind of drug therapies for amebiasis, luminal amoebicides (paromomycin, diloxanide furoate and iodoquinol) for intestinal disease, but ineffective against organisms in tissue and metronidazole and other derivatives,  for invasive disease. However, resistance to these drugs is easily achievable at clinically acceptable drug levels. James McKerrow (UCSF) group presented a high throughput screening of small molecules using an FDA-approved library of drugs and known bioactive compounds, and they identified six compounds with similar or better activity than metronidazole, in vitro. Other group from Mexico is focusing on probiotics and natural compounds such as Astrophitus capricorne (cactus), Jatrhropa dioica and Eucalyptus camaldulensis.

Montreal View from Le Crystal Hotel

To finalize, because the list of interesting things can be just too much, a wonderful piece of work that will hopefully provide an extensive framework for further studies of host susceptibility to E. histolytica comes from Bill Petri’s lab. They have performed a small hairpin RNA (shRNA) screen to identify human factors crucial for E. histolytica cytotoxicity. Using a mammalian shRNA knockdown library they did nine rounds of selection using 1:5 parasite:host and 1:50 parasite:host ratios and resistant clones after 6 rounds of selection, were sequenced using Solexa. This way, they identified a number of host gene families including kinases, surface receptors and ion channels that may be important for susceptibility to the parasite, and of course, they are working on that…

But the take home message that I have imprinted in my brain is…do not kiss your dog.  After three years of age, ALL DOGS have periodontal disease. And inevitably they have Entamoeba gingivalis and possibly other species as well…and they are one of the sources of infection to humans. Dog kisses owner, owner kisses lover, wife, husband, and kids…and the amoeba conquest of the world continues…

For your delight, two movies on the topic: Mark Bonner, the meeting organizer (in French) and a film of a patient with periodontal disease biofilm. Enjoy!

Entamoeba histolytica research presented at the Molecular Parasitology Meeting

Entamoeba histolytica causes invasive intestinal and extraintestinal infections, known as amoebiasis, in about 50 million people and still remains a significant cause of human death in developing countries. However, for unknown reasons, fewer than 10% of E. histolytica infections are symptomatic (causing symptoms such as diarrhea, dysentery or liver abscess). The J. Craig Venter Institute is among the institutions awarded the NIAID Genome Sequencing Centers for Infectious Diseases (GSCID) contracts to provide high-quality genome sequencing and high-throughput genotyping of NIAID Category A-C priority pathogens.

Photo of Entamoeba histolytica

Entamoeba histolytica in the trophozoite stage.

A GSCID project led at JCVI by Dr. Elisabet Caler includes performing whole-genome sequencing of Entamoeba phenotypic variants from symptomatic, asymptomatic and liver abscess-causing strains chosen to include a range of clinical manifestations and taken from human cases, as well as strains grown under different conditions. Our objective is to develop a genome-wide landscape of Entamoeba diversity to understand how sequence variations in the parasite relate to pathogenicity (ability to cause disease) and clinical outcome.

The Molecular Parasitology Meeting held at the Woods Hole Oceanographic Institution, Woods Hole, MA last week provided a window into the exciting science of Parasitology.  The keynote speaker, Fotis Kafatos, spoke on “Major Challenges to Global Health in the Tropics and Beyond–Insect Vectors of Malaria and Other Parasitic or Viral Diseases.”  Dr. Kafatos stressed that a multi-pronged approach to the control of malaria is necessary to prevent the devastating loss of life that malaria causes.

Woods Hole Oceanographic Institution

A view of Woods Hole Oceanographic Institution.

The many excellent papers and posters provided an overview of the field, including   Plasmodium falciparum, Toxoplasma gondii, the trypanosomes, Giardia lamblia, Trichomonas vaginalis, Entamoeba histolytica, Schistosoma species, Babesia bovis, and associated vectors.  Topics spanned basic biology, drug design, sequencing and host-pathogen interactions.

I presented an overview of the Entamoeba sequencing project at the meeting.   Discussions as a result of the presentation included questions about the details of sequencing and handling the next-generation sequencing data.   We had animated discussions about methods for assembly of the DNA sequences, including reference-guided vs de novo assembly.   Many attendees were impressed with JCVI’s open-source METAREP metagenomic tool (J. Goll, et al., Bioinformatics 2010).  Determination of the best methods for the analysis of differences in the clinical isolates generated much discussion.  Entamoeba researchers see the sequences as a great resource and are looking forward to being able to mine the data.  One, from India, was very excited that he was going to have about 15 times the resources he has had in the past, since he has had only had one genome to mine up until now.

The Molecular Parasitology Meeting was an excellent venue for scientific exchange.  The Entamoeba histolytica GSCID project will help us understand the pathogenicity of Entamoeba histolytica, and has the potential to save lives in developing countries.

Looking for a Few Good Genomes (to sequence)!

The JCVI is one of three centers funded by the National Institutes of Allergy and Infectious Disease (NIAID) to provide sequencing and genotyping services to the infectious disease community.  We are continually looking for researchers who would like to have organisms of research interest to them sequenced and become a resource for the community. The costs are covered under the NIAID contract to the JCVI Genome Sequencing Center for Infectious Diseases (GSC) and therefore of no cost to the investigators.

The JCVI GSC provides the infectious disease research community with rapid and cost-effective high-quality sequencing services for pathogenic microorganisms including viruses, bacteria, fungi, protozoa, and invertebrate vectors of disease. The center is focused on NIAID Category A-C priority pathogens, related organisms, clinical isolates, closely related species, and microorganisms responsible for emerging and re-emerging infectious diseases and their hosts. Genotyping services are offered by the Center in order to study the variation in host response. The center also offers expertise in pathogen biology with the ultimate goal to use the sequencing and genotyping data to develop new diagnostics, vaccines, and drugs.  Data generated from the sequencing and genotyping projects will be released to the scientific community in accordance with the NIAID Data and Reagent Sharing and Release Guidelines.

JCVI has completed many projects to date and over 15 are ongoing or about to start. See here for information about these projects and the collaborators.  We are currently in year 2 of our second five year contact and over 70 known publications have resulted from these collaborative efforts.

We particularly encourage multi-collaborator projects that will provide the most impact for the scientific community.  The sequencing and genotyping projects to be conducted by center are selected from white paper proposals that can be submitted by investigators worldwide, including academia, not-for-profit organizations, industry, and government.  Information about the application process is available here.

Please contact us with questions and for advice about developing a white paper proposal.  A JCVI project lead will be assigned to each person/group who submits a white paper to help them with the process and ensure the best chance for success.  We look forward to hearing from you!

Influenza H1N1pdm sequencing project overview

Since 2004, the JCVI Influenza Genome Sequencing Project, funded by the National Institute of Allergy and Infectious Diseases (NIAID), has sequenced thousands of human, swine, and avian influenza isolates from collections around the world to provide researchers with a better understanding of the evolution of this important pathogen and to enable the development of new therapeutics, diagnostics, and vaccines.

JCVI has been collaborating with groups worldwide to monitor the evolution of the pandemic H1N1 Influenza virus (also known as H1N1pdm) that entered the human population in the spring of 2009 and has been responsible for at least 16,226 deaths worldwide. Genomic sequence information and epidemiological data are being used to address critical scientific questions of virus adaptation.

Some of the questions we are trying to answer with our current H1N1pdm studies include:

  • How do pandemic viruses collected during the first wave of the pandemic compare to those collected in the later phases? We have ongoing studies in New York, Texas, Wisconsin, and California which address this question.
  • How will the presence of a new pandemic influenza virus affect the evolution of seasonal H1N1 and H3N2 viruses? Will the seasonal viruses become extinct? Will we identify novel reassortants between the H1N1pdm and seasonal human viruses?
  • Will the pandemic virus acquire resistance to neuraminidase inhibitors such as Tamiflu?
  • How does H1N1pdm isolated in the tropics differ from isolates collected in temperate regions? What is the relationship between strains present in the tropics and epidemic strains in temperate regions? We have collections in Nicaragua, Hong Kong, and Brazil which will help answer these questions.
  • What are the evolutionary dynamics of H1N1pdm in a situation of intense viral transmission such as between students in a university setting?
  • Influenza samples have never been collected during the summer months. Thus the collection of pandemic influenza samples during the summer gives us a glimpse of viral persistence and transmission during the off peak months. How do circulating influenza strains collected during the southern hemisphere’s influenza season compare to those collected during the US summer and which strains persist into the next northern hemisphere flu season?

For more information please visit