I attended the Summit on Systems Biology hosted by Virginia Commonwealth University in Richmond, VA June 15-17. So, judging from the talks given, what is systems biology?
- Systems biology is non-linear and/or multi-step. Heavy math does not make something systems biology if it’s directly solvable. Taking a big gene expression matrix, using principle component analysis on it, and coming up with a linear equation for the contributions of a list of biomarker genes, is not systems biology. The same microarray expression experiment, coupled with pathway analysis in order to reduce candidate genes and so do a less stringent multiple-hypothesis-testing-correction and so have fewer false negatives, is. So is a non-linear model of how just a few genes interact over time.
- Standard bioinformatic analysis seeks correlations. Systems biology goes beyond that to seek cause and effect. Thus, most systems biology work involves time series, and sometimes simulation.
What data and techniques do systems biologists use?
- Large datasets of all types. Microarray time-series, genomes, SNPs, protein-protein interactions, automated protein annotation – anything that comes in gigabytes instead of kilobytes.
- There was marked interest in protein-protein interaction networks, and in micro RNAs (which inhibit translation of multiple target mRNAs).
- There were several papers using reverse-phase protein microarrays. RPMAs can distinguish phosphorylated (which usually means active) from unphosphorylated proteins, which helps understand protein interaction dynamics.
- There were several papers using weighted gene co-expression network analysis. WGCNA analyzes modules of co-expressed genes, rather than individual genes. This gives more statistical power from sparse data. Brian Sayre of VSU identified disease-resistance genes in livestock and crop species using single-nucleotide polymorphisms (SNPs) from related species. We might know about some goats that are resistant to a disease that also affects sheep; but sheep don’t have the same SNPs as goats. His group categorized the SNPs into genes, and the genes into pathways common across species, then looked for pathways associated with disease resistance in other species, and hypothesized that the same pathways would be involved in disease resistance in the target species.
What do people do with systems biology?
- Medical applications predominated. The main areas of interest were cancer, aging, cell simulation, eukaryotic model organisms, genome-wide association studies, pathway analysis, and immunology.
- There were no talks about industrial applications or synthetic biology.
- There were no talks on prokaryotes, except one on host-pathogen interactions. This struck me as odd, since eukaryotes are more difficult to analyze or simulate than prokaryotes, and we haven’t done these things with prokaryotes yet.
- There were no talks on metagenomics. This also struck me as odd; bacterial communities seem like a natural systems biology problem.
What does the future hold for systems biology?
- Omniomics: We don’t want just a protein’s sequence – we want to know where and when it is expressed, what regulates it, what it interacts with, and what parameters describe those interactions. Soon, annotating a genome will not mean producing a list of genes and their functions – it will mean producing a simulation.
- We need to learn to think at a higher level of abstraction. If you have tens or hundreds of thousands of genes, transcripts, proteins, small molecules, and structures interacting, you need to figure out what it is you’re really interested in (e.g., “How did this cancer bypass the G1 cell-cycle restriction checkpoint?”), how to specify that precisely enough to ask the computer for an answer, and not to insist on understanding all the details if the answer checks out.
- There is a growing gap between research and practice. We can make more and more detailed analyses of diseases, especially in cancer, where each patient has a unique disease at the genetic level. Meanwhile, the FDA approval process is so long and expensive that even in diseases (for example, Alzheimer’s and FTLD) for which there are millions of patients and a handful of known causes, pharmaceutical companies don’t try to develop three to four separate therapies for those three to four causes. And the gap is growing wider: Even as we are coming up with ways to combine weak information from across an entire genome, the FDA is considering proposals to regulate genomic sequencing that would forbid doctors from acquiring a full sequence.