|The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing / May 4, 2007|
Evolution -- the very word makes some people uneasy. But researchers at Argonne National Laboratory aren't looking for a debate about science versus religion. They're focusing on a much smaller issue: the evolutionary analysis of microorganisms.
The Argonne team of computational biologists -- or bioinformaticists, as they sometimes are called -- have developed a new system, named Chisel, to identify enzyme variations and to gain insight into the adaptation of organisms to particular environments.
It's long been known that organisms from all domains of life -- eurkaryotes, prokaryotes, and archaea -- share a common ancestry. But differences in lifestyle can result in different evolutionary paths and in enzymatic variations that make it easier for the organism to adapt to its environment. For example, the archaebacterium P. furiosus that lives in boiling ocean vents at temperatures of over 100 C and the soil bacterium D. radiodurans that can survive extremely high levels of radiation both have evolved their proteins to tolerate and function in such harsh environments.
Today, scientists have accumulated enormous volumes of genomic and enzymatic data. Analysis of such genomic data is complicated and computationally intensive. Microorganisms are like small chemical factories that are able to execute and control billions of chemical reactions catalyzed by the enzymes -- protein catalysts invented by nature. Understanding how different organisms function under different conditions may help to discover new antibiotics, advance biotechnology, and find new energy sources.
Most bioinformatics systems used for evolutionary analysis have been limited to reasoning about the evolutionary history of a single enzymatic function. In contrast, the Chisel system allows scientists to explore the evolutionary history of over 900 enzymatic functions and understand the differences in these natural catalysts that occur in different organisms in the larger context of metabolic pathways.
"Chisel was built specifically to enable a systems-level study of the evolution of enzymes," says Natalia Maltsev, a computational biologist in the Mathematics and Computer Science Division at Argonne. "Understanding these broad evolutionary patterns is essential for genetic engineering, environmental research, and drug design," she explains.
Using enzymatic sequences provided by the user or obtained from public databases, Chisel generates function-specific and taxonomy-specific clusters. From these clusters it creates a library of computational models that can be used to predict unannotated sequences.
Chisel already contains more than 8,500 clusters and more than 900 distinct enzymatic functions.
So, is it just creating another data-mining nightmare?
"Not at all," says Alex Rodriguez, principal developer of Chisel at Argonne. "To help scientists make sense of the data, we have designed Chisel so its results can be presented through a Web-based interface."
The user-friendly interface includes a suite of bioinformatics tools for interactive analysis and development of models. For example, sequences from Chisel can be aligned and represented as dendrograms -- tree diagrams showing arrangements of clusters. From these diagrams, users can select subsets of sequences for further analysis.
In addition to comparing known variations, the Chisel system can be used to predict functions of hypothetical proteins or unannotated sequences. To test the accuracy of these new predictions, the Argonne team conducted more than 200,000 experiments.
"The results were outstanding," says Rodriguez. Functions were predicted with an accuracy rate of almost 95 percent.
The Argonne researchers then compared Chisel results with the annotated enzymatic functions in protein families from major public databases. Again, Chisel clusters had a significantly higher degree of accuracy of function predictions.
These results are especially exciting because they may enable scientists to predict which variants of the same enzymatic functions are more likely to be found in a particular environment. Identification of these variants may provide insights into how taxonomy-specific differences in metabolic pathways emerged.
"We know of no other bioinformatics system being developed exclusively for this purpose," says Maltsev.
Chisel has already been used in several key applications. For example, at the Hanford site in Washington, the extremely high levels of radiation and chromium pollution mean that scientists have access to only a small number of sequences for study. Using Chisel, however, the Argonne researchers have been able to predict the taxonomic distribution and physiology of the site's microorganisms -- the first step in understanding what makes them resistant to such pollutants.
The Argonne team also noted a surprising similarity between the microbial community at Hanford and that reported for soil samples in irradiated zones of the Atacama desert in northern Chile. Using Chisel, the researchers confirmed their hypothesis that microorganisms residing in extreme environments are preconditioned for further adaptation to other extreme environments.
In another study of how microorganisms adapt to their environment, Chisel was used to analyze genomes of Shewanella. This bacterium, which can grow both in air and without air and uses different electron acceptors (like nitrate, iron or uranium) for its energy production. Of the variations of enzymes predicted by Chisel, more than half proved to be associated with metabolic pathways involved in key metabolic functions such as the biosynthesis of amino acids. According to Maltsev and Rodriguez, these results suggest that Shewanella organisms have undergone significant systems-level adaptation that led to diversification of enzymes during their evolution.
Chisel also offers promise for biomedical research. The system contains almost 250 models for enzymes specific for Enterobacteriaceae, a large family of bacteria that includes pathogens that can cause typhoid fever; salmonellosis, and dysentery; 90 models for Staphylococcus, sometimes called the plague of the century; which causes a large variety of aggressive inflammatory diseases; and 126 models for Streptococcus, another microorganism that is linked to a range of diseases in humans, ranging from common skin lesions to autoimmune diseases such as lupus and rheumatoid arthritis. The enzymes in these models represent a set of potential targets for antibacterial drugs.
"Chisel's library of enzymatic variations can help experimentalists in characterizing organisms that are difficult or impossible to grow as pure cultures," says Rodriguez.
The Argonne bioinformaticists attribute much of their success with Chisel to the use of GADU, the Genome Analysis and Database Update system developed at Argonne. With GADU they are able to take advantage of the distributed computational resources of the national computational Grid.
"Grid-enabled GADU resources are essential for acquiring and comparing enormous amounts of sequence and enzymatic data so we can determine where a particular microorganism is on the evolutionary tree," says Dinanath Sulakhe, a computational biologist at the University of Chicago and an Argonne collaborator, who has focused on the use of the Grid for evolutionary analysis of genomes.
"By creating advanced bioinformatics tools such as Chisel, we hope to enable scientists to explore the fascinating world of microbes and to understand the evolution of multispecies microbial communities," Sulakhe says.
The development of Chisel was sponsored in part by the Chicago Biomedical Consortium, which promotes collaborative work among Chicagoland universities through philanthropic funds from the family of John G. Searle.
For more information and graphics, visit the Chisel website: http://compbio.mcs.anl.gov/CHISEL/.
Source: Argonne National Laboratory