Many microbes can't be cultured in the lab. However, with the aid of modern genomic techniques, their genetic material can potentially be isolated from samples of tissue, water, soil or some other substrate.
Genomic methodologies used since the 1990s yield genetic sequences around 800 base pairs (bp) long. More recent methods give sequences ~100bp long, or less. The challenge with both kinds of output is to identify which types of organism the sequences come from. This is typically done by comparing them with sequences of known attribution.
It is computationally challenging to match tens or hundreds of thousands of short sequences with sequences in genetic databases. Now Stephan Schuster, Daniel Huson of Tübingen University and collaborators have published a new computer program, MEGAN (Metagenome Analyzer) that facilitates taxonomic exploration of genome sequence data.
In a 2007 issue of Genome Research, the researchers demonstrate how the program can be used on a laptop computer to analyze various kinds of large genomic datasets, including samples consisting of 35-100bp sequences. MEGAN is conservative, with an extremely low rate of false positive assignments to a particular taxon.
Of potential interest to researchers studying pathogenic organisms is MEGAN's ability to distinguish between closely-related species and strains — for instance, closely-related pathogenic and non-pathogenic variants — more clearly than some phylogenetic techniques. The program analyzes random reads, and so lights on species- and strain-specific sequences that are not typically used in conventional phylogenetic analyses.
MEGAN can be downloaded free at http://www-ab.informatik.uni-tuebingen.de/software/megan
Written By: Daniel H. Huson, Alexander F. Auch, Ji Qi, & Stephan C. Schuster
Journal: 17: 366-386
Journal Reference: 17: 366-386
Paper Id: 10.1101.gr5969107