Traditional methods for nucleic acid and protein sequence analysis rely on multiple sequence alignment (MSA). In addition, a large number of alignment-free approaches have been developed for sequence analysis during the last decades. Most of these methods use k-tuple frequencies to define dissimilarity measures between sequences as a basis for phylogeny reconstruction and classification. In genome-based phylogeny studies, alignment-free methods have several advantages compared to the more traditional alignment-based approaches: they are much faster, they do not require sets of alignable orthologous genes, they are not affected by MSA errors, and they can take coding as well as non-coding parts of genomes into account. At our workshop, we will discuss new developments in the field of alignment-free sequence comparison. The workshop will be small and informal, the focus will be on open questions and ongoing projects rather than on final results.
The workshop takes place on 10 September 2013 on Göttingen's North Campus, lecture hall HS3 at the Faculty of Physics.
(1) General Introduction
(2) Information theory applications for alignment-free sequence analysis
Tools for Alignment-free Genome Comparison
Alignment-free distance: global vs pairwise approaches
Variable-length decoding and classification of protein families
The k-Mismatch Average Common Substring approach
Using Suffix Trees for Alignment-Free Sequence Comparison and Phylogenetic Reconstruction
A phylogenetic analysis of the Brassicales clade based on an alignment-free sequence comparison method