bwt based LACA - A tool for computing the LCP array directly from the BWT

Installation:

1. Download this bz2-file.

2. Unpack the bz2 file with the command "tar xfvj bwt_based_laca.bz2".

3. Change directory to sdsl2/build and execute "cmake .." and "make".

4. Execute "make" in the root directory of the archive.

The last "make" will produce the executables bwt_based_laca, bwt_based_laca2 and others. You'll find more information in the readme.txt file (at the root directory of the archive).

Note: You can use the tool dbwt of Kunihiko Sadakane, to compute the BWT directly from the input.

backwardMEM - A tool for computing maximal exact matches

Installation:

1. Download this tar.gz-file.

2. Unpack the tar.gz file with "tar -xzvf calcMEM.tar.gz".

3. Change directory to backwardMEM and unpack sdsl-0.7.3.tar.gz with "tar -xzvf sdsl-0.7.3.tar.gz"  and execute "./configure" and "make" in the sdsl-0.7.3/ directory.

4. Change directory to sparseMEM/ directory and execute 'make'

4. Change directory to backwardMEM and execute 'make'

The last 'make' will produce the executable backwardMEM1, backwardMEM2, backwardMEM4, backwardMEM8, and backwardMEM16.

Execute ./backwardMEM[1|2|4|8|16] -h to get information about the usage.

You could try to run the small example from the paper with "./backwardMEM1 l=2 example1.fasta example2.fasta"

Note:  We use the algorithm and original implementation of Larsson and Sadakane (article: "Faster suffix sorting", 2007) for the construction of the index. The algorithm works only for sequences of length < 2GB. We will replace the algorithm in the next version of backwardMEM to handle input of more than 2GB.

 

backwardSK - a tool to calculate string kernels

Installation:

1. Download this tar.gz-file.

2. Unpack the tar.gz file with "tar -xzvf backwardSK.tgz".

3. Change directory to sdsl2/skernel and read the README file to get further instructions

Bidirectional search

Bidirectional search in a string with wavelet trees

C++ source code for the bidirectional search in a string with wavelet trees:

bidirectionalsearch.tar.gz

The documentation is available here.

 

Linear Time Algorithms for Generalizations of the Longest Common Substring Problem

Here, one can find pseudo-code and implementations of the algorithms described in the article "Linear Time Algorithms for Generalizations of the Longest Common Substring Problem" by Michael Arnold and Enno Ohlebusch, Algorithmica, to appear.

 

All Pairs Suffix-Prefix Problem

In this tar.gz-file, one can find the implementation of an efficient algorithm for the all pairs suffix-prefix problem based on a generalized enhanced suffix array, as described in the article "Efficient Algorithms for the All Pairs Suffix-Prefix Problem and the All Pairs Substring-Prefix Problem" by Enno Ohlebusch and Simon Gog, Information Processing Letters, to appear. It also contains an alternative implementation based on a generalized suffix tree and the EST database of C. elegans for test purposes.