clasp - A fast local fragment chainer using sum-of-pair gap costs

Introduction

clasp is a fast and flexible fragment chainer that supports linear and sum-of-pair gap costs and uses highly time-efficient index structure, i.e., Johnson priority queues and range trees padded with Johnson priority queues. Chaining of short match fragments helps to quickly and accurately identify region of synteny that may not be found using common local alignment heuristics alone. Further details on the algorithm or the gap cost models are provided in Abouelhoda and Ohlebusch (2003).

It reads tab-separated fragment files providing information on fragment start and end position on query and database sequence as well as a score measure. It executes a local chaining algorithm using either the linear (parameter -L, --lin) or the sum-of-pair gap cost model (default). It produces a tab-separated output of chain data, optionally including fragments. Note that the algorithm is optimized for short queries and large database sequences using a novel clustering approach.

Download

For detailed instructions type
  ./clasp.x --help
or see the man page.

Installation

Download the latest release and extract the archive using
 tar -xvzf clasp_v*.tar.gz
subsequently go to the new directory and type
 make
Run clasp by typing
 ./clasp.x

Example (with BLAST)

Run BLAST

Download chr5 of Mus musculus from UCSC and uncompress:
  gunzip chr5.fa.gz
Create blast database with
  formatdb -i chr5.fa -p F

Run BLAST with Human H/ACA snoRNA ACA42 (from snoRNABase) and
its reverse complement (options -m 8 and -S 1 are required):
  blastall -p blastn -d chr5.fa -i ACA42.fa -m 8 -S 1 -W 8 -e 1e5 -o ACA42.blast
  blastall -p blastn -d chr5.fa -i ACA42_revcomp.fa -m 8 -S 1 -W 8 -e 1e5 -o ACA42_revcomp.blast

Chain blast output

Run clasp on BLAST output (with sum-of-pair gap cost model,
epsilon=0, lambda=0.5, and minimal demanded chain score of 35):
 ./clasp.x -i ACA42.blast -c 7 8 9 10 4 -C 1 2 -e 0 -l 0.5 -S 35 -o ACA42.chn
 ./clasp.x -i ACA42_revcomp.blast -c 7 8 9 10 4 -C 1 2 -e 0 -l 0.5 -S 35 -o ACA42_revcomp.chn

Contact

If you have any further questions, complaints, or bug reports please mail to christian (at) bioinf (dot) uni-leipzig (dot) de.