Provided by: soapaligner_2.20-3_amd64 bug

NAME

       SOAPaligner/soap2 - Short Oligonucleotide Analysis Package aligner

SYNOPSIS

       soap reference.index short_reads.fast[a|q] alignment.out [options]

DESCRIPTION

       SOAPaligner/soap2  is  a  member  of  the SOAP (Short Oligonucleotide Analysis Package). It is an updated
       version of SOAP software for short oligonucleotide alignment. The new program features in super fast  and
       accurate alignment for huge amounts of short reads generated by Illumina/Solexa Genome Analyzer. Compared
       to soap v1, it is one order of magnitude faster. It require only 2 minutes aligning one  million  single-
       end  reads  onto the human reference genome. Another remarkable improvement of SOAPaligner is that it now
       supports a wide range of the read length.

       SOAPaligner benefitted in time and space efficiency by a revolution in  the  basic  data  structures  and
       algorithms  used.The  core  algorithms  and  the indexing data structures (2way-BWT) are developed by the
       algorithms research group of the Department of Computer Science, the University of Hong Kong  (T.W.  Lam,
       Alan Tam, Simon Wong, Edward Wu and S.M. Yiu).

COMMAND AND OPTIONS

       soap   -D   <in.fasta.index>   -a   <query.file.a>   [-b   <query.file.b>]   -o   <alignment.output>  [-2
       <unpaired.output>] [options]

       OPTIONS:

              -D STR Prefix name for reference index [*.index]. See APPENDIX How to build the reference index

              -a STR Query file, for SE reads alignment or one end of PE reads

              -b STR Query b file, one end of PE reads

              -o STR Output file for alignment results

              -2 STR Output file contains mapped but unpaired reads when do PE alignment

              -u STR Output file for unmapped reads, [none]

              -m INT Minimal insert size INT allowed for PE, [400]

              -x INT Maximal insert size INT allowed for PE, [600]

              -n INT Filter low quality reads containing more INT bp Ns, [5]

              -t     Output reads id instead reads name, [none]

              -r INT How to report repeat hits, 0=none; 1=random one; 2=all, [1]

              -R     RF alignment for long insert size(>= 2k bps) PE data, [none] FR alignment

              -l INT For long reads with high error rate at 3'-end, those can't align whole length,  then  first
                     align 5' INT bp subsequence as a seed, [256] use whole length of the read

              -s INT minimal alignment length (for soft clip)

              -v INT Totally allowed mismatches in one read, when use subsequence as a seed, [5]

              -g INT Allow gap size in one read, [0]

              -M INT Match  mode  for  each  read  or the seed part of read, which shouldn't contain more than 2
                     mismatches, [4]

                     0: exact match only

                     1: 1 mismatch match only

                     2: 2 mismatch match only

                     4: find the best hits
              -p INT Multithreads, n threads, [1]

OUTPUT FORMAT

       SOAP2 output format contains following column information:

       1. reads name / reads ID (if -t is available)

       2. reads sequence (if read align to reverse strand, here is the reverse sequence of original read)

       3. quality sequence (if input is fasta reads, the column will be all 'h', and the sequence is backward if
       reads mapping reverse )

       4.

APPENDIX

       Before use soap2 to do alignment, the reference index must be generated by 2bwt-builder.

              2bwt-builder <reference.fasta>

              NOTE:  1.  the  reference  input should only be FASTA format; 2. the program wil auto generate the
              index files in the directory where the fasta file is located, so confirm the permission at first.

ENVIRONMENT

       The datastructure is imcompatible with 32bit, so it can't be migrated on any  32bit  platforms.   Due  to
       using  the  MMX  instruction  to  opitimize  parts  of  code,  the current version can only run on x86_64
       platform.  We will provide a universal version for most of the 64bit platform later.

       HARDWARE REQUIREMENT
              1.8Gb RAM (for a genome as large as human's)

              2.at least 8Gb hard disk to store index (for a genome as large as human's)

       SYSTEM REQUIREMENT
              Linux x86_64

SEE ALSO

       Website for SOAP <http://soap.genomics.org.cn>,

       Google Group for SOAP <http://groups.google.com/group/bgi-soap>

       Publication:
              "SOAP: short oligonucleotide alignment program" (2008) BIOINFORMATICS,Vol.  24  no.5  2008,  pages
              713-714

ATHOUR

       BGI  Shenzhen  SOAP  team.  The  core algorithm Bidirect-BWT is wrotten by Prof. T.W. Lam and his team at
       HongKong University.

REPORT BUGS

       Report bugs to <soap@genomics.org.cn>

ACKNOWLEDGEMENTS

       We appreciate Prof. T.W. Lam, Alan Tam, Simon Wong, Edward Wu and S.M. Yiu prominent  work  on  Bidirect-
       BWT.