Provided by: seqan-apps_1.4.1+dfsg-2_amd64 bug

NAME

       rabema_evaluate - RABEMA Evaluation

       SYNOPSIS

              rabema_evaluate  [OPTIONS]  --reference REF.fa --in-gsi IN.gsi --in-sam MAPPING.sam
              rabema_evaluate [OPTIONS] --reference REF.fa --in-gsi IN.gsi --in-bam MAPPING.bam

       DESCRIPTION

              Compare the SAM/bam output MAPPING.sam/MAPPING.bam of any read mapper  against  the
              RABEMA gold standard previously built with rabema_build_gold_standard. The input is
              a reference FASTA file, a gold standard interval (GSI) file and the  SAM/BAM  input
              to evaluate.

              The input SAM/BAM file must be sorted by queryname. The program will create a FASTA
              index file REF.fa.fai for fast random access to the reference.

       -h, --help

              Displays this help message.

       --version

              Display version information

       -v, --verbose

              Enable verbose output.

       -vv, --very-verbose

              Enable even more verbose output.

              Input / Output:

       -r, --reference FASTA

              Path to load reference FASTA from. Valid filetypes are: fa and fasta.

       -g, --in-gsi GSI

              Path to load gold standard intervals from. If compressed using gzip, the file  will
              be decompressed on the fly. Valid filetypes are: gsi and gsi.gz.

       -s, --in-sam SAM

              Path to load the read mapper SAM output from. Valid filetype is: sam.

       -b, --in-bam BAM

              Path to load the read mapper BAM output from. Valid filetype is: bam.

       --out-tsv TSV

              Path to write the statistics to as TSV. Valid filetype is: tsv.

              Benchmark Parameters:

       --oracle-mode

              Enable  oracle  mode. This is used for simulated data when the input GSI file gives
              exactly one position that is considered as the true sample position. For  simulated
              data.

       --only-unique-reads

              Consider only reads that a single alignment in the mapping result file. Usefull for
              precision computation.

       --match-N

              When set, N matches all characters without penalty.

       --distance-metric METRIC

              Set distance metric. Valid values: hamming, edit. Default: edit. One of hamming and
              edit. Default: edit.

       -e, --max-error RATE

              Maximal  error  rate  to  build  gold standard for in percent. This parameter is an
              integer and relative to the read length. The error rate is ignored in oracle  mode,
              here  the  distance  of  the read at the sample position is taken, individually for
              each read. Default: 0 Default: 0.

       -c, --benchmark-category CAT

              Set benchmark category. One of {all, all-best, any-best. Default: all One  of  all,
              all-best, and any-best. Default: all.

       --trust-NM

              When  set, we trust the alignment and distance from SAM/BAM file and no realignment
              is performed. Off by default.

       --ignore-paired-flags

              When set, we ignore all SAM/BAM flags related to pairing. This  is  necessary  when
              analyzing SAM from SOAP's soap2sam.pl script.

       --DONT-PANIC

              Do  not  stop  program execution if an additional hit was found that indicates that
              the gold standard is incorrect.

              Logging:

       --show-missed-intervals

              Show details for each missed interval from the GSI.

       --show-invalid-hits

              Show details for invalid hits (with too high error rate).

       --show-additional-hits

              Show details for additional hits (low enough error rate but not in gold standard.

       --show-hits

              Show details for hit intervals.

       --show-try-hit

              Show details for each alignment in SAM/BAM input.

              The occurrence of "invalid" hits in the read mapper's output is not  an  error.  If
              there are additional hits, however, this shows an error in the gold standard.

       RETURN VALUES

              A return value of 0 indicates success, any other value indicates an error.

       MEMORY REQUIREMENTS

              From  version 1.1, great care has been taken to keep the memory requirements as low
              as possible.

              The evaluation step needs to store the  whole  reference  sequence  in  memory  but
              little  more  memory. So, for the human genome, the memory requirements are below 4
              GB, regardless of the size of the GSI or SAM/BAM file.

       REFERENCES

              M. Holtgrewe, A.-K. Emde, D.  Weese  and  K.  Reinert.  A  Novel  And  Well-Defined
              Benchmarking  Method  For  Second Generation Read Mapping, BMC Bioinformatics 2011,
              12:210.

              http://www.seqan.de/rabema

              RABEMA Homepage

              http://www.seqan.de/mason

              Mason Homepage

       VERSION

              rabema_evaluate version: 1.2.0 Last update March 14, 2013