xenial (1) rabema_evaluate.1.gz

Provided by: seqan-apps_1.4.1+dfsg-2_amd64 bug

NAME

       rabema_evaluate - RABEMA Evaluation

       SYNOPSIS

              rabema_evaluate  [OPTIONS] --reference REF.fa --in-gsi IN.gsi --in-sam MAPPING.sam rabema_evaluate
              [OPTIONS] --reference REF.fa --in-gsi IN.gsi --in-bam MAPPING.bam

       DESCRIPTION

              Compare the SAM/bam output MAPPING.sam/MAPPING.bam of any read  mapper  against  the  RABEMA  gold
              standard  previously built with rabema_build_gold_standard. The input is a reference FASTA file, a
              gold standard interval (GSI) file and the SAM/BAM input to evaluate.

              The input SAM/BAM file must be sorted by queryname. The program will create  a  FASTA  index  file
              REF.fa.fai for fast random access to the reference.

       -h, --help

              Displays this help message.

       --version

              Display version information

       -v, --verbose

              Enable verbose output.

       -vv, --very-verbose

              Enable even more verbose output.

              Input / Output:

       -r, --reference FASTA

              Path to load reference FASTA from. Valid filetypes are: fa and fasta.

       -g, --in-gsi GSI

              Path to load gold standard intervals from. If compressed using gzip, the file will be decompressed
              on the fly. Valid filetypes are: gsi and gsi.gz.

       -s, --in-sam SAM

              Path to load the read mapper SAM output from. Valid filetype is: sam.

       -b, --in-bam BAM

              Path to load the read mapper BAM output from. Valid filetype is: bam.

       --out-tsv TSV

              Path to write the statistics to as TSV. Valid filetype is: tsv.

              Benchmark Parameters:

       --oracle-mode

              Enable oracle mode. This is used for simulated data when the input  GSI  file  gives  exactly  one
              position that is considered as the true sample position. For simulated data.

       --only-unique-reads

              Consider  only  reads  that  a  single alignment in the mapping result file. Usefull for precision
              computation.

       --match-N

              When set, N matches all characters without penalty.

       --distance-metric METRIC

              Set distance metric. Valid values: hamming, edit. Default: edit. One of hamming and edit. Default:
              edit.

       -e, --max-error RATE

              Maximal  error  rate  to  build  gold  standard  for  in percent. This parameter is an integer and
              relative to the read length. The error rate is ignored in oracle mode, here the  distance  of  the
              read at the sample position is taken, individually for each read. Default: 0 Default: 0.

       -c, --benchmark-category CAT

              Set  benchmark  category.  One of {all, all-best, any-best. Default: all One of all, all-best, and
              any-best. Default: all.

       --trust-NM

              When set, we trust the alignment and distance from SAM/BAM file and no realignment  is  performed.
              Off by default.

       --ignore-paired-flags

              When  set,  we  ignore  all SAM/BAM flags related to pairing. This is necessary when analyzing SAM
              from SOAP's soap2sam.pl script.

       --DONT-PANIC

              Do not stop program execution if an additional hit was found that indicates that the gold standard
              is incorrect.

              Logging:

       --show-missed-intervals

              Show details for each missed interval from the GSI.

       --show-invalid-hits

              Show details for invalid hits (with too high error rate).

       --show-additional-hits

              Show details for additional hits (low enough error rate but not in gold standard.

       --show-hits

              Show details for hit intervals.

       --show-try-hit

              Show details for each alignment in SAM/BAM input.

              The  occurrence  of  "invalid"  hits  in  the  read  mapper's output is not an error. If there are
              additional hits, however, this shows an error in the gold standard.

       RETURN VALUES

              A return value of 0 indicates success, any other value indicates an error.

       MEMORY REQUIREMENTS

              From version 1.1, great care has been taken to keep the memory requirements as low as possible.

              The evaluation step needs to store the whole reference sequence in memory but little more  memory.
              So,  for  the  human genome, the memory requirements are below 4 GB, regardless of the size of the
              GSI or SAM/BAM file.

       REFERENCES

              M. Holtgrewe, A.-K. Emde, D. Weese and K. Reinert. A Novel And  Well-Defined  Benchmarking  Method
              For Second Generation Read Mapping, BMC Bioinformatics 2011, 12:210.

              http://www.seqan.de/rabema

              RABEMA Homepage

              http://www.seqan.de/mason

              Mason Homepage

       VERSION

              rabema_evaluate version: 1.2.0 Last update March 14, 2013