xenial (1) rabema_build_gold_standard.1.gz

Provided by: seqan-apps_1.4.1+dfsg-2_amd64 bug

NAME

       rabema_build_gold_standard - RABEMA Gold Standard Builder

       SYNOPSIS

              rabema_build_gold_standard  [OPTIONS]  --out-gsi  OUT.gsi  --reference REF.fa --in-sam PERFECT.sam
              rabema_build_gold_standard [OPTIONS] --out-gsi OUT.gsi --reference REF.fa --in-bam PERFECT.bam

       DESCRIPTION

              This program allows to build a RABEMA gold standard. The input is a reference  FASTA  file  and  a
              perfect SAM/BAM map (e.g. created using RazerS 3 in full-sensitivity mode).

              The  input  SAM/BAM  file must be sorted by coordinate. The program will create a FASTA index file
              REF.fa.fai for fast random access to the reference.

       -h, --help

              Displays this help message.

       --version

              Display version information

       -v, --verbose

              Enable verbose output.

       -vv, --very-verbose

              Enable even more verbose output.

              Input / Output:

       -o, --out-gsi GSI

              Path to write the resulting GSI file to. Valid filetypes are: gsi and gsi.gz.

       -r, --reference FASTA

              Path to load reference FASTA from. Valid filetypes are: fa and fasta.

       -s, --in-sam SAM

              Path to load the "perfect" SAM file from. Valid filetype is: sam.

       -b, --in-bam BAM

              Path to load the "perfect" BAM file from. Valid filetype is: bam.

              Gold Standard Parameters:

       --oracle-mode

              Enable oracle mode. This is used for simulated data when the input SAM/BAM file gives exactly  one
              position that is considered as the true sample position.

       --match-N

              When set, N matches all characters without penalty.

       --distance-metric METRIC

              Set distance metric. Valid values: hamming, edit. Default: edit. One of hamming and edit. Default:
              edit.

       -e, --max-error RATE

              Maximal error rate to build gold standard for  in  percent.  This  parameter  is  an  integer  and
              relative  to  the read length. In case of oracle mode, the error rate for the read at the sampling
              position is used and RATE is used as a cutoff threshold. Default: 0.

       RETURN VALUES

              A return value of 0 indicates success, any other value indicates an error.

       EXAMPLES

              rabema_build_gold_standard -e 4 -o OUT.gsi -s IN.sam -r REF.fa

              Build gold standard from a SAM file IN.sam with all mapping locations and a FASTA reference REF.fa
              to GSI file OUT.gsi with a maximal error rate of 4.

              rabema_build_gold_standard --distance-metric edit -e 4 -o OUT.gsi -b IN.bam -r REF.fa

              Same as above, but using Hamming instead of edit distance and BAM as the input.

              rabema_build_gold_standard --oracle-mode -o OUT.gsi -s IN.sam -r REF.fa

              Build  gold standard from a SAM file IN.sam with the original sample position, e.g. as exported by
              read simulator Mason.

       MEMORY REQUIREMENTS

              From version 1.1, great care has been taken to keep the memory requirements as  low  as  possible.
              There  memory  required  is two times the size of the largest chromosome plus some constant memory
              for each match.

              For example, the memory usage for 100bp human genome reads at 5% error rate was  1.7GB.  Of  this,
              roughly 400GB came from the chromosome and 1.3GB from the matches.

       REFERENCES

              M.  Holtgrewe,  A.-K.  Emde, D. Weese and K. Reinert. A Novel And Well-Defined Benchmarking Method
              For Second Generation Read Mapping, BMC Bioinformatics 2011, 12:210.

              http://www.seqan.de/rabema

              RABEMA Homepage

              http://www.seqan.de/mason

              Mason Homepage

       VERSION

              rabema_build_gold_standard version: 1.2.0 Last update March 14, 2013