Ubuntu Manpage: gmap - Genomic Mapping and Alignment Program

Provided by: gmap_2017-11-15-1_amd64

NAME

       gmap - Genomic Mapping and Alignment Program

SYNOPSIS

       gmap [OPTIONS...] <FASTA files...>, or cat <FASTA files...> | gmap [OPTIONS...]

OPTIONS

   Input options (must include -d or -g)
       -D, --dir=directory
              Genome   directory.   Default  (as  specified  by  --with-gmapdb  to  the  configure  program)  is
              /var/cache/gmap

       -d, --db=STRING
              Genome database.  If argument is '?' (with the quotes), this command lists available databases.

       -k, --kmer=INT
              kmer size to use in genome database (allowed values: 16 or less).  If not specified,  the  program
              will find the highest available kmer size in the genome database

       --sampling=INT
              Sampling  to  use  in  genome  database.   If  not  specified,  the program will find the smallest
              available sampling value in the genome database within selected k-mer size

       -g, --gseg=filename
              User-supplied genomic segment

       -1, --selfalign
              Align one sequence  against  itself  in  FASTA  format  via  stdin  (Useful  for  getting  protein
              translation of a nucleotide sequence)

       -2, --pairalign
              Align two sequences in FASTA format via stdin, first one being genomic and second one being cDNA

       --cmdline=STRING,STRING
              Align  these  two  sequences  provided on the command line, first one being genomic and second one
              being cDNA

       -q, --part=INT/INT
              Process only the i-th out of every n sequences e.g., 0/100 or 99/100 (useful for distributing jobs
              to a computer farm).

       --input-buffer-size=INT
              Size of input buffer (program reads this many sequences at a time for efficiency) (default 1000)

       Computation options

       -B, --batch=INT
              Batch mode (default = 2)

                       Mode     Offsets       Positions       Genome
                         0      see note      mmap            mmap
                         1      see note      mmap & preload  mmap
               (default) 2      see note      mmap & preload  mmap & preload
                         3      see note      allocate        mmap & preload
                         4      see note      allocate        allocate
                         5      expand        allocate        allocate

       Note: For a single sequence, all data structures use mmap
              If mmap not available and allocate not chosen, then will use fileio (very slow)

       Note about --batch and offsets: Expansion of offsets can be controlled
              independently by the --expand-offsets flag.  The --batch=5 option is equivalent to --batch=4  plus
              --expand-offsets=1

       --expand-offsets=INT
              Whether  to expand the genomic offsets index Values: 0 (no, default), or 1 (yes).  Expansion gives
              faster alignment, but requires more memory

       --nosplicing
              Turns off splicing (useful for aligning genomic sequences onto a genome)

       --min-intronlength=INT
              Min length for one internal intron (default 9).  Below this size, a genomic gap will be considered
              a deletion rather than an intron.

       --max-intronlength-middle=INT
              Max length for one internal intron (default 500000).  Note: for backward compatibility, the -K  or
              --intronlength flag will set both --max-intronlength-middle and --max-intronlength-ends.  Also see
              --split-large-introns below.

       --max-intronlength-ends=INT
              Max  length for first or last intron (default 10000).  Note: for backward compatibility, the -K or
              --intronlength flag will set both --max-intronlength-middle and --max-intronlength-ends.

       --split-large-introns
              Sometimes GMAP will exceed the value for --max-intronlength-middle, if  it  finds  a  good  single
              alignment.  However, you can force GMAP to split such alignments by using this flag

       --trim-end-exons=INT
              Trim end exons with fewer than given number of matches (in nt, default 12)

       -w, --localsplicedist=INT
              Max length for known splice sites at ends of sequence (default 2000000)

       -L, --totallength=INT
              Max total intron length (default 2400000)

       -x, --chimera-margin=INT
              Amount  of  unaligned  sequence  that  triggers  search  for  the remaining sequence (default 30).
              Enables alignment of chimeric reads, and may help with some non-chimeric reads.  To turn off,  set
              to zero.

       --no-chimeras
              Turns off finding of chimeras.  Same effect as --chimera-margin=0

       -t, --nthreads=INT
              Number of worker threads

       -c, --chrsubset=string
              Limit search to given chromosome

       -z, --direction=STRING
              cDNA direction (sense_force, antisense_force, sense_filter, antisense_filter,or auto (default))

       --canonical-mode=INT
              Reward  for  canonical  and  semi-canonical  introns  0=low reward, 1=high reward (default), 2=low
              reward for high-identity sequences and high reward otherwise

       --cross-species
              Use a more sensitive search for canonical  splicing,  which  helps  especially  for  cross-species
              alignments and other difficult cases

       --allow-close-indels=INT
              Allow  an  insertion  and  deletion  close  to  each  other  (0=no,  1=yes  (default),  2=only for
              high-quality alignments)

       --microexon-spliceprob=FLOAT
              Allow microexons only if one of the splice site probabilities is greater than this value  (default
              0.95)

       --cmetdir=STRING
              Directory  for methylcytosine index files (created using cmetindex) (default is location of genome
              index files specified using -D, -V, and -d)

       --atoidir=STRING
              Directory for A-to-I RNA editing index files (created using atoiindex)  (default  is  location  of
              genome index files specified using -D, -V, and -d)

       --mode=STRING
              Alignment    mode:    standard    (default),   cmet-stranded,   cmet-nonstranded,   atoi-stranded,
              atoi-nonstranded, ttoc-stranded, or ttoc-nonstranded.  Non-standard modes  requires  you  to  have
              previously run the cmetindex or atoiindex programs (which also cover the ttoc modes) on the genome

       -p, --prunelevel
              Pruning level: 0=no pruning (default), 1=poor seqs, 2=repetitive seqs, 3=poor and repetitive

       Output types

       -S, --summary
              Show summary of alignments only

       -A, --align
              Show alignments

       -3, --continuous
              Show alignment in three continuous lines

       -4, --continuous-by-exon
              Show alignment in three lines per exon

       -Z, --compress
              Print output in compressed format

       -E, --exons=STRING
              Print exons ("cdna" or "genomic")

       -P, --protein_dna
              Print protein sequence (cDNA)

       -Q, --protein_gen
              Print protein sequence (genomic)

       -f, --format=INT
              Other  format  for  output  (also note the -A and -S options and other options listed under Output
              types):
               psl (or 1) = PSL (BLAT) format,
               gff3_gene (or 2) = GFF3 gene format,
               gff3_match_cdna (or 3) = GFF3 cDNA_match format,
               gff3_match_est (or 4) = GFF3 EST_match format,
               splicesites (or 6) = splicesites output (for GSNAP splicing file),
               introns = introns output (for GSNAP splicing file),
               map_exons (or 7) = IIT FASTA exon map format,
               map_ranges (or 8) = IIT FASTA range map format,
               coords (or 9) = coords in table format,
               sampe = SAM format (setting paired_read bit in flag),
               samse = SAM format (without setting paired_read bit),
               bedpe = indels and gaps in BEDPE format

       Output options

       -n, --npaths=INT
              Maximum number of paths to show (default  5).   If  set  to  1,  GMAP  will  not  report  chimeric
              alignments, since those imply two paths.  If you want a single alignment plus chimeric alignments,
              then set this to be 0.

       --suboptimal-score=FLOAT
              Report only paths whose score is within this value of the best path.

       If specified between 0.0 and 1.0, then treated as a fraction
              of  the  score  of  the  best  alignment  (matches  minus  penalties  for  mismatches and indels).
              Otherwise, treated as an integer number to be subtracted from the score  of  the  best  alignment.
              Default value is 0.50.

       -O, --ordered
              Print output in same order as input (relevant only if there is more than one worker thread)

       -5, --md5
              Print MD5 checksum for each query sequence

       -o, --chimera-overlap
              Overlap to show, if any, at chimera breakpoint

       --failsonly
              Print only failed alignments, those with no results

       --nofails
              Exclude printing of failed alignments

       -V, --snpsdir=STRING
              Directory for SNPs index files (created using snpindex) (default is location of genome index files
              specified using -D and -d)

       -v, --use-snps=STRING
              Use  database  containing  known  SNPs  (in  <STRING>.iit,  built  previously  using snpindex) for
              tolerance to SNPs

       --split-output=STRING
              Basename for multiple-file output, separately for nomapping,
               uniq, mult, (and chimera, if --chimera-margin is selected)

       --failed-input=STRING
              Print completely failed alignments as input FASTA or FASTQ format  to  the  given  file.   If  the
              --split-output  flag  is  also  given,  this  file  is  generated in addition to the output in the
              .nomapping file.

       --append-output
              When --split-output or --failedinput is given, this flag will append output to the existing files.
              Otherwise, the default is to create new files.

       --output-buffer-size=INT
              Buffer size, in queries, for output thread (default 1000).  When  the  number  of  results  to  be
              printed exceeds this size, the worker threads are halted until the backlog is cleared

       --translation-code=INT
              Genetic  code  used  for  translating  codons  to  amino  acids  and  computing  CDS Integer value
              (default=1)         corresponds         to         an          available          code          at
              http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi

       --alt-start-codons
              Also,  use  the  alternate  initiation codons shown in the above Web site By default, without this
              option, only ATG is considered an initiation codon

       -F, --fulllength
              Assume full-length protein, starting with Met

       -a, --cdsstart=INT
              Translate codons from given nucleotide (1-based)

       -T, --truncate
              Truncate alignment around full-length protein, Met to Stop Implies -F flag.

       -Y, --tolerant
              Translates cDNA with corrections for frameshifts

       Options for GFF3 output

       --gff3-add-separators=INT
              Whether to add a ### separator after each query sequence Values: 0 (no), 1 (yes, default)

       --gff3-swap-phase=INT
              Whether to swap phase (0 => 0, 1 => 2, 2 =>  1)  in  gff3_gene  format  Needed  by  some  analysis
              programs, but deviates from GFF3 specification Values: 0 (no, default), 1 (yes)

       --gff3-cds=STRING
              Whether to use cDNA or genomic translation for the CDS coordinates Values: cdna (default), genomic

       Options for SAM output

       --no-sam-headers
              Do not print headers beginning with '@'

       --sam-use-0M
              Insert  0M  in  CIGAR  between adjacent insertions and deletions Required by Picard, but can cause
              errors in other tools

       --sam-extended-cigar
              Use extended CIGAR format (using X and = symbols instead of M,
               to indicate matches and mismatches, respectively

       --force-xs-dir
              For RNA-Seq alignments, disallows XS:A:? when the sense direction is unclear,  and  replaces  this
              value  arbitrarily  with  XS:A:+.  May be useful for some programs, such as Cufflinks, that cannot
              handle XS:A:?.  However, if you use this flag, the reported value of XS:A:+ in  these  cases  will
              not be meaningful.

       --md-lowercase-snp
              In MD string, when known SNPs are given by the -v flag,
               prints difference nucleotides as lower-case when they,
               differ from reference but match a known alternate allele

       --action-if-cigar-error
              Action to take if there is a disagreement between CIGAR length and sequence length Allowed values:
              ignore, warning (default), abort

       --read-group-id=STRING
              Value to put into read-group id (RG-ID) field

       --read-group-name=STRING
              Value to put into read-group name (RG-SM) field

       --read-group-library=STRING
              Value to put into read-group library (RG-LB) field

       --read-group-platform=STRING
              Value to put into read-group library (RG-PL) field

       Options for quality scores

       --quality-protocol=STRING
              Protocol  for  input quality scores.  Allowed values: illumina (ASCII 64-126) (equivalent to -J 64
              -j -31) sanger   (ASCII 33-126) (equivalent to -J 33 -j 0)

       Default is sanger (no quality print shift)
              SAM output files should have quality scores in sanger protocol

              Or you can specify the print shift with this flag:

       -j, --quality-print-shift=INT
              Shift FASTQ quality scores by this amount in output (default is 0 for sanger protocol;  to  change
              Illumina input to Sanger output, select -31)

       External map file options

       -M, --mapdir=directory
              Map directory

       -m, --map=iitfile
              Map file.  If argument is '?' (with the quotes),
               this lists available map files.

       -e, --mapexons
              Map each exon separately

       -b, --mapboth
              Report hits from both strands of genome

       -u, --flanking=INT
              Show flanking hits (default 0)

       --print-comment
              Show comment line for each hit

       Alignment output options

       -N, --nolengths
              No intron lengths in alignment

       -I, --invertmode=INT
              Mode  for  alignments  to  genomic (-) strand: 0=Don't invert the cDNA (default) 1=Invert cDNA and
              print genomic (-) strand 2=Invert cDNA and print genomic (+) strand

       -i, --introngap=INT
              Nucleotides to show on each end of intron (default 3)

       -l, --wraplength=INT
              Wrap length for alignment (default 50)

       Filtering output options

       --min-trimmed-coverage=FLOAT
              Do not print alignments with trimmed  coverage  less  this  value  (default=0.0,  which  means  no
              filtering) Note that chimeric alignments will be output regardless of this filter

       --min-identity=FLOAT
              Do not print alignments with identity less this value (default=0.0, which means no filtering) Note
              that chimeric alignments will be output regardless of this filter Help options

       --check
              Check compiler assumptions

       --version
              Show version

       --help Show this help message

       Other tools of GMAP suite are located in /usr/lib/gmap

gmap 2017-11-15-1                                 December 2017                                          GMAP(1)