lunar (1) mgaps.1.gz

Provided by: mummer_3.23+dfsg-8_amd64 bug

NAME

       mummer - package for sequence alignment of multiple genomes

SYNOPSIS

       mummer-annotate <gapfile><datafile>
       combineMUMs <RefSequence><MatchSequences><GapsFile>
       dnadiff [options]<reference><query> or [options]-d<deltafile>
       exact-tandems <file><min-match-len>
       gaps
       mapview [options]<coordsfile>[UTRcoords][CDScoords]
       mgaps [-d<DiagDiff>][-f<DiagFactor>][-l<MatchLen>][-s<MaxSeparation>]
       mummer [options]<reference-file><query-files>
       mummerplot [options]<matchfile>
       nucmer [options]<Reference><Query>
       nucmer2xfig
       promer [options]<Reference><Query>
       repeat-match [options]<genome-file>
       run-mummer1 <fastareference><fastaquery><prefix>[-r]
       run-mummer3 <fastareference><multi-fastaquery><prefix>
       show-aligns [options]<deltafile><refID><qryID>

       Input  is  the  .delta output of either the "nucmer" or the "promer" program passed on the
       command line.

       Output is to stdout, and consists of all the alignments between the  query  and  reference
       sequences identified on the command line.

       NOTE:  No sorting is done by default, therefore the alignments will be ordered as found in
       the <deltafile> input.
       show-coords [options]<deltafile>
       show-snps [options]<deltafile>
       show-tiling [options]<deltafile>

DESCRIPTION

OPTIONS

       All tools (except for gaps) obey to the -h, --help, -V and --version options as one  would
       expect. This help is excellent and makes these man pages basically obsolete.
       combineMUMs  Combines  MUMs  in <GapsFile> by extending matches off ends and between MUMs.
       <RefSequence> is a fasta file of the reference sequence.   <MatchSequences>  is  a  multi-
       fasta file of the sequences matched against the reference

         -D      Only output to stdout the difference positions
                 and characters
         -n      Allow matches only between nucleotides, i.e., ACGTs
         -N num  Break matches at <num> or more consecutive non-ACGTs
         -q tag  Used to label query match
         -r tag  Used to label reference match
         -S      Output all differences in strings
         -t      Label query matches with query fasta header
         -v num  Set verbose level for extra output
         -W file Reset the default output filename witherrors.gaps
         -x      Don't output .cover files
         -e      Set error-rate cutoff to e (e.g. 0.02 is two percent)
       dnadiff  Run  comparative  analysis  of  two sequence sets using nucmer and its associated
       utilities with recommended parameters.  See  MUMmer  documentation  for  a  more  detailed
       description of the output. Produces the following output files:

           .report  - Summary of alignments, differences and SNPs
           .delta   - Standard nucmer alignment output
           .1delta  - 1-to-1 alignment from delta-filter -1
           .mdelta  - M-to-M alignment from delta-filter -m
           .1coords - 1-to-1 coordinates from show-coords -THrcl .1delta
           .mcoords - M-to-M coordinates from show-coords -THrcl .mdelta
           .snps    - SNPs from show-snps -rlTHC .1delta
           .rdiff   - Classified ref breakpoints from show-diff -rH .mdelta
           .qdiff   - Classified qry breakpoints from show-diff -qH .mdelta
           .unref   - Unaligned reference IDs and lengths (if applicable)
           .unqry   - Unaligned query IDs and lengths (if applicable)

       MANDATORY:
           reference       Set the input reference multi-FASTA filename
           query           Set the input query multi-FASTA filename
             or
           delta file      Unfiltered .delta alignment file from nucmer

       OPTIONS:
           -d|delta        Provide precomputed delta file for analysis
           -h
           --help          Display help information and exit
           -p|prefix       Set the prefix of the output files (default "out")
           -V
           --version       Display the version information and exit

       mapview
         -h
         --help   Display help information and exit
         -m|mag   Set the magnification at which the figure is rendered,
                  this is an option for fig2dev which is used to generate
                  the PDF and PS files (default 1.0)
         -n|num   Set the number of output files used to partition the
                  output, this is to avoid generating files that are too
                  large to display (default 10)
         -p|prefix  Set the output file prefix
                  (default "PROMER_graph or NUCMER_graph")
         -v
         --verbose  Verbose logging of the processed files
         -V
         --version  Display the version information and exit
         -x1 coord  Set the lower coordinate bound of the display
         -x2 coord  Set the upper coordinate bound of the display
         -g|ref     If the input file is provided by 'mgaps', set the
                    reference sequence ID (as it appears in the first column
                    of the UTR/CDS coords file)
         -I         Display the name of query sequences
         -Ir        Display the name of reference genes
       mummer  Find  and  output  (to  stdout)  the positions and length of all sufficiently long
       maximal matches of a substring in <query-file> and <reference-file>

         -mum           compute maximal matches that are unique in both sequences
         -mumcand       same as -mumreference
         -mumreference  compute maximal matches that are unique in
                  the reference-sequence but not necessarily              in  the  query-sequence
       (default)
         -maxmatch      compute all maximal matches regardless of their uniqueness
         -n             match only the characters a, c, g, or t
                        they can be in upper or in lower case
         -l             set the minimum length of a match
                        if not set, the default value is 20
         -b             compute forward and reverse complement matches
         -r             only compute reverse complement matches
         -s             show the matching substrings
         -c             report the query-position of a reverse complement match
                        relative to the original query sequence
         -F             force 4 column output format regardless of the number of
                        reference sequence inputs
         -L             show the length of the query sequences on the header line
       nuncmer
           nucmer generates nucleotide alignments between two mutli-FASTA input
           files. Two output files are generated. The .cluster output file lists
           clusters of matches between each sequence. The .delta file lists the
           distance between insertions and deletions that produce maximal scoring
           alignments between each sequence.

       MANDATORY:
           Reference     Set the input reference multi-FASTA filename
           Query         Set the input query multi-FASTA filename

         --mum           Use anchor matches that are unique in both the reference
                         and query
         --mumcand       Same as --mumreference
         --mumreference  Use anchor matches that are unique in in the reference
                         but not necessarily unique in the query (default behavior)
         --maxmatch      Use all anchor matches regardless of their uniqueness

         -b|breaklen     Set the distance an alignment extension will attempt to
                         extend poor scoring regions before giving up (default 200)
         -c|mincluster   Sets the minimum length of a cluster of matches (default 65)
         --[no]delta     Toggle the creation of the delta file (default --delta)
         --depend        Print the dependency information and exit
         -d|diagfactor   Set the clustering diagonal difference separation factor
                         (default 0.12)
         --[no]extend    Toggle the cluster extension step (default --extend)
         -f
         --forward       Use only the forward strand of the Query sequences
         -g|maxgap       Set the maximum gap between two adjacent matches in a
                         cluster (default 90)
         -h
         --help          Display help information and exit
         -l|minmatch     Set the minimum length of a single match (default 20)
         -o
         --coords        Automatically generate the original NUCmer1.1 coords
                         output file using the 'show-coords' program
         --[no]optimize  Toggle alignment score optimization, i.e. if an alignment
                         extension reaches the end of a sequence, it will backtrack
                         to optimize the alignment score instead of terminating the
                         alignment at the end of the sequence (default --optimize)
         -p|prefix       Set the prefix of the output files (default "out")
         -r
         --reverse       Use only the reverse complement of the Query sequences
         --[no]simplify  Simplify alignments by removing shadowed clusters. Turn
                         this option off if aligning a sequence to itself to look
                         for repeats (default --simplify)

       promer
           promer generates amino acid alignments between two mutli-FASTA DNA input
           files. Two output files are generated. The .cluster output file lists
           clusters of matches between each sequence. The .delta file lists the
           distance between insertions and deletions that produce maximal scoring
           alignments between each sequence. The DNA input is translated into all 6
           reading frames in order to generate the output, but the output coordinates
           reference the original DNA input.

       MANDATORY:
           Reference     Set the input reference multi-FASTA DNA file
           Query         Set the input query multi-FASTA DNA file

         --mum           Use anchor matches that are unique in both the reference
                         and query
         --mumcand       Same as --mumreference
         --mumreference  Use anchor matches that are unique in in the reference
                         but not necessarily unique in the query (default behavior)
         --maxmatch      Use all anchor matches regardless of their uniqueness

         -b|breaklen     Set the distance an alignment extension will attempt to
                         extend poor scoring regions before giving up, measured in
                         amino acids (default 60)
         -c|mincluster   Sets the minimum length of a cluster of matches, measured in
                         amino acids (default 20)
         --[no]delta     Toggle the creation of the delta file (default --delta)
         --depend        Print the dependency information and exit
         -d|diagfactor   Set the clustering diagonal difference separation factor
                         (default .11)
         --[no]extend    Toggle the cluster extension step (default --extend)
         -g|maxgap       Set the maximum gap between two adjacent matches in a
                         cluster, measured in amino acids (default 30)
         -l|minmatch     Set the minimum length of a single match, measured in amino
                         acids (default 6)
         -m|masklen      Set the maximum bookend masking length, measured in amino
                         acids (default 8)
         -o
         --coords        Automatically generate the original PROmer1.1 ".coords"
                         output file using the "show-coords" program
         --[no]optimize  Toggle alignment score optimization, i.e. if an alignment
                         extension reaches the end of a sequence, it will backtrack
                         to optimize the alignment score instead of terminating the
                         alignment at the end of the sequence (default --optimize)

         -p|prefix       Set the prefix of the output files (default "out")
         -x|matrix       Set the alignment matrix number to 1 [BLOSUM 45],
                         2 [BLOSUM 62] or 3 [BLOSUM 80] (default 2)
       repeat-match Find all maximal exact matches in <genome-file>
         -E    Use exhaustive (slow) search to find matches
         -f    Forward strand only, don't use reverse complement
         -n #  Set minimum exact match length to #
         -t    Only output tandem repeats
         -V #  Set level of verbose (debugging) printing to #
       show-aligns
         -h      Display help information
         -q      Sort alignments by the query start coordinate
         -r      Sort alignments by the reference start coordinate
         -w int  Set the screen width - default is 60
         -x int  Set the matrix type - default is 2 (BLOSUM 62),
                 other options include 1 (BLOSUM 45) and 3 (BLOSUM 80)
                 note: only has effect on amino acid alignments
       show-coords
         -b          Merges overlapping alignments regardless of match dir
                     or frame and does not display any idenitity information.
         -B          Switch output to btab format
         -c          Include percent coverage information in the output
         -d          Display the alignment direction in the additional
                     FRM columns (default for promer)
         -g          Deprecated option. Please use 'delta-filter' instead
         -h          Display help information
         -H          Do not print the output header
         -I float    Set minimum percent identity to display
         -k          Knockout (do not display) alignments that overlap
                     another alignment in a different frame by more than 50%
                     of their length, AND have a smaller percent similarity
                     or are less than 75% of the size of the other alignment
                     (promer only)
         -l          Include the sequence length information in the output
         -L long     Set minimum alignment length to display
         -o          Annotate maximal alignments between two sequences, i.e.
                     overlaps between reference and query sequences
         -q          Sort output lines by query IDs and coordinates
         -r          Sort output lines by reference IDs and coordinates
         -T          Switch output to tab-delimited format

         Input  is the .delta output of either the "nucmer" or the "promer" program passed on the
       command line.

         Output is to stdout, and consists of a list of coordinates, percent identity, and  other
       useful  information  regarding  the  alignment  data  contained in the .delta file used as
       input.

         NOTE: No sorting is done by default, therefore the alignments will be ordered  as  found
       in the <deltafile> input.
       show-snps
         -C            Do not report SNPs from alignments with an ambiguous
                       mapping, i.e. only report SNPs where the [R] and [Q]
                       columns equal 0 and do not output these columns
         -h            Display help information
         -H            Do not print the output header
         -I            Do not report indels
         -l            Include sequence length information in the output
         -q            Sort output lines by query IDs and SNP positions
         -r            Sort output lines by reference IDs and SNP positions
         -S            Specify which alignments to report by passing
                       'show-coords' lines to stdin
         -T            Switch to tab-delimited format
         -x int        Include x characters of surrounding SNP context in the
                       output, default 0

         Input  is the .delta output of either the nucmer or promer program passed on the command
       line.

         Output is to stdout, and consists of a list of SNPs (or  amino  acid  substitutions  for
       promer)  with  positions  and other useful info.  Output will be sorted with -r by default
       and the [BUFF] column will always refer to the sequence whose positions have been  sorted.
       This value specifies the distance from this SNP to the nearest mismatch (end of alignment,
       indel, SNP, etc) in the same alignment, while the [DIST]  column  specifies  the  distance
       from  this  SNP  to  the  nearest sequence end. SNPs for which the [R] and [Q] columns are
       greater than 0 should be evaluated with caution, as these columns specify  the  number  of
       other alignments which overlap this position. Use -C to assure SNPs are only reported from
       unique alignment regions.

       show-tiling
         -a          Describe the tiling path by printing the tab-delimited
                     alignment region coordinates to stdout
         -c          Assume the reference sequences are circular, and allow
                     tiled contigs to span the origin
         -g int      Set maximum gap between clustered alignments [-1, INT_MAX]
                     A value of -1 will represent infinity
                     (nucmer default = 1000)
                     (promer default = -1)
         -i float    Set minimum percent identity to tile [0.0, 100.0]
                     (nucmer default = 90.0)
                     (promer default = 55.0)
         -l int      Set minimum length contig to report [-1, INT_MAX]
                     A value of -1 will represent infinity
                     (common default = 1)
         -p file     Output a pseudo molecule of the query contigs to 'file'
         -R          Deal with repetitive contigs by randomly placing them
                     in one of their copy locations (implies -V 0)
         -t file     Output a TIGR style contig list of each query sequence
                     that sufficiently matches the reference (non-circular)
         -u file     Output the tab-delimited alignment region coordinates
                     of the unusable contigs to 'file'
         -v float    Set minimum contig coverage to tile [0.0, 100.0]
                     (nucmer default = 95.0) sum of individual alignments
                     (promer default = 50.0) extent of syntenic region
         -V float    Set minimum contig coverage difference [0.0, 100.0]
                     i.e. the difference needed to determine one alignment
                     is 'better' than another alignment
                     (nucmer default = 10.0) sum of individual alignments
                     (promer default = 30.0) extent of syntenic region
         -x          Describe the tiling path by printing the XML contig
                     linking information to stdout

         Input is the .delta output of the nucmer program, run on very similar sequence data,  or
       the .delta output of the promer program, run on divergent sequence data.

         Output  is  to  stdout,  and  consists  of the predicted location of each aligning query
       contig as mapped to the reference sequences.  These coordinates reference  the  extent  of
       the  entire  query  contig, even when only a certain percentage of the contig was actually
       aligned (unless the -a option is used). Columns are, start in ref, end in ref, distance to
       next  contig,  length  of  this  contig, alignment coverage, identity, orientation, and ID
       respectively.

SEE ALSO

       http://mummer.sourceforge.net/

       Open source MUMmer 3.0 is described in
       Versatile and open software for comparing large genomes.  S.  Kurtz,  A.  Phillippy,  A.L.
       Delcher,  M.  Smoot,  M.  Shumway, C. Antonescu, and S.L. Salzberg, Genome Biology (2004),
       5:R12.

AUTHOR

       mummer was written by S. Kurtz, A. Phillippy, A.L.  Delcher,  M.  Smoot,  M.  Shumway,  C.
       Antonescu, and S.L. Salzberg.

                                           May 21, 2005                                 MUMMER(1)