Provided by: patman_1.2.2+dfsg-6build1_amd64 bug

NAME

       PatMaN - search for approximate patterns in DNA libraries

SYNOPSIS

       patman [ option | file ... ]

DESCRIPTION

       PatMaN searches for (small) patterns in (huge) DNA databases, allowing for some mismatches
       and optionally gaps.  Patterns and databases are read from  one  or  more  fasta(5)  files
       listed  as  non-option  arguments,  depending on whether the -D or -P option last preceded
       them, and matched against each other.  The output of PatMaN is a table containing one line
       for each match, consisting of tab-separated fields:

       •   name of database sequence,

       •   name of pattern,

       •   position  of  first  matched  base  in database sequence, the sequence's beginning has
           position 1,

       •   position of last matched base in database sequence,

       •   strand (+ for literal match, - for reverse complement),

       •   edit distance (number of mismatches plus number of gaps).

OPTIONS

       -V, --version
              Print version number and exit.

       -e num, --edits num
              Allow up to num mismatches and/or gaps per match.

       -g num, --gaps num
              Allow up to num gaps per match.  Note that gaps count as mismatches, too, so the -e
              option  should always be set at least as high as the -g option.  Allowing many gaps
              can incur a considerable computational cost.

       -D, --databases
              Treat the following files as database.   Databases  must  be  in  fasta(5)  format.
              Multiple database files, including "-" for standard input, are allowed and are read
              in turn.

       -P, --patterns
              Treat the following files as patterns.  Pattern files must be in  fasta(5)  format.
              Multiple  pattern  files, including "-" for standard input, are allowed and are all
              read before touching the databases.

       -o file, --output file
              Redirect output to file.  The file name "-" causes output to be written to  stdout,
              which is also the default

       -a, --ambicodes
              Activate  the  interpretation  of ambiguity codes in patterns.  This results in the
              expansion of any pattern with ambiguity codes  into  multiple  patterns  which  can
              match independently.  Compare Unknown Nucleotides below.

       -s, --singlestrand
              Deactivate  matching  of  reverse-complements.   Normally, PatMaN will try to match
              patterns both literally and after reverse-complementing them, with this option set,
              only straight forward matches are considered.

       -p num, --prefetch num
              Causes  num  pointers  to  be  prefetched  in  advance.   This  feature can improve
              performance, if PatMaN has been compiled for a processor architecture that supports
              prefetching.   The  optimum  value  for  your particular setup has to be determined
              empirically, but the default should be reasonably good.

       -l len, --min-length len
              Only consider patterns with a length of at least len.  Use  this  if  your  pattern
              collection  contains  short  sequences that you don't want lots of possible matches
              reported for.

       -x num, --chop3 num
              Cut off num bases from the 3' end of each pattern.   Use  this  for  patterns  with
              damaged,  edited,  etc.  3'  ends  that  should  be ignored.  The chopped bases are
              neither matched nor included in the reported match regions.

       -X num, --chop5 num
              Cut off num bases from the 5' end of each pattern.   Use  this  for  patterns  with
              damaged,  edited,  etc.  5'  ends  that  should  be ignored.  The chopped bases are
              neither matched nor included in the reported match regions.

       -A, --adenine-hack
              Allow adenine to be ignored in patterns.  This is  essentially  equivalent  to  not
              counting  gaps  in  the database, as long as it was an A that was gapped.  Using -A
              can be computationally extremely expensive,  both  in  terms  of  memory  and  time
              consumed.

       -q, --quiet
              Suppress  warnings  (about  unrecognized  characters  in input sequences or missing
              input files).  Even without -q, at most one such warning is given per run.

       -v, --verbose
              Prints additional progress information to stderr.

       -d flags, --debug flags
              Sets debugging flags to flags.Flags may be the logical OR of any of  the  following
              values,  each  of which causes some output to appear on stderr.  Some of the values
              may only work if PatMaN has been compiled in debug mode.  The default value is 1.

       1      Print warnings.  Equivalent to not setting -q.

       2      Print progress information.  Equivalent to setting -v.

       4      Dump the suffix trie of the patterns.  Only available in debug build.

       8      Count number of visited nodes and  print  that  number  in  each  iteration.   Only
              available in debug build.

       16     Print total number of nodes fetched from memory after completing all databases.

       32     Output database sequence while it is being matched.

NOTES

   Non-Option Arguments
       Non-option  arguments  (bare  filenames)  are either treated as database or pattern files,
       depending on whether the -D or -P option  was  the  the  last  that  occurred  before  the
       filename.  If neither -D nor -P was given, file names are treated as pattern files.  If no
       database was given, it is instead  read  from  standard  input.   Standard  input  can  be
       explicitly  given  as  either  a  database or a pattern file by using the filename "-".  A
       warning is given if standard input is selected implicitly as database, an error message is
       given if no pattern files have been named at all.

   Gapped Matching
       Allowing  gaps  often  causes  overlapping  matches  of single patterns at almost the same
       position.  PatMaN makes no attempt to filter these  redundant  matches.   Also  note  that
       allowing  many  gaps,  and  especially allowing an arbitrary amount of gaps through the -A
       hack can slow down PatMaN considerably and cause it to produce enormous amounts of output.
       The use of some sorty of post-processor to filter these is highly recommended.

   Unknown Nucleotides
       Unknown  nucleotides are most often encoded by the letter N.  If the --ambicodes option is
       not given, Ns in patterns are interpreted as  unknown  nucleotides  and  can  never  match
       without penalty.  If --ambicodes is given, Ns in patterns are expanded just like the other
       amibuguity codes, and effectively work as wildcards.  Unknown  nucleotides  can  still  be
       encoded  by  an  X  and will never match anything.  The database is treated differently in
       that anything other than A, C, G, T and  U,  including  ambiguity  codes,  is  treated  as
       unknown and can never match without penalty.

FILES

       /etc/popt
              The  system  wide  configuration  file  for  popt(3).   PatMaN identifies itself as
              "patman" to popt.

       ~/.popt
              Per user configuration file for popt(3).

BUGS

       None known.

AUTHOR

       Kay Pruefer <pruefer@eva.mpg.de>
       Udo Stenzel <udo_stenzel@eva.mpg.de>

SEE ALSO

       popt(3),fasta(5)