lunar (1) kmer-mask.1.gz

Provided by: meryl_0~20150903+r2013-8build3_amd64 bug

NAME

       kmer-mask - mask and filter set of nucleotide sequences by kmer content

SYNOPSIS

       kmer-mask {-novel|-confirmed} [-mdb mer-database] [-ms mer-size] [-edb exist-database] [-m
       min-size] [-e extend-size] [-lowthreshold l] [-highthreshold  h]  [-t  threads]  [-v]  [-h
       histogram] [-promote|-demote|-discard] -1 in.1.fastq [-2 in.2.fastq] -o output-prefix

DESCRIPTION

       Mask  and  filter set of sequences (presumed to be reads) by kmer content.  Masking can be
       done to retain novel sequence not in the database, or to retain confirmed sequence present
       in the database.  Filtering will segregate sequences fully, partially or not masked.

OPTIONS

       -mdb mer-database
              load masking kmers from meryl(1) mer-database

       -ms mer-size

       -edb exist-database
              save masking kmers to an existDB(1) file exist-database for faster restarts

       -1 in.1.fastq

       -2 in.2.fastq
              input  reads files in fastq, fastq.gz, fastq.bz2 or fastq.xz format.  The second is
              optional, but messes up the output classification if not present.

       -o out
              prefix for output reads

              out.fullymasked.[12].fastq
                     reads with below 'lowthreshold' bases retained

              out.partiallymasked.[12].fastq
                     reads in between

              out.retained.[12].fastq
                     reads with more than 'hightreshold' bases retained

              out.discarded.[12].fastq
                     reads with conflicting status

       -m min-size
              ignore database hits below this many consecutive kmers (0)

       -e extend-size
              extend database hits across this many missing kmers (0)

       -novel RETAIN novel sequence not present in the database

       -confirmed
              RETAIN confirmed sequence present in the database

       -promote
              promote the  lesser  RETAINED  read  to  the  status  of  the  more  RETAINED  read
              read1=fullymasked and read2=partiallymasked -> both are partiallymasked

       -demote
              demote  the  more  RETAINED  read  to  the  status  of  the  lesser  RETAINED  read
              read1=fullymasked and read2=partiallymasked -> both are fullymasked

       -discard
              discard   pairs   with   conflicting   status   (DEFAULT)   read1=fullymasked   and
              read2=partiallymasked -> both are discarded

   stats on stderr, number of sequences with amount RETAINED:
       -lowthreshold t
              (0.3333)

       -highthreshold t
              (0.6667)

       -h histogram
              write a histogram of the amount of sequence RETAINED

       -t t   use t compute threads

       -v     show progress

SEE ALSO

       meryl(1) existDB(1)