Ubuntu Manpage: mash-screen - determine whether query sequences are within a larger pool of sequences

Provided by: mash_2.3+dfsg-1build2_amd64

NAME

       mash-screen - determine whether query sequences are within a larger pool of sequences

SYNOPSIS

       mash screen [options] <queries>.msh <pool> [<pool>] ...

DESCRIPTION

       Determine how well query sequences are contained within a pool of sequences. The queries must be
       formatted as a single Mash sketch file (.msh), created with the mash sketch command. The <pool> files can
       be contigs or reads, in fasta or fastq, gzipped or not, and "-" can be given for <pool> to read from
       standard input. The <pool> sequences are assumed to be nucleotides, and will be 6-frame translated if the
       <queries> are amino acids. The output fields are [identity, shared-hashes, median-multiplicity, p-value,
       query-ID, query-comment], where median-multiplicity is computed for shared hashes, based on the number of
       observations of those hashes within the pool.

OPTIONS

       -h
           Help

       -p <int>
           Parallelism. This many threads will be spawned for processing.

       -w
           Winner-takes-all strategy for identity estimates. After counting hashes for each query, hashes that
           appear in multiple queries will be removed from all except the one with the best identity (ties
           broken by larger query), and other identities will be reduced. This removes output redundancy,
           providing a rough compositional outline.

   Output
       -i <num>
           Minimum identity to report. Inclusive unless set to zero, in which case only identities greater than
           zero (i.e. with at least one shared hash) will be reported. Set to -1 to output everything.

       -v <num>
           Maximum p-value to report.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO