Provided by: mira-assembler_4.9.6-4build5_amd64 bug

NAME

       mirabait - a 'grep' like tool to select reads with kmers up to 256 bp

SYNOPSIS

       mirabait [options] {-b baitfile [-b ...] | -B file | -j joblibrary} {-p file_1 file_2 | -P file3}* [file4
       ...]

DESCRIPTION

       mirabait selects reads from a read collection which are partly similar or equal to sequences  defined  as
       target  baits. Similarity is defined by finding a user-adjustable number of common k-mers (sequences of k
       consecutive bases) which are the same in the bait sequences and the screened sequences  to  be  selected,
       either  in  forward or forward/reverse complement direction. Adding a DUST-like repeat filter for repeats
       up 4 bases is optional.

       When used on paired files, selects sequences where at least one mate matches.

OPTIONS

   Main options:
       -b file
              Load bait sequences from file (multiple -b allowed)

       -B file
              Load baits from kmer statistics file, not from sequence files.  Only one  -B  allowed,  cannot  be
              combined with -b.  (see -K for creating such a file)

       -j job Set options for predefined job from supplied MIRA library Currently available jobs:

              rrna Bait rRNA sequences

       -p file1 file2
              Load  paired sequences to search from file1 and file2 Files must contain same number of sequences,
              sequence names must be in same order.  Multiple -p allowed, but must come before non-paired files.

       -P file
              Load paired sequences from file File must be interleaved: pairs must follow each other,  non-pairs
              are not allowed.  Multiple -p allowed, but must come before non-paired files.

       -k int kmer length of bait in bases (<=256, default=31)

       -n int If  >0:  minimum  number  of k-mer baits needed (default=1) If <=0: allowed number of missed kmers
              over sequence

              length

       -d     Do not use kmers with microrepeats (DUST-like, see also -D)

       -D int Set length of microrepeats in kmers to discard from bait.
              - int > 0 microrepeat len in percentage of kmer length.  E.g.: -k 17 -D 67 --> 11.39 bases -->  12
              bases.
              - int < 0 microrepeat len in bases.
              - int != 0 implies -d, int=0 turns DUST filter off.

       -i     Selects sequences that do not hit bait

       -I     Selects sequences that hit and do not hit bait (to different files)

       -r     No checking of reverse complement direction

       -t     Number of threads to use (default=0 -> up to 4 CPU cores)

   Options for output definition:
       Normally mirabait writes separate result files (named 'bait_match_*' and 'bait_miss_*') for each input to
       the current directory. For changing this behaviour and other relating to output, use these options:

       -c     No case change of sequence to denote bait hits

       -l int length of a line (FASTA only, default 0=unlimited)

       -K file
              Save kmer statistics to 'file' (see also -B)

       -N name
              Change the prefix 'bait' to <name> Has no effect if -o/-O is used and targets are not directories

       -o <path>
              Save sequences matching bait to path If path is  a  directory,  write  separate  files  into  this
              directory.  If  not,  combine  all  matching  sequences  from the input file(s) into a single file
              specified by the path.

       -O <path>
              Like -o, but for sequences not matching

   Other options:
       -T dir Use 'dir' as directory for temporary files instead of current working directory.

       -m integer
              Memory to use for computing kmer statistics
              0..100 = use percentage of free system memory
              >100 = amount of MiB to use (e.g. 16384 for 16 GiB)
              Default 75 (75% of free system memory).

Defining files types to load/save:

       Normally mirabait recognises the file types according to the file extension (even when packed). In  cases
       you need to force a certain file type because the file extension is non-standard, use the EMBOSS notation
       to  force  a  type:  <filetype>::<name_of_file>.  E.g.,  to  tell  that  "somefile.dat"  is  FASTQ,  use:
       fastq::somefile.dat Recognised types are: caf, fasta, fastq, gbf, gbk, gbff, maf and phd.

       MIRABAIT will write files in the same file type as the corresponding input files.  Examples:

       mirabait -b b.fasta file.fastq

       mirabait -I -j rrna -p file_1.fastq file_2.fastq

       mirabait -b b1.fasta -b b2.gbk file.fastq

       mirabait -b fasta::baits.dat -p fastq::file_1.dat fastq::file_2.dat

       mirabait -b b.fasta -p file_1.fastq file_2.fastq -P file3.fasta file4.caf

       mirabait -I -b b.fasta -p file_1.fastq file_2.fastq -P file3.fasta file4.caf

       mirabait -k 27 -n 10 -b b.fasta file.fastq

       mirabait -b fasta::b.dat fastq::file.dat

       mirabait -o /dev/shm/ -b b.fasta -p file_1.fastq file_2.fastq

       mirabait -o /dev/shm/match -b b.fasta -p file_1.fastq file_2.fastq

       mirabait -b human_genome.fasta -K HG_kmerstats.mhs.gz -p file1.fastq file2.fastq

       mirabait -B HG_kmerstats.mhs.gz -p file1.fastq file2.fastq

       mirabait -d -B HG_kmerstats.mhs.gz -p file1.fastq file2.fastq

SEE ALSO

       mira(1), miraconvert(1)

       A more extensive documentation is provided in the MIRA manual available online at

              http://mira-assembler.sourceforge.net/docs/DefinitiveGuideToMIRA.html

       On  Debian, this can be installed with the mira-doc package and can then be found at /usr/share/doc/mira-
       assembler/DefinitiveGuideToMIRA.html.   On   other    systems,    you    may    want    to    check    in
       /usr/local/share/mira/doc or run "locate DefinitiveGuideToMIRA" to find it locally.

       You can also subscribe one of the MIRA mailing lists at

              http://www.chevreux.org/mira_mailinglists.html

       After subscribing, mail general questions to the MIRA talk mailing list:

              mira_talk@freelists.org

BUGS

       To report bugs or ask for features, please use the ticketing system at:

              http://sourceforge.net/projects/mira-assembler/

AUTHOR

       Bastien Chevreux <bach@chevreux.org>

       This  manual  page  was  written  by  Bastien Chevreux <bach@chevreux.org> but can be freely used for any
       documentation purpose.