Provided by: seqan-raptor_3.0.1+ds-3build1_amd64
NAME
Raptor-prepare - A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences.
SYNOPSIS
raptor prepare --input <file> --output <directory> [--threads <number>] [--quiet] [--kmer <number>|--shape <01-pattern>] [--window <number>] [--kmer-count-cutoff <number>|--use- filesize-dependent-cutoff]
DESCRIPTION
Computes minimisers for the use with raptor layout and raptor build. Can continue where it left off after a crash or in multiple runs.
OPTIONS
General options --input (std::filesystem::path) File containing file names. The file must contain at least one file path per line, with multiple paths being separated by a whitespace. Each line in the file corresponds to one bin. Valid extensions for the paths in the file are [minimiser] when using preprocessed input from raptor prepare, and [embl,fasta,fa,fna,ffn,faa,frn,fas,fastq,fq,genbank,gb,gbk,sam], possibly followed by [bz2,gz,bgzf]. The input file must exist and read permissions must be granted. --output (std::filesystem::path) A valid path for the output directory. Will create a minimiser.list inside the output directory. This file contains a list of generated minimiser files, in the same order as the input. When you manually delete a .in_progress file, also delete the corresponding .header and .minimiser file! Created output files for each file: *.header: Contains the shape, window size, cutoff and minimiser count. *.minimiser: Contains binary minimiser values, one minimiser per line. *.in_progress: Temporary file to track process. Deleted after finishing computation. --threads (unsigned 8 bit integer) The number of threads to use. Default: 1. Value must be a positive integer. --quiet Do not print time and memory usage. k-mer options --kmer (unsigned 8 bit integer) The k-mer size. Default: 20. Value must be in range [1,32]. --window (unsigned 32 bit integer) The window size. Default: k-mer size. Value must be a positive integer. --shape (std::string) The shape to use for k-mers. Mutually exclusive with --kmer. Parsed from right to left. Default: 11111111111111111111 (a k-mer of size 20). Value must match the pattern '[01]+'. Processing options --kmer-count-cutoff (unsigned 8 bit integer) Only store k-mers with at least (>=) x occurrences. Mutually exclusive with --use- filesize-dependent-cutoff. Default: 1. Value must be in range [1,254]. --use-filesize-dependent-cutoff Apply cutoffs from Mantis(Pandey et al., 2018). Mutually exclusive with --kmer- count-cutoff. Common options -h, --help Prints the help page. -hh, --advanced-help Prints the help page including advanced options. --version Prints the version information. --copyright Prints the copyright/license information. --export-help (std::string) Export the help page information. Value must be one of [html, man, ctd, cwl].
EXAMPLES
raptor prepare --input bins.list --output some_directory --kmer 20 --window 24 raptor prepare --input bins.list --output some_directory --kmer-count-cutoff 2 raptor prepare --input bins.list --output some_directory --use-filesize-dependent-cutoff
VERSION
Last update: Unavailable Raptor-prepare version: 3.0.1 (commit unavailable) Sharg version: 1.1.1 SeqAn version: 3.3.0-rc.2
URL
https://github.com/seqan/raptor
LEGAL
Raptor-prepare Copyright: BSD 3-Clause License Author: Enrico Seiler Contact: enrico.seiler@fu-berlin.de SeqAn Copyright: 2006-2023 Knut Reinert, FU-Berlin; released under the 3-clause BSDL. In your academic works please cite: Raptor: A fast and space-efficient pre-filter for querying very large collections of nucleotide sequences; Enrico Seiler, Svenja Mehringer, Mitra Darvish, Etienne Turc, and Knut Reinert; iScience 2021 24 (7): 102782. doi: https://doi.org/10.1016/j.isci.2021.102782 For full copyright and/or warranty information see --copyright.