Provided by: python-cutadapt_1.9.1-1build1_amd64
NAME
cutadapt - manual page for cutadapt 1.8.3
DESCRIPTION
cutadapt version 1.8.3 Copyright © 2010-2015 Marcel Martin <marcel.martin@scilifelab.se> cutadapt removes adapter sequences from high-throughput sequencing reads. Usage: cutadapt -a ADAPTER [options] [-o output.fastq] input.fastq For paired-end reads: cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq Replace "ADAPTER" with the actual sequence of your 3' adapter. IUPAC wildcard characters are supported. The reverse complement is *not* automatically searched. All reads from input.fastq will be written to output.fastq with the adapter sequence removed. Adapter matching is error-tolerant. Multiple adapter sequences can be given (use further -a options), but only the best-matching adapter will be removed. Input may also be in FASTA format. Compressed input and output is supported and auto-detected from the file name (.gz, .xz, .bz2). Use the file name '-' for standard input/output. Without the -o option, output is sent to standard output. Some other available features are: * Various other adapter types (5' adapters, "mixed" 5'/3' adapters etc.) * Trimming a fixed number of bases * Quality trimming * Trimming colorspace reads * Filtering reads by various criteria Use "cutadapt --help" to see all command-line options. See http://cutadapt.readthedocs.org/ for full documentation.
OPTIONS
--version show program's version number and exit -h, --help show this help message and exit -f FORMAT, --format=FORMAT Input file format; can be either 'fasta', 'fastq' or 'sra-fastq'. Ignored when reading csfasta/qual files (default: auto-detect from file name extension). Options that influence how the adapters are found: Each of the following three parameters (-a, -b, -g) can be used multiple times and in any combination to search for an entire set of adapters of possibly different types. Only the best matching adapter is trimmed from each read (but see the --times option). Instead of giving an adapter directly, you can also write file:FILE and the adapter sequences will be read from the given FILE (which must be in FASTA format). -a ADAPTER, --adapter=ADAPTER Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed. If the adapter sequence ends with the '$' character, the adapter is anchored to the end of the read and only found if it is a suffix of the read. -g ADAPTER, --front=ADAPTER Sequence of an adapter that was ligated to the 5' end. If the adapter sequence starts with the character '^', the adapter is 'anchored'. An anchored adapter must appear in its entirety at the 5' end of the read (it is a prefix of the read). A non-anchored adapter may appear partially at the 5' end, or it may occur within the read. If it is found within a read, the sequence preceding the adapter is also trimmed. In all cases, the adapter itself is trimmed. -b ADAPTER, --anywhere=ADAPTER Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. -e ERROR_RATE, --error-rate=ERROR_RATE Maximum allowed error rate (no. of errors divided by the length of the matching region) (default: 0.1) --no-indels Do not allow indels in the alignments (allow only mismatches). Currently only supported for anchored adapters. (default: allow both mismatches and indels) -n COUNT, --times=COUNT Try to remove adapters at most COUNT times. Useful when an adapter gets appended multiple times (default: 1). -O LENGTH, --overlap=LENGTH Minimum overlap length. If the overlap between the read and the adapter is shorter than LENGTH, the read is not modified. This reduces the no. of bases trimmed purely due to short random adapter matches (default: 3). --match-read-wildcards Allow IUPAC wildcards in reads (default: False). -N, --no-match-adapter-wildcards Do not interpret IUPAC wildcards in adapters. Options for filtering of processed reads: --discard-trimmed, --discard Discard reads that contain the adapter instead of trimming them. Also use -O in order to avoid throwing away too many randomly matching reads! --discard-untrimmed, --trimmed-only Discard reads that do not contain the adapter. -m LENGTH, --minimum-length=LENGTH Discard trimmed reads that are shorter than LENGTH. Reads that are too short even before adapter removal are also discarded. In colorspace, an initial primer is not counted (default: 0). -M LENGTH, --maximum-length=LENGTH Discard trimmed reads that are longer than LENGTH. Reads that are too long even before adapter removal are also discarded. In colorspace, an initial primer is not counted (default: no limit). --no-trim Match and redirect reads to output/untrimmed-output as usual, but do not remove adapters. --max-n=LENGTH The max proportion of N's allowed in a read. A number < 1 will be treated as a proportion while a number > 1 will be treated as the maximum number of N's contained. --mask-adapter Mask adapters with 'N' characters instead of trimming them. Options that influence what gets output to where: --quiet Do not print a report at the end. -o FILE, --output=FILE Write modified reads to FILE. FASTQ or FASTA format is chosen depending on input. The summary report is sent to standard output. Use '{name}' in FILE to demultiplex reads into multiple files. (default: trimmed reads are written to standard output) --info-file=FILE Write information about each read and its adapter matches into FILE. See the documentation for the file format. -r FILE, --rest-file=FILE When the adapter matches in the middle of a read, write the rest (after the adapter) into FILE. --wildcard-file=FILE When the adapter has wildcard bases ('N's), write adapter bases matching wildcard positions to FILE. When there are indels in the alignment, this will often not be accurate. --too-short-output=FILE Write reads that are too short (according to length specified by -m) to FILE. (default: discard reads) --too-long-output=FILE Write reads that are too long (according to length specified by -M) to FILE. (default: discard reads) --untrimmed-output=FILE Write reads that do not contain the adapter to FILE. (default: output to same file as trimmed reads) Additional modifications to the reads: -u LENGTH, --cut=LENGTH Remove LENGTH bases from the beginning or end of each read. If LENGTH is positive, the bases are removed from the beginning of each read. If LENGTH is negative, the bases are removed from the end of each read. This option can be specified twice if the LENGTHs have different signs. -q [5'CUTOFF,]3'CUTOFF, --quality-cutoff=[5'CUTOFF,]3'CUTOFF Trim low-quality bases from 5' and/or 3' ends of reads before adapter removal. If one value is given, only the 3' end is trimmed. If two comma-separated cutoffs are given, the 5' end is trimmed with the first cutoff, the 3' end with the second. The algorithm is the same as the one used by BWA (see documentation). (default: no trimming) --quality-base=QUALITY_BASE Assume that quality values are encoded as ascii(quality + QUALITY_BASE). The default (33) is usually correct, except for reads produced by some versions of the Illumina pipeline, where this should be set to 64. (Default: 33) --trim-n Trim N's on ends of reads. -x PREFIX, --prefix=PREFIX Add this prefix to read names -y SUFFIX, --suffix=SUFFIX Add this suffix to read names --strip-suffix=STRIP_SUFFIX Remove this suffix from read names if present. Can be given multiple times. -c, --colorspace Colorspace mode: Also trim the color that is adjacent to the found adapter. -d, --double-encode When in colorspace, double-encode colors (map 0,1,2,3,4 to A,C,G,T,N). -t, --trim-primer When in colorspace, trim primer base and the first color (which is the transition to the first nucleotide) --strip-f3 For colorspace: Strip the _F3 suffix of read names --maq, --bwa MAQ- and BWA-compatible colorspace output. This enables -c, -d, -t, --strip-f3 and -y '/1'. --length-tag=TAG Search for TAG followed by a decimal number in the description field of the read. Replace the decimal number with the correct length of the trimmed read. For example, use --length-tag 'length=' to correct fields like 'length=123'. --no-zero-cap Do not change negative quality values to zero. Colorspace quality values of -1 would appear as spaces in the output FASTQ file. Since many tools have problems with that, negative qualities are converted to zero when trimming colorspace data. Use this option to keep negative qualities. -z, --zero-cap Change negative quality values to zero. This is enabled by default when -c/--colorspace is also enabled. Use the above option to disable it. Paired-end options.: The -A/-G/-B/-U options work like their -a/-b/-g/-u counterparts. -A ADAPTER 3' adapter to be removed from the second read in a pair. -G ADAPTER 5' adapter to be removed from the second read in a pair. -B ADAPTER 5'/3 adapter to be removed from the second read in a pair. -U LENGTH Remove LENGTH bases from the beginning or end of each read (see --cut). -p FILE, --paired-output=FILE Write second read in a pair to FILE. --untrimmed-paired-output=FILE Write the second read in a pair to this FILE when no adapter was found in the first read. Use this option together with --untrimmed-output when trimming pairedend reads. (Default: output to same file as trimmed reads.)
SEE ALSO
The full documentation for cutadapt is maintained as a Texinfo manual. If the info and cutadapt programs are properly installed at your site, the command info cutadapt should give you access to the complete manual.