Ubuntu Manpage: cutadapt - remove adapter sequences from high-throughput sequencing reads

NAME

       cutadapt - remove adapter sequences from high-throughput sequencing reads

SYNOPSIS

              cutadapt -a ADAPTER [options] [-o output.fastq] input.fastq

       For paired-end reads:

              cutadapt -a ADAPT1 -A ADAPT2 [options] -o out1.fastq -p out2.fastq in1.fastq in2.fastq

DESCRIPTION

       Replace  "ADAPTER"  with the actual sequence of your 3' adapter. IUPAC wildcard characters are supported.
       The reverse complement is *not* automatically searched. All reads from input.fastq  will  be  written  to
       output.fastq  with  the  adapter  sequence  removed. Adapter matching is error-tolerant. Multiple adapter
       sequences can be given (use further -a options), but only the best-matching adapter will be removed.

       Input may also be in FASTA format. Compressed input and output is supported and  auto-detected  from  the
       file  name  (.gz,  .xz,  .bz2).  Use  the file name '-' for standard input/output. Without the -o option,
       output is sent to standard output.

OPTIONS

       --help show all command-line options

       --version
              show program's version number and exit

       -h, --help
              show this help message and exit

       --debug
              Print debugging information.

       -f FORMAT, --format=FORMAT
              Input  file  format;  can  be  either  'fasta',  'fastq'  or  'sra-fastq'.  Ignored  when  reading
              csfasta/qual files.  Default: auto-detect from file name extension.

              Finding adapters::

              Parameters  -a,  -g, -b specify adapters to be removed from each read (or from the first read in a
              pair if data is paired). If specified multiple times, only the best matching  adapter  is  trimmed
              (but see the --times option). When the special notation 'file:FILE' is used, adapter sequences are
              read from the given FASTA file.

       -a ADAPTER, --adapter=ADAPTER
              Sequence  of  an  adapter  ligated to the 3' end (paired data: of the first read). The adapter and
              subsequent bases are trimmed. If a '$' character is appended ('anchoring'), the  adapter  is  only
              found if it is a suffix of the read.

       -g ADAPTER, --front=ADAPTER
              Sequence of an adapter ligated to the 5' end (paired data: of the first read). The adapter and any
              preceding  bases  are  trimmed.  Partial  matches at the 5' end are allowed. If a '^' character is
              prepended ('anchoring'), the adapter is only found if it is a prefix of the read.

       -b ADAPTER, --anywhere=ADAPTER
              Sequence of an adapter that may be ligated to the 5' or 3' end (paired data: of the  first  read).
              Both  types of matches as described under -a und -g are allowed.  If the first base of the read is
              part of the match, the behavior is as with -g, otherwise as with -a. This  option  is  mostly  for
              rescuing  failed  library preparations - do not use if you know which end your adapter was ligated
              to!

       -e ERROR_RATE, --error-rate=ERROR_RATE
              Maximum allowed error rate (no. of errors divided by the length of the matching region).  Default:
              0.1

       --no-indels
              Allow only mismatches in alignments. Default: allow both mismatches and indels

       -n COUNT, --times=COUNT
              Remove up to COUNT adapters from each read. Default: 1

       -O MINLENGTH, --overlap=MINLENGTH
              If  the  overlap  between  the  read  and  the  adapter is shorter than MINLENGTH, the read is not
              modified.  Reduces the no. of bases trimmed due to random adapter matches. Default: 3

       --match-read-wildcards
              Interpret IUPAC wildcards in reads. Default: False

       -N, --no-match-adapter-wildcards
              Do not interpret IUPAC wildcards in adapters.

       --no-trim
              Match and redirect reads to output/untrimmed-output as usual, but do not remove adapters.

       --mask-adapter
              Mask adapters with 'N' characters instead of trimming them.

              Additional read modifications:

       -u LENGTH, --cut=LENGTH
              Remove bases from each read (first read only if paired). If LENGTH is positive, remove bases  from
              the beginning. If LENGTH is negative, remove bases from the end. Can be used twice if LENGTHs have
              different signs.

       -q [5'CUTOFF,]3'CUTOFF, --quality-cutoff=[5'CUTOFF,]3'CUTOFF
              Trim low-quality bases from 5' and/or 3' ends of each read before adapter removal. Applied to both
              reads if data is paired. If one value is given, only the 3' end is trimmed. If two comma-separated
              cutoffs are given, the 5' end is trimmed with the first cutoff, the 3' end with the second.

       --nextseq-trim=3'CUTOFF
              NextSeq-specific  quality trimming (each read). Trims also dark cycles appearing as high-quality G
              bases (EXPERIMENTAL).

       --quality-base=QUALITY_BASE
              Assume that quality values in FASTQ are encoded as ascii(quality + QUALITY_BASE). This needs to be
              set to 64 for some old Illumina FASTQ files. Default: 33

       --trim-n
              Trim N's on ends of reads.

       -x PREFIX, --prefix=PREFIX
              Add this prefix to read names. Use {name} to insert the name of the matching adapter.

       -y SUFFIX, --suffix=SUFFIX
              Add this suffix to read names; can also include {name}

       --strip-suffix=STRIP_SUFFIX
              Remove this suffix from read names if present. Can be given multiple times.

       --length-tag=TAG
              Search for TAG followed by a decimal number in the description field  of  the  read.  Replace  the
              decimal  number  with  the  correct  length  of  the  trimmed read.  For example, use --length-tag
              'length=' to correct fields like 'length=123'.

              Filtering of processed reads:

       --discard-trimmed, --discard
              Discard reads that contain an adapter. Also use -O to avoid discarding too many randomly  matching
              reads!

       --discard-untrimmed, --trimmed-only
              Discard reads that do not contain the adapter.

       -m LENGTH, --minimum-length=LENGTH
              Discard  trimmed reads that are shorter than LENGTH.  Reads that are too short even before adapter
              removal are also discarded. In colorspace, an initial primer is not counted. Default: 0

       -M LENGTH, --maximum-length=LENGTH
              Discard trimmed reads that are longer than LENGTH.  Reads that are too long  even  before  adapter
              removal are also discarded. In colorspace, an initial primer is not counted. Default: no limit

       --max-n=COUNT
              Discard  reads with too many N bases. If COUNT is an integer, it is treated as the absolute number
              of N bases. If it is between 0 and 1, it is treated as the proportion of N's allowed in a read.

              Output:

       --quiet
              Print only error messages.

       -o FILE, --output=FILE
              Write trimmed reads to FILE. FASTQ or FASTA format is  chosen  depending  on  input.  The  summary
              report  is sent to standard output. Use '{name}' in FILE to demultiplex reads into multiple files.
              Default: write to standard output

       --info-file=FILE
              Write information about each read and its adapter matches into FILE. See the documentation for the
              file format.

       -r FILE, --rest-file=FILE
              When the adapter matches in the middle of a read, write the rest (after the adapter) into FILE.

       --wildcard-file=FILE
              When the adapter has N bases (wildcards), write adapter bases matching wildcard positions to FILE.
              When there are indels in the alignment, this will often not be accurate.

       --too-short-output=FILE
              Write reads that are too short (according to length specified by -m)  to  FILE.  Default:  discard
              reads

       --too-long-output=FILE
              Write  reads  that  are  too  long (according to length specified by -M) to FILE. Default: discard
              reads

       --untrimmed-output=FILE
              Write reads that do not contain the adapter to FILE.  Default: output  to  same  file  as  trimmed
              reads

              Colorspace options:

       -c, --colorspace
              Enable colorspace mode: Also trim the color that is adjacent to the found adapter.

       -d, --double-encode
              Double-encode colors (map 0,1,2,3,4 to A,C,G,T,N).

       -t, --trim-primer
              Trim primer base and the first color (which is the transition to the first nucleotide)

       --strip-f3
              Strip the _F3 suffix of read names

       --maq, --bwa
              MAQ- and BWA-compatible colorspace output. This enables -c, -d, -t, --strip-f3 and -y '/1'.

       --no-zero-cap
              Do  not change negative quality values to zero in colorspace data. By default, they are since many
              tools have problems with negative qualities.

       -z, --zero-cap
              Change negative quality values to zero. This is enabled by default when  -c/--colorspace  is  also
              enabled. Use the above option to disable it.

              Paired-end options:

              The  -A/-G/-B/-U  options  work like their -a/-b/-g/-u counterparts, but are applied to the second
              read in each pair.

       -A ADAPTER
              3' adapter to be removed from second read in a pair.

       -G ADAPTER
              5' adapter to be removed from second read in a pair.

       -B ADAPTER
              5'/3 adapter to be removed from second read in a pair.

       -U LENGTH
              Remove LENGTH bases from second read in a pair (see --cut).

       -p FILE, --paired-output=FILE
              Write second read in a pair to FILE.

       --pair-filter=(any|both)
              Which of the reads in a paired-end read have to match the filtering criterion in order for  it  to
              be filtered. Default: any

       --interleaved
              Read and write interleaved paired-end reads.

       --untrimmed-paired-output=FILE
              Write  second  read  in  a pair to this FILE when no adapter was found in the first read. Use this
              option together with --untrimmed-output when trimming pairedend reads.  Default:  output  to  same
              file as trimmed reads

       --too-short-paired-output=FILE
              Write   second   read  in  a  pair  to  this  file  if  pair  is  too  short.  Use  together  with
              --too-short-output.

       --too-long-paired-output=FILE
              Write second read in a pair to this file if pair is too long. Use together with --too-long-output.

AUTHOR

       This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage
       of the program.

cutadapt 1.10                                       June 2016                                        CUTADAPT(1)

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SEE ALSO

AUTHOR