Provided by: ea-utils_1.1.2+dfsg-3_amd64 

NAME
fastq-mcf - ea-utils: detect levels of adapter presence, compute likelihoods and locations of the
adapters
SYNOPSIS
fastq-mcf [options] <adapters.fa> <reads.fq> [mates1.fq ...]
DESCRIPTION
Version: 1.04.676
Detects levels of adapter presence, computes likelihoods and locations (start, end) of the adapters.
Removes the adapter sequences from the fastq file(s).
Stats go to stderr, unless -o is specified.
Specify -0 to turn off all default settings
If you specify multiple 'paired-end' inputs, then a -o option is required for each. IE: -o read1.clip.q
-o read2.clip.fq
OPTIONS
-h This help
-o FIL Output file (stats to stdout)
-s N.N Log scale for adapter minimum-length-match (2.2)
-t N % occurance threshold before adapter clipping (0.25)
-m N Minimum clip length, overrides scaled auto (1)
-p N Maximum adapter difference percentage (10)
-l N Minimum remaining sequence length (19)
-L N Maximum remaining sequence length (none)
-D N Remove duplicate reads : Read_1 has an identical N bases (0)
-k N sKew percentage-less-than causing cycle removal (2)
-x N 'N' (Bad read) percentage causing cycle removal (20)
-q N quality threshold causing base removal (10)
-w N window-size for quality trimming (1)
-H remove >95% homopolymer reads (no)
-X remove low complexity reads (no)
-0 Set all default parameters to zero/do nothing
-U|u Force disable/enable Illumina PF filtering (auto)
-P N Phred-scale (auto)
-R Don't remove N's from the fronts/ends of reads
-n Don't clip, just output what would be done
-C N Number of reads to use for subsampling (300k)
-S Save all discarded reads to '.skip' files
-d Output lots of random debugging stuff
Quality adjustment options:
--cycle-adjust
CYC,AMT Adjust cycle CYC (negative = offset from end) by amount AMT
--phred-adjust
SCORE,AMT Adjust score SCORE by amount AMT
--phred-adjust-max
SCORE Adjust scores > SCORE to SCOTE
Filtering options*:
--[mate-]qual-mean
NUM Minimum mean quality score
--[mate-]qual-gt
NUM,THR At least NUM quals > THR
--[mate-]max-ns
NUM Maxmium N-calls in a read (can be a %)
--[mate-]min-len
NUM Minimum remaining length (same as -l)
--homopolymer-pct
PCT Homopolymer filter percent (95)
--lowcomplex-pct
PCT Complexity filter percent (95)
If mate- prefix is used, then applies to second non-barcode read only
Adapter files are 'fasta' formatted:
Specify n/a to turn off adapter clipping, and just use filters
Increasing the scale makes recognition-lengths longer, a scale of 100 will force full-length recognition
of adapters.
Adapter sequences with _5p in their label will match 'end's, and sequences with _3p in their label will
match 'start's, otherwise the 'end' is auto-determined.
Skew is when one cycle is poor, 'skewed' toward a particular base. If any nucleotide is less than the
skew percentage, then the whole cycle is removed. Disable for methyl-seq, etc.
Set the skew (-k) or N-pct (-x) to 0 to turn it off (should be done for miRNA, amplicon and other
low-complexity situations!)
Duplicate read filtering is appropriate for assembly tasks, and never when read length < expected
coverage. -D 50 will use 4.5GB RAM on 100m DNA reads - be careful. Great for RNA assembly.
*Quality filters are evaluated after clipping/trimming
Homopolymer filtering is a subset of low-complexity, but will not be separately tracked unless both are
turned on.
fastq-mcf 1.1.2 July 2015 FASTQ-MCF(1)