Ubuntu Manpage: pilon - automated genome assembly improvement and variant detection tool

NAME

       pilon - automated genome assembly improvement and variant detection tool

SYNOPSIS

       pilon  --genome  genome.fasta [--frags frags.bam] [--jumps jumps.bam] [--unpaired unpaired.bam] [...other
       options...]

DESCRIPTION

       Pilon is a software tool which can be used to:

       • Automatically improve draft assemblies

       • Find variation among strains, including large event detection

       Pilon requires as input a FASTA file of the genome along with one or more BAM files of reads  aligned  to
       the  input  FASTA  file. Pilon uses read alignment analysis to identify inconsistencies between the input
       genome and the evidence in the reads. It  then  attempts  to  make  improvements  to  the  input  genome,
       including:

       • Single base differences

       • Small indels

       • Larger indel or block substitution events

       • Gap filling

       • Identification of local misassemblies, including optional opening of new gaps

OPTIONS

--help

for option details

INPUTS
--genome genome.fasta

The input genome we are trying to improve, which must be the reference used for the bam
alignments. At least one of --frags or --jumps must also be given.

--frags frags.bam

A bam file consisting of fragment paired-end alignments, aligned to the --genome argument using
bwa or bowtie2. This argument may be specified more than once.

--jumps jumps.bam

A bam file consisting of jump (mate pair) paired-end alignments, aligned to the --genome argument
using bwa or bowtie2. This argument may be specified more than once.

--unpaired unpaired.bam

A bam file consisting of unpaired alignments, aligned to the --genome argument using bwa or
bowtie2. This argument may be specified more than once.

--bam any.bam

A bam file of unknown type; Pilon will scan it and attempt to classify it as one of the above bam
types.

OUTPUTS
--output prefix

Prefix for output files

--outdir directory

Use this directory for all output files.

--changes

If specified, a file listing changes in the <output>.fasta will be generated.

--vcf

If specified, a vcf file will be generated

--vcfqe

If specified, the VCF will contain a QE (quality-weighted evidence) field rather than the default
QP (quality-weighted percentage of evidence) field.

--tracks

This options will cause many track files (*.bed, *.wig) suitable for viewing in a genome browser
to be written.

CONTROL
--variant

Sets up heuristics for variant calling, as opposed to assembly improvement; equivalent to "--vcf
--fix all,breaks".

--chunksize

Input FASTA elements larger than this will be processed in smaller pieces not to exceed this size
(default 10000000).

--diploid

Sample is from diploid organism; will eventually affect calling of heterozygous SNPs

--fix fixlist

A comma-separated list of categories of issues to try to fix:

"snps": try to fix individual base errors; "indels": try to fix small indels; "gaps": try to fill
gaps; "local": try to detect and fix local misassemblies; "all": all of the above (default);
"bases": shorthand for "snps" and "indels" (for back compatibility); "none": none of the above;
new fasta file will not be written.

The following are experimental fix types:

"amb": fix ambiguous bases in fasta output (to most likely alternative); "breaks": allow local
reassembly to open new gaps (with "local"); "circles": try to close circlar elements when used
with long corrected reads; "novel": assemble novel sequence from unaligned non-jump reads.

--dumpreads

Dump reads for local re-assemblies.

--duplicates

Use reads marked as duplicates in the input BAMs (ignored by default).

--iupac

Output IUPAC ambiguous base codes in the output FASTA file when appropriate.

--nonpf

Use reads which failed sequencer quality filtering (ignored by default).

--targets targetlist

Only process the specified target(s). Targets are comma-separated, and each target

is a fasta element name optionally followed by a base range. Example:
"scaffold00001,scaffold00002:10000-20000" would result in processing all of scaffold00001 and
coordinates 10000-20000 of scaffold00002. If "targetlist" is the name of a file, each line will
be treated as a target specification.

--threads

Degree of parallelism to use for certain processing (default 1). Experimental.

--verbose

More verbose output.

--debug

Debugging output (implies verbose).

--version

Print version string and exit.

HEURISTICS:
--defaultqual qual

Assumes bases are of this quality if quals are no present in input BAMs (default 15).

--flank nbases

Controls how much of the well-aligned reads will be used; this many bases at each end of the good
reads will be ignored (default 10).

--gapmargin

Closed gaps must be within this number of bases of true size to be closed (100000)

--K

Kmer size used by internal assembler (default 47).

--mindepth depth

Variants (snps and indels) will only be called if there is coverage of good pairs at this depth or
more; if this value is >= 1, it is an absolute depth, if it is a fraction < 1, then minimum depth
is computed by multiplying this value by the mean coverage for the region, with a minimum value of
5 (default 0.1: min depth to call is 10% of mean coverage or 5, whichever is greater).

--mingap

Minimum size for unclosed gaps (default 10)

--minmq

Minimum alignment mapping quality for a read to count in pileups (default 0)

--minqual

Minimum base quality to consider for pileups (default 0)

--nostrays

Skip making a pass through the input BAM files to identify stray pairs, that is, those pairs in
which both reads are aligned but not marked valid because they have inconsistent orientation or
separation. Identifying stray pairs can help fill gaps and assemble larger insertions, especially
of repeat content. However, doing so sometimes consumes considerable memory.

AUTHOR

       This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage
       of the program.

pilon 1.22                                          June 2018                                           PILON(1)