       pilon - automated genome assembly improvement and variant detection tool


       pilon   --genome   genome.fasta   [--frags   frags.bam]  [--jumps  jumps.bam]  [--unpaired
       unpaired.bam] [...other options...]


       Pilon is a software tool which can be used to:

       · Automatically improve draft assemblies

       · Find variation among strains, including large event detection

       Pilon requires as input a FASTA file of the genome along with one or  more  BAM  files  of
       reads  aligned  to  the  input  FASTA file. Pilon uses read alignment analysis to identify
       inconsistencies between the input genome and the evidence in the reads. It  then  attempts
       to make improvements to the input genome, including:

       · Single base differences

       · Small indels

       · Larger indel or block substitution events

       · Gap filling

       · Identification of local misassemblies, including optional opening of new gaps



              --genome genome.fasta

              The input genome we are trying to improve, which must be the reference used for the
              bam alignments.  At least one of --frags or --jumps must also be given.

              --frags frags.bam

              A bam file consisting of fragment paired-end alignments, aligned  to  the  --genome
              argument using bwa or bowtie2.  This argument may be specified more than once.

              --jumps jumps.bam

              A  bam  file  consisting  of jump (mate pair) paired-end alignments, aligned to the
              --genome argument using bwa or bowtie2.  This argument may be specified  more  than

              --unpaired unpaired.bam

              A  bam  file  consisting  of  unpaired alignments, aligned to the --genome argument
              using bwa or bowtie2.  This argument may be specified more than once.

              --bam any.bam

              A bam file of unknown type; Pilon will scan it and attempt to classify it as one of
              the above bam types.

              --output prefix

              Prefix for output files

              --outdir directory

              Use this directory for all output files.


              If specified, a file listing changes in the <output>.fasta will be generated.


              If specified, a vcf file will be generated


              If  specified,  the  VCF will contain a QE (quality-weighted evidence) field rather
              than the default QP (quality-weighted percentage of evidence) field.


              This options will cause many track files (*.bed, *.wig) suitable for viewing  in  a
              genome browser to be written.


              Sets  up  heuristics  for  variant  calling,  as  opposed  to assembly improvement;
              equivalent to "--vcf --fix all,breaks".


              Input FASTA elements larger than this will be processed in smaller  pieces  not  to
              exceed this size (default 10000000).


              Sample  is  from  diploid  organism; will eventually affect calling of heterozygous

              --fix fixlist

              A comma-separated list of categories of issues to try to fix:

              "snps": try to fix individual base errors;  "indels":  try  to  fix  small  indels;
              "gaps":  try  to  fill  gaps;  "local":  try to detect and fix local misassemblies;
              "all": all of the above (default); "bases": shorthand for "snps" and "indels"  (for
              back compatibility); "none": none of the above; new fasta file will not be written.

              The following are experimental fix types:

              "amb":  fix ambiguous bases in fasta output (to most likely alternative); "breaks":
              allow local reassembly to open new gaps (with "local");  "circles":  try  to  close
              circlar  elements  when  used  with  long  corrected reads; "novel": assemble novel
              sequence from unaligned non-jump reads.


              Dump reads for local re-assemblies.


              Use reads marked as duplicates in the input BAMs (ignored by default).


              Output IUPAC ambiguous base codes in the output FASTA file when appropriate.


              Use reads which failed sequencer quality filtering (ignored by default).

              --targets targetlist

              Only process the specified target(s).  Targets are comma-separated, and each target

              is  a  fasta  element  name  optionally  followed  by  a  base   range.    Example:
              "scaffold00001,scaffold00002:10000-20000"   would   result  in  processing  all  of
              scaffold00001 and coordinates 10000-20000 of scaffold00002.  If "targetlist" is the
              name of a file, each line will be treated as a target specification.


              Degree of parallelism to use for certain processing (default 1). Experimental.


              More verbose output.


              Debugging output (implies verbose).


              Print version string and exit.

              --defaultqual qual

              Assumes  bases  are  of this quality if quals are no present in input BAMs (default

              --flank nbases

              Controls how much of the well-aligned reads will be used; this many bases  at  each
              end of the good reads will be ignored (default 10).


              Closed gaps must be within this number of bases of true size to be closed (100000)


              Kmer size used by internal assembler (default 47).

              --mindepth depth

              Variants  (snps  and indels) will only be called if there is coverage of good pairs
              at this depth or more; if this value is >= 1, it is an absolute depth, if it  is  a
              fraction  < 1, then minimum depth is computed by multiplying this value by the mean
              coverage for the region, with a minimum value of 5 (default 0.1: min depth to  call
              is 10% of mean coverage or 5, whichever is greater).


              Minimum size for unclosed gaps (default 10)


              Minimum alignment mapping quality for a read to count in pileups (default 0)


              Minimum base quality to consider for pileups (default 0)


              Skip  making  a  pass through the input BAM files to identify stray pairs, that is,
              those pairs in which both reads are aligned but not marked valid because they  have
              inconsistent  orientation or separation. Identifying stray pairs can help fill gaps
              and assemble larger insertions, especially of repeat content.   However,  doing  so
              sometimes consumes considerable memory.


