Provided by: scoary_1.6.16-5_all bug

NAME

       scoary - pangenome-wide association studies

SYNOPSIS

       scoary  [-h] [-t TRAITS] [-g GENES] [-n NEWICKTREE] [-s START_COL] [--delimiter DELIMITER]
       [-r  RESTRICT_TO]  [-o  OUTDIR]  [-u]  [-p  P_VALUE_CUTOFF  [P_VALUE_CUTOFF   ...]]    [-c
       [{I,B,BH,PW,EPW,P}   [{I,B,BH,PW,EPW,P}   ...]]]  [-m  MAX_HITS]  [--include_input_columns
       GRABCOLS] [-w] [--no-time] [-e PERMUTE] [--no_pairwise] [--collapse]  [--threads  THREADS]
       [--test] [--citation] [--version]

OPTIONS

   optional arguments:
       -h, --help
              show this help message and exit

   Input options:
       -t TRAITS, --traits TRAITS
              Input trait table (comma-separated-values). Trait presence is indicated by 1, trait
              absence by 0.  Assumes strain names in the first column  and  trait  names  in  the
              first row

       -g GENES, --genes GENES
              Input  gene presence/absence table (comma-separatedvalues) from ROARY. Strain names
              must be equal to those in the trait table

       -n NEWICKTREE, --newicktree NEWICKTREE
              Supply a custom tree (Newick format) for phylogenetic analyses instead  instead  of
              calculating it internally.

       -s START_COL, --start_col START_COL
              On  which column in the gene presence/absence file do individual strain info start.
              Default=15. (1-based indexing)

       --delimiter DELIMITER
              The delimiter between cells in the gene presence/absence and trait files,  as  well
              as the output file.

       -r RESTRICT_TO, --restrict_to RESTRICT_TO
              Use  if  you  only  want  to analyze a subset of your strains. Scoary will read the
              provided comma-separated table of strains and restrict analyzes to these.

   Output options:
       -o OUTDIR, --outdir OUTDIR
              Directory to place output files. Default = .

       -u, --upgma_tree
              This flag will cause Scoary to write the calculated UPGMA tree to a newick file

       -p P_VALUE_CUTOFF [P_VALUE_CUTOFF ...],  --p_value_cutoff  P_VALUE_CUTOFF  [P_VALUE_CUTOFF
       ...]
              P-value  cut-off  /  alpha level. For Fishers, Bonferronis, and Benjamini-Hochbergs
              tests, SCOARY will not report genes with higher p-values than this.  For  empirical
              p-values,  this  is  treated  as  an alpha level instead. I.e. 0.02 will filter all
              genes except the lower and upper percentile from this test. Run with  "-p  1.0"  to
              report  all  genes.  Accepts  standard  form  (e.g.  1E-8).  Provide a single value
              (applied to  all)  or  exactly  as  many  values  as  correction  criteria  and  in
              corresponding order. (See example under correction). Default = 0.05

       -c    [{I,B,BH,PW,EPW,P}   [{I,B,BH,PW,EPW,P}   ...]],   --correction   [{I,B,BH,PW,EPW,P}
       [{I,B,BH,PW,EPW,P} ...]]
              Apply the indicated filtration measure. Allowed values are I, B, BH,  PW,  EPW,  P.
              I=Individual  (naive)  p-value. B=Bonferroni adjusted p-value. BH=BenjaminiHochberg
              adjusted p. PW=Best (lowest) pairwise  comparison.  EPW=Entire  range  of  pairwise
              comparison  p-values.  P=Empirical p-value from permutations. You can enter as many
              correction  criteria  as  you  would  like.  These  will  be  associated  with  the
              p_value_cutoffs  you  enter.  For  example  "-c  I  EPW -p 0.1 0.05" will apply the
              following cutoffs: Naive p-value must be lower than 0.1 AND  the  entire  range  of
              pairwise  comparison  values  are below 0.05 for this gene. Note that the empirical
              p-values should be interpreted at both tails. Therefore, running  "-c  P  -p  0.05"
              will  apply  an  alpha  of  0.05 to the empirical (permuted) p-values, i.e. it will
              filter everything except the upper and  lower  2.5  percent  of  the  distribution.
              Default = Individual p-value. (I)

       -m MAX_HITS, --max_hits MAX_HITS
              Maximum  number of hits to report. SCOARY will only report the top max_hits results
              per trait

       --include_input_columns GRABCOLS
              Grab columns from the input Roary file. and puts them in the output. Handles  comma
              and ranges, e.g.  --include_input_columns 4,6,8,16-23. The special keyword ALL will
              include all relevant input columns in the output

       -w, --write_reduced
              Use with -r if you want Scoary to create a new gene presence absence file from your
              reduced set of isolates. Note: Columns 1-14 (No. sequences, Avg group size nuc etc)
              in this file do not reflect the reduced dataset. These  are  taken  from  the  full
              dataset.

       --no-time
              Output  file  in  the  form TRAIT.results.csv, instead of TRAIT_TIMESTAMP.csv. When
              used with  the  -w  argument  will  output  a  reduced  gene  matrix  in  the  form
              gene_presence_absence_reduced.csv                    rather                    than
              gene_presence_absence_reduced_TIMESTAMP.csv

   Analysis options:
       -e PERMUTE, --permute PERMUTE
              Perform N number of permutations of the  significant  results  post-analysis.  Each
              permutation  will  do  a  label  switching  of  the  phenotype and a new p-value is
              calculated according to this new dataset. After all N permutations  are  completed,
              the  results  are  ordered  in  ascending order, and the percentile of the original
              result in the permuted p-value distribution is reported.

       --no_pairwise
              Do not perform pairwise comparisons. Inthis mode, Scoary  will  perform  population
              structure-naive  calculations  only.  (Fishers  test,  ORs etc). Useful for summary
              operations and exploring sets. (Genes unique in groups, intersections etc) but  not
              causal analyses.

       --collapse
              Add  this  to  collapse  correlated  genes  (genes that have identical distribution
              patterns in the sample) into merged units.

   Misc options:
       --threads THREADS
              Number of threads to use. Default = 1

       --test Run Scoary on the test set in exampledata, overriding all other parameters.

       --citation
              Show citation information, and exit.

       --version
              Display Scoary version, and exit.

       by Ola Brynildsrud (olbb@fhi.no)

AUTHOR

       This manpage was written by Andreas Tille for the Debian distribution and can be used  for
       any other usage of the program.