Provided by: beagle_220722-1_all bug

NAME

       Beagle - Genotype calling, genotype phasing and imputation of ungenotyped markers

SYNOPSIS

       java -Xmx[GB]g -jar /usr/share/beagle/beagle.jar [options]

DESCRIPTION

       Beagle performs genotype calling, genotype phasing, imputation of ungenotyped markers, and
       identity-by-descent segment detection. Genotypic imputation  works  on  phased  haplotypes
       using a Li and Stephens haplotype frequency model.  Beagle also implements the Refined IBD
       algorithm  for  detecting  homozygosity-by-descent  (HBD)  and  identity-by-descent  (IBD)
       segments.

OPTIONS

   Data input/output parameters
       gt=filename
              Optional
              Specifies a VCF file containing a GT (genotype) format field for each marker.  If a
              genotype contains the phased allele separator, "|", then Beagle will  preserve  the
              phase  of  the  genotype  during  the  analysis.   If  you use the gt argument, all
              genotypes in the output file will be phased and non-missing.

       gl=filename
              Optional
              Specifies a VCF file containing a GL or PL (genotype likelihood) format  field  for
              each  marker.   Any data in the GT format field will be ignored.  If both GL and PL
              format fields are present for a marker, the GL format will be used.

       gtgl=filename
              Optional
              Specifies a VCF file containing a GT, GL or PL format field for each marker.  If  a
              genotype  is  non-missing,  Beagle will ignore the genotype likelihood.  If both GL
              and PL format fields are present for a marker, the GL field will be used.

       ref=filename
              Optional
              Specifies a VCF  file  containing  phased  reference  genotypes.   See  the  impute
              parameter.

       out=prefix
              Required
              Specifies  the  output  filename prefix.  The prefix may be an absolute or relative
              filename, but it cannot be a directory name.

       excludesamples=filename
              Optional
              Specifies a file containing non-reference samples  (one  sample  per  line)  to  be
              excluded from the analysis and output files.

       excludemarkers=filename
              Optional
              Specifies  a  file containing markers (one marker per line) to be excluded from the
              analysis and the output files.  An excluded marker  identifier  can  either  be  an
              identifier  from  the  VCF record’s ID field or a genomic coordinate in the format:
              CHROM:POS.

       map=filename
              Optional
              Specifies a PLINK format genetic map on the cM scale.   HapMap  GrCh36  and  GrCh37
              genetic  maps  in  PLINK format are available for download from the Beagle website.
              Use of a genetic map is recommended if you are imputing ungenotyped markers.  If no
              genetic  map is specified, Beagle will assume a constant recombination rate of 1 cM
              / Mb.

       chrom=chrom:start-end
              Optional
              Specifies a chromosome or chromosome interval using a chromosome identifier in  the
              VCF  file  and  the  starting  and  ending  positions  of the interval.  The entire
              chromosome, the beginning of the chromosome, and the end of  a  chromosome  can  be
              specified    by   chrom=[chrom],   chrom=[chrom:-end],   and   chrom=[chrom:start-]
              respectively.

       maxlr=number_≥_1
              Default = 5000
              Specifies the maximum likelihood ratio at a genotype.  If M is the maximum  of  the
              likelihoods of each possible genotype, any likelihood that is less than (M ⁄ maxlr)
              is set to 0.0 to improve computational efficiency.

   General parameters
       nthreads=positive_integer
              Default: machine-dependent
              Specifies the number  of  threads  of  execution.   If  no  nthreads  parameter  is
              specified,  the  nthreads parameter will be set equal to the number of CPU cores on
              the host machine.

       lowmem=true/false
              Default = false
              Specifies whether  a  memory  efficient  algorithm  should  be  used.   The  memory
              efficient algorithm increases run-time by a factor less than 2.0.

       window=positive_integer
              Default = 50000
              Specifies  the  number  of  markers  to include in each sliding window.  The window
              parameter must be at least twice as large as the  overlap  parameter.   The  window
              parameter  controls  the amount of memory used in the analysis.  For human data, it
              is recommended that the window parameter be greater than or equal  to  the  typical
              number of markers in 5 cM.

       overlap=positive_integer
              Default = 3000
              Specifies  the  number  of  markers  of overlap between sliding windows.  For human
              data, it is recommended that the overlap be set to the typical number of markers in
              0.5 cM (when ibd=false) or 2.0 cM (when ibd=true).

       seed=integer
              Default = -99999
              Specifies the seed for the random number generator.

   Phasing and imputation parameters
       niterations=non-negative_integer
              Default = 5
              Specifies the number of phasing iterations.  The phasing iterations are preceded by
              10 burn-in iterations which carry out the Beagle version 4.0 phasing algorithm.  If
              you   want  to  phase  your  data  with  the  Beagle  4.0  phasing  algorithm,  use
              niterations=0.  Accuracy and compute time increase with the number of iterations.

       impute=true/false
              Default = true
              Specifies whether markers that are present in the reference  panel  but  absent  in
              your  data  will be imputed.  This option has no effect if the ref and gt arguments
              are not used.

       gprobs=true/false
              Default = false
              Specifies whether a GP (genotype probability) format field will be included in  the
              output  VCF file when imputing ungenotyped markers.  By default, a GP fields is not
              printed because a DS (alternate allele dose) format field is  always  printed  when
              imputing ungenotyped markers.

       ne=integer
              Default = 1000000
              Specifies  the  effective  population  size when imputing ungenotyped markers.  The
              default value is suitable for a large outbred human population.  Smaller values  in
              the  hundreds  or thousands for the ne parameter are suggested for inbred human and
              animal populations.

       err=non-negative_number
              Default = 0.0001
              Specifies the allele miscall rate.  The default value should give good results  for
              most sequence and SNP array data.

       cluster=non-negative_number
              Default = 0.005
              Specifies the maximum cM distance between individual markers that are combined into
              an aggregate marker when imputing ungenotyped markers.

   IBD parameters
       ibd=true/false
              Default = false
              Specifies whether IBD analysis will be performed when the gt argument is used.

       ibdlod=non-negative_integer
              Default = 3.0
              Specifies the minimum LOD score for reported IBD.

       ibdscale=non-negative_number
              Default: data-dependent
              Specifies the scale parameter used to build the haplotype frequency model  for  IBD
              analysis.   If  no  ibdscale parameter is specified the scale parameter for the IBD
              analysis will be set to max{2, sqrt[sample size]/100}, which we have found to  work
              well for outbred populations.

       ibdtrim=non-negative_integer
              Default = 40
              Specifies  the  number  of  markers trimmed from the end of a shared haplotype when
              testing for IBD.  Note: The default ibdtrim  parameter  is  designed  for  European
              samples  genotyped  with a 1M SNP array (~ 1 marker per 3 kb).  For human SNP array
              data, it is recommended to set the ibdtrim  parameter  to  the  typical  number  of
              markers  in  a  0.15 cM region.  Pilot studies of randomly selected genomic regions
              can be used to fine-tune the values of the ibdtrim parameter.

SEE ALSO

       https://faculty.washington.edu/browning/beagle/beagle.html

AUTHOR

       Beagle was written by Brian L. Browning.

       This manual page was written by Dylan Aïssi <bob.dybian@gmail.com>, for the Debian project
       (but may be used by others).