bionic (1) beagle.1.gz

Provided by: beagle_4.1~180127+dfsg-1_all bug

NAME

       Beagle - Genotype calling, genotype phasing and imputation of ungenotyped markers

SYNOPSIS

       java -Xmx[GB]g -jar /usr/share/beagle/beagle.jar [options]

DESCRIPTION

       Beagle  performs  genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-
       descent segment detection. Genotypic imputation works on  phased  haplotypes  using  a  Li  and  Stephens
       haplotype  frequency model.  Beagle also implements the Refined IBD algorithm for detecting homozygosity-
       by-descent (HBD) and identity-by-descent (IBD) segments.

OPTIONS

   Data input/output parameters
       gt=filename
              Optional
              Specifies a VCF file containing a GT (genotype) format field  for  each  marker.   If  a  genotype
              contains  the  phased  allele  separator, "|", then Beagle will preserve the phase of the genotype
              during the analysis.  If you use the gt argument, all genotypes in the output file will be  phased
              and non-missing.

       gl=filename
              Optional
              Specifies  a  VCF  file  containing a GL or PL (genotype likelihood) format field for each marker.
              Any data in the GT format field will be ignored.  If both GL and PL format fields are present  for
              a marker, the GL format will be used.

       gtgl=filename
              Optional
              Specifies  a  VCF  file  containing a GT, GL or PL format field for each marker.  If a genotype is
              non-missing, Beagle will ignore the genotype likelihood.  If both GL  and  PL  format  fields  are
              present for a marker, the GL field will be used.

       ref=filename
              Optional
              Specifies a VCF file containing phased reference genotypes.  See the impute parameter.

       out=prefix
              Required
              Specifies  the output filename prefix.  The prefix may be an absolute or relative filename, but it
              cannot be a directory name.

       excludesamples=filename
              Optional
              Specifies a file containing non-reference samples (one sample per line) to be  excluded  from  the
              analysis and output files.

       excludemarkers=filename
              Optional
              Specifies a file containing markers (one marker per line) to be excluded from the analysis and the
              output files.  An excluded marker identifier can either be an identifier from the VCF record’s  ID
              field or a genomic coordinate in the format: CHROM:POS.

       map=filename
              Optional
              Specifies  a  PLINK  format genetic map on the cM scale.  HapMap GrCh36 and GrCh37 genetic maps in
              PLINK format are available for download from  the  Beagle  website.   Use  of  a  genetic  map  is
              recommended  if you are imputing ungenotyped markers.  If no genetic map is specified, Beagle will
              assume a constant recombination rate of 1 cM / Mb.

       chrom=chrom:start-end
              Optional
              Specifies a chromosome or chromosome interval using a chromosome identifier in the  VCF  file  and
              the  starting  and  ending positions of the interval.  The entire chromosome, the beginning of the
              chromosome, and the end of a chromosome can be specified by chrom=[chrom], chrom=[chrom:-end], and
              chrom=[chrom:start-] respectively.

       maxlr=number_≥_1
              Default = 5000
              Specifies  the  maximum likelihood ratio at a genotype.  If M is the maximum of the likelihoods of
              each possible genotype, any likelihood that is less than (M ⁄ maxlr) is  set  to  0.0  to  improve
              computational efficiency.

   General parameters
       nthreads=positive_integer
              Default: machine-dependent
              Specifies the number of threads of execution.  If no nthreads parameter is specified, the nthreads
              parameter will be set equal to the number of CPU cores on the host machine.

       lowmem=true/false
              Default = false
              Specifies whether a memory efficient algorithm should be used.   The  memory  efficient  algorithm
              increases run-time by a factor less than 2.0.

       window=positive_integer
              Default = 50000
              Specifies  the  number of markers to include in each sliding window.  The window parameter must be
              at least twice as large as the overlap parameter.  The window parameter  controls  the  amount  of
              memory  used  in  the  analysis.   For  human data, it is recommended that the window parameter be
              greater than or equal to the typical number of markers in 5 cM.

       overlap=positive_integer
              Default = 3000
              Specifies the number of markers of overlap  between  sliding  windows.   For  human  data,  it  is
              recommended that the overlap be set to the typical number of markers in 0.5 cM (when ibd=false) or
              2.0 cM (when ibd=true).

       seed=integer
              Default = -99999
              Specifies the seed for the random number generator.

   Phasing and imputation parameters
       niterations=non-negative_integer
              Default = 5
              Specifies the number of phasing iterations.  The phasing iterations are  preceded  by  10  burn-in
              iterations  which  carry  out the Beagle version 4.0 phasing algorithm.  If you want to phase your
              data with the Beagle 4.0 phasing algorithm, use niterations=0.  Accuracy and compute time increase
              with the number of iterations.

       impute=true/false
              Default = true
              Specifies  whether markers that are present in the reference panel but absent in your data will be
              imputed.  This option has no effect if the ref and gt arguments are not used.

       gprobs=true/false
              Default = false
              Specifies whether a GP (genotype probability) format field will be included in the output VCF file
              when imputing ungenotyped markers.  By default, a GP fields is not printed because a DS (alternate
              allele dose) format field is always printed when imputing ungenotyped markers.

       ne=integer
              Default = 1000000
              Specifies the effective population size when imputing ungenotyped markers.  The default  value  is
              suitable  for  a  large outbred human population.  Smaller values in the hundreds or thousands for
              the ne parameter are suggested for inbred human and animal populations.

       err=non-negative_number
              Default = 0.0001
              Specifies the allele miscall rate.  The default value should give good results for  most  sequence
              and SNP array data.

       cluster=non-negative_number
              Default = 0.005
              Specifies  the  maximum cM distance between individual markers that are combined into an aggregate
              marker when imputing ungenotyped markers.

   IBD parameters
       ibd=true/false
              Default = false
              Specifies whether IBD analysis will be performed when the gt argument is used.

       ibdlod=non-negative_integer
              Default = 3.0
              Specifies the minimum LOD score for reported IBD.

       ibdscale=non-negative_number
              Default: data-dependent
              Specifies the scale parameter used to build the haplotype frequency model for IBD analysis.  If no
              ibdscale  parameter  is  specified  the scale parameter for the IBD analysis will be set to max{2,
              sqrt[sample size]/100}, which we have found to work well for outbred populations.

       ibdtrim=non-negative_integer
              Default = 40
              Specifies the number of markers trimmed from the end of a shared haplotype when testing  for  IBD.
              Note: The default ibdtrim parameter is designed for European samples genotyped with a 1M SNP array
              (~ 1 marker per 3 kb).  For human SNP array data, it is recommended to set the  ibdtrim  parameter
              to  the typical number of markers in a 0.15 cM region.  Pilot studies of randomly selected genomic
              regions can be used to fine-tune the values of the ibdtrim parameter.

SEE ALSO

       https://faculty.washington.edu/browning/beagle/beagle.html

AUTHOR

       Beagle was written by Brian L. Browning.

       This manual page was written by Dylan Aïssi <bob.dybian@gmail.com>, for the Debian project  (but  may  be
       used by others).