Ubuntu Manpage: PLINK - whole genome SNP analysis

Provided by: plink2_2.00~a3.3-220801+dfsg-1_amd64

NAME

       PLINK - whole genome SNP analysis

DESCRIPTION

       PLINK   v2.00a3  64-bit  (9  Apr  2020)               www.cog-genomics.org/plink/2.0/  (C)
       2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3

       In the command line flag definitions that follow,

              * <angle brackets> denote a required parameter, where the text between the

              angle brackets describes its nature.

       * ['square brackets + single-quotes'] denotes an optional modifier.
              Use the

              EXACT text in the quotes.

              * [{bar|separated|braced|bracketed|values}] denotes a collection of mutually

       exclusive optional modifiers (again, the exact text must be used).
              When

              there are no outer square brackets, one of the choices must be selected.

              * ['quoted_text='<description of value>] denotes an optional modifier that

              must begin with the quoted text, and be followed by a value with no  whitespace  in
              between.  '|' may also be used here to indicate mutually exclusive options.

              * [square brackets without quotes or braces] denote an optional parameter,

              where the text between the brackets describes its nature.

              * An ellipsis (...) indicates that you may enter multiple arguments of the

              specified type.

              * A "column set descriptor" is either

              1. a comma-separated sequence of column set names; this is interpreted as

              the full list of column sets to include.

              2. a comma-separated sequence of column set names, all preceded by '+' or

              '-'; this is interpreted as a list of changes to the default.

              plink2  <input  flag(s)...>  [command flag(s)...] [other flag(s)...]  plink2 --help
              [flag name(s)...]

       Most PLINK runs require exactly one main input fileset.  The following flags are available
       for defining its form and location:

       --pfile <prefix> ['vzs']
              : Specify .pgen + .pvar[.zst] + .psam prefix.

       --pgen <filename>
              : Specify full name of .pgen/.bed file.

       --pvar <filename>
              : Specify full name of .pvar/.bim file.

       --psam <filename>
              : Specify full name of .psam/.fam file.

       --bfile
              <prefix> ['vzs'] : Specify .bed + .bim[.zst] + .fam prefix.

       --bpfile <prefix> ['vzs'] : Specify .pgen + .bim[.zst] + .fam prefix.

       --keep-autoconv ['vzs']
              :  When  importing non-PLINK-binary data, don't delete autogenerated fileset at end
              of run.

       --no-fid
              : .fam file does not contain column 1 (family ID).

       --no-parents
              : .fam file does not contain columns 3-4 (parents).

       --no-sex
              : .fam file does not contain column 5 (sex).

       --vcf <filename> ['dosage='<field>]

       --bcf <filename> ['dosage='<field>] :

              Specify full name of .vcf{|.gz|.zst} or BCF2 file to import.  * These can  be  used
              with  --psam/--fam.   *  By default, dosage information is not imported.  To import
              the GP field

              (must  be  VCFv4.3-style  0..1,  one  probability  per  possible   genotype),   add
              'dosage=GP'  (or  'dosage=GP-force',  see  below).  To import Minimac3-style DS+HDS
              phased dosage, add 'dosage=HDS'.  'dosage=DS' (or anything else for now) causes the
              named  field  to  be  interpreted  as  a  Minimac3-style dosage.  Note that, in the
              dosage=GP case, PLINK 2 collapses the probabilities down to dosages; you cannot use
              PLINK 2 to losslessly convert VCF FORMAT:GP data to e.g. BGEN format.  To make this
              more obvious, PLINK 2 now errors out when dosage=GP  is  used  on  a  file  with  a
              FORMAT:DS   header  line  and  --import-dosage-certainty  wasn't  specified,  since
              dosage=DS extracts the same information more quickly in this  situation.   You  can
              suppress  this  error with 'dosage=GP-force'.  In all of these cases, hardcalls are
              regenerated from scratch from the dosages.  As a consequence, variants with  no  GT
              field  can now be imported; they will be assumed to contain only diploid calls when
              HDS is also absent.

       --data <filename prefix> <REF/ALT mode> ['gzs']

       --bgen <filename> <REF/ALT mode> ['snpid-chr']

       --gen <filename> <REF/ALT mode>

       --sample <filename> :

       Specify an Oxford-format dataset to import.
              --data specifies a .gen[.zst]

              + .sample pair, while --bgen specifies a BGEN v1.1+ file.  * If a BGEN  v1.2+  file
              contains sample IDs, it may be imported without a

              companion .sample file.

              * With 'snpid-chr', chromosome codes are read from the 'SNP ID' field

              instead of the usual chromosome field.

              * The following REF/ALT modes are supported:

              'ref-first': The first allele for each variant is REF.  'ref-last': The last allele
              for each variant is REF.  'ref-unknown':  The  last  allele  for  each  variant  is
              treated as

              provisional-REF.

       --haps <filename> [{ref-first | ref-last}]

       --legend <filename> <chr code> :

              Specify  .haps  [+  .legend] file(s) to import.  * When --legend is specified, it's
              assumed that the --haps file doesn't

              contain header columns.

       * On chrX, the second male column may contain dummy '-' entries.
              (However,

              PLINK 2 currently cannot handle omitted male columns.)

              * If not used with --sample, new sample IDs are of the form 'per#/per#'.

       --map <filename>
              : Specify full name of .map file.

       --import-dosage <allele dosage file> ['noheader'] ['id-delim='<char>]

              ['skip0='<i>] ['skip1='<j>] ['skip2='<k>] ['dose1']  ['format='<m>]  [{ref-first  |
              ref-last}] ['single-chr='<code>] ['chr-col-num='<#>] ['pos-col-num='<#>] :

              Specify PLINK 1.x-style dosage file to import.  * You must also specify a companion
              .psam/.fam file.  * By default, PLINK assumes that the file contains a header line,
              which has

              'SNP'  in  (1-based)  column  i+1,  'A1' in column i+j+2, 'A2' in column i+j+3, and
              sample FID/IIDs starting from column i+j+k+4.  (i/j/k are normally zero, but can be
              changed  with  'skip0',  'skip1',  and  'skip2' respectively.  FID/IID are normally
              assumed to be separate tokens, but if they're merged into a single  token  you  can
              specify the delimiter with 'id-delim='.)  If such a header line is not present, use
              the 'noheader' modifier; samples will then be assumed to appear in the  same  order
              as they do in the .psam/.fam file.

       * You may specify a companion .map file.
              If you do not,

              * 'single-chr=' can be used to specify that all variants are on the named

       chromosome.
              Otherwise, you can use 'chr-col-num=' to read chromosome

              codes from the given (1-based) column number.

              * 'pos-col-num=' causes bp coordinates to be read from the given column

              number.

              * The 'format=' modifier lets you specify the number of values used to

       represent each dosage.
              'format=1' normally indicates a single 0..2 A1

       expected count; 'dose1' modifies this to a 0..1 frequency.
              'format=2'

              indicates  a  0..1  homozygous  A1  likelihood  followed  by a 0..1 het likelihood.
              'format=3' indicates 0..1 hom A1, 0..1  het,  0..1  hom  A2.   'format=infer'  (the
              default) infers the format from the number of columns in the first nonheader line.

       --dummy <sample ct> <SNP ct> [missing dosage freq] [missing pheno freq]

              [{acgt | 1234 | 12}] ['pheno-ct='<count>] ['scalar-pheno'] ['dosage-freq='<rate>]

              This  generates a fake input dataset with the specified number of samples and SNPs.
              * By default, the missing dosage and phenotype frequencies are zero.

              These can be changed by providing 3rd and 4th numeric arguments.

              * By default, allele codes are As and Bs; this can be changed with the

              'acgt', '1234', or '12' modifier.

       * By default, one binary phenotype is generated.
              'pheno-ct=' can be used

              to change the number of phenotypes, and 'scalar-pheno' causes these  phenotypes  to
              be normally distributed scalars.

       * By default, all (nonmissing) dosages are in {0,1,2}.
              To make some of

       them take on decimal values, use 'dosage-freq='.
              (These dosages are

              affected by --hard-call-threshold and --dosage-erase-threshold.)

       --fa <filename>
              : Specify full name of reference FASTA file.

       Output  files  have names of the form 'plink2.<extension>' by default.  You can change the
       'plink2' prefix with

       --out <prefix>
              : Specify prefix for output files.

       Most runs also require at least one of the following commands:

       --rm-dup [mode] ['list']

              Remove all but one instance of each duplicate-ID variant (ignoring the missing ID),
              and  (with  the  'list'  modifier)  write  a  list  of  duplicated  IDs  to <output
              prefix>.rmdup.list.  The following modes of  operation  are  supported:  *  'error'
              (default) causes this to error out when there's a genotype data

       or other mismatch between the records.
              A list of affected IDs is written

              to <output prefix>.rmdup.mismatch.

              * 'retain-mismatch' causes all instances of a duplicate-ID variant to be

              retained  when  there's  a  genotype  data  or variant info mismatch; otherwise one
              instance is kept.  The .rmdup.mismatch file is also written.

              * 'exclude-mismatch' removes all instances of duplicate-ID mismatched

              variants instead.

              * 'exclude-all' causes all instances of duplicate-ID variants to be

              removed, even when the actual records are identical.

              * 'force-first' causes only the first instance of duplicate-ID variants to

              be kept, under all circumstances.

       --make-pgen ['vzs'] ['format='<code>] ['trim-alts'] ['erase-phase']

              ['erase-dosage']   ['pvar-cols='<col   set   descriptor>]   ['psam-cols='<col   set
              descriptor>]

       --make-bpgen ['vzs'] ['format='<code>] ['trim-alts'] ['erase-phase']

              ['erase-dosage']

       --make-bed ['vzs'] ['trim-alts']

              Create  a  new  PLINK  2 binary fileset (--make-pgen = .pgen + .pvar[.zst] + .psam,
              --make-bpgen = .pgen + .bim[.zst] + .fam).  * Unlike the  automatic  text-to-binary
              converters (which only heed

              chromosome filters), this supports all of PLINK's filtering flags.

              * The 'vzs' modifier causes the variant file (.pvar/.bim) to be

              Zstd-compressed.

              * The 'format' modifier requests an uncompressed fixed-variant-width .pgen

       file.
              (These do not directly support multiallelic variants.)  The

              following format code is currently supported:

              2: just like .bed, except with an extended (12-byte instead of 3-byte)

              header  containing variant/sample counts, and rotated genotype codes (00 = hom ref,
              01 = het, 10 = hom alt, 11 = missing).

              * The 'erase-phase' and 'erase-dosage' modifiers prevent phase and dosage

              information from being written to the new .pgen.

              * The first five columns of a .pvar file are always #CHROM/POS/ID/REF/ALT.

              Supported optional .pvar column sets are:

              xheader: All ## header lines (yeah, this is technically not a column),

              except for possibly FILTER/INFO definitions when those column(s) have been removed.
              Without this, only the #CHROM header line is kept.

              vcfheader: xheader, with additions to make it a valid VCF header.  maybequal: QUAL.
              Omitted if all remaining values are missing.  qual: Force QUAL column to be written
              even  when  empty.   maybefilter:  FILTER.   Omitted  if  all  remaining values are
              missing.  filter: Force FILTER column to be written even  when  empty.   maybeinfo:
              INFO.  Omitted if all remaining values are missing, or if

              INFO:PR is the only subfield.

              info:  Force  INFO column to be written.  maybecm: Centimorgan coordinate.  Omitted
              if all remaining values = 0.  cm: Force CM column to be written even when empty.

              The default is xheader,maybequal,maybefilter,maybeinfo,maybecm.

              * Supported column sets for the .psam file are:

       maybefid: Family ID, '0' = missing.
              Omitted if all values missing.

              fid: Force FID column to be written even when empty.  maybesid: Source  ID,  '0'  =
              missing.   Omitted if all values missing.  sid: Force SID column to be written even
              when empty.  maybeparents: Father and mother IIDs.  Omitted if all values  missing.
              parents: Force PAT and MAT columns to be written even when empty.  sex: '1' = male,
              '2' = female, 'NA' = missing.  pheno1: First active phenotype.  If none, all column
              entries are set to

              the --output-missing-phenotype string.

       phenos: All active phenotypes, if any.
              (Can be combined with pheno1 to

              force at least one phenotype column to be written.)

              The default is maybefid,maybesid,maybeparents,sex,phenos.

       --make-just-pvar ['zs'] ['cols='<column set descriptor>]

       --make-just-psam ['cols='<column set descriptor>]

       --make-just-bim ['zs']

       --make-just-fam

              Variants  of  --make-pgen/--make-bed which only write a new variant or sample file.
              These don't always require an input genotype file.  USE THESE  CAUTIOUSLY.   It  is
              very  easy  to  desynchronize  your  binary  genotype  data and your sample/variant
              indexes if you use these commands improperly.  If you have any  doubt,  stick  with
              --make-[b]pgen/--make-bed.

       --export <output format(s)...> [{01 | 12}] ['bgz'] ['id-delim='<char>]

              ['id-paste='<column  set descriptor>] ['include-alt'] ['omit-nonmale-y'] ['spaces']
              ['vcf-dosage='<field>] ['ref-first'] ['bits='<#>]

       Create a new fileset with all filters applied.
              The following output

              formats are supported: (actually, only A, AD,  A-transpose,  bcf,  bgen-1.x,  haps,
              hapslegend, ind-major-bed, oxford, and vcf are implemented for now) * '23': 23andMe
              4-column format.  This can only be used on a single

              sample's data (--keep may be handy), and does  not  support  multicharacter  allele
              codes.

              * 'A': Sample-major additive (0/1/2) coding, suitable for loading from R.

              If you need uncounted alleles to be named in the header line, add the 'include-alt'
              modifier.

              * 'AD': Sample-major additive (0/1/2) + dominant (het=1/hom=0) coding.

              Also supports 'include-alt'.

              * 'A-transpose': Variant-major 0/1/2.  * 'beagle': Unphased per-autosome  .dat  and
              .map files, readable by early

              BEAGLE versions.

              *  'beagle-nomap':  Single  .beagle.dat  file.  * 'bgen-1.x': Oxford-format .bgen +
              .sample.  For v1.2/v1.3, sample

              identifiers are stored in the .bgen (with id-delim and id-paste settings  applied),
              and default precision is 16-bit (use the 'bits' modifier to reduce this).

              *  'bimbam':  Regular  BIMBAM  format.   *  'bimbam-1chr':  BIMBAM  format,  with a
              two-column .pos.txt file.  Does not

              support multiple chromosomes.

              * 'fastphase': Per-chromosome fastPHASE files, with

              .chr-<chr #>.phase.inp filename extensions.

       * 'fastphase-1chr': Single .phase.inp file.
              Does not support

              multiple chromosomes.

       * 'haps', 'hapslegend': Oxford-format .haps + .sample[ + .legend].
              All

       data must be biallelic and phased.
              When the 'bgz'

       modifier is present, the .haps file is
              block-gzipped.

              * 'HV': Per-chromosome Haploview files, with .chr-<chr #>{.ped,.info}

              filename extensions.

       * 'HV-1chr': Single Haploview .ped + .info file pair.
              Does not support

              multiple chromosomes.

              * 'ind-major-bed': PLINK 1 sample-major .bed (+ .bim + .fam).  *  'lgen':  PLINK  1
              long-format  (.lgen  +  .fam + .map), loadable with --lfile.  * 'lgen-ref': .lgen +
              .fam + .map + .ref, loadable with --lfile +

              --reference.

       * 'list': Single genotype-based list, up to 4 lines per variant.
              To omit

              nonmale genotypes on the Y chromosome, add the 'omit-nonmale-y' modifier.

              * 'rlist': .rlist + .fam + .map fileset, where the .rlist file is a

              genotype-based list which omits the most common genotype for  each  variant.   Also
              supports 'omit-nonmale-y'.

       * 'oxford': Oxford-format .gen + .sample.
              When the 'bgz' modifier is

              present, the .gen file is block-gzipped.

              *   'ped':   PLINK   1  sample-major  (.ped  +  .map),  loadable  with  --file.   *
              'compound-genotypes': Same as 'ped', except that the space between each

              pair of same-variant allele codes is removed.

              * 'structure': Structure-format.  * 'transpose': PLINK  1  variant-major  (.tped  +
              .tfam), loadable with

              --tfile.

       * 'vcf',
              : VCF (default version 4.3).  If PAR1 and PAR2 are present,

       'vcf-4.2',
              they are automatically merged with chrX, with proper

       'bcf', handling of chromosome codes and male ploidy.

       'bcf-4.2'
              When  the  'bgz'  modifier is present, the VCF file is block-gzipped.  (This always
              happens with BCF output.)  The 'id-paste' modifier controls which .psam columns are
              used  to  construct  sample IDs (choices are maybefid, fid, iid, maybesid, and sid;
              default is maybefid,iid,maybesid), while the 'id-delim' modifier sets the character
              between  the  ID pieces (default '_').  Genotypes are always exported.  If you want
              to export a sites-only VCF instead, see --make-pgen/--make-just-pvar's  'vcfheader'
              column set.  Dosages are not exported unless the 'vcf-dosage=' modifier is present.
              The following five dosage export modes  are  supported:  'GP':  genotype  posterior
              probabilities  (v4.3  only).   'DS': Minimac3-style dosages, omitted for hardcalls.
              'DS-force': Minimac3-style  dosages,  never  omit.   'HDS':  Minimac3-style  phased
              dosages, omitted for hardcalls

       and unphased calls.
              Also includes 'DS' output.

              'HDS-force': Always report DS and HDS.

              In  addition,  *  The  '12' modifier causes alt1 alleles to be coded as '1' and ref
              alleles

              to be coded as '2', while '01' maps alt1 -> 0 and ref -> 1.

              * The 'spaces' modifier makes the output space-delimited instead of

              tab-delimited, whenever both are permitted.

              * For biallelic formats where it's unspecified whether the reference/major

              allele should appear first or second, --export defaults to second for compatibility
              with  PLINK 1.9.  Use 'ref-first' to change this.  (Note that this doesn't apply to
              the 'A', 'AD', and 'A-transpose' formats;  use  --export-allele  to  control  which
              alleles are counted there.)

       --freq ['zs'] ['counts'] ['cols='<column set descriptor>] ['bins-only']

              ['refbins='<comma-separated     bin     boundaries>     |    'refbins-file='<file>]
              ['alt1bins='<comma-separated bin boundaries> | 'alt1bins-file='<file>]

       Empirical allele frequency report.
              By default, only founders are

       considered.
              Dosages are taken into account (e.g. heterozygous haploid

              calls count as 0.5).  Supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate alleles, comma-separated.  reffreq:  Reference  allele  frequency/dosage.
              alt1freq:  Alt1 frequency/dosage.  altfreq: Comma-separated frequencies/dosages for
              all alternate alleles.  freq: Similar to altfreq, except ref is  also  included  at
              the start.  eq: Comma-separated <allele>=<freq> for all present alleles.  (If no

              alleles are present, the column contains a single '.'.)

              eqz:  Same  as  eq, except zero-counts are included.  alteq/alteqz: Same as eq/eqz,
              except reference allele is omitted.  numeq:  0=<freq>,1=<freq>,  etc.   Zero-counts
              are omitted.  altnumeq: Same as numeq, except reference allele is omitted.  machr2:
              Unphased MaCH imputation quality metric.  minimac3r2:  Phased  Minimac3  imputation
              quality.  nobs: Number of allele observations.

              The  default  is chrom,ref,alt,altfreq,nobs.  Additional .afreq.{ref,alt1}.bins (or
              .acount.{ref,alt1}.bins    with    'counts')    file(s)    are    generated    when
              'refbins='/'refbins-file=' or 'alt1bins='/'alt1bins-file=' is present; these report
              the total number of frequencies or counts in each left-closed, right-open interval.
              (If you only want these histogram(s), and not the main report, add 'bins-only'.)

       --geno-counts ['zs'] ['cols='<column set descriptor>]

              Variant-based   hardcall   genotype   count   report   (considering   both  alleles
              simultaneously  in  the  diploid  case).   Nonfounders  are   now   included;   use
              --keep-founders  if  this  is a problem.  Heterozygous haploid calls are treated as
              missing.  Supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate  alleles,  comma-separated.   homref:  Homozygous-ref  count.    refalt1:
              Heterozygous   ref-alt1   count.   refalt:  Comma-separated  het  ref-altx  counts.
              homalt1:  Homozygous-alt1  count.   altxy:  Comma-separated  altx-alty  counts,  in
              (1/1)-(1/2)-(2/2)-(1/3)-...

              order.

              xy: Similar to altxy, except the reference allele is treated as alt0,

              and the sequence starts (0/0)-(0/1)-(1/1)-(0/2)-...

              hapref:  Haploid-ref  count.  hapalt1: Haploid-alt1 count.  hapalt: Comma-separated
              haploid-altx counts.  hap: Similar to hapalts, except ref is also included  at  the
              start.  numeq: 0/0=<hom ref ct>,0/1=<het ref-alt1>,1/1=<hom alt1>,...,0=<hap ref>

       etc.   Zero-counts are omitted.  (If all genotypes are missing, the

              column contains a single '.'.)

              missing:  Number  of  missing  genotypes.   nobs:  Number  of (nonmissing) genotype
              observations.

              The default is chrom,ref,alt,homref,refalt,altxy,hapref,hapalt,missing.

       --sample-counts ['zs'] ['cols='<column set descriptor>]

              Sample-based hardcall genotype count report.  * Unknown-sex samples are treated  as
              female.   *  Heterozygous haploid calls (MT included) are treated as missing.  * As
              with other PLINK 2 commands, SNPs that have not been left-normalized

       are counted as non-SNP non-symbolic.
              (Use e.g. --normalize when that's a

              problem.)

              * Supported column sets are:

              maybefid: FID, if that column was present in the input.  fid: Force FID  column  to
              be  written  even when absent in the input.  (IID is always present, and positioned
              here.)  maybesid: SID, if that column was present in the  input.   sid:  Force  SID
              column to be written even when absent in the input.  sex: '1' = male, '2' = female,
              'NA' = missing.  hom: Homozygous genotype count.  homref:  Homozygous-ref  genotype
              count.   homalt:  Homozygous-alt  genotype  count.   homaltsnp:  Homozygous-alt SNP
              count.  het: Heterozygous genotype count.  refalt: Heterozygous  ref-altx  genotype
              count.   het2alt:  Heterozygous altx-alty genotype count.  hetsnp: Heterozygous SNP
              count.  dipts: Diploid SNP transition count.  ts: SNP transition  count  (excluding
              chrY  for  females).   diptv: Diploid SNP transversion count.  tv: SNP transversion
              count.  dipnonsnpsymb: Diploid non-SNP, non-symbolic count.   nonsnpsymb:  Non-SNP,
              non-symbolic  count.   symbolic:  Symbolic  variant  count.  nonsnp: Non-SNP count.
              dipsingle: Number of singletons relative to this dataset, across just

       diploid calls.
              (Note that if the ALT allele in a chrX

              biallelic variant appears in exactly one female and one  male,  that  counts  as  a
              singleton for just the female.)

              single: Number of singletons relative to this dataset.  haprefwfemaley: Haploid-ref
              count, counting chrY for everyone.  hapref: Haploid-ref count, excluding  chrY  for
              females.   hapaltwfemaley:  Haploid-alt count, counting chrY for everyone.  hapalt:
              Haploid-alt count, excluding  chrY  for  females.   missingwfemaley:  Missing  call
              count, counting chrY for everyone.  missing: Missing call count, excluding chrY for
              females.

              The     default      is      maybefid,maybesid,homref,homaltsnp,hetsnp,dipts,diptv,
              dipnonsnpsymb,dipsingle,haprefwfemaley,hapaltwfemaley,missingwfemaley.

              * The 'hetsnp', 'dipts'/'ts'/'diptv'/'tv', 'dipnonsnpsymb'/'nonsnpsymb',

              'symbolic',  and 'nonsnp' columns count each ALT allele in a heterozygous altx-alty
              call separately, since they can be of different subtypes.  (I.e. if they are of the
              same  subtype,  the  corresponding  count  is incremented by 2.)  As a consequence,
              these columns are unaffected by variant split/join.

       --missing ['zs'] [{sample-only | variant-only}]

              ['scols='<column set descriptor>] ['vcols='<column set descriptor>]

              Generate sample- and variant-based missing data reports  (or  just  one  report  if
              'sample-only'/'variant-only'  is  specified).   As  of  alpha 2, mixed MT hardcalls
              appear  in  the  heterozygous  haploid  stats.   Supported  column  sets   in   the
              sample-based report are:

              maybefid:  FID,  if that column was present in the input.  fid: Force FID column to
              be written even when absent in the input.  (IID is always present,  and  positioned
              here.)   maybesid:  SID,  if  that column was present in the input.  sid: Force SID
              column to be written even when absent  in  the  input.   misspheno1:  First  active
              phenotype missing (Y/N)?  Always 'Y' if no

              phenotypes are loaded.

       missphenos: A Y/N column for each loaded phenotype.
              (Can be combined

              with misspheno1 to force at least one such column.)

              nmissdosage:  Number  of  missing dosages.  nmiss: Number of missing hardcalls, not
              counting  het  haploids.   nmisshh:  Number  of  missing  hardcalls,  counting  het
              haploids.   hethap:  Number  of  heterozygous haploid hardcalls.  nobs: Denominator
              (male count on chrY, otherwise total sample count).   fmissdosage:  Missing  dosage
              rate.   fmiss:  Missing hardcall rate, not counting het haploids.  fmisshh: Missing
              hardcall rate, counting het haploids.

              The default  is  maybefid,maybesid,missphenos,nmiss,nobs,fmiss.   Supported  column
              sets in the variant-based report are:

              chrom:  Chromosome  ID.   pos:  Base-pair  coordinate.   (ID is always present, and
              positioned here.)  ref: Reference allele.  alt1:  Alternate  allele  1.   alt:  All
              alternate  alleles,  comma-separated.   nmissdosage:  Number  of  missing  dosages.
              nmiss: Number of missing hardcalls, not counting het haploids.  nmisshh: Number  of
              missing  hardcalls,  counting het haploids.  hethap: Number of heterozygous haploid
              calls.  nobs: Number of potentially valid calls.  fmissdosage: Missing dosage rate.
              fmiss: Missing hardcall rate, not counting het haploids.  fmisshh: Missing hardcall
              rate, counting het haploids.  fhethap: Heterozygous haploid rate.

              The default is chrom,nmiss,nobs,fmiss.

       --hardy ['zs'] ['midp'] ['redundant'] ['cols='<column set descriptor>]

              Hardy-Weinberg exact test p-value report(s).   *  By  default,  only  founders  are
              considered;  change  this  with --nonfounders.  * chrX is now omitted from the main
              <output prefix>.hardy report.  Instead,

              (if present) it gets its own <output prefix>.hardy.x report  based  on  the  method
              described  in  Graffelman  J,  Weir  BS (2016) Hardy-Weinberg equilibrium and the X
              chromosome.

              * For variants with k alleles where k>2, k separate 'biallelic' tests are

       performed, each reported on its own line.
              However, biallelic variants

              are normally reported on a single  line,  since  the  counts/frequencies  would  be
              mirror-images  and  the  p-values  would  be the same.  You can add the 'redundant'
              modifier to force biallelic variant results to be reported on two lines for parsing
              convenience.

              *  There  is  currently  no special handling of case/control phenotypes.  Supported
              column sets are:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate alleles, comma-separated.  (A1 is always present, and  positioned  here.)
              ax:  Non-A1  allele(s),  comma-separated.   gcounts:  Hom-A1 count, total number of
              het-A1 calls, and total number of

       nonmissing calls with no copies of A1.
              On chrX, these are

              followed by male A1 and male non-A1 counts.

              gcount1col: gcounts values in a single comma-separated column.   hetfreq:  Observed
              and  expected  het-A1  frequencies.   sexaf:  Female  and  male  A1 observed allele
              frequencies (chrX  only).   femalep:  Female-only  p/midp-value  (chrX  only).   p:
              Hardy-Weinberg equilibrium exact test p/midp-value.

              The default is chrom,ax,gcounts,hetfreq,sexaf,p.

       --indep-pairwise <window size>['kb'] [step size (variant ct)]

              <unphased-hardcall-r^2 threshold>

              Generate a list of variants in approximate linkage equilibrium.  * For multiallelic
              variants, major allele counts are used in the r^2

              computation.

              * With the 'kb' modifier, the window size is in kilobase instead of variant

       count units.
              (Pre-'kb' space is optional, i.e.

              "--indep-pairwise 500 kb 0.5"  and  "--indep-pairwise  500kb  0.5"  have  the  same
              effect.)

              * The step size now defaults to 1 if it's unspecified, and *must* be 1 if

              the window is in kilobase units.

              * Note that you need to rerun PLINK using --extract or --exclude on the

              .prune.in/.prune.out  file  to apply the list to another computation... and as with
              other applications of --extract/--exclude, duplicate variant  IDs  are  a  problem.
              --indep-pairwise  still  runs  to completion for now when duplicate variant IDs are
              present, but that will become an error in alpha 3.

       --ld <variant ID> <variant ID> ['dosage'] ['hwe-midp']

              This displays diplotype frequencies, r^2, and D' for a single pair of variants.   *
              For   multiallelic  variants,  major  allele  counts/dosages  are  used.   *  Phase
              information is used when both variants are on the same chromosome.  * When there is
              at least one sample with unphased het calls for both

              variants,  diplotype  frequencies  are estimated using the Hill equation.  If there
              are multiple biologically possible local maxima, all are displayed, along with  HWE
              exact test statistics.

       * By default, only hardcalls are considered.
              Add the 'dosage' modifier if

       you want dosages to be taken into account.
              (In the diploid case, an

              unphased  dosage  of  x  is  interpreted as P(0/0) = 1 - x, P(0/1) = x when x is in
              0..1.)

       --sample-diff ['id-delim='<char>] ['dosage' | 'dosage='<tolerance>]

              ['include-missing']  [{pairwise  |  counts-only}]   ['fname-id-delim='<c>]   ['zs']
              ['cols='<column  set  descriptor>] ['counts-cols='<column set descriptor>] {base= |
              ids=}<sample ID> [other sample ID(s)...]

       --sample-diff ['id-delim='<char>] ['dosage' | 'dosage='<tolerance>]

              ['include-missing']  [{pairwise  |  counts-only}]   ['fname-id-delim='<c>]   ['zs']
              ['cols='<column    set   descriptor>]   ['counts-cols='<column   set   descriptor>]
              file=<ID-pair file>

              (alias: --sdiff)  Report  discordances  and  discordance-counts  between  pairs  of
              samples.   If chrX or chrY is present, sex must be defined and consistent.  * There
              are three ways to specify which sample pairs to compare.  To

              compare a single baseline sample against some others, start  the  (space-delimited)
              sample  ID  list with 'base='.  To perform an all-vs.-all comparison, start it with
              'ids=' instead.  To compare sample pairs listed in a file, use 'file='.  Note  that
              'base='/'ids='/'file=' must be positioned after all modifiers.

              * Sample IDs are interpreted as if they were in a VCF header line, with

              'id-delim=' having the usual effect.

       * By default, comparisons are based on hardcalls.
              Use 'dosage' to compare

              dosages instead; you can combine this with a tolerance in [0, 0.5).

              * By default, if one genotype is missing and the other isn't, that doesn't

              count as a difference; this can be changed with 'include-missing'.

              * By default, a single main report is written to

       <output prefix>[.<base ID>].sdiff.
              To write separate pairwise

              <output  prefix>.<ID1>.<ID2>.sdiff  reports  for  each  compared  ID  pair, add the
              'pairwise' modifier.  To omit the main  report,  add  the  'counts-only'  modifier.
              (Note  that, if you're only interested in nonmissing autosomal biallelic hardcalls,
              --make-king-table provides a more efficient way to compute just counts.)

              * By default, if an output filename has a multipart sample ID, the parts

              will be delimited by '_'; use 'fname-id-delim=' to change this.

              Supported main-report column sets are:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (Variant ID is  always  present,
              and  positioned  here.)   ref:  Reference  allele.   alt:  All  alternate  alleles,
              comma-separated.  maybefid: FID1/FID2, if that column was in the  input.   Requires
              'id'.   fid: Force FID1/FID2 even when FID was absent in the input.  id: IID1/IID2.
              maybesid: SID1/SID2, if that column was in the input.  Requires 'id'.   sid:  Force
              SID1/SID2  even  when SID was absent in the input.  geno: Unphased GT or DS for the
              two samples.

              The default is usually chrom,pos,ref,alt,maybefid,id,maybesid,geno; the sample  IDs
              are    removed    from    the    default    in    'pairwise'    mode.     Supported
              discordance-count-summary column sets are:

              maybefid: FID1/FID2, if that column was in the input.  fid:  Force  FID1/FID2  even
              when  FID  was  absent  in  the  input.  (IID1/IID2 are always present.)  maybesid:
              SID1/SID2, if that column was in the input.  sid: Force SID1/SID2 even when SID was
              absent  in the input.  nobs: Number of variants considered.  This includes variants
              where one or

              both variants are missing iff 'include-missing' was specified.

              nobsibs: ibs0+ibs1+ibs2.  ibs0: Number of diploid variants with no common  hardcall
              alleles.   ibs1:  Number of diploid variants with exactly 1 common hardcall allele.
              ibs2: Number of diploid variants with both hardcall  alleles  matching.   halfmiss:
              Number of variants with exactly 1 missing genotype/dosage.

              Ignored without 'include-missing'.

              diff: Total number of differences.

              The default is maybefid,maybesid,nobs,halfmiss,diff.

       --make-king [{square | square0 | triangle}] [{zs | bin | bin4}]

              KING-robust kinship estimator, described by Manichaikul A, Mychaleckyj JC, Rich SS,
              Daly K, Sale M,  Chen  WM  (2010)  Robust  relationship  inference  in  genome-wide
              association  studies.   By  default,  this  writes a lower-triangular tab-delimited
              table  of  kinship  coefficients  to  <output  prefix>.king,  and  a  list  of  the
              corresponding  sample  IDs  to <output prefix>.king.id.  The first row of the .king
              file contains a single <genome 1-genome 2> kinship coefficient, the second row  has
              the  <genome 1-genome 3> and <genome 2-genome 3> kinship values in that order, etc.
              * Only autosomes are currently considered.  *  Pedigree  information  is  currently
              ignored; the between-family estimator

              is used for all pairs.

              *  For  multiallelic  variants,  REF  allele counts are used.  * If the 'square' or
              'square0' modifier is present, a square matrix is

              written instead; 'square0' fills the upper right triangle with zeroes.

              * If the 'zs' modifier is present, the .king file is  Zstd-compressed.   *  If  the
              'bin' modifier is present, a binary (square) matrix of

              double-precision  floating  point  values,  suitable for loading from R, is instead
              written to <output prefix>.king.bin.  ('bin4'  specifies  single-precision  numbers
              instead.)   This  can  be combined with 'square0' if you still want the upper right
              zeroed out, or 'triangle' if you don't want to pad the upper right at all.

              * The computation can be subdivided with --parallel.

       --make-king-table ['zs'] ['counts'] ['rel-check'] ['cols='<col set descrip.>]

              Similar to --make-king, except results are reported in KING's original  .kin0  text
              table  format  (with  minor changes, e.g. row order is more friendly to incremental
              addition of samples), --king-table-filter can be used to  restrict  the  report  to
              high  kinship  values,  and  the  'rel-check'  modifier  can be used to restrict to
              same-FID pairs.  Supported column sets are:

       maybefid: FID1/FID2, if that column was in the input.
              Requires 'id'.

              fid: Force FID1/FID2 even when  FID  was  absent  in  the  input.   id:  IID1/IID2.
              maybesid:  SID1/SID2,  if that column was in the input.  Requires 'id'.  sid: Force
              SID1/SID2 even when SID  was  absent  in  the  input.   nsnp:  Number  of  variants
              considered   (autosomal,   neither  call  missing).   hethet:  Proportion/count  of
              considered call pairs which are het-het.  ibs0: Proportion/count of considered call
              pairs  which  are opposite homs.  ibs1: HET1_HOM2 and HET2_HOM1 proportions/counts.
              kinship: KING-robust between-family kinship estimator.

              The  default  is  maybefid,id,maybesid,nsnp,hethet,ibs0,kinship.   hethet/ibs0/ibs1
              values  are proportions unless the 'counts' modifier is present.  If id is omitted,
              a .kin0.id file is also written.

       --make-rel ['cov'] ['meanimpute'] [{square | square0 | triangle}]

              [{zs | bin | bin4}]

              Write a  lower-triangular  variance-standardized  relationship  matrix  to  <output
              prefix>.rel,  and  corresponding IDs to <output prefix>.rel.id.  * This computation
              assumes that variants do not have very low MAF, or

              deviate greatly from Hardy-Weinberg equilibrium.

              * Also, it's usually best to perform this calculation on a variant set in

              approximate linkage equilibrium.

              * The 'cov' modifier replaces the variance-standardization step with basic

              mean-centering, causing a covariance matrix to be calculated instead.

              * The computation can be subdivided with --parallel.

       --make-grm-list ['cov'] ['meanimpute'] ['zs'] [{id-header | iid-only}]

       --make-grm-bin ['cov'] ['meanimpute'] [{id-header | iid-only}]

       --make-grm-list causes the relationships to be written to GCTA's original

              list format, which describes one pair per line, while --make-grm-bin writes them in
              GCTA  1.1+'s  single-precision  triangular  binary format.  Note that these formats
              explicitly report the number of valid observations  (where  neither  sample  has  a
              missing call) for each pair, which is useful input for some scripts.

       --pca [count] [{approx | meanimpute}] ['scols='<col set descriptor>]

       --pca [{allele-wts | biallelic-var-wts}] [count] [{approx | meanimpute}]

              ['vzs'] ['scols='<col set descriptor>] ['vcols='<col set descriptor>]

              Extracts  top  principal  components  from  the  variance-standardized relationship
              matrix.  * It is usually best to perform this calculation on a variant set in

              approximate linkage equilibrium, with no very-low-MAF variants.

              * By default, 10 PCs are extracted; you can adjust this by passing a

       numeric argument.
              (Note that 10 is lower than the PLINK 1.9 default of

              20;  this  is  due  to  the  randomized  algorithm's   memory   footprint   growing
              quadratically w.r.t. the PC count.)

              * The 'approx' modifier causes the standard deterministic computation to be

              replaced  with  the  randomized  algorithm  originally implemented for Galinsky KJ,
              Bhatia G, Loh PR, Georgiev S, Mukherjee S,  Patterson  NJ,  Price  AL  (2016)  Fast
              Principal-Component  Analysis  Reveals  Convergent Evolution of ADH1B in Europe and
              East Asia.  This can be a good idea when  you  have  >5k  samples,  and  is  almost
              required with >50k.

              * The randomized algorithm always uses mean imputation for missing genotype

       calls. For comparison purposes, you can use the 'meanimpute' modifier to

              request this behavior for the standard computation.

              * 'scols=' can be used to customize how sample IDs appear in the .eigenvec

       file.  (maybefid, fid, maybesid, and sid supported; default is

              maybefid,maybesid.)

              * The 'allele-wts' modifier requests an additional one-line-per-allele

              .eigenvec.allele  file  with  PCs  expressed  as  allele  weights instead of sample
              weights.   When  it's  present,  'vzs'  causes  the  .eigenvec.allele  file  to  be
              Zstd-compressed.   'vcols='  can be used to customize the report columns; supported
              column sets are:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate alleles, comma-separated.  (A1 is always present, and  positioned  here.)
              ax:  Non-A1  alleles,  comma-separated.   (PCs  are  always present, and positioned
              here.)

              Default is chrom,ref,alt.

              * For datasets with no multiallelic variants, the 'biallelic-var-wts'

              modifier requests the old .eigenvec.var format,  which  only  reports  weights  for
              major  alleles.  (These weights are 2x the corresponding .eigenvec.allele weights.)
              Supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate alleles, comma-separated.  maj:  Major  allele.   nonmaj:  Minor  allele.
              (PCs  are  always  present,  and  positioned here.  Signs are w.r.t. the major, not
              necessarily reference, allele.)

              Default is chrom,maj,nonmaj.

       --king-cutoff [.king.bin + .king.id fileset prefix] <threshold>

              Exclude one member of each pair of samples with KING-robust  kinship  greater  than
              the  given  threshold.   Remaining/excluded  sample  IDs  are  written  to  <output
              prefix>.king.cutoff.in.id + .king.cutoff.out.id.  If present,  the  .king.bin  file
              must be triangular (either precision is ok).

       --write-covar ['cols='<column set descriptor>]

              If  covariates  are  defined,  an  updated  version  (with  all filters applied) is
              automatically   written    to    <output    prefix>.cov    whenever    --make-pgen,
              --make-just-psam,  --export,  or  a similar command is present.  However, if you do
              not wish to simultaneously generate a new sample file, you can use --write-covar to
              just produce a pruned covariate file.  Supported column sets are:

              maybefid:  FID,  if  that  column  was  in  the input.  fid: Force FID column to be
              written even when absent in the input.  maybesid: SID, if that column  was  in  the
              input.   sid:  Force  SID  column  to  be  written  even  when absent in the input.
              maybeparents: Father/mother IIDs ('0' = missing), if columns  in  input.   parents:
              Force  PAT/MAT  columns  to be written even when absent in input.  sex: '1' = male,
              '2' = female, 'NA' = missing.  pheno1: First active phenotype.  If none, all column
              entries are set to

              the --output-missing-phenotype string.

       phenos: All active phenotypes, if any.
              (Can be combined with pheno1 to

              force at least one phenotype column to be written.)

              (Covariates are always present, and positioned here.)

              The default is maybefid,maybesid.

       --write-samples

              Report IDs of all samples which pass your filters/inclusion thresholds.

       --write-snplist ['zs'] ['allow-dups']

       List all variants which pass your filters/inclusion thresholds.
              Unless the

              'allow-dups' modifier is provided, this now errors out when duplicate variant ID(s)
              remain.

       --glm ['zs'] ['omit-ref'] [{sex | no-x-sex}] ['log10'] ['pheno-ids']

              [{genotypic  |  hethom  |   dominant   |   recessive}]   ['interaction']   ['skip']
              ['hide-covar'] [{no-firth | firth-fallback | firth}] ['intercept'] ['cols='<col set
              desc>] ['local-covar='<file>] ['local-psam='<file>] ['local-pos-cols='<key col  #s>
              |        'local-pvar='<file>]       ['local-haps']       ['local-omit-last'       |
              'local-cats[0]='<category ct>] ['allow-no-covars']

              Basic association analysis on quantitative  and/or  case/control  phenotypes.   For
              each  variant,  a  linear  (for quantitative traits) or logistic (for case/control)
              regression is run with the phenotype as the dependent variable, and nonmajor allele
              dosage(s)  and  a  constant-1 column as predictors.  * There is usually an additive
              effect line for every nonmajor allele, and

       no such line for the major allele.
              To omit REF alleles instead of major

       alleles, add the 'omit-ref' modifier.
              (When performing interaction

              testing,  this  tends  to  cause  the   multicollinearity   check   to   fail   for
              low-ref-frequency variants.)

              * By default, sex (male = 1, female = 2; note that this is a change from

              PLINK  1.x) is automatically added as a predictor for X chromosome variants, and no
              others.  The 'sex' modifier causes it to be added everywhere (except  chrY),  while
              'no-x-sex' excludes it entirely.

              *  The  'log10'  modifier  causes  p-values  to  be  reported in -log10(p) form.  *
              'pheno-ids' causes the samples used in each set of regressions to be

       written to an .id file.
              (When the samples differ on chrX or chrY, .x.id

              and/or .y.id files are also written.)

              * The 'genotypic' modifier adds an additive effect/dominance deviation 2df

              joint test (0-2 and 0..1..0 coding), while 'hethom' uses 0..0..1 and 0..1..0 coding
              instead.

              * 'dominant' and 'recessive' specify a model assuming full dominance or

       recessiveness, respectively, for the ref allele.
              I.e. the genotype

              column is recoded as 0..1..1 or 0..0..1, respectively.

       * 'interaction' adds genotype x covariate interactions to the model.
              Note

              that  this  tends to produce 'NA' results (due to the multicollinearity check) when
              the reference allele is 'wrong'; --maj-ref can be used to enable analysis of  those
              variants.

       * Additional predictors can be added with --covar.
              By default, association

              statistics  are  reported  for  all nonconstant predictors; 'hide-covar' suppresses
              covariate-only results, while 'intercept' causes intercepts to be reported.   Since
              running  --glm  without  at least e.g. principal component covariates is usually an
              analytical mistake, the 'allow-no-covars' modifier  is  now  required  when  you're
              intentionally running --glm without a covariate file.

              * By default, if the current phenotype and covariates are such that every

       regression on a chromosome will fail, PLINK 2 errors out.
              To just skip

              the phenotype or chromosome instead, add the 'skip' modifier.

              * There are now three regression modes for case/control phenotypes:

              * 'no-firth' requests PLINK 1.x's behavior, where a NA result is reported

              when basic logistic regression fails to converge.

              * 'firth-fallback' requests logistic regression, followed by Firth

       regression whenever the logistic regression fails to converge.
              This is

              now the default.

              * 'firth' requests Firth regression all the time.

              * To add covariates which are not constant across all variants, add the

              'local-covar='  and  'local-psam='  modifiers, use full filenames for each, and use
              either 'local-pvar='  or  'local-pos-cols='  to  provide  variant  ID  or  position
              information.  Normally, the local-covar file should have c * n real-valued columns,
              where the first c columns correspond to the first sample in  the  local-psam  file,
              columns  (c+1)  to  2c  correspond  to  the  second  sample, etc.; and the mth line
              corresponds to the mth nonheader line of the local-pvar file  when  there  is  one.
              (Variants  outside  of  the local-pvar file are excluded from the regression.)  The
              local covariates are assigned the names LOCAL1, LOCAL2, etc.; to exclude  the  last
              local  covariate  from  the  regression  (necessary if they are e.g. local ancestry
              coefficients  which  sum  to  1),  add  'local-omit-last'.    Alternatively,   with
              'local-cats='<k>,  the  local-covar  file  is  expected  to  have  n  columns  with
              integer-valued entries in [1, k].  (This range is [0,  k-1]  with  'local-cats0='.)
              These  category  assignments  are expanded into (k-1) local covariates in the usual
              manner.  When position information is in  the  local-covar  file,  this  should  be
              indicated  by 'local-pos-cols='<number of header rows>,<chrom col #>,<pos start col
              #>,<first covariate col #>.  'local-haps' indicates  that  there's  one  column  or
              column-group per haplotype instead of per sample; they are averaged by --glm.

              The main report supports the following column sets:

              chrom:  Chromosome  ID.   pos:  Base-pair  coordinate.   (ID is always present, and
              positioned here.)  ref: Reference allele.  alt1:  Alternate  allele  1.   alt:  All
              alternate  alleles,  comma-separated.   (A1 is always present, and positioned here.
              For multiallelic variants, this column may contain multiple comma-separated alleles
              when  the  result  doesn't  depend  on  which  allele  is A1.)  ax: Non-A1 alleles,
              comma-separated.  a1count: A1 allele count  (can  be  decimal  with  dosage  data).
              totallele: Allele observation count (can be higher than --freq value, due

              to inclusion of het haploids and chrX model).

              a1countcc: A1 count in cases, then controls (case/control only).  totallelecc: Case
              and  control  allele  observation  counts.   gcountcc:  Genotype  hardcall   counts
              (neither-A1, het-A1, A1-A1) in cases,

              then controls (case/control only).

              a1freq:  A1  allele  frequency.   a1freqcc:  A1  frequency  in cases, then controls
              (case/control only).  machr2: Unphased MaCH imputation quality (frequently  labeled
              'INFO').   firth:  Reports whether Firth regression was used (firth-fallback only).
              test: Test identifier.  (Required unless only one test is run.)   nobs:  Number  of
              samples in the regression.  beta: Regression coefficient (for A1 if additive test).
              orbeta: Odds ratio for case/control, beta for quantitative traits.

              (Ignored if 'beta' column set included.)

              se: Standard error  of  beta.   ci:  Bounds  of  symmetric  approximate  confidence
              interval  (requires --ci).  tz: T-statistic for linear regression, Wald Z-score for
              logistic/Firth.  p: Asymptotic p-value  (or  -log10(p))  for  T/Z-statistic.   err:
              Error code for NA results.

              The default is chrom,pos,ref,alt,firth,test,nobs,orbeta,se,ci,tz,p,err.

       --score <filename> [i] [j] [k] [{header | header-read}]

              [{center  |  variance-standardize  |  dominant | recessive}] ['no-mean-imputation']
              ['se'] ['zs'] ['ignore-dup-ids'] [{list-variants | list-variants-zs}]  ['cols='<col
              set descriptor>]

              Apply linear scoring system(s) to each sample.  The input file should have one line
              per scored (variant, allele) pair.  Variant IDs are read from column #i and  allele
              codes  are  read from column #j, where i defaults to 1 and j defaults to i+1.  * By
              default, a single column of input coefficients is read from column #k,

       where k defaults to j+1.
              (--score-col-nums can be used to specify

              multiple columns.)

              * 'header-read' causes the first line of the input file to be treated as a

       header line containing score names.
              Otherwise, score(s) are assigned the

              names 'SCORE1', 'SCORE2', etc.; and 'header' just  causes  the  first  line  to  be
              entirely ignored.

              * By default, copies of unnamed alleles contribute zero to score, while

              missing genotypes contribute an amount proportional to the loaded (via --read-freq)
              or imputed allele frequency.  To throw out missing observations instead (decreasing
              the   denominator   in   the   final   average   when   this   happens),   use  the
              'no-mean-imputation' modifier.

              * You can use the 'center' modifier to shift all genotypes to mean zero, or

              'variance-standardize' to linearly transform the genotypes to mean-0, variance-1.

              * The 'dominant' modifier causes dosages greater than 1 to be treated as 1,

              while 'recessive' uses max(dosage - 1, 0)  on  diploid  chromosomes.   ('dominant',
              'recessive', and 'variance-standardize' cannot be used with chrX.)

              * The 'se' modifier causes the input coefficients to be treated as

              independent   standard  errors;  in  this  case,  standard  errors  for  the  score
              average/sum are  reported.   (Note  that  this  will  systematically  underestimate
              standard errors when scored variants are in LD.)

              * By default, --score errors out if a variant ID in the input file appears

       multiple times in the main dataset.
              Use the 'ignore-dup-ids' modifier to

              skip them instead (a warning is still printed if such variants are present).

              * The 'list-variants[-zs]' modifier causes variant IDs used for scoring to

              be written to <output prefix>.sscore.vars[.zst].

              The main report supports the following column sets:

              maybefid:  FID,  if  that  column  was  in  the input.  fid: Force FID column to be
              written even when absent in the input.  (IID  is  always  present,  and  positioned
              here.)   maybesid:  SID, if that column was in the input.  sid: Force SID column to
              be written even when absent in the input.  pheno1: First active phenotype.  phenos:
              All  active  phenotypes,  if  any.   nallele: Number of nonmissing alleles.  denom:
              Denominator of score average (equal to nallele value when

              'no-mean-imputation' specified).

              dosagesum: Sum of named allele dosages.   scoreavgs:  Score  averages.   scoresums:
              Score sums.

              The  default  is  maybefid,maybesid,phenos,nallele,dosagesum,scoreavgs.   For  more
              sophisticated  polygenic  risk  scoring,  we  recommend  looking  at   the   LDpred
              (https://github.com/bvilhjal/ldpred  )  and  PRSice-2  (https://www.prsice.info/  )
              software packages.

       --variant-score <filename> ['zs'] ['bin' | 'cols='<col set descriptor>]

              (alias: --vscore) Apply linear scoring system(s) to each  variant.   Each  reported
              variant   score   is   the   dot   product  of  a  sample-weight  vector  with  the
              total-ALT-dosage vector, with MAF-based mean imputation applied to missing dosages.
              Input  file  format:  one line per sample, each starting with an ID and followed by
              scoring weight(s); it can also have a header line with the sample ID representation
              and the score name(s).  The usual .vscore text report supports the following column
              sets:

              chrom: Chromosome ID.  pos: Base-pair  coordinate.   (ID  is  always  present,  and
              positioned  here.)   ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All
              alternate  alleles,  comma-separated.   altfreq:  ALT  allele  frequency  used  for
              mean-imputation.   nmiss: Number of missing (and thus mean-imputed) dosages.  nobs:
              Number of (nonmissing) sample observations.  (Variant scores  are  always  present,
              and positioned here.)

              Default  is  chrom,pos,ref,alt.   If  binary  output is requested instead, the main
              .vscore.bin matrix contains double-precision floating-point values, column  (score)
              ID(s)  are  saved  to  <output  prefix>.vscore.cols,  and  variant IDs are saved to
              <output prefix>.vscore.vars[.zst].

       --adjust-file <filename> ['zs'] ['gc'] ['cols='<column set descriptor>]

              ['log10'] ['input-log10'] ['test='<test name, case-sensitive>]

              Given  a  file  with  unfiltered  association  test  results,  report  some   basic
              multiple-testing  corrections,  sorted  in increasing-p-value order.  * 'gc' causes
              genomic-controlled p-values to be used in the formulas.

       (This tends to be overly conservative.
              We note that LD Score regression

              usually does a better job of calibrating lambda; see Lee JJ, McGue  M,  Iacono  WG,
              Chow  CC  (2018) The accuracy of LD Score regression as an estimator of confounding
              and genetic correlations in genome-wide association studies.)

              * 'log10' causes negative base 10 logs of p-values to be reported, instead

       of raw p-values.
              'input-log10' specifies that the input file contains

       -log10(p) values.

              * If the input file contains multiple tests per variant which are

              distinguished by a 'TEST' column (true for --linear/--logistic/--glm), you must use
              'test=' to select the test to process.

              The following column sets are supported:

              chrom:  Chromosome  ID.   pos:  Base-pair  coordinate.   (ID is always present, and
              positioned here.)  ref: Reference allele.  alt1:  Alternate  allele  1.   alt:  All
              alternate  alleles,  comma-separated.  a1: Tested allele.  (Omitted if missing from
              input file.)  unadj: Unadjusted  p-value.   gc:  Devlin  &  Roeder  (1999)  genomic
              control corrected p-value (additive

              models only).

              qq:  P-value  quantile.  bonf: Bonferroni correction.  holm: Holm-Bonferroni (1979)
              adjusted p-value.  sidakss: Sidak single-step  adjusted  p-value.   sidaksd:  Sidak
              step-down  adjusted  p-value.   fdrbh:  Benjamini  &  Hochberg (1995) step-up false
              discovery control.  fdrby: Benjamini & Yekutieli  (2001)  step-up  false  discovery
              control.

              Default set is chrom,a1,unadj,gc,bonf,holm,sidakss,sidaksd,fdrbh,fdrby.

       --genotyping-rate ['dosage']

              Report genotyping rate in log (this was automatic in PLINK 1.x).

       --pgen-info

              Reports basic information about a .pgen file.

       --validate

              Validates all variant records in a .pgen file.

       --zst-decompress <.zst file> [output filename]

              (alias:  --zd)  Decompress  a  Zstd-compressed  file.   If  no  output  filename is
              specified, the file is decompressed to standard output.  This cannot be  used  with
              any other flags, and does not cause a log file to be generated.

       The following other flags are supported.

       --script <fname>
              : Include command-line options from file.

       --rerun [log]
              : Rerun commands in log (default 'plink2.log').

       --version
              : Display only version number before exiting.

       --silent
              : Suppress regular output to console.  (Error-output is not suppressed.)

       --double-id
              : Set both FIDs and IIDs to the VCF/.bgen sample ID.

       --const-fid [ID]
              :  Set  all  FIDs  to  the  given constant.  If '0' (the default), no FID column is
              created.

       --id-delim [d]
              :   Normally   parses   single-delimiter   sample   IDs   as   <FID><d><IID>,   and
              double-delimiter   IDs   as   <FID><d><IID><d><SID>;   default  delimiter  is  '_'.
              --id-delim can no longer be used with --double-id/--const-fid; it will error out if
              any ID lacks the delimiter.

       --idspace-to <c>
              : Convert spaces in VCF/.bgen sample IDs to the given character.

       --iid-sid
              :  Make  --id-delim  and  --sample-diff  interpret  two-token sample IDs as IID-SID
              instead of FID-IID.

       --vcf-require-gt
              : Skip variants with no GT field.

       --vcf-min-gq <val>
              : No-call genotypes when GQ is present and below the threshold.

       --vcf-max-dp <val>
              : No-call genotypes when DP is present and above/below

       --vcf-min-dp <val>
              the threshold.

       --vcf-half-call <m> : Specify how '0/.' and similar VCF GT values should be
              handled.  The following four modes are supported: *  'error'/'e'  (default)  errors
              out  and  reports  line  #.   *  'haploid'/'h'  treats  them  as  haploid calls.  *
              'missing'/'m' treats them as missing.  * 'reference'/'r' treats the  missing  value
              as 0.

       --oxford-single-chr <chr name>
              : Specify single-chromosome .gen/.bgen file with no useful chromosome info inside.

       --missing-code [string list]
              : Comma-delimited list of missing phenotype

       (alias: --missing_code)
              values for Oxford-format import (default 'NA').

       --hard-call-threshold <val>
              :  When  importing dosage data, a hardcall is normally saved when the distance from
              the nearest hardcall, defined as

       0.5 * sum_i |x_i - round(x_i)|
              (where the x_i's are 0..2 allele dosages), is not greater than 0.1.  You can adjust
              this  threshold  by providing a numeric argument to --hard-call-threshold.  You can
              also use this with --make-[b]pgen to alter the saved hardcalls  while  leaving  the
              dosages untouched, or --make-bed to tweak hardcall export.

       --dosage-erase-threshold <val>
              :  --hard-call-threshold normally preserves the original dosages, and several PLINK
              2 commands use them when they're available.  Use --dosage-erase-threshold  to  make
              PLINK  2  erase  dosages and keep only hardcalls when distance-from-hardcall <= the
              given level.

       --import-dosage-certainty <val> : The PLINK 2 file format currently supports
              a single dosage for each allele.  Some other dosage file formats include a separate
              probability   for   every   possible   genotype,   e.g.  {P(0/0)=0.2,  P(0/1)=0.52,
              P(1/1)=0.28}, a highly uncertain call that is nevertheless treated  as  a  hardcall
              under  '--hard-call-threshold  0.1'.   To  make  PLINK  2 treat a dosage as missing
              whenever   the   largest   probability   is   less   than    a    threshold,    use
              --import-dosage-certainty.

       --input-missing-genotype <c> : '.' is always interpreted as a missing
              genotype  code in input files.  By default, '0' also is; you can change this second
              missing code with --input-missing-genotype.

       --allow-extra-chr
              : Permit unrecognized chromosome codes (alias --aec).

       --chr-set <autosome ct> ['no-x'] ['no-y'] ['no-xy'] ['no-mt'] :

       Specify a nonhuman chromosome set.
              The first parameter sets the number of

              diploid autosome pairs if positive, or  haploid  chromosomes  if  negative.   Given
              diploid  autosomes,  the  remaining  modifiers  indicate  the  absence of the named
              non-autosomal chromosomes.

       --cow/--dog/--horse/--mouse/--rice/--sheep : Shortcuts for those species.

       --autosome-num <val>
              : Alias for '--chr-set <value> no-y no-xy no-mt'.

       --human
              : Explicitly specify human chromosome set, and make output .pvar/VCF files  include
              a  ##chrSet  header  line.   (.pvar/VCF output files automatically include ##chrSet
              when a nonhuman set is specified.)

       --chr-override ['file'] : By default, if --chr-set/--autosome-num/--cow/etc.
              conflicts with an input  file  ##chrSet  header  line,  PLINK  2  will  error  out.
              --chr-override  with  no  argument  causes  the  command  line  to take precedence;
              '--chr-override file' defers to the file.

       --var-min-qual <val>
              : Skip variants with low/missing QUAL.

       --var-filter [exception(s)...]
              : Skip variants which have FILTER failures.

       --extract-if-info <key> <op> <val> : Exclude variants which don't/do satisfy

       --exclude-if-info <key> <op> <val>
              a comparison predicate on an INFO key,

       (aliases: --extract-if,
              e.g.

       --exclude-if)
              --extract-if-info "VT == SNP"

       Unless the operator is !=, the predicate
              always evaluates to false when the key is missing.

       --require-info <key(s)...>
              : Exclude variants based on nonexistence

       --require-no-info <key(s)...>
              or existence of an INFO key.  "<key>=."  is treated as nonexistence.

       --extract-col-cond <f> [valcol] [IDcol] [skip] :

       --extract-col-cond-match <(sub)string(s)...>

       --extract-col-cond-mismatch <(sub)string(s)...>

       --extract-col-cond-substr

       --extract-col-cond-min <min>

       --extract-col-cond-max <max> :

              Exclude all variants without a value-column entry satisfying  a  condition.   *  By
              default, values are read from column 2 of the file, and variant IDs

              are read from column 1.

              * Three types of conditions are supported:

              * When --extract-col-cond-match is specified without

       --extract-col-cond-substr, the value is checked for equality with the

       given strings, and kept iff one of them matches.
              Similarly,

       --extract-col-cond-mismatch without --extract-col-cond-substr causes

              the variant to be kept iff the value matches none of the given strings.

              * When --extract-col-cond-match and/or -mismatch are specified with

       --extract-col-cond-substr, the variant is kept iff none of the

       --extract-col-cond-mismatch substrings are contained in the value, and

              either  --extract-col-cond-match  was unspecified or at least one of its substrings
              is contained.

              * Otherwise, the value is interpreted as a number, and the variant is

              kept if the number is in [<min>, <max>] (default min=0, max=DBL_MAX).

       --pheno ['iid-only'] <f> : Specify additional phenotype/covariate file.
              Comma-delimited files with a header line are now permitted.

       --pheno-name <name...>
              : Only load the designated phenotype(s) from the --pheno (if one was specified)  or
              .psam (if no --pheno) file.  Separate multiple names with spaces or commas, and use
              dashes to designate ranges.

       --pheno-col-nums <#...>
              : Only load the phenotype(s) in the designated column number(s)  from  the  --pheno
              file.

       --no-psam-pheno
              : Ignore phenotype(s) in .psam/.fam file.

       --strict-sid0
              :  By  default, if there is no SID column in the .psam/.fam (or --update-ids) file,
              but there is one in another input file (for e.g. --keep/--remove), the  latter  SID
              column  is  ignored;  sample IDs are considered matching as long as FID and IID are
              equal (with missing FID treated as '0').  If you also want to require SID = '0' for
              a sample ID match in this situation, add --strict-sid0.

       --input-missing-phenotype <v> : Set nonzero number to treat as a missing
              pheno/covar in input files (default -9).

       --no-input-missing-phenotype
              :  Don't  treat any nonzero number as a missing pheno/covar.  ('NA'/'nan' are still
              treated as missing.)

       --1    : Expect case/control phenotypes in input files to be coded as 0  =  control,  1  =
              case,  instead  of  the  usual 0 = missing, 1 = ctrl, 2 = case.  (Unlike PLINK 1.x,
              this does not force all phenotypes to be interpreted as case/ctrl.)

       --missing-catname <str>
              : Set missing-categorical-phenotype string (case-sensitive, default 'NONE').

       --covar ['iid-only'] <f> : Specify additional covariate file.
              Comma-delimited files with a header line are now permitted.

       --covar-name <name...>
              : Only load the designated covariate(s) from the --covar (if  one  was  specified),
              --pheno (if no --covar), or .psam (if no --covar or --pheno) file.

       --covar-col-nums <#...>
              :  Only  load  the covariate(s) in the designated column number(s) from the --covar
              (if one was specified) or --pheno (if no --covar) file.

       --within <f> [new pheno name] : Import a PLINK 1.x categorical phenotype.
              (Phenotype name defaults to 'CATPHENO'.)  * If any numeric values are present, ALL

       values must be numeric.
              In that case, 'C'

       is added in front of all category names.
              * 'NA' is treated as a missing value.

       --mwithin <n>
              : Load --within categories from column n+2.

       --family [new pheno name]
              : Create a categorical phenotype from FID.  Restrictions on and handling of numeric
              values are the same as for --within.

       --family-missing-catname <nm> : Make --family treat the specified FID as
              missing.

       --keep <fname...>
              : Exclude all samples not named in a file.

       --remove <fname...>
              : Exclude all samples named in a file.

       --keep-fam <fn...>
              : Exclude all families not named in a file.

       --remove-fam <f...>
              : Exclude all families named in a file.

       --extract [{bed0 | bed1}] <f...> : Usually excludes all variants (not) named

       --exclude [{bed0 | bed1}] <f...>
              in  the given file(s).  When multiple files are named, they are concatenated.  With
              the 'bed0' or 'bed1' modifier, variants outside/inside the positional ranges in the
              interval-BED  file(s)  are  excluded  instead.   'bed0' tells PLINK 2 to assume the
              interval  bounds  follow  the  UCSC  0-based  half-open  convention,  while  'bed1'
              (equivalent to PLINK 1.9 'range') specifies 1-based fully-closed.

       --extract-intersect [{bed0 | bed1}] <f...> : Just like --extract, except that
              a  variant must be in the intersection, rather than just the union, of the files to
              remain.

       --bed-border-bp <n>
              : Stretch BED intervals by the given amount on each

       --bed-border-kb <n>
              side.

       --keep-cats <filename>
              : These can be used individually or in combination

       --keep-cat-names <nm...>
              to define a list of categories to keep;  all  samples  not  in  one  of  the  named
              categories   are   excluded.    Use   spaces   to   separate   category  names  for
              --keep-cat-names.  Use the --missing-catname value (default 'NONE') to refer to the
              group of uncategorized samples.

       --keep-cat-pheno <pheno> : If more than one categorical phenotype is loaded,
              or  you wish to filter on a categorical covariate, --keep-cat-pheno must be used to
              specify which phenotype/covariate --keep-cats and --keep-cat-names apply to.

       --remove-cats <filename> : Exclude all categories named in the file.

       --remove-cat-names <...> : Exclude named categories.

       --remove-cat-pheno <phe> : Specify pheno for --remove-cats/remove-cat-names.

       --split-cat-pheno [{omit-most | omit-last}] ['covar-01']
              [cat. pheno/covar name(s)...] :

              Split n-category phenotype(s) into n (or n-1, with 'omit-most'/'omit-last')  binary
              phenotypes,  with  names  of  the  form  <orig.  pheno  name>=<cat.  name>.   (As a
              consequence, affected phenotypes and categories are not permitted  to  contain  the
              '='  character.)   *  This  happens after all sample filters.  * If no phenotype or
              covariate names are provided, all categorical

              phenotypes (but not covariates) are processed.

       * By default, generated covariates are coded as 1=false, 2=true.
              To code

              them as 0=false, 1=true instead, add the 'covar-01' modifier.

       --loop-cats <pheno/covar>
              : Run variant filters and subsequent operations on just the samples  in  the  first
              category;  then  just  the  samples  in  the  second  category;  and so on, for all
              categories in the named categorical phenotype.

       --no-id-header ['iid-only'] : Don't include a header line in .id output
              files.  This normally forces two-column FID/IID output;  add  'iid-only'  to  force
              just single-column IID.

       --variance-standardize [pheno/covar name(s)...]

       --covar-variance-standardize [covar name(s)...] :

              Linearly    transform   named   covariates   (and   quantitative   phenotypes,   if
              --variance-standardize) to mean-zero, variance 1.  If no  arguments  are  provided,
              all  possible  phenotypes/covariates are affected.  This is frequently necessary to
              prevent multicollinearity when dealing with  covariates  where  abs(mean)  is  much
              larger than abs(standard deviation), such as year of birth.

       --quantile-normalize [...]
              : Force named covariates and quantitative

       --pheno-quantile-normalize [...]
              phenotypes to a N(0,1) distribution,

       --covar-quantile-normalize [...]
              preserving only the original rank orders.

       --chr <chr(s)...>
              :  Exclude  all  variants not on the given chromosome(s).  Valid choices for humans
              are 0  (unplaced),  1-22,  X,  Y,  XY,  MT,  PAR1,  and  PAR2.   Separate  multiple
              chromosomes  with  spaces  and/or  commas,  and  use  a  dash  (no  adjacent spaces
              permitted) to denote a range, e.g.  '--chr 1-4, 22, par1, x, par2'.

       --not-chr <...>
              : Reverse of --chr (exclude variants on listed chromosomes).

       --autosome
              : Exclude all non-autosomal variants.

       --autosome-par
              : Exclude all non-autosomal variants, except those in a pseudo-autosomal region.

       --snps-only ['just-acgt'] : Exclude non-SNP variants.
              By default, SNP = all allele codes are single-character (so  multiallelic  variants
              with a mix of SNPs and non-SNPs are excluded; split your variants first if that's a
              problem).     The    'just-acgt'    modifier     restricts     SNP     codes     to
              {A,C,G,T,a,c,g,t,<missing>}.

       --from <var ID>
              : Use ID(s) to specify a variant range to load.  When used

       --to   <var  ID>      together, both variants must be on the same chromosome.  (--snps can
              be used to specify intervals which cross chromosome boundaries.)

       --snp  <var ID>    : Specify a single variant to load.

       --exclude-snp <ID> : Specify a single variant to exclude.

       --window
              <kbs>    : With --snp/--exclude-snp, loads/excludes all variants  within  half  the
              specified kb distance of the named one.

       --from-bp <pos>
              : Use base-pair coordinates to define a variant range to

       --to-bp
              <pos>      load.

       --from-kb <pos>
              * You must use these with --chr, specifying a single

       --to-kb
              <pos>        chromosome.

       --from-mb <pos>
              * Decimals and negative numbers are permitted.

       --to-mb
              <pos>       *  The  --to-bp(/-kb/-mb) position is no longer permitted to be smaller
              than the --from-bp position.

       --snps <var IDs...>
              : Use IDs to specify variant range(s) to load or

       --exclude-snps <...>
              exclude.  E.g. '--snps rs1111-rs2222, rs3333, rs4444'.

       --force-intersect
              : PLINK 2 normally errors out when multiple variant inclusion  filters  (--extract,
              --extract-col-cond,  --extract-intersect,  --from/--to,  --from-bp/--to-bp,  --snp,
              --snps) are specified.  --force-intersect  allows  the  run  to  proceed;  the  set
              intersection will be taken.

       --thin <p>
              : Randomly remove variants, retaining each with prob. p.

       --thin-count <n>
              : Randomly remove variants until n of them remain.

       --bp-space <bps>
              : Remove variants so that each pair is no closer than the given bp distance.

       --thin-indiv <p>
              : Randomly remove samples, retaining with prob. p.

       --thin-indiv-count <n> : Randomly remove samples until n of them remain.

       --keep-col-match <f> <val(s)...> : Exclude all samples without a 3rd column
              entry  in  the  given  file  exactly  matching one of the given strings.  (Separate
              multiple strings with spaces.)

       --keep-col-match-name <col name> : Check column with given name instead.

       --keep-col-match-num <n>
              : Check nth column instead.

       --geno [val] [{dosage | hh-missing}]

       --mind [val] [{dosage | hh-missing}] :

              Exclude variants (--geno) and/or samples (--mind)  with  missing  call  frequencies
              greater  than  a threshold (default 0.1).  (Note that the default threshold is only
              applied if --geno/--mind is invoked without an argument; when --geno/--mind is  not
              invoked,   no   missing   call   frequency  ceiling  is  enforced  at  all.   Other
              inclusion/exclusion default thresholds work the same  way.)   By  default,  when  a
              dosage  is  present  but a hardcall is not, the genotype is treated as missing; add
              the 'dosage' modifier to treat this case as nonmissing.  Alternatively, you can use
              'hh-missing' to also treat heterozygous haploid calls as missing.

       --require-pheno [name(s)...] : Remove samples missing any of the named

       --require-covar [name(s)...]
              phenotype(s)/covariate(s).      If     no     arguments     are    provided,    all
              phenotype(s)/covariate(s) must be present.

       --maf [freq] [mode]
              : Exclude variants with allele frequency lower than a

       (alias: --min-af)
              threshold (default 0.01).  By default, the nonmajor allele frequency is  used;  the
              other  supported  modes  are  'nref'  (non-reference),  'alt1',  and 'minor' (least
              frequent).  bcftools freq:mode notation is permitted.

       --max-maf <freq> [mode] : Exclude variants with MAF greater than the

       (alias: --max-af)
              threshold.

       --mac <ct> [mode]
              : Exclude variants with allele dosage lower than the

       (alias: --min-ac)
              given threshold.

       --max-mac <ct> [mode]
              : Exclude variants with allele dosage greater than

       (alias: --max-ac)
              the given threshold.

       --maf-succ
              : Rule of succession allele frequency estimation  (used  in  EIGENSOFT).   Given  j
              observations of one allele and k observations of the other for a biallelic variant,
              infer allele frequencies of (j+1) / (j+k+2) and (k+1) / (j+k+2),  rather  than  the
              default j / (j+k) and k / (j+k).  Note that this does not affect --freq's output.

       --min-alleles <ct> : Exclude variants with fewer than the given # of alleles.
              (When  a variant has exactly one ALT allele, and it's a missing-code, it's excluded
              by "--min-alleles 2".)

       --max-alleles <ct> : Exclude variants with more than the given # of alleles.

       --read-freq <file> : Load allele frequency estimates from the given --freq or
              --geno-counts (or PLINK 1.9 --freqx) report, instead  of  imputing  them  from  the
              immediate dataset.

       --hwe <p> ['midp'] ['keep-fewhet'] :

              Exclude  variants  with  Hardy-Weinberg  equilibrium  exact  test  p-values below a
              threshold.  * By default, only founders are considered.  * chrX  p-values  are  now
              computed  using  Graffelman  and Weir's method.  * For variants with k alleles with
              k>2, k separate 'biallelic' tests are

              performed, and the variant is filtered out if any of them fail.

              * With 'keep-fewhet', variants which fail the test in the too-few-hets

       direction are not excluded.
              On chrX, this uses the ratio between the

              Graffelman/Weir p-value and the female-only p-value.

              * There is currently no special handling of case/control phenotypes.

       --mach-r2-filter [min] [max] : Exclude variants with MaCH imputation quality
              metric less than min or greater than max  (defaults  0.1  and  2.0).   (Monomorphic
              variants,  with  r2  =  nan,  are not excluded.)  * This is NOT identical to the R2
              metric

       reported by Minimac3 0.1.13+; see below.
              * If a single argument is provided, it is

       treated as the minimum.
              * The metric is not computed on chrX and MT.

       --minimac3-r2-filter <min> [max] : Compute Minimac3 R2 values from scratch,
              and exclude variants with R2 less than min or (if max  is  provided)  greater  than
              max.  * Note that this requires phased-dosage

       data for all samples and variants;
              otherwise this will systematically underestimate imputation quality, since unphased
              hardcalls/dosages  are  treated  as  if  they  were  maximally   uncertain.    (Use
              --extract-if-info/--exclude-if-info  to  filter  on  precomputed  Minimac3  R2 in a
              VCF/.pvar INFO column.)

       --keep-females
              : Exclude male and unknown-sex samples.

       --keep-males
              : Exclude female and unknown-sex samples.

       --keep-nosex
              : Exclude all known-sex samples.

       --remove-females
              : Exclude female samples.

       --remove-males
              : Exclude male samples.

       --remove-nosex
              : Exclude unknown-sex samples.

       --keep-founders
              : Exclude nonfounder samples.

       --keep-nonfounders : Exclude founder samples.

       --keep-if <pheno/covar> <op> <val> : Exclude samples which don't/do satisfy a

       --remove-if <pheno/covar> <op> <v>
              comparison predicate, e.g.  --keep-if "PHENO1 == case"

       Unless the operator is !=, the predicate
              always evaluates to false when the phenotype/covariate is missing.

       --nonfounders
              : Include nonfounders in allele freq/HWE calculations.

       --bad-freqs
              : When PLINK 2 needs decent allele frequencies, it  normally  errors  out  if  they
              aren't  provided  by  --read-freq and less than 50 founders are available to impute
              them from.  Use --bad-freqs to force PLINK 2 to proceed in this case.

       --bad-ld
              : PLINK 2 normally errors out when it needs to estimate LD  between  variants,  but
              there are less than 50 founders to estimate from.  Use --bad-ld to force PLINK 2 to
              proceed.

       --export-allele <file> : With --export A/A-transpose/AD, count alleles named
              in the file, instead of REF alleles.

       --output-chr <MT code> : Set chromosome coding scheme in output files by
              providing the desired human mitochondrial code.  Options are '26', 'M', 'MT', '0M',
              'chr26',  'chrM', and 'chrMT'; default is now 'MT' (note that this is a change from
              PLINK 1.x, which defaulted to '26').

       --output-missing-genotype <ch> : Set the code used to represent missing
              genotypes in output files (default '.').

       --output-missing-phenotype <s> : Set the string used to represent missing
              phenotypes in output files (default 'NA').

       --sort-vars [mode]
              : Sort variants by chromosome, then position, then ID.  The following string orders
              are  supported:  *  'natural'/'n':  Natural  sort (default).  * 'ascii'/'a': ASCII.
              This must be used with --make-[b]pgen/--make-bed.

       --set-hh-missing ['keep-dosage'] : Make --make-[b]pgen/--make-bed set non-MT
              heterozygous haploid hardcalls, and all female chrY  calls,  to  missing.   (Unlike
              PLINK  1.x,  this  treats  unknown-sex chrY genotypes like males, not females.)  By
              default, all associated dosages are also erased; use  'keep-dosage'  to  keep  them
              all.

       --set-mixed-mt-missing ['keep-dosage'] : Make --make-[b]pgen/--make-bed set
              mixed MT hardcalls to missing.

       --split-par <bp1> <bp2> : Changes chromosome code of all X chromosome

       --split-par <build>
              variants  with  bp position <= bp1 to PAR1, and those with position >= bp2 to PAR2.
              The following build codes are supported as shorthand: *  'b36'/'hg18'  =  NCBI  36,
              2709521/154584237  *  'b37'/'hg19'  =  GRCh37,  2699520/154931044  * 'b38'/'hg38' =
              GRCh38, 2781479/155701383

       --merge-par
              : Merge PAR1/PAR2 back with X.  Requires PAR1 to be positioned  immediately  before
              X,  and PAR2 to be immediately after X.  (Should *not* be used with "--export vcf",
              since it causes male homozygous/missing  calls  in  PAR1/PAR2  to  be  reported  as
              haploid.)

       --merge-x
              : Merge XY back with X.  This usually has to be combined with --sort-vars.

       --set-missing-var-ids <t>
              : Given a template string with a '@' where the

       --set-all-var-ids <t>
              chromosome   code   should   go   and   '#'   where   the  bp  coordinate  belongs,
              --set-missing-var-ids assigns  chromosome-and-bp-based  IDs  to  unnamed  variants,
              while --set-all-var-ids resets all IDs.  You may also use '$r'/'$a' to refer to the
              ref and alt1 alleles, or '$1'/'$2' to refer to them in alphabetical order.

       --var-id-multi <t>
              : Specify alternative templates for multiallelic

       --var-id-multi-nonsnp <t>
              variants.  ('$a' and '$1'/'$2' should be avoided here, though  they're  technically
              still allowed.)

       --new-id-max-allele-len <len> [{error | missing | truncate}] :

              Specify  maximum  number  of leading characters from allele codes to include in new
              variant IDs, and behavior on longer codes (defaults 23, error).

       --missing-var-code <str>
              : Change  unnamed  variant  code  for  --rm-dup,  --set-{missing|all}-var-ids,  and
              --recover-var-ids (default '.').

       --update-map
              <f> [bpcol]  [IDcol]  [skip] : Update variant bp positions.

       --update-name <f> [newcol] [oldcol] [skip] : Update variant IDs.

       --recover-var-ids <file> ['strict-bim-order'] [{rigid | force}] ['partial'] :

       Undo --set-all-var-ids, given the original .pvar/VCF/.bim file.
              Original

              IDs  are  looked up by position and allele codes.  * By default, if the original-ID
              file is a .bim, allele order is ignored.

              Use 'strict-bim-order' to force A1=ALT, A2=REF.

              * If any variant has multiple matching records in the original-ID file, and

              the IDs conflict, --recover-var-ids writes the affected (current) ID(s) to  <output
              prefix>.recoverid.dup,  and  normally  errors out.  If the original-ID file has the
              same number of variants in the same order, you can still recover the old  IDs  with
              the  'rigid'  modifier  in  this  case.   Alternatively,  to proceed and assign the
              missing-ID code to these variants, add the 'force' modifier.   (The  .recoverid.dup
              file is still written when 'rigid' or 'force' is specified.)

              * --recover-var-ids normally expects to replace all variant IDs, and errors

       out if any are left untouched.
              Add the 'partial' modifier when you

              actually want to update just a proper subset.

       --update-alleles <fname>
              : Update variant allele codes.

       --update-ids <fname>
              : Update sample IDs.

       --update-parents <fname>
              : Update parental IDs.

       --update-sex <filename> ['col-num='<n>] ['male0'] :

              Update  sex  information.   *  By  default, if there is a header line starting with
              '#FID'/'#IID', sex is

              loaded from the first column titled 'SEX' (any capitalization); otherwise, column 3
              is assumed.  Use 'col-num=' to force a column number.

       * Only the first character in the sex column is processed.
              By default,

              '1'/'M'/'m'  is  interpreted  as  male,  '2'/'F'/'f'  is interpreted as female, and
              '0'/'N' is interpreted as unknown-sex.  To  change  this  to  '0'/'M'/'m'  =  male,
              '1'/'F'/'f' = female, anything else other than '2' = unknown-sex, add 'male0'.

       --real-ref-alleles
              :  Treat A2 alleles in a PLINK 1.x fileset as actual REF alleles; otherwise they're
              marked as provisional.

       --maj-ref ['force'] : Set major alleles to reference, like PLINK 1.x
              automatically  did.   (Note  that  this  is  now  opt-in   rather   than   opt-out;
              --keep-allele-order is no longer necessary to prevent allele-swapping.)  * This can
              only be used in runs with

       --make-bed/--make-[b]pgen/--export and no other
              commands.

       * By default, this only affects variants marked as
              having 'provisional' reference alleles.  Add 'force' to apply this to all variants.

              * All new reference alleles are marked as provisional.

       --ref-allele ['force'] <filename> [refcol] [IDcol] [skip]

       --alt1-allele ['force'] <filename> [alt1col] [IDcol] [skip] :

              These set the  alleles  specified  in  the  file  to  ref  (--ref-allele)  or  alt1
              (--alt1-allele).   They  can be combined in the same run.  * These can only be used
              in runs with --make-bed/--make-[b]pgen/--export

              and no other commands.

              * "--ref-allele <VCF filename> 4 3 '#'", which scrapes reference allele

              assignments from a VCF file, is especially useful.

              * By default, these error out when asked to change a 'known' reference

       allele.
              Add 'force' to permit that (when e.g. switching to a new

              reference genome).

              * When --alt1-allele changes the previous ref allele to alt1, the previous

              alt1 allele is set to reference and marked as provisional.

       --ref-from-fa ['force'] : This sets reference alleles from the --fa file when
              it can be done unambiguously (note that it's never possible for deletions  or  some
              insertions).   By  default,  it errors out when asked to change a 'known' reference
              allele; add the 'force' modifier to permit that.

       --normalize ['list']
              : Left-normalize all variants, using the --fa file.

       (alias: --norm)
              (Assumes no differences in capitalization.)  The 'list' modifier causes a  list  of
              affected variant IDs to be written to <output prefix>.normalized.

       --indiv-sort <mode> [f] : Specify sample ID sort order for merge and
              --make-[b]pgen/--make-bed.   The  following  four modes are supported: * 'none'/'0'
              keeps samples in the order they were

       loaded.
              Default for non-merge.

       * 'natural'/'n' invokes "natural sort", e.g.
              'id2' < 'ID3' < 'id10'.  Default when merging.

       * 'ascii'/'a' sorts in ASCII order, e.g.
              'ID3' < 'id10' < 'id2'.

       * 'file'/'f' uses the order in the given file
              (named in the last argument).

       --king-table-filter <min>
              : Specify minimum kinship coefficient for inclusion in --make-king-table report.

       --king-table-subset <f> [kmin] : Restrict current --make-king-table run to
              sample pairs listed in the given .kin0 file.  If a  second  argument  is  provided,
              only  sample  pairs  with  kinship  >=  that  threshold  (in  the  input .kin0) are
              processed.

       --condition <variant ID> [{dominant | recessive}] ['multiallelic']

       --condition-list <fname> [{dominant | recessive}] ['multiallelic'] :

              Add the given variant, or all variants in the given file, as --glm covariates.   By
              default,  this  errors  out  if  any  of  the  variants  are  multiallelic; add the
              'multiallelic' ('m' for short) modifier to  allow  them.   They'll  effectively  be
              split  against the major allele (unless --glm's 'omit-ref' modifier was specified),
              and  all  induced  covariate  names--even  for  biallelic  variants--will  have  an
              underscore followed by the allele code at the end.

       --parameters <...> : Include only the given covariates/interactions in the
              --glm model, identified by a list of 1-based indices and/or ranges of them.

       --tests <...>
              : Perform a (joint) test on the specified term(s) in the

       --tests all          --glm model, identified by 1-based indices and/or ranges
              of them.  * Note that, when --parameters is also present, the

       indices refer to the terms remaining AFTER pruning by
              --parameters.

              * You can use '--tests all' to include all terms.

       --vif <max VIF>
              :  Set  VIF  threshold for --glm multicollinearity check (default 50).  (This is no
              longer skipped for case/control phenotypes.)

       --max-corr <val>
              : Skip --glm regression when the absolute value  of  the  correlation  between  two
              predictors exceeds this value (default 0.999).

       --xchr-model <m>
              :  Set  the  chrX --glm/--condition[-list]/--[v]score model.  * '0' = skip chrX.  *
              '1' = add sex as a covar on chrX, code males 0..1.  *  '2'  (default)  =  chrX  sex
              covar,  code  males  0..2.   (Use  the  --glm  'interaction'  modifier  to test for
              interaction between genotype and sex.)

       --adjust ['zs'] ['gc'] ['log10'] ['cols='<column set descriptor>] :

              For  each  association  test  in  this  run,  report  some  basic  multiple-testing
              corrections,  sorted  in  increasing-p-value order.  Modifiers work the same way as
              they do on --adjust-file.

       --lambda
              : Set genomic control lambda for --adjust[-file].

       --adjust-chr-field <n...>
              : Set --adjust-file input field names.  When

       --adjust-pos-field <n...>
              multiple arguments are given to these flags,

       --adjust-id-field <n...>
              earlier names take precedence over later ones.

       --adjust-ref-field <n...>

       --adjust-alt-field <n...>

       --adjust-a1-field <n...>

       --adjust-test-field <n...>

       --adjust-p-field <n...>

       --ci <size>
              : Report confidence ratios for odds ratios/betas.

       --pfilter <val>
              : Filter out assoc. test results with higher p-values.

       --score-col-nums <...> : Process all the specified coefficient columns in the
              --score file, identified by 1-based indexes and/or ranges of them.

       --q-score-range <range file> <data file> [i] [j] ['header'] ['min'] :

              Apply --score to subset(s) of variants in the primary score list(s) based  on  e.g.
              p-value  ranges.   *  The  first file should have range labels in the first column,
              p-value

              lower bounds in the second column, and upper bounds in  the  third  column.   Lines
              with  too  few  entries,  or  nonnumeric  values in the second or third column, are
              ignored.

              * The second file should contain a variant ID and a p-value on each line

       (except possibly the first).
              Variant IDs are read from column #i and

              p-values are read from column #j, where i defaults to 1 and j defaults to i+1.  The
              'header' modifier causes the first nonempty line of this file to be skipped.

              * By default, --q-score-range errors out when a variant ID appears multiple

       times in the data file (and is also present in the main dataset).
              To use

              the minimum p-value in this case instead, add the 'min' modifier.

       --vscore-col-nums <...> : Process all the specified coefficient columns in
              the --variant-score file, identified by 1-based indexes and/or ranges of them.

       --parallel <k> <n> : Divide the output matrix into n pieces, and only compute
              the  kth piece.  The primary output file will have the piece number included in its
              name, e.g. plink2.king.13 or plink2.king.13.zst if k is  13.   Concatenating  these
              files  in  order  will  yield  the full matrix of interest.  (Yes, this can be done
              before decompression.)  N.B. This generally cannot be  used  to  directly  write  a
              symmetric square matrix.  Choose square0 or triangle shape instead, and postprocess
              as necessary.

       --memory <val> ['require'] : Set size, in MiB, of initial workspace malloc
              attempt.  To error out instead of  reducing  the  request  size  when  the  initial
              attempt fails, add the 'require' modifier.

       --threads <val>
              : Set maximum number of compute threads.

       --d <char>
              : Change variant/covariate range delimiter (normally '-').

       --seed <val...>
              :  Set  random  number  seed(s).   Each  value  must  be  an  integer between 0 and
              4294967295 inclusive.  Note that --threads  and  "--memory  require"  may  also  be
              needed to reproduce some randomized runs.

       --output-min-p <p> : Specify minimum p-value to write to reports.
              (2.23e-308 is useful for preventing underflow in some programs.)

       --debug
              : Use slower, more crash-resistant logging method.

       --randmem
              : Randomize initial workspace memory (helps catch uninitialized-memory bugs).

       --warning-errcode
              : Return a nonzero error code to the OS when a run completes with warning(s).

       --zst-level <lvl>
              : Set the Zstd compression level (1-22, default 3).

       Primary  methods  paper:  Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ
       (2015) Second-generation PLINK: rising to the challenge of  larger  and  richer  datasets.
       GigaScience, 4.

NAME

DESCRIPTION

SEE ALSO