Provided by: qtltools_1.3.1+dfsg-2build2_amd64 bug

NAME

       QTLtools pca - Conducts PCA

SYNOPSIS

       QTLtools pca --vcf [in.vcf|in.vcf.gz|in.bcf] | --bed in.bed.gz --out output.txt [OPTIONS]

DESCRIPTION

       This  mode  allows  performing  a  Principal  Component Analysis (PCA) either on molecular
       phenotype quantifications or genotype data.  It is typically used (i) to  detect  outliers
       in  the  data,  (ii)  to  detect  stratification in the data or (iii) to build a covariate
       matrix  before  QTL  mapping.   QTLtools'  PCA  implementation  utilizes  singular   value
       decomposition (SVD).  When building a covariate matrix to account for technical covariates
       we recommend using --center and  --scale.

OPTIONS

       --vcf [in.vcf|in.bcf|in.vcf.gz|in.bed.gz]
              Genotypes in VCF/BCF/BED format.  REQUIRED unless --bed.

       --bed quantifications.bed.gz
              Quantifications in BED format.  REQUIRED unless --vcf.

       --out output_prefix
              Output file prefix.  REQUIRED.

       --center
              Center the variables (genotypes or phenotypes) by subtracting the  mean  from  each
              value

       --scale
              Scale  the  variables  (genotypes  or  phenotypes)  by  dividing  each value by the
              standard deviation

       --region chr:start-end
              Genomic region to be processed.  E.g. chr4:12334456-16334456, or chr5

       --exclude-chrs string
              The chromosomes to exclude given as a space separated list.  Only applies to --vcf.
              DEFAULT="X Y M MT XY chrX chrY chrM chrMT chrXY"

       --maf float
              Exclude  sites  with minor allele frequency less than this.  Only applies to --vcf.
              DEFAULT=0.0

       --distance integer
              Only include sites separated with this many base pairs.   Only  applies  to  --vcf.
              DEFAULT=0

OUTPUT FILES

       .pca
        This  file  contains  the  principal  components  that were calculated.  The names of the
        principal components, which is given in the first column, is composed of the output  file
        prefix,  whether  the  data  was centered, whether the data was scaled, and the principal
        component number.

       .pca_stats
        This file contains the standard deviation of each principal component, and  the  variance
        and the cumulative variance explained by each PC.

EXAMPLES

       o Running pca on RNAseq quantifications to calculate technical covariates:

         QTLtools  pca  --bed  genes.50percent.chr22.bed.gz  --out genes.50percent.chr22 --center
         --scale

       o Running pca on genotypes to detect population stratification:

         QTLtools pca --vcf genotypes.chr22.vcf.gz --out genotypes.chr22 --center  --scale  --maf
         0.05 --distance 5000

SEE ALSO

       QTLtools(1)

       QTLtools website: <https://qtltools.github.io/qtltools>

BUGS

       o Versions  up  to  and  including  1.2, suffer from a bug in reading missing genotypes in
         VCF/BCF files.  This bug affects variants with a DS field in their genotype's FORMAT and
         have  a  missing genotype (DS fiels is .) in one of the samples, in which case genotypes
         for all the samples are set to missing,  effectively  removing  this  variant  from  the
         analyses.

       Please submit bugs to <https://github.com/qtltools/qtltools>

AUTHORS

       Halit Ongen (halitongen@gmail.com), Olivier Delaneau (olivier.delaneau@gmail.com)