Provided by: qtltools_1.3.1+dfsg-2build2_amd64 bug

NAME

       QTLtools quan - Quantify gene and exon expression from RNA-seq

SYNOPSIS

       QTLtools quan --bam [in.sam|in.bam|in.cram] --gtf gene_annotation.gtf --out-prefix output [OPTIONS]

DESCRIPTION

       This mode quantifies the expression of genes and exons in the provided --gtf file using the RNA-seq reads
       in the --bam file.  The method counts the number of reads  overlapping  the  exons  in  the  --gtf  file.
       Firstly  all  exons  of  a  gene  are converted into meta-exons where overlapping exons are merged into a
       single exon encompassing all the overlapping exons.  Any  overlap  between  the  read  and  the  exon  is
       considered a match, that is a read is not required to be in between start and end positions of an exon to
       count towards that exon's quantification.  Split reads aligning to multiple exons contribute to each exon
       it  overlaps  with  based  on the fraction of the read that overlaps with a given exon.  Thus split reads
       contribute less than a single count to each of the overlapping exons.  Reads aligning to  multiple  exons
       (i.e.  overlapping  exons  of  multiple  genes) count towards the quantification of all the exons that it
       overlaps with.  If the --bam file contains paired-end reads and if there are cases  where  the  two  mate
       pairs  overlap  with  each other (i.e. have an insert size < 0), then each of these reads contribute less
       then a single count towards the quantifications unless --no-merge is provided.   The  following  diagram,
       with  two  genes with overlapping exons and one paired-end read where both mate pairs are split reads and
       overlap with each other, illustrates how the quantification works:

                    x             x
                   / \           / \
        +---------+   +---------+   +---------+
        | Exon1|1 |   | Exon1|2 |   | Exon1|3 |      Gene1
        +---------+   +---------+   +---------+
                                  x
                                 / \
                   +------------+   +-------------+
                   |  Exon2|1   |   |   Exon2|2   |  Gene2
                   +------------+   +-------------+
                                     x
                                    / \
                      +------------+   +----+        RNAseq Read Mate1

                      |--a-||-b-||c|   |-d--||--e-|
                                     x
                                    / \
                            +------+   +----------+  RNAseq Read Mate2

         Left Mate1  = ((b * 0.5) + a) / (a + b + d)
         Right Mate1 = (d * 0.5)/(a + b + d)
         Left Mate2  = (b * 0.5)/(b + d + e)
         Right Mate2 = ((d * 0.5) + e)/(b + d + e)

         Exon1|2 = Left Mate1 + Left Mate2
         Exon1|3 = Right Mate1 + Right Mate2
         Exon2|1 = Left Mate1 + Left Mate2
         Exon2|2 = Right Mate2 + Right Mate2

         Gene1 = Exon1|2 + Exon1|3
         Gene2 = Exon2|1 + Exon2|2

       The quan mode in version 1.2 and above is not  compatible  with  the  quantifications  generated  by  the
       previous  versions.  This  due  to  bug  fixes  and slight adjustments to the way we quantify. DO NOT MIX
       QUANTIFICATIONS GENERATED BY EARLIER VERSIONS OF QTLTOOLS WITH QUANTIFICATIONS FROM VERSION 1.2 AND ABOVE
       AS THIS WILL CREATE A BIAS IN YOUR DATASET.

OPTIONS

       --gtf gene_annotation.gtf
              Gene  annotations  in  GTF  format.   These  can be obtained from <https://www.gencodegenes.org/>.
              REQUIRED.

       --bam [in.bam|in.sam|in.cram]
              Sequence data in BAM/SAM/CRAM format sorted by chromosome and then position.  One sample  per  BAM
              file.  REQUIRED.

       --out-prefix output
              Output prefix.  REQUIRED.

       --sample sample_name
              The  sample  name  of the BAM file.  If not provided the sample name will be taken as the BAM file
              path.

       --rpkm Output RPKM values.

       --tpm  Output TPM values.

       --xxhash
              Rather than using the GTF file name to generate unique hash for the options used, use the hash  of
              the GTF file.

       --no-hash
              Do  not include a hash signifying the options used in the quantification in the output file names.
              NOT RECOMMENDED.

       --gene-types gene_type ...
              Only quantify these  gene  types.   Requires  gene_type  attribute  in  GTF.   It  will  also  use
              transcript_type if present.

       --filter-mapping-quality integer
              Minimum  mapping  quality  for  a  read  or  read pair to be considered.  Set this to only include
              uniquely mapped reads.  DEFAULT=10.

       --filter-mismatch integer|float
              Maximum mismatches allowed in a read.  If between 0 and 1 taken as the fraction  of  read  length.
              Requires NM attribute in the BAM file.  DEFAULT=OFF.

       --filter-mismatch-total integer|float
              Maximum  total  mismatches  allowed  in paired-reads.  If between 0 and 1 taken as the fraction of
              combined read length.  Requires NM attribute in the BAM file.  DEFAULT=OFF.

       --filter-min-exon integer
              Minimum length of an exon for it to be quantified.  Exons smaller than this will  not  be  printed
              out in the exon quantifications, but will still count towards gene quantifications.  DEFAULT=0.

       --filter-remove-duplicates
              Remove duplicate sequencing reads,as indicated by the aligner, in the process.  NOT RECOMMENDED.

       --filter-failed-qc
              Remove fastq reads that fail sequencing QC as indicated by the sequencer.

       --check-proper-pairing
              If  provided  only  properly paired reads according to the aligner that are in correct orientation
              will be considered.  Otherwise all pairs in correct orientation will be considered.

       --check-consistency
              If provided checks the consistency of split reads with annotation, rather than pure overlap of one
              of the blocks of the split read.

       --no-merge
              If  provided  overlapping mate pairs will not be merged.  Default behavior is to merge overlapping
              mate pairs based on the amount of overlap, such that each mate pair counts for less than 1 read.

       --legacy-options
              Exactly replicate Dermitzakis lab original quantification script.  DO NOT USE.

       --region chr:start-end
              Genomic region to be processed.  E.g. chr4:12334456-16334456, or chr5.

OUTPUT FILES

       Unless --no-hash is provided, all output files will include a hash value corresponding to combination  of
       the  specific  options  used.  This is given so that one does not merge quantifications from samples that
       were quantified differently, which would create a bias in the dataset.

       .gene.count.bed .exon.count.bed .gene.rpkm.bed .exon.rpkm.bed .gene.tpm.bed .exon.tpm.bed
        These are the quantification results files with the following columns:

        1   chr           Phenotype's chromosome
        2   start         Phenotype's start position (0-based)
        3   end           Phenotype's end position (1-based)
        4   gene|exon     The gene or exon ID.
        5   info|geneID   Information about the gene or the gene ID of the exon.  The gene info is separated  by
                          semicolons, and L=gene length, T=gene type, R=gene positions, N=gene name
        6   strand        Phenotype's strand
        7   sample_name   The sample name of the BAM file

       .stats
        Details the statistics of the quantification, with the following rows:

         1   filtered_secondary_alignments_(does_not_count_towards_total_reads)   Number of secondary alignments
         2   total_reads                                                          Number  of  reads  in  the BAM
                                                                                  file
         3   filtered_unmapped                                                    Number of unmapped reads
         4   filtered_failqc                                                      Number  of  reads   with   the
                                                                                  failed QC tag
         5   filtered_duplicate                                                   Number of duplicate reads
         6   filtered_mapQ_less_than_X                                            Number   of  reads  below  the
                                                                                  mapping quality threshold X
         7   filtered_notpaired                                                   Number of pairs that were  not
                                                                                  in  the correct orientation or
                                                                                  were not properly paired
         8   filtered_mismatches_greater_than_X_Y                                 Number of  reads  failing  the
                                                                                  mismatches  per  read,  X, and
                                                                                  mismatches total filters, Y
         9   filtered_unmatched_mate_pairs                                        Number of  reads  where  there
                                                                                  was   a   paired-read  with  a
                                                                                  missing mate
        10   total_good                                                           Number of  reads  that  passed
                                                                                  all filters
        11   total_exonic                                                         Number  of  reads that aligned
                                                                                  to  exons   and   passed   all
                                                                                  filters
        12   total_exonic_multi_counting                                          Number  of  reads that aligned
                                                                                  to exons when we  count  reads
                                                                                  that  align  to multiple exons
                                                                                  multiple times
        13   total_merged_reads                                                   Number of reads where the mate
                                                                                  pairs   were  overlapping  and
                                                                                  thus were merged
        14   total_exonic_multi_counting_after_merge_(used_for_rpkm)              Number of reads  that  aligned
                                                                                  to   exons   when   we   merge
                                                                                  overlapping mate pairs
        15   good_over_total                                                      Number of good reads over  the
                                                                                  total number of reads
        16   exonic_over_total                                                    Number  of  exonic  reads over
                                                                                  the total number of reads
        17   exonic_over_good                                                     Number of  exonic  reads  over
                                                                                  the number of good reads

EXAMPLE

       o Quantifying  a  sample  mapped with GEM, outputting TPM and RPKM values, and taking the hash of the GTF
         file:

         QTLtools quan --bam HG00381.chr22.bam --gtf  gencode.v19.annotation.chr22.gtf.gz  --out-prefix  HG00381
         --sample HG00381 --rpkm --tpm --xxhash --filter-mismatch-total 8 --filter-mapping-quality 150

SEE ALSO

       QTLtools(1)

       QTLtools website: <https://qtltools.github.io/qtltools>

BUGS

       Please submit bugs to <https://github.com/qtltools/qtltools>

CITATION

       Delaneau,  O.,  Ongen, H., Brown, A. et al. A complete tool set for molecular QTL discovery and analysis.
       Nat Commun 8, 15452 (2017).  <https://doi.org/10.1038/ncomms15452>

AUTHORS

       Halit Ongen (halitongen@gmail.com), Olivier Delaneau (olivier.delaneau@gmail.com)