Provided by: qtltools_1.3.1+dfsg-4build6_amd64 

NAME
QTLtools quan - Quantify gene and exon expression from RNA-seq
SYNOPSIS
QTLtools quan --bam [in.sam|in.bam|in.cram] --gtf gene_annotation.gtf --out-prefix output [OPTIONS]
DESCRIPTION
This mode quantifies the expression of genes and exons in the provided --gtf file using the RNA-seq reads
in the --bam file. The method counts the number of reads overlapping the exons in the --gtf file.
Firstly all exons of a gene are converted into meta-exons where overlapping exons are merged into a
single exon encompassing all the overlapping exons. Any overlap between the read and the exon is
considered a match, that is a read is not required to be in between start and end positions of an exon to
count towards that exon's quantification. Split reads aligning to multiple exons contribute to each exon
it overlaps with based on the fraction of the read that overlaps with a given exon. Thus split reads
contribute less than a single count to each of the overlapping exons. Reads aligning to multiple exons
(i.e. overlapping exons of multiple genes) count towards the quantification of all the exons that it
overlaps with. If the --bam file contains paired-end reads and if there are cases where the two mate
pairs overlap with each other (i.e. have an insert size < 0), then each of these reads contribute less
then a single count towards the quantifications unless --no-merge is provided. The following diagram,
with two genes with overlapping exons and one paired-end read where both mate pairs are split reads and
overlap with each other, illustrates how the quantification works:
x x
/ \ / \
+---------+ +---------+ +---------+
| Exon1|1 | | Exon1|2 | | Exon1|3 | Gene1
+---------+ +---------+ +---------+
x
/ \
+------------+ +-------------+
| Exon2|1 | | Exon2|2 | Gene2
+------------+ +-------------+
x
/ \
+------------+ +----+ RNAseq Read Mate1
|--a-||-b-||c| |-d--||--e-|
x
/ \
+------+ +----------+ RNAseq Read Mate2
Left Mate1 = ((b * 0.5) + a) / (a + b + d)
Right Mate1 = (d * 0.5)/(a + b + d)
Left Mate2 = (b * 0.5)/(b + d + e)
Right Mate2 = ((d * 0.5) + e)/(b + d + e)
Exon1|2 = Left Mate1 + Left Mate2
Exon1|3 = Right Mate1 + Right Mate2
Exon2|1 = Left Mate1 + Left Mate2
Exon2|2 = Right Mate2 + Right Mate2
Gene1 = Exon1|2 + Exon1|3
Gene2 = Exon2|1 + Exon2|2
The quan mode in version 1.2 and above is not compatible with the quantifications generated by the
previous versions. This due to bug fixes and slight adjustments to the way we quantify. DO NOT MIX
QUANTIFICATIONS GENERATED BY EARLIER VERSIONS OF QTLTOOLS WITH QUANTIFICATIONS FROM VERSION 1.2 AND ABOVE
AS THIS WILL CREATE A BIAS IN YOUR DATASET.
OPTIONS
--gtf gene_annotation.gtf
Gene annotations in GTF format. These can be obtained from <https://www.gencodegenes.org/>.
REQUIRED.
--bam [in.bam|in.sam|in.cram]
Sequence data in BAM/SAM/CRAM format sorted by chromosome and then position. One sample per BAM
file. REQUIRED.
--out-prefix output
Output prefix. REQUIRED.
--sample sample_name
The sample name of the BAM file. If not provided the sample name will be taken as the BAM file
path.
--rpkm Output RPKM values.
--tpm Output TPM values.
--xxhash
Rather than using the GTF file name to generate unique hash for the options used, use the hash of
the GTF file.
--no-hash
Do not include a hash signifying the options used in the quantification in the output file names.
NOT RECOMMENDED.
--gene-types gene_type ...
Only quantify these gene types. Requires gene_type attribute in GTF. It will also use
transcript_type if present.
--filter-mapping-quality integer
Minimum mapping quality for a read or read pair to be considered. Set this to only include
uniquely mapped reads. DEFAULT=10.
--filter-mismatch integer|float
Maximum mismatches allowed in a read. If between 0 and 1 taken as the fraction of read length.
Requires NM attribute in the BAM file. DEFAULT=OFF.
--filter-mismatch-total integer|float
Maximum total mismatches allowed in paired-reads. If between 0 and 1 taken as the fraction of
combined read length. Requires NM attribute in the BAM file. DEFAULT=OFF.
--filter-min-exon integer
Minimum length of an exon for it to be quantified. Exons smaller than this will not be printed
out in the exon quantifications, but will still count towards gene quantifications. DEFAULT=0.
--filter-remove-duplicates
Remove duplicate sequencing reads,as indicated by the aligner, in the process. NOT RECOMMENDED.
--filter-failed-qc
Remove fastq reads that fail sequencing QC as indicated by the sequencer.
--check-proper-pairing
If provided only properly paired reads according to the aligner that are in correct orientation
will be considered. Otherwise all pairs in correct orientation will be considered.
--check-consistency
If provided checks the consistency of split reads with annotation, rather than pure overlap of one
of the blocks of the split read.
--no-merge
If provided overlapping mate pairs will not be merged. Default behavior is to merge overlapping
mate pairs based on the amount of overlap, such that each mate pair counts for less than 1 read.
--legacy-options
Exactly replicate Dermitzakis lab original quantification script. DO NOT USE.
--region chr:start-end
Genomic region to be processed. E.g. chr4:12334456-16334456, or chr5.
OUTPUT FILES
Unless --no-hash is provided, all output files will include a hash value corresponding to combination of
the specific options used. This is given so that one does not merge quantifications from samples that
were quantified differently, which would create a bias in the dataset.
.gene.count.bed .exon.count.bed .gene.rpkm.bed .exon.rpkm.bed .gene.tpm.bed .exon.tpm.bed
These are the quantification results files with the following columns:
1 chr Phenotype's chromosome
2 start Phenotype's start position (0-based)
3 end Phenotype's end position (1-based)
4 gene|exon The gene or exon ID.
5 info|geneID Information about the gene or the gene ID of the exon. The gene info is separated by
semicolons, and L=gene length, T=gene type, R=gene positions, N=gene name
6 strand Phenotype's strand
7 sample_name The sample name of the BAM file
.stats
Details the statistics of the quantification, with the following rows:
1 filtered_secondary_alignments_(does_not_count_towards_total_reads) Number of secondary alignments
2 total_reads Number of reads in the BAM
file
3 filtered_unmapped Number of unmapped reads
4 filtered_failqc Number of reads with the
failed QC tag
5 filtered_duplicate Number of duplicate reads
6 filtered_mapQ_less_than_X Number of reads below the
mapping quality threshold X
7 filtered_notpaired Number of pairs that were not
in the correct orientation or
were not properly paired
8 filtered_mismatches_greater_than_X_Y Number of reads failing the
mismatches per read, X, and
mismatches total filters, Y
9 filtered_unmatched_mate_pairs Number of reads where there
was a paired-read with a
missing mate
10 total_good Number of reads that passed
all filters
11 total_exonic Number of reads that aligned
to exons and passed all
filters
12 total_exonic_multi_counting Number of reads that aligned
to exons when we count reads
that align to multiple exons
multiple times
13 total_merged_reads Number of reads where the mate
pairs were overlapping and
thus were merged
14 total_exonic_multi_counting_after_merge_(used_for_rpkm) Number of reads that aligned
to exons when we merge
overlapping mate pairs
15 good_over_total Number of good reads over the
total number of reads
16 exonic_over_total Number of exonic reads over
the total number of reads
17 exonic_over_good Number of exonic reads over
the number of good reads
EXAMPLE
o Quantifying a sample mapped with GEM, outputting TPM and RPKM values, and taking the hash of the GTF
file:
QTLtools quan --bam HG00381.chr22.bam --gtf gencode.v19.annotation.chr22.gtf.gz --out-prefix HG00381
--sample HG00381 --rpkm --tpm --xxhash --filter-mismatch-total 8 --filter-mapping-quality 150
SEE ALSO
QTLtools(1)
QTLtools website: <https://qtltools.github.io/qtltools>
BUGS
Please submit bugs to <https://github.com/qtltools/qtltools>
CITATION
Delaneau, O., Ongen, H., Brown, A. et al. A complete tool set for molecular QTL discovery and analysis.
Nat Commun 8, 15452 (2017). <https://doi.org/10.1038/ncomms15452>
AUTHORS
Halit Ongen (halitongen@gmail.com), Olivier Delaneau (olivier.delaneau@gmail.com)
QTLtools-v1.3 06 May 2020 QTLtools-quan(1)