Provided by: ea-utils_1.1.2+dfsg-9build1_amd64
NAME
sam-stats - ea-utils: produce digested statistics
SYNOPSIS
sam-stats [options] [file1] [file2...filen]
DESCRIPTION
Version: 1.38.681 Produces lots of easily digested statistics for the files listed Options (default in parens): -D Keep track of multiple alignments -O PREFIX Output prefix enabling extended output (see below) -R FIL Coverage/RNA output (coverage, 3' bias, etc, implies -A) -A Report all chr sigs, even if there are more than 1000 -b INT Number of reads to sample for per-base stats (1M) -S INT Size of ascii-signature (30) -x FIL File extension for handling multiple files (stats) -M Only overwrite if newer (requires -x, or multiple files) -B Input is bam, don't bother looking at magic -z Don't fail when zero entries in sam OUTPUT: If one file is specified, then the output is to standard out. If multiple files are specified, or if the -x option is supplied, the output file is <filename>.<ext>. Default extension is 'stats'. Complete Stats: <STATS> : mean, max, stdev, median, Q1 (25 percentile), Q3 reads : # of entries in the sam file, might not be # reads phred : phred scale used bsize : # reads used for qual stats mapped reads : number of aligned reads (unique probe id sequences) mapped bases : total of the lengths of the aligned reads forward : number of forward-aligned reads reverse : number of reverse-aligned reads snp rate : mismatched bases / total bases (snv rate) ins rate : insert bases / total bases del rate : deleted bases / total bases pct mismatch : percent of reads that have mismatches pct align : percent of reads that aligned len <STATS> : read length stats, ignored if fixed-length mapq <STATS> : stats for mapping qualities insert <STATS> : stats for insert sizes %<CHR> : percentage of mapped bases per chr, followed by a signature Subsampled stats (1M reads max): base qual <STATS> : stats for base qualities %A,%T,%C,%G : base percentages Meaning of the per-chromosome signature: A ascii-histogram of mapped reads by chromosome position. It is only output if the original SAM/BAM has a header. The values are the log2 of the # of mapped reads at each position + ascii '0'. Extended output mode produces a set of files: .stats : primary output .fastx : fastx-toolkit compatible output .rcov : per-reference counts & coverage .xdist : mismatch distribution .ldist : length distribution (if applicable) .mqdist : mapping quality distribution