Provided by: libgenome-model-tools-music-perl_0.04-4_all bug

genome music smg

NAME

       genome music smg - Identify significantly mutated genes.

VERSION

       This document describes genome music smg version 0.04 (2018-07-05 at 09:17:13)

SYNOPSIS

       genome music smg --gene-mr-file=? --output-file=? [--max-fdr=?] [--skip-low-mr-genes]
       [--bmr-modifier-file=?] [--processors=?]

        ... music smg \
             --gene-mr-file output_dir/gene_mrs \
             --output-file output_dir/smgs

       (A "gene-mr-file" can be generated using the tool "music bmr calc-bmr".)

REQUIRED ARGUMENTS

       gene-mr-file  Text
           File with per-gene mutation rates (Created using "music bmr calc-bmr")

       output-file  Text
           Output file that will list significantly mutated genes and their p-values

OPTIONAL ARGUMENTS

       max-fdr  Number
           The maximum allowed false discovery rate for a gene to be considered an SMG

           Default value '0.2' if not specified

       skip-low-mr-genes  Boolean
           Skip testing genes with MRs lower than the background MR

           Default value 'true' if not specified

       noskip-low-mr-genes  Boolean
           Make skip-low-mr-genes 'false'

       bmr-modifier-file  Text
           Tab delimited multipliers per gene that modify BMR before testing [gene_name bmr_modifier]

       processors  Integer
           Number of processors to use (requires 'foreach' and 'doMC' R packages)

           Default value '1' if not specified

DESCRIPTION

       This script runs R-based statistical tools to identify Significantly Mutated Genes (SMGs), when given
       per-gene mutation rates categorized by mutation type, and the overall background mutation rates (BMRs)
       for each of those categories (gene_mr_file, created using "music bmr calc-bmr").

       P-values and false discovery rates (FDRs) for each gene in gene_mr_file is calculated using three tests:
       Fisher's Combined P-value test (FCPT), Likelihood Ratio test (LRT), and the Convolution test (CT). For a
       gene, if its FDR for at least 2 of these tests is <= max_fdr, it will be output as an SMG. Another output
       file with prefix "_detailed" will have p-values and FDRs for all genes.

ARGUMENTS

       --bmr-modifier-file
           The user can provide a BMR modifier for each gene in the ROI file, which is a multiplier for the
           categorized background mutation rates, before testing them against the gene's categorized mutation
           rates. Such a file can be used to correct for regional or systematic bias in mutation rates across
           the genome that may be correlated to CpG deamination or DNA repair processes like transcription-
           coupled repair or mismatch repair. Mutation rates have also been associated with DNA replication
           timing, where higher mutation rates are seen in late replicating regions. Note that the same per-gene
           multiplier is used on each mutation category of BMR. Any genes from the ROI file that are not in the
           BMR modifier file will be tested against unmodified overall BMRs per mutation category. BMR modifiers
           of <=0 are not permitted, because that's just silly.
       --skip-low-mr-genes
           Genes with consistently lower MRs than the BMRs across mutation categories, may show up in the
           results as an SMG (by CT or LRT). If such genes are not of interest, they may be assigned a p-value
           of 1. This should also speed things up. Genes with higher Indel or Truncation rates than the
           background will not be skipped even if the gene's overall MR is lower than the BMR. If bmr-modifiers
           are applied, this step uses the modified BMRs instead.

AUTHORS

        Qunyuan Zhang, Ph.D.
        Cyriac Kandoth, Ph.D.
        Nathan D. Dees, Ph.D.