Provided by: libgenome-model-tools-music-perl_0.04-1_all bug

genome music path-scan

NAME

       genome music path-scan - Find signifcantly mutated pathways in a cohort given a list of
       somatic mutations.

VERSION

       This document describes genome music path-scan version 0.04 (2013-05-25 at 17:45:01)

SYNOPSIS

       genome music path-scan --gene-covg-dir=? --bam-list=? --pathway-file=? --maf-file=?
       --output-file=? [--bmr=?] [--genes-to-ignore=?] [--min-mut-genes-per-path=?]
       [--skip-non-coding] [--skip-silent]

        ... music path-scan \
               --bam-list input_dir/bam_file_list \
               --gene-covg-dir output_dir/gene_covgs/ \
               --maf-file input_dir/myMAF.tsv \
               --output-file output_dir/sm_pathways \
               --pathway-file input_dir/pathway_dbs/KEGG.txt \
               --bmr 8.7E-07

REQUIRED ARGUMENTS

       gene-covg-dir  Text
           Directory containing per-gene coverage files (Created using music bmr calc-covg)

       bam-list  Text
           Tab delimited list of BAM files [sample_name, normal_bam, tumor_bam] (See Description)

       pathway-file  Text
           Tab-delimited file of pathway information (See Description)

       maf-file  Text
           List of mutations using TCGA MAF specifications v2.3

       output-file  Text
           Output file that will list the significant pathways and their p-values

OPTIONAL ARGUMENTS

       bmr  Number
           Background mutation rate in the targeted regions

           Default value '1e-06' if not specified

       genes-to-ignore  Text
           Comma-delimited list of genes whose mutations should be ignored

       min-mut-genes-per-path  Number
           Pathways with fewer mutated genes than this, will be ignored

           Default value '1' if not specified

       skip-non-coding  Boolean
           Skip non-coding mutations from the provided MAF file

           Default value 'true' if not specified

       skip-silent  Boolean
           Skip silent mutations from the provided MAF file

           Default value 'true' if not specified

DESCRIPTION

       Only the following four columns in the MAF are used. All other columns may be left blank.

        Col 1: Hugo_Symbol (Need not be HUGO, but must match gene names used in the pathway file)
        Col 2: Entrez_Gene_Id (Matching Entrez ID trump gene name matches between pathway file and MAF)
        Col 9: Variant_Classification
        Col 16: Tumor_Sample_Barcode (Must match the name in sample-list, or contain it as a substring)

       The Entrez_Gene_Id can also be left blank (or set to 0), but it is highly recommended, in
       case genes are named differently in the pathway file and the MAF file.

ARGUMENTS

       --pathway-file
           This is a tab-delimited file prepared from a pathway database (such as KEGG), with the
           columns: [path_id, path_name, class, gene_line, diseases, drugs, description] The
           latter three columns are optional (but are available on KEGG). The gene_line contains
           the "entrez_id:gene_name" of all genes involved in this pathway, each separated by a
           "|" symbol.
                   For example, a line in the pathway-file would look like:

                     hsa00061    Fatty acid biosynthesis    Lipid Metabolism    31:ACACA|32:ACACB|27349:MCAT|2194:FASN|54995:OXSM|55301:OLAH

                   Ensure that the gene names and entrez IDs used match those used in the MAF
                   file. Entrez IDs are not mandatory (use a 0 if Entrez ID unknown). But if a
                   gene name in the MAF does not match any gene name in this file, the entrez IDs
                   are used to find a match (unless it's a 0).

       --gene-covg-dir
           This is usually the gene_covgs subdirectory created when you run "music bmr calc-
           covg". It should contain files for each sample that report per-gene covered base
           counts.
       --bam-list
           Provide a file containing sample names and normal/tumor BAM locations for each. Use
           the tab- delimited format [sample_name normal_bam tumor_bam] per line. This tool only
           needs sample_name, so all other columns can be skipped. The sample_name must be the
           same as the tumor sample names used in the MAF file (16th column, with the header
           Tumor_Sample_Barcode).
       --bmr
           The overall background mutation rate. This can be calculated using "music bmr calc-
           bmr".
       --genes-to-ignore
           A comma-delimited list of genes to ignore from the MAF file. This is useful when there
           are recurrently mutated genes like TP53 which might mask the significance of other
           genes.

AUTHORS

        Michael Wendl, Ph.D.

CREDITS

       This module uses reformatted copies of data from the Kyoto Encyclopedia of Genes and
       Genomes (KEGG) database:

        * KEGG - http://www.genome.jp/kegg/