Ubuntu Manpage: gmap_setup - create a genome database for GMAP or GSNAP

Provided by: gmap_2012-06-12-1ubuntu1_amd64

NAME

       gmap_setup - create a genome database for GMAP or GSNAP

SYNOPSIS

       gmap_setup -dgenomename [-Ddestdir] [-oMakefile] FASTA

OPTIONS

       -d     genome name

       -D     destination directory for installation (defaults to gmapdb directory specified at configure time)

       -o     name of output Makefile (default is "Makefile.<genome>")

       -M     use coordinates from an .md file (e.g., seq_contig.md file from NCBI)

       -C     try to parse chromosomal coordinates from each FASTA header

       -E     interpret argument as a command, instead of a list of FASTA files

       -O     order chromosomes in numeric/alphabetic order (0 = no, 1 = yes (default))

   Advanced options
       -W     write some output directly to file, instead of using RAM (use only if RAM is limited)

       -q     GMAP indexing interval (default: 3 nt)

       -Q     PMAP indexing interval (default: 6 aa)

DESCRIPTION

       If  you  want  to treat each FASTA entry as a separate chromosome (either because it is in fact an entire
       chromosome or because you have  contigs  without  any  chromosomal  information),  you  can  simply  call
       gmap_setup like this:

              gmap_setup -d <genome> <fasta_file>...

       The  accession  of  each  FASTA header (the word following each ">") will be the name of each chromosome.
       GMAP can handle an unlimited number of "chromosomes", with arbitrarily long  names.  In  this  way,  GMAP
       could be used as a general search program for near-identity matches against a FASTA file.

       -M and -C
              If your sequences represent contigs that have mapping information to specific chromosomal regions,
              then  you  can have gmap_setup try to read each header to determine its chromosomal region (the -C
              flag) or read an .md file that contains information about chromosomal regions (the -M  flag).  The
              .md files are often provided in NCBI releases, but since the formats change often, gmap_setup will
              prompt you to make sure it parses it correctly.

       -E     If  you  need to pre-process the FASTA files before using these programs, perhaps because they are
              compressed or because you need to insert chromosomal information in  the  header  lines,  you  can
              specify a command instead of multiple fasta_files, like these examples:

               gmap_setup -d <genome> -E 'gunzip -c genomefiles.gz'
               gmap_setup -d <genome> -E 'cat *.fa | ./add-chromosomal-info.pl'

       -W     The gmap_setup process works best if you have a computer with enough RAM to hold the entire genome
              (e.g.,  3  gigabytes  for  a  human- or mouse-sized genome). Since the resulting genome files work
              across all machine architectures, you can find any machine with sufficient RAM to build the genome
              files and then transfer the files to another machine. (GMAP itself  runs  fine  on  machines  with
              limited  RAM.)  If you cannot find any machine with sufficient RAM for gmap_setup, you can run the
              program with the -W flag to write the files directly, but this can be very slow.

       -q and -Q
              If you specify a smaller interval (for example, 3  for  the  GMAP  interval),  you  can  create  a
              higher-resolution  database, which can be useful for mapping small oligomers (smaller than 18 nt).
              However, the corresponding genome index files will be larger (twice as big if you specify  -q  3).
              These  index  files  may  exceed  the  2  gigabyte  file  offset limit on some computers, and will
              therefore fail to work on those computers.

AUTHOR

       Thomas D. Wu and Colin K. Watanabe

REPORTING BUGS

       Report bugs to Thomas Wu <twu@gene.com>.

COPYRIGHT

       Copyright 2005 Genentech, Inc. All rights reserved.

NAME

SYNOPSIS

OPTIONS

DESCRIPTION

AUTHOR

REPORTING BUGS

COPYRIGHT

SEE ALSO