Provided by: cd-hit_4.6.1-2012-08-27-2_amd64 bug

NAME

       cd-hit-2d-para.pl  -  divide  a big clustering job into pieces to run cd-hit-2d or cd-hit-
       est-2d jobs

SYNOPSIS

       cd-hit-2d-para.pl options

DESCRIPTION

              This script divide a big clustering job into  pieces  and  submit  jobs  to  remote
              computers  over  a  network  to make it parallel.  After all the jobs finished, the
              script merge the clustering results as if  you  just  run  a  single  cd-hit-2d  or
              cd-hit-est-2d.

              You  can  also use it to divide big jobs on a single computer if your computer does
              not have enough RAM (with -L option).

   Requirements:
              1 When run this script over a network, the directory where you

              run the scripts and the input files must be available on all the remote hosts  with
              identical path.

              2 If you choose "ssh" to submit jobs, you have to have

              passwordless  ssh  to  any  remote  host,  see  ssh  manual  to  know how to set up
              passwordless ssh.

              3 I suggest to use queuing system instead of ssh,

              I currently support PBS and SGE

              4 cd-hit-2d cd-hit-est-2d cd-hit-div cd-hit-div.pl must be

              in same directory where this script is in.

       Options

       -i     input filename for 1st db in fasta format, required

       -i2 input filename for 2nd db in fasta format, required

       -o     output filename, required

       --P    program, "cd-hit-2d" or "cd-hit-est-2d", default "cd-hit-2d"

       --B    filename of list of hosts, requred unless -Q or -L option is supplied

       --L    number of cpus on local computer, default 0 when you are  not  running  it  over  a
              cluster, you can use this option to divide a big clustering jobs into small pieces,
              I suggest you just use "--L 1" unless you have enough RAM for each cpu

       --S    Number of segments to split 1st db into, default 2

       --S2 Number of segments to split 2nd db into, default 8

       --Q    number of jobs to submit to queue queuing system, default 0 by default, the program
              use ssh mode to submit remote jobs

       --T    type of queuing system, "PBS", "SGE" are supported, default PBS

       --R    restart file, used after a crash of run

       -h     print this help

       More cd-hit-2d/cd-hit-est-2d options can be speicified in command line

              Questions, bugs, contact Weizhong Li at liwz@sdsc.edu