Provided by: cd-hit_4.6.4-1_amd64 bug

NAME

       cd-hit-para.pl - divide a big clustering job into pieces to run cd-hit or cd-hit-est jobs

SYNOPSIS

       cd-hit-para.pl options

DESCRIPTION

              This  script  divide  a  big  clustering  job into pieces and submit jobs to remote
              computers over a network to make it parallel.  After all  the  jobs  finished,  the
              script  merge  the  clustering  results  as  if  you  just  run  a single cd-hit or
              cd-hit-est.

              You can also use it to divide big jobs on a single computer if your  computer  does
              not have enough RAM (with -L option).

   Requirements:
              1 When run this script over a network, the directory where you

              run  the scripts and the input files must be available on all the remote hosts with
              identical path.

              2 If you choose "ssh" to submit jobs, you have to have

              passwordless ssh to any remote  host,  see  ssh  manual  to  know  how  to  set  up
              passwordless ssh.

              3 I suggest to use queuing system instead of ssh,

              I currently support PBS and SGE

              4 cd-hit cd-hit-2d cd-hit-est cd-hit-est-2d

              cd-hit-div cd-hit-div.pl must be in same directory where this script is in.

       Options

       -i input filename in fasta format, required

       -o output filename, required

       --P program, "cd-hit" or "cd-hit-est", default "cd-hit"

       --B filename of list of hosts,

              requred unless -Q or -L option is supplied

       --L number of cpus on local computer, default 0

              when you are not running it over a cluster, you can use this option to divide a big
              clustering jobs into small pieces, I suggest you just use "--L 1" unless  you  have
              enough RAM for each cpu

       --S Number of segments to split input DB into, default 64

       --Q number of jobs to submit to queue queuing system, default 0

              by default, the program use ssh mode to submit remote jobs

       --T type of queuing system, "PBS", "SGE" are supported, default PBS

       --R restart file, used after a crash of run

       -h print this help

       More cd-hit/cd-hit-est options can be speicified in command line

              Questions, bugs, contact Weizhong Li at liwz@sdsc.edu