Provided by: ray_2.3.1-7build1_amd64 bug

NAME

       Ray - assemble genomes in parallel using the message-passing interface

SYNOPSIS

       mpiexec -n NUMBER_OF_RANKS Ray -k KMERLENGTH -p l1_1.fastq l1_2.fastq -p l2_1.fastq l2_2.fastq -o test

       mpiexec -n NUMBER_OF_RANKS Ray Ray.conf # with commands in a file

DESCRIPTION:

       The  Ray  genome  assembler  is  built  on top of the RayPlatform, a generic plugin-based distributed and
       parallel compute engine that uses the message-passing interface for passing messages.

       Ray targets several applications:

              - de novo genome assembly (with Ray vanilla) - de novo meta-genome assembly (with Ray Meta)  -  de
              novo  transcriptome assembly (works, but not tested a lot) - quantification of contig abundances -
              quantification of  microbiome  consortia  members  (with  Ray  Communities)  -  quantification  of
              transcript  expression  -  taxonomy  profiling  of  samples (with Ray Communities) - gene ontology
              profiling of samples (with Ray Ontologies)

       -help

              Displays this help page.

       -version

              Displays Ray version and compilation options.

              Using a configuration file

              Ray can be launched with mpiexec -n 16 Ray Ray.conf The configuration file  can  include  comments
              (starting with #).

              K-mer length

       -k kmerLength

              Selects  the length of k-mers. The default value is 21.  It must be odd because reverse-complement
              vertices are stored together.  The maximum length  is  defined  at  compilation  by  MAXKMERLENGTH
              Larger k-mers utilise more memory.

              Inputs

       -p leftSequenceFile rightSequenceFile [averageOuterDistance standardDeviation]

              Provides  two  files  containing paired-end reads.  averageOuterDistance and standardDeviation are
              automatically computed if not provided.

       -i interleavedSequenceFile [averageOuterDistance standardDeviation]

              Provides  one  file   containing   interleaved   paired-end   reads.    averageOuterDistance   and
              standardDeviation are automatically computed if not provided.

       -s sequenceFile

              Provides a file containing single-end reads.

              Outputs

       -o outputDirectory

              Specifies the directory for outputted files. Default is RayOutput

              Assembly options (defaults work well)

       -disable-recycling

              Disables read recycling during the assembly reads will be set free in 3 cases: 1. the distance did
              not match for a pair 2. the read has not met its mate 3. the library population indicates a  wrong
              placement  see Constrained traversal of repeats with paired sequences.  Sebastien Boisvert, Elenie
              Godzaridis, Francois Laviolette & Jacques Corbeil.  First  Annual  RECOMB  Satellite  Workshop  on
              Massively Parallel Sequencing, March 26-27 2011, Vancouver, BC, Canada.

       -disable-scaffolder

              Disables the scaffolder.

       -minimum-contig-length minimumContigLength

              Changes the minimum contig length, default is 100 nucleotides

       -color-space

              Runs in color-space Needs csfasta files. Activated automatically if csfasta files are provided.

       -use-maximum-seed-coverage maximumSeedCoverageDepth

              Ignores any seed with a coverage depth above this threshold.  The default is 4294967295.

       -use-minimum-seed-coverage minimumSeedCoverageDepth

              Sets  the  minimum  seed  coverage  depth.  Any path with a coverage depth lower than this will be
              discarded. The default is 0.

              Distributed storage engine (all these values are for each MPI rank)

       -bloom-filter-bits bits

              Sets the number of bits for the Bloom filter Default is 268435456 bits, 0 bits disables the  Bloom
              filter.

       -hash-table-buckets buckets

              Sets the initial number of buckets. Must be a power of 2 !  Default value: 268435456

       -hash-table-buckets-per-group buckets

              Sets the number of buckets per group for sparse storage Default value: 64, Must be between >=1 and
              <= 64

       -hash-table-load-factor-threshold threshold

              Sets the load factor threshold for real-time resizing Default value: 0.75, must be >= 0.5 and < 1

       -hash-table-verbosity

              Activates verbosity for the distributed storage engine

              Biological abundances

       -search searchDirectory

              Provides a directory containing fasta files to be searched in the  de  Bruijn  graph.   Biological
              abundances       will       be       written       to      RayOutput/BiologicalAbundances      See
              Documentation/BiologicalAbundances.txt

       -one-color-per-file

              Sets one color per file instead of one per sequence.  By default, each sequence in each file has a
              different  color.   For files with large numbers of sequences, using one single color per file may
              be more efficient.

              Taxonomic profiling with colored de Bruijn graphs

       -with-taxonomy Genome-to-Taxon.tsv TreeOfLife-Edges.tsv Taxon-Names.tsv

              Provides   a   taxonomy.    Computes   and    writes    detailed    taxonomic    profiles.     See
              Documentation/Taxonomy.txt for details.

       -gene-ontology OntologyTerms.txt
              Annotations.txt

              Provides  an  ontology and annotations.  OntologyTerms.txt is fetched from http://geneontology.org
              Annotations.txt is a 2-column file (EMBL_CDS handle       &       gene  ontology  identifier)  See
              Documentation/GeneOntology.txt

              Other outputs

       -enable-neighbourhoods

              Computes     contig     neighborhoods     in     the     de     Bruijn    graph    Output    file:
              RayOutput/NeighbourhoodRelations.txt

       -amos

              Writes the AMOS file called RayOutput/AMOS.afg An AMOS file contains read  positions  on  contigs.
              Can be opened with software with graphical user interface.

       -write-kmers

              Writes  k-mer  graph  to  RayOutput/kmers.txt  The  resulting  file  is  not utilised by Ray.  The
              resulting file is very large.

       -write-read-markers

              Writes read markers to disk.

       -write-seeds

              Writes seed DNA sequences to RayOutput/Rank<rank>.RaySeeds.fasta

       -write-extensions

              Writes extension DNA sequences to RayOutput/Rank<rank>.RayExtensions.fasta

       -write-contig-paths

              Writes contig paths with coverage values to RayOutput/Rank<rank>.RayContigPaths.txt

       -write-marker-summary

              Writes marker statistics.

              Memory usage

       -show-memory-usage

              Shows memory usage. Data is fetched from /proc on GNU/Linux Needs __linux__

       -show-memory-allocations

              Shows memory allocation events

              Algorithm verbosity

       -show-extension-choice

              Shows the choice made (with other choices) during the extension.

       -show-ending-context

              Shows the ending context of each extension.  Shows the children of the vertex where extension  was
              too difficult.

       -show-distance-summary

              Shows summary of outer distances used for an extension path.

       -show-consensus

              Shows the consensus when a choice is done.

              Checkpointing

       -write-checkpoints checkpointDirectory

              Write checkpoint files

       -read-checkpoints checkpointDirectory

              Read checkpoint files

       -read-write-checkpoints checkpointDirectory

              Read and write checkpoint files

              Message routing for large number of cores

       -route-messages

              Enables  the Ray message router. Disabled by default.  Messages will be routed accordingly so that
              any rank can communicate directly with only a few others.  Without -route-messages, any  rank  can
              communicate   directly   with   any   other   rank.    Files  generated:  Routing/Connections.txt,
              Routing/Routes.txt and Routing/RelayEvents.txt and Routing/Summary.txt

       -connection-type type

              Sets the connection type for routes.  Accepted values are debruijn,  hypercube,  polytope,  group,
              random, kautz and complete. Default is debruijn.

              debruijn: a full de Bruijn graph a given alphabet and diameter hypercube: a hypercube, alphabet is
              {0,1} and the vertices is  a  power  of  2  polytope:  a  convex  regular  polytope,  alphabet  is
              {0,1,...,B-1}  and  the  vertices  is a power of B group: silly model where one representative per
              group can communicate with outsiders random: Erdos-Renyi model kautz: a full de Kautz graph, which
              is a subgraph of a de Bruijn graph complete: a full graph with all the possible connections

              With  the type debruijn, the number of ranks must be a power of something.  Examples: 256 = 16*16,
              512=8*8*8, 49=7*7, and so on.  Otherwise, don't use debruijn routing but use another one With  the
              type kautz, the number of ranks n must be n=(k+1)*k^(d-1) for some k and d

       -routing-graph-degree degree

              Specifies the outgoing degree for the routing graph.  See Documentation/Routing.txt

              Hardware testing

       -test-network-only

              Tests the network and returns.

       -write-network-test-raw-data

              Writes one additional file per rank detailing the network test.

       -exchanges NumberOfExchanges

              Sets the number of exchanges

       -disable-network-test

              Skips the network test.

              Debugging

       -verify-message-integrity

              Checks  message  data  reliability  for  any  non-empty  message.   add '-D CONFIG_SSE_4_2' in the
              Makefile to use hardware instruction (SSE 4.2)

       -run-profiler

              Runs the profiler as the code runs. By default,  only  show  granularity  warnings.   Running  the
              profiler increases running times.

       -with-profiler-details

              Shows  number  of  messages sent and received in each methods during in each time slices (epochs).
              Needs -run-profiler.

       -show-communication-events

              Shows all messages sent and received.

       -show-read-placement

              Shows read placement in the graph during the extension.

       -debug-bubbles

              Debugs bubble code.  Bubbles can be due to  heterozygous  sites  or  sequencing  errors  or  other
              (unknown) events

       -debug-seeds

              Debugs seed code.  Seeds are paths in the graph that are likely unique.

       -debug-fusions

              Debugs fusion code.

       -debug-scaffolder

              Debug the scaffolder.

       FILES

              Input files

              Note: file format is determined with file extension.

              .fasta   .fasta.gz   (needs   HAVE_LIBZ=y  at  compilation)  .fasta.bz2  (needs  HAVE_LIBBZ2=y  at
              compilation) .fastq .fastq.gz (needs HAVE_LIBZ=y at compilation) .fastq.bz2  (needs  HAVE_LIBBZ2=y
              at compilation) .sff (paired reads must be extracted manually) .csfasta (color-space reads)

              Outputted files

              Scaffolds

              RayOutput/Scaffolds.fasta

              The scaffold sequences in FASTA format

              RayOutput/ScaffoldComponents.txt

              The components of each scaffold

              RayOutput/ScaffoldLengths.txt

              The length of each scaffold

              RayOutput/ScaffoldLinks.txt

              Scaffold links

              Contigs

              RayOutput/Contigs.fasta

              Contiguous sequences in FASTA format

              RayOutput/ContigLengths.txt

              The lengths of contiguous sequences

              Summary

              RayOutput/OutputNumbers.txt

              Overall numbers for the assembly

              de Bruijn graph

              RayOutput/CoverageDistribution.txt

              The distribution of coverage values

              RayOutput/CoverageDistributionAnalysis.txt

              Analysis of the coverage distribution

              RayOutput/degreeDistribution.txt

              Distribution of ingoing and outgoing degrees

              RayOutput/kmers.txt

              k-mer graph, required option: -write-kmers

              The resulting file is not utilised by Ray.  The resulting file is very large.

              Assembly steps

              RayOutput/SeedLengthDistribution.txt

              Distribution of seed length

              RayOutput/Rank<rank>.OptimalReadMarkers.txt

              Read markers.

              RayOutput/Rank<rank>.RaySeeds.fasta

              Seed DNA sequences, required option: -write-seeds

              RayOutput/Rank<rank>.RayExtensions.fasta

              Extension DNA sequences, required option: -write-extensions

              RayOutput/Rank<rank>.RayContigPaths.txt

              Contig paths with coverage values, required option: -write-contig-paths

              Paired reads

              RayOutput/LibraryStatistics.txt

              Estimation of outer distances for paired reads

              RayOutput/Library<LibraryNumber>.txt

              Frequencies for observed outer distances (insert size + read lengths)

              Partition

              RayOutput/NumberOfSequences.txt

              Number of reads in each file

              RayOutput/SequencePartition.txt

              Sequence partition

              Ray software

              RayOutput/RayVersion.txt

              The version of Ray

              RayOutput/RayCommand.txt

              The exact same command provided

              AMOS

              RayOutput/AMOS.afg

              Assembly representation in AMOS format, required option: -amos

              Communication

              RayOutput/MessagePassingInterface.txt

              Number of messages sent

              RayOutput/NetworkTest.txt

              Latencies in microseconds

              RayOutput/Rank<rank>NetworkTestData.txt

              Network test raw data

       DOCUMENTATION

              -  mpiexec  -n  1  Ray  -help|less  (always up-to-date) - This help page (always up-to-date) - The
              directory  Documentation/  -  Manual  (Portable  Document   Format):   InstructionManual.tex   (in
              Documentation)                -                Mailing                list               archives:
              http://sourceforge.net/mailarchive/forum.php?forum_name=denovoassembler-users

       AUTHOR

              Written by Sebastien Boisvert.

       REPORTING BUGS

              Report      bugs       to       denovoassembler-users@lists.sourceforge.net       Home       page:
              <http://denovoassembler.sourceforge.net/>

       COPYRIGHT

              This program is free software: you can redistribute it and/or modify it under the terms of the GNU
              General Public License as published by the Free Software Foundation, version 3 of the License.

              This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;  without
              even  the  implied  warranty  of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
              General Public License for more details.

              You have received a copy of the GNU General Public License along with this program (see LICENSE).

       Ray 2.1.0

       License for Ray: GNU General Public License version 3 RayPlatform version: 1.1.0 License for RayPlatform:
       GNU Lesser General Public License version 3

       MAXKMERLENGTH:  32  KMER_U64_ARRAY_SIZE:  1  Maximum  coverage  depth stored by CoverageDepth: 4294967295
       MAXIMUM_MESSAGE_SIZE_IN_BYTES: 4000 bytes FORCE_PACKING = n ASSERT = n HAVE_LIBZ  =  y  HAVE_LIBBZ2  =  y
       CONFIG_PROFILER_COLLECT  = n CONFIG_CLOCK_GETTIME = n __linux__ = y _MSC_VER = n __GNUC__ = y RAY_32_BITS
       = n RAY_64_BITS = y MPI standard version: MPI 2.1 MPI library: Open-MPI 1.4.2 Compiler: GNU gcc/g++ 4.4.5