Provided by: mira-assembler_4.0-1_amd64 bug

NAME

       convert_project - convert assembly and sequencing file types

DESCRIPTION

       This  program  is  part  of the MIRA assembler package. It is used to convert project file
       types into other types.   Please check out  the  documentation  below  for  more  detailed
       information about convert_project.

SYNOPSIS

       convert_project
              [-f  <fromtype>]  [-t  <totype>  [-t <totype> ...]]  [-aChimMsuZ] [-AcflnNoqrtvxXyz
              {...}] {infile} {outfile} [<totype> <totype> ...]

OPTIONS

       -f <fromtype>
              load this type of project files, where fromtype is:

       caf    a complete assembly or single sequences from CAF

       maf    a complete assembly or single sequences from CAF

       fasta  sequences from a FASTA file

       fastq  sequences from a FASTQ file

       gbf    sequences from a GBF file

       phd    sequences from a PHD file

       fofnexp
              sequences in EXP files from file of filenames

       -t <totype>
              write the sequences/assembly to this type (multiple mentions of -t are allowed):

       ace    sequences or complete assembly to ACE

       caf    sequences or complete assembly to CAF

       maf    sequences or complete assembly to MAF

       sam    complete assembly to SAM

       samnbb like above, but leaving out reference (backbones) in mapping assemblies

       gbf    sequences or consensus to GBF

       gff3   consensus to GFF3

       wig    assembly coverage info to wiggle file

       gcwig  assembly gc content info to wiggle file

       fasta  sequences or consensus to FASTA file (qualities to

              .qual)

       fastq  sequences or consensus to FASTQ file

       exp    sequences or complete assembly to EXP files in

              directories. Complete assemblies are suited for gap4 import as  directed  assembly.
              Note: using caf2gap to import into gap4 is recommended though

       text   complete assembly to text alignment (only when -f is

              caf, maf or gbf)

       html   complete assembly to HTML (only when -f is caf, maf or

              gbf)

       tcs    complete assembly to tcs

       hsnp   surrounding  of  SNP  tags  (SROc, SAOc, SIOc) to HTML (only when -f is caf, maf or
              gbf)

       asnp   analysis of SNP tags (only when -f is caf, maf or gbf)

       cstats contig statistics file like from MIRA (only when source contains contigs)

       crlist contig read list file like from MIRA (only when source contains contigs)

       maskedfasta
              reads where sequencing vector is masked out (with X) to FASTA  file  (qualities  to
              .qual)

       scaf   sequences or complete assembly to single sequences CAF

       -a     Append to target files instead of rewriting

       -A <string>
              String  with  MIRA parameters to be parsed Useful when setting parameters affecting
              consensus calling like -CO:mrpg etc.  E.g.: -a "454_SETTINGS -CO:mrpg=3"

       -b     Blind data Replaces all bases in reads/contigs with a 'c'

       -C     Perform hard clip to reads When reading formats which define clipping points, will

              save only the unclipped part into the result file.

              Applies only to files/formats which do not contain

              contigs.

       -d     Delete gap only columns When output is contigs: delete columns that are

              entirely gaps (like after having deleted reads during editing in gap4 or similar)

              When output is reads: delete gaps in reads

       -F     Filter to read groups Special use case, do not use yet.

       -m     Make contigs (only for -t = caf or maf) Encase single reads as contig singlets into
              the CAF/MAF file.

       -n <filename>
              when given, selects only reads or contigs given by name in that file.

       -i     when -n is used, inverts the selection

       -o     fastq  quality  Offset  (only  for  -f = 'fastq') Offset of quality values in FASTQ
              file. Default of 0 tries to automatically recognise.

       -Q <quality>
              Set default quality for bases in file types without quality values Furthermore,  do
              not stop if expected quality files are missing (e.g. '.fasta')

       -R <name>
              Rename  contigs/singlets/reads  with  given  name  string  to  which  a  counter is
              appended.  Known bug: will create duplicate names if input

              contains contigs/singlets as well as free reads, i.e.  reads  not  in  contigs  nor
              singlets.

       -S <name>
              (name)Scheme  for  renaming  reads,  important  for  paired-ends  Only  'solexa' is
              currently supported.

   The following switches work only when input (CAF or MAF) contains contigs.
       Beware: CAF and MAf can also contain just reads.

       -M     Do not extract contigs (or their consensus), but the sequence of the reads they are
              composed of.

       -N <filename>
              like -n, but sorts output according to order given in file.

       -r [cCqf]
              Recalculate  consensus and / or consensus quality values and / or SNP feature tags.
              'c' recalc cons & cons qualities (with IUPAC) 'C'  recalc  cons  &  cons  qualities
              (forcing  non-IUPAC)  'q'  recalc  consensus qualities only 'f' recalc SNP features
              Note: only the last of cCq is relevant, f works as a

              switch and can be combined with cQq (e.g. "-r C -r f")

              Note: if the CAF/MAF contains  multiple  strains,  recalculation  of  cons  &  cons
              qualities is forced, you

              can just influence whether IUPACs are used or not.

       -s     split output into multiple files instead of creating a single file

       -u     'fillUp  strain  genomes'  Fill  holes  in  the  genome of one strain (N or @) with
              sequence from a consensus of other strains Takes effect only with -r and -t gbf  or
              fasta/q  in  FASTA/Q: bases filled up are in lower case in GBF: bases filled up are
              in upper case

       -q <integer>
              Defines minimum quality a consensus base of a strain  must  have,  consensus  bases
              below  this  will  be 'N' Default: 0 Only used with -r, and -f is caf/maf and -t is
              (fasta

              or gbf)

       -v     Print version number and exit

       -x <integer>
              Minimum contig or read length When loading, discard all  contigs  /  reads  with  a
              length  less than this value. Default: 0 (=switched off) Note: not applied to reads
              in contigs!

       -X <integer>
              Similar to -x but applies only to reads and then to the clipped length.

       -y <integer>
              Minimum average contig coverage When loading, discard all contigs with  an  average
              coverage less than this value.  Default: 1

       -z <integer>
              Minimum  number  of reads in contig When loading, discard all contigs with a number
              of reads less than this value.  Default: 0 (=switched off)

       -l <integer>
              when output as text or HTML: number of bases shown in one alignment line.  Default:
              60.

       -c <character>
              when output as text or HTML: character used to pad endgaps. Default: ' ' (blank)

       Aliases:  caf2html, exp2fasta, ... etc. Any combination of "<validfromtype>2<validtotype>"
       can be used as program name (also using links) so as  that  convert_project  automatically
       sets -f and -t accordingly.

EXAMPLES

              convert_project source.maf dest.sam

              convert_project source.caf dest.fasta wig ace

              convert_project -x 2000 -y 10 source.caf dest.caf

              caf2html -l 100 -c . source.caf dest

SEE ALSO

       A  more  extensive  documentation  is provided in the mira-doc package and can be found at
       /usr/share/doc/mira-assembler/DefinitiveGuideToMIRA.html.

       You can also subscribe one of the MIRA mailing lists at

              http://www.chevreux.org/mira_mailinglists.html

       After subscribing, mail general questions to the MIRA talk mailing list:

              mira_talk@freelists.org

BUGS

       To report bugs or ask for features, please use the new ticketing system at:

              http://sourceforge.net/apps/trac/mira-assembler/

AUTHOR

       The author of the mira code is Bastien Chevreux <bach@chevreux.org>

       This manual page was written by Andreas Tille <tille@debian.org> but can  be  freely  used
       for any other distribution.