Ubuntu Manpage: convert_project - convert assembly and sequencing file types

Provided by: mira-assembler_4.9.5-5_amd64

NAME

       convert_project - convert assembly and sequencing file types

DESCRIPTION

       This  program  is part of the MIRA assembler package. It is used to convert project file types into other
       types.   Please check out the documentation below for more detailed information about convert_project.

SYNOPSIS

       convert_project
              [-f <fromtype>] [-t <totype> [-t <totype> ...]]  [-aChimMsuZ]  [-AcflnNoqrtvxXyz  {...}]  {infile}
              {outfile} [<totype> <totype> ...]

OPTIONS

       -f <fromtype>
              load this type of project files, where fromtype is:

       caf    a complete assembly or single sequences from CAF

       maf    a complete assembly or single sequences from CAF

       fasta  sequences from a FASTA file

       fastq  sequences from a FASTQ file

       gbf    sequences from a GBF file

       phd    sequences from a PHD file

       fofnexp
              sequences in EXP files from file of filenames

       -t <totype>
              write the sequences/assembly to this type (multiple mentions of -t are allowed):

       ace    sequences or complete assembly to ACE

       caf    sequences or complete assembly to CAF

       maf    sequences or complete assembly to MAF

       sam    complete assembly to SAM

       samnbb like above, but leaving out reference (backbones) in mapping assemblies

       gbf    sequences or consensus to GBF

       gff3   consensus to GFF3

       wig    assembly coverage info to wiggle file

       gcwig  assembly gc content info to wiggle file

       fasta  sequences or consensus to FASTA file (qualities to

              .qual)

       fastq  sequences or consensus to FASTQ file

       exp    sequences or complete assembly to EXP files in

              directories.  Complete  assemblies  are  suited for gap4 import as directed assembly.  Note: using
              caf2gap to import into gap4 is recommended though

       text   complete assembly to text alignment (only when -f is

              caf, maf or gbf)

       html   complete assembly to HTML (only when -f is caf, maf or

              gbf)

       tcs    complete assembly to tcs

       hsnp   surrounding of SNP tags (SROc, SAOc, SIOc) to HTML (only when -f is caf, maf or gbf)

       asnp   analysis of SNP tags (only when -f is caf, maf or gbf)

       cstats contig statistics file like from MIRA (only when source contains contigs)

       crlist contig read list file like from MIRA (only when source contains contigs)

       maskedfasta
              reads where sequencing vector is masked out (with X) to FASTA file (qualities to .qual)

       scaf   sequences or complete assembly to single sequences CAF

       -a     Append to target files instead of rewriting

       -A <string>
              String with MIRA parameters to be  parsed  Useful  when  setting  parameters  affecting  consensus
              calling like -CO:mrpg etc.  E.g.: -a "454_SETTINGS -CO:mrpg=3"

       -b     Blind data Replaces all bases in reads/contigs with a 'c'

       -C     Perform hard clip to reads When reading formats which define clipping points, will

              save only the unclipped part into the result file.

              Applies only to files/formats which do not contain

              contigs.

       -d     Delete gap only columns When output is contigs: delete columns that are

              entirely gaps (like after having deleted reads during editing in gap4 or similar)

              When output is reads: delete gaps in reads

       -F     Filter to read groups Special use case, do not use yet.

       -m     Make  contigs  (only  for -t = caf or maf) Encase single reads as contig singlets into the CAF/MAF
              file.

       -n <filename>
              when given, selects only reads or contigs given by name in that file.

       -i     when -n is used, inverts the selection

       -o     fastq quality Offset (only for -f = 'fastq') Offset of quality values in FASTQ file. Default of  0
              tries to automatically recognise.

       -Q <quality>
              Set  default  quality  for  bases in file types without quality values Furthermore, do not stop if
              expected quality files are missing (e.g. '.fasta')

       -R <name>
              Rename contigs/singlets/reads with given name string to which a counter is appended.   Known  bug:
              will create duplicate names if input

              contains contigs/singlets as well as free reads, i.e.  reads not in contigs nor singlets.

       -S <name>
              (name)Scheme for renaming reads, important for paired-ends Only 'solexa' is currently supported.

   The following switches work only when input (CAF or MAF) contains contigs.
       Beware: CAF and MAf can also contain just reads.

       -M     Do not extract contigs (or their consensus), but the sequence of the reads they are composed of.

       -N <filename>
              like -n, but sorts output according to order given in file.

       -r [cCqf]
              Recalculate  consensus  and  /  or consensus quality values and / or SNP feature tags.  'c' recalc
              cons & cons qualities (with IUPAC) 'C' recalc cons & cons qualities (forcing non-IUPAC) 'q' recalc
              consensus qualities only 'f' recalc SNP features Note: only the last of cCq is relevant,  f  works
              as a

              switch and can be combined with cQq (e.g. "-r C -r f")

              Note:  if the CAF/MAF contains multiple strains, recalculation of cons & cons qualities is forced,
              you

              can just influence whether IUPACs are used or not.

       -s     split output into multiple files instead of creating a single file

       -u     'fillUp strain genomes' Fill holes in the genome of one strain (N  or  @)  with  sequence  from  a
              consensus  of  other  strains  Takes  effect  only with -r and -t gbf or fasta/q in FASTA/Q: bases
              filled up are in lower case in GBF: bases filled up are in upper case

       -q <integer>
              Defines minimum quality a consensus base of a strain must have, consensus bases below this will be
              'N' Default: 0 Only used with -r, and -f is caf/maf and -t is (fasta

              or gbf)

       -v     Print version number and exit

       -x <integer>
              Minimum contig or read length When loading, discard all contigs / reads with a  length  less  than
              this value. Default: 0 (=switched off) Note: not applied to reads in contigs!

       -X <integer>
              Similar to -x but applies only to reads and then to the clipped length.

       -y <integer>
              Minimum  average  contig  coverage When loading, discard all contigs with an average coverage less
              than this value.  Default: 1

       -z <integer>
              Minimum number of reads in contig When loading, discard all contigs with a number  of  reads  less
              than this value.  Default: 0 (=switched off)

       -l <integer>
              when output as text or HTML: number of bases shown in one alignment line. Default: 60.

       -c <character>
              when output as text or HTML: character used to pad endgaps. Default: ' ' (blank)

       Aliases:  caf2html, exp2fasta, ... etc. Any combination of "<validfromtype>2<validtotype>" can be used as
       program name (also using links) so as that convert_project automatically sets -f and -t accordingly.

EXAMPLES

              convert_project source.maf dest.sam

              convert_project source.caf dest.fasta wig ace

              convert_project -x 2000 -y 10 source.caf dest.caf

              caf2html -l 100 -c . source.caf dest

BUGS

       To report bugs or ask for features, please use the new ticketing system at:

              http://sourceforge.net/apps/trac/mira-assembler/

AUTHOR

       The author of the mira code is Bastien Chevreux <bach@chevreux.org>

       This  manual  page  was  written by Andreas Tille <tille@debian.org> but can be freely used for any other
       distribution.

3.9.17                                              June 2013                                 CONVERT_PROJECT(1)

NAME

DESCRIPTION

SYNOPSIS

OPTIONS

EXAMPLES

SEE ALSO

BUGS

AUTHOR