Ubuntu Manpage: miraconvert - convert assembly and sequencing file types

Provided by: mira-assembler_4.9.6-10_amd64

NAME

       miraconvert - convert assembly and sequencing file types

SYNOPSIS

       miraconvert   [-f   <fromtype>]   [-t   <totype>   [-t   <totype>   ...]]  [-aAbCdhimMsuZ]
       [-cflnNoPqrtvxXyYz {...}] {infile} {outfile} [<totype> <totype> ...]

OPTIONS

       -f <fromtype>
              load this type of project files, where fromtype is:

              caf a complete assembly or single sequences from CAF

              maf a complete assembly or single sequences from CAF

              fasta sequences from a FASTA file

              fastq sequences from a FASTQ file

              gb[f|k|ff] sequences from a GenBank file

              phd sequences from a PHD file

              fofnexp sequences in EXP files from file of filenames

       -t <totype>
              write the sequences/assembly to this type (multiple mentions of -t are allowed):

              ace sequences or complete assembly to ACE

              caf sequences or complete assembly to CAF

              maf sequences or complete assembly to MAF

              sam complete assembly to SAM

              samnbb like above, but leaving out reference (backbones) in mapping assemblies

              gb[f|k|ff] sequences or consensus to GenBank

              gff3 consensus to GFF3

              wig assembly coverage info to wiggle file

              gcwig assembly gc content info to wiggle file

              fasta sequences or consensus to FASTA file (qualities to .qual)

              fastq sequences or consensus to FASTQ file

              exp sequences or complete assembly to EXP files in directories. Complete assemblies
              are  suited  for  gap4  import as directed assembly.  Note: using caf2gap to import
              into gap4 is recommended though

              text complete assembly to text alignment (only when -f is caf, maf or gbf)

              html complete assembly to HTML (only when -f is caf, maf or gbf)

              tcs complete assembly to tcs

              hsnp surrounding of SNP tags (SROc, SAOc, SIOc) to HTML (only when -f is  caf,  maf
              or gbf)

              asnp analysis of SNP tags (only when -f is caf, maf or gbf)

              cstats contig statistics file like from MIRA (only when source contains contigs)

              crlist contig read list file like from MIRA (only when source contains contigs)

              maskedfasta  reads  where  sequencing  vector  is masked out (with X) to FASTA file
              (qualities to .qual)

              scaf sequences or complete assembly to single sequences CAF

       -a     Append to target files instead of rewriting

       -A     Do not Adjust sequence case

              When reading formats which define clipping points, and saving to formats  which  do
              not  have  clipping  information,  miraconvert  normally  adjusts  the case of read
              sequences: lower case for clipped parts, upper case for unclipped parts  of  reads.
              Use -A if you do not want this. See also -C.

              Applies only to files/formats which do not contain contigs.

       -b     Blind data

              Replaces all bases in reads/contigs with a 'c'

       -C     Perform hard clip to reads

              When  reading  formats  which  define clipping points, will save only the unclipped
              part into the result file.

              Applies only to files/formats which do not contain contigs.

       -d     Delete gap only columns

              When output is contigs: delete columns that are entirely gaps  (like  after  having
              deleted reads during editing in gap4 or similar)

              When output is reads: delete gaps in reads

       -F     Filter read groups to different files

              Works  only for input files with readgroups (CAF/MAF) 3 (or 4) files generated: one
              or two for paired, one for unpaired and one for debris reads.

              Reads in paired file are interlaced by default, use -F  twice  to  create  separate
              files.

       -m     Make contigs (only for -t = caf or maf)

              Encase single reads as contig singlets into the CAF/MAF file.

       -n <filename>
              when given, selects only reads or contigs given by name in that file.

       -N <filename>
              like -n, but sorts output according to order given in file.

       -i     when -n is used, inverts the selection

       -o <quality>t
              FASTQ quality Offset (only for -f = 'fastq')

              Offset  of  quality  values  in  FASTQ file. Default of 33 loads Sanger/Phred style
              files, using 0 tries to automatically recognise.

       -P <string>
              String with MIRA parameters to be parsed

              Useful when setting parameters affecting consensus calling like -CO:mrpg etc.

              E.g.: -P "454_SETTINGS -CO:mrpg=3"

       -q <quality>
              Set default quality for bases in file types without quality values. Furthermore, do
              not stop if expected quality files are missing (e.g. '.fasta')

       -R <name>
              Rename  contigs/singlets/reads  with  given  name  string  to  which  a  counter is
              appended.

              Known bug: will create duplicate names if input contains contigs/singlets  as  well
              as free reads, i.e.  reads not in contigs nor singlets.

       -S <name>
              (name)Scheme  for  renaming  reads,  important  for  paired-ends.  Only 'solexa' is
              currently supported.

       -T     When converting single reads, trim/clip away stretches of  N  and  X  and  ends  of
              reads.  Note:  remember  to  use -C to also perform a hard clip (e.g. with FASTA as
              output).

       -v     Print version number and exit

       -Y <integer>
              Yield. Max (clipped/padded) bases to convert.

              When used on reads: output will contain first reads of file where length of clipped
              bases  totals at least -Y.  When used on contigs: output will contain first contigs
              of file where length of padded contigs totals at least -Y.

       The following switches work only when input (CAF or MAF) contains contigs. Beware: CAF and
       MAf can also contain just reads.

       -M     Do not extract contigs (or their consensus), but the sequence of the reads they are
              composed of.

       -r [cCqf]
              Recalculate consensus and / or consensus quality values and / or SNP feature tags.

              'c' recalc cons & cons qualities (with IUPAC)

              'C' recalc cons & cons qualities (forcing non-IUPAC)

              'q' recalc consensus qualities only

              'f' recalc SNP features

              Note: only the last of cCq is relevant, f works as a switch  and  can  be  combined
              with cQq (e.g. "-r C -r f")

              Note:  if  the  CAF/MAF  contains  multiple  strains,  recalculation of cons & cons
              qualities is forced, you can just influence whether IUPACs are used or not.

       -s     split output into multiple files instead of creating a single file

       -u     'fillUp strain genomes'

              Fill holes in the genome of one strain (N or @) with sequence from a  consensus  of
              other strains

              Takes  effect only with -r and -t gbf or fasta/q in FASTA/Q: bases filled up are in
              lower case in GBF: bases filled up are in upper case

       -Q <integer>
              Defines minimum quality a consensus base of a strain  must  have,  consensus  bases
              below this will be 'N' Default: 0

              Only used with -r, and -f is caf/maf and -t is (fasta or gbf)

       -V <integer>
              Defines  minimum  coverage  a  consensus  base  of  a  strain must have, bases with
              coverage below this will be 'N' Default: 0

              Only used with -r, and -t is (fasta or gbf)

       -x <integer>
              Minimum contig or unclipped read length

              When loading, discard all contigs / reads with  a  length  less  than  this  value.
              Default: 0 (=switched off)

              Note: not applied to reads in contigs!

       -X <integer>
              Similar to -x but applies only to reads and then to the clipped length.

       -y <integer>
              Minimum  average  contig coverage When loading, discard all contigs with an average
              coverage less than this value.  Default: 1

       -z <integer>
              Minimum number of reads in contig When loading, discard all contigs with  a  number
              of reads less than this value.  Default: 0 (=switched off)

       -l <integer>
              when  output as text or HTML: number of bases shown in one alignment line. Default:
              60.

       -c <character>
              when output as text or HTML: character used to pad endgaps. Default: ' ' (blank)

EXAMPLES

              miraconvert source.maf dest.sam

              miraconvert source.caf dest.fasta wig ace

              miraconvert -x 2000 -y 10 source.caf dest.caf

              miraconvert -x 40 -C -F -F source.maf .fastq

BUGS

       To report bugs or ask for features, please use the ticketing system at:

              http://sourceforge.net/projects/mira-assembler/

AUTHOR

       Bastien Chevreux <bach@chevreux.org>

       This  manpage was written by Andreas Tille for the Debian distribution and can be used for
       any other usage of the program.

NAME

SYNOPSIS

OPTIONS

EXAMPLES

SEE ALSO

BUGS

AUTHOR