Provided by: boxshade_3.3.1-11_amd64 bug

NAME

       boxshade - Pretty-printing of multiple sequence alignments

SYNOPSIS

       boxshade

DESCRIPTION

       BOXSHADE is a program for pretty-printing multiple alignment output. The program itself doesn't do any
       alignment, you have to use a multiple alignment program like ClustalW or Pileup and use the output of
       these programs as input for BOXSHADE.

       -help
           Show the help.

        -check
           Show the help and extend command line.

        -def
           Use defaults, no unnecessary questions.

        -numdef
           Use default numbering.

        -dna
           Assume DNA sequences, use box_dna.par.

        -split
           Create separate files for multiple pages.

        -toseq=xxx
           Shading according to sequence No.  xxx.

        -in=xxxxx

           xxxxx is input file name.

        -out=xxxxx

           xxxxx is output file name.

        -par=xxxxx

           xxxxx is parameter file name.

        -sim=xxxxx

           xxxxx is file name for similar residues def.

        -grp=xxxxx

           xxxxx is file name for grouping residues def.

        -thr=x

           x is the fraction of sequences that must agree for a consensus.

        -dev=x

           x is output device class (see below).

        -type=x

           x is input file format (see below).

        -ruler
           Print ruler line.

        -cons
           Create consensus line.

        -symbcons=xyz

           xyz are consensus symbols.

        -symbcons="xyz"
           If the one above does not work, try this one.

        -unix
           Output files lines are terminated with LF only.

        -mac
           Output files lines are terminated with CR only.

        -dos
           Output files lines are terminated with CRLF.

       This manual page was written for the Debian(TM) distribution because the original program does not have a
       manual page. The presented information comes from the documentation of the Web Service of the 3.21
       version that is not available as a Debian package.

       BOXSHADE is a program for creating good looking printouts from multiple-aligned protein or DNA sequences.
       The program does no alignment by itself, it has to take as input a file preprocessed by a multiple
       alignment program or a multiple file editor. See below for a list of supported input formats and output
       devices. In the standard BOXSHADE output, identical and similar residues in the multiple-alignment chart
       are represented by different colors or shadings. There are some more options concerning the kind of
       shading to be applied, sequence numbering, consensus output and so on. The user interface is a bit clumsy
       at the moment, one has to answer a lot of questions in order to get the desired output. There is,
       however, the possibility to use default parameters from a standard parameter file or to supply the
       program with parameters from the command line. At the moment, the VMS and DOS versions of BOXSHADE have
       identical user interfaces.

   Input formats
       BOXSHADE 3.2 knows about the following input file formats: (some of the are generally used only for MSDOS
       or VMS systems) + CLUSTAL and CLUSTALV, multiple alignment program, DOS/VMS/MAC default extension .ALN +
       ESEE, multiple sequence editor, DOS default extension .ESE + PHYLIP, phylogenetic analysis package, DOS,
       VMS, UNIX default extension .PHY + PILEUP and PRETTY of the GCG sequence analysis package VMS/UNIX
       default extensions .MSF and .PRE NB!! you are strongly encouraged NOT to use the PRETTY format as input,
       it may be incompatible with the revised version of .MSF input. We can't actually think why anyone would
       use this format now, .MSF files are more useful generally. + MALIGNED, multiple sequence editor, VMS only
       default extension .MAL BOXSHADE tries to determine the file type from the extension but will work also if
       different extensions are used.

   Output devices
       POSTSCRIPT/EPS creates POSTSCRIPT(TM) files for printing on a Laserprinter or for further conversion with
       a POSTSCRIPT interpreter (like GHOSTSCRIPT) + HPGL for export to various graphics programs or for
       conversion/printing with the shareware program PRINTGL. Plotting BOXSHADE output on a plotter is
       generally not recommended + RTF for export to various word-processing and graphics programs + CRT, uses
       direct screen writes to the PC-monitor. Possible options depend on the graphics adapter used. This output
       device is supported only in the MSDOS version. + ANSI. On a PC, this option uses an ANSI device driver
       (ANSI.SYS) that has to be loaded in CONFIG.SYS previously. Possible character renditions are reverse,
       bold,underlined, blinking etc. On non-DOS systems, this option behaves more or less like the VT100 output
       mode. + VT100 for display on a VT100 compatible terminal or emulator. + ReGISterm for display on a ReGIS
       compatible graphics terminal or emulator. + ReGISfile for later conversion by the program RETOS
       (copyright DEC) in order to print on DIGITALs printer series. + LJ250 for printing on DIGITALS LJ250
       color printer. + ASCII output showing either the conserved residues or the varying ones (others as '-').
       + FIG file for xfig 2.1. + PICT files for import to Mac and PC graphics progs. Some of the formats above
       offer the possibility of scaling the characters and of rotating the plot. Character size has to be
       entered in 'point' units. Normal output orientation is in portrait mode (PS/EPS/HPGL/PICT only), to
       obtain output in landscape orientation, 'rotate plot = y' has to be chosen. When creating multi-page
       output, all pages are contained in a single output file. If one page per file is desired, one has to use
       the command line parameter /SPLIT. This is enforced when requesting EPSF or PICT file output, as
       multi-page EPSFs are a contradiction of the purpose of an EPSF and large PICT files would probably be too
       big for most personal computers. While using the terminal as output device, the 'RETURN' key has to be
       pressed to obtain the next page of output.

   Sequence numbering
       Starting with version 2.2 there is the possibility to add numbering to the output files. The numbers are
       printed between the sequence names and the sequence itself. Since most of the input-files either use no
       numbering or number the first position in the alignment always with a "1" (and that does not necessarily
       reflect the numbers within the original sequence), the user is asked to enter the starting position for
       each sequence. The command line flag /DEFNUM suppressed that question, a starting position of 1 is
       assumed for all sequences. Boxshade starts with the value entered for the leftmost position and continues
       numbering every valid symbol, skipping blanks, '-','.' and stuff like that.

   Default parameters
       Several people using previous releases of BOXSHADE pointed me to the need of having default parameters
       for the various questions asked by the program. They argued that most sites only use one type of input
       files, one output device and one choice of colors for the output. I therefore added a management of
       default parameters allowing two levels of assistance to the user. 1) all default parameters are contained
       in an ASCII file that can be modified easily to accommodate the users taste. The format is roughly
       documented within the file-header, it resembles the keyboard input one has to make if using the program
       interactively. There are two such files supplied with this release of BOXSHADE, BOX_DNA.PAR and
       BOX_PEP.PAR , holding some example parameters for peptide and dna-comparisons. there are no big
       differences between these two, the major one is that when shading DNA-comparisons one doesn't care of
       "similar" residues. 2) to run the program with minimal user interaction, I have added the possibility to
       use command line parameters. At the moment, you can use: /check : list all allowed command line
       parameters (this list) and allows parameters to be added. /def : program runs without questions,
       BOX_PEP.PAR is used as default /dna : makes the program use BOX_DNA.PAR as parameter file /pep : makes
       the program use BOX_PEP.PAR as parameter file /in=xxx : makes the program take xxx as input file /out=yyy
       : makes the program take yyy as output file (note1) /par=zzz : makes the program use zzz as a default
       parameter file /type=1 : makes the program assume an input file of type 1 (PRETTY/MSF) /dev=1 : makes the
       program assume and output device of type 1 (CRT) /numdef : use default numbering (all sequences starting
       with "1") /thr : threshold fraction of residues that must agree for a consensus /split : forces one page
       per file output, creates multiple output files. /cons : makes the program create an additional consensus
       line (see below) /symbcons=: influences the way the consensus line is displayed. (see below) /unix :
       writes output files in unix style (LF only) (note2) /dos : writes output files in DOS style (CR/LF)
       (note2) note1: on unix machines, use out=OUTPUT for terminal output on DOS machines, use out=con: on VMS
       machines, use out=tt: note2: if no mode is specified, the native style of the machine is used.

           ATTENTION
           on unix systems, the dash (-) instead of the slash (/) has to be used as separation character for
           command line parameters. For example, a valid unix command line is: boxshade -def -numdef -cons
           -symbcons=" .*"

   Shading strategies (similarity to consensus or single sequence)
       Starting with version 3, BOXSHADE has a new shading system. The first difference is the introduction of a
       threshold fraction of residues that must agree for there to be a consensus. Previously, the program
       assumed that SOME residue was always the consensus. If no two residues were the same, the first sequence
       provided the consensus residue. This threshold fraction can be any number between 0.0 and 1.0. The number
       of sequences that must agree for there to be a consensus is, as you might expect, this fraction times the
       total number of sequences in the alignment (fractions of a sequence count as one, e.g. 3.2 becomes 4).
       The second difference is the idea of 'consensus by similarity'; this tries to take account of the
       situations where all the sequences may have (for example) R or K at a position, but neither in a
       majority. It would not be logical to shade one type of residue as 'identical' and the other as 'similar';
       the threshold function might also eliminate both as being in too small numbers. Therefore, if there is
       not a single residue that is conserved (greater than the threshold) at a position, the program looks for
       a 'group' of amino acids that fulfills the requirements. 'Groups' are defined in the .grp files. Users
       can tailor these to their personal prejudices. Any amino acid not listed is assumed not to be in a group.
       All members of a group are considered to be mutually similar, unlike the .sim files, described below. If
       consensus by similarity is found, all the residues in the consensus are shaded using the 'similar'
       shading defined by the user. If the user does not select 'shading by similarity', only identity-type
       consensus is looked at. If an identity-type consensus is found, and similarity shading is in operation,
       the program looks to see if the remaining residues are similar to the consensus residue. Here the
       box_xxx.sim files are used. The main difference between relationships in these files and those in the
       .grp files is that, e.g. in a .grp file the line STA means that all three a.a.s are mutually similar. In
       a .sim file S TA means that both T and A are considered similar to S, where there is a conserved S
       residue in more than threshold number of sequences. However, it does NOT mean that T and A are similar to
       each other. Note that cases where two residues, or groups of residues, fulfill the threshold requirements
       (as could happen with values of the thr. fraction less than or equal to 0.5) are treated as having no
       consensus. This describes the main shading model 'shading according to a consensus'. The alternative
       model is called 'shading according to a master sequence'. In this case the user is prompted for a
       sequence of the alignment and consecutively that sequence is taken to be the 'consensus'. Only those
       residues become shaded that are identical or similar to the chosen sequence. Output obtained with this
       option tends to be less shaded and neglects similarities between the other (non-chosen) sequences.
       Starting in V2.7, this 'master sequence' can be hidden. Thus, it only influences the shading of the other
       sequences without being shown itself.

   Consensus display
       Starting with version 2.5, BOXSHADE offers the possibility to create an additional line holding a
       consensus symbol. This line can either be obtained by using the command line qualifier /CONS or
       interactively by answering the question ' create consensus? '. The way this consensus line is displayed
       can be modified by the command line parameter SYMBCONS=xyz, by editing the respective entry in the .PAR
       file or interactively. Since the SYMBCONS syntax is not intuitive, here a brief description: The SYMBCONS
       parameter consist of exactly three symbols: + the first one stands for 'normal' sequence residues that
       are not involved in any similar/identical relationship. + the second symbol represents positions that are
       similar in all sequences of the alignment. See the files BOX_PEP.SIM and BOX_DNA.SIM to see what residues
       are considered similar. + the third symbol represents positions that are identical in all sequences of
       the alignment. A SYMBCONS parameter string " .*" (blank/point/asterisk) means: label all positions in the
       alignment with totally identical residues by an asterisk, all positions with all similar residues by a
       point and do not mark the other positions. The letter 'B' can be used instead of the blank, this is
       necessary e.g. when using the command line option /SYMBCONS=B.* which gives the same result as the above
       example. The option /SYMBCONS= .* would result in an unexpected behaviour because MSDOS squeezes blanks
       out of the command line. Besides points, asterisks and other symbols, there are two special characters
       when they appear in the SYMBCONS string: 'L' and 'U'. An 'L' means, that a lowercase representation of
       the most abundant residue at that position is to be used instead of a fixed consensus symbol while an 'U'
       means an uppercase character representation of that residue. A possible application would be the SYMBCONS
       string " LU" where similar residues are represented by lowercase characters and identical by uppercase
       characters.

   Shareware/PD programs useful in conjunction with BOXSHADE
       multiple alignment files that to be used by BOXSHADE can be created, amongst others, by the following
       PD/freeware programs: + PHYLIP by Joe Felsenstein, available by ftp from anthro.utah.edu + ESEE by Eric
       Cabot, available from the same sources as BOXSHADE (see above) + CLUSTAL by Des Higgins, ditto for
       preview/conversion of POSTSCRIPT files, the program GHOSTSCRIPT from GNU software foundation is highly
       recommended. It is available from all major MSDOS ftp-sites (e.g. SIMTEL or ftp.uni-koeln.de) There is
       also a version tested for use with boxshade available at vax0.biomed.uni-koeln.de although this might be
       not the most recent release. for Mac users, there is MacGhostscript, also available from the main
       archives (info-mac, umich and their mirrors). A *very* good tool for putting a preview image into an EPSF
       file, often a prerequisite for incorporating into a drawing package, is PS2EPS, by Peter Lerup. This can
       be found on info-mac. for preview/conversion of HPGL files, the shareware program PRINTGL 1.18 by Cary
       Ravitz is highly recommended. It is available from many MSDOS ftp sites and from
       netserv@embl-heidelberg.de - output on dot printers - Since PRINTGL offers a broad choice of printer
       types and is a nice program, I recommend its use for printing BOXSHADE output on non-POSTSCRIPT printers.
       Use HPGL output with options 0F1N for normal residues 2F1N for identical residues 3F1N for similar
       residues 2F4N for conserved residues 8 for character size not rotated (these are the standard parameters
       in BOX_PEP.PAR) for creating a HPGL files. (lets call it TEST.PLT) Now use PRINTGL either interactively
       by calling PMI or use a command line like: PRINTGL /Fx/S0340/Waaac/Ptest.plt where test.plt is to be
       replaced by the filename to convert and the x in the expression /Fx is to be replaced by the letter of
       the printer you use. (See the PRINTGL documentation for further details)

RESTRICTIONS

       The RTF output and PHYLIP input implementations are still experimental. Please tell me of your
       experiences with the program. + the current DOS version supports only 13 sequences with 2000 residues
       each. This parameters can be easily changed in the source code. If you cannot compile the sources because
       you are lacking a pascal compiler, contact the author for precompiled versions

CITING BOXSHADE

       There is no publication on BOXSHADE and none is planned. Most people just use it for figures in
       publications and don't mention anything, this is ok for the authors of BOXSHADE. If you really feel like
       mentioning BOXSHADE, you could either acknowledge it in the figure legend or in the Mat&Meth part on
       sequence analysis.

SEE ALSO

       /etc/boxshade/*.par

       seaview(1) kalign(1)

AUTHORS

       Kay Hofmann <kay.hofmann@memorec.com>
       ISREC, Bioinformatics Group,
                   CH-1066
                   Epalinges s/Lausanne             Switzerland

           Wrote Boxshade.

       Michael Baron <michael.baron@bbsrc.ac.uk>
       BBSRC Institute for Animal Health,
                   Pirbright,
                   Surrey
                   GU24 0NF
                   U.K.

           Wrote Boxshade.

       Harmut Schirmer <hsc@techfak.uni-kiel.de>
       Technische Fakultaet,
                   Kaiserstr. 2
                   D-24143
                   Kiel
                   Germany

           C port of Boxshade. (don't send Kay or Michael any questions concerning the 'C' version of boxshade)

       Steffen Möller <moeller@debian.org>
           Wrote the manpage.

       Charles Plessy <plessy@debian.org>
           Updated the manpage

COPYRIGHT

       Copyright © 1997 Kay Hofmann, Michael Baron and Harmut Schirmer
       Copyright © 2003, 2007 Steffen Moeller, Charles Plessy

       The above copyright notices refer to the program and its manpage respectively.

       BOXSHADE is completely public-domain and may be passed around and modified without any notice to the
       authors.

       This manual page was written for the Debian(TM) system but may be used by others. Permission is granted
       to copy, distribute and/or modify this document under same terms as boxshade itself.