Ubuntu Manpage: clustalw - Multiple alignment of nucleic acid and protein sequences

NAME

       clustalw - Multiple alignment of nucleic acid and protein sequences

SYNOPSIS


       clustalw [-infile] file.ext [OPTIONS]

       clustalw [-help | -fullhelp]

DESCRIPTION

       Clustal W is a general purpose multiple alignment program for DNA or proteins.

       The program performs simultaneous alignment of many nucleotide or amino acid sequences. It is typically
       run interactively, providing a menu and an online help. If you prefer to use it in command-line (batch)
       mode, you will have to give several options, the minimum being -infile.

OPTIONS

   DATA (sequences)
        -infile=file.ext
           Input sequences.

        -profile1=file.ext and -profile2=file.ext
           Profiles (old alignment)

   VERBS (do things)
       -options
           List the command line parameters.

       -help or -check
           Outline the command line params.

       -fullhelp
           Output full help content.

       -align
           Do full multiple alignment.

       -tree
           Calculate NJ tree.

       -pim
           Output percent identity matrix (while calculating the tree).

        -bootstrap=n
           Bootstrap a NJ tree (n= number of bootstraps; def. = 1000).

       -convert
           Output the input sequences in a different file format.

   PARAMETERS (set things)
       General settings:

           -interactive
               Read command line, then enter normal interactive menus.

           -quicktree
               Use FAST algorithm for the alignment guide tree.

           -type=
               PROTEIN or DNA sequences.

           -negative
               Protein alignment with negative values in matrix.

           -outfile=
               Sequence alignment file name.

           -output=
               GCG, GDE, PHYLIP, PIR or NEXUS.

           -outputorder=
               INPUT or ALIGNED

           -case
               LOWER or UPPER (for GDE output only).

           -seqnos=
               OFF or ON (for Clustal output only).

           -seqnos_range=
               OFF or ON (NEW: for all output formats).

           -range=m,n
               Sequence range to write starting m to m+n.

           -maxseqlen=n
               Maximum allowed input sequence length.

           -quiet
               Reduce console output to minimum.

           -stats=file
               Log some alignments statistics to file.

       Fast Pairwise Alignments:

           -ktuple=n
               Word size.

           -topdiags=n
               Number of best diags.

           -window=n
               Window around best diags.

           -pairgap=n
               Gap penalty.

           -score
               PERCENT or ABSOLUTE.

       Slow Pairwise Alignments:

           -pwmatrix=
               :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename

           -pwdnamatrix=
               DNA weight matrix=BLOSUMIUB, BLOSUMCLUSTALW or BLOSUMfilename.

           -pwgapopen=f
               Gap opening penalty.

           -pwgapext=f
               Gap extension penalty.

       Multiple Alignments:

           -newtree=
               File for new guide tree.

           -usetree=
               File for old guide tree.

           -matrix=
               Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename.

           -dnamatrix=
               DNA weight matrix=IUB, CLUSTALW or filename.

           -gapopen=f
               Gap opening penalty.

           -gapext=f
               Gap extension penalty.

           -engaps
               No end gap separation pen.

           -gapdist=n
               Gap separation pen. range.

           -nogap
               Residue-specific gaps off.

           -nohgap
               Hydrophilic gaps off.

           -hgapresidues=
               List hydrophilic res.

           -maxdiv=n
               Percent identity for delay.

           -type=
               PROTEIN or DNA

           -transweight=f
               Transitions weighting.

           -iteration=
               NONE or TREE or ALIGNMENT.

           -numiter=n
               Maximum number of iterations to perform.

       Profile Alignments:

           -profile
               Merge two alignments by profile alignment.

           -newtree1=
               File for new guide tree for profile1.

           -newtree2=
               File for new guide tree for profile2.

           -usetree1=
               File for old guide tree for profile1.

           -usetree2=
               File for old guide tree for profile2.

       Sequence to Profile Alignments:

           -sequences
               Sequentially add profile2 sequences to profile1 alignment.

           -newtree=
               File for new guide tree.

           -usetree=
               File for old guide tree.

       Structure Alignments:

           -nosecstr1
               Do not use secondary structure-gap penalty mask for profile 1.

           -nosecstr2
               Do not use secondary structure-gap penalty mask for profile 2.

           -secstrout=STRUCTURE or MASK or BOTH or NONE
               Output in alignment file.

           -helixgap=n
               Gap penalty for helix core residues.

           -strandgap=n
               Gap penalty for strand core residues.

           loopgap=n
               Gap penalty for loop regions.

           -terminalgap=n
               Gap penalty for structure termini.

           -helixendin=n
               Number of residues inside helix to be treated as terminal.

           -helixendout=n
               Number of residues outside helix to be treated as terminal.

           -strandendin=n
               Number of residues inside strand to be treated as terminal.

           -strandendout=n
               Number of residues outside strand to be treated as terminal.

       Trees:

           -outputtree=nj OR phylip OR dist OR nexus

           -seed=n
               Seed number for bootstraps.

           -kimura
               Use Kimura's correction.

           -tossgaps
               Ignore positions with gaps.

           -bootlabels=node
               Position of bootstrap values in tree display.

           -clustering=
               NJ or UPGMA.

BUGS

       The Clustal bug tracking system can be found at
       http://bioinf.ucd.ie/bugzilla/buglist.cgi?quicksearch=clustal.

REFERENCES

       •   Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM,
           Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. (2007).  Clustal W and Clustal X version 2.0.[1]
           Bioinformatics, 23, 2947-2948.

       •   Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003).  Multiple
           sequence alignment with the Clustal series of programs.[2] Nucleic Acids Res., 31, 3497-3500.

       •   Jeanmougin F, Thompson JD, Gouy M, Higgins DG, Gibson TJ. (1998).  Multiple sequence alignment with
           Clustal X[3]. Trends Biochem Sci., 23, 403-405.

       •   Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. (1997).  The CLUSTAL_X windows
           interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.[4]
           Nucleic Acids Res., 25, 4876-4882.

       •   Higgins DG, Thompson JD, Gibson TJ. (1996).  Using CLUSTAL for multiple sequence alignments.[5]
           Methods Enzymol., 266, 383-402.

       •   Thompson JD, Higgins DG, Gibson TJ. (1994).  CLUSTAL W: improving the sensitivity of progressive
           multiple sequence alignment through sequence weighting, position-specific gap penalties and weight
           matrix choice.[6] Nucleic Acids Res., 22, 4673-4680.

       •   Higgins DG. (1994).  CLUSTAL V: multiple alignment of DNA and protein sequences.[7] Methods Mol
           Biol., 25, 307-318

       •   Higgins DG, Bleasby AJ, Fuchs R. (1992).  CLUSTAL V: improved software for multiple sequence
           alignment.[8] Comput. Appl. Biosci., 8, 189-191.

       •   Higgins,D.G. and Sharp,P.M. (1989).  Fast and sensitive multiple sequence alignments on a
           microcomputer.[9] Comput. Appl. Biosci., 5, 151-153.

       •   Higgins,D.G. and Sharp,P.M. (1988).  CLUSTAL: a package for performing multiple sequence alignment on
           a microcomputer.[10] Gene, 73, 237-244.

AUTHORS

       Des Higgins
           Copyright holder for Clustal.

       Julie Thompson
           Copyright holder for Clustal.

       Toby Gibson
           Copyright holder for Clustal.

       Charles Plessy <plessy@debian.org>
           Prepared this manpage in DocBook XML for the Debian distribution.

COPYRIGHT

       Copyright © 1988–2010 Des Higgins, Julie Thompson & Toby Giboson (Clustal)
       Copyright © 2008–2010 Charles Plessy (This manpage)

       This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser
       General Public License as published by the Free Software Foundation, either version 3 of the License, or
       (at your option) any later version.

       This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
       the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General
       Public License for more details.

       You should have received a copy of the GNU Lesser General Public License along with this program. If not,
       see http://www.gnu.org/licenses/, or on Debian systems, /usr/share/common-licenses/LGPL-3.

       This manual page and its XML source can be used, modified, and redistributed as if it were in public
       domain.

NOTES

1. Clustal W and Clustal X version 2.0.
http://www.ncbi.nlm.nih.gov/pubmed/17846036

2. Multiple sequence alignment with the Clustal series of programs.
http://www.ncbi.nlm.nih.gov/pubmed/12824352

3. Multiple sequence alignment with Clustal X
http://www.ncbi.nlm.nih.gov/pubmed/9810230

4. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality
analysis tools.
http://www.ncbi.nlm.nih.gov/pubmed/9396791

5. Using CLUSTAL for multiple sequence alignments.
http://www.ncbi.nlm.nih.gov/pubmed/8743695

6. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence
weighting, position-specific gap penalties and weight matrix choice.
http://www.ncbi.nlm.nih.gov/pubmed/7984417

7. CLUSTAL V: multiple alignment of DNA and protein sequences.
http://www.ncbi.nlm.nih.gov/pubmed/8004173

8. CLUSTAL V: improved software for multiple sequence alignment.
http://www.ncbi.nlm.nih.gov/pubmed/1591615

9. Fast and sensitive multiple sequence alignments on a microcomputer.
http://www.ncbi.nlm.nih.gov/pubmed/2720464

10. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.
http://www.ncbi.nlm.nih.gov/pubmed/3243435

Clustal 2.1 12/28/2010 CLUSTALW(1)