Provided by: readseq_1-14_amd64
NAME
readseq - Reads and writes nucleic/protein sequences in various formats
SYNOPSIS
readseq [-options] in.seq > out.seq
DESCRIPTION
This manual page documents briefly the readseq command. This manual page was written for the Debian GNU/Linux distribution because the original program does not have a manual page. Instead, it has documentation in text form, see below. readseq reads and writes biosequences (nucleic/protein) in various formats. Data files may have multiple sequences. readseq is particularly useful as it automatically detects many sequence formats, and interconverts among them.
FORMATS
Formats which readseq currently understands: * IG/Stanford, used by Intelligenetics and others * GenBank/GB, genbank flatfile format * NBRF format * EMBL, EMBL flatfile format * GCG, single sequence format of GCG software * DNAStrider, for common Mac program * Fitch format, limited use * Pearson/Fasta, a common format used by Fasta programs and others * Zuker format, limited use. Input only. * Olsen, format printed by Olsen VMS sequence editor. Input only. * Phylip3.2, sequential format for Phylip programs * Phylip, interleaved format for Phylip programs (v3.3, v3.4) * Plain/Raw, sequence data only (no name, document, numbering) + MSF multi sequence format used by GCG software + PAUP's multiple sequence (NEXUS) format + PIR/CODATA format used by PIR + ASN.1 format used by NCBI + Pretty print with various options for nice looking output. Output only. + LinAll format, limited use (LinAll and ConStruct programs) + Vienna format used by ViennaRNA programs See the included "Formats" file for detail on file formats.
OPTIONS
-help Show summary of options. -a[ll] Select All sequences -c[aselower] Change to lower case -C[ASEUPPER] Change to UPPER CASE -degap[=-] Remove gap symbols -i[tem=2,3,4] Select Item number(s) from several -l[ist] List sequences only -o[utput=]out.seq Redirect Output -p[ipe] Pipe (command line, <stdin, >stdout) -r[everse] Change to Reverse-complement -v[erbose] Verbose progress -f[ormat=]# Format number for output, or -f[ormat=]Name Format name for output: 1. IG/Stanford 11. Phylip3.2 2. GenBank/GB 12. Phylip 3. NBRF 13. Plain/Raw 4. EMBL 14. PIR/CODATA 5. GCG 15. MSF 6. DNAStrider 16. ASN.1 7. Fitch 17. PAUP/NEXUS 8. Pearson/Fasta 18. Pretty (out-only) 9. Zuker (in-only) 19. LinAll 10. Olsen (in-only) 20. Vienna Pretty format options: -wid[th]=# Sequence line width -tab=# Left indent -col[space]=# Column space within sequence line on output -gap[count] Count gap chars in sequence numbers -nameleft, -nameright[=#] Name on left/right side [=max width] -nametop Name at top/bottom -numleft, -numright Seq index on left/right side -numtop, -numbot Index on top/bottom -match[=.] Use match base for 2..n species -inter[line=#] Blank line(s) between sequence blocks
EXAMPLES
readseq -- for interactive use readseq my.1st.seq my.2nd.seq -all -format=genbank -output=my.gb -- convert all of two input files to one genbank format output file readseq my.seq -all -form=pretty -nameleft=3 -numleft -numright -numtop -match -- output to standard output a file in a pretty format readseq my.seq -item=9,8,3,2 -degap -CASE -rev -f=msf -out=my.rev -- select 4 items from input, degap, reverse, and uppercase them cat *.seq | readseq -pipe -all -format=asn > bunch-of.asn -- pipe a bunch of data thru readseq, converting all to asn
SEE ALSO
The programs are documented fully in text form. See the files in /usr/share/doc/readseq
AUTHOR
This manual page was written by Stephane Bortzmeyer <bortzmeyer@debian.org>, for the Debian GNU/Linux system (but may be used by others). READSEQ(1)