Provided by: bali-phy_3.6.1+dfsg-1_amd64 bug

NAME

       alignment-thin - Remove sequences or columns from an alignment.

SYNOPSIS

       alignment-thin alignment-file [OPTIONS]

DESCRIPTION

       Remove sequences or columns from an alignment.

GENERAL OPTIONS:

       -h, --help
              Print usage information.

       -V, --verbose
              Output more log messages on stderr.

SEQUENCE FILTERING OPTIONS:

       -p arg, --protect arg
              Sequences that cannot be removed (comma-separated).

       -k arg, --keep arg
              Remove sequences not in comma-separated list arg.

       -r arg, --remove arg
              Remove sequences in comma-separated list arg.

       -l arg, --longer-than arg
              Remove sequences not longer than arg.

       -s arg, --shorter-than arg
              Remove sequences not shorter than arg.

       -c arg, --cutoff arg
              Remove similar sequences with #mismatches < cutoff.

       -d arg, --down-to arg
              Remove similar sequences down to arg sequences.

       --remove-crazy arg
              Remove  arg  outlier  sequences  --  defined as sequences that are missing too many
              conserved sites.

       --conserved arg (=0.75)
              Fraction of sequences that must contain a letter for it to be considered conserved.

COLUMN FILTERING OPTIONS:

       -K arg, --keep-columns arg
              Keep columns from this sequence

       -m arg, --min-letters arg
              Remove columns with fewer than arg letters.

       -u arg, --remove-unique arg
              Remove insertions in a single sequence if longer than arg letters

       -e, --erase-empty-columns
              Remove columns with no characters (all gaps).

OUTPUT OPTIONS:

       -S, --sort
              Sort partially ordered columns to group similar gaps.

       -L, --show-lengths
              Just print out sequence lengths.

       -N, --show-names
              Just print out sequence lengths.

       -F arg, --find-dups arg
              For each sequence, find the closest other sequence.

EXAMPLES:

       Remove columns without a minimum number of letters:

              % alignment-thin --min-letters=5 file.fasta > file-thinned.fasta

       Remove sequences by name:

              % alignment-thin --remove=seq1,seq2 file.fasta > file2.fasta

              % alignment-thin --keep=seq1,seq2   file.fasta > file2.fasta

       Remove short sequences:

              % alignment-thin --longer-than=250 file.fasta > file-long.fasta

       Remove similar sequences with <= 5 differences from the closest other sequence:

              % alignment-thin --cutoff=5 file.fasta > more-than-5-differences.fasta

       Remove similar sequences until we have the right number of sequences:

              % alignment-thin --down-to=30 file.fasta > file-30taxa.fasta

       Remove dissimilar sequences that are missing conserved columns:

              % alignment-thin --remove-crazy=10 file.fasta > file2.fasta

       Protect some sequences from being removed:

              % alignment-thin --down-to=30 file.fasta --protect=seq1,seq2 > file2.fasta

              % alignment-thin --down-to=30 file.fasta --protect=@filename > file2.fasta

REPORTING BUGS:

       BAli-Phy online help: <http://www.bali-phy.org/docs.php>.

       Please send bug reports to <bali-phy-users@googlegroups.com>.

AUTHORS

       Benjamin Redelings.

                                             Feb 2018                           alignment-thin(1)