Provided by: genometools_1.6.5+ds-2.2_amd64
NAME
gt-ltrdigest - Identifies and annotates sequence features in LTR retrotransposon candidates.
SYNOPSIS
gt ltrdigest [option ...] gff3_file
DESCRIPTION
-outfileprefix [string] prefix for output files (e.g. foo will create files called foo_*.csv and foo_*.fas) Omit this option for GFF3 output only. -metadata [yes|no] output metadata (run conditions) to separate file (default: yes) -seqnamelen [value] set maximal length of sequence names in FASTA headers (e.g. for clustalw or similar tools) (default: 20) -pptlen [start end] required PPT length range (default: [8..30]) -uboxlen [start end] required U-box length range (default: [3..30]) -uboxdist [value] allowed U-box distance range from PPT (default: 0) -pptradius [value] radius around beginning of 3' LTR to search for PPT (default: 30) -pptrprob [value] purine emission probability inside PPT (default: 0.970000) -pptyprob [value] pyrimidine emission probability inside PPT (default: 0.030000) -pptgprob [value] background G emission probability outside PPT (default: 0.250000) -pptcprob [value] background C emission probability outside PPT (default: 0.250000) -pptaprob [value] background A emission probability outside PPT (default: 0.250000) -ppttprob [value] background T emission probability outside PPT (default: 0.250000) -pptuprob [value] U/T emission probability inside U-box (default: 0.910000) -trnas [filename] tRNA library in multiple FASTA format for PBS detection Omit this option to disable PBS search. -pbsalilen [start end] required PBS/tRNA alignment length range (default: [11..30]) -pbsoffset [start end] allowed PBS offset from LTR boundary range (default: [0..5]) -pbstrnaoffset [start end] allowed PBS/tRNA 3' end alignment offset range (default: [0..5]) -pbsmaxedist [value] maximal allowed PBS/tRNA alignment unit edit distance (default: 1) -pbsradius [value] radius around end of 5' LTR to search for PBS (default: 30) -hmms profile HMM models for domain detection (separate by spaces, finish with --) in HMMER3 format Omit this option to disable pHMM search. -pdomevalcutoff [value] global E-value cutoff for pHMM search default 1E-6 -pdomcutoff [...] model-specific score cutoff choose from TC (trusted cutoff) | GA (gathering cutoff) | NONE (no cutoffs) (default: NONE) -aliout [yes|no] output pHMM to amino acid sequence alignments (default: no) -aaout [yes|no] output amino acid sequences for protein domain hits (default: no) -allchains [yes|no] output features from all chains and unchained features, labeled with chain numbers (default: no) -maxgaplen [value] maximal allowed gap size between fragments (in amino acids) when chaining pHMM hits for a protein domain (default: 50) -force_recreate [yes|no] force recreation of hmmpressed profiles (default: no) -pbsmatchscore [value] match score for PBS/tRNA alignments (default: 5) -pbsmismatchscore [value] mismatch score for PBS/tRNA alignments (default: -10) -pbsinsertionscore [value] insertion score for PBS/tRNA alignments (default: -20) -pbsdeletionscore [value] deletion score for PBS/tRNA alignments (default: -20) -v [yes|no] be verbose (default: no) -o [filename] redirect output to specified file (default: undefined) -gzip [yes|no] write gzip compressed output file (default: no) -bzip2 [yes|no] write bzip2 compressed output file (default: no) -force [yes|no] force writing to output file (default: no) -seqfile [filename] set the sequence file from which to take the sequences (default: undefined) -encseq [filename] set the encoded sequence indexname from which to take the sequences (default: undefined) -seqfiles set the sequence files from which to extract the features use -- to terminate the list of sequence files -matchdesc [yes|no] search the sequence descriptions from the input files for the desired sequence IDs (in GFF3), reporting the first match (default: no) -matchdescstart [yes|no] exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3) from the beginning to the first whitespace (default: no) -usedesc [yes|no] use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries. If a description contains a sequence range (e.g., III:1000001..2000000), the first part is used as sequence ID (III) and the first range position as offset (1000001) (default: no) -regionmapping [string] set file containing sequence-region to sequence file mapping (default: undefined) -help display help for basic options and exit -help+ display help for all options and exit -version display version information and exit
REPORTING BUGS
Report bugs to https://github.com/genometools/genometools/issues.