Provided by: timbl_6.4.4-4_amd64

**NAME**

timbl - Tilburg Memory Based Learner

**SYNOPSYS**

timbl [options] timbl -f data-file -t test-file

**DESCRIPTION**

TiMBL is an open source software package implementing several memory-based learning algorithms, among which IB1-IG, an implementation of k-nearest neighbor classification with feature weighting suitable for symbolic feature spaces, and IGTree, a decision-tree approximation of IB1-IG. All implemented algorithms have in common that they store some representation of the training set explicitly in memory. During testing, new cases are classified by extrapolation from the most similar stored cases.

**OPTIONS**

-a<n> or-a<string> determines the classification algorithm. Possible values are:0orIBthe IB1 (k-NN) algorithm (default)1orIGTREEa decision-tree-based approximation of IB12orTRIBLa hybrid of IB1 and IGTREE3orIB2an incremental editing version of IB14orTRIBL2a non-parameteric version of TRIBL-bn number of lines used for bootstrapping (IB2 only)-Bn number of bins used for discretization of numeric feature values--Beam=<n> limit +v db output to n highest-vote classes--clones=<n> number f threads to use for parallel testing-cn clipping frequency for prestoring MVDM matrices+Dstore distributions on all nodes (necessary for using +v db with IGTree, but wastes memory otherwise)--Diversifyrescale weight (see docs)-dval weigh neighbors as function of their distance: Z : equal weights to all (default) ID : Inverse Distance IL : Inverse Linear ED:a : Exponential Decay with factor a (no whitespace!) ED:a:b : Exponential Decay with factor a and b (no whitespace!)-en estimate time until n patterns tested-ffile read from data file 'file' OR use filenames from 'file' for cross validation test-Fformat assume the specified input format (Compact, C4.5, ARFF, Columns, Binary, Sparse )-Gnormalization normalize distibutions (+v db option only) Supported normalizations are:Probabilityor0normalize between 0 and 1addFactor:<f> or1:<f> add f to all possible targets, then normalize between 0 and 1 (default f=1.0).logProbabilityor2Add 1 to the target Weight, take the 10Log and then normalize between 0 and 1+Hor-Hwrite hashed trees (default +H)-ifile read the InstanceBase from 'file' (skips phase 1 & 2 )-Ifile dump the InstanceBase in 'file'-kn search 'n' nearest neighbors (default n = 1)-Ln set value frequency threshold to back off from MVDM to Overlap at level n-ln fixed feature value length (Compact format only)-mstring use feature metrics as specified in' string': The format is : GlobalMetric:MetricRange:MetricRange e.g.: mO:N3:I2,5-7 C: cosine distance. (Global only. numeric features implied) D: dot product. (Global only. numeric features implied) DC: Dice coefficient O: weighted overlap (default) E: Euclidian distance L: Levenshtein distance M: modified value difference J: Jeffrey divergence S: Jensen-Shannon divergence N: numeric values I: Ignore named values--matrixin=file read ValueDifference Matrices from file 'file'--matrixout=file store ValueDifference Matrices in 'file'-nfile create a C4.5-style names file 'file'-Mn size of MaxBests Array-Nn number of features (default 2500)-os use s as output filename--occurences=<value> The input file contains occurrence counts (at the last position) value can be one of:train,testorboth-Opath save output using 'path'-pn show progress every n lines (default p = 100,000)-Ppath read data using 'path'-qn set TRIBL threshold at level n-Rn solve ties at random with seed n-suse the exemplar weights from the input file-s0ignore the exemplar weights from the input file-Tn use feature n as the class label. (default: the last feature)-tfile test using 'file'-tleave_one_out test with the leave-one-out testing regimen (IB1 only). you may add --sloppy to speed up leave-one-out testing (but see docs)-tcross_validate perform cross-validation test (IB1 only)-t@file test using files and options described in 'file' Supported options: d e F k m o p q R t u v w x % ---Treeorder=valuen ordering of the Tree: DO: none GRO: using GainRatio IGO: using InformationGain 1/V: using 1/# of Values G/V: using GainRatio/# of Valuess I/V: using InfoGain/# of Valuess X2O: using X-square X/V: using X-square/# of Values SVO: using Shared Variance S/V: using Shared Variance/# of Values GxE: using GainRatio * SplitInfo IxE: using InformationGain * SplitInfo 1/S: using 1/SplitInfo-ufile read value-class probabilities from 'file'-Ufile save value-class probabilities in 'file'-VShow VERSION+vlevel or-vlevel set or unset verbosity level, where level is: s: work silently o: show all options set b: show node/branch count and branching factor f: show calculated feature weights (default) p: show value difference matrices e: show exact matches as: show advanced statistics (memory consuming) cm: show confusion matrix (implies +vas) cs: show per-class statistics (implies +vas) cf: add confidence to output file (needs -G) di: add distance to output file db: add distribution of best matched to output file md: add matching depth to output file. k: add a summary for all k neigbors to output file (sets -x) n: add nearest neigbors to output file (sets -x) You may combine levels using '+' e.g. +v p+db or -v o+di-wn weighting 0 or nw: no weighting 1 or gr: weigh using gain ratio (default) 2 or ig: weigh using information gain 3 or x2: weigh using the chi-square statistic 4 or sv: weigh using the shared variance statistic 5 or sd: weigh using standard deviation. (all features must be numeric)-wfile read weights from 'file'-wfile:n read weight n from 'file'-Wfile calculate and save all weights in 'file'+%or-%do or don't save test result (%) to file+xor-xdo or don't use the exact match shortcut (IB1 and IB2 only, default is -x)-Xfile dump the InstanceBase as XML in 'file'

**BUGS**

possibly

**AUTHORS**

Ko van der Sloot Timbl@uvt.nl Antal van den Bosch Timbl@uvt.nl

**SEE** **ALSO**

timblserver(1) 2012 July 10 timbl(1)