Provided by: python-gamera.toolkits.ocr_1.2.2-5_all
NAME
ocr4gamera - OCR system using the Gamera framework
USAGE
ocr4gamera -x <traindata> [options] <imagefile>
OPTIONS
-v <int>, --verbosity=<int> Set verbosity level to <int>. Possible values are 0 (default): silent operation; 1: information on progress; >2: segmentation info is written to PNG files with prefix debug_. -h, --help Display help and exit. --version Print version and exit. -d, --deskew Do a skew correction (recommended). -mf <ws>, --median_filter=<ws> Smooth the input image with a median filter with window size <ws>. Default is <ws>=0, which means no smoothing -ds <s>, --despeckle=<s> Remove all speckle with size <= <s>. Default is <s> = 0, which means no despeckling. -f, --filter Filter out very large (images) and very small components (noise). -a, --automatic-group Autogroup glyphs with classifier. -x <file>, --xmlfile=<file> Read training data from <file>. -o <xml>, --output=<xml> Write recognized text to file <xml> (otherwise it is written to stdout). -od <dir>, --output_directory=<dir> Writes for each input image <img> the recognized text to <dir>/<img>.txt. Note that this option cannot be used in combination with -o (--outfile). -c <csv>, --extra_chars_csvfile=<csv> Read additional class name conversions from file <csv>. <csv> must contain one conversion per line. -R <rules>, --heuristic_rules=<rules> Apply heuristic rules <rules> for disambiguation of some chars. <rules> can be roman (default) or none (for no rules). -D, --dictionary-correction Correct words using a dictionary (requires aspell or ispell). -L <lang>, --dictionary-language=<lang> Use <lang> as language for aspell (when option -D is set). -e <int>, --edit-distance=<int> Correct words only when edit distance not more than <int>. -ho, --hocr_out Writes output as hocr file (only works with the -o option). -hi <hocrfile>, --hocr_in=<hocrfile> Uses an hocr input file for textline segmentation. OCR4GAMERA(1)