Ubuntu Manpage: unicode - command line unicode database query tool

NAME

       unicode - command line unicode database query tool

SYNOPSIS

       unicode [options] string

DESCRIPTION

       This manual page documents the unicode command.

       unicode is a command line unicode database query tool.

OPTIONS

-h --help

Show help and exit.

-x --hexadecimal

Assume string to be a hexadecimal number

-d --decimal

Assume string to be a decimal number

-o --octal

Assume string to be an octal number

-b --binary

Assume string to be a binary number

-r --regexp

Assume string to be a regular expression

-s --string

Assume string to be a sequence of characters

-a --auto

Try to guess type of string from one of the above (default)

-mMAXCOUNT
--max=MAXCOUNT

Maximal number of codepoints to display, default: 20; use 0 for unlimited

-iCHARSET
--io=IOCHARSET

I/O character set. For maximal pleasure, run unicode on UTF-8 capable terminal and
specify IOCHARSET to be UTF-8. unicode tries to guess this value from your locale,
so with properly set up locale, you should not need to specify it.

--fcp=CHARSET
--fromcp=CHARSET

Convert numerical arguments from this encoding, default: no conversion. Multibyte
encodings are supported. This is ignored for non-numerical arguments.

-cADDCHARSET
--charset-add=ADDCHARSET

Show hexadecimal reprezentation of displayed characters in this additional charset.

-CUSE_COLOUR
--colour=USE_COLOUR

USE_COLOUR is one of on off auto

--colour=on will use ANSI colour codes to colourise the output

--colour=off won't use colours.

--colour=auto will test if standard output is a tty, and use colours only when it
is.

--color is a synonym of --colour

-v --verbose

Be more verbose about displayed characters, e.g. display Unihan information, if
available.

-w --wikipedia

Spawn browser pointing to English Wikipedia entry about the character.

--wt --wiktionary

Spawn browser pointing to English Wiktionary entry about the character.

--brief

Display character information in brief format

--format=fmt

Use your own format for character information display. See the README for details.

--list

List (approximately) all known encodings.

--download

Try to download UnicodeData.txt into ~/.unicode/

--ascii

Display ASCII table

--brexit-ascii
--brexit

Display ASCII table (EU–UK Trade and Cooperation Agreement 2020 version)

USAGE

       unicode tries to guess the type of an argument. In particular, if the arguments looks like
       a  valid  hexadecimal  representation  of a Unicode codepoint, it will be considered to be
       such. Using

       unicode face

       will display information about U+FACE CJK COMPATIBILITY IDEOGRAPH-FACE, and  it  will  not
       search for 'face' in character descriptions - for the latter, use:

       unicode -r face

       For  example,  you can use any of the following to display information about  U+00E1 LATIN
       SMALL LETTER A WITH ACUTE (á):

       unicode 00E1

       unicode U+00E1

       unicode á

       unicode 'latin small letter a with acute'

       You can specify a range of characters as argumets, unicode will show these  characters  in
       nice  tabular  format,  aligned to 256-byte boundaries.  Use two dots ".." to indicate the
       range, e.g.

       unicode 0450..0520

       will display the whole cyrillic and hebrew blocks (characters from U+0400 to U+05FF)

       unicode 0400..

       will display just characters from U+0400 up to U+04FF

       Use --fromcp to query codepoints from other encodings:

       unicode --fromcp cp1250 -d 200

       Multibyte encodings are supported: unicode --fromcp big5 -x aff3

       and multi-char strings are supported, too:

       unicode --fromcp utf-8 -x c599c3adc5a5

BUGS

       Tabular format does not deal well with full-width, combining, control and RTL characters.

AUTHOR

       Radovan Garabík <garabik @ kassiopeia.juls.savba.sk>

                                            2003-01-31                                 UNICODE(1)