Provided by: m17n-docs_1.6.2-2.1_all
NAME
M-text - M-text objects and API for them. Typedefs typedef struct MText MText Type of M-texts. Enumerations enum MTextFormat { MTEXT_FORMAT_US_ASCII, MTEXT_FORMAT_UTF_8, MTEXT_FORMAT_UTF_16LE, MTEXT_FORMAT_UTF_16BE, MTEXT_FORMAT_UTF_32LE, MTEXT_FORMAT_UTF_32BE, MTEXT_FORMAT_MAX } Enumeration for specifying the format of an M-text. enum MTextLineBreakOption { MTEXT_LBO_SP_CM = 1, MTEXT_LBO_KOREAN_SP = 2, MTEXT_LBO_AI_AS_ID = 4, MTEXT_LBO_MAX } Enumeration for specifying a set of line breaking option. Functions int mtext_line_break (MText *mt, int pos, int option, int *after) Find a linebreak postion of an M-text. MText * mtext () Allocate a new M-text. MText * mtext_from_data (const void *data, int nitems, enum MTextFormat format) Allocate a new M-text with specified data. void * mtext_data (MText *mt, enum MTextFormat *fmt, int *nunits, int *pos_idx, int *unit_idx) Get information about the text data in M-text. int mtext_len (MText *mt) Number of characters in M-text. int mtext_ref_char (MText *mt, int pos) Return the character at the specified position in an M-text. int mtext_set_char (MText *mt, int pos, int c) Store a character into an M-text. MText * mtext_cat_char (MText *mt, int c) Append a character to an M-text. MText * mtext_dup (MText *mt) Create a copy of an M-text. MText * mtext_cat (MText *mt1, MText *mt2) Append an M-text to another. MText * mtext_ncat (MText *mt1, MText *mt2, int n) Append a part of an M-text to another. MText * mtext_cpy (MText *mt1, MText *mt2) Copy an M-text to another. MText * mtext_ncpy (MText *mt1, MText *mt2, int n) Copy the first some characters in an M-text to another. MText * mtext_duplicate (MText *mt, int from, int to) Create a new M-text from a part of an existing M-text. MText * mtext_copy (MText *mt1, int pos, MText *mt2, int from, int to) Copy characters in the specified range into an M-text. int mtext_del (MText *mt, int from, int to) Delete characters in the specified range destructively. int mtext_ins (MText *mt1, int pos, MText *mt2) Insert an M-text into another M-text. int mtext_insert (MText *mt1, int pos, MText *mt2, int from, int to) Insert sub-text of an M-text into another M-text. int mtext_ins_char (MText *mt, int pos, int c, int n) Insert a character into an M-text. int mtext_replace (MText *mt1, int from1, int to1, MText *mt2, int from2, int to2) Replace sub-text of M-text with another. int mtext_character (MText *mt, int from, int to, int c) Search a character in an M-text. int mtext_chr (MText *mt, int c) Return the position of the first occurrence of a character in an M-text. int mtext_rchr (MText *mt, int c) Return the position of the last occurrence of a character in an M-text. int mtext_cmp (MText *mt1, MText *mt2) Compare two M-texts character-by-character. int mtext_ncmp (MText *mt1, MText *mt2, int n) Compare initial parts of two M-texts character-by-character. int mtext_compare (MText *mt1, int from1, int to1, MText *mt2, int from2, int to2) Compare specified regions of two M-texts. int mtext_spn (MText *mt, MText *accept) Search an M-text for a set of characters. int mtext_cspn (MText *mt, MText *reject) Search an M-text for the complement of a set of characters. int mtext_pbrk (MText *mt, MText *accept) Search an M-text for any of a set of characters. MText * mtext_tok (MText *mt, MText *delim, int *pos) Look for a token in an M-text. int mtext_text (MText *mt1, int pos, MText *mt2) Locate an M-text in another. int mtext_search (MText *mt1, int from, int to, MText *mt2) Locate an M-text in a specific range of another. int mtext_casecmp (MText *mt1, MText *mt2) Compare two M-texts ignoring cases. int mtext_ncasecmp (MText *mt1, MText *mt2, int n) Compare initial parts of two M-texts ignoring cases. int mtext_case_compare (MText *mt1, int from1, int to1, MText *mt2, int from2, int to2) Compare specified regions of two M-texts ignoring cases. int mtext_lowercase (MText *mt) Lowercase an M-text. int mtext_titlecase (MText *mt) Titlecase an M-text. int mtext_uppercase (MText *mt) Uppercase an M-text. Variables MSymbol Mlanguage Variables: Default Endian of UTF-16 and UTF-32 enum MTextFormat MTEXT_FORMAT_UTF_16 Variable of value MTEXT_FORMAT_UTF_16LE or MTEXT_FORMAT_UTF_16BE. const int MTEXT_FORMAT_UTF_32 Variable of value MTEXT_FORMAT_UTF_32LE or MTEXT_FORMAT_UTF_32BE.
Detailed Description
M-text objects and API for them. In the m17n library, text is represented as an object called M-text rather than as a C-string (char * or unsigned char *). An M-text is a sequence of characters whose length is equals to or more than 0, and can be coined from various character sources, e.g. C-strings, files, character codes, etc. M-texts are more useful than C-strings in the following points. • M-texts can handle mixture of characters of various scripts, including all Unicode characters and more. This is an indispensable facility when handling multilingual text. • Each character in an M-text can have properties called text properties. Text properties store various kinds of information attached to parts of an M-text to provide application programs with a unified view of those information. As rich information can be stored in M-texts in the form of text properties, functions in application programs can be simple. In addition, the library provides many functions to manipulate an M-text just the same way as a C-string.
Typedef Documentation
typedef struct MText MText Type of M-texts. The type MText is for an M-text object. Its internal structure is concealed from application programs.
Enumeration Type Documentation
enum MTextFormat Enumeration for specifying the format of an M-text. The enum MTextFormat is used as an argument of the mtext_from_data() function to specify the format of data from which an M-text is created. Enumerator: MTEXT_FORMAT_US_ASCII US-ASCII encoding MTEXT_FORMAT_UTF_8 UTF-8 encoding MTEXT_FORMAT_UTF_16LE UTF-16LE encoding MTEXT_FORMAT_UTF_16BE UTF-16BE encoding MTEXT_FORMAT_UTF_32LE UTF-32LE encoding MTEXT_FORMAT_UTF_32BE UTF-32BE encoding MTEXT_FORMAT_MAX enum MTextLineBreakOption Enumeration for specifying a set of line breaking option. The enum MTextLineBreakOption is to control the line breaking algorithm of the function mtext_line_break() by specifying logical-or of the members in the arg option. Enumerator: MTEXT_LBO_SP_CM Specify the legacy support for space character as base for combining marks. See the section 8.3 of UAX#14. MTEXT_LBO_KOREAN_SP Specify to use space characters for line breaking Korean text. MTEXT_LBO_AI_AS_ID Specify to treat characters of ambiguous line-breaking class as of ideographic line-breaking class. MTEXT_LBO_MAX
Variable Documentation
enum MTextFormat MTEXT_FORMAT_UTF_16 Variable of value MTEXT_FORMAT_UTF_16LE or MTEXT_FORMAT_UTF_16BE. The global variable MTEXT_FORMAT_UTF_16 is initialized to MTEXT_FORMAT_UTF_16LE on a 'Little Endian' system (storing words with the least significant byte first), and to MTEXT_FORMAT_UTF_16BE on a 'Big Endian' system (storing words with the most significant byte first). SEE ALSO mtext_from_data() const int MTEXT_FORMAT_UTF_32 Variable of value MTEXT_FORMAT_UTF_32LE or MTEXT_FORMAT_UTF_32BE. The global variable MTEXT_FORMAT_UTF_32 is initialized to MTEXT_FORMAT_UTF_32LE on a 'Little Endian' system (storing words with the least significant byte first), and to MTEXT_FORMAT_UTF_32BE on a 'Big Endian' system (storing words with the most significant byte first). SEE ALSO mtext_from_data() MSymbol Mlanguage The symbol whose name is 'language'.
Author
Generated automatically by Doxygen for The m17n Library from the source code.
COPYRIGHT
Copyright (C) 2001 Information-technology Promotion Agency (IPA) Copyright (C) 2001-2011 National Institute of Advanced Industrial Science and Technology (AIST) Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License <http://www.gnu.org/licenses/fdl.html>.