Ubuntu Manpage: FBB::String - Several operations on std::string objects

Provided by: libbobcat-dev_6.11.00-1_amd64

NAME

       FBB::String - Several operations on std::string objects

SYNOPSIS

       #include <bobcat/string>
       Linking option: -lbobcat

DESCRIPTION

       This  class  offers  facilities  for  often  used  transformations  on std::string objects, which are not
       supported by the std::string class itself. All members of FBB::String are static.

       Initially this class was derived from std::string. Deriving from std::string, however, is  considerd  bad
       design as std::string was not designed as a base-class.

       FBB::String offers a series of static member functions providing the facilities originally implemented as
       non-static  members.  One  of  these  members  is  the (overloaded) split member, splitting a string into
       elements separated by one or more configurable characters. These  elements  may  contain  or  consist  of
       double-  or  single-quoted  (sub) strings and escape characters. Escape characters are converted to their
       implied byte-values (e.g., \n is converted to byte value 10) unless they are  embedded  in  single-quoted
       (sub)  strings.  Quotes surrounding double- and single-quoted (sub) strings are removed from the elements
       returned by the split members.

NAMESPACE

       FBB
       All constructors, members, operators and manipulators, mentioned in this man-page,  are  defined  in  the
       namespace FBB.

INHERITS FROM

--

ENUMERATIONS

       o      Type:
              This  enumeration  indicates  the nature of the content of an element in the array returned by the
              overloaded split members (see below).

              DQUOTE, a subset of the characters in the matching string element was delimited by  double  quotes
              in the in the string that was parsed by the split members.

              DQUOTE_UNTERMINATED,  the  content  of  the string that was parsed by the split members started at
              some point with a double quote, but the matching ending double quote was lacking.

              ESCAPED_END,  the content of the string that was parsed by the  split  members  ended  in  a  mere
              backslash.

              NORMAL, a normal string;

              SEPARATOR, a separator;

              SQUOTE,  a  subset of the characters in the matching string element was delimited by quotes in the
              in the string that was parsed by the split members.

              SQUOTE_UNTERMINATED, the content of the string that was parsed by the  split  members  started  at
              some point with a quote, but the matching ending quote was lacking.

       o      SplitType:
              This  enumeration  is used to specify how split members should split the information in the string
              objects that are passed to these members:

              TOK: the split member acts like the standard C function strtok(3). The essence  here  is  that  no
              empty elements are returned. E.g., a string containing "a,," which is processed using the TOK mode
              returns a NORMAL element containing "a".

              TOKSEP:  the  split member acts like the standard C function strtok(3), also returning information
              about encountered separators. Since strtok  doesn’t  return  empty  elements,  TOKSEP  uses  empty
              elements  to  indicate  the  occurrence  of  separators.  E.g., a string containing "a,," which is
              processed using the TOKSEP mode returns a NORMAL element containing "a",  followed  by  two  empty
              SEPARATOR elements.

              STR:  the split member acts like the standard C function strstr(3). The essence here is that empty
              elements are also returned. E.g., a string containing "a,," which is processed using the STR  mode
              returns an element containing "a", followed by two empty NORMAL elements.

              STRSEP:  the  split member acts like the standard C function strstr(3), also returning information
              about encountered separators.  E.g., a string containing "a,," which is processed using the STRSEP
              mode returns a NORMAL element containing "a", followed by  a  SEPARATOR  element  containing  ",",
              followed  by  a  NORMAL empty element, followed by a SEPARATOR element containing ",", and finally
              followed by a NORMAL empty element,

NESTED TYPE

       The struct SplitPair defines a std::pair<std::string, String::Type> and is used by some overloaded  split
       members (see below).

STATIC MEMBER FUNCTIONS

       o      char const **argv(std::vector<std::string> const &words):
              Returns  a pointer to an allocated series of pointers to the C strings stored in the vector words.
              The caller is responsible for returning the array of pointers to the common pool, but  should  not
              delete  the  C-strings  to  which  the  pointers  point. The last element of the returned array is
              guaranteed to be a 0-pointer.

       o      int casecmp(std::string const &lhs, std::string const &rhs):
              Performs a case-insensitive comparison of the content of two std::string objects. A negative value
              is returned if lhs should be ordered before rhs; 0 is returned if the two strings  have  identical
              content; a positive value is returned if the lhs object should be ordered beyond rhs.

       o      std::string escape(std::string const &str, char const *series = "’\"\\"):
              Returns a copy of str in which all characters in series are prefixed by a backslash character.

       o      std::string join(std::vector<std::string> const &words, char sep):
              The  elements of the words vector are returned as one string, separated from each other by the sep
              character;

       o      std::string join(std::vector<SplitPair> const &entries, char sep, bool all = true):
              The first fields of the elements in entries are returned as one string, separated from each  other
              by the sep character. If the parameter all is specified as false then elements whose second fields
              are equal to String::SEPARATOR are ignored.

       o      std::string lc(std::string const &str) const:
              Returns a copy of str in which all letters were transformed to lower case letters.

       o      std::vector<String::SplitPair>  split(std::string  const &str, SplitType mode, char const *sep = "
              \t"):
              The string str is split into substrings, separated by any of the characters in sep. The substrings
              are returned in a vector of SplitPair elements,  using  the  specified  SplitType  mode  (cf.  the
              description of the various SplitPair values and their effects in the ENUMERATIONS section).

       o      std::vector<String::SplitPair>  split(std::string const &str, char const *separators = " \t", bool
              addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode TOK and addEmpty ==
              true to select mode TOKSEP.

       o      size_t split(std::vector<String::SplitPair> *entries, std::string const &str, SplitType mode, char
              const *sep = " \t"):
              Same functionality as the first split member, but this member stores the SplitPair elements in the
              vector pointed at by the entries parameter, first clearing the vector. This member returns the new
              value of entries->size().

       o      size_t split(std::vector<String::SplitPair> *entries, std::string const &str, char const *sep =  "
              \t", bool addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode TOK and addEmpty ==
              true to select mode TOKSEP.

       o      std::vector<std::string>  split(Type  *type,  std::string  const &str, SplitType stype, char const
              *sep = " \t"):
              Same functionality as the first split member, but this member merely stores the  first  fields  of
              the  SplitPair  elements in the returned vector. The String::Type variable whose address is passed
              to the type parameter is set to  NORMAL  if  the  final  entry  was  successfully  determined;  to
              DQUOTE_UNTERMINATED  if a final closing double quote could not be found; to SQUOTE_UNTERMINATED if
              a final closing single quote could not be found; and to ESCAPE_END if the final character  in  str
              is a backslash character.

       o      std::vector<std::string>  split(Type  *type, std::string const &str, char const *sep = " \t", bool
              addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode TOK and addEmpty ==
              true to select mode TOKSEP.

       o      size_t split(std::vector<std::string> *words, std::string const &str, SplitType stype, char  const
              *sep = " \t"):
              Same  functionality  as  the first split member, but this member merely stores the first fields of
              the encountered SplitPair elements in the vector pointed at by words, first clearing  the  vector.
              This member returns the new value of words->size().

       o      size_t  split(std::vector<std::string>  *words,  std::string  const &str, char const *sep = " \t",
              bool addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode TOK and addEmpty ==
              true to select mode TOKSEP.

       o      std::string to_string(Type value, unsigned base = 10):
              returns value converted to a std::string using number  system  base.  Type  must  be  an  unsigned
              integral  type  and  2 <= base <= 36. If the latter condition doesn’t hold true an empty string is
              returned. Leading zeroes are ignored, a single 0 is returned  as  a  string  containing  a  single
              ’0’-character.  For  base > 10 subsequent letters of the alphabet are used to represent the values
              10, 11, 12, ..., 36. The returned string doesn’t start with a prefix indicating the number  system
              (e.g., there’s no 0x prefix when using base 16).

       o      std::string trim(std::string const &str):
              Returns a copy of str from which leading and trailing blank characters were removed.

       o      std::string uc(std::string const &str):
              Returns a copy of str in which all letters were capitalized.

       o      std::string unescape(std::string const &str):
              Returns  a  copy  of  str  in  which  the  escaped (i.e., prefixed by a backslash) characters were
              interpreted. All standard escape characters (\a, \b, \f, \n, \r, \t, \v)  are  recognized.  If  an
              escape character is followed by x at most the next two characters are interpreted as a hexadecimal
              number.  If  an  escape  character  is  followed  by  an  octal digit, then at most the next three
              characters following the backslash are interpreted as an octal number. In  all  other  cases,  the
              backslash is removed and the character following the backslash is kept.

       o      std::string urlDecode(std::string const &str):
              URL  specifications use %xx encoding to encode characters, except for alpha-numeric characters and
              the characters - _ . and ~, which are kept as-is. Other characters are encode by  a  %  character,
              followed  by  two  hexadecimal characters representing those characters’ byte value. E.g., a blank
              space is encoded as %20, a plus character is encoded  as  %2B.  The  member  urlDecode  returns  a
              std::string containing the decoded characters of the url-encoded string that is passed as argument
              to this member.

       o      std::string urlEncode(std::string const &str):
              See the member urlDecode: urlEncode returns a std::string containing the url-encoded characters of
              the characters in the string that is passed as argument to this member.

EXAMPLE

       #include <iostream>
       #include <vector>
       #include <bobcat/string>

       using namespace std;
       using namespace FBB;

       static char const *type[] =
       {
           "DQUOTE_UNTERMINATED",
           "SQUOTE_UNTERMINATED",
           "ESCAPED_END",
           "SEPARATOR",
           "NORMAL",
           "DQUOTE",
           "SQUOTE",
       };

       int main(int argc, char **argv)
       {
           cout << "Program’s name in uppercase: " << String::uc(argv[0]) << "\n\n";

           vector<String::SplitPair> splitpair;
           string text{ "one, two, ’thr\\x65\\145’" };
           string encoded{ String::urlEncode(text) };

           cout << "The string `" << text << "’\n"
                   "   as url-encoded string: `" << encoded << "’\n"
                   "   and the latter string url-decoded: " <<
                                           String::urlDecode(encoded) << "\n"
                   "\n"
                   "Splitting `" << text << "’ into " <<
                           String::split(&splitpair, text, String::STRSEP, ", ") <<
                       " fields\n";

           for (auto it = splitpair.begin(); it != splitpair.end(); ++it)
               cout << (it - splitpair.begin() + 1) << ": " <<
                       type[it->second] << ": `" << it->first <<
                       "’, unescaped: `" << String::unescape(it->first) <<
                       "’\n";

           cout << ’\n’ <<
               text << ":\n"
               "   upper case: " << String::uc(text) << ",\n"
               "   lower case: " << String::lc(text) << ’\n’;
       }

       /*
           Calling the program as
               driver’
           results in the following output:
               Program’s name in uppercase: DRIVER

               Splitting `one, two, ’thr\x65\145’’ into 9 fields
               1: NORMAL: `one’, unescaped: `one’
               2: SEPARATOR: `,’, unescaped: `,’
               3: NORMAL: `’, unescaped: `’
               4: SEPARATOR: ` ’, unescaped: ` ’
               5: NORMAL: `two’, unescaped: `two’
               6: SEPARATOR: `,’, unescaped: `,’
               7: NORMAL: `’, unescaped: `’
               8: SEPARATOR: ` ’, unescaped: ` ’
               9: SQUOTE: `thr\x65\145’, unescaped: `three’

               one, two, ’thr\x65\145’:
                  upper case: ONE, TWO, ’THR\X65\145’,
                  lower case: one, two, ’thr\x65\145’

       */

FILES

       bobcat/string - defines the class interface

BUGS

       None Reported.

BOBCAT PROJECT FILES

       o      https://fbb-git.gitlab.io/bobcat/: gitlab project page;

       Debian Bobcat project files:

       o      libbobcat6: debian package containing the shared library, changelog and copyright note;

       o      libbobcat-dev:  debian package containing the static library, headers, manual pages, and developer
              info;

BOBCAT

       Bobcat is an acronym of `Brokken’s Own Base Classes And Templates’.

COPYRIGHT

       This is free software, distributed under the terms of the GNU General Public License (GPL).

AUTHOR

       Frank B. Brokken (f.b.brokken@rug.nl).

libbobcat-dev_6.11.00                               2005-2025                               FBB::String(3bobcat)

NAME

SYNOPSIS

DESCRIPTION

NAMESPACE

INHERITS FROM

ENUMERATIONS

NESTED TYPE

STATIC MEMBER FUNCTIONS

EXAMPLE

FILES

SEE ALSO

BUGS

BOBCAT PROJECT FILES

BOBCAT

COPYRIGHT

AUTHOR