Ubuntu Manpage: FBB::String - Several operations on std::string objects

name
synopsis
description
namespace
inherits from
enumerations
typedef
static member functions
example
files
see also
bugs
bobcat project files
bobcat
copyright
author

Provided by: libbobcat-dev_5.11.01-1_amd64

NAME

       FBB::String - Several operations on std::string objects

SYNOPSIS

       #include <bobcat/string>
       Linking option: -lbobcat

DESCRIPTION

       This  class offers facilities for often used transformations on std::string objects, which
       are not supported by the std::string class itself. All members of FBB::String are static.

       Initially this class was derived from std::string. Deriving from std::string, however,  is
       considerd bad design as std::string was not designed as a base-class.

       FBB::String offers a series of static member functions providing the facilities originally
       implemented as non-static members. One of these members is the (overloaded) split  member,
       splitting  a  string into elements separated by one or more configurable characters. These
       elements may contain or consist of double-  or  single-quoted  (sub)  strings  and  escape
       characters.  Escape  characters  are  converted  to their implied byte-values (e.g., \n is
       converted to byte value 10) unless they  are  embedded  in  single-quoted  (sub)  strings.
       Quotes  surrounding  double- and single-quoted (sub) strings are removed from the elements
       returned by the split members.

NAMESPACE

       FBB
       All constructors, members, operators and manipulators, mentioned  in  this  man-page,  are
       defined in the namespace FBB.

INHERITS FROM

--

ENUMERATIONS

       o      Type:
              This  enumeration  indicates  the  nature of the content of an element in the array
              returned by the overloaded split members (see below).

              DQUOTE, a subset of the characters in the matching string element was delimited  by
              double quotes in the in the string that was parsed by the split members.

              DQUOTE_UNTERMINATED, the content of the string that was parsed by the split members
              started at some point with a double quote, but the matching ending double quote was
              lacking.

              ESCAPED_END,   the content of the string that was parsed by the split members ended
              in a mere backslash.

              NORMAL, a normal string;

              SEPARATOR, a separator;

              SQUOTE, a subset of the characters in the matching string element was delimited  by
              quotes in the in the string that was parsed by the split members.

              SQUOTE_UNTERMINATED, the content of the string that was parsed by the split members
              started at some point with a quote, but the matching ending quote was lacking.

       o      SplitType:
              This enumeration is used to specify how split members should split the  information
              in the string objects that are passed to these members:

              TOK: the split member acts like the standard C function strtok(3). The essence here
              is that no empty elements are returned. E.g., a string containing  "a,,"  which  is
              processed using the TOK mode returns a NORMAL element containing "a".

              TOKSEP:  the  split  member  acts  like  the  standard  C  function strtok(3), also
              returning information about encountered separators.  Since  strtok  doesn’t  return
              empty   elements,  TOKSEP  uses  empty  elements  to  indicate  the  occurrence  of
              separators. E.g., a string containing "a,," which is  processed  using  the  TOKSEP
              mode  returns  a  NORMAL  element  containing  "a", followed by two empty SEPARATOR
              elements.

              STR: the split member acts like the standard C function strstr(3). The essence here
              is  that empty elements are also returned. E.g., a string containing "a,," which is
              processed using the STR mode returns an element containing  "a",  followed  by  two
              empty NORMAL elements.

              STRSEP:  the  split  member  acts  like  the  standard  C  function strstr(3), also
              returning information about encountered  separators.   E.g.,  a  string  containing
              "a,,"  which is processed using the STRSEP mode returns a NORMAL element containing
              "a", followed by a SEPARATOR element containing ",", followed  by  a  NORMAL  empty
              element,  followed by a SEPARATOR element containing ",", and finally followed by a
              NORMAL empty element,

TYPEDEF

       The typedef SplitPair represents std::pair<std::string, String::Type> and is used by  some
       overloaded split members (see below).

STATIC MEMBER FUNCTIONS

       o      char const **argv(std::vector<std::string> const &words):
              Returns a pointer to an allocated series of pointers to the C strings stored in the
              vector words. The caller is responsible for returning the array of pointers to  the
              common  pool,  but should not delete the C-strings to which the pointers point. The
              last element of the returned array is guaranteed to be a 0-pointer.

       o      int casecmp(std::string const &lhs, std::string const &rhs):
              Performs a case-insensitive comparison of the content of two std::string objects. A
              negative  value  is  returned if lhs should be ordered before rhs; 0 is returned if
              the two strings have identical content; a positive value is  returned  if  the  lhs
              object should be ordered beyond rhs.

       o      std::string escape(std::string const &str, char const *series = "’\"\\"):
              Returns a copy of str in which all characters in series are prefixed by a backslash
              character.

       o      std::string join(std::vector<std::string> const &words, char sep):
              The elements of the words vector are returned as one string,  separated  from  each
              other by the sep character;

       o      std::string join(std::vector<SplitPair> const &entries, char sep, bool all = true):
              The  first  fields of the elements in entries are returned as one string, separated
              from each other by the sep character. If the parameter all is  specified  as  false
              then elements whose second fields are equal to String::SEPARATOR are ignored.

       o      std::string lc(std::string const &str) const:
              Returns a copy of str in which all letters were transformed to lower case letters.

       o      std::vector<String::SplitPair>  split(std::string  const &str, SplitType mode, char
              const *sep = " \t"):
              The string str is split into substrings, separated by any of the characters in sep.
              The  substrings are returned in a vector of SplitPair elements, using the specified
              SplitType mode (cf. the description of  the  various  SplitPair  values  and  their
              effects in the ENUMERATIONS section).

       o      std::vector<String::SplitPair> split(std::string const &str, char const *separators
              = " \t", bool addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode  TOK
              and addEmpty == true to select mode TOKSEP.

       o      size_t   split(std::vector<String::SplitPair>  *entries,  std::string  const  &str,
              SplitType mode, char const *sep = " \t"):
              Same functionality as the first split member, but this member stores the  SplitPair
              elements  in  the  vector  pointed  at by the entries parameter, first clearing the
              vector. This member returns the new value of entries->size().

       o      size_t split(std::vector<String::SplitPair> *entries, std::string const &str,  char
              const *sep = " \t", bool addEmpty = false):
              This  member acts like the previous one, using addEmpty == false to select mode TOK
              and addEmpty == true to select mode TOKSEP.

       o      std::vector<std::string> split(Type *type, std::string const &str, SplitType stype,
              char const *sep = " \t"):
              Same  functionality  as  the  first split member, but this member merely stores the
              first fields of the SplitPair elements in the  returned  vector.  The  String::Type
              variable  whose  address  is  passed  to the type parameter is set to NORMAL if the
              final entry was successfully determined; to DQUOTE_UNTERMINATED if a final  closing
              double  quote  could not be found; to SQUOTE_UNTERMINATED if a final closing single
              quote could not be found; and to ESCAPE_END if the final  character  in  str  is  a
              backslash character.

       o      std::vector<std::string>  split(Type *type, std::string const &str, char const *sep
              = " \t", bool addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode  TOK
              and addEmpty == true to select mode TOKSEP.

       o      size_t  split(std::vector<std::string>  *words,  std::string  const &str, SplitType
              stype, char const *sep = " \t"):
              Same functionality as the first split member, but this  member  merely  stores  the
              first  fields  of  the  encountered  SplitPair elements in the vector pointed at by
              words,  first  clearing  the  vector.  This  member  returns  the  new   value   of
              words->size().

       o      size_t  split(std::vector<std::string>  *words,  std::string const &str, char const
              *sep = " \t", bool addEmpty = false):
              This member acts like the previous one, using addEmpty == false to select mode  TOK
              and addEmpty == true to select mode TOKSEP.

       o      std::string trim(std::string const &str):
              Returns  a  copy  of  str  from  which  leading  and trailing blank characters were
              removed.

       o      std::string uc(std::string const &str):
              Returns a copy of str in which all letters were capitalized.

       o      std::string unescape(std::string const &str):
              Returns a copy of str  in  which  the  escaped  (i.e.,  prefixed  by  a  backslash)
              characters  were  interpreted.  All standard escape characters (\a, \b, \f, \n, \r,
              \t, \v) are recognized. If an escape character is followed by x at  most  the  next
              two  characters  are interpreted as a hexadecimal number. If an escape character is
              followed by an octal digit, then at most the next three  characters  following  the
              backslash  are interpreted as an octal number. In all other cases, the backslash is
              removed and the character following the backslash is kept.

       o      std::string urlDecode(std::string const &str):
              URL specifications use %xx encoding to encode characters, except for  alpha-numeric
              characters  and  the characters - _ . and ~, which are kept as-is. Other characters
              are encode by a % character, followed by two  hexadecimal  characters  representing
              those  characters’  byte  value.  E.g.,  a  blank  space  is encoded as %20, a plus
              character is encoded as %2B. The member urlDecode returns a std::string  containing
              the decoded characters of the url-encoded string that is passed as argument to this
              member.

       o      std::string urlEncode(std::string const &str):
              See  the  member  urlDecode:  urlEncode  returns  a  std::string   containing   the
              url-encoded  characters  of the characters in the string that is passed as argument
              to this member.

EXAMPLE

       #include <iostream>
       #include <vector>
       #include <bobcat/string>

       using namespace std;
       using namespace FBB;

       static char const *type[] =
       {
           "DQUOTE_UNTERMINATED",
           "SQUOTE_UNTERMINATED",
           "ESCAPED_END",
           "SEPARATOR",
           "NORMAL",
           "DQUOTE",
           "SQUOTE",
       };

       int main(int argc, char **argv)
       {
           cout << "Program’s name in uppercase: " << String::uc(argv[0]) << "\n\n";

           vector<String::SplitPair> splitpair;
           string text{ "one, two, ’thr\\x65\\145’" };
           string encoded{ String::urlEncode(text) };

           cout << "The string `" << text << "’\n"
                   "   as url-encoded string: `" << encoded << "’\n"
                   "   and the latter string url-decoded: " <<
                                           String::urlDecode(encoded) << "\n"
                   "\n"
                   "Splitting `" << text << "’ into " <<
                           String::split(&splitpair, text, String::STRSEP, ", ") <<
                       " fields\n";

           for (auto it = splitpair.begin(); it != splitpair.end(); ++it)
               cout << (it - splitpair.begin() + 1) << ": " <<
                       type[it->second] << ": `" << it->first <<
                       "’, unescaped: `" << String::unescape(it->first) <<
                       "’\n";

           cout << ’\n’ <<
               text << ":\n"
               "   upper case: " << String::uc(text) << ",\n"
               "   lower case: " << String::lc(text) << ’\n’;
       }

       /*
           Calling the program as
               driver’
           results in the following output:
               Program’s name in uppercase: DRIVER

               Splitting `one, two, ’thr\x65\145’’ into 9 fields
               1: NORMAL: `one’, unescaped: `one’
               2: SEPARATOR: `,’, unescaped: `,’
               3: NORMAL: `’, unescaped: `’
               4: SEPARATOR: ` ’, unescaped: ` ’
               5: NORMAL: `two’, unescaped: `two’
               6: SEPARATOR: `,’, unescaped: `,’
               7: NORMAL: `’, unescaped: `’
               8: SEPARATOR: ` ’, unescaped: ` ’
               9: SQUOTE: `thr\x65\145’, unescaped: `three’

               one, two, ’thr\x65\145’:
                  upper case: ONE, TWO, ’THR\X65\145’,
                  lower case: one, two, ’thr\x65\145’

       */

FILES

       bobcat/string - defines the class interface

BUGS

       None Reported.

BOBCAT PROJECT FILES

       o      https://fbb-git.gitlab.io/bobcat/: gitlab project page;

       o      bobcat_5.11.01-x.dsc: detached signature;

       o      bobcat_5.11.01-x.tar.gz: source archive;

       o      bobcat_5.11.01-x_i386.changes: change log;

       o      libbobcat1_5.11.01-x_*.deb: debian package containing the libraries;

       o      libbobcat1-dev_5.11.01-x_*.deb: debian package containing  the  libraries,  headers
              and manual pages;

BOBCAT

       Bobcat is an acronym of `Brokken’s Own Base Classes And Templates’.

COPYRIGHT

       This  is  free  software,  distributed  under  the terms of the GNU General Public License
       (GPL).

AUTHOR

       Frank B. Brokken (f.b.brokken@rug.nl).

NAME

SYNOPSIS

DESCRIPTION

NAMESPACE

INHERITS FROM

ENUMERATIONS

TYPEDEF

STATIC MEMBER FUNCTIONS

EXAMPLE

FILES

SEE ALSO

BUGS

BOBCAT PROJECT FILES

BOBCAT

COPYRIGHT

AUTHOR