Public Member Functions | Static Public Member Functions | List of all members
Utf8String Struct Reference

Contains a UTF-8 encoded string. More...

#include <WString.h>

Inheritance diagram for Utf8String:
Utf8PrintfString

Public Member Functions

 Utf8String ()
 
 Utf8String (Utf8CP str)
 
 Utf8String (bastring const &other)
 
 Utf8String (bastring const &__str, size_t __pos, size_t __n=npos)
 
 Utf8String (CharCP __s, size_t __n)
 
 Utf8String (size_t __n, char __c)
 
 Utf8String (iterator __beg, iterator __end)
 
 Utf8String (const_iterator __beg, const_iterator __end)
 
 Utf8String (reverse_iterator __beg, reverse_iterator __end)
 
 Utf8String (const_reverse_iterator __beg, const_reverse_iterator __end)
 
 Utf8String (WCharCP str)
 Construct a Utf8String by converting from a wchar_t string. More...
 
 Utf8String (WString str)
 Construct a Utf8String by converting from a wchar_t string. More...
 
Utf8StringR Assign (WCharCP str)
 Construct a Utf8String by converting from a wchar_t string. More...
 
size_type SizeInBytes () const
 Computes the size, in bytes, of this string's data, including its NULL-terminator. More...
 
bool IsAscii ()
 Test if this string contains only characters less than or equal to 127. More...
 
void VSprintf (Utf8CP format, va_list argptr)
 Replace the contents of this string with a formatted result. More...
 
void Sprintf (Utf8CP format,...)
 Replace the contents of this string with a formatted result. More...
 
int CompareToI (Utf8CP other) const
 Perform a case-insensitive comparison. More...
 
int CompareToI (Utf8StringCR other) const
 Perform a case-insensitive comparison. More...
 
int CompareTo (Utf8CP other) const
 Perform a (case-sensitive) comparison. More...
 
int CompareTo (Utf8StringCR other) const
 Perform a (case-sensitive) comparison. More...
 
bool Equals (Utf8CP other) const
 Test for equality with another string. More...
 
bool Equals (Utf8StringCR other) const
 Test for equality with another string. More...
 
bool EqualsI (Utf8StringCR other) const
 Test for equality with another string, ignoring case. More...
 
bool EqualsI (Utf8CP other) const
 Test for equality with another string, ignoring case. More...
 
void Trim ()
 Removes all whitespace from the left and right sides. Whitespace includes space, line feed, carriage return, and tab (e.g. iswspace). More...
 
void Trim (Utf8CP trimCharacters)
 Removes all instances of any of the given characters from the left and right sides. More...
 
Utf8StringR TrimEnd ()
 Removes all whitespace from the end. Whitespace includes space, line feed, carriage return, and tab (e.g. iswspace). More...
 
bool StartsWith (Utf8CP) const
 Determines if this instance starts with the provided string. More...
 
Utf8StringR AssignOrClear (Utf8CP in)
 Update this string to be equal to in. If in is NULL, clear this string. More...
 
size_t ReplaceAll (Utf8CP subStringToReplace, Utf8CP replacement)
 Replace all instances of a sub string. Returns the number of replacements made. More...
 
void ToLower ()
 Converts this string, in-place, to all lower case. More...
 
size_t GetNextToken (Utf8StringR next, CharCP delims, size_t offset) const
 Reads the next token delimited by any character in delims or \0. More...
 

Static Public Member Functions

static bool IsNullOrEmpty (Utf8CP value)
 Utility function to test if value represents the empty string. This function interprets NULL to be the empty string. More...
 
static bool IsAsciiWhiteSpace (char val)
 Determine whether the supplied character is a whitespace character in the ascii (below 128) code page. More...
 
static char ToLowerChar (char c)
 Equivalent to tolower. More...
 

Detailed Description

Contains a UTF-8 encoded string.

This class has many of the capabilities of std::string, except that it is intended to hold only UTF-8 encoded strings. This class also defines utility functions for constructing and manipulating the string.

Constructor & Destructor Documentation

Utf8String ( Utf8CP  str)
Utf8String ( bastring const &  other)
Utf8String ( bastring const &  __str,
size_t  __pos,
size_t  __n = npos 
)
Utf8String ( CharCP  __s,
size_t  __n 
)
Utf8String ( size_t  __n,
char  __c 
)
Utf8String ( iterator  __beg,
iterator  __end 
)
Utf8String ( const_iterator  __beg,
const_iterator  __end 
)
Utf8String ( reverse_iterator  __beg,
reverse_iterator  __end 
)
Utf8String ( const_reverse_iterator  __beg,
const_reverse_iterator  __end 
)
Utf8String ( WCharCP  str)
explicit

Construct a Utf8String by converting from a wchar_t string.

Utf8String ( WString  str)
explicit

Construct a Utf8String by converting from a wchar_t string.

Member Function Documentation

Utf8StringR Assign ( WCharCP  str)

Construct a Utf8String by converting from a wchar_t string.

Utf8StringR AssignOrClear ( Utf8CP  in)

Update this string to be equal to in. If in is NULL, clear this string.

References clear(), and NULL.

int CompareTo ( Utf8CP  other) const

Perform a (case-sensitive) comparison.

Returns
0 if the strings are equal, or -1 or 1 if this string should come before or after other.
Parameters
otherThe other string.
int CompareTo ( Utf8StringCR  other) const

Perform a (case-sensitive) comparison.

Returns
0 if the strings are equal, or -1 or 1 if this string should come before or after other.
Parameters
otherThe other string.
int CompareToI ( Utf8CP  other) const

Perform a case-insensitive comparison.

Returns
0 if the strings are equal (ignoring case), or -1 or 1 if this string should come before or after other.
Parameters
otherThe other string.
int CompareToI ( Utf8StringCR  other) const

Perform a case-insensitive comparison.

Returns
0 if the strings are equal (ignoring case), or -1 or 1 if this string should come before or after other.
Parameters
otherThe other string.

References Utf8String::CompareToI().

Referenced by Utf8String::CompareToI().

bool Equals ( Utf8CP  other) const

Test for equality with another string.

Returns
true if the strings are equal.
Parameters
otherThe other string.
bool Equals ( Utf8StringCR  other) const

Test for equality with another string.

Returns
true if the strings are equal.
Parameters
otherThe other string.
bool EqualsI ( Utf8StringCR  other) const

Test for equality with another string, ignoring case.

Returns
true if the strings are equal (ignoring case).
Parameters
otherThe other string.
bool EqualsI ( Utf8CP  other) const

Test for equality with another string, ignoring case.

Returns
true if the strings are equal (ignoring case).
Parameters
otherThe other string.
size_t GetNextToken ( Utf8StringR  next,
CharCP  delims,
size_t  offset 
) const

Reads the next token delimited by any character in delims or \0.

Parameters
[out]nextset to next token, if found, or cleared if not
[in]delimsthe characters that could delimit the tokens
[in]offsetwhere to start search
Returns
1 beyond the end of the current token or npos if token not found Example
// Read lines from a string, where each line is delimited by \n. The last line need not have a trailing \n.
size_t offset = 0;
while ((offset = m_virtuals.GetNextToken (m, "\n", offset)) != Utf8String::npos)
{
printf ("%s\n", m.c_str());
}
Note
If this string ends with a delimiter, then the last token returned is the one before the trailing delimiter. If this string does not end with a delimiter, then the last token is everything following the last delimiter (if any) and the end of this string. If this string has no delimiters at all, then the first and only token returned is the string itself. So, for example, if this string were "abc\n" then GetNextToken (next, "\n", 0) would set next to "abc" and return 4, and GetNextToken (next, "\n", 4) would return npos. If this string were "abc", the two calls would return the same results.
bool IsAscii ( )

Test if this string contains only characters less than or equal to 127.

static bool IsAsciiWhiteSpace ( char  val)
static

Determine whether the supplied character is a whitespace character in the ascii (below 128) code page.

This is necessary since the c "isspace" function is locale specific and sometimes returns true for the non-breaking-space character (0xA0), which is not a valid test for a Utf8 string. Note this does not test for VT, or FF as they are considered obsolete.

static bool IsNullOrEmpty ( Utf8CP  value)
static

Utility function to test if value represents the empty string. This function interprets NULL to be the empty string.

References NULL.

size_t ReplaceAll ( Utf8CP  subStringToReplace,
Utf8CP  replacement 
)

Replace all instances of a sub string. Returns the number of replacements made.

size_type SizeInBytes ( ) const

Computes the size, in bytes, of this string's data, including its NULL-terminator.

References size().

void Sprintf ( Utf8CP  format,
  ... 
)

Replace the contents of this string with a formatted result.

Parameters
formatThe sprintf-like format string.
bool StartsWith ( Utf8CP  ) const

Determines if this instance starts with the provided string.

void ToLower ( )

Converts this string, in-place, to all lower case.

Remarks
This function can be very slow if the string contains non-ascii characters.

References begin(), clear(), end(), and WString::ToLower().

static char ToLowerChar ( char  c)
static

Equivalent to tolower.

void Trim ( )

Removes all whitespace from the left and right sides. Whitespace includes space, line feed, carriage return, and tab (e.g. iswspace).

void Trim ( Utf8CP  trimCharacters)

Removes all instances of any of the given characters from the left and right sides.

Utf8StringR TrimEnd ( )

Removes all whitespace from the end. Whitespace includes space, line feed, carriage return, and tab (e.g. iswspace).

void VSprintf ( Utf8CP  format,
va_list  argptr 
)

Replace the contents of this string with a formatted result.

Parameters
formatThe sprintf-like format string.
argptrArguments used by format.

The documentation for this struct was generated from the following file:

Copyright © 2017 Bentley Systems, Incorporated. All rights reserved.