This chapter examines the Standard C++ string class, beginning with a look at what constitutes a C++ string and how the C++ version differs from a traditional C character array. You’ll learn about operations and manipulations using string objects, and you’ll see how C++ strings accommodate variation in character sets and string data conversion.[28]
Handling text is perhaps one of the oldest of all programming applications, so it’s not surprising that the C++ string draws heavily on the ideas and terminology that have long been used for this purpose in C and other languages. As you begin to acquaint yourself with C++ strings, this fact should be reassuring. No matter which programming idiom you choose, there are really only about three things you want to do with a string:
· Create or modify the sequence of characters stored in the string.
· Detect the presence or absence of elements within the string.
· Translate between various schemes for representing string characters.
You’ll see how each of these jobs is accomplished using C++ string objects.
In C, a string is simply an array of characters that always includes a binary zero (often called the
The exact implementation of memory layout for the string class is not defined by the C++ Standard. This architecture is intended to be flexible enough to allow differing implementations by compiler vendors, yet guarantee predictable behavior for users. In particular, the exact conditions under which storage is allocated to hold data for a string object are not defined. String allocation rules were formulated to allow but not require a reference-counted implementation, but whether or not the implementation uses reference counting, the semantics must be the same. To put this a bit differently, in C, every char array occupies a unique physical region of memory. In C++, individual string objects may or may not occupy unique physical regions of memory, but if reference counting is used to avoid storing duplicate copies of data, the individual objects must look and act as though they do exclusively own unique regions of storage. For example:.
//: C03:StringStorage.cpp
//{L} ../TestSuite/Test
#include
#include
#include "../TestSuite/Test.h"
using namespace std;
class StringStorageTest : public TestSuite::Test {
public:
void run() {
string s1("12345");
// This may copy the first to the second or
// use reference counting to simulate a copy
string s2 = s1;
test_(s1 == s2);
// Either way, this statement must ONLY modify s1
s1[0] = '6';
cout << "s1 = " << s1 << endl;
cout << "s2 = " << s2 << endl;
test_(s1 != s2);
}
};
int main() {
StringStorageTest t;
t.run();
return t.report();
} ///:~
An implementation that only makes unique copies when a string is modified is said to use a
Whether a library implementation uses reference counting or not should be transparent to users of the string class. Unfortunately, this is not always the case. In multithreaded programs, it is practically impossible to use a reference-counting implementation safely.[29]
Creating and initializing strings is a straightforward proposition and fairly flexible. In the SmallString.cpp example in this section, the first string, imBlank, is declared but contains no initial value. Unlike a C char array, which would contain a random and meaningless bit pattern until initialization, imBlank does contain meaningful information. This string object is initialized to hold "no characters" and can properly report its zero length and absence of data elements through the use of class member functions.