B. Stroustrup - The C++ Programming Language (794319), страница 51
Текст из файла (страница 51)
Formally (§iso.3.9, §iso.9), a PODobject must be of• a standard layout type, and• a trivially copyable type,• a type with a trivial default constructor.A related concept is a trivial type, which is a type with• a trivial default constructor and• trivial copy and move operationsInformally, a default constructor is trivial if it does not need to do any work (use =default if youneed to define one §17.6.1).A type has standard layout unless it• has a non-static member or a base that is not standard layout,• has a virtual function (§3.2.3, §20.3.2),• has a virtual base (§21.3.5),• has a member that is a reference (§7.7),• has multiple access specifiers for non-static data members (§20.5), or• prevents important layout optimizations• by having non-static data members in more than one base class or in both the derivedclass and a base, or• by having a base class of the same type as the first non-static data member.Basically, a standard layout type is one that has a layout with an obvious equivalent in C and is inthe union of what common C++ Application Binary Interfaces (ABIs) can handle.A type is trivially copyable unless it has a nontrivial copy operation, move operation, or destructor (§3.2.1.2, §17.6).
Informally, a copy operation is trivial if it can be implemented as a bitwise copy. So, what makes a copy, move, or destructor nontrivial?• It is user-defined.• Its class has a virtual function.• Its class has a virtual base.• Its class has a base or a member that is not trivial.An object of built-in type is trivially copyable, and has standard layout. Also, an array of triviallycopyable objects is trivially copyable and an array of standard layout objects has standard layout.Consider an example:template<typename T>void mycopy(T∗ to, const T∗ from, int count);I’d like to optimize the simple case where T is a POD. I could do that by only calling mycopy() forPODs, but that’s error-prone: if I use mycopy() can I rely on a maintainer of the code to remembernever to call mycopy() for non-PODs? Realistically, I cannot.
Alternatively, I could call std::copy(),which is most likely implemented with the necessary optimization. Anyway, here is the generaland optimized code:212Structures, Unions, and EnumerationsChapter 8template<typename T>void mycopy(T∗ to, const T∗ from, int count){if (is_pod<T>::value)memcpy(to,from,count∗sizeof(T));elsefor (int i=0; i!=count; ++i)to[i]=from[i];}The is_pod is a standard-library type property predicate (§35.4.1) defined in <type_traits> allowingus to ask the question ‘‘Is T a POD?’’ in our code.
The best thing about is_pod<T> is that it savesus from remembering the exact rules for what a POD is.Note that adding or subtracting non-default constructors does not affect layout or performance(that was not true in C++98).If you feel an urge to become a language lawyer, study the layout and triviality concepts in thestandard (§iso.3.9, §iso.9) and try to think about their implications to programmers and compilerwriters. Doing so might cure you of the urge before it has consumed too much of your time.8.2.7 FieldsIt seems extravagant to use a whole byte (a char or a bool) to represent a binary variable – for example, an on/off switch – but a char is the smallest object that can be independently allocated andaddressed in C++ (§7.2).
It is possible, however, to bundle several such tiny variables together asfields in a struct. A field is often called a bit-field. A member is defined to be a field by specifyingthe number of bits it is to occupy. Unnamed fields are allowed. They do not affect the meaning ofthe named fields, but they can be used to make the layout better in some machine-dependent way:struct PPN {// R6000 Physical Page Numberunsigned int PFN : 22; // Page Frame Numberint : 3;// unusedunsigned int CCA : 3;// Cache Coherency Algorithmbool nonreachable : 1;bool dirty : 1;bool valid : 1;bool global : 1;};This example also illustrates the other main use of fields: to name parts of an externally imposedlayout.
A field must be of an integral or enumeration type (§6.2.1). It is not possible to take theaddress of a field. Apart from that, however, it can be used exactly like other variables. Note that abool field really can be represented by a single bit. In an operating system kernel or in a debugger,the type PPN might be used like this:void part_of_VM_system(PPN∗ p){// ...Section 8.2.7Fields213if (p−>dirty) { // contents changed// copy to diskp−>dirty = 0;}}Surprisingly, using fields to pack several variables into a single byte does not necessarily savespace.
It saves data space, but the size of the code needed to manipulate these variables increaseson most machines. Programs have been known to shrink significantly when binary variables wereconverted from bit-fields to characters! Furthermore, it is typically much faster to access a char oran int than to access a field. Fields are simply a convenient shorthand for using bitwise logicaloperators (§11.1.1) to extract information from and insert information into part of a word.8.3 UnionsA union is a struct in which all members are allocated at the same address so that the union occupies only as much space as its largest member.
Naturally, a union can hold a value for only onemember at a time. For example, consider a symbol table entry that holds a name and a value:enum Type { str, num };struct Entry {char∗ name;Type t;char∗ s; // use s if t==strint i;// use i if t==num};void f(Entry∗ p){if (p−>t == str)cout << p−>s;// ...}The members s and i can never be used at the same time, so space is wasted.
It can be easily recovered by specifying that both should be members of a union, like this:union Value {char∗ s;int i;};The language doesn’t keep track of which kind of value is held by a union, so the programmer mustdo that:214Structures, Unions, and EnumerationsChapter 8struct Entry {char∗ name;Type t;Value v; // use v.s if t==str; use v.i if t==num};void f(Entry∗ p){if (p−>t == str)cout << p−>v.s;// ...}To avoid errors, one can encapsulate a union so that the correspondence between a type field andaccess to the union members can be guaranteed (§8.3.2).Unions are sometimes misused for ‘‘type conversion.’’ This misuse is practiced mainly by programmers trained in languages that do not have explicit type conversion facilities, so that cheatingis necessary.
For example, the following ‘‘converts’’ an int to an int∗ simply by assuming bitwiseequivalence:union Fudge {int i;int∗ p;};int∗ cheat(int i){Fudge a;a.i = i;return a.p;}// bad useThis is not really a conversion at all. On some machines, an int and an int∗ do not occupy the sameamount of space, while on others, no integer can have an odd address. Such use of a union is dangerous and nonportable. If you need such an inherently ugly conversion, use an explicit type conversion operator (§11.5.2) so that the reader can see what is going on. For example:int∗ cheat2(int i){return reinterpret_cast<int∗>(i);}// obviously ugly and dangerousHere, at least the compiler has a chance to warn you if the sizes of objects are different and suchcode stands out like the sore thumb it is.Use of unions can be essential for compactness of data and through that for performance.
However, most programs don’t improve much from the use of unions and unions are rather error-prone.Consequently, I consider unions an overused feature; avoid them when you can.Section 8.3.1Unions and Classes2158.3.1 Unions and ClassesMany nontrivial unions have a member that is much larger than the most frequently used members.Because the size of a union is at least as large as its largest member, space is wasted. This wastecan often be eliminated by using a set of derived classes (§3.2.2, Chapter 20) instead of a union.Technically, a union is a kind of a struct (§8.2) which in turn is a kind of a class (Chapter 16).However, many of the facilities provided for classes are not relevant for unions, so some restrictionsare imposed on unions:[1] A union cannot have virtual functions.[2] A union cannot have members of reference type.[3] A union cannot have base classes.[4] If a union has a member with a user-defined constructor, a copy operation, a move operation, or a destructor, then that special function is deleted (§3.3.4, §17.6.4) for that union;that is, it cannot be used for an object of the union type.[5] At most one member of a union can have an in-class initializer (§17.4.4).[6] A union cannot be used as a base class.These restrictions prevent many subtle errors and simplify the implementation of unions.
The latteris important because the use of unions is often an optimization and we won’t want ‘‘hidden costs’’imposed to compromise that.The rule that deletes constructors (etc.) from a union with a member that has a constructor (etc.)keeps simple unions simple and forces the programmer to provide complicated operations if theyare needed. For example, since Entry has no member with constructors, destructors, or assignments, we can create and copy Entrys freely.
For example:void f(Entry a){Entry b = a;};Doing so with a more complicated union would cause implementation difficulties or errors:union U {int m1;complex<double> m2;string m3;};// complex has a constructor// string has a constructor (maintaining a serious invariant)To copy a U we would have to decide which copy operation to use.
For example:void f2(U x){U u;U u2 = x;u.m1 = 1;string s = u.m3;return;}// error : which default constructor?// error : which copy constructor?// assign to int member// disaster : read from string member// error : which destructors are called for x, u, and u2?It’s illegal to write one member and then read another, but people do that nevertheless (usually bymistake). In this case, the string copy constructor would be called with an invalid argument. It is216Structures, Unions, and EnumerationsChapter 8fortunate that U won’t compile.
When needed, a user can define a class containing a union thatproperly handles union members with constructors, destructors, and assignments (§8.3.2). Ifdesired, such a class can also prevent the error of writing one member and then reading another.It is possible to specify an in-class initializer for at most one member.
If so, this initializer willbe used for default initialization. For example:union U2 {int a;const char∗ p {""};};U2 x1;U2 x2 {7};// default initialized to x1.p == ""// x2.a == 78.3.2 Anonymous unionsTo see how we can write a class that overcomes the problems with misuse of avariant of Entry (§8.3):union,consider aclass Entry2 { // two alternative representations represented as a unionprivate:enum class Tag { number, text };Tag type; // discriminantunion { // representationint i;string s; // string has default constructor, copy operations, and destructor};public:struct Bad_entry { };// used for exceptionsstring name;˜Entry2();Entry2& operator=(const Entry2&);Entry2(const Entry2&);// ...// necessar y because of the string variantint number() const;string text() const;void set_number(int n);void set_text(const string&);// ...};I’m not a fan of get/set functions, but in this case we really need to perform a nontrivial user-specified action on each access.