导读:由Scott Meyers所著的图书《Effective C++ ——55 Specific Ways to Improve Programs and Designs》(中文译名《Effective C++:改善程序与设计的55个具体做法》),是一本非常经典的C++图书,被喻为C++程序员的必读书籍。电子工业出版社力邀国内资深专家执笔,在英文原著基础上增加中文点评与注释,旨在以先行者的学研心得与实践感悟,对读者阅读与学习加以点拨、指明捷径。本文节选自第1章:Accustoming Yourself to C++。
Regardless of your programming background, C++ is likely to take a little getting used to. It’s a powerful language with an enormous range of features, but before you can harness that power and make effective use of those features, you have to accustom yourself to C++’s way of doing things. This entire book is about how to do that, but some things are more fundamental than others, and this chapter is about some of the most fundamental things of all.
每种语言都有它解决问题的方法,虽然最后都能殊途同归,但选择的方法不同,程序性能也千差万别。特定的方法可以最大地发挥语言的性能;有些方法则出于惯例。程序员与人交流合作,比如编写程序库供人使用,使用别人编写好的程序库,都需要遵从一定的惯例,这样才能减少沟通的成本。这些惯例背后都有其深厚的原因,并非一句话能够解释清楚。不幸的是,C++社区过于庞大,并非所有的意见都有统一的结论。
Item 1: View C++ as a federation of languages.
In the beginning, C++ was just C with some object-oriented features tacked on. Even C++’s original name, “C with Classes,” reflected this simple heritage.
As the language matured, it grew bolder and more adventurous, adopting ideas, features, and programming strategies different from those of C with Classes. Exceptions required different approaches to structuring functions (see Item 29). Templates gave rise to new ways of thinking about design (see Item 41), and the STL defined an approach to extensibility unlike any most people had ever seen.
Today’s C++ is a multiparadigm programming language, one supporting a combination of procedural, object-oriented, functional, generic, and metaprogramming features. This power and flexibility make C++ a tool without equal, but can also cause some confusion. All the “proper usage” rules seem to have exceptions. How are we to make sense of such a language?
无人能掌握C++所有的枝节。这并非夸张的说法,也不是藐视读者的智商。因为C++本身不断在发展,不断地加入新的东西。
很多年之前,我学习C++时用的第一个C++编译器(Turbo C++ 1.0)中,template还只是一个被保留而未实现任何功能的关键字。可在C++诞生的若干年后,它居然成为了 STL的基石。这个不起眼的小玩意,即使是C++之父,一开始也未能意识到其蕴涵的巨大能量。
The easiest way is to view C++ not as a single language but as a federation of related languages. Within a particular sublanguage, the rules tend to be simple, straightforward, and easy to remember. When you move from one sublanguage to another, however, the rules may change. To make sense of C++, you have to recognize its primary sublanguages. Fortunately, there are only four:
C. Way down deep, C++ is still based on C. Blocks, statements, the preprocessor, built-in data types, arrays, pointers, etc., all come from C. In many cases, C++ offers approaches to problems thatare superior to their C counterparts (e.g., see Items 2 (alternativesto the preprocessor) and 13 (using objects to manage resources)),but when you find yourself working with the C part of C++, therules for effective programming reflect C’s more limited scope: notemplates, no exceptions, no overloading, etc.
Object-Oriented C++. This part of C++ is what C with Classes was all about: classes (including constructors and destructors), encapsulation, inheritance, polymorphism, virtual functions (dynamic binding), etc. This is the part of C++ to which the classic rules for object-oriented design most directly apply.
Template C++. This is the generic programming part of C++, the one that most programmers have the least experience with. Template considerations pervade C++, and it’s not uncommon for rules of good programming to include special template-only clauses(e.g., see Item 46 on facilitating type conversions in calls to template functions). In fact, templates are so powerful, they give riseto a completely new programming paradigm, template metaprogramming(TMP). Item 48 provides an overview of TMP, but unless you’re a hard-core template junkie, you need not worry about it. The rules for TMP rarely interact with mainstream C++ programming.
The STL. The STL is a template library, of course, but it’s a very special template library. Its conventions regarding containers, iterators, algorithms, and function objects mesh beautifully, but templates and libraries can be built around other ideas, too. The STL has particular ways of doing things, and when you’re working with the STL, you need to be sure to follow its conventions.
C++不是绝对意义上的C++。在本书的第二版中没有Item 1这一节,而在这一版中,把这一大段放在了第一条,可见作者对这个问题的重要性也是逐步才认识到的。我对此深以为然。这一篇是全书的中心,读此书必须先细细品味它。如果之前读过第二版,对比一下行文风格,就能发现二者有极大差异。作者不再强调在C++中必须怎样做,文字中隐隐透着些许无奈,本篇就是最佳注脚。
在我看来,C++各个不同方面的差异性要远大于它们的共性。C++经过几十年发展逐渐演变成今天这样,将如此之多的编程风格糅合在同一门语言中,让它们能和谐共存,是非常困难的事情。因为要尽可能满足各种项目、各种用户在各种时期的不同需求,所以C++不是在一开始经过深思熟虑定义出来的。C++语言发展到今天,还能发展下去,难能可贵。所以,C++新标准从 1998 年到现在,十多年过去了,还未能完全定稿,真的很容易理解。
在某些C++教材上,反复强调不要把C++当成 C 使用(包括本书第二版),在某种意义上说没错。但只使用C++的一部分——只是C的部分,仅仅利用C++的改进来弥补 C 的一些缺陷,在工程实践中也是个不错的方案。如何使用C++最好,仅取决于你的开发团队怎样定义你们使用的C++,并且是否全部认同。Google在这一点上做得很好,在网络上流传着Google发布的C++编码规范,建议大家看一看。有规范,并且大家一起遵守,比到底规范了些什么重要得多。
我在2005年到2006年间,曾经在团队中推广过一段时间类似C的C++子集做开发,那和我早些年编写的C++程序风格完全不同,也工作良好。不过这段经历使我对面向对象和模板技术做了许多反思,并最终转向彻底的纯C语言开发。
我个人觉得,应该多尝试一些不同的东西,而不要武断地把任何技术当成唯一真理。你可以热爱面向对象,也可以尝试一下Template。但需要警惕的是,虽然C++允许把各种不同风格的编程方式杂糅在一起使用,每种都提供了高性能的支持,可以取各家之所长,有种世界在我手中的感觉,甚至可以让C++程序员心中不断生出创新的快感,殊不知,其引起的冲突和复杂性,可以轻易超过个人能控制的范畴。尤其对于聪明的C++程序员,更是危险。这一点仅仅学习语言,而不经过长年的经验积累,是很难有切身体会的。
Keep these four sublanguages in mind, and don’t be surprised when you encounter situations where effective programming requires that you change strategy when you switch from one sublanguage to another. For example, pass-by-value is generally more efficient than pass-by-reference for built-in (i.e., C-like) types, but when you move from the C part of C++ to Object-Oriented C++, the existence of userdefined constructors and destructors means that pass-by-referenceto- constis usually better. This is especially the case when working in Template C++, because there, you don’t even know the type of object you’re dealing with. When you cross into the STL, however, you know that iterators and function objects are modeled on pointers in C, so for iterators and function objects in the STL, the old C pass-by-value rule applies again. (For all the details on choosing among parameter-passing options, see Item 20.)
C++, then, isn’t a unified language with a single set of rules; it’s a federation of four sublanguages, each with its own conventions. Keep these sublanguages in mind, and you’ll find that C++ is a lot easier to understand.
Things to Remember
Rules for effective C++ programming vary, depending on the part of C++ you are using.
定义你想怎么使用C++非常重要,这决定了你的项目是否能够一直做下去直到发布。就算只有你一个人做项目,你也会使用别人的代码(至少是标准库),或提供扩展接口供别人编写扩展。这都会和并非出自你手的代码打交道。即使所有的一切都是由你一个人掌握,你也不可能随心所欲地使用那些C++中看起来最酷的特性,因为你总会发现C++中还有更有趣的东西可供挖掘。这种想法很危险,因为如此一来项目会逐渐偏离原始的目标,编写C++代码只是为了用C++编写,而非为了解决问题。
Item 2: Prefer consts, enums, and inlines to #defines.
This Item might better be called “prefer the compiler to the preprocessor,” because #define may be treated as if it’s not part of the language per se. That’s one of its problems. When you do something like this,
- #define ASPECT_RATIO 1.653
the symbolic name ASPECT_RATIO may never be seen by compilers; it may be removed by the preprocessor before the source code ever gets to a compiler. As a result, the name ASPECT_RATIO may not get entered into the symbol table. This can be confusing if you get an error during compilation involving the use of the constant, because the error message may refer to 1.653, not ASPECT_RATIO. If ASPECT_RATIO were defined in a header file you didn’t write, you’d have no idea where that 1.653 came from, and you’d waste time tracking it down. This problem can also crop up in a symbolic debugger, because, again, the name you’re programming with may not be in the symbol table.
The solution is to replace the macro with a constant:
- const double AspectRatio = 1.653; // uppercase names are usually for
- // macros, hence the name change
As a language constant, AspectRatio is definitely seen by compilers and is certainly entered into their symbol tables. In addition, in the case of a floating point constant (such as in this example), use of the constant may yield smaller code than using a #define. That’s because the preprocessor’s blind substitution of the macro name ASPECT_RATIO with 1.653 could result in multiple copies of 1.653 in your object code, while the use of the constant AspectRatio should never result in more than one copy.
宏对于C++而言不是好东西。C++的使用惯例中,往往用各种手段来避免使用宏定义。为什么在C语言中常见的宏定义,到了C++中就不受待见了呢?光用C语言中没有好的替代品来解释是不够的。
C++强调强类型,这可以帮助程序员从纷纷扰扰、乱花迷人眼的语法糖陷阱中解脱出来,帮助编译器自动发现程序员的错误。而C语言的哲学则是可显性,推荐程序“表里如一”。C语言虽然类型较弱,但尽可能地把实际工作展现出来。在大多数情况下,宏在语言中起到的作用是使程序更易读、可配置,而并非改变语言的表现形式,或是提供一种DSL(领域相关语言)。
When replacing #defines with constants, two special cases are worth mentioning. The first is defining constant pointers. Because constant definitions are typically put in header files (where many different source files will include them), it’s important that the pointer be declared const, usually in addition to what the pointer points to. To define a constant char*-based string in a header file, for example, you have to write const twice:
- const char * const authorName = "Scott Meyers";
For a complete discussion of the meanings and uses of const, especially in conjunction with pointers, see Item 3. However, it’s worth reminding you here that string objects are generally preferable to their char*-based progenitors, so authorName is often better defined this way:
const std::string authorName("Scott Meyers");
这里可见 build-in类型在C++中不受欢迎。如果你不打算使用C风格的C++。使用 std::string总是比const char *要好一些。
C接口中常见的void **类型,在C++风格的程序中也不多见。
The second special case concerns class-specific constants. To limit the scope of a constant to a class, you must make it a member, and to ensure there’s at most one copy of the constant, you must make it a static member:
- class GamePlayer {
- private:
- static const int NumTurns = 5; // constant declaration
- int scores[NumTurns]; // use of constant
- ...
- };
What you see above is a declaration for NumTurns, not a definition. Usually, C++ requires that you provide a definition for anything you use, but class-specific constants that are static and of integral type (e.g., integers, chars, bools) are an exception. As long as you don’t take their address, you can declare them and use them without providing a definition. If you do take the address of a class constant, or if your compiler incorrectly insists on a definition even if you don’t take the address, you provide a separate definition like this:
- const int GamePlayer::NumTurns; // definition of NumTurns; see
- // below for why no value is given
You put this in an implementation file, not a header file. Because the initial value of class constants is provided where the constant is declared (e.g., NumTurns is initialized to 5 when it is declared), no initial value is permitted at the point of definition.
Note, by the way, that there’s no way to create a class-specific constant using a #define, because #defines don’t respect scope. Once a macro is defined, it’s in force for the rest of the compilation (unless it’s #undefed somewhere along the line). Which means that not only can’t #defines be used for class-specific constants, they also can’t be used to provide any kind of encapsulation, i.e., there is no such thing as a “private” #define. Of course, const data members can be encapsulated; NumTurns is.
Older compilers may not accept the syntax above, because it used to be illegal to provide an initial value for a static class member at its point of declaration. Furthermore, in-class initialization is allowed only for integral types and only for constants. In cases where the above syntax can’t be used, you put the initial value at the point of definition:
- class CostEstimate {
- private:
- static const double FudgeFactor; // declaration of static class
- ... // constant; goes in header file
- };
- const double // definition of static class
- CostEstimate::FudgeFactor = 1.35; // constant; goes in impl. file
This is all you need almost all the time. The only exception is when you need the value of a class constant during compilation of the class, such as in the declaration of the array GamePlayer::scores above (where compilers insist on knowing the size of the array during compilation). Then the accepted way to compensate for compilers that (incorrectly) forbid the in-class specification of initial values for static integral class constants is to use what is affectionately (and non-pejoratively) known as “the enum hack.” This technique takes advantage of the fact that the values of an enumerated type can be used where ints are expected, so GamePlayer could just as well be defined like this:
- Class GamePlayer {
- private:
- enum { NumTurns = 5 }; // “the enum hack” — makes
- // NumTurns a symbolic name for 5
- int scores[NumTurns]; // fine
- ...
- };
关于enum { NumTurns = 5 };
早期的C++编译器无法识别数字常量,而C++也不支持用变量定义数组。为了避免宏的使用,使用 enum 是常见的方法。
关于int scores[Numturns];
纯粹的STL派可能更喜爱使用std::vector或boost::array。但不可否认,几乎没有人坚持完全不用build-in数组。
The enum hack is worth knowing about for several reasons. First, the enum hack behaves in some ways more like a #define than a const does, and sometimes that’s what you want. For example, it’s legal to take the address of a const, but it’s not legal to take the address of an enum, and it’s typically not legal to take the address of a #define, either. If you don’t want to let people get a pointer or reference to one of your integral constants, an enum is a good way to enforce that constraint. (For more on enforcing design constraints through coding decisions, consult Item 18.) Also, though good compilers won’t set aside storage for const objects of integral types (unless you create a pointer or reference to the object), sloppy compilers may, and you may not be willing to set aside memory for such objects. Like #defines, enums never result in that kind of unnecessary memory allocation.
A second reason to know about the enum hack is purely pragmatic. Lots of code employs it, so you need to recognize it when you see it. In fact, the enum hack is a fundamental technique of template metaprogramming (see Item 48).
Getting back to the preprocessor, another common (mis)use of the #define directive is using it to implement macros that look like functions but that don’t incur the overhead of a function call. Here’s a macro that calls some function f with the greater of the macro’s arguments:
- // call f with the maximum of a and b
- #define CALL_WITH_MAX(a, b) f((a) > (b) ? (a) : (b))
Macros like this have so many drawbacks, just thinking about them is painful.
Whenever you write this kind of macro, you have to remember to parenthesize all the arguments in the macro body. Otherwise you can run into trouble when somebody calls the macro with an expression. But even if you get that right, look at the weird things that can happen:
- int a = 5, b = 0;
- CALL_WITH_MAX(++a, b); // a is incremented twice
- CALL_WITH_MAX(++a, b+10); // a is incremented once
max是永远的关于宏的反面案例。可悲的是,这里列出的template的解决方案也并非完美。有兴趣的同学可以在Google中搜索一篇题为min, max, and more的文章。那篇文章也正是本书作者Scott Meyers所写,你会惊叹把这么一件简单的事情做得完全正确是如此的困难。
Here, the number of times that a is incremented before calling f depends on what it is being compared with!
Fortunately, you don’t need to put up with this nonsense. You can get all the efficiency of a macro plus all the predictable behavior and type safety of a regular function by using a template for an inline function (see Item 30):
- template<typename T> // because we don’t
- inline void callWithMax(const T& a, const T& b) // know what T is, we
- { // pass by reference-
- f(a > b ? a : b); // const — see Item 20
- }
This template generates a whole family of functions, each of which takes two objects of the same type and calls f with the greater of the two objects. There’s no need to parenthesize parameters inside the function body, no need to worry about evaluating parameters multiple times, etc. Furthermore, because callWithMax is a real function, it obeys scope and access rules. For example, it makes perfect sense to talk about an inline function that is private to a class. In general, there’s just no way to do that with a macro.
Given the availability of consts, enums, and inlines, your need for the preprocessor (especially #define) is reduced, but it’s not eliminated. #include remains essential, and #ifdef/#ifndef continue to play important roles in controlling compilation. It’s not yet time to retire the preprocessor, but you should definitely give it long and frequent vacations.
Things to Remember
For simple constants, prefer const objects or enums to #defines.
For function-like macros, prefer inline functions to #defines.