C++ Programming Style

With Rationale


1 Introduction

1.1 Background

This document defines the C++ coding style for Wildfire, Inc. It also tries to provide guidelines on how to use the various features found in the C++ language. The establishment of a common style will facilitate understanding and maintaining code developed by more than one programmer as well as making it easier for several people to cooperate in the development of the same program. In addition, following a common programming style will enable the construction of tools that incorporate knowledge of these standards to help in the programming task.

Using a consistent coding style throughout a particular module, package, or project is important because it allows people other than the author to easily understand and (hopefully) maintain the code. Most programming styles are somewhat arbitrary, and this one is no exception. In the places where there were choices to be made, we attempted to include the rationale for our decisions.

This document contains rationale for many of the choices made. Rationale will be presented with this paragraph style.

One more thing to keep in mind is that when modifying an existing source file, the modifications should be coded in the same style as the file being modified. A consistent style is important, even if it isn't the one you usually use.

However, there are many variations in style that do not interfere with achieving these goals. This style guide is intended to be the minimum reasonable set of rules that accomplish these ends. It does not attempt to answer all questions about where ever character should go. We rely upon the good judgement of the programmer as much as possible.

This guide presents things in "programming order", that is, notes, rules, and guidelines about a particular programming construct are grouped together. In addition, the sections are in an order that approximates that used to write programs.

The section Miscellaneous on page 32 contains many useful tidbits of information that didn't fit well into any of the other sections.

Finally, there is a Bibliography and Reading List at the end of this document that contains quite a few titles. Many of the books there should be considered mandatory reading --- if nothing else, buy and read a copy of both the ARM [10] by Ellis & Stroustrup and Effective C++ [12] by Scott Meyers. Coplien's Advanced C++ Programming Styles and Idioms [13] is also highly recommended.

2 Fundamental MetaRule

A good style guide can enhance the quality of the code that we write. This style guide tries to present a standard set of methods for achieving that end.

It is, however, the end itself that is important. Deviations from this standard style are acceptable if they enhance readability and code maintainability. Major deviations require a explanatory comment at each point of departure so that later readers will know that you didn't make a mistake, but purposefully are doing a local variation for a good cause.

A good rule of thumb is that 10% of the cost of a project goes into writing code, while more than 50% is spent on maintaining it. Think about the trade-offs between ease-of-programming now vs. ease-of-maintenance for the next 5 to 10 years when you consider the rules presented here.

2.1 C++ is different from C

The C++ programming language differs substantially from the C programming language. In terms of usage, C is more like Pascal than it is like C++. This style guide differs from traditional C style guides in places where the "C mindset" is detrimental to the object-oriented outlook desired for C++ development.

3 Files

Code should compile without errors or warnings. "Compile" in this sense applies to lint-like code analyzers, a standard-validating compilers (ANSI-C++, POSIX, Style Guide Verification, etc.), and C++ compilers on all supported hardware/software platforms.

3.1 File Naming Conventions

Try to pick filenames that are meaningful and understandable. File names are not limited to 14 characters. The following table shows the file naming conventions we will use:

-------------------------------------------
File Contents             Name               
-------------------------------------------
C++ Source Code           filename.cc        
C++ Header File           filename.hh        
C Source Code             filename.c         
C Header File             filename.h         
Object Code               filename.o         
Archive Libraries         filename.a         
Dynamic Shared Libraries  filename.so.<ver>  
Shell Scripts             filename.sh        
Yacc/C Source Code        filename.y         
Yacc/C++ Source Code      filename.yy        
Lex/C Source Code         filename.l         
Lex/C++ Source Code       filename.ll        
Directory Contents        README             
Build rules for make      Makefile           
-------------------------------------------

POSIX specifies a maximum of 14 characters for filenames, but in practice this limit is too restrictive: source control systems like RCS and SCCS use 2 characters; the IDL compiler generates names with suffixes appended, etc.

3.2 File Organization
char *s1 = "hello\n"
           "world\n";                    // s1 is exactly the same as s2,
char *s2 = "hello\nworld\n";

The line length limit is related to the fact that many printers and terminals are limited to an 80 character line length. Source code that has longer lines will cause either line wrapping or truncation on these devices. Both of these behaviors result in code that is hard to read.

#pragma directives are, by definition, non-standard, and can cause unexpected behavior when compiled on other systems. On another system, a #pragma might even have the opposite meaning of the intended one.

In some cases #pragma is a necessary evil. Some compilers use #pragma directives to control template instantiations. In these rare cases the #pragma usage should be documented and, if possible, #ifdef directives should be to ensure other copilers don't trip over the usage. (See #error directive and 4.2 Conditional Compilation (#if and its ilk)

3.3 Header File Content

Header files should be functionally organized, with declarations of separate subsystems placed in separate header files. For class definitions, header files should be treated as interface definition files.

The required ordering in header files is as follows:

  1. A "stand-alone" copyright notice such as that shown below (1):
    // Copyright 1992 by Wildfire Communications, Inc.
    // remainder of Wildfire copyright notice
Don't place anything other than the copyright text in this comment --- the whole comment will be replaced programmatically to update the copyright text.

  1. An #ifndef that checks whether the header file has been previously included, and if it has, ignores the rest of the file. The name of the variable tested looks like _WF_FILE_HH, where "FILE_HH" is replaced by the header file name, using underscore for any character not legal in an identifier. Immediately after the test, the variable is defined.
    #ifndef _WF_FILENAME_HH
    #define _WF_FILENAME_HH

  1. A block comment describing the contents of the file. A description of the purpose of the entities in the files is more useful than just a list of class names. Keep the description short and to the point.

  2. The RCS $Header$ variable should be placed as the end of the block comment, or in a comment immediately following it:

// $Header$

  1. #include directives. Every header file should be self-sufficient, including all other header files it needs.

Since implementations will change, code that places "implementation-required #includes" in clients could cause them to become tied to a particular implementation.

The following items are a suggested order. There will be many times where this ordering is inappropriate, and should be changed.

  1. const declarations.

  2. Forward class, struct, and union declarations.

  3. struct or union declarations.

  4. typedef declarations.

  5. class declarations.

The rest of these items should be in found this order at the end of the header file.

  1. Global variable declarations (not definitions). Of course, global variables should be avoided, and should never be used in interfaces. A class scoped enum or const can be used to reduce the need for globals; if they are still required they should be either file-scope static or declared extern in a header file.

  2. External declarations of functions implemented in this module.

  3. The header guard's #endif need be followed by a comment describing the #ifdef head guard.
    After all, it is the last directive in the file and should obvious.
3.4 Source File Content

  1. Global scope variable definitions. Global variables (both external and file-static) are problematic in a multi-threaded and/or in a reentrant server context. They should be avoided. (This is also a problem for non-const class static member variables.)

  2. File scope (static) variable definitions.

  3. Function definitions. A comment should generally precede each function definition.

4 Preprocessor

    #include <module/name>
to get public header files from a standard place. The -I option of the compiler is the best way to handle the pseudo-public "package private" header files used when constructing libraries--- it permits reorganizing the directory structure without altering source files.
4.1 Macros (#define)

Macros are almost never necessary in C++.

The debugger can deal with them symbolically, while it can't with a #define, and their scope is controlled and they only occupy a particular namespace, while #define symbols apply everywhere except inside strings.

Macros should be used to hide the ## or #param features of the preprocessor and encapsulate debugging aids such as assert(). (Code that uses these features should be rare.) If you find that you must use macros, they must be defined so that they can be used anywhere a statement can. That is, they can not end in a semicolon. To accomplish this, multi-statement macros that cannot use the comma operator should use a do/while construct:

#define      ADD(sys,val)        do { \
                if (!known_##sys(val)) \
                  add_##sys(val);\
              } while(0)

This allows ADD() to be used anywhere a statement can, even inside an if/else construct:

  if (doAdd)
    ADD(name, "Corwin");
  else
    somethingElse();

It is also robust in the face of a missing semicolon between the ADD() and the else.

This technique should not be used for paired begin/end macros. In other words, if you have macros that bracket an operation, do not put a do in the begin macro and its closing while in the end macro.

This makes any break or continue between the begin and end macro invocations relative to the hidden do/while loop, not any outer containing loop.

4.2 Conditional Compilation (#if and its ilk)

In general, avoid using #ifdef. Modularize your code so that machine dependencies are isolated to different files and beware of hard coding assumptions into your implementation.


#ifdef sun

#define USE_MOTIF
#define RPC_ONC

#elif hpux

#define USE_OPENLOOK
#define RPC_OSF

#else

#error unknown machine type

#endif
#define      BEGIN    {        // EXTREMELY BAD STYLE!!!
#define      when    break;case        // EXTREMELY BAD STYLE!!!

This makes the program unintelligible to all but the perpetrator. C++ is hard enough to read as it is.


#ifdef RPC_ONC
    doONCStuff();
#endif


#ifdef DEBUG
    if (!debug)        // #ifdef breaks standard braces rule
#endif
    {
      doSomeStuff();
      doMoreStuff();
    }

5 Identifier Naming Conventions

Identifier naming conventions make programs more understandable by making them easier to read. They also give information about the purpose of the identifier. Each subsystem should use the same conventions consistently. For example, if the variable offset holds an offset in bytes from the beginning of a file cache, the same name should not be used in the same subsystem to denote an offset in blocks from the beginning of the file.

We have made an explicit decision to not use Hungarian Notation.(2)

5.1 General Rules

Well chosen names go a long way toward making a program self-documenting. What is an obvious abbreviation to you may be baffling to others, especially in other parts of the world. Abbreviations make it hard for others to remember the spelling of your functions and variables. They also obscure the meaning of the code that uses them.

5.2 Identifier Style

Identifiers are either upper caps, mixed case, or lower case. If an identifier is upper caps, word separation in multi-word identifiers is done with an underscore (for example, RUN_QUICK). If an identifier is mixed case, it starts with a capital, and word separation is done with caps (for example, RunQuick). If an identifier is lower case, words are separated by underscore (for example, run_quick). Preprocessor identifiers and template parameters are upper case. The mixed case identifiers are global variables, function names, types (including class names), class data members, enum members. Local variables and class member functions are lower case.

Template parameter names act much like #define identifiers over the scope of the template. Making them upper case calls them out so they are readily identifiable in the body of the template.

An initial or trailing underscore should never be used in any user-program identifiers.(3)

Prefixes are given for identifiers with global scope (some packages may extend the prefixes for their identifiers):

----------------------------------------------------------------------
Prefix  Used for                                                        
----------------------------------------------------------------------
WF_     preprocessor                                                    
_WF_    hidden preprocessor (e.g., protecting symbols for header file)  
Wf      Global scope (global variables, functions, type names).         
wf      File-static scope                                               
----------------------------------------------------------------------

File-static identifiers, are the only exception: they are mixed case, but start with a lower-case prefix. (for example, wfFileStaticVar).

5.3 Namespace Clashes

The goal of this section is to provide guidance to minimize potential name clashes in C++ programs and libraries.

There are two solution strategies: (1) minimize the number of clashable names, or (2) choose clashable names that minimize the probability of a clash. Strategy (1) is preferable, but clashable names cannot be totally eliminated.

Clashable names include: external variable names, external function names, top-level class names, type names in public header files, class member names in public header files, etc. (Class member names are scoped by the class, but can clash in the scope of a derived class. Explicit scoping can be used to resolve these clashes.)

There are two kinds of name clash problem:

  1. Clashes that prevent two code modules from being linked together. This problem affects external variable names, external function names, and top-level class names.

  2. Clashes that cause client code to fail to compile. This problem affects type names in public header files, and class member names in public header files. It is most egregious in the case of names that are intended to be private, such as the names of private class members, as a new version of the header file with new private names could cause old client code to break.

Solutions:

  1. Avoiding the use of external variables and functions, in favor of class data members and function members.

  2. Minimizing the number of top-level classes, by using nested classes.

  3. Minimizing the number of private class members declared in public header files. Private class members should be defined in public header files only where clients need to perform implementation inheritance. To minimize the number of dependencies on the data representation, define a single private data member of an opaque pointer type that points to the real data representation whose structure is not published.

Exception: A top-level class name used only as a naming scope can consist entirely of a distinctive prefix.

WfRenderingContext                 (a type name)
WfPrint()                 (a function name)
WfSetTopView()                 (a function name)
WfMasterIndex                 (a variable name)
Wf::String                 (a type name --- the class name serves as prefix)

For components of the Wildfire program, prefixes begin with Wf.

5.4 Reserved Namespaces

Listed below are explicitly reserved names which should not be used in human-written code (it may be permissible for program generators to use some of these).

_[A-Z_][0-9A-Za-z_]*
_[a-z][0-9A-Za-z_]*

E[0-9A-Z][0-9A-Za-z]*                  errno values
is[a-z][0-9A-Za-z]*                  Character classification
to[a-z][0-9A-Za-z]*                  Character manipulation
LC_[0-9A-Za-z_]*                  Locale
SIG[_A-Z][0-9A-Za-z_]*                  Signals
str[a-z][0-9A-Za-z_]*                  String manipulation
mem[a-z][0-9A-Za-z_]*                  Memory manipulation
wcs[a-z][0-9A-Za-z_]*                  Wide character manipulation

Note that the first three namespaces are hard to avoid. In particular, many accessor methods naturally fall into the is* namespace, and error conditions map onto the E* namespace. Be aware of these conflicts and make sure that you are not redefining existing identifiers.

6 Using White Space

Blank lines and blank spaces improve readability by offsetting sections of code that are logically related. A blank line should always be used in the following circumstances:

The guidelines for using spaces are:

  // no space between 'strcmp' and '(',

  // but space between 'if' and '('

  if (strcmp(input_value, "done") == 0)
    return 0;

This helps to distinguish keywords from procedure calls.

  for (expr1; expr2; expr3) {
    ...;
  }
  start= (a < b ? a : b);
  end= (a > b ? a : b);
6.1 Indentation

Only four-space line indentation should be used. The exact construction of the indentation (spaces only or tabs and spaces) is left unspecified. However, you may not change the settings for hard tabs in order to accomplish this. Hard tabs must be set every 8 spaces.

If this rule was not followed tabs could not be used since they would lack a well-defined meaning.

The rules for how to indent particular language constructs are described in Statements, § 10.

6.2 Long Lines

Occasionally an expression will not fit in the available space in a line; for example, a procedure call with many arguments, or a logical expression with many conditions. Such occurrences are especially likely when blocks are nested deeply or long identifiers are used.

  if (LongLogicalTest1 || LongLogicalTest2 ||
   LongLogicalTest3) {
    ...;
  }

  a = (long_identifier_term1 --- long_identifier_term2) *
     long_identifier_term3;
If there were some correlation among the terms of the expression, it might also be written as:
  if (ThisLongExpression < 0 ||
     ThisLongExpression > max_size ||
     ThisLongExpression == SomeOtherLongExpression) {
    ...;
  }

Placing the line break after an operator alerts the reader that the expression is continued on the next line. If the break were to be done before the operator, the continuation is less obvious.

Note also that, since temporary variables are cheap (an optimizing compiler will generate similar code whether or not you use them), they can be an alternative to a complicated expression:

temp1  = LongLogicalTest1;
temp2  = LongLogicalTest2;
temp3  = LongLogicalTest3;

if (temp1 || temp2 || temp3) {
    ...;
}
6.3 Comments

Comments should be used to give an overview of the code and provide additional information that is not readily understandable from the code itself. Comments should only contain information that is germane to reading and understanding the program.

//!! When we can, replace this code with a wombat -author

This gives maintainers some idea of whom to contact. It also allows one to easily grep the source looking for unfinished areas.

6.4 Block Comments

Block comments are used to describe a file's contents, a function's behavior, data structures, and algorithms.

This would require anyone changing the text in the box to continually deal with keeping the right-hand line straight.


statements;

// another block comment

// made up of C++ style comments

statements;
/*
 * Here is a C-style block comment
 * that takes up multiple lines.
 */

statements;
6.5 Single-Line Comments

Short comments may appear on a single line indented at least to the indentation level of the code that follows.

  if (argc > 1) {
    // Get option from the command line.
    ...;
  }
6.6 Trailing Comments

Very short comments may appear on the same line as the code they describe, but should be tabbed over far enough to separate them from the statements. Trailing comments are useful for documenting declarations.

  if (a == 2)
    return WfTrue;            // special case
  else
    return is_prime(a);            // works only for odd a

7 Types

7.1 Constants
class Foo
{
 public:
    enum {
      Success = 0, Failure = -1
    };
    ...;
}

if (foothing.foo_method("Argument") == Foo::Success) ...
  for (i = 0; i < size; i++) {
    // statements using array[i];
  }
double        factors[] = {
          0.1345,
          123.23451,
          0.0
        };

const int        num_factors = sizeof factors / sizeof factors[0];

This means that if the type of an object changes, all the associated sizeof operations will continue to be correct. The parentheses are forbidden for data objects so that sizeof on types (where the compiler requires them) will be easy to see.

7.2 Use of const

Both ANSI C and C++ add a new modifier to declarations, const. This modifier declares that the specified object cannot be changed. The compiler can then optimize code, and also warn you if you do something that doesn't match the declaration.

The first example is a modifiable pointer to constant integers: foo can be changed, but what it points to cannot be. Use this form for function parameter lists when you accept a pointer to something that you do not intend to change (for example, strlen(const char *string))

const int *foo;

foo = &some_constant_integer_variable

Next is a constant pointer to a modifiable integer: the pointer cannot be changed (once initialized), but the integer it points to can be changed at will:

int *const foo = &some_integer_variable;

Finally, we have a constant pointer to a constant integer. Neither the pointer nor the integer it points to can be changed:

const int *const foo = &some_const_integer_variable;

Note that const objects can be assigned to non-const objects (thereby making a copy), and the modifiable copy can of course be changed. However, pointers to const objects cannot be assigned to pointers to non-const objects, although the converse is allowed. Both of these forms of assignments are legal:

(const int *)          = (int *);

(int *)          = (int *const);

But both of these forms are illegal:

(int *)          = (const int *);          // illegal

(int *const)          = (int *);          // illegal

When const is used in an argument list, it means that the argument will not be modified. This is especially useful when you want to pass an argument by reference, but you don't want the argument to be modified.

void
block_move(const void *source, void *destination, size_t length);

Here we are explicitly stating that the source data will not be modified, but that the destination data will be modified. (Of course, if the length is 0, then the destination won't actually be modified.)

All of these rules apply to class objects as well:

class Foo
{
  public:
    void bar1() const;
    void bar2();
};

const Foo *foo_pointer;
foo_pointer->bar1();                    // legal

foo_pointer->bar2();                    // illegal

Inside a const member function like bar1(), the this pointer is type (const Foo *const), so you really can't change the object.

However, there is a distinction between bit-wise const and logical const. A bit-wise const function truly does not modify any bits of data in the object. This is what the compiler enforces for a const member function. A logical const function modifies the bits, but not the externally visible state; for example, it may cache a value. To users of a class, it is logical, not bit-wise, const is important. However, the compiler cannot know if a modification is logically const or not.

You get around this by casting away const, for example, by casting the pointer to be a (Foo *). This should only be done if you are absolutely sure that the function remains logically const after your operation, and must always be accompanied by an explanatory comment.

7.3 struct and union Declarations

A struct should only be used for grouping data; there should be no member functions. If you want member functions, you should be using a class. Hence, structs should be pretty rare.

struct Foo {
    int    size;        // Measured in inches
    char    *name;        // Label on icon
    ...;
};

Note that struct and enum tag names are valid types in C++, so the following common C idiom is obsoleted because foo can be used wherever you used to use Foo:

      typedef struct foo {                    /* Obsolete C idiom */
          ...;
      } Foo;
7.4 enum Declarations
class Color
{
  public:
    enum Component {
      Red, Green, Blue
    };
};
Color::Component foo = Color::Red;
const int        Red    = 0;              // Bad Form
const int        Blue    = 1;
const ink        Green    = 2;
enum ColorComponent {                          // Much Better
      Red,
      Blue,
      Green
};  
enum ColorComponent {                          // Explicit values can be given
      Red    = 0x10,                // to each item as well...
      Blue    = 0x20,
      Green    = 0x40
};
This causes ColorComponent to become a distinct type that is type-checked by the compiler. Values of type ColorComponent will be automatically converted to int as needed, but an int cannot be changed to a ColorComponent without a cast.

enum Color {
      Red,
      Blue,
      Green,
      LastColor = green
};
This trick should only be used when you need the number of elements, and will only work if none of the enumeration literals are explicitly assigned values.
7.5 Classes
7.6 class Declarations

This is to help users of vi, which has a simple "go to beginning of paragraph" command, and which recognizes such a line as a paragraph beginning. Thus, you can, in the middle of a long class declaration, go to the beginning of the class with a simple command. The usefulness of this feature was deemed to outweigh its inconsistency (also see. section 9.2).

The ordering is "most public first" so people who only wish to use the class can stop reading when they reach protected/private.

class Foo: public Blah, private Bar
{
  public:
             Foo();              // be sure to use better
            ~Foo();
    int         get_size(int    phase_of_moon) const;              // comments than these.
    int         set_size(int    new_size);
    virtual int         override_me() = 0;

  protected:

    static int         hidden_get_size();

  private:
    int         Size;              // meaningful comment
    void         utility_method();
};

Public and protected data members affect all derived classes and violate the basic object oriented philosophy of data hiding.

7.7 Class Constructors and Destructors

Constructors and destructors are used for initializing and destroying objects and for converting an object of one type to another type. There are lots of rules and exceptions to the use of constructors and destructors in C++, and programs that rely heavily on constructors being called implicitly are hard to understand and maintain. Be careful when using this feature of C++!

Be particularly careful when writing constructors that accept only one argument (or use default arguments that may allow a multi-argument constructor to be used as if it did) since such constructors specify a conversion from their argument type to the type of its class. Such constructors need not be called explicitly and can lead to unintended implicit uses of conversions. There are also other difficulties with constructors and destructors being called implicitly by the compiler when initializing references and when copying objects.

Things to do to avoid problems with constructors and destructors:

This will cause the compiler to generate a compile-time error if a
member-wise copy is attempted.


class Foo
{
  public:
           Foo();  
          ~Foo();

    int       get_size(int phase_of_moon) const;      

  private:

    ...

};

BusStop::BusStop() :
    PeopleQueue(30),
    Port("Default")
{
    ...;
}
BusStop::BusStop(char *some_argument) :
    PeopleQueue(30),
    Port(some_argument)
{
    ...;
}
7.8 Automatically-Provided Member Functions

C++ automatically provides the following methods for your classes (unless you provide your own):


class Empty { };              // You write this ...
class Empty              // You really get this ...
{
   public:
                 Empty() { }                 // constructor
            ~Empty() { }                 // destructor
                 Empty(const Empty &rhs);                 // copy constructor
        Empty         &operator=(const Empty &rhs);  // assignment operator
        Empty         *operator&();                 // address-of
    const Empty            *operator&() const;                 //      operators
};

Every class writer must consider whether the default functions are correct for that class. If they are, a comment must be provided where the function would be declared so that a reader of the class knows that the issue was considered, not forgotten.

If a class has no valid meaning for these functions, you should declare an implementation in the private section of the class. Such a function should probably call abort(), throw an exception, or otherwise generate a visible runtime error.

This ensures that the compiler will not use the default implementations, that it will not allow users to invoke that function, and that if a member function uses it by accident, it may at least be caught at runtime.

It is a good idea to always define a constructor, copy constructor, and a destructor for every class you write, even if they don't do anything.

7.9 Function Overloading

Overloading function names must only be done if the functions do essentially the same thing. If they do not, they must not share a name. Declarations of overloaded functions should be grouped together.

7.10 Operator Overloading

Deciding when to overload operators requires careful thought. Operator overloading does not simply create a short-hand for an operation --- it creates a set of expectations in the mind of the reader, and inherits precedence from the language.

foo->member()            // should be identical to
(*foo).member()            // which should also be identical to
foo[0].member()
Overloading == requires overloading !=, and vice versa.

If the expression (a != b) is not equivalent to !(a == b) we have unacceptably astonished the user.

This allows the user of the class to determine if a cast is more readable than a member function invocation, for example, to avoid casts that look like they should be automatically done by the compiler, but are explicit to invoke the cast.

7.11 Protected items

When a member of a class is declared protected, to clients of the class it is as if the member were private. Subclasses, however, can access the member as if it were declared private to them. This means that a subclass can access the member, but only as one of its own private fields. Specifically, it cannot access a protected field of its parent class via a pointer to the parent class, only via a pointer to itself (or a descendant).

7.12 friends

When using friends remember that private member access rights do not extend to subclasses of the friend class. Any method that depends on friend access to another class cannot be rewritten in a subclass.

The use of friend class or method declarations is discouraged, since the use of friend breaks the separation between the interface and the implementation. The only non-discouraged uses are for binary operators and for cooperating classes that need private communication, such as container/iterator sets.

7.12.1 friend Classes

class Secret
{
  private:   
    int    Data;

    int    method();

  friend        Base;
};

class Base
{
  protected:
    int    secret_data(Secret *income_info);
    int    secret_method(Secret *income_info);
};

int
Base::secret_data(Secret *income_info)
{
    return income_info->Data;
}

int
Base::seccet_method(Secret *income_info)
{
    return income_info->method();
}
Methods of the Secret class should not be accessed directly by methods of the friend class Base. Direct access makes it hard to cut-and-paste code from the base to a derived class:
void
Base::an_example(Secret *income_info)
{
    int  a = income_info->Data;                  // BAD:  Direct access is wrong
    int  b = secret_data(income_info);                  // GOOD: Use accessor functions!
}
7.12.2 friend Methods

Binary operators, except assignment operators, must almost always be friend member functions.

class String
{
  public:
            String(const char *);

  friend int operator==(const String &,const String &);
  friend int operator!=(const String &,const String &)
            { return !(string1 == string2); }
};

If the operator== were a member function, the conversion operator would only allow (String == char *) but not (char * == String) This would be quite surprising to the user of the class. Making operator== a friend member function allows the conversion implied by the constructor to work on both sides of the operator.

7.13 Templates

  template<class TYPE>
  class List
  {
    public:

    TYPE     front();
    ...
  };
  
template<class TYPE>
TYPE
List<TYPE>::front()
{
    ...;
}


  template<class TYPE, unsigned int SIZE>
  class Vector
  {
    private:

    Type     Data[SIZE];
  }
Here, the type stored by the Vector template class is named TYPE because it is a general purpose parameter. The SIZE parameter, however, is specific since it ultimately determines the size of a Vector<TYPE> object; its name reflects this specific purpose.

8 Variables

8.1 Placement of Declarations

Since C++ gives the programmer the freedom to place a variable definition wherever a statement can appear, they should be placed near their use. For efficiency, it may be desirable to invoke constructors only when necessary. Thus function code may define some local variables, do some parameter checking, and once the sanity checks have passed then define the class instances and invoke their constructors.

Where possible, you should initialize variables when they are defined.

char      *Foo[]             = { "Hello", ", ", "World\n" };

int      max_string_length            = BUFSIZE;

String      path("/usr/tmp/gogin");

This minimizes "used before initialization" bugs.

8.2 extern Declarations

class ClassName;

Place them in header files to prevent inconsistent declarations in each source file that uses it.

8.3 Indentation of Variables

int*      ip;
String&      str;

This style, though currently popular, lies about syntax, since int* p1, p2; implies p1 and p2 are both pointers, but one is not. Since we do not accept that only one variable should be declared on a line as a fixed rule, we cannot allow a style that lies about the meaning of multiple declarations on a line.


int       count             = 0;

char      **pointer_to_string            = &foo;
8.4 Number of Variables per Line
int      level    = 0;          // indentation level
int      size    = 0;          // size of symbol table
int      lines    = 0;          // lines read from input
is preferred over:
int      level, size, lines;              // Not Recommended
The latter style is frequently used for declaring several temporary variables of primitive types such as int or char, or strongly matched variables, such as x, y pairs, where changing the type of one requires changing the type of all.

long        db, OpenDb();            // Bad
long        db;            // Better
long        OpenDb();            // but still not recommended
#include <admintools/database.hh>                    // Best

Databae        db;

You should use a header file that contains an external function declaration of OpenDb() instead of hard-coding its definition in your source file.

8.5 Definitions Hiding Other Definitions

void
WfFunction()
{
    static int      boggle_count;          // Count of boggles in formyls

    if (condition) {
      int    boggle_count;          // Bad --- this hides the above instance
    }
}
8.6 Initialized Variables

This style is purposefully analogous to the function declaration style. It may look strange to some at first, but in the context of a complete program, it lends itself to an overall pleasing appearance of the code.


Cat      cats[] = {
        "Shamus",
        "Macka",
        "Tigger",
        "Xenephon",
      };

char     *name = "Framus";

9 Functions

9.1 Function Declarations
SomeType        *WfLibraryFunctionName(void *current_pointer,
				       size_t desired_new_size);
However, if a function takes only a few parameters, the declaration can be strung onto one line (if it fits):
int    strcmp(const char *s1, const char *s2);

We usually use a one-line-per-declaration form for several reasons.
(1) It is easy to comment the individual parameters,
(2) It makes it easier to read when there many parameters.
(3) It is easy to reorder the parameters, or to add one. The closing ); is on a line by itself to make it easier to add a new parameter at the end of the parameter list.
(4) It is designed to be visually similar to the other declaration statements.
(5) It works well with long identifier names.
However, with simple declarations the weight seems too great for the benefit.

int      getchar();
The ANSI C-compatible construct of (void) for a function with no parameters must only be used in header files designed to be included by both C and C++ (See Interaction with C on page 39.)

This provides internal documentation that can help people remember what a parameter is supposed to represent. It also allows comments in the file to refer to the parameter by name.

9.2 Function Definitions

Small functions promote clarity and correctness.


char *
WfString::cstr()
{
    // ...
}

void    
WfFoo(int param1, int /* optional_param2 */)
{
    // ...;
}

int
SystemInformationObject::get_number_of_users(Name host_name, Time idle_time)
{
    int    some_variable;

    statements;
    ...;
}

10 Statements



  argv++; argc--;                        // Multiple statements are bad

  if (err) 

    fprintf(stderr, "error\n"), exit(1);                      // Using `,' is worse



argv++;                          // The right way

argc--;

if (err) {

    fprintf(stderr, "error\n");

    exit(1);

}
10.1 Compound Statements

Compound statements are statements that contain lists of statements enclosed in braces.



{

    // New Block Scope

    int some_variable;



    statements;

}



if (condition) {                // braces required; following "if" is two lines

    if (other_condition)            // braces not required -- only one line follows

      statement;

}
Braces are not required for control structures with single-line bodies, except for do/while loops, whose always require enclosing braces. This single-line rule includes a full if/else/else/... statement:
if (condition)

    single_thing();

else if (other_condition)

    other_thing();

else

    final_thing();
Note that this is a "single-line rule", not a "single statement rule". It applies to things that fit on a single line.

Single-statement bodies are too simple to be worth the weight of the extra curlies.

10.2 if/else Statements

An else clause is joined to any preceding close curly brace that is part of its if. See also Comparing against Zero on page 34.

if (condition) {

    ...;

}
if (condition) {

    ...;

} else {

    ...;

}
if (condition) {

    ...;

} else if (condition) {

    ...;

} else {

    ...;

}
10.3 for Statements
  for (initialization; condition; update) {

    ...;

  }

If the three parts of the control structure of a for statement do not fit on one line, they each should be placed on separate lines or broken out of the loop:


for   (longinitialization;

      longcondition;

      longupdate

) {

    ...;

}
longinitialization;                  // Alternate form...

for (; condition; update) {

    ...;

}

When using the comma operator in the initialization or update clauses of a for statement, no more than two variables should be updated. If there are more, it is better to use separate statements outside the for loop (for the initialization clause), or at the end of the loop (for the update clause).

10.4 do Statements
  do {

    ...;

  } while (condition);
10.5 while Statements
  while (condition) {

    ...;

  }
10.6 Infinite Loops

The infinite loop is written using a for loop:

  for (;;) {

    ...;

  }

This form is better than the functionally equivalent while (TRUE) or while (1) since they imply a test against TRUE (or 1), which is neither necessary nor meaningful (if TRUE ever is not true, then we are all in real trouble).

10.7 Empty Loops

Loops that have no body must use the continue keyword to make it obvious that this was intentional.

while (*string_pointer++ != '\0')
    continue;
10.8 switch Statements

We use this indentation since the labels are conceptually part of the switch, but indenting by a full indent would mean that all code would be indented by two indent levels, which would be too much.

This prevents a fall-through error if another case is added after the last one.

    // FALLTHROUGH
where the break would normally be expected.

This makes it clear to the reader that it is this fallthrough was intentional.

Some C++ compilers will warn you if such a switch is missing a member. This warning will call out situations where you add a member to an enum definition but forget to add a case for it in a given switch. This is usually an error.



  switch (pixel_color) {

    case Color::blue:

    ...;

    break;



    case Color::red:

    found_red_one = TRUE;

    // FALLTHROUGH

    case Color::purple:

   {

      int    local_variable;

      ...;

      break;

   }



    default:          // handles green, mauve, and pink colors...

    ...;

    break;

  }

This is to catch unexpected inputs in more graceful ways than failing unpredictably somewhere else in the code.

10.9 goto Statements

While not completely avoidable, use of goto is discouraged. In many cases, breaking a procedure into smaller pieces, or using a different language construct will enable elimination of a goto.

The main place where a goto can be usefully employed is to break out of several nested levels of switch, for, or while nesting when an error is detected. In future versions of C++ exceptions should be used.

    for (...) {    

      for (...) {    

        ...;

        if (disaster) {

          goto error;

        }

      }

    }

    return true;

error:        // clean up the mess


    if (pool.is_empty()) {

      goto label;        // VERY WRONG

    }

    for (...) {

      Object obj;

label:

    }

Branching into a block may bypass the constructor invocations and initializations for the automatic variables in the block. Some compilers treat this as an error, others blissfully ignore it.

10.10 return Statements

The expressions associated with return statements are not required to be enclosed in parentheses.

10.11 try/catch Statements

The proposed C++ syntax for exception handling is not to be used in shared code at this time (5). This section specifies the syntax which will eventually be used to support the feature, but should be avoided in near term code.


We need a section describing an alternate way of handling exceptions so that we can use "boilerplate" code to do the right thing now and help ease the transition to exceptions in the future.

  try {
    statements;
  } catch (type) {
    statements;
  } catch (...) {                // This is the literal "..."
    statements;
  }

11 Miscellaneous

11.1 General Comments & Rules

This can used to support C programs being linked to C++ libraries without the use of a C++-aware linker. See Interaction with C on page 39.



void printf(const char *, ...);


if (BooleanExpression)

    return WfTrue;

else

    return WfFalse;
with:
return BooleanExpression;
Similarly,
if (condition)                    // Awkward

    return x;

return y;
is usually clearer when written as:

if (condition)                    // Clear

    return x;

else

    return y;
or
return (condition ? x : y);


if (x = y) {                    // Confusing

    ...;

}
it is hard to tell whether the programmer really meant assignment or the equality test.
Instead, use
if ((x = y) != 0) {                    // Understandable

    ...;

}
or something similar if the assignment is needed within the if statement. There is a time and a place for embedded assignments. The ++ and -- operators count as assignments. So, for many purposes, do functions with side effects.

As an aside, many of today's compilers can produce faster and/or smaller code if you don't use embedded assignments. If you are using such convoluted code to "help the compiler optimize the program", you may be doing more harm than good.

11.2 Limits on numeric precision
11.3 Comparing against Zero

Comparisons against zero values must be driven by the type of the expression. This section shows the valid ways to compare for given types. Anything not permitted here is forbidden.

When maintaining code it is very useful to be able to tell what "units" a comparison is using. As an example, an equality test against the constant 0 implies that the variable being tested is an integral type; testing against an explicit NULL implies a pointer comparison, while an implied NULL implies a boolean relationship.

(See if/else Statements on page 28)

11.3.1 Boolean

Choose variable names that make sense when used as a "condition". For example,

if (pool.is_empty())

makes sense, while

if (pool.state())

just confuses people. The generic form for Boolean comparisons are

if (boolean_variable)

if (!boolean_variable)

Note that this is the only case where an implicit test is allowed; all other comparisons (int, char, pointers, etc.) must explicitly compare against a value of their type. A standalone variable should always imply a boolean value.

Never use the boolean negation operator! with non-boolean expressions. In particular, never use it to test for a null pointer or to test for success of the strcmp() function:

if (!strcmp(s1, s2))                     // Bad

if (strcmp(s1, s2) == 0)                    // Good
11.3.2 Character
if (char_variable != '\0')

while (*char_pointer != '\0')
11.3.3 Integral
if (integer_variable == 0)

if (integer_variable != 0)
11.3.4 Floating Point
if (floating_point_variable > 0.0)

Always exercise care when comparing floating point values. It is generally not a good idea to use equality or inequality type comparisons. Use relative comparisons, possibly bounded by a "fuzz" factor in cases where an equality-like functionality is required.

11.3.5 Pointer
if (pointer_variable != NULL    )                  // Always use an explicit test vs. NULL

Implicit comparisons are not allowed:

if (pointer_variable)                      // WRONG
11.4 Use and Misuse of inline

Since a client using your inlined interface actually compiles your code into their executable, you can never change this part of your implementation. And no one else can provide an alternate implementation.

Within your implementation there may be places where you need to use inlines. Be aware that the use of inlines can easily make your (and other people's) code larger, which can overcome any efficiency gains. Here are some guidelines to help do it right.

class Dummy
{
public:
    inline int        do_something();
};

inline int
Dummy::do_something()
{
    // ... several lines of code
}
11.5 References vs. Pointers

The advantages of using references over pointers include (from [25]):

The advantages of using pointers over references include:

Use references where you reasonably can --- that is, when assigning a name to an already existing singular object. Use pointers for any of the other N meanings that pointers have traditionally held.

11.6 Portability

The advantages of portable code are well known and little appreciated. This section gives some guidelines for writing portable code, where the definition of portable is a source file that can be compiled and executed on different machines with the only source change being the inclusion of (possibly different) header files.


x &= 0177770
will clear only the three right most bits of a 16 bit int on a PDP-11. On a VAX (with 32 bit ints) it will also clear the entire upper 16 bits. Use
x &= ~07
instead, which works as expected on all machines. The operator | does not have these problems, nor do bit-fields.

  1. first look in the directory the build is being done in,

  2. then use the -I directory list
    while on System V derived systems it means

  3. first look in the directory the current file is in,

  4. then the directory the build is being done in,

  5. then the -I directory list
    Note that the BSD systems don't implicitly look in the directory that contains the source file if that directory isn't the one where the build is being done.
    As an example, assume a directory with the files foo.cc, foo.hh, and the subdirectory obj. Also assume that foo.cc does an #include "foo.hh". The command
        CC -c -o obj/foo.o foo.cc
will work on both systems, but
        cd obj

        CC -c -o foo.o ../foo.cc
will fail to find foo.hh on many BSD based systems.

12 Interaction with C

12.1 ANSI-C/C++ include files:

This is the list of the header files that ANSI-C (and thus C++) requires be provided by the language implementation. Use of any other "system" header file may not be portable.

Uses of these C header files are not required to be bracketed with the extern "C" { } construct.

#include <stddef.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <locale.h>
#include <ctype.h>
#include <string.h>
#include <time.h>
#include <limits.h>
#include <errno.h>
#include <assert.h>
#include <signal.h>
#include <setjmp.h>
#include <math.h>
#include <float.h>
12.2 Including C++ Header Files in C programs

Header files that must be included by both C and C++ source have slightly different rules than do C++-only header files.

This project has no interest in any non-ANSI-C dialects of the C language.

12.3 Including C Header Files in C++

The inclusion of every non-C++ header file must be surrounded by the extern "C" { } construct.

Note that the header files enumerated above (ANSI-C/C++ include files:, § 12.1) are considered C++ header files, and not subject to this rule.

extern "C" {
#include <somefile.h>
#include <otherfile.h>
}
12.4 C Code calling C++ Libraries

Function calls that are intended to be called from C that take input-only struct arguments may wish to use pointers, since C does not have references. Such pointers must, of course, be declared const.

In order to be able to export a C++ library to C users, you'd like to let the C users link to the library using the regular C/Unix linker.

The CC linker (also called the "C++ pre-linker" or "patch" or "munch") is the part of the C++ system that makes static constructors and static destructors work. These are the routines that get called if you have a global (or a local static) variable whose class has a constructor. The C++ pre-linker paws through your object files looking for the right pattern of mangled name that indicates constructors and/or destructors that need to be called for that file, puts them all together into an initialization routine named _main(), and links that synthesized _main() into your program. Cfront has inserted a call to _main() at the start of your main C++ program.

So, on SunOS 4.x, if your library has any global, file-static or local-static variables whose classes have constructors or destructors, you must use the CC command to link any application to that library.

SunOS 5.0 object files allow libraries to have a .init section that gets called to initialize the library. The constructors would be put into this section (and not _main()), thus avoiding the linking problems mentioned above.


Footnotes

(1)
The real, legal copyright text is still being created. It will be much longer and wordier than that which is shown here. A tool exists that allows us to simply put a comment containing the word "Copyright" somewhere in it at the beginning of the file and have the correct current copyright notice inserted automatically at a later date.
(2)
Hungarian notation is that style that has a identifier modifier (either suffix or prefix) for every potentially interesting fact about that identifier, for example, a "P" suffix for pointer, "l" prefix for local, "i" prefix for an int, "a" suffix for arrays, so each identifier can look something like liindexaP.
(3)
This does not apply to names used in libraries or generated by utilities.
(4)
Lining up the initializations, as shown in the examples, is a matter of personal preference.
(5)
As soon as exceptions are available in the compilers we use, this restriction will be revisited.