Smart Pointers Overview

An evolutionary approach to a classic C++ idea

by John M. Dlugosz

My interest in smart pointers grew out of my interest in multithreading. When data is being passed between threads, such as in the case of parameters to an asynchronous function, there is a serious ownership issue.

But, even in more mundane programming, the problem arises. Strings are a case in point: in the old days of C, returning a string from a function was a royal pain. There were numerous approaches to this, such as returning a dynamically allocated string that must then be freed by the caller, returning a static buffer that gets reused with each call, or passing a destination buffer and length as parameters. With discipline from the programmers involved, and with a little experience of getting burned a few times, it generally worked. But, itís a constant drag on the programmerís attention that ought to be better spent on other things.

With a modern string class, we have the power we see in most other high-level languages, where returning a string from a function is just as simple as returning an integer. You just return the thing, no muss, no fuss.

So, reference counting via smart pointers is the way to go in C++. There are many widespread uses of this device, once a reusable smart pointer component is available. The challenge is to use modern C++ (in particular, templates) to produce a serious smart pointer class that stands up to some rather challenging design requirements.

Reference Counting vs. Garbage Collection

One day a student came to Moon and said: "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons."

Moon patiently told the student the following story:

"One day a student came to Moon and said: ĎI understand how to make a better garbage collectorÖ

óAn AI Koan, courtesy of The New Hackerís Dictionary a.k.a. The Jargon File.

Reference counting has a problem with recursive or circular references. As the Koan from an ancient LISP culture shows, the problem has long been understood. Garbage collection is not subject to this particular problem, so fans of garbage collection in languages tend to ridicule the proponents of reference counting.

But, garbage collection is not at all the same thing. C++ has a very distinctive feature that sets it apart from other contemporary object-oriented languages: value semantics. In Smalltalk, LISP, Object Pascal, and Java, for example, objects have reference semantics. Assigning one variable to another is a pointer operation, not an operation on the whole object.

In contrast, the assignment operator in C++ operates on the whole object, and pointers, if desired, are treated separately. This general idea gave us constructors and destructors, and led to the wonderful concept of resource acquisition as initialization. This is one of the great strengths of the C++ philosophy, and we should embrace it rather than try to circumvent it.

The meaning of garbage collection in C++ is to be a simulation of infinite memory. If the system can ascertain that a particular object will never be used again, it can swap it out to dev\nul for all anyone cares. Point is, the destructor is not called.

Reference counting, on the other hand, is a system of knowing exactly when the last use of an object is extinguished, and calling the destructor at that precise point in the execution.

Picture a class that represents a window on the screen. When the program is done using that window, why does it remain on the screen? Because the garbage collector hasnít noticed yet. For the concept of resource acquisition as initialization, we must have timely destruction. Objects are not just memory to be scavenged when we run lowóobjects in C++ are files, windows, communication channels, and other things with real effects due to destruction.

Destruction when the last user is finished is like a room where the last one out turns off the lights. The responsibility is shared among all users of the object. Putting this logic in the pointer itself is a very O-O concept, because the object is taking care of itself. It's like the light switch knowing when the last person left the room, and turning itself off.

The End of Innocence

A serious reference counting pointer in C++ needs to overcome this exact problem. The good news is, Iíve solved it. My smart pointer system handles circular references.

Thatís a doozie, but not the only problem. You can have multiple pointers to the same object in C++, where the values of the pointers donít match.

class A { /* Ö */ };
class B1 : public A { /* Ö */ };
class B2 : public A { /* Ö */ };
class C : public B1, public B2 { /* Ö */ };

C object;
B1* p1= &object;
B2* p2= &object;

Here, p1 and p2 point to the same object, but canít be the same address. With smart pointers in the same situation, you find that they are of different types, and hold different values, yet must realize that they have a common lifetime and hence cooperate with ownership in exactly the same manner as two smart pointers of the same type and same pointer value.

If thatís not enough, consider const pointers. Or rather, pointers to constants. Point is, const can be used in two places within a pointer definition, with different meanings.

const char* s1;
char* const s2;

The smart pointers need to have this same concept, with compile-time checking of constness in both cases.

If thatís not enough, there are two kinds of behavior that are often associated with smart pointers. One behavior is aliasing, where two smart pointers reference the same object and a change in one is seen by the other. On the other hand, you may prefer copy-on-write behavior, which allows efficient passing and returning of large values, and full value semantics. This is what you expect of a string class. Naturally, we want both types of behavior.

As a minor issue, but one that showed up during my experiments, the smart pointer should support the NULL value.

Next, the smart pointer should be able to accommodate data types that were never designed with smart pointers in mind. In particular, classes from other libraries should be handled by these smart pointers.

Oh, and remember the original need I mentioned for smart pointers? They must be thread safe.

The Tour Starts Here

This article explores the design of a smart pointer system that satisfies the above requirements. Future articles will explore individual issues and their impact on the implementation in more depth. But you donít have to wait: complete source code is available from my web site.

What Do Horses and Cows Have To Do With It?

There are three different kinds of smart pointers collectively called handles, that fundamentally behave in the way you expect a smart pointer should. Specifically, they all have an operator-> that provides access to the underlying object.

handle<C> h (new C);
handle<C> h= new C; //canít use Ď=í form.
C* p= new C;
h->foo();  //access through handle
p->foo();  //access through pointer

Note: the examples use the parenthesis form of initialization (formally called direct initialization) rather than the Ď=í form (copy initialization) because the explicit keyword on the constructor requires this. You could write the regular pointer with direct initialization too; e.g. C*p(new C); but most people donít write it that way.

As you can see, handle is a template which allows you to specify the type of the smart pointer. There are actually three such templates: handle, const_handle, and cow.

The const_handle is just like a handle, but provides read-only access. Consider the analogous case with regular pointers:

const C* p2= new C;
const_handle<C> h2 (new C);
p2->set_value (10);
x= p2->get_value();
h2->set_value (10);
x= h2->get_value();

p2= p;  //OK.  Adds const.
h2= h;  //OK.

p= p2;  //compile-time erroró
h= h2;  //attempt to remove const.

The smart pointer behaves analogously to the regular pointer. Specifically, operator-> on the const_handle returns a const pointer, while operator-> on the regular handle returns an unqualified pointer. Meanwhile, you may implicitly convert a const_handle to the same kind of handle, but not vice-versa. This is done by using inheritance.

template <typename T>
class const_handle {
public:
   const T* operator->() const;
   };

template <typename T>
class handle : public const_handle<T> {
public:
   T* operator->() const;
   };

A regular handle can do anything a const_handle can, and then some. So making it a derived class is a perfectly sensible thing to do. The less-restricted form of operator-> hides the one in the base class.

Notice that both versions of operator-> are const functions. Why isnít the derived one written

   T* operator->() const;
instead? Because pointers have two kinds of const attached to them. The constness of the thing being controlled by the pointer is distinct from the pointer itself. The dereference operation is a const function because dereferencing doesnít change the value of the pointer. The const on member functions of the smart pointer are used to distinguish this other dimension of constness, as follows:


C* p= new C;	//full access
const C* p2= new C;	//const object
C* const p3= new C;	//const pointer
const C* const p4= new C;	//const in both places

handle<C> h (new C);	//full access
const_handle<C> h2 (new C);	//const object
const handle<C> h3 (new C);	//const pointer
const const_handle<C> h4 (new C);	//const in both places

Note: the same applies to volatile, but my handle classes donít model this.

So, the difference between const_handle and handle is clear enough. So what of cow?

The requirements stated a need for both alias and copy-on-write functionality. If you use operator-> on a handle, you get the object, which may be pointed to by other handles. It doesnít matter to handle. The operator-> just returns the controlled object, and doesnít care whether other pointers are also pointing to it. This is the simplest and most straightforward implementation, and corresponds to the alias requirement. This is also how regular pointers work.

The other behavior is copy-on-write. In this case, an operator-> would check to see if there were other references, and if so, perform a copy first. The operator-> always returns a unique value, never a shared value. A different implementation of operator-> calls for another class. The template cow is also derived from const_handle, and works the same as handle, except for the copy-on-write checking.

There is no distinction between the two behaviors if the object is not written to, so only a single kind of const_handle exists. Notice that any use of operator-> on the cow handle will cause lazy-copying to be triggeredóit canít tell if you are calling a const member or not, so it assumes the worst. Because of this, it is more efficient to use the const_handle when you are not going to actually change the object. Remember, you can pass a cow<T> to any function taking a const_handle<T> parameter.

What You Donít Own You Can Borrow.

Iím tinkering with an experimental Win32 programming framework called Tomahawk. One of the design goals is to totally eliminate all ownership issues. I hate the way MFC and OWL both make you new various objects, and donít make it clear when ownership is held by some enclosing object and when it isnít. Most people just give up and never delete any framework object. This prevents duplicate frees, but causes memory and resource leaks, too. Clearly, smart pointers provide the answer.

I decided that all relationships between objects in Tomahawk, including such things as child windows, GDI objects, and system resources would be done through reference-counting, and most (if not all) functions work with value semantics and lazy copying (like the way a string class works). In short, no ownership problems, ever.

There were minor usability issues that I quickly faced, and tuned the smart pointer design appropriately. A major problem remained, though. The dreaded circular reference problem. Now, thanks to foreshadowing, you know I solved this problem, and is in fact the reason Iím writing this article. But when I first started, I felt like I was pinned against a stone wall, trapped between contradicting design goals.

The problem is that of child windows. More generally, it applies to any set of windows that send messages to each other, but I studied the problem specifically with the case of child windows. The parent window points to the child, and the child needs a pointer back to the parent. Yes, I tried doing without one or the other as actual pointers, using child ID numbers only or some such. But it was inefficient and didnít generalize. And, there was still a problem with multithreaded programs.

My attempts to keep track of the parent window using other means and special considerations eventually generalized to the concept of the baro handle. baro stands for back-reference object.

Hereís how the idea works. A normal smart pointer does two things: It provides access to another object, and it also controls the lifetime of that object. But, circular references cause problems, because two objects that refer to each other also own each other. Thatís not what weíre trying to model, though. The framework owns both windows, and the parent owns the child, but the child doesnít own the parent. But, the child needs to refer to the parent.

Most people who use smart pointers switch back to regular dumb pointers when they need to refer to the object without participating in that objectís lifetime. But that leads to dangling pointers. Suppose the parent window were destroyed by GUI interaction, but the child window, on its own thread, was in the middle of processing a message and, before it knows its being destroyed too, tries to refer to the parent. Woops.

The solution is not to use dumb pointers, but to use a different kind of smart pointer.

A baro works with the smart-pointer system so is aware of the objectís lifetime, but does not own the object. That is, outstanding baro handles to an object donít prevent the object from destructing. When all cows, handles, and const_handles to the object disappear, the object is destroyed. The baro handles donít count.

You might imagine that a baro implements yet another form of operator-> that checks for the continued existence of the object before returning. But thatís not the best solution, because of the threading issues. What if you call foo through a baro handle, and before foo returns, another thread drops the last regular handle, destroying the object? In short, a baro doesnít own the object, but when the object is actually in use, the object does need to be owned, until the call returns.

Instead, a baro doesnít have an operator-> at all! You cannot dereference a baro. What you can do is construct a handle, const_handle, or cow using a baro handle. The handle refers to the same object as the baro, assuming there was one, and hence gives you ownership. You then use the object through the handle, and when youíre done, you let that handle go out of scope.

baro<C> borrower (new C);
//... later
{
handle<C> hh (borrower);
hh->foo();
// destroy hh at end of the block.
}

The error checking takes place during the construction of a handle from a baro, and this constructor throws an exception if the object no longer exists. Once the handle is constructed, the code can be sure of the objectís continued existence. Is that elegant or what?

So back to our pathological window example: The parent owns the child, but the child only borrows the parent. The parent is destroyed when you click in the [x] in the corner of the window, but the child doesnít know that yet because of the inherent synchronization problems of multithreaded programs. Instead, when the child needs to send a message to the parent, it uses code like in the above sample. Its parent pointer is a baro, and upon trying to create a useable handle from it, finds out in no uncertain terms that the parent no longer exists.

The dangling pointer is always caught at run-time. There are no more stray pointer writes or other mysterious cancers in the code. This is useful in C++ thanks to exceptions, and much more efficient than tracking and nulling all outstanding references when an object is destroyed.

There is also a const_baro template, which only allows the creation of a const_handle. A regular baro can seed any kind of handle. This makes a total of five related smart pointers in this system.

Casting and Converting

Pointers can be converted to base class pointers, and can even be downcasted again, if you dare.

I need to pass a handle<Derived> to a function that takes a handle<Base>. Thatís rather simple actually, thanks to template member functions. The std::auto_ptr in the ANSI/ISO standard library does exactly that. Unfortunately, the Microsoft compiler doesnít implement template member functions.

I came up with two work-aroundís. First, I made a handle to a derived type another class, rather than a simple handle<Derived>. This used a template to create a class derived from handle<Derived> (what I really wanted) that also includes an implicit conversion function to handle<Base> as one of its members. Yuck, but it works fine.

A solution that doesnít intrude on the declaration of the needed handle types also led to the general suite of handle casts.

Look at the correspondence between handles and built-in pointers using this system:

handle<Derived> hder;
handle<Base> hbase;
Derived * der;
Base * base;

der= static_cast<Derived*> (base);  //unsafe down-cast
hder= static_handle_cast<handle<Derived> > (hbase);

der= dynamic_cast<Derived*> (base);  //safe
hder= dynamic_handle_cast<handle<Derived> > (hbase);

Basically, there are handle_cast forms that correspond to the keyword-style casting operators, with the same meanings. The thing in the angle brackets can be any of the five handle types on any object type, and works if the const rules of handles are satisfied and if the corresponding pointer cast works on the referenced object.

To get around the lack of implicit conversions (a flaw in the compiler, not in my design), I added an implicit_handle_cast form that only performs casts that should have been done implicitly. It wonít do anything dangerousóonly allows me to convert derived-to-base.

Because pointers and handles can be converted, itís possible for two handles of different types to point to the same object. In the above example, both hder and hbase point to the same object, and the lifetime issues need to deal with this.

Lost & Found

Itís possible, of course, to construct a smart pointer from a regular pointer. But, if that object is already "in the system", then it should understand that there are other smart pointers to this object already, and co-operate with them.

For example, we wish everyone would write using smart pointers alone:

handle<C> original (new C);
handle<C> copy;
copy= original;

but what sometimes happens is this:

handle<C> original (new C);
C* regular= original.operator->();
handle<C> copy (regular);

In the latter case, where the regular pointer is extracted from the smart pointer and then another smart pointer is constructed, it should have the same result as the first case, where one smart pointer is constructed directly from another. Naturally, this needs to handle derived-to-base pointer issues at the same time. If the reference count is part of the object, this lost & found effect is no big deal. But when handling objects that were not designed for reference counting, and in coping with the derived/base pointer issues, we need to make sure the design covers this point.

Conclusion

In order for smart pointers in C++ to solve specific issues of lifetime and ownership, they need to cope with the exact situations that make ownership issues so difficult. A serious smart pointer design must deal with a number of important issues. But the resulting component is wonderful to have.