Code Reviews#4
Elephant Hunting

John M. Dlugosz

Finding your prey can be easy. Then choose your weapons carefully.

Last month, the Guru talked with his students about the differences between antibugging and abugging.

“So what’s the difference between antibugging and abugging?” asked the second.

“Generally, antibugging deals with design-level issues, and abugging deals with implementation-level issues. Anything you see is probably abugging. Antibugging is like the elephant.”

“The elephant that’s not here?” asked the first.

“The same.” replied the Guru. “You didn’t notice it until I pointed it out.”

“So, antibugging is when everything comes out just right, and abugging is when you have to deal with issues?” mused the second.

Now, we’ll look at just what he meant in some detail, with examples.

Special Instructions

The need for antibugging can be seen in the documentation. It may not be printed formal docs—it may be your mental usage-notes as well. The thing to look for is relationships that influence your subsequent use of the code.

Let’s see how that applies to a real piece of code (sorry, Mike). Here is a class designed to fill a specific need in a program, where a lump of binary data is read from a disk file and then broken up into component objects, which contains both fixed size primitives (such as ints) and variable sized items such as length-prefixed strings.

class read_buffer {
   //...  implementation details...
public:
   read_buffer (size_t size);
   ~read_buffer();
   byte* buffer();  //fetch internal buffer, for filling
   // various functions for extracting data
   void read (void* dest, size_t bytecount);
   int read_int();  //calls read()
   //... others
   };

To use the read_buffer class, first give the buffer’s size to the constructor, which allocates memory. Then call the buffer() member to get a pointer to this memory, into which you read the raw data from the file.

Then, use the read() members, which pull bytes out of the buffer and advance the current position. Finally, the destructor frees the memory.

This class must be used carefully, as buffer() may not be called after you start using read(). Rather, it must only be used once, immediately after construction.

The inverse class is even worse:

class write_buffer {
   //...   implementation details...
public:
   write_buffer();
   ~write_buffer();
   // various functions for storing data into the buffer
   void write (const void*, size_t);
   void write (int);  //write an int, calls general form
   //... others
   // functions to eject the filled buffer
   const void* get_first (size_t& chunksize);
   const void* get_next (size_t& chunksize);
   };

It starts out simple enough. After declaring a write_buffer (no constructor arguments), add data to it using the various forms of write(). The problem is with ejecting the filled buffer. First, you call get_first() which returns the first part of the data. The pointer to the data is the function’s return value, and the size is placed in the reference parameter. Once this call is made, you may not call write() again or bad things will happen. You read the rest of the buffer, in chunks, with repeated calls to get_next(). The size will be 0 after all chunks have been returned.

Telling the user what he must and mustn’t do is a sign that the class is overly complex to use. If I had a similar need and wrote this today, I might wonder what all the fuss was about. To do the reading, simply establish a file mapping, specifying the file, starting offset, and length in the constructor. For writing, where the overall size is not known ahead of time, use a large chunk of reserved but uncommitted memory, and commit pages as the buffer fills. The buffer dynamically grows, but remains in a single contiguous region of memory.

But, the above classes were designed for an embedded system, where memory was at a premium and there were no tricks available to play with virtual memory mappings. The inner workings of the buffers was designed to be fast and memory-efficient given the constraints of the problem. It was a good idea. The only “problem” is that the usage of the classes exposes this complexity. The peculiarities of using the class are artifacts of the implementation. A better design would be an abstraction, not a direct correspondence to the underlying implementation.

As it turns out, a better idea did not come immediately to mind. This is itself a sign that antibugging needs to be done at a higher level. It’s like the fountain in the Guru’s park: a detail that itself directly contradicts the overall good design. Here, a “better idea” would not be a cleaner public interface to buffer which somehow does the same thing. Rather, a “better idea” would be to back away a level and look at the code that is demanding this component. Could that be altered so it’s not demanding this exact awkward component?

In the actual program, a single instance was created, filled, dumped, and destroyed all within a single short passage of code. If usage was more widespread, this component would be more problematic. But since it was only used in one place, I decided redesign was not worth the trouble. Instead, I turned toward abugging.

I thought about slight improvements to the existing design. Why not implement it so the buffer() function can be called at any time? Why not eliminate the first/next fetch functions on the writer and just have one function? Can anything be done about the second return value found in the pass-by-reference parameter?

With antibugging, I was considering the overall system and how all the pieces fit together. Remember, antibugging is a design process, and something you don’t notice unless it’s wrong. The Guru asked about muddy sidewalks, and his students brainstormed ideas to cope with it. Dealing with these issues is abugging. The real answer in his riddle was antibugging—build sidewalks that don’t tend to get muddy, which means all the ideas that his students thought up are unnecessary.

So, deciding that I’m stuck with these issues, I need to deal with them in the implementation. That drops down to abugging, like the Guru’s fountain. In the park, antibugging would be to put the fountain at the top of the hill, so it would not create standing water when it rains and overflows. But the design was to have a large flat plaza around the fountain, and it can’t be at the top of a hill because there must be a vantage point where one can look down over the fountain. Since the park’s designer could not sidestep this problem (antibugging), he had to deal with it (abugging) and included drains on the flat area.

In the buffer code above, the real problem is dealing with separate use and purge modes. Putting in a flag so it asserts if incorrectly called would save someone some debugging someday, as the program is maintained. In general, if there are rules to live by (and no matter how simple it seems, some maintainer will mess it up) have the code itself check for adherence.

The “avoid special instructions” rule is central to antibugging. Many other individual antibugging techniques somehow reflect on “special instructions”.

Pointers and References

A very common place that “avoid special instructions” as an antibugging technique comes into play is when dealing with pointers. In fact, the whole concept behind forgetting to check error codes is a special case of the general idea of “avoid special instructions”. Telling the user of a library that after calling some function he must check an error code is certainly special instructions that could possibly be disobeyed. That’s kind of weak, I grant you, but the concept is more closely related when you have a single value that could be either the answer or an error code, as opposed to a separate way to explicitly check for errors.

This is exactly the case when a function returns a pointer as a result, or a null pointer on error. Consider a function that looks up a record in some kind of collection. One way to design it would be item* collection::lookup (key); where the function returns a pointer to the item in the collection if it is found, or null if it was not found. It won’t be long before someone writes p->lookup(the_key)->some_virtual_function(); and plooie, the null pointer doesn’t work very well when you try to call a virtual function on it.

Another design would be item& collection::lookup (key); where the function always returns a valid item. Perhaps it automatically adds an element if not present. Perhaps lookup() throws an exception if the key is not present. Either way, the potential for misusing a bad return value is eliminated.

You will notice this used in standard collections like std::map. The operator[] will insert an item with that key if necessary, so it always has something to return a reference to, and other functions return an iterator to the end if nothing was found. Likewise, the dynamic_cast has a choice of pointer or reference forms. In both examples, use the pointer version when you will be checking for errors, and the reference form when you will not be.

Complete the Functionality

Another antibugging issue is “complete the functionality”. This is a complete topic in itself, but a simple case is a function which the compiler lets you call in situations that the “special instructions” say you shouldn’t do. The domain of functions is a simple example—the actual limits of a function’s inputs are more strict than the compiler’s type checking can provide for.

To deal with this, first see if you can make the full range in fact work. If that is not an option, make the function itself catch the bad inputs. You can see that this relates to the pointer example above, too. The null pointer dealt with a problematic output, but the invalid key is a problematic input. Often, a single issue will fall under multiple categories. That’s just saying that a single design flaw will affect the code in numerous ways. The design that automatically adds an element if necessary deals with this issue too: there are no longer any bad input values.

Prerequisites

Sometimes the “special instructions” will tell you to be sure to do something before doing something else. The concept of prerequisites gets special treatment in C++, because C++ has constructors. In general, you have to declare something before using it. You can make the fact that it is declared fill the prerequisite condition. This is usually taken for granted in good C++ design, where you have everything “object-oriented” and objects are constructed to a usable state, and then objects take care of themselves.

So watch your encapsulation. Make internal state self-initializing whenever possible, so you can just call a member and not have to call a special setup member first! If this isn’t working cleanly, perhaps you need to rethink where your object boundaries lie.

Copy Semantics

The concept of strong copy semantics is central to C++, something lacking from object-oriented Pascal or Smalltalk. It’s a very powerful concept. Instead of needing “special instructions” saying “call this function to copy a person record; don’t use primitive assignment.”, you just define an assignment operator to do what is right. The user of the class never has to worry about it.

So why are programmers always forgetting this? Consider the function

char* C::bar (char*);

//in use:
char* input_value= new char[80];
strcpy (input_value, “hello world”);
char* result= x.bar (input_value);

Hmm, why doesn’t the fragment say input_value= “hello world”; ? Instead there is a funny function called strcpy, and assignment to input_value would be syntactically legal but mean something else.

The result value is more problematic. Who owns the memory? Does result point to something that is already owned by something else, or does the caller have the responsibility of freeing it? Likewise with the input: does bar take over ownership of the string, or not?

Here, a string class would bypass all these problems. Antibugging would be to use strings instead of char*s, which simply don’t have these problems. But, I’ve been in a situation where a string class was not practical, due to memory overhead. There, I used abugging techniques to

  1. establish coding conventions so it would be clear when ownership is transferred
  2. make sure all ownership issue is clearly documented for each function
  3. implement debugging code which catches double deletes and watches for memory leaks.

The system in question was running on an 80286, with 16-bit code and 640K of memory. The speed was probably around 2 MIPS. So what’s your excuse today, with machines thousands of times the capacity and speed?

A string class is a special (and rather common) case. But watch for this issue in other places, as well. A blanket antibugging solution is to use value semantics for everything, so everything has constructors, destructors, and assignment operators. Some of these objects are simple wrappers around pointers, but they keep track of ownership issues themselves.

Recap

When faced with programming issues, there is a fundamental difference between dealing directly with those issues, and avoiding the issues. Generally, think about antibugging first. Instead of saying “what do I do about X?” think about “Why is X happening, and can I avoid it?”. That doesn’t mean that every issue can be so eliminated. But, even if you decide to leave it as-is, you will have a deeper understanding of the nature of the problem. Then, tackle the remaining issues by confronting them directly or avoiding their consequences using abugging techniques.


Code Reviews series | Magazine top | previous part