Object-oriented programming: rarer than you think
daniel bray, 10^th May 2003

Introduction

Before you start reading this essay you should know where I'm pitching it. I'll start with what this is not:

This is not an introduction to the structures of OO programming: inheritance; polymorphism etc.; you'll have gotten that in college.
This is not an introduction to the mechanics of implementing, I don't know, inheritance, in, say, Java. You should have gotten that in college too, but who does. I came out of college knowing how to write while loops.... and that's about it.

What this is, is an elaboration. I've been in the working week for five years now and it has never ceased to amaze me how few people are writing object-oriented software. For sure, they're using objects and inheritance and whatnot, but their code is still (forgive me my vulgarity) functional. I would normally forgive this... erm, nostalgia, but it usually tends to be non-functioning; at least in terms of maintaining the software. You can normally spot this code a mile off 'cause it usually has a ton of identical switch statements, everywhere.

What I'm trying to do with this is take the simple constructs of object-oriented programming and describe how they need to be clearly understood and implemented rigorously. To take the glib one-liners handed down by computer science lecturers and show how, although they're usually safe, they hide the subtle mysteries that exist in all programming, and which can lead to truly awful software. Mostly, what I want to do is show that there is a HUGE difference between object- oriented languages and object-oriented programming, and to show a few constructs to help a coder spot that difference, and make the jump across it.

I think, 'though, that primarily I'm writing this in the hope that it might get out into the world and make some kind of difference and that I won't have to spend so much time fixing almost identical bugs all the time.

I'm new to this whole essay-writing thing, so I'd be delighted to receive any comments, additions or clarifications you have on this. I live at daniel@braindelay.com.

I've written some more of these: thay can be found at danielbray.com.

I is what I is
when something is what it isn't

One of the most heniously dangerous things I heard in college was the mantra:

Inheritance is "is-a"; aggregation is "has-a"

It sounds nice, and simple and, for sure, it's easy to remember; and if there's one things students like, it's things that are easy to remember. The problem that arises is that is has two meanings:

the is of generalisation: This is essentially a syllogism. You can say A is a B, because A happens to be a type of B. This is transitive: if A is a B and B is a C then A is a C.
the is of identification: This is what breaks everything. This is where you'd say something like: "Daniel is a man." This is merely assigns some identity to something, but doesn't say anything about what it is. This is not transitive.

If you're scratching your head right now, relax: I'm about to show you a picture - pictures are always good.

A class hierarchy: Daniel extends Man extends Human extends Primate extends Species

This is a sample class hierarchy. Before we start I should tell you that it's awfully wrong, and I'm sorry if you're a creationist, but Man descending from Primate isn't the reason. If you can't see the reason, take it easy, you'll see it soon enough.

This is the kind of inheritance hierarchy that I see a lot, and it's always coming up because of a misunderstanding of what is means. If you follow the inheritance is "is-a" mantra, then this looks fine: Daniel is a man (I like to think so); a man is a human (most of the time); a human is a primate (I'm sorry, God); and a primate is a species of animal.

Now, you're going to get a lot of people telling you not to think about coding when you're in the design process, and, to be sure, that's usually good advice, but when you're just starting off in the business you should think about code all the time. In fact, even when you've been coding for a few years I'd suggest you think for about ten minutes about how you're going to put your designs down on paper. For a start, if you can't immediately think of a few implementations for your design, then there's a good chance that your design is an implementation, only in pretty pictures. Secondly, thinking about code will force you to think about the gritty details of coding, which you're going to have to get 'round to anyway, and if you find that you're going to be forced to write the same code over and over then you may want to change your design.

Now take another look at that class hierarchy again. The fault here is that Daniel is not a Man, in the class inheritance sense, or at least shouldn't be. Suppose you add Some more men.

A class hierarchy: Adam, Daniel and Simon extend Man extends Human extends Primate extends Species

You can probably see what's happening here. We're going to get a huge explosion in classes, one for every man, and there's nothing to distinguish them except their identity. This is mistake number one; the inheritance is "is-a" rule is only correct for the is of generalization, not for the is of identification, and the reason is this:

Given the rules of class invariants, inheritance is transitive: A polygon is a shape; a square is a polygon; a square is a shape. When you're using is to define your inheritance hierarchy the is you're using must be transitive also. The is of identification isn't transitive, so it doesn't work.

So if you look at the hierarchy: Daniel is a Man; a Man is a Human; a Human is a Primate; a Primate is a Species. Since inheritance is transitive, the following is true:

Daniel is a Human; nothing wrong here.
Daniel is a Primate; so far so good.
Daniel is a Species.... ahh, this ain't right.

So, Daniel can't extend from Man, but Daniel is a man... what to do?

A class hierarchy: Man (with an identity field) extends Human extends Primate extends Species

What you do is have a minor technical genocide and get rid of all of the classes you would have defined for each person and associate Man with some identity

You're probably thinking, "so what, that's a toy example and that kind of thing never happens in the real world," but you'd be wrong. The difference in the meaning of is can be very subtle and will force code duplication which always leads to embarassment in a code review. Still not convinced? Lets take another look at the class hierarchy and see what happens when we add women, and other animals.

A class hierarchy: Man and Woman (both with an identity field) extend Human extends Primate extends Species; Dog and Bitch (both with an identity field) extend Canine extends Species

Not so simple now, is it? If you didn't see this coming it's not so bad, 'cause most people don't. It's a very common mistake to firmly fix your gaze on the part of the hierarchy that needs the work done to it now and ignoring the rest. Now, I'm not suggesting that you design an entire hierarchy for everything that may or may not happen in the future - that kind of creeping featuritis is the Hansen's disease of software - but I am suggesting that you pay absolute care when you're building you're hierarchy to what kind of is you're using.

You see, here we always assumed that Man was a perfectly viable class and put all of our focus on the obviously wrong Daniel class, but the fact of the matter is that gender belongs way, way up the hierarchy. I don't know quite where, 'cause I'm no biologist, but it's way, way up there. Also, the identity becomes a problem.

A class hierarchy: Different animals extend a gender-specific class, which extends species, an amoeba extends directly from species. Some of the specific instances of Species are implementing an Identifiable interface

And here's a candidate solution that avoids the problem of lots of identical classes. It achieves this because the is used to define the inheritance is the transitive is of generalisation. I know that this is the umpteenth time I've said this, but inheritance is such an integral part of object-oriented programming that it has to be done right, and this is problem can be fairly subtle and easy to miss.

Possesion is nine points of the law
when to use aggregation or association

What's the difference between aggregation and inheritance? Easy, it's

Inheritance is "is-a" (careful here); aggregation is "has-a".

Okay then, here's another one: what's the difference between a association and an aggregation? It's....

It's not so simple, is it? We'll take some examples to clarify things.

Does a wall aggregate its bricks?: I'd say yes to this.
Does a swarm aggregate its bees?: Again, I'd say yes to this.
Does a company aggregate its employees?: This causes so many arguments, but I'd say no.

The thing is that the difference between associations and aggregations is tenuous at best since, in code, they both end up adding fields to the client class, increasing visibility of fields, or operations in the subject. It does, however, become important when you get 'round to putting the connection down in code; and the problem is feelings.

You'll hear this a lot in design reviews: I don't like that; it feels wrong. Whoever said this will have no real reason for their opinion, but their choice of words is right on the money because if you argue with them you will hurt their feelings. What your team want to do is pick one meaning for what you think the difference is between an association and an aggregation, if indeed you think there is any, and write it down. Write it down in a document with an important sounding title and make sure that one of your technical managers puts their weight behind it. Stick a sort of catechism, like I have above, of examples of what the differences are. Most of all, you have to be consistent. You'll save a ton of time this way

Personally, I think there's a difference, I think that the difference is important and that ignoring it just causes hurt, sorrow and late nights refactoring.

And the difference is?

Aggregation: A simple containment: a linked list aggregates its elements; a picture aggregates its images. There are no, or else very simple, semantics concerning this relationship. If there is a semantic that says that when one end of the association is deleted that the other is also, then this becomes easy: what you have is a composition; a specialisation of aggregation, and not a "simple" association
Association: I would argue that whenever there are complicated semantics concerning the connection between two classes then you no longer have a simple containment (an aggregation), but rather an association. The reason for the distinction is that although an aggregation can wildly differ in its structure and operation, from the outside it is merely a container. An association doesn't have (in my terminology, at least) this limitation.

Again, I'll give you a picture to clarify things, and we'll go back to the catechism: Does a company aggregate its employees?.

Let's suppose it does.

A class hierarchy: A Company has a one to many aggregation association with an Employee class

The employee, lest we forget, is a human being: it will have a name, an address, an age, a gender, etc. It will also have fields, such as salary, employment commencement date, union affiliation, etc.;which have nothing to do with their humanity, but rather their employment.

The company will also contain, operations, and probably some fields as well, about the employment of all of these employees.

The problem here is that no one class is directly, and solely, responsible for an employee's employment. The employee maintains the data, and the company will collate, update and make sense of it. If this kind of thing happens a lot, with lots of different classes, what you'll end up with is an absolute monster of a Company class, surrounded by little satellite data structures.

You may as well be writing this in PASCAL.

My own view of this problem is just a tadge different.

Like I said, it's just a tadge different, at least on paper, but this approach puts the whole code in a different perspective: one which pushes the complicated behaviour associated with employing a person out of the company, leaving the company to just deal with employment; what that employment is, along with all of its complications is contained within some Employment class. When you extend this to everywhere where the company might have associations with complicated semantics, say with the buildings they own or rent, the clients they deal with: this spreads out the load across the hierarchy making each part of it simpler.

As an added bonus, I've noticed that in large systems these association semantics tend to be very similar, by placing them into association classes of their own you make them amenable to generalisation. Remember, this is object-oriented programming, and inheritance, applied properly, is a very good thing.

Inheritance tax
how to use polymorphism without slitting your own throat

When I was a kid, there was a TV programme with a plasticine hero called Morph. Like I said, Morph was made from plasticine and so could change into anything he wanted to. I remember it was very funny.

People out there hear the word "polymorphism" and they think that means that they can, through inheritance, turn any class into something else. This isn't funny at all.

If you've read my bit about what is means, then you'll know that you have to be very careful when you're extending classes, that the class you're creating by extending another is actually an instance of that base class.

What you don't want to be doing is extending a class just so you can avoid writing a few lines of code. I once knew a coder who thought nothing of extending a Polygon from LinkedList, 'cause it gave him his API for setting his points for free. The fact that it gave him operations in his Polygon API that were faintly ludicrous wasn't an issue for him. Who knows, maybe one day he might need to turn his Polygon into an array of Objects...

What you must always keep in mind, all the time, when you extend an operation is that there are very strict mathematical rules involved here, which if broken, might not break things, but will certainly cause the behaviour of your program to be suspicious, especially if you're dealing with an interface whose implementation you have no control over.

A prime example of this is in almost every Java class I have ever seen, and, again, it has to do with identity. java.lang.Object has two operations that have to kept in synch: equals and hashcode. equals defines the object identity, and that identity is an integral part of the definition of the result of hashcode. If hashcode isn't using the same fields to determine the hash code as equals uses to identify the object, then hashcode simply won't work.

People are forever extending the method of equals and ignoring hashcode because, they figure, I'm never going to use this as a key into a hashmap, so why bother?

This is why...

java.util.Collection defines a contains operation that has, in its documentation, the requirement that equals be defined. Cool, you think, I've done that... nothing in here about hash codes

Now, if you're client is talking to an interface that returns a java.util.Collection, and that collection is, unknown to you, a java.util.HashMap, you're stymied, and contains is never going to find anything, unless you're really, really lucky. This is because java.util.HashMap is using the hashcode to find elements for comparison, and you never defined hashcode. And there's no point crying foul because the API didn't tell you to extend hashcode because it did tell you to extend equals, and the API for equals does tell you that you have to keep hashcode in synch with it.

The moral of this story is that you can't just extend operations willy-nilly, you have to follow the following rules.

Class invariants are sacrosanct: If the base class defines a required observable state, then the sub-class must maintain at least this much. A corrollary of this is as I described above, if a connection exists between the results of two operations then that connection must be maintained in the sub-class.
When extending operations, the new method must not
- strengthen the precondition: a method must not demand more from a calling class than its defining operation does.
- weaken the postcondition: a method must not return less to a calling class than its defining operation does.

The point of these rules is that an interface is what's supposed to define the observable behaviour of a class; a sub-class can, if it wants to, require less, and give back more than the super-class, but it must adhere to its interface.

Again, an example is called for:

When you learnt polymorphism it was probably in the form of a heterogeneous list; in my case it was an array of Shapes, so that's going to be my example now, only without the pointers.

A class hierarchy: Ellipse, Polygon and Curve extend from Shape; Shape has a draw operation with a Context parameter

Here, what we have is a simple enough toy example of a graphics utility what can support different shapes. An image, would simply be an array of these shapes. To draw an image you'd simply draw every image in this array.

Shape has one operation defined for this example, and that's draw(Context); where Context is a graphical context that the shape will use to draw itself. This would define operations like drawLine(x, y) and such. There'd be a different context for drawing to the screen, or to a printer or whatever.

Now I'll explain why the rules I laid out above are so important.

Class invariants are sacrosanct

This is really just a more general case of the following two rules, but the faults that can be caused if this rule is broken can be fairly subtle, especially if your class is maintaining a complex state; so subtle, that it deserves to be singled out on its own.

Essentially the rule is this: given that an object is defined as having state, behaviour and identity; the state will only be well-defined in certain configurations: every operation call made to this object must maintain this state in a well-defined and correct configuration. Where the behaviour is defined such that there is a connection between the functionality of some of the operations - if this connection isn't maintained in an extending class then there's a very good chance that the new sub-class is going to break the state when one, or more, of these operations is called.

Never strengthen an operation precondition

The beauty (so to speak) of this design is that when we want to draw the image we can do the following, (in java):

   Shape[] image = ......;             // define the array
   GraphicalContext context = ......;   // get the appropriate context for the task
   for (int i=0; i!= image.length; i++) {
     image.draw(context);
   }

And that's it, there's no need to know what type of Shape we're drawing; we know it's a Shape, and the interface on Shape tells us that we only need a GraphicalContext.

Now, suppose we broke this rule, and there was a Shape whose implementation of the draw operation needed something else, then we'd need to do something like this:

   Shape[] image = ......;             // define the array
   GraphicalContext context = ......;   // get the appropriate context for the task
   for (int i=0; i!= image.length; i++) {
     if (image instanceof SpecialShape)
     {
       ((SpecialShape)image).draw(context, somethingExtra);
     }
     else
     {
       image.draw(context);
     }
   }

What this is, isn't object-oriented programming. SpecialShape is breaking the interface defined by Shape. There's a good chance here that either SpecialShape isn't a Shape, or else the signature defined by Shape.draw(...) isn't generic enough. This is a common enough mistake; where a designer stares intently at the one part of the hierarchy that they need to use and then, later, when they need to munge new classes onto it they are very loath to change anything too fundamental in the framework. This go-lightly approach is liable to kill you early with a stroke, so avoid it.

Never weaken an operation postcondition

This is much the same as not strengthening a precondition. To properly deal with heterogeneous collections the client has to be able to treat every element in that collection the same way, and be able to assume that calling the same operation on all instances of a class will return dependable results.

For example, suppose GraphicalContext contained a cursor, and its drawLine(x, y) operation took its co-ordinates from the place where the last call left off: it would be vital that every Shape, once it had finished drawing, reset the cursor back to 0,0 so that the next Shape could be drawn properly. If any of the shapes didn't do this, then we'd have to do the following.

   Shape[] image = ......;             // define the array
   GraphicalContext context = ......;   // get the appropriate context for the task
   for (int i=0; i!= image.length; i++) {
     image.draw(context);
     if (image instanceof SpecialShape)
     {
       context.setCursor(0,0);
     }
   }

Again, this isn't object-oriented programming. This is, of course, a toy example, but in more complex systems you really have to stay on your toes or something very similar to this, only much more subtle, can happen.

Dangerous liasons

classes are what they say they do, not how they do it

One of Newton's laws - I think it was the first - says that if a force isn't applied to an object then it will keep on moving along in a straight line. This pretty much holds true for everything, and software is no different. When an analysis concept changes, there's big temptation to try to implement this new altered concept by causing as little change as possible to the code. This is partly laziness, and for sure, there are times when there's a definite short-term business case for doing as little damage to the code as possible if there's only a day or so before a release; but it's usually the case that the coder just has a poor intuition as to what their software actually is: classes which should be the of the same type are considered to be distinct; clients are using classes with the full knowledge of how they are implemented, and in most cases the clients are implementing some of the classes' behaviour. This poor intuition tends to dissolve analysis concepts across different classes, without a clear interface, to the point where a coder can't take a concept and point to a specific class, or group of classes who are clearly and succinctly responsible for the implementation of that concept. Changing the concept becomes very tricky.

This dissolute implementation of analysis concepts is usually caused by a poor understanding of what exactly a class or an object is, in the strictest sense.

It's extremely important that you always think about classes and objects in a very accurate way, and that way is this:

A class defines behaviour and state rules.
An object is an instance of a class, and has identity, state and behaviour.

If you think about objects and classes in any other way then you're going to snap... loudly. What you really don't want to be doing is this:

Thinking that the object is the core of the program.

Yeah, I know it's called object-oriented programming, but it's not; not really. For sure, there are a lot of objects, and it is objects that do the work at runtime, but all of the power comes from the definition of classes. It's focussing purely on objects that blinds programmers to the safe and efficient use of inheritance.

It's also worth noting something that a lot of coders are unaware of: access modifiers are defined on the class. Fields that are defined as private are private to the class, not to the instance object. It never ceases to amaze me how few coders understand this.

Thinking that a class is its fields and operations

This is most dangerous assumption (is misassumption a word?), 'cause it's the predominant failing in coders, in all languages and paradigms: that the components of the software are the concepts in the system.

A class is its behaviour and its state. The code, on paper, must always bend to the abstract notion of what the concept is, and not vice versa. Inertia in design (hacking) arises because this is rarely the case. This is not because coders are lazy... well not always, but because the assumption is there, albeit unspoken, that the code is the concept. The result of this assumption is that if the actual concept changes (usually due to a customer request) there's a resistance to changing the code and bits of a solution get placed all over the place.

I'll explain this with a conversation that I've heard waaay to often for comfort:

Mister Blue: I keep having the same problem, and I want to avoid it. The problem
arises 'cause we're returning lists of different objects and I keep having to
check each item in the collection to see what type it is before I know what to
do with it.

Mister Green: (sigh) Why are we asking for lists of different objects?

Mister Blue: Well they're the same kind of thing, sort of, but they're different
"objects."

Mister Green: If they're the same...?

Mister Blue: Well they extend the same object, so that they can use up a few of
the same methods, but they have two different sets of operations.

Mister Green: (sotto voce) Sweet suffering Jesus...

Mister Green: Are they the same type of thing or not?

Mister Blue: Yes, but they have different sets of operations.

Mister Green: When you get back this list of two different types, what are you
doing with them?

Mister Blue: Well for the first type I'm calling methods A, B and C; but for
the second type I'm calling methods D, E, F and G.

Mister Green: (feels a migraine coming on, he gets a lot these days) What are you
trying to do with them?

Mister Blue: Well for the first type I'm calling methods A, B...

Mister Green: In the abstract... (breathes deeply) Pretend this is the design
phase: what are you trying to do here?

Mister Blue: Well the first type needs to write to a plain file, and the second
needs to write to a socket.

Mister Green: So both are writing to something?

Mister Blue: But they're using two different sets of methods.

Mister Green: No they're not.

Mister Blue: Yes they are?

Mister Green: No, something else is calling two different sets of operations on
these to classes, and this is happening everywhere. What you want to do is include
some kind of abstract "write" operation to the base class and then define a method
for it in each of the sub-classes; each of these methods would call the operations
that the other classes are calling all the time.

Mister Blue: Oh, I see! That would take the code that figures out what the type
is, and the code that does the two different "write" methods out, of the code.
It would remove all of the code duplication and replace it with the single
implementations in the sub classes. Cool.

The problem with mister Blue wasn't that he was dim, it's just that he was thinking about the class design purely in terms of how it was implemented. He saw the two types and immediately saw them as two different entities because the current programmatic interface was different. Even after he saw that they both did essentially the same task, only differently, he still couldn't see that they were implementing the same concept, only differently.

Considering a class as being defined by its implementation

The other thing that Mister Blue did, which can also cause problems, is that he talked about methods when he meant operations. This doesn't look like a huge problem most of the time, but there is a difference, and it is, I think, important.

Operations are the defined elements of a class' behaviour. For example,
public String toString();
is an operation. An operation is defined on a class, and it states part of that class' behaviour, there's no attempt here to define how this is done.

Methods define how a operations are implemented.

Again, you're probably thinking Oh, come on now! That's just splitting hairs: surely it's only a name?

I would have thought by know that you would have started to trust me. The problem is that there's a HUGE difference between the two and you are only making confusion more likely when you ignore this fact. You should take care to remind yourself constantly that this difference between the two exists, otherwise you run the risk of considering a class as being defined by its implementation, and once you do that you're dead in the water. When you think of a class as being how its operations are implemented then you're going to find it very hard to consider using inheritance because, like Mister Blue, you're going think that a class that writes text to a file, and a class that writes text to a socket, are two completely different types of creature just because how the text is written is completely different. If you start to view classes as how they do things then you're never going to see the big object-oriented picture 'cause you're always going to be rooting about the minor trivialities of how the almost identical things are different.

You won't be able to see the wood for the trees.

References

As much as I would like to believe that I came up with this on my own, I know I didn't, and I doubt if I even could have. The following books are either directly reference in this paper, or were such an influence on my professional work that it would be hard to separate my thinking from the author's, so I feel that I should name them as references anyway.

C++ Programming Language, third edition, by Bjarne Stroustrup
http://www.aw.com/catalog/academic/product/0,4096,0201889544,00.html

Design Patterns: Elements of reusable object-oriented software, by Gamma et al
http://hillside.net/patterns/books/index.htm#Gamma

Mastering object-oriented design in C++, by Cay Horstmann (my book from college)
http://www.horstmann.com/mood.html

Refactoring: Improving the Design of Existing Code, by Martin Fowler
http://www.awprofessional.com/catalog/product.asp?product_id={B372FE30-1699-4C71-8C46-36DF503ED0A1}

UML Distilled: Applying the standard object modelling language, by Martin Fowler
http://www.awprofessional.com/catalog/product.asp?product_id={4DEBAF3E-BC74-4566-BF05-19DC00360E16}

danielbray:essays:oo

Object-oriented programming: rarer than you think daniel bray, 10th May 2003