Common Code Smells in OOP

Over years of reviewing Ruby code, the same things tend to come up over and over. In this post, I’d like to address some of the most common code smells I find when reviewing OOP code (and Ruby code in particular).

I personally think "code smell" is a great name. As all great names, it needs little or no explanation. It’s something which smells bad in code, which simply means "bad code".

Wikipedia provides us with a more formal definition:

In computer programming, a code smell is any characteristic in the source code of a program that possibly indicates a deeper problem. Determining what is and is not a code smell is subjective, and varies by language, developer, and development methodology.

Each developer, language and even ecosystem has its own code smells. Some are more general than others. In this post I’ll talk about very generic smells which apply to any object-oriented language, be it Ruby, JavaScript, C# or PHP.

Polluting Instance Data

This is one of the most frequent smells I find when reviewing code.

Object-oriented languages isolate data into objects, and only expose that data to the external world through messages — also called methods:

In the example above, our actual data is the X and Y coordinates for our little point. We can’t access them directly, so we do that through our point object. Also, when we want to change them, we have to do that through the point’s move message.

Objects not only have data, but also behavior which is tightly coupled with that data. In the example above, it makes sense to have the move behaviour inside the Point.

When dealing with objects, we don’t care much about the internal data or state, instead we care about the methods, or messages that it exposes. That’s the only way we can communicate with objects.

We use messages to ask them questions — list.empty? — and make them do stuff — mailer.send. (See Command/Query Separation for more on this, as it escapes the scope of this post).

So far, so good, but sometimes we end up adding more data than that which the object semantically represents. A common example of polluting the object’s data is adding helpers for calculations:

Here we are just adding a method to calculate the distance between two points.

The math is really not important, the thing to note here is that we are using instance variables where we could just get away with using regular variables.

Why are instance variables bad? The biggest issue is that they add actual data to the object which has little to do with its role. Why does a point care about the x_diff with some other point? After the calculation is over, this data is useless.

Another issue with this is that this calculation data won’t be cleared up by the garbage collector: it will be re-defined every time the method gets called. This makes our objects bigger, and if we are going to instantiate thousands of them, it’s easy to end up with bloated systems.

Yet another issue is that the more state we have, the more likely bugs are to emerge. We want to have as little state as possible, and ideally, we don’t want it to change (but that’s a subject for another post).

Another common example of misusing instance data is using it to collect intermediate results:

Can you spot the problem here? What happens when you call #import_people without having called #import_categories first? It breaks.

Not only are we adding data which is not relevant to what our object represents, but we are also making our code more fragile.

The solution is very simple:

By passing intermediate values through parameters, we don’t need to pollute the object’s internal state.

Leaky Logic

Another very common smell I encounter often is leaking logic. This means an object’s logic is leaking to the outside world.

Luckily, this one is very easy to identify, and it normally comes in the form of: a) conditionality and b) serial method calls.

Let’s see a conditionality example:

It’s not necessarily a bad thing to ask an object questions, and although it’s not ideal (Tell, Don’t Ask), sometimes it makes sense to do so.

But problems can arise when the actions taken after asking an object something also send messages to the same object.

In the example above, we are asking coffee a question, and then in the body of the conditional sending yet more messages to coffee.

This logic could just live in the coffee object:

The second example is about serial method calls. This applies when calling several methods in a particular order:

In the example above, you will always call open first, write second, and close last. It’s not a big smell if it just happens once, but once you know this smell, it’s always a good idea to just refactor by defining a more specific method in your object:

This way you make sure the methods are always called in order and it’s not possible to forget to call a method. Grouping these methods into one makes sense, and it deserves to be its own message.

Note that it doesn’t always have to be like the example above–sometimes you just spot the same methods being called over and over again:

In this case, we could simply create a new method with a nice name:

Excessive Conditionals

This one could be its own blog post, but I’ll try to keep it concise this time. It’s about replacing conditionals with inheritance.

One of the most powerful features OOP offers is polymorphism: the ability to just send messages to objects and not care about their specific type at runtime.

In dynamic languages like Ruby, you don’t need any extra ceremony to do this. In statically-typed languages like Java, you need a common interface. Regardless of language, we want to use this power to abstract away conditionality.

Let me show you an example and let’s see all the advantages we get for free:

The conditionality above might not seem like a big deal. Even if we have to repeat it in a few methods, it’s not the end of the world. But what if we have another requirement?

Now it’s starting to get messy. First of all, if we were to make this change, we’d have to make it in every single place where the logging was implemented, which could take considerable time.

We could extract it to its own method, but what if we want that functionality on other objects too?

Also, why should the store know about debug_mode?, development?, and production?? It even knows how to create a special log using the Log class, therefore introducing another dependency to our class.

It’s now clear that this doesn’t belong here. The first step is easy, move it to its own class:

Now, that is much better. At least we can test the logging functionality separately now. Still, because it has many conditionals, it’s a pain to test. Every conditional branches the code, and the amount of tests we have to write to cover all cases grows exponentially with each nested conditional.

Now for the interesting part:

Whoa, what happened? We created many specific logger objects, and we move the responsibility of knowing which logger to use to a factory object.

Now we can test each logger individually, and the instantiation logic happens in a single place–the factory–which can also be easily tested and reused.

Bonus

Alternatively, sometimes you can get away with automating the factory and completely removing the conditionality using convention over configuration.

Consider this example:

We could use a switch statement instead, but you get the idea. Now we can refactor this as follows:

In this case, we can just try to load a class named <user_role>Greeter. If we can find one, great. If it fails, we raise a helpful exception. The nice thing about this is that if we add more roles, all we need to do is implement a new class. For example, if we were to add a moderator role, simply creating a ModeratorGreeter class is all we’d need. So even if we forgot to write the new greeter class, the error message would be quite descriptive and help us immediately figure out what’s wrong.

That’s it!

It was just three code-smells this time, but I don’t want to make this too long. If you guys are interested, make sure to leave comments and I will continue this series. 🙂

Cheers!

Federico — Senior Web Developer @ Beezwax

Leave a Reply