High Cohesion

When I first started learning about object oriented programming, and really digging into the best practices, loose coupling and high cohesion kept cropping up.  I had a hard time trying to keep them straight and remember which one I wanted more of and which one I wanted to reduce.  My biggest difficulty was I couldn’t keep the definitions of coupling vs cohesion straight!

Coupling refers to how much a set of classes rely on one another.  If we have two classes, A and B that each use methods from the other, these classes would be tightly coupled.  If we have those same two classes and only one needs methods from the other, then those are more loosely coupled.

The little metal ends on a garden hose that you can twist together are called couplings.  You use them to join two pieces of hose together.  I like to use this to help me remember that coupling is about how two (or more) classes interact with one another.

Cohesion refers to how much the methods in a particular class belong together.  For example, let’s say we have a class representing a soup can.  The soup can should know about how big it is, what it is made out of, what shape it is and what color it is.  If we start adding information about the soup that is stored inside the can, then we are breaking the cohesion.  Information about the soup is not important to the actual can.  The information about the soup should be contained in a separate class that the can could know about.

Thinking about a cohesive group of people helps me remember what this means.  A cohesive group of people are people that work very well together and really seem to belong together, much the same way as a cohesive class design has methods that really belong together.

I have another post about coupling here.

High Cohesion

When we talk about cohesion, we are really talking about how well the ideas in a class or a data structure belong together.  I mention data structure  here because this principle can and should be used when designing database tables, or any data storage scheme, as well as when you are designing a class.  We are gong to focus on cohesion in code in this article.

Now that we have a better understanding of what cohesion means, why is it important?  Why should we worry about creating things that are really cohesive?  The program will work even if we throw a bunch of different concepts into one pile, right?

Technically, yes.  It is possible to write an entire application in one file.  And…you might even be able to hold all of the context of the application in your head at once while you are writing it.  Just because you can , doesn’t mean you should.

Why not put it all into one class?

There is a term for a class that ends up knowing too much about too many things…”god class”.  This is not the good kind of god, it is the spiteful, vengeful type of god, and you really don’t want to go creating them.

When you have a class that knows too much about too many things, it makes changing that class REALLY dangerous.

When a class knows too much, and does too many things, making a small change can impact many parts of the application, since it is likely used there too!  When we break things out into small classes, those classes are often used in fewer places, and their methods are very well defined.

Also, when everything is in one class, there is often a fear of touching that class.  We cannot understand everything that it is trying to do, so we don’t want to make a change that will cause unknown issues.  When we are afraid to touch classes, we don’t try to improve them, we just try to patch the hole and get out as quickly as possible.  This is not good for the overall health of our application, and is not good for our overall mental health.

 

Single Responsibility

High cohesion and the SRP (Single Responsibility Principle) go hand in hand.  When you design your classes, they should have one main purpose, one main reason to change.

Breaking your classes down this way not only means that each individual class will be changed less often, but it also helps us humans reason about the system as a whole.  Often systems are very complex, and contain a lot of concepts.  If we can break down the ideas into smaller and smaller parts, we can more easily understand these tiny parts, and can then build up our mental model much more easily.

 

Bad Code – Snippets from production

True, false … or maybe?

Lets start with something on the funny side. Some of the code that I have come across in our system.

boolean isFutureDeal = Boolean.parseBoolean(request.getParameter("loadFuture"));
if(isFutureDeal != true){
       isFutureDeal = false;
}

If it’s not true, set it to false. I guess this is to make sure it isn’t set to that other boolean value – true, false, maybe.

Give me that.  What?!  I don’t need that.

public ShoppingCart createNewCart(ShoppingCart cart, HttpSession session,
                     MiniUserBean member) throws Exception {

              if (cart == null) {
                     session.setAttribute("shoppingCart", new ShoppingCart());
                     cart = (ShoppingCart) session.getAttribute("shoppingCart");
                     cart.setCurrentRedemptionType(member.getRedemptiontype());
              }
              cart = (ShoppingCart) session.getAttribute("shoppingCart");

The cart object is passed in, but the first thing we do is read it from the session, so why even pass it?

Day of the week calculation

[pseudo-code...actually seen in production]
if (date == "11/28/2016" || date == "12/5/2016" || date == "12/12/2016" 
     || date == "12/19/2016" ...) {
          dayOfWeek = "Monday";
} else if ( date=="11/29/2016"...

You get the idea.  There were about 1/2 year’s dates in the if statements.  I found out about this gem when a customer called and complained that their script was not setting the day of the week any longer.  This was in perl, and I replaced that huge block with a few lines that would forever compute the day of the week.

Code Coverage

We all want to achieve 100% code coverage.  If we have 100% of our code executed in tests, we can feel confident that we are testing our code well. Right?

Not so fast!

We wanted to share some examples of tests that, while they may boost your code coverage, aren’t testing much.  Below are some examples we have actually seen (code has been changed to protect the guilty).

Trust, but verify

This is a code snippet using Mockito.

public void testSomething() {
   classInTest.doStuff();
   verify(classInTest).doStuff();
}

For those of you unfamiliar with what verify does in Mockito…it will pass if the method in question was called.  That’s right…this test is verifying that we called the “doStuff” method…which we did in the previous line.

Unfortunately for us, our code coverage tools will look at this and see that we executed lines inside the doStuff method and count them as covered, even though we didn’t actually test anything!

Preparation is key

public void testStuff() {
  when(thing.doStuff()).thenReturn("hi");
  when(otherThing.doStuff()).thenReturn("there");
  .... lots more setup ....

  classInTest.methodInTest();

}

That’s right…they took the time to do a whole bunch of setup and even called the method in test…but forgot to actually test anything!  There are no asserts or verifies.  Again, our helpful code coverage will show that lines inside of methodInTest were covered since they were executed in a test, even though we didn’t test a single thing.

Green means Go

public void testTest() {
  ... improper setup ...
  ... call method ...
  ... assert the wrong thing ...
}

Sometimes we make mistakes in our setup code and when we run the test, it fails, even though we are sure that the code is working correctly.  That leaves us with two options.  1) Figure out why our test is failing…step through, debug, print stuff out, etc.  See if it is a bug in the production code that we hadn’t noticed or if there is an issue with our test.  OR 2) Change the expected values in our test.

Option 2 is obviously the easier answer.  Yes, we’ve seen this.  No, this is not correct.  Tests are supposed to give you confidence that your code is working correctly.  It can also act as a sort of documentation for future developers.  Do you want future coders to look at the test and think that what you are testing is expected behavior?

What have we learned?

Code coverage is not actually telling the whole story.  As we have seen above, if you have poorly written tests lines of code covered don’t mean anything.  Anyone can be clever and find ways to execute more code in their tests…the smart programmer figures out how to actually test the code.

Please be vigilant in writing your tests and actually ensuring that your code is doing what you expect it to.  When doing code reviews, if you see any blunders like we show above, or other not so great tests, please help the other person write better tests.  You, your team and your code will be better off for it.

photo credit: <a href=”http://www.flickr.com/photos/62975503@N04/8227882239″>Facepalm</a&gt; via <a href=”http://photopin.com”>photopin</a&gt; <a href=”https://creativecommons.org/licenses/by/2.0/”>(license)</a&gt;

 

Design Principle: Encapsulate what changes

Principle

Encapsulate or isolate the parts of your system that change more often.

We can accomplish isolation in many different ways.  For example, we can simply separate out the parts of the code (or objects) that are likely to change into separate classes.

Another option that comes to mind is configuration files (or any type of external configuration).  These can include data (like a copyright notice) that might change depending on the deployment, as well as feature switches, or information on which style of a class to load.

Concerns

Shouldn’t I keep the behaviors of an object together in one class, rather than separating it out?

Yes.  However, isolating the parts that change doesn’t need to mean that your object doesn’t define what behaviors it has.

We use helper methods all of the time.  Sometimes it is to perform a calculation that we need to perform in many different places.  When we use a helper class to do some of this work, we will define a method on our object of interest, and have it simply call through to our helper.

For Example:

public class InterestingObject {
   private HelperObject helper;
   public InterestingObject(HelperObject helper) {
      this.helper = helper;
   }

   public void interestingFunctionality() {
      helper.interestingFunctionality();
   }
}

public class HelperObject {
   public void interestingFunctionality() {
      // some interesting functionality here
   }
}

When working with our InterestingObject, we now know that it has interestingFunctionality.  When working with that object, we don’t need to care that there is a helper object that is implements that functionality.

Why does it matter if I change some parts of a class more often than other parts?

First of all, making modifications to a class violates the open/closed principle which states that we should have our classes open to extension but closed to modification.  Granted, it is not always possible to follow this principle.

Imagine you have a class that has behaviors.  In that class, you will likely have some instance variables that you share in several methods.  Now, imagine that you need to change the data structure for one of those variables because you need it in a more convenient form for a new calculation.  In order to switch that data structure, you will need to touch all of the methods that use that data.  If, however, that method lived in a different object it would be free to use a new data structure to do the computation, as long as the API remained the same and it could still return the same type of object/data structure.

What I am getting at with all of those words is that methods inside a class can often be tightly coupled with one another, and that is ok.  However, when you need to touch one, you often have to touch others.  If you have those methods in separate classes, you can often get away with making a change just to the methods in question without having to touch the calling code.

What is all of this work really buying me?

Ideally this is buying you a safety net.  It is allowing you to ensure that the changes you make are only going to impact a small piece and only the piece that you intend.

If you have pulled out some things into a configuration file or table, you can often make changes to your application without even having to rebuild it.  Just make a change to the configuration and restart and viola!

 

How thin is too thin?

I recently had a conversation with a coworker about where the logic belonged in a piece of code.  I felt that his dao* (data access object) layer had too much logic.  I suggested that the objects that were communicating with the database should only have the queries needed to get the data from the database, and any business logic should be in a different layer (likely a service layer). My colleague said that he had done that before and he thought that made the dao layer too thin.

I let the comment slide, and let the code slide (though I am regretting that decision).  But, that got me to wondering if there is such a thing as a layer being “too thin”.

tomatoes
This, by the way, is probably too thin.

The only example I can think of where I might say that a layer was too thin was if it were just a wrapper around another object and it wasn’t really serving any purpose, it just called through to the object it wraps.  However, if the wrapper was serving a purpose (like wrapping a utility object that was all static methods so we can mock it, or wrapping a third party library that we want to encapsulate to make it easy to change out), then I wouldn’t have an issue.

In my mind, I try to think of each layer as a different level of abstraction.  Our controllers have the highest level of abstraction.  Reading the code there helps you understand the high level tasks we need to do (like, get some data, set some values and return some data).

The next level down is the service layer.  This contains business logic  It would contain the code that sets values, or does some calculations that we need to return to the user through the controller.  This layer is more detailed, but will often use other classes to assist.

The last level is the dao layer.  This is where we are getting our data, either from a database or file, or magic 8-ball.  This has nitty-gritty details of how to get the data, but contains NO logic for converting the data in any way.  It can contain queries that allow us to bring back only the data we are interested in.  Those queries can do things for us like sorting our data, etc.  But, the dao layer is simply in charge of executing the queries and returning the data to the service layer.

Do you have any examples where you have seen a layer and thought it was too thin?

*A data access object is any object that is used for accessing data, such as code to talk to a database, or read from a file.  It is a commonly used Design Pattern.  Look forward to a future post about this.

Keep abstractions at same level

During a code review I noticed that the developer had added an equals and hashcode method to an existing object.  We were not comparing these objects anywhere so we didn’t need the equals defined.

This caused me to dig a little further to understand why he added the equality code.  After a short conversation, it turned out that he added the equality methods in order to test code in another class.

Here is what his code was doing (methods changed to protect the innocent):

public class Stuff {
    private Jsonifier jsonifier;
    private Dao dao;
    public String getSomethingAndConvert() {
        List<Stuff> things = dao.getStuff();
        List<ConvertedStuff> convert = new ArrayList<ConvertedStuff>();
        for(Stuff thing: things) {
            converted.add(new ConvertedStuff(thing));
        }
        return jsonifier.jsonify(converted);
    }
}

When testing this method, since he was creating a bunch of new objects, he needed to have equals and hashcode defined on the ConvertedStuff object.  I suggested that he change it to this instead:

public class Stuff {
    private Jsonifier jsonifier;
    private Dao dao;
    private Converter converter;
    public String getSomething() {
        List<Stuff> things = dao.getStuff();
        List<ConvertedStuff> converted = converter.convert(things);
        return jsonifier.jsonify(converted);
    }
}

public class Converter {
    public List<ConvertedStuff> convert(List<Stuff> stuffToConvert) {
        List<ConvertedStuff> converted = new ArrayList();
        for(Stuff thing: stuffToConvert) {
            converted.add(new ConvertedStuff(thing));
        }
        return converted;
     }
}

Now the test for the method getSomething() in the Stuff class we can easily test without needing an equals and hashcode.  We can simply mock out the converter.convert method and see that the method is working correctly.

It seems we have just moved our problem to a different class.  Don’t we still need to have an equals and hashcode method on ConvertedStuff?  Well, no.

public void testConvert() {
    Stuff thing1 = mock(Stuff.class);
    thing1.setName("name");
    List<Stuff> stuffToConvert = Arrays.asList(thing1);

    List<ConvertedStuff> converted = converter.convert(stuffToConvert);

    assertEquals(1, converted.size());
    assertEquals("name", converted.get(0).getName());
}

We could have just written our test around the stuff.getSomethingAndConvert method like we see above.  That would have allowed us to get rid of the equals and hashcode. However, the issue I saw wasn’t really about the equality methods.  That is simply what tipped me off to a possible larger design issue.

We were doing too many things in the original method and we had multiple levels of abstraction in one method.  With the refactor, we are keeping the methods at the same level of abstraction.  This makes reading the code much easier.  We don’t have to look at the details of how we convert stuff to converted stuff, we just know that we do it.  If we need the details, we know exactly where to go.

 

The Coding Interview

 

What am I evaluating when I have a candidate do a pairing exercise with me during an interview?

Communication skills:  Communication is one of the most important skills a programmer has.  Of course, we all need to be able to solve a problem, but in order to identify that problem in the first place, we need to be able to communicate with our customers to understand it.

Most of us are not the only programmer in-house, so we need to be able to talk to other programmers about the code we are working on or the problem we are trying to solve.  We need the ability to clearly communicate our intentions through our code as well.

By doing a pairing exercise with the candidate I can get a pretty good feel of many aspects of their communication skills.  I get to see the code that they write to see if they are clearly communicating their intentions through the code (good names, good method decomposition, etc).  I also get to play customer and clarify the requirements for them.  This allows me to evaluate how well they can understand requirements and ask clarifying questions.

Lastly, I get to see how well they can talk about the code they are writing as a technical peer.  Are they able to articulate what they are trying to accomplish in the code?  Are they able to articulate why the approach they are taking is giving them troubles, or how they intend to break the problem down into manageable chunks.

Code quality:  This is the no-brainer answer.  Yes, I am interested in seeing how you code.  I want to see if you can produce quality, readable, maintainable code.

I am looking at things like naming (for variables, methods and classes), at method decomposition, simple formatting.  I am looking to see if the candidate will start with tests.  Does he/she at least write testable code?

What about comments?  If he/she comments, are they useful comments, or is it simply reiterating what the code does?  Is there any dead code remaining?  Does he/she refactor the code?

Algorithmic thinking: By giving the candidate a problem that does not have a readily apparent solution, I get to see how the candidate thinks through a tricky problem.

Does the candidate just freeze up?  Or is he/she able to break the problem down into smaller chunks?  If so, do these chunks seem like a logical way to break down the problem?

Ability to take feedback: While doing a coding exercise, I will typically either ask them to change how they are approaching a solution, or help advise them on a way to achieve their goals.  This is typically intended to allow me to see how well the candidate will incorporate feedback.  (I have seen candidates who refused to take any feedback given to them, even though they were struggling with achieving their goals).

Whether I am working in a pairing environment or not, I believe that multiple sets of eyes should see the code.  I encourage code reviews, and I expect each person to incorporate some, if not all, of the feedback given in such reviews (at the very least, the feedback needs to spark a conversation).