Archive for September, 2009

Using Polymorphism instead of Conditionals

Interviewing nowadays with tech companies has become run of the mill. You have phone screens, then you are brought on for on site interviews. And in each one of these, you are asked one mind-bending, off-ball algorithm question after another. I personally have been on both sides of the interview table, having asked and been asked my fair share of these. And after a while, I started questioning whether these questions provided any insight into how interviewees think other than their knowledge of algorithms.

It was then that I decided I wanted to try and find out if the candidates really understood polymorphism and other concepts, rather than their knowledge of algorithms, since every other interviewer would be covering that. And that was when I stumbled upon this gem of a question, which also underlies a fundamental concept of object oriented programming.

The question is simple. “Given a mathematical expression, like 2 + 3 * 5, which can be represented as a binary tree, how would you design the classes and code the methods so that I can call evaluate() and toString() on any node of the tree and get the correct value.”. Of course, I would clarify that populating the tree was out of the scope of the problem, so they had a filled in tree to work with. It also gives me a chance to figure out how the candidate thinks, whether he asks whether filling in the tree is his problem or whether he just assumes stuff. You could preface this question with another about Tree’s and traversal to check the candidate’s knowledge and whether this one would be a waste of time or not.

Now, one of three things can happen at this point. One, the candidate has no clue about trees and traversals, in which case there is no point proceeding down this line. Second, which seems to happen more often than not, is the candidate gives a class and method like the following :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Node {
    char operator;
    int lhsValue, rhsValue;
    Node left, right;

    public int evaluate() {
      int leftVal = left == null ?
          lhsValue : left.evaluate();
      int rightVal = right == null ?
          rhsValue : right.evaluate();
      if (operator == '+') {
        return leftVal + rightVal;
      } else if (operator == '-') {
        // So on and so forth.
        // Same for toString()
      }
  }
}

Whenever I see code like the example above, it just screams that whoever is writing it has no clue about how to work with Polymorphism. While I agree that some conditionals are needed, like checks for boundary conditions, but when you keep working with similar variables, but apply different operations to them based on condition, that is the perfect place for polymorphism and reducing the code complexity.

In the above case, the biggest problem is that all the code and logic is enclosed in a single method. So when a candidate presents me with this solution, the first thing I ask is what happens when we need to add another operation, like division or something. When the prompt answer is that we add an if condition, that is when I prompt and ask if there is not a cleaner solution, which would keep code separate. Finally, often, you depend on third party libraries for functionality. Well, in those cases, you won’t be able to edit the original source code, leaving you cursing the developer who wrote it for not allowing an extensible design.

The ideal answer would, for this question, be that Node is an interface with evaluate() and toString(). Then, we have different implementations of Node, like a ValueNode, an AdditionOperationNode, and so on and so forth. The implementations would look as follows :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
interface Node {
  int evaluate();
  String toString();
}

public class ValueNode
    implements Node {
  private int value;
  public int evaluate() {
    return value;
  }
  public String toString() {
    return value + "";
  }
}

public class AdditionOperationNode
    implements Node {
  Node left, right;
  public int evaluate() {
    return left.evaluate()
        + right.evaluate();
  }
  public String toString() {
    return left.toString() + " + "
        + right.toString();
  }
}

You could go one step further and have an abstract base class for all operations with a Node left and right, but I would be well satisfied with just the above solution. Now, adding another operation is as simple as just adding another class with the particular implementation. Testing-wise, each class can be tested separately and independently, and each class has one and only one responsibility.

Now there are usually two types of conditionals you can’t replace with Polymorphism. Those are comparatives (>, <) (or working with primitives, usually), and boundary cases, sometimes. And those two are language specific as well, as in Java only. Some other languages allow you to pass closures around, which obfuscate the need for conditionals.

Of course, one might say that this is overkill. The if conditions don’t really make it hard to read. Sure, with the example above, maybe. But when was the last time you had just one level of nesting? Most times, these conditionals are within conditionals which are within loops. And then, every bit of readability helps. Not to mention there is a combinatorial explosion of the amount of code paths through a method. In that case, wouldn’t it be easier to test that the correct method is called on a class, and just test those individually to do the right thing?

So next time you are adding a conditional to your code, stop and think about it for a second, before you go ahead and add it in.

, , ,

12 Comments

Testing function vs testing implementation

Often I have got complaints from developers that I work with that their unit tests are prone to breakages, or they don’t like writing unit tests because their code changes frequently, which causes them to change their tests as well. Its just extra overhead at that point, and starts being a chore. Atleast thats what their claim is. Now of course, I don’t agree with this at all. Not. One. Bit.

You see, when I hear this, its always tells me that there is something wrong with the way tests are written. A unit test that requires changes every time someone changes the code implies that there is a extremely strong coupling between how the code is written to how its tested. Some useful indicators of such a thing could be having a getter methods or properties which are visible only for tests, but not to external code. Or Tests which check if a loop happened 6 times or a mock was called 17 times. Sure, these assert that the function is working as intended, but say you optimize and reduce the recursion or method calls, then you need to go and update your expectations.

Of course, some of this is unavoidable when you are working with classes that have mocks injected into them. But in such a case, unless it is plain delegation, there must be some logic that must be happening. That should be the target of your tests, not the mock delegations. Usually, when I work with mocks, I have a few tests to make sure the right methods are getting called, and only if there is logic, I test it further. Otherwise, 1 or 2 tests and then I go and test the implementation of the mocked class to make sure it works under all conditions.

So lets consider a run of the mill binary search method that would be tested with mocks (A little bit contrived, but bear with me on this) :

public int binarySearch(List<Integer> items, int itemToFind, int low, int high) {
    // Do the needful, in a recursive fashion 
}
// A Brittle test
public void testUsingMocks() {
  List<Integer> list = mockery.mock(List.class);
  mockery.checking(new Expectations() {{
    oneOf(list).size(); will(returnValue(3));
    oneOf(list).get(1); will(returnValue(6));
  }});
  assertEquals(1, binarySearch(list, 6, 0, 2));
}

Now, while a bit contrived, this is a familiar sight when mocks are used to test. Or it might happen that to check the correctness of the algorithm, the indices at which the split happens is stored in a list, and verified in the test. These are the kind of whitebox tests that make unit tests brittle. And the more of them there are, the harder it is to maintain or refactor code. Rather than testing it with for some use cases and boundary conditions, this is testing whether the algorithm itself is correct. Useful for some particular cases, but normally not required unless you are developing algorithm.

I would argue that its rare to write these kinds of tests if you write your tests before you write the methods. With a TDD, you just write your expectations, what you expect to give the method and what you expect out. You then write your code to get it to pass, and you might use internal variables or logic which the test really doesn’t care about. These tests are durable and hold up to refactorings, and even give you a nice safety net. There are times when these end up becoming integration tests rather than unit tests, but I still believe that they deliver more bang for the buck.

Of course, when you start testing edge cases, you do end up getting mostly a code dependent white box test, and those still are fine since they are supposed to be edge cases, which shouldn’t change that often. Though the fact that there are conditionals usually signifies that there is a polymorphic object hiding in there. But thats a blog post for another day.

, , ,

1 Comment

Software Engineering vs Software Artistry

I never expected my last post about whether Inheritance was needed or could be done away with to spark such a furore. But spark a furore it did, especially at the Java DZone lobby. Maybe it was the inflammatory nature of the title (which could have been a tad bit exaggerated :) ), but whatever it was, it sure didn’t stop the flames. From being called “incredibly naive” to “nonsense” to even losing Dzone some subscribers, no stone was left overturned.

But what did surprise me at the end was the fact that for every dismissal of the idea, there was a proponent who understood my reasoning behind it. And of course, there were the people in the middle who would only say, “Depends on the situation.” And surprisingly, I agreed with a lot of their point of views. But that in turn led me down a line of thinking which led to this post.

Who is an engineer? In every other field other than computers and software, and engineer is one who uses scientific methodologies and time proven concepts to design and implement constructs / processes which reliably and safely perform specific tasks. Look at electrical engineering, or aerospace engineering. These guys consistently develop hardware (planes!!) which work. Every! Single! Time! No bugs, no defects. I mean, can you imagine a plane in the middle of the flight, and suddenly there’s a bug in the landing gear? Shudder….

These guys follow some tried and tested techniques. There’s probably lore that every engineer depends on to create his next system. Passed down from generation to generation of what works and what shouldn’t be done. Same with civil engineering, there are no two ways to construct, say a building. Sure, you might differ in how it looks and what materials you use, but the base work of, creating a frame, etc remains the same (Then again, I have no clue what does go into buildings). The probability of a bug, or building a system that the next person in finds it impossible to maintain, are far less from what I have heard (and I will admit that this is based on hearsay).

Now, an artist, on the other hand, is usually defined as someone who expresses themselves through a medium. Interestingly though, the oxford dictionary has one of the definitions for an artist as “A follower of a pursuit in which skill comes by study or practice – the opposite of a theorist“. Now what does that remind you of? Exactly, engineering. To an extent, artistry is engineering, except note that in artistry, while the basics might be the same, the end results are usually unique. There still is no defined methodology or “steps you follow to perform ABC”. You work with what you have in the best possible way you know and you churn out something that may or may not be what you desired.

Now where do we fit as software engineers? We have some lore, some history of tried and true practices. We have design patterns, we have team practices like Agile, XP, etc. And we almost have an algorithm for everything. Its almost like an Apple Iphone ad, “You need to search a graph? There’s an algorithm for that.” But when it comes to implementation and combining all these into a single product, there is so much divergence. Two people, given the exact same set of requirements, will come up with two almost completely differing solutions. And I’m not talking about just names. The architecture, the design patterns used, the way services are split up. And both of these may completely satisfy the requirements. Or they may end up being epic disasters.

What I’m saying is, there is no guaranteed recipe for success like in other fields of engineering. It is completely feasible to dig yourselves into a hole even while applying commonly known solid techniques. You might argue that this happens in other fields of engineering as well, like say, Boeings new 787 which has been delayed so many times. But to that, I say that they were trying to stretch the boundaries and innovate, and create something new. That rule applies to any engineering discipline, when you try to go above and beyond what currently exists.

But when you are creating run of the mill apps, like a Configuration system or a database data displayer, those should, by now, be trivial. But they aren’t. I know groups which spend more time and effort developing these than should be required. And finally developed, these turn into nightmares when you want to update them or add new features. You might say, “Well, I never do that.” To that, I say, sure, but remember the last time you moved onto a project with a legacy code base? Remember how that felt? Well, someone who was well-meaning, just like you, developed that disaster.

So are we Software Engineers or Artists? At the end of the day, it doesn’t matter what we are as long as the job gets done, but you would think we would finally start narrowing down on some concepts that can be universally agreed upon. Most software solutions I see as the end product are usually works of art. I have no clue how they made it, no clue how it works, but its beautiful nonetheless (or ugly, if that is how your artistic tendencies lie). Maybe we will be closer to being engineers in another 100 years? After all, civil engineering has been here for quite some time now.

, , , ,

5 Comments

Using Tracker for Agile projects

I first got exposed to the Agile methodology when I took a “Thinking in Agile” course. It was a two day course, and they walked you through what it meant to be agile, the processes involved, etc. But thinking back, I fell into the same trap I did back in college. It was mostly talk, and not enough action. I learn by doing, and ipso facto, nothing really stuck. One and half years later, still at google, I became part of an internal project, which needed a kick start. And lo and behold, we decided to subscribe to the agile philosophies and not half ass it as many do. And we decided to use Tracker to manage the project.

The basic ideologies of Agile (and Tracker) are as follows. You work in short iterations (preferably a week or two). Every iteration, the entire team gets together to talk about what got done, and what is up next. You interact directly with the customer or his proxy, who gives you things to do in the form of stories. For a company search app, a story might look something along the lines of “As a product owner, I want to be able to search for a particular employee, so that I know what he works on.” Notice that this story is very well defined, especially for you as a developer. It tells you the target audience, what functionality is needed and why. The why is important because sometimes (not always), you might be able to think of a better way to do something, in which case you can pipe up with said suggestion. The example project below ignores that suggestion and I didn’t spend time to change the demo to meet my suggestions.

Tracker screen

Tracker screen

Now tracker allows you to add and move stories around. You can move a story from the Icebox (which is where stories end up by default) into the Backlog, which are the things you need to work on. You can arrange them to order them by priority (and it is all immediate, so someone else with tracker open can see those changes as they happen). Now finally, during your iteration meeting, you sit as a team, and estimate how much each story is worth.

The most important point here is that clients are the only ones allowed to request stories (though you can probably figure out exceptions if you really think that some chores should get points). Customers get to decide what stories need to happen, and what order they need to happen in. Nothing more, nothing else. Now as developers, your role is to estimate how long it will take for the items in order. You can start off with some estimates like, “A point is one days work.” initially, when you are new. You can play planning poker to estimate, which is a fun little activity in itself to make sure no one is influenced by anyone else’s estimates. Now what tracker will do is it will measure how many stories you end up actually delivering per week, and calculate what we call your Average Velocity (pointed out in the screenshot above). This basically denotes how much tracker thinks you can do in the next iteration, assuming that your estimate base remains constant.

Another important thing to note, that only things that the customer cares about, or is customer facing, should be stories. These are thing the customer can see when delivered and brings value to him. So that refactoring you need to do to come out of that hole you dug for yourself? Guess what, its a chore and does not contribute to your velocity. Want to move database schemas to make it easier for yourself? Fixing that bug you introduced in trying to rush through all the stories? The customer doesn’t care (well he probably cares about the bugs, but you are not delivering value, you are cleaning up after your own mistakes), so no cookies nor any points for you.

So a basic flow for an iteration (after estimations and planning poker) is as follows ;

  1. Customer / Product owner prioritizes stories / chores / bugs in the backlog
  2. Tracker looks at average velocity and figures out how many it can squeeze into the next iteration
  3. Developers click start on a story / task when they start to work on it.
  4. They click finish when they are done implementing on it, but do not click on Deliver
  5. Pushmaster / Release Engineer clicks Deliver when those changes are pushed to a customer visible place
  6. Customer gets to try out each story, and then can decide whether it meets specifications, and decides if he wants to accept / reject.

Rinse and repeat, and you have a great way of managing requirements, release plans and so much more. You can figure out who’s working on what, you also get a great host of charting, including burndown charts, charts which allow you to figure out where you are spending the majority of time (whether stories, bugs, chores, etc). Example chart below :

An example burndown chart from Tracker

An example burndown chart from Tracker

Did I mention Tracker is currently free to sign up and start using? Regardless of whether you are a product manager who wants to keep his project in line, a developer interested in using agile practices, or just plain curious about what this thing is all about, Tracker has something for everyone. So, what are you waiting for, a personal invitation? You don’t need to be an Agile team to try this out for yourself.

, , , ,

2 Comments