Archive for category Opinion

Facebook’s next big opportunity : Analytics

In my current term at ISB, I am taking a course on “Leveraging Social Media and Analytics”. Very awesome and interesting course, but it also has a project where we take a deep dive into one company’s Google Adwords, Google Analytics and Facebook Ads data.

Now Google analytics is brilliant at allowing users to see who’s visiting their websites, where are they coming from, what they do on it, etc. Its a very powerful tool, and especially since it integrates greatly with Google AdWords, providing a great one two punch for Google and is their big selling point.

Now enter Facebook, with their Ads. The biggest thing lost from FB’s point of view is data on how useful their ads are, how many conversions you get etc. This is still possible to figure out through correlation between FB’s ad data and Google Analytics. But is still a huge pain point from FB’s and FB user’s point of view. So why doesn’t Facebook offer something like Google Analytics?

Well, you might say, Google Analytics is the biggest one out there, and people require to put a code snippet in their websites to track usage, and they won’t do it twice or won’t take the hassle.

But think about this. Facebook already has their code snippets in most websites, either through their Like buttons, Share to Facebook buttons and who knows what buttons. All it takes is for them to include their tracking and analytic code snippet as part of these buttons. Suddenly, you realize that their tracking code could already be present in a gazillion odd websites, ready for analytics.

All Facebook needs to do is turn it on, and link to Facebook Analytics and voila : Facebook Analytics could have a huge installed base right off the bat!

Now this is all out there, but just a thought I had. Crazy? Logical? What do people think?

, , , ,

1 Comment

APIs and what not to do

APIs seem to be like opinions. Everyone has one, and no two people have the same concept of what constitutes a good one. An API is supposed to be an interface that is exposed for other programs or programmers to use to interact with your code. Except, each API, like an individual, is unique with its own flaws and niceties. A great API is one which reduces the amount of code you have to write when you use it. I personally feel amazing if I can get something done with minimal code. That just screams “GOOD API” to me.

On the other hand, a bad API leaves you feeling dirty, unclean even, as if you are committing grave sins against nature even by just using it. Here are a few common mistakes which end up leaving that bad taste in your mouth (with examples, of course!) :

Bad APIs

These are the worst offenders, the APIs which are supposedly there to make your life easier, but just end up making it more work to use it than rewriting it from scratch. I faced one of the bigger offenders of this one recently when I was working with GWT. I was trying to create a tree structure to represent a navigation hierarchy when it dawned on me.

A GWT Tree is created by creating a Tree object, and then creating a tree item for each node. To append children to each node, you create further tree items and add whatever text or elements you want to it. So to summarize, even if I have a data structure to represent my tree (which in most cases, I do), I will have to traverse it manually, create tree items, tell each one how to render itself and then append it to the correct items. Yuck.

Now consider how JFace creates a Tree (which I consider much more powerful and a nicer API altogether). You create a TreeViewer, set its data source / input. Then, you set a content provider which knows how to traverse your data object and get children / parents. You can also set a LabelProvider which tells it how to render its data elements. End result? Nice clean code that I actually feel satisfied about.

Most of these are the end result of rushed / not well thought out design. Having a concrete use case prior to designing it should have been enough to scream out “Its ugly!!!”. Suggestion to prevent this : write a test / use case for anything you start designing, so you can get a feel for how it feels in action. That should help you avoid a lot of these.

Not fully thought out APIs

This one is similar to the previous one, but I think it deserves section and example of its own. This happens when you almost nail the API, but fail to consider some common uses of the API. The biggest offender of this one I believe is the Java List API.

The two most common use cases I have in Java when I work with lists are
1.) Iterating through them to perform some operation and
2.) Filtering the list to get a subset

The second operation is so common that I get annoyed now that I have to create an empty list, iterate through each one using a for each and conditionally add elements to the new list. Now I realize that Java doesn’t make it easy to pass in functions (check my older article about this) as arguments, but what I really really want here is the ability to do myList.filter(predicate) where predicate is a predicate function I decide, which returns the filtered list with elements matching the predicate.

There are many more such common operations missing on the List interface, but this is the most egregious one I believe. Javascript also gets this wrong, but underscore, a JS library adds a lot of this which makes working with lists and collections a dream.

Misnamed APIs and methods

How many times have you called a method, only to realize that it didn’t really do what you thought it did? Or look for a method XYZ, only to realize later that it had been named YXZ instead. Raise your hands if you have experienced this. For some reason, an apple for someone almost always turns out to be an orange for someone else.

I’ll switch to bashing on JS for this one, underscore in particular. For all the amazing methods that underscore provides in JS, they really have a problem with naming. I ended up looking for a collection.contains method, and ended up finding only indexOf, so I initially assumed that they didn’t have it. I mean, if I look for contains, at best, I will also look for has, hasKey. Browsing through the list of method names, I might have even accepted includes (though it would not have been my first choice). But never in all my life would I have expected it to be include (Yes, that is include, as in singular!). People, what were you thinking????

Liars

The final set of APIs which can annoy (but are easily worked around, just like the previous section) are APIs which lie. These include APIs which don’t do what the function name suggests it does (no obvious example from the open source land comes to mind, thankfully). The other kind is one which is not done with work even after the object is created. Most times, it is the case of a lurking init / initialize method. And if you ever see an interface called Initializable, run in the opposite direction.

, ,

2 Comments

Is Strong Typing really needed?

This is something I have been struggling with for the last few months. I have had people argue ardently that all Strong typing is good for is false comfort and lots of unneeded typing. But I was strong. I was undeterred. I dismissed this as the crazy rants of those JS developers, those dynamic language people who believe that obfuscation and compactness is everything, even at the cost of maintainability. I mean, how could a language where you didn’t even know what was getting passed in in any ways better than something where the APIs are explicit and stop you from making mistakes. A dynamic language could work for a single developer, but definitely not for a team. That was my whole hearted conclusion.
Now, I’m not so sure anymore. Its been 3 weeks since our team made the whole hearted switch. Has it been roses and sunshine? No. But it hasn’t been as bad as I expected it to be. And there are a few reasons for that. But before that, I’ll lay down the pros and cons the way I see them from my (assuredly very limited) experience :

Benefits of Strong Typing :
  1. Errors / Warnings in your editor
    Simply put, this might just be the single most greatest benefit of strong typing, and the single reason why most java developers (a lot rightly so) will never even consider leaving the safety of strong typing. While compilation support doesn’t necessarily go hand in hand with strong typing, most people tend to associate Java with it, so lets run with that. Simply put, with Strong typing, your editor can (and should, I mean, if you are not going to get immd. feedback, what’s the point?) give you immediate feedback when you messed something up. Whether this be using the wrong variable name or trying to call a method that either does not exist or with the wrong parameters. Or if you are trying to use the wrong type of object.

    To a Java developer, an IDE like Eclipse or IntelliJ is godsend, as it tells you what is wrong in your world and lets you jump to them, gives you suggestions and autofixes and generally makes your life as painless as it can. And it is brilliant, I can tell you that.

    In Javascript (or any other dynamic language), everything is fine and dandy for the first 100 lines. After that, it becomes scarily unmanageable. The only way around this that I have found so far is to be super paranoid and write tests for every single line of code. If you can’t do that, stay far far away.

  2. Generics (but this is also a negative, in my opinion, which I’ll get to below)
    The idea behind generics is that gives developers some assurances about the types in a collection (or whatever it is you are genericizing). That way, all operations are type safe, without having to convert to and from different types. And you are assured that you will not be surprised suddenly by a different type of object popping up when you least expect it. But there are a lot of issues with them that I’ll cover in the second section.
  3. Ability to follow a chain and figure out what type of object is required at each step
    Now this is something I definitely miss in languages like Javascript and Python. The fact that I can trace (in my IDE, note that part) what the type of each variable / method call in an expression chain is simply amazing, especially when you are working with a new codebase. You never have to wonder what the parameter types of the method you are calling are. You don’t have to wonder what methods are available or visible. You just know this information (Again, assuming you are using an IDE. If not, god help you)
  4. Refactoring

    The biggest advantage of Strong typing though, in my opinion, is the ability to create IDEs which make refactoring a breeze. Renaming a method / variable? Trivial. Moving or extracting a method? Simple key combination. Stuff which can be extremely tedious and mind numbing are accomplished in a matter of minutes. (Want to know more about these shortcuts? Check out Eclipse shortcuts). This is simply not possible with languages like Python and Javascript.

Disadvantages of Strong typing :
  1. More concise and precise, less typing
    Dynamic languages do tend to be more dense, and it is much easier to accomplish in 10 lines what can easily take 50-100 in a language like Java, which is especially verbose. Consider trying to pass in a chunk of code to be executed at the end of a function in both Java and javascript (this is pretty common in web apps and task runners)
    Java :

    interface Function {
        T execute();   // Optional parameters is not easy here :(
    }
    taskRunner.execute(taskArgument, new Function() {
        String execute() {
            return "Success";
        }
    });

    Javascript:

    taskRunner.execute(params, function() {response="Success"});
  2. No badly implemented generics
    This is mostly Java’s fault of getting generics pretty badly wrong. The idea behind generics is sound, its the implementation that is horribly broken. Here are a few things which are wrong with it :
    Type erasure : This basically involves the fact that at runtime, there is no way to differentiate between say, a List<String> and a List<Integer> If you never work with reflection or Guice, then this might not be a problem. But it also is a pain with deeply nested generics and wildcards. I have seen compiling code which blows up at runtime because it cannot differentiate between a Provider<? extends Repository> and Provider<? extends Resource> and neither Resource nor Repository have anything in common. Crazy….

    Verbosity : Map<String, List<String>> myMap = new HashMap<String, List<String>>();. Enuff said.

    Guice & Reflection : Generics and java.lang.reflect just don’t mix. They just don’t. Type erasure blows away all type information, so you are bound to be using stuff like new Entity<?> which totally defeats the purpose. And don’t get me started on Guice. In guice, normal bindings (non generic classes) look as follows :

    bind(MyInterface.class).toInstance(instance);

    With Generics involved, they now look as follows :

    bind(new TypeLiteral<MyInterface<String>>(){}).toInstance(instance);

    What the heck just happened there???

  3. Closures / Functions :
    Closures are a form of anonymous inner functions which can have an environment of their own, including variables bound to the scope of the function. The inner function has access to the local variables of the outer scope and can change state. But what it does allow is creating functions, as callbacks or for performing some quick little task in a repeated fashion, easily and quickly and pretty darn cheaply.Java has had a few proposals to add it (http://javac.info/) but it has not passed the review committee yet. And probably won’t for the next few years. So till then, in Java, you are stuck creating interfaces, creating an implementation of it at runtime, passing in variables you need access to in the constructor or through some other mechanism, and generally be in a lot of pain. Thanks, but no thanks.

, ,

11 Comments

What I miss in Java

So I finally got some time to sit down and write, after being knee deep in work the past month or two. And without a doubt, I wanted to write about what has been heckling and annoying me over the past month. I am an ardent defender of Java as a good language, especially defending it from Misko day in and day out, but even I will agree that it does suck at times. So today, Java, the gloves are off. I love you, but this is the way things are.

To give some context, I have been working on GWT a lot recently, and have done some crazy things with GWT generators (which I might cover in a few posts later). I love GWT, but for all of GWT’s aims to allow developing modern web apps without losing any of Java’s tooling support, there are a lot of things which are made easier in javascript. Lets take a look at them one by one, shall we?

Closures (Abiity to pass around methods)

So this was the straw that broke the camel’s back. I had this use case today where I wanted to set some fields through setters on a POJO. Simple enough right? Well, NO, because someone used defensive programming (Don’t get me started about precondition checks, thats for another post) and so it through a null pointer exception. Ok, since I can’t change the POJO (since it is in someone else’s code base), I needed to check for nulls on my side and not call the setter if the value was null. Simple enough, I do a check and call the method conditionally. Except when you have a few 10 odd properties, thats a lot of conditionally crappy code.

Ok, so my other option is write a function which checks that, right? Except in java, you can’t pass around functions or closures. Ideally, I want to have a closure which takes a value and a function, and let the closure handle the null check and conditional calling. Something like :

callConditionally(myPojo.setValue, actualValue);

Except you can’t. Not in java. I mean, I could create an interface to wrap it, but that just adds more boilerplate than necessary. I ended up creating a method which uses reflection to find the method by name and calls it, but my point is that it shouldn’t be necessary. What should be two or three lines of code ended up being a 20 line monstrosity. And yes, before some smart aleck replies that if I wanted closures, I should go to javascript, I will point out that there have multiple proposals to include closures in Java, and Scala, which compiles into java, supports closures as well.

There are multiple JSR’s and open source libraries which try to implement this for Java, and one of these days, I’m gonna give it a try. But for those interested, check out http://javac.info/ and http://code.google.com/p/lambdaj/. Both of them look promising.

Type inference and General Wordiness

They say a picture is worth a thousand words. Well, with java, and especially with generics, it seems that even a simple declaration is atleast a thousand words. For example :

Map<String, List<String>> myMap = new HashMap<String, List<String>>();

The above line could be so much shorter and sweeter as :

Map<String, List<String>> myMap = new HashMap();

There are very few cases when I would want a map of something else when I just declared it of a particular types. Other examples like reading a file, working with regexes abound, all of which require much more syntax than other languages. And I definitely do miss being able to say

if (myValue)

instead of

if (myValue != null)

Sigh… And don’t even get me started with reflection. Reflection in Java is extremely powerful, but man is it wordy. Not only can you not recurse over the properties of an object directly (like say, in javascript), you also have to worry about exceptions (which I’ll get to in the next section)

Checked exceptions

That brings me to my last and biggest complaint. Checked exceptions in Java. They are just plain evil. I know people swear by them, and some of their arguments even make sense. Sometimes. But the fact remains that they make me write more boilerplate, more code that I don’t even care about than anything else in Java. The idea behind checked exceptions is sound. Its a great way to declare what the caller of a method needs to worry about. But the thing is, I should have an option other than rethrowing or logging it.

I did a very unscientific data gathering experiment of just looking at code randomly in different code bases (Codesearch was especially useful for this). And the majority of catch blocks I found either

  • Logged it using logger or System.err
  • Rethrew it as a wrapped exception

Me personally, I have changed Eclipse to generate all catch clauses for me by wrapping and rethrowing it as a RuntimeException so I don’t have to worry about adding a throws to my method declaration, when it is a non recoverable exception for the most part.

Furthermore, sometimes Checked exceptions can even lead to clauses which will never ever be executed. Point in case :

try {
  java.net.URLEncoder.encode(myString, "UTF8");
} catch (UnsupportedEncodingException e) {
 // Can never be thrown, but I am forced to catch it.
 // Because its a checked exception!!!
}

There are many more cases like this, but I think this is enough of a rant for now.

, , , ,

7 Comments

Testing function vs testing implementation

Often I have got complaints from developers that I work with that their unit tests are prone to breakages, or they don’t like writing unit tests because their code changes frequently, which causes them to change their tests as well. Its just extra overhead at that point, and starts being a chore. Atleast thats what their claim is. Now of course, I don’t agree with this at all. Not. One. Bit.

You see, when I hear this, its always tells me that there is something wrong with the way tests are written. A unit test that requires changes every time someone changes the code implies that there is a extremely strong coupling between how the code is written to how its tested. Some useful indicators of such a thing could be having a getter methods or properties which are visible only for tests, but not to external code. Or Tests which check if a loop happened 6 times or a mock was called 17 times. Sure, these assert that the function is working as intended, but say you optimize and reduce the recursion or method calls, then you need to go and update your expectations.

Of course, some of this is unavoidable when you are working with classes that have mocks injected into them. But in such a case, unless it is plain delegation, there must be some logic that must be happening. That should be the target of your tests, not the mock delegations. Usually, when I work with mocks, I have a few tests to make sure the right methods are getting called, and only if there is logic, I test it further. Otherwise, 1 or 2 tests and then I go and test the implementation of the mocked class to make sure it works under all conditions.

So lets consider a run of the mill binary search method that would be tested with mocks (A little bit contrived, but bear with me on this) :

public int binarySearch(List<Integer> items, int itemToFind, int low, int high) {
    // Do the needful, in a recursive fashion 
}
// A Brittle test
public void testUsingMocks() {
  List<Integer> list = mockery.mock(List.class);
  mockery.checking(new Expectations() {{
    oneOf(list).size(); will(returnValue(3));
    oneOf(list).get(1); will(returnValue(6));
  }});
  assertEquals(1, binarySearch(list, 6, 0, 2));
}

Now, while a bit contrived, this is a familiar sight when mocks are used to test. Or it might happen that to check the correctness of the algorithm, the indices at which the split happens is stored in a list, and verified in the test. These are the kind of whitebox tests that make unit tests brittle. And the more of them there are, the harder it is to maintain or refactor code. Rather than testing it with for some use cases and boundary conditions, this is testing whether the algorithm itself is correct. Useful for some particular cases, but normally not required unless you are developing algorithm.

I would argue that its rare to write these kinds of tests if you write your tests before you write the methods. With a TDD, you just write your expectations, what you expect to give the method and what you expect out. You then write your code to get it to pass, and you might use internal variables or logic which the test really doesn’t care about. These tests are durable and hold up to refactorings, and even give you a nice safety net. There are times when these end up becoming integration tests rather than unit tests, but I still believe that they deliver more bang for the buck.

Of course, when you start testing edge cases, you do end up getting mostly a code dependent white box test, and those still are fine since they are supposed to be edge cases, which shouldn’t change that often. Though the fact that there are conditionals usually signifies that there is a polymorphic object hiding in there. But thats a blog post for another day.

, , ,

1 Comment

Software Engineering vs Software Artistry

I never expected my last post about whether Inheritance was needed or could be done away with to spark such a furore. But spark a furore it did, especially at the Java DZone lobby. Maybe it was the inflammatory nature of the title (which could have been a tad bit exaggerated :) ), but whatever it was, it sure didn’t stop the flames. From being called “incredibly naive” to “nonsense” to even losing Dzone some subscribers, no stone was left overturned.

But what did surprise me at the end was the fact that for every dismissal of the idea, there was a proponent who understood my reasoning behind it. And of course, there were the people in the middle who would only say, “Depends on the situation.” And surprisingly, I agreed with a lot of their point of views. But that in turn led me down a line of thinking which led to this post.

Who is an engineer? In every other field other than computers and software, and engineer is one who uses scientific methodologies and time proven concepts to design and implement constructs / processes which reliably and safely perform specific tasks. Look at electrical engineering, or aerospace engineering. These guys consistently develop hardware (planes!!) which work. Every! Single! Time! No bugs, no defects. I mean, can you imagine a plane in the middle of the flight, and suddenly there’s a bug in the landing gear? Shudder….

These guys follow some tried and tested techniques. There’s probably lore that every engineer depends on to create his next system. Passed down from generation to generation of what works and what shouldn’t be done. Same with civil engineering, there are no two ways to construct, say a building. Sure, you might differ in how it looks and what materials you use, but the base work of, creating a frame, etc remains the same (Then again, I have no clue what does go into buildings). The probability of a bug, or building a system that the next person in finds it impossible to maintain, are far less from what I have heard (and I will admit that this is based on hearsay).

Now, an artist, on the other hand, is usually defined as someone who expresses themselves through a medium. Interestingly though, the oxford dictionary has one of the definitions for an artist as “A follower of a pursuit in which skill comes by study or practice – the opposite of a theorist“. Now what does that remind you of? Exactly, engineering. To an extent, artistry is engineering, except note that in artistry, while the basics might be the same, the end results are usually unique. There still is no defined methodology or “steps you follow to perform ABC”. You work with what you have in the best possible way you know and you churn out something that may or may not be what you desired.

Now where do we fit as software engineers? We have some lore, some history of tried and true practices. We have design patterns, we have team practices like Agile, XP, etc. And we almost have an algorithm for everything. Its almost like an Apple Iphone ad, “You need to search a graph? There’s an algorithm for that.” But when it comes to implementation and combining all these into a single product, there is so much divergence. Two people, given the exact same set of requirements, will come up with two almost completely differing solutions. And I’m not talking about just names. The architecture, the design patterns used, the way services are split up. And both of these may completely satisfy the requirements. Or they may end up being epic disasters.

What I’m saying is, there is no guaranteed recipe for success like in other fields of engineering. It is completely feasible to dig yourselves into a hole even while applying commonly known solid techniques. You might argue that this happens in other fields of engineering as well, like say, Boeings new 787 which has been delayed so many times. But to that, I say that they were trying to stretch the boundaries and innovate, and create something new. That rule applies to any engineering discipline, when you try to go above and beyond what currently exists.

But when you are creating run of the mill apps, like a Configuration system or a database data displayer, those should, by now, be trivial. But they aren’t. I know groups which spend more time and effort developing these than should be required. And finally developed, these turn into nightmares when you want to update them or add new features. You might say, “Well, I never do that.” To that, I say, sure, but remember the last time you moved onto a project with a legacy code base? Remember how that felt? Well, someone who was well-meaning, just like you, developed that disaster.

So are we Software Engineers or Artists? At the end of the day, it doesn’t matter what we are as long as the job gets done, but you would think we would finally start narrowing down on some concepts that can be universally agreed upon. Most software solutions I see as the end product are usually works of art. I have no clue how they made it, no clue how it works, but its beautiful nonetheless (or ugly, if that is how your artistic tendencies lie). Maybe we will be closer to being engineers in another 100 years? After all, civil engineering has been here for quite some time now.

, , , ,

6 Comments

Is Inheritance overrated ? Needed even?

To give some context to this topic, the idea was brought forward to me by Alex Eagle. I was happily coding away when Alex sprung his idea for Composition over Inheritance for Noop – a language we are developing with testability and dependency injection in mind. My gut reaction was that this was blasphemy, and it couldn’t be done. You can’t just do away with inheritance, its one of the building blocks of OO based programming languages. But now, after I have let the idea digest for a few days, it doesn’t seem so far fetched any more. And here’s why.

Let me first talk about the biggest problems with vanilla inheritance as we have it in Java. Joshua Bloch hits it on the nail in his Effective Java book item about “Favoring composition over inheritance.” But lets do a quick recap anyway.

The biggest problem is that inheritance often ends up breaking encapsulation. This is because the child class depends on the implementation of the parent class. But between releases, something in the parent class implementation can change and can break all child classes without even touching its code. Another common gotcha is in how protected fields and members are used. Often, the parent class changes the value of fields depending on how methods are called. Not understanding this behavior often leads to buggy or simply wrong behavior from the subclasses.

Another problem with a subclass – especially from the point of view of unit testing – is that there is no way to create an instance of the subclass in isolation. By this, I mean that everytime I create an instance of the subclass, I am forced to have the parent class as well. In most cases, this shouldn’t be a problem, but I have run into situations where the parent class is just a landmine waiting to explode, with the default constructor not being explicit in stating its dependencies. So instant Kablaam!!! Or the parent class will load things you don’t really care about and make things slow in a test. There was this insidious test I ran into once, which extended a base test case, which did the same thing. About 7 layers deep. And the test itself didn’t really care about 3 or 4 of those layers, but had to jump through all the hoops and get everything because it was a parent class.

There are a few more issues, which are well documented in Effective Java item 16, “Favor composition over inheritance.”. I won’t bore you further on this, assuming I have convinced the skeptics about the problems with inheritance. If not, go read that book, and you shall be convinced. But then, I wanted to postulate on whether it was at all possible to have a programming language which does away with inheritance (As Noop proposes).

So when do we use inheritance ? To me, Polymorphism is about the only time when inheritance and subclassing is deemed appropriate. Be it having different subtypes or just plain old code reuse. So unless you want to have a base abstract class which has some methods defined (Like Shape with draw() method and Circles and Rectangles), inheritance is not really needed.

In Java, interfaces allow you to perform polymorphic operations with abandon, and convert between types. And interfaces don’t straddle you down with the requirement that you get the base class for every instance.

Also, if you use composition, then you can reuse code by using delegation. For example, you could define a Shape interface with a DefaultShape implementation. Now rather than subclassing a concrete type Shape, you could have a Rectangle which implements Shape. And if you wanted to reuse some code, let Rectangle take in a DefaultShape instance and just delegate to it when necessary. This offers multiple benefits. One, you are not tied down to getting things from the base class. In your test, you could pass in a mock, a null, whatever you want. The only problem is that this option is not viable if you don’t have an interface. If that is the case (or the thing you are subclassing is in a package outside of your control), then you are stuck doing inheritance the old fashioned way.

And this is (atleast the last time I heard the proposal) what Noop aims to solve. When you want to subclass, you tell the class what you want to compose. Regardless of whether it is an interface or not, it will create that class with an instance of your composition type. By default, all methods in the composition type will be available in the subclass, and it will delegate automatically, unless you override it. You get complete control over object creation, and this could potentially support multiple inheritance through this approach.

What do other people think ? It this feasible ? Am I missing something obvious when inheritance is the only approach and composition just doesn’t cut it (both right now and in the Noop proposal) ? Are you interested in Noop ? Drop me a line.

, , , , ,

6 Comments

What could be done to improve CS Degrees (Part 2)

Continuing on from last time, this time, I want to focus on what could be done in colleges. What courses could be taught, and what is wrong with the industrial approach as well. Again, all of these are my opinions, formed from my experiences, so yours might differ.

Why can’t there be a course which teaches you how to write maintainable software ? Concepts like separating your concerns, reducing your coupling. Using interfaces as and when necessary to abstract out unnecessary details. How about a semester or two semester long class which gives a student a chance to develop something in the first semester and maintain, add new features to it in the second. The only times I have heard of this happening is with independent studies.

And how can we forget about unit tests? This concept was rarely (if at all) touched upon during my four years in college. The only reason I had even heard of the concept was because I had participated in the Microsoft Imagine Cup and the Top Coder competitions. It sure wasn’t mentioned or required in any of my courses. Looking back, I just wonder how much easier a lot of my projects would have gotten if I had just written unit tests instead of trying to trace and debug it manually. But instead, I don’t remember it as more than a honorable mention in my Software Engineering class.

Why is such a fundamental cornerstone of software development not given more focus in college? Ideally, there would be a dedicated course talking about testing, the various aspects and kinds and when and where it could be applied. Along with practical examples and usages and projects where different kinds of tests were required. In addition, each class like Algorithms, OOAD and even Database Design would require testing of some sort of the other. Professors and TA’s could just run the submitted code against some unit tests to check for validity and only have to manually look at the code for stylistic errors or for problems.

How many fresh graduates actually get to work on something completely new and exciting? Instead, most end up joining companies which have a large and existing code base. And let me tell you, working with legacy code is rarely, if ever, fun. But guess what, this is again not touched upon in college? Working with legacy code? You are on your own again. Why? This is such a fundamental aspect of being a software developer, yet its not a skill which is even touched upon in colleges. Even letting students loose on open source projects and asking them to contribute a patch or two is good experience, but we don’t do it.

Even the industry seems to be misleading in that sense. In most of my interviews, I was asked algorithmic questions. Sure, they were nice and tough to chew on, and they give a nice, contained question that the interviewees can code up. But think about it. When was the last time a developer had to write just one method? Isn’t the requirement more often than not developing a system or contributing to one? Why don’t we ask questions which test for this? I myself have started asking design questions which test, say, a candidate’s understanding of polymorphism by asking him to design a class structure to represent mathematical operations like addition, subtraction with two methods, evaluate() and toString(). Does the candidate understand how to use inheritance, or does he end up using conditionals and switch cases instead.

It just seems to me that colleges have, to an extent, lost track of what the industry needs, and the industry doesn’t seem to be helping its cause. Introducing some of the courses I outlined above, or even incorporating it into existing coursework would give prospective graduates a leg up when they look for jobs. Its setting them up for success, and we all want them to succeed.

5 Comments

Whats lacking in CS Degrees nowadays (Part 1)

I remember when I was a fresh grad, just joining Google. I was naive, starry eyed, and somehow scraped through the gauntlet of interviews thrown my way. Then came my first few weeks at work. I was given the task of writing this new testing framework for my product. In Java. Being the slinger that I am, I was in my zone. I worked for about three weeks, and churned out the code in no time. I had a working prototype which performed all that it needed to, and was customizable. I was pumped. And then came the point where I had to check this in. So of course, I sent it to one of the senior developers for review. And the first set of comments came back.

It almost seemed like there was a line of comment for each line of code I had sent. Suddenly, my starter project seemed like an insurmountable task. I ended up pairing with aforementioned developer, and refactoring the code till it was almost unrecognizable. But the end product after all that blood and tears and refactoring was something much more manageable and maintainable. We added unit tests for each component, and separated our concerns properly.

First, I thought maybe it was just me, I hadn’t learnt something in college. It was a humbling experience, and showed that I had much to learn. But then I saw this repeat. An intern I knew spent his entire internship developing a component for a bigger project, but he was unable to check it in till his internship finished. The code was simply horrible to maintain for anyone who had not written it, and there were no unit tests, so no additions could be done with confidence that nothing else was broken. This was code developed by a really smart guy, who was a pretty good programmer. And this was no isolated case, something along these lines happened again and again. So what happened ?

This was when I started questioning if what happened with me and this intern weren’t just isolated cases but part of something bigger? A conspiracy even? Well, I wouldn’t go so far, but simply put, why weren’t we taught in college how to actually develop software? Why doesn’t Software Engineering actually teach how to write maintainable, well tested applications? Why isn’t there a single CS course which taught us how to work with legacy systems?

I mean, the usual Computer Science degree consists of courses in Discrete Mathematics, Automata Theory, Data Structures and Algorithms, Object Oriented Analysis and Design, Software Engineering and many more. And sure, I learnt about how to define classes in my OOAD class, and how to write sorting algorithsm and graph algorithms in my Algorithms class, and what the different steps for a Software project are and what approaches are present in my Software Engineering class. But looking back at it, none of them really prepped me for the work I would do in real life.

For instance, the OOAD class had a great project of creating a Chess game with AI. And I am proud to say I did get it working with a pretty solid AI backing it. But it was not code that I am proud of, nor would I ever want to go back and add a feature or fix a bug in it. My Software Engineering class had a project which was mostly talk and design, and really not that much implementation. And Algorithms was mostly write a function or one or two classes to implement an algorithm.

These were some great professors. And I gained a solid theoretical base in Algorithms and OOAD which would have been impossible otherwise. But some of these professors had industry experience. And the assignments they gave and the problems they assigned reflected nothing of that. And that hasn’t been just my experience, restricted to my university. Talking to my colleagues and friends who graduated around the same time, it has been the norm, not the exception. Why is this the case?  Why couldn’t my CS degree have prepared me for what I would have to work with?

I will continue down this line of thought in my next post, where I try to articulate what I would have wanted to be taught in college, knowing what I do know now.

, , , , , , , ,

16 Comments

The ROI of Testing

Nowadays, when I talk with (read: rant at) anyone about why they should do test driven development or write unit tests, my spiel has gotten extremely similar and redundant to the point that I don’t have to think about it anymore. But even when I do pairing with skeptics, even as I cajole and coax testable code or some specific refactorings out of them, I wonder, why is it that I have to convince you of the worth of testing ? Shouldn’t it be obvious ?

And sadly, it isn’t. Not to many people. To many people, I come advocating the rise of the devil itself. To others, it is this redundant, totally useless thing that is covered by the manual testers anyway. The general opinion seems to be, “I’m a software engineer. It is my job to write software. Nowhere in the job description does it say that I have to write these unit tests.” Well, to be fair, I haven’t heard that too many times, but they might as well be thinking it, given their investment in writing unit tests. And last time I checked, an engineer’s role is to deliver a working software. How do you even prove that your software works without having some unit tests to back you up ? Do you pull it up and go through it step by step, and start cursing when it breaks ? Because without unit tests, the odds are that it will.

But writing unit tests as you develop isn’t just to prove that your code works (though that is a great portion of it). There are so many more benefits to writing unit tests. Lets talk in depth about a few of these below.

Instantaneous Gratification

The biggest and most obvious reason for writing unit tests (either as you go along, or before you even write code) is instantaneous gratification. When I write code (write, not spike. That is a whole different ball game that I won’t get into now), I love to know that it works and does what it should do. If you are writing a smaller component of a bigger app (especially one that isn’t complete yet), how are you even supposed to know if what you just painstakingly wrote even works or not ? Even the best engineers make mistakes.

Whereas with unit tests, I can write my code. Then just hit my shortcut keys to run my tests, and voila, within a second or two, I have the results, telling me that everything passed (in the ideal case) or what failed and at which line, so I know exactly what I need to work on. It just gives you a safety net to fall back on, so you don’t have to remember all the ways it is supposed to work in. Something tells you if it is or not.

Also, doing Test Driven Development when developing is one of the best ways to keep track of what you are working on. I have times when I am churning out code and tests, one after the other, before I need to take a break. The concept of TDD is that I write a failing test, and then I write just enough code to pass that test. So when I take a break, I make it a point to leave at a failing test, so that when I come back, I can jump right back into writing the code to get it to pass. I don’t have to spend 15 – 20 minutes reading through the code to figure out where I left off. My asserts usually tell me exactly what I need to do.

Imposing Modularity / Reusability

The very first rule of reusable code is that you have to be able to instantiate an instance of the class before you can use it. And guess what ? With unit tests, you almost always have to instantiate an instance of the class under test. Therefore, writing a unit test is always a first great step in making code reusable. And the minute you start writing unit tests, most likely, you will start running into the common pain points of not having injectable dependencies (Unless of course, you are one of the converts, in which case, good for you!).

Which brings me to the next point. Once you start having to jump through fiery hoops to set up your class just right to test it, you will start to realize when a class is getting bloated, or when a certain component belongs in its own class. For instance, why test the House when what you really want to test is the Kitchen it contains. So if the Kitchen class was initially part of the House, when you start writing unit tests, it becomes obvious enough that it belongs separately. Before long, you have modular classes which are small and self contained and can be tested independently without effort. And it definitely helps keep the code base cleaner and more comprehensible.

Refactoring Safety Net

Any project, no matter what you do, usually ends up at a juncture where the requirements change on you. And you are left with the option of refactoring your codebase to add / change it, or rewrite from scratch. One, never rewrite from scratch, always refactor. Its always faster when you refactor, no matter what you may think. Two, what do you do when you have to refactor and you don’t have unit tests ? How do you know you haven’t horribly broken something in that refactor ? Granted, IDE’s such as Eclipse and IntelliJ have made refactoring much more convenient, but adding new functionality or editing existing features is never simple.

More often than not, we end up changing some undocumented way the existing code behaved, and blow up 10 different things (it takes skill to blow up more, believe me, I have tried). And its often something as simple as changing the way a variable is set or unset. In those cases, having unittests (remember those things you were supposed to have written?) to confirm that your refactoring broke nothing is godsend. I can’t tell you the amount of times I have had to refactor a legacy code base without this safety net. The only way to ensure I did it correct was to write these large integration tests (because again, no unit tests usually tends to increase the coupling and reduce modularity, even in the most well designed code bases) which verified things at a higher level and pray fervently that I broke nothing. Then I would spend a few minutes bringing up the app everytime, and clicking on random things to make sure nothing blew up. A complete waste of my time when I could have known the same thing by just running my unit tests.

Documentation

Finally, one of my favorite advantages to doing TDD or writing unit tests as I code. I have a short memory for code I have written. I could look back at the code I wrote two days ago, and have no clue what I was thinking. In those cases, all I have to do is go look at the test for a particular method, and that almost always will tell me what that method takes in as parameters, and what all it should be doing. A well constructed set of tests tell you about valid and invalid inputs, state that it should modify and output that it may return.

Now this is useful for people like me with short memory spans. But it is also useful, say, when you have a new person joining the team. We had this cushion the last time someone joined our team for a short period of time, and when we asked him to add a particular check to a method, we just pointed him to the tests for that method, which basically told him what the method does. He was able to understand the requirements, and go ahead and add the check with minimal handholding. And the tests give a safety net so he doesn’t break anything else while he was at it.

Also useful is the fact that later, when someone comes marching through your door, demanding you fix this bug, you can always make sure whether it was a a bug (in which case, you are obviously missing a test case) or if it was a feature that they have now changed the requirements on (in which case you already have a test which proves it was your intent to do it, and thus not a bug).

, , , , ,

4 Comments