Why In Programming Education Matters
A story about a small team's success in a mission impossible like project. (And also how having a good team is a very important prerequisite.)
It started as a nine month contract work for a relatively large and growing global company quite some time ago. The team of contractors was coming through the same small local company and was quite gelled (we studied or worked with each other, or had common friends). The project we were assigned to was cancelled after a couple of months and to honor the remaining months in the contract a new task was identified for us. There was no one that would be willing to work on it and we suddenly turned out as a free resource.
The project itself seemed to be quite tedious. It was about converting some dynamic pages customized per client from one obsolete ad hoc technology to another completely different. Data and a sort of template comes in, HTML comes out. All templates were designed by someone else, all quite poor quality, all using deprecated and misunderstood HTML constructs not working on newly appearing browsers (it was the time when IE6 ruled the world). We were supposed to not just convert them, but to test and fix them all to look exactly the same everywhere as the original in IE6. There seemed to be nothing reusable. An average template took about 2–4 hours to convert. There were thousands of them. There were 4 of us. Do the math.
We Are Not Monkeys (we are apes, well, technically monkeys as well)
The first thing that we noticed was that many of us replicated work of others as some templates were created using copy–paste–modify, but there was nothing to suggest which were similar to each other and how much. In addition there was a noticeable pattern of biological evolution (with a weak natural selection) in action; several simple designs blindly combined and mutated evolved to extreme complexity (table-based designs of large nesting depths, redundant code with no function, etc.).
Once we went to lunch together and started complaining and hypothesizing what we would need to be able to deal with the problem effectively. Maybe there can be an unfortunate person selected to just looking at the templates and assign to the people appropriately. However, there was no clue even in the visual appearance what the code looks like (some visually similar could be (re)coded by different people, some different by copy–paste and changing sizes and images). Code itself was a mess and formatting was not consistent or absent (I did not mention that HTML was created by string concatenation in a very unsuitable language). So for a while we thought we would need an artificial intelligence to help us. But then me (studying more practical computer science at a university) and my colleague (studying more theoretical one) started remembering some of lessons we went through.
We are able to compare two files to get their distance in terms of number of keystrokes needed to convert one template to another using Levenshtein Distance. That seemed to be a good approximation expressed by one number and mostly ignoring formatting.
Then we would need a plan to order the templates how we should proceed from one to another reusing the one before as efficiently as possible. We remembered Travelling salesman problem and quickly dismissed as not the right way. We did not need the closest template to the previously finished template, we needed the closest template to any of the previously finished templates. The education jumped in to provide immediate answer: Minimum Spanning Tree.
Since the produced tree is planar (can be drawn nicely on a paper) we wanted to visualize it as clearly as possible to see how we can tackle the problem further. We remembered that one of Java demos does that kind of thing, so we knew it is possible.
After this lunch we were quite excited that we can try to solve our problem. Moreover we would need to do what we enjoyed much more — to program.
The Next Three Days
Levenshtein distance computation is usually solved using dynamic programming and turned out to be short, simple, and the program fast enough to compute a matrix of all distances between all possible pairs of templates. For efficiency we used C++ (about a hundred of lines). This matrix was fed to a program computing the optimal tree that was very fast and took not much more than a hundred of lines of C# (a fresh new language that time). The result (a set of template ids, connections and their distances — nodes and edges of the tree) was loaded to a heavily customized Java graph layout application. When the layout was (with human assistance) done it was exported to a vector format and printed on A1 format paper (area of 0.5 square meters).
After (and for some of us even during) this work was done we were very positively surprised. It appeared to be even better than we hoped and extrapolated from our small sample we already finished converting. There were very visible clusters of templates and we could effectively split the templates just by drawing a loop around a cluster to allocate and color the finished templates when done. Every time we moved to a different cluster, we could get to its most typical template and have most of the work on the other templates done.
These three days to get the tools done sped us up orders of magnitude of most of the time. There was almost constantly some one at the paper battle plan to allocate new work or mark the completed templates. Of course, the faster you worked, the easier work you could allocate for yourself, so everyone could chose its pace. Different colors made obvious who is the least productive. Nobody wanted that displayed on the wall.
We finished 99% of the templates in a fraction of the estimated time to everyone's surprise. Our approach could identify those 1% templates that would get us stuck for a day each. Postponing the work on those enabled us to further reduce the useless work on templates that expired or were redesigned by a client while we worked on easier ones.
We Are Hiring You – Gelled Shmelled
The work we done impressed a few people enough that one CxO took our battle plan as an inspiring souvenir and we were offered to become regular employees (in fact, we were bought without actually buying our company).
People from our team were dissolved into the existing teams and we started to work in the way others did. Only very infrequently, one of us was asked to automate something (if we came with the idea first). In time the teams split and merge, people were transferred between projects and teams, and our initial success has been forgotten for it happened in those old times with no traces left. That is why I wrote this. Conclusion is that education sometimes matters big time, but the culture of many companies matters even more.