Solving software problems the hard way

Many years ago, I wrote software for the federal government. Our application helped a couple of agencies produce the documents they send to the President and Congress to ask for money. I mean their official annual budget request, not like “Hey, Joe, you got five bucks on you?” I assume they did that in person. The basic process was our code produced some boilerplate, some tables based on numbers they uploaded, and then inserted text documents that they also uploaded. It then spit out a completed document. With modern technology, this would be pretty straightforward, but this was 2008 federal government tech, so it was held back by all sorts of things.

The code mostly worked fine without major changes EXCEPT for one thing. Treasury liked to change the table heading colors every year. This was a HUGE problem. At the time, we were using a very old document type called Rich Text Format. Maybe you are old enough to have seen an RTF file. It was made by Microsoft to be compatible with non-Microsoft Word programs. In hindsight this is sort of hilarious. To change the color on an RTF file, the code ripped the header off the file and inserted a new one. Then you ran the code and hoped that you did it right. Usually, you did not, so you tried again. It was an ENORMOUS amount of work for something that I thought should be trivial – there were all sorts of escape characters and weird abbreviations and none of it made any sense. It was sort of like HTML with a CSS file except where CSS is supposed to be simple and straightforward (Cascading Simple Straightforward, that’s CSS), this was the opposite.

So I thought, surely there is software out there that we can use to do this instead of rolling our own. And there was! We were a Java team, and at the time, Apache POI was what everyone in Java used to do text documents. I was excited! A polished open source program that was free to use! No more headaches!

And then there were headaches. You see, our Java code lived inside an Oracle database. And the Oracle database version we were stuck on only supported Java version 1.4. Apache POI required 1.5. There was no way I, a lowly software contractor, could get them to move to a newer Oracle version that would support 1.5.

So I did what any young software person in that situation would do – I wrote my own code to create and manipulate DOCX files. Do not attempt to read this code, you will cry, I promise. Interesting note – most (all?) Microsoft Office files (.docx and .xlsx for sure) are just a zipped folder of human-readable XML files. You can just unzip them and look at the files. During my time writing this code, I looked at those files A LOT. More than was healthy, probably. Much longer and I would have been able to open a Word document and see the XML, Matrix-style.

But in the end, it worked! You can see some of the documents my code produced if you look at the Congressional Justification and Budget in Brief at treasury.gov. I don’t know how many years they used my code – I left the project in 2010 – but I’m sure they used it in 2008. I get a kick out of knowing that budgeting decisions were made in part using a document I coded.

Leave a Reply

Your email address will not be published. Required fields are marked *