I found out some obvious-in-retrospect things this week when doing a little optimisation work. This is work in C#, but I’m sure it’s applicable to many languages. If you want to skip the details, but want the take-home lesson, here it is: if you want reasonable performance with large-ish data-sets (I was working of the order of 5 million elements), use primitive structures — i.e., arrays and hand-coding — rather than the convenience classes provided by your framework of choice. Now, on to the examples.
Firstly, if you are building up a large array — especially if you know a bound on the size — don’t use an ArrayList, use an array. I slashed a generation stage of my app in half, from 26 to 13 seconds, via this alone (after doing some study of the problem to figure out the maximum possible array size).
Tuning the behavior of array resizing is also useful. The behavior of most generic ArrayList implementations is to double the array size as they go. If you know, for example, that the solution space expands exponentially rather than linearly, then you can write a far more efficient implementation of array re-allocation.
Secondly, memory matters and more complicated structures are likely to be more wasteful than something you code yourself. By moving to my own C# array+code I saved about a gigabyte of memory. Granted, this was partly because of some slightly sloppy choices I made early on — when I thought we would be dealing with a few thousand items, so forgivable choices I would counter — I was duplicating memory all over the place. As I see it you have two choices when dealing with a case of resizing arrays:
There are obviously optimisations that can be made, but it’s a classic speed-vs-space trade-off. It’s a little more complicated because it’s more “fast-add, slow-get, more memory” vs “slow-add, fast-get, low memory” case, so it depends how you are working. In my case, the latter worked excellently because I have a pattern of few-additions vs many-gets. Again, my understanding of the problem at hand allowed me to easily surpass generic implementations.
Finally, as an aside more than anything, I realised there’s a super fast stack implementation if you know the maximum-bound of the stack; which you quite often will if you take some time to analyse the problem. It can be illustrated as follows:
[gratuitus line-break are thanks to my dodgy weblog software]
Yeah, it’s simple and obvious, but I hadn’t really thought about it before. Trick is that i always points to the position beyond the top of the stack. The ordering of ++ and — is important, and presumes a C-style implementation of precedence.
So silly little things, but ones which will really make a difference. I think this is one reason why C programs are fast: you have to know what is going on and, as such, tend to use the most efficient methods for the problem at hand. My take-home lesson is that, while convenience classes certainly have a place, they can be no substitute for some code designed specifically for the problem in mind when efficiency is needed.