By Kevin Tatterson
In my previous post I explored the notion that ++d is faster than d++.
Now for an educated guess on what would happen if we got rid of the __asm nop and allowed the optimizer to inline. At the very least, the instructions in dark red in the previous example (lea, call myint::operator++, nop, and ret) would go away, leaving us with 8 clock cycles for pre-increment and 11 clock cycles for post-increment: which would make pre-increment is 27% faster!
Back to reality for a moment. In actuality, the myint example gives the best case figures because of two reasons: both the myint copy constructor and the pre/post-increment’s are dead simple – one clock cycle each – and inlining works because the implementations are short. So what happens if these implementations get even just a little more complex?
of copy ctor & operator++
|2 (best case, our example)||8||11||27%|
Now consider that seemingly innocuous instructions will explode the number of clock cycles into the 100’s and 1000’s – calls like sprintf, malloc, new, itoa – will blow this example out of the water and reduce the benefit to nil.
I have mixed feelings on whether to recommend pre-increment over post-increment:
- Your copy ctor and pre/post-incr implementation have to be dead simple to measure a win.
- It wouldn’t surprise me if compiler optimizers are able to determine when post-increment can be replaced with pre-increment, when your program’s semantics allow.
- It doesn’t change the semantics of your program much, but other developers might wonder why you favor pre-increment.
- In the grand scheme of things, few real world algorithms’ performance will measurably affected by favoring pre-increment.
Here at Spatial, I’d like to think that we take a pragmatic approach to our software’s performance. CGM, ACIS, 3D InterOp, and IOP-CGM, rarely give concern to this level of minutia. I’d like to describe our approach to performance as tactical and pareto-ized. As I said, in rare instances we give concern to minutia, but only when our profilers tell us to.
In the end, I’m okay if you use pre-increment – but for myself, I’ll aspire to loftier programmatic governances. What are these governances? That’s another blog – one that is sure to stir things a bit.