Pre-increment vs. Post-increment: Which is Faster? (Part II)

Tue Feb 15, 2011

By Kevin Tatterson

In my previous post I explored the notion that ++d is faster than d++. 

Ponderings

Now for an educated guess on what would happen if we got rid of the __asm nop and allowed the optimizer to inline. At the very least, the instructions in dark red in the previous example (lea, call myint::operator++, nop, and ret) would go away, leaving us with 8 clock cycles for pre-increment and 11 clock cycles for post-increment: which would make pre-increment is 27% faster!

Back to reality for a moment. In actuality, the myint example gives the best case figures because of two reasons: both the myint copy constructor and the pre/post-increment’s are dead simple – one clock cycle each – and inlining works because the implementations are short. So what happens if these implementations get even just a little more complex?

Cost (clocks)

 

 

of copy ctor & operator++

Pre-incr

 

 

Total

Clocks

Post-incr

 

 

Total

Clocks

% Faster
2 (best case, our example)                   8                                  11                               27%                        
10 16 19 16%                     
20 26 29 10%
40 46 49 6%

 

Now consider that seemingly innocuous instructions will explode the number of clock cycles into the 100’s and 1000’s – calls like sprintf, malloc, new, itoa – will blow this example out of the water and reduce the benefit to nil.

Conclusion

I have mixed feelings on whether to recommend pre-increment over post-increment:

  • Your copy ctor and pre/post-incr implementation have to be dead simple to measure a win.
  • It wouldn’t surprise me if compiler optimizers are able to determine when post-increment can be replaced with pre-increment, when your program’s semantics allow.
  • It doesn’t change the semantics of your program much, but other developers might wonder why you favor pre-increment.
  • In the grand scheme of things, few real world algorithms’ performance will measurably affected by favoring pre-increment.

Here at Spatial, I’d like to think that we take a pragmatic approach to our software’s performance. CGM, ACIS, 3D InterOp, and IOP-CGM, rarely give concern to this level of minutia. I’d like to describe our approach to performance as tactical and pareto-ized. As I said, in rare instances we give concern to minutia, but only when our profilers tell us to.

In the end, I’m okay if you use pre-increment – but for myself, I’ll aspire to loftier programmatic governances. What are these governances? That’s another blog – one that is sure to stir things a bit.

You May Also Like

These Stories on 3D Software Development Kits

Subscribe to the D2D Blog

No Comments Yet

Let us know what you think