Itanium exception handling performance

Itanium exception handling performance

Post by Dan Foste » Mon, 19 Mar 2007 03:21:11

I'm curious -- why is exception handling performance so poor on Itanium?

Lots of overhead? Translated calls? Something else? Noticed a mention of
this in passing in the BRUDEN presentation PDF and wondered about the
technical reasons for it.


Itanium exception handling performance

Post by Dan Foste » Mon, 19 Mar 2007 03:34:22

Is it due to Itanium being a processor with high levels of instruction
level parallelism and flagging an exception would result in a pipeline
flush with a corresponding performance penalty?

If so, I can see why exception handling would be 'expensive' as an
architectural limitation.

I'm just not sure if there's more to it from the OpenVMS side, or if
it's just the (processor) architectural design itself.



Itanium exception handling performance

Post by Bob Gezelt » Mon, 19 Mar 2007 04:31:45


Actually, the pipelining on IA-64 is limited to what is explicitly
specified by the instruction set (which is, after all, what EPIC -
Explicitly Parallel Instruction Computing means).

Alpha is slower than VAX for a variety of reasons, pipelining is only
one one of the issues.

As a beginning, draw a simple table of the number of items that must
be preserved on a context switch. With the VAX, it is the 16 general
registers, plus assorted other information. On Alpha, the number of
registers doubles.IA-64 has even more registers, and more context
information to save. Consider the fact that one of the most expensive
operations on Alpha was a procedure call, because of the call frame
processing. Also consider that the register windowing on IA-64
addresses that specific issue.

I do not have micro-level timing analyses of the fault processing, but
I would expect that fault processing performance of IA-64 affects all
of the systems, not OpenVMS specifically.

- Bob Gezelter,

Itanium exception handling performance

Post by Tom Linde » Wed, 21 Mar 2007 03:22:45

Mips did something similar, which was a real nusiance. It was not the
right way to do then or is it now, for the reasons you cited.

Using Opera's revolutionary e-mail client:

Itanium exception handling performance

Post by John Reaga » Wed, 21 Mar 2007 04:20:58

Pipelining has nothing to do with it.

The extra context to save/restore is about half to blame, but it is also
related to Intel Calling Standard that we use on OpenVMS I64. (We
started with it and made some enhancements but using it allowed us to
incorporate Intel compiler technology for the C++ compiler much quicker).

VAX and Alpha have frame-pointers. When an exception occurs, software
can quickly find out where the compilers (or hardware in the case of
VAX) saved various registers (including the return address). That
allows code to easily walk back up the stack.

I64 does not have a frame pointer. The Intel Calling Standard is
PC-based. So when an exception occurs, the system has to look up the PC
(in a balanced tree, look at the SYS$SET_UNWIND_TABLE system service).
Once the system finds the unwind info left behind by the compiler, it
has to interpret it. It is a rather complicated set of data structures.
Not as simple as the VAX register save masks or the Alpha Procedure
Descriptor register save information. Items like the address of a
routine's static handler is also encoded in these unwind descriptors
unlike Alpha where the frame-pointer could quickly find the procedure
descriptor which contains the static handler's address.

All of this makes the LIB$I64_ calling standard routines and/or
exception delivery much slower than their Alpha counterparts. We're
looking at improving the performance as much as possible, but it is not

John Reagan
HP Pascal/{A|I}MACRO/COBOL for OpenVMS Project Leader
Hewlett-Packard Company

Itanium exception handling performance

Post by roge » Thu, 22 Mar 2007 01:11:25

Wow - what a kludge is that?
I remember when engineering use to be something you were proud of.

Itanium exception handling performance

Post by Dan Foste » Thu, 22 Mar 2007 14:08:25


Thank you very much for the clear and concise technical
explanations. I found it fascinating, and found it made perfect sense. I
only wish I could reciprocate in some way.

MIPS, eh? That's interesting. I'm guessing Intel's PC-based (as
in program counter) approach may have been historical; it's hard to stop
a historical juggernaut once it's got enough steam.

I also appreciated the words of someone from a well-known
engineering laboratory whom had additional insight into the subject,
just as we happened to be discussing it here. :-)

Incidentally, this reminds me of how Sun chose to implement
their Niagara processors differently (from an architectural

They can save all registers (for all threads) and related
context switch information with a single call since they sized it to be
big enough to do it all in one gulp. This is one of the reasons why
thread-switching (up to 8 hardware threads per core) is a cheap
operation on Niagara. It's roughly along the lines of changing a pointer
to another area of what I informally call 'thread information block'.

No criticism of other architectures; just merely mentioning a
different approach. Of course, Sun's first Niagara implementation also
had an absymal FP engine (corrected in second generation). It also had
the benefit of hindsight as it is newer than most architectures today.

I will be sure to keep I64 issues in mind when writing code.
Thanks! I also found it interesting to read about how architectural
design choices affects real world apps, even indirectly. From my
reading, it does indeed sound like quite a challenge for the HP folks to
try and improve performance in some way, to the extent possible.


Itanium exception handling performance

Post by Kilgalle » Thu, 22 Mar 2007 21:06:08

> I will be sure to keep I64 issues in mind when writing code.

Of course the term "exception" is supposed to be a code word that this
is an abnormal occurance. Besides hardware designers there are going
to be language implementers and even language designers taking their
cue from that term and overloading "exception handling" with lots of
neat but expensive features.

When coding one should not depend on exceptions for "normal" processing.

Itanium exception handling performance

Post by Tom Linde » Thu, 22 Mar 2007 22:04:22

On Wed, 21 Mar 2007 04:06:08 -0800, Larry Kilgallen

Well, there are some exceptions (:-} )
/* finished reading the file now start the processing*/

Using Opera's revolutionary e-mail client:

Itanium exception handling performance

Post by Andre » Sat, 24 Mar 2007 01:18:51

Each of the T1 has 4 hardware threads, the core switches from thread
to thread in a single cycle executing threads in a round robin order
(assuming there is something to run) if a thread stalls the core
ignores it and continues to round robin through the 3 remaining
threads until the stalled thread returns. This speeds up the execution
of the remaining threads while the stall happens.

So for example if you have 4 threads running, each thread sees roughly
a 300 Mhz processor assuming the T1 runs at 1.2 GHz.
If one thread stalls then the remaining 3 threads see a 400Mhz
processor and so on.

The key to Niagara is this thread switching efficiency which allows it
to ignore stalls boosting processor efficiency.

Andrew Harrison