The memory barrier instructions are described in a rather implementation-
oriented way in that article, that is not actually very helpful.
So I looked for something a bit more definitive:
"Book II: PowerPC Virtual Environment Architecture", downloaded from
(Version 2.01, dated 10 Dec 2003). And what does it say but this (on
Because stores cannot be performed "out-of-order" (see Book III,
PowerPC Operating Environment Architecture), if a Store instruction
depends on the value returned by a preceding Load instruction (because
the value returned by the Load is used to compute either the effective
address specified by the Store or the value to be stored), the
corresponding storage accesses are performed in program order.
Also, on page 24 of Book III:
Stores are not performed out-of-order (even if the Store instructions
that caused them were executed out-of-order).
"Performed" is defined on page 2 of Book II:
A load or instruction fetch by a processor or mechanism (P1) is performed
with respect to any processor or mechanism (P2) when the value to be
returned by the load or instruction fetch can no longer be changed by a
store by P2. A store by P1 is performed with respect to P2 when a load
by P2 from the location accessed by the store will return the value
stored (or a value stored subsequently).
[snip irrelevant stuff about cache block invalidations]
The preceding definitions apply regardless of whether P1 and P2 are the
It is presumably not a coincidence that these are the same definitions used
Unfortunately, Book II only defines "performed with respect to", not "performed".
However, my reading of the above is that for each pair of processors P1 and
P2, P2 cannot observe stores made by P1 out-of-order. Also note that Book II
page 5 says that "In most systems the default is that all storage is Memory
Coherence Required", where
Memory coherence refers to the ordering of stores to a single location.
Atomic stores to a given location are coherent if they are serialized in
some order, and no processor or mechanism is able to observe any subset of
those stores as occurring in a conflicting order.
and all aligned stores are "atomic stores" (Book II page 3).
This would imply that aligned stores to locations of the default memory type
occur in a global total order, which is the order in which they are observed
by all processors.
- if an incompatible change has been made to the PowerPC architecture, why
is it not documented? (I don't think I missed a more up-to-date version of
the arch manual; the web page certainly doesn't give any indication that
there might be one.)
- if an incompatible change has not been made, then what have I missed, and
what is <http://www-128.ibm.com/developerworks/eserver/articles/power4_mem.html>
going on about?
David Hopwood < XXXX@XXXXX.COM >