Reference: GCC docs. A lot of the material here seems to have been written assuming that the inline assembly consists of a single instruction which reads inputs and then writes outputs (possibly overwriting inputs).
Wiki Markup |
---|
Inline assembler code for the memory management package uses the GNU extension of named operands, e.g., "%\[foo\]". We generally use the "\&" constraint for output operands; otherwise, the compiler assumes that the output operands are set as the very last actions of the code, after all input operands have been used. In such a case the compiler may use the same register for an input operand and an output operand. Using "\&" marks the output operand as "early clobber",i.e., the register is altered early on, perhaps before all the inputs have been used. |
...
Code Block |
---|
register unsigned tmp;
asm volatile
(// Read and store a Translation Lookaside Buffer entry from the MMU.
// High part.
"tlbrehi %[tmp],%[idx] \n\t"
"stw %[tmp],0(%[self]) \n\t"
// Low part.
"tlbrelo %[tmp],%[idx] \n\t"
"stw %[tmp],4(%[self])"
: "=m"(*this), [tmp]"=&r"(tmp)
: [idx]"r"(this->index), [self]"b"(this)
:
);
|
The "I" constraint lets you specify signed 16-bit constants so the above could also have been written like this, assuming that the data members affected were named "foo" and "bar":
Code Block |
---|
register unsigned tmp; asm volatile ( "tlbrehi %[tmp],%[idx] \n\t" "stw %[tmp],%[foo](%[self]) \n\t" "tlbrelo %[tmp],%[idx] \n\t" "stw %[tmp],%[bar](%[self])" : "=m"(*this), [tmp]"=&r"(tmp) : [idx]"r"(this->index), [self]"b"(this),[foo]"I"(offsetof(foo)),[bar]"I"(offsetof(bar)) : ); |
Counterexample. The following code is meant to atomically remove the head item from a linked list and return the pointer the item. The output argument "tmp" is not marked early-clobber so the compiler is in principle free to reuse the register used for the input "&_head". In one case the compiler used r0 for both ("%2" and "%1"). Note also that in this case "%2" may be used more than once if we have to loop, making it especially important to never to reuse it.
Code Block |
---|
Flink<T>* head;
Flink<T>* tmp;
asm volatile ("1:;"
"lwarx %0,0,%2;"
"lwzx %1,0,(%0);"
"stwcx. %1,0,%2;"
"bne- 1b;"
: "=&r" (head), "=r" (tmp) : "r" (&_head) : "cc", "memory");
return static_cast<T*>(head);
|
Now consider the effect of the above instructions when %2 is the same as %1 and the list has one member (and is terminated by a special "empty" node), and assuming that the stwcx instruction always stores:
_head -> first -> empty
Instruction | _head | first | %2==%1 | %0 |
---|---|---|---|---|
Setup | &first | &empty | &_head | ? |
lwarx | &first | &empty | &_head | &first |
lwzx | &first | &empty | &empty | &first |
stwcx. | &first | &empty | &empty | &first |
This is what was intended:
Instruction | _head | first | %2 | %1 | %0 |
---|---|---|---|---|---|
Setup | &first | &empty | &_head | ? | ? |
lwarx | &first | &empty | &_head | ? | &first |
lwzx | &first | &empty | &_head | &empty | &first |
stwcx. | &empty | &empty | &_head | &empty | &first |