Opcode Parsing & Invalid Opcodes

Opcode Parsing & Invalid Opcodes

Post by Nima » Wed, 07 Jul 2010 06:22:58


I'm learning to program in straight machine code, and I just finished
reading the Intel manuals.

I have a burning question that the books haven't answered, maybe I'm
just stupid and I missed it.

If I do a JMP to a bunch of garbled data, how does the prefetching
process know where the "instruction boundaries" are? Where will EIP
be when the inevitable invalid opcode exception is triggered?

In other words, if the instructions are garbage, how much garbage is
taken in? What are the rules?

My guess is, each possible opcode byte has something like a lookup
table entry, and after parsing a byte, the prefetcher either adds
another byte to the instruction, adds a modr/m byte to the instruction
and grabs displacement and immediate bytes, or ends the instruction
and sends it to the pipeline. This is entirely based on inference, I
can't find anything in the manuals to confirm or deny this.

Whatever process it uses, it MUST be entirely deterministic, or code
can't be. So where is it documented?
 
 
 

Opcode Parsing & Invalid Opcodes

Post by BGB / cr88 » Wed, 07 Jul 2010 07:00:44


simple answer:
the bytes either look like an opcode it recognizes, or they don't;
if they look like a recognized instruction, the processor will behave as if
they were that instruction;
if it doesn't, the processor will #UD leaving EIP pointing before the thing
which looks like the bad instruction (any bytes read during attempting to
decode the opcode don't matter).

now, how many bytes this will be/... depends solely on the particular
processor and particular bytes...

in my x86 interpreter, I basically just did a big pattern match over the
opcodes table for decoding each instruction. failing here (no match) was a
#UD. similarly, if there was no logic attached to the opcode in the
interpreter, this was also a #UD (may happen with opcodes which exist in the
ISA, but nothing was written into the interpreter for them to do...).

note that on newer processors, there is the NX bit, which will generate an
exception as soon as EIP/RIP lands in a page with this set, ..., if this
matters...

or such...

 
 
 

Opcode Parsing & Invalid Opcodes

Post by Nima » Wed, 07 Jul 2010 07:18:51

On Jul 5, 3:00m, "BGB / cr88192" < XXXX@XXXXX.COM >




Theories are nice, but where's the proof? I need word of god on this.

Also, your method works for fixed instruction lengths, but it doesn't
describe what happens to extra bytes trailing after an invalid opcode
is encountered. What if the modr/m byte is screwed up? Does it treat
immediate/displacement data as an additional, invalid opcodes, or does
it throw them away? What if an opcode is corrupted, and its immediate
data or displacement contains a valid opcode?

These are all boundary conditions, sure, and probably useless to most
people, but I need to know!

I want to be able to take any possible stream of bits, no matter how
full of garbage, and be able to know EXACTLY what will happen.

I do NOT want to have to fuzz test my CPU. These things should be
documented.

Again, my question is this: Where is this documented?

Where does Intel specifically say, "instructions are parsed according
to these rules:"?
 
 
 

Opcode Parsing & Invalid Opcodes

Post by Bernhard S » Wed, 07 Jul 2010 08:27:37


If you jump to any target, the prefetch mechanism
starts to read data from 00[jmp_target]. If these
data cannot be interpreted as an instruction, the
processor issues an invalid opcode exception. For
conditional jumps, the built-in branch prediction
logic determines if the code at 00[jmp_target] is
prefetched or the branch is not taken, and execu-
tion continues with the next opcode following the
conditional jump. If a prediction fails, the pre-
loaded code must be flushed and the real code has
to be loaded.

The prefetcher is a stupid machine and reads data
from anywhere if the current instruction tells it
to do that. The required brain has to be provided
by the programmer or compiler (which got its true
brain from a human, as well).

If you write binary code, you should know how op-
codes are assembled. Every processor works like a
disassembler. It reads byte for byte and compares
them against its opcodes. If a match is found, it
determines if additional data are required - e.g.
the address of your jump target - and gets those,
as well. If everything was okay, this instruction
is executed (with the data you provided). If not,
your program crashes.

This probably is not documented anywhere. You can
figure it out with the opcodes listed in AMD's or
iNTEL's manuals on your own.


Greetings from Augsburg

Bernhard Schornak
 
 
 

Opcode Parsing & Invalid Opcodes

Post by Nima » Wed, 07 Jul 2010 09:30:43

n Jul 5, 4:27m, Bernhard Schornak < XXXX@XXXXX.COM >
wrote:

Thanks, but I already know all that. It's not that I don't have a
pretty good idea about these answers, but that I need to be able to
confirm them in order to move forward, and at this point my only
options are to fuzz test, or RTFM, and TFM doesn't seem to exist.
Computing is not guesswork. What I really need is official
documentation on what an IA32 processor is "supposed" to do when it
fetches, decodes and executes an instruction, without ignoring all the
possible boundary conditions.

The original 386 manual has a useful tidbit about how invalid opcode
exceptions aren't triggered during prefetch. The current manual
doesn't have this information anywhere, and its sections on pipeline
behavior read more like marketing blurbs than technical reference.
Are these manuals really ALL the information we have about IA32?

Any solid information about the deterministic mechanisms behind
prefetch, pipeline, and execution, from a reliable source, would be
incredibly helpful. The behavior of the opcode interpreter is
arguably the most important part of the entire architecture, and it
seems like the only way to deduce its behavior is reverse-engineering
it from the opcodes, and the descriptions of what they effect.

Any guesses and hypotheses about these things are nice, but if
documentation exists, they're a waste of time.







 
 
 

Opcode Parsing & Invalid Opcodes

Post by Rod Pember » Wed, 07 Jul 2010 15:37:49

Nimai" < XXXX@XXXXX.COM > wrote in message
news: XXXX@XXXXX.COM ...

Don't do that... Why would you intentionally jump to "garbled data"?

BTW, the single-byte x86 opcode instruction map is full. So, to generate an
"invalid instruction", you must use multi-byte instruction opcodes... I.e.,
there's no such thing as executing "garbled data" on x86. Any randomly
generated "garbled data" will most likely result in generating one of the
single-byte opcodes. They'll execute. Occasionally, it may be a multi-byte
instruction, which will execute also. And, rarely, it may be invalid
multi-byte opcode, which will be trapped...


Why would it need to "know"?

The instruction decoder(s) decodes the instruction after some bytes have
already been (pre)fetched...

The 8086 has a 6-byte buffer prefetch. Later versions will likely have much
more, but at least 15, since that's the instruction size limit on 386 or
later.


Still pointing at the "inevitable invalid opcode", perhaps? If it's invalid
and the microprocessor prevented it from being executed, why would
it have a known size? ...

If an "invalid opcode" had a determinable size, that means the
microprocessor is able to decode an invalid opcode. That implies that the
microprocessor doesn't detect "invalid opcodes". x86 does (from 286?...).
Early micro's used an instruction mask, so they didn't have any invalid
opcodes. On them, an "invalid opcode" actually did *something*, but usually
something not that useful... Those micro's didn't detect invalid opcodes
prior to execution.


What rules?

It probably varies from microprocessor generation to microprocessor
generation, and manufacturer to manufacturer.

(Or, as a programmer, why would you *ever* need to know?)


That might've been true on an 8086/88. But, on later processors, it's my
guess, that they read a large block of bytes at a time. The prefetched
bytes might be called a cache line in the cache...
http://en.wikipedia.org/wiki/CPU_cache

What's known is the maximum x86 instruction length:

386 or later has 15 byte maximum
286 has a 10 byte maximum
86/88 has no maximum, but the instruction size was 1 to 4 bytes

Some info on the 8086:

8086 has a 6-byte buffer prefetch
8086 instruction size 1 to 4 bytes
8086 21-bit microinstructions (504)
8086 two stage pipeline


It's probably proprietary. But, there is some public information on how
various x86 microprocessors work, including prefetch. E.g.,

Inside AMD64 Architecture (See page 6)
http://www.hardwaresecrets.com/article/Inside-AMD64-Architecture/324/1

Into the Core: Intel's next-generation microarchitecture (maybe see page 5)
http://arstechnica.com/hardware/news/2006/04/core.ars/

You can find articles like that when searching for "reorder buffer" or
"micro-ops" or "macro-fusion" or "microinstructions" inconjunction with
"x86" or "AMD64" etc.


Rod Pemberton


 
 
 

Opcode Parsing & Invalid Opcodes

Post by Alexei A. » Wed, 07 Jul 2010 16:07:57


> process know where the "instruction boundaries" are?
I'm not sure what exactly you mean here by prefetching and boundaries
together.
>> Where will EIP >> be when the inevitable invalid opcode exception is triggered?

At the entry point of the #UD handler, which will have on its stack
the address of that invalid instruction.
>> In other words, if the instructions are garbage, how much garbage is >> taken in? hat are the rules?

If that "instruction" causes a #UD, none. #UD is a fault type of
exception. As such, returning from the #UD handler will force the CPU
to try to execute that instruction again. If it doesn't cause a #UD,
chances are it's an (officially) undocumented instruction.
>> My guess is, each possible opcode byte has something like a lookup >> table entry, and after parsing a byte, the prefetcher either adds >> another byte to the instruction, adds a modr/m byte to the instruction >> and grabs displacement and immediate bytes, or ends the instruction >> and sends it to the pipeline. his is entirely based on inference, I >> can't find anything in the manuals to confirm or deny this. >> >> Whatever process it uses, it MUST be entirely deterministic, or code >> can't be.
Surely, the behavior is deterministic.
> > So where is it documented?

Perhaps, Intel/AMD internal documentation that you're not going to get
access to?

There are a number of places in their public documents where things
are either not described in full detail and there're statements like
"should (not)", "undefined" and such. If you carefully study the
explanation of shift/rotate instructions, you'll find "undefined"
there. These instructions, of course, do not generate random results
in the "undefined" cases, they consistently produce the same output
for the same inputs on the same CPU. But this output may be different
on different CPUs. I've found two different implementations of shift/
rotate on Intel CPUs and one on AMD, and all three are different.

You may try contacting Intel/AMD, but be prepared to get ignored or
RTFM'd by them. After all, why should they care about you and others
like you? Seriously? You'd only incur support costs.

Alex
 
 
 

Opcode Parsing & Invalid Opcodes

Post by BGB / cr88 » Wed, 07 Jul 2010 17:51:07

"Nimai" < XXXX@XXXXX.COM > wrote in message
news: XXXX@XXXXX.COM ...
On Jul 5, 3:00 pm, "BGB / cr88192" < XXXX@XXXXX.COM >
wrote:

<--
Theories are nice, but where's the proof? I need word of god on this.
-->

you can't prove anything...

it is like, one can ask themselves how can they be certain they exist.
the best answer would seem to be that one can look at their hand, and infer
that only someone who exists can raise the question as to whether or not
they exist. but, at best, this is the hueristic...


how about this view of the world:
there are no guerantees, and there are no absolutes;
all is, at best, probability, hueristics, and guesswork.

really, it is a world build up in the air, a world built of words and
suppositions.

so we take these guesses and arbitrary statements, and simply pretend as if
they were the truth, and at any moment the specifics may be changed and
revised, and then we have some new "absolute" reality...

one can assert that reality is absolute, but how can one prove it?...
how can one prove what, if anything, then, is absolute?...
for all it matters, all that has been seen thus far could be instead
synthetic behavior, the world we see instead being a piece of machinery
built on top of some other, almost entirely different, universe.

and, one can ask, really, what does it matter?...

the world we live in could just as easily be folded up into a paper crane
for what it matters.


<--
Also, your method works for fixed instruction lengths, but it doesn't
describe what happens to extra bytes trailing after an invalid opcode
is encountered. What if the modr/m byte is screwed up? Does it treat
immediate/displacement data as an additional, invalid opcodes, or does
it throw them away? What if an opcode is corrupted, and its immediate
data or displacement contains a valid opcode?
-->

simple answer:
ModR/M can't be screwed up...
why?... because all possible encodings are valid.
likewise for SIB and displacement...
one gets different results, but nothing can be "wrong" with these bytes as
far as the CPU is concerned (except when part of the opcode goes into "reg",
but then it is a UD as before...).


<--
These are all boundary conditions, sure, and probably useless to most
people, but I need to know!

I want to be able to take any possible stream of bits, no matter how
full of garbage, and be able to know EXACTLY what will happen.

I do NOT want to have to fuzz test my CPU. These things should be
documented.

Again, my question is this: Where is this documented?

Where does Intel specifically say, "instructions are parsed according
to these rules:"?
-->

but, anyways, besides what is documented, the CPU is free to do whatever,
and really this depends a lot on the CPU.

after all, if the CPU had "absolute" behavior, where would there be all the
special edge-cases left to hack over and redefine as new behavior?...

there are almost always little specific details depending on the specific
vendor and model of processor, and before CPUID this was commonly how people
identified which version of which processor was in use...



 
 
 

Opcode Parsing & Invalid Opcodes

Post by N0Spa » Wed, 07 Jul 2010 21:34:17

On Mon, 5 Jul 2010 14:22:58 -0700 (PDT), Nimai



It seems to me that the only real question here is where EIP
will be when the invalid opcode exception is triggered:
Will it be at the first byte of the garbage, or at the byte
where the decoder decides it is garbage?

Personally, I'd want EIP to point to the start of the bad
instruction, not several bytes along. I can't imagine any
use for the latter approach.

But wouldn't this question be easy enough to solve via
experiment? Make up a known-bad "opcode" and see what
happens. Surely if a few trials with different code show
EIP at the start, that would be convincing evidence of the
behavior... it would be *really* strange if the decoder
didn't use the same method every time.

Best regards,


Bob Masta

DAQARTA v5.10
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Sound Level Meter
Frequency Counter, FREE Signal Generator
Pitch Track, Pitch-to-MIDI
DaqMusic - FREE MUSIC, Forever!
(Some assembly required)
Science (and fun!) with your sound card!
 
 
 

Opcode Parsing & Invalid Opcodes

Post by James Harr » Wed, 07 Jul 2010 23:19:01

n 5 July, 22:22, Nimai < XXXX@XXXXX.COM > wrote:


At machine level what determines the *meaning* of a byte is how it is
*used*. If you move part of an instruction into a register it is not
treated as an instruction but as a piece of data. It's just a pattern
of ones and zeros. Conversely if you try to execute some of your data
then, for the purposes of the CPU, it is not taken as data but
effectively *is* an instruction. Again, it's just a pattern of ones
and zeros. For example if you move into a register a byte containing
the value 83 (0x53) it will be taken as the number 83. If you try to
execute that byte (e.g. by jumping to it) it will be taken as the
instruction to push the EBX register on to the stack.

If the CPU can make sense of the instruction it will do whatever the
"instruction" tells it to do and then go on to the next instruction.
If it can't make sense of it it will issue an undefined opcode
exception.

So to answer your question as to how much garbage is taken in the
answer is that it takes in and executes as much of it as makes sense
(i.e. decodes to valid and permissible instructions). If and when it
comes across some bit patterns which are not valid instructions it
will generate an exception.


I think that's a good working model.


Modern x86 CPUs have documented behaviour. IIRC the old 8086 didn't
take an exception if it came across garbage that didn't decode to
valid instructions. Who knows what it did but the action possibly
depended on the internals of a given fabrication and wasn't
documented.

For anything modern check the reference manual under Interrupts and
Exceptions. See Exception 6, invalid opcode. It is classified as a
"fault." This means that the CPU will wind back to the beginning of
the faulting instruction before invoking the exception handler.

"Faults ... When a fault is reported, the processor restores the
machine state to the state prior to the beginning of execution of the
faulting instruction. The return address (saved contents of the CS and
EIP registers) for the fault handler points to the faulting
instruction, rather than to the instruction following the faulting
instruction."

Not all CPUs conveniently wind back. Some stop at whereever they got
to making it virtually impossible to tell the start of the instruction
they are complaining about. X86 is OK though.

James