This puzzled me for a while, so I thought I'd explain it here. Don't hesitate
to contact me if it's still
unclear, if I am still making mistakes, or if you feel like it *smiley
face*.
If like me you want to encode or decode manually Intel IA-32 instructions, you
probably read about the Mod R/M byte. It's an extra byte of information, which
needs to be present depending on the opcode you're using. It is then found
directly after the opcode, and interpreted in a number of different ways.
Encoding
Here is how it is encoded:
7 6 5 3 2 0
------------------
| MOD | R/O | R/M |
------------------
- mod: defines 8 registers or 24 addressing modes;
- r/o: contains a register number of extra opcode information;
- r/m: defines a register as operand or completes the addressing
mode.
Then, in the Intel
Instruction Set Reference (volume 2), you will notice that some opcode
numbers are followed by either of the following:
- /digit: where digit is 0 to 7, then uses only r/m;
- /r: both r/o and r/m are used;
- cb, cw, cd, cp: the opcode is respectively followed by a 1, 2, 4 or
6 bytes code offset value;
- ib, iw, id: the opcode is respectively followed by a 1, 2, or 4
bytes immediate operand;
- +rb, +rw, +rd: a register code (0 to 7) is added to the opcode
itself;
- +i: a floating-point register code is added to the opcode
itself.
Mod R/M table
What matters here is the "/r" and the "/digit" notations. They are actually
indexing the columns in the Mod R/M table (or for 16 bits
addressing). Then, you pick the register or the effective
address that you need, which indexes the line. You now have your value!
Quick example
Here follows a practical application just to be sure. Say you want to encode
this:
ADC $0x90, %dh
where
$0x90 is an immediate value, and
%dh is the
destination register. The Intel documentation mentions:
80 /2 ib ADC r/m8,imm8
where
80 is the opcode (in hexadecimal),
ADC the "Add
with carry" instruction, and
r/m8,imm8 means it takes a mod r/m byte
and an immediate value (single byte) as arguments. It will be encoded like
this:
0x80 0xD6 0x90
because in the "/2" column from the table (the third), the line
corresponding to the
DH register contains the value
D6.
I thought it would be useful to summarize this in one place.