I said I'd never blog

DeforaOS, NetBSD, reverse-engineering and stuff

Older stuff...

DRM patch for Linux 2.4.36.2
Wed May 21 04:11:10 CEST 2008

Since I already know that you'll ask why I have been doing this for an « outdated » kernel version (2.4...), and on a tuesday evening until 4:00 am, well, I'll simply tell you:

  • I *really* appreciate the stability of the 2.4 series;
  • yes, because I don't use Linux that much anymore so I'm glad I don't have to compile/update my kernel everytime I multi-boot into it;
  • no, I am usually using NetBSD instead;
  • no, I haven't been gaming all night (or even trying to);
  • but rather, my last 2.6 update broke the sh*t out of everything on this system;
  • this system happens to be my last machine currently in running order, that also takes a keyboard and a screen (every hardware I touch fails, even without socks);
  • 3D support is the last working bit that was missing with my good old 2.4 grub entry;
  • it's an entertaining way to hack a bit on kernel code, too :)
  • (and I can't sleep until I finished whatever I'm working on)

Enough chit chat, the patch is out, and on its way to the 2.4 kernel maintainer already.

Update: the patch will not be integrated to Linux 2.4, but it's now also found on the LKUP page (here).

It may look trivial
Tue May 20 02:45:08 CEST 2008

It may look trivial indeed, by just looking at it:
hello3.c

But in fact, this image was generated with the graph plug-in of the DeforaOS C compiler. Here is the content of the file that was parsed:

#include <stdio.h>

int main()
{
 	printf("%s", "Hello, world!\n");
	return 0;
}
And yes, <stdio.h> was processed. The parser is not featured enough yet for /usr/include from a mainstream system, but this is precisely where I am really glad to have written my own libc for DeforaOS: I kept it as simple as possible, which really helps here.

For the record, here is the command line I used to "compile" the image, from within the Apps/Devel/src/c99 directory:

$ make &&
make install > /dev/null 2> /dev/null &&
./src/c99 -M graph -c test/hello3.c &&
cat test/hello3.o &&
dot -Tpng test/hello3.o -o test/hello3.png &&
open test/hello3.png
where:

While far from being finished, I have every intention to continue to work on this ambitious project. Meaning, a compiler that can:

  • target every architecture it supports without being recompiled;
  • be extended with (possibly proprietary) modules without being recompiled;
  • be useful not only for compiling code;
  • and last but not least, be used as a library.

If I wrote this clearly enough, you already understood that this would make cross-compiling obsolete, among other things.

But don't hold your breath: it's gonna take me a while, at best...

Filing bugs
Mon May 19 01:37:48 CEST 2008
How to read the "Mod R/M" table
Wed May 7 17:11:05 CEST 2008

This puzzled me for a while, so I thought I'd explain it here. Don't hesitate to contact me if it's still unclear, if I am still making mistakes, or if you feel like it *smiley face*.

If like me you want to encode or decode manually Intel IA-32 instructions, you probably read about the Mod R/M byte. It's an extra byte of information, which needs to be present depending on the opcode you're using. It is then found directly after the opcode, and interpreted in a number of different ways.

Encoding

Here is how it is encoded:

  7  6 5   3 2    0
 ------------------
| MOD | R/O | R/M  |
 ------------------
  • mod: defines 8 registers or 24 addressing modes;
  • r/o: contains a register number of extra opcode information;
  • r/m: defines a register as operand or completes the addressing mode.

Then, in the Intel Instruction Set Reference (volume 2), you will notice that some opcode numbers are followed by either of the following:

  • /digit: where digit is 0 to 7, then uses only r/m;
  • /r: both r/o and r/m are used;
  • cb, cw, cd, cp: the opcode is respectively followed by a 1, 2, 4 or 6 bytes code offset value;
  • ib, iw, id: the opcode is respectively followed by a 1, 2, or 4 bytes immediate operand;
  • +rb, +rw, +rd: a register code (0 to 7) is added to the opcode itself;
  • +i: a floating-point register code is added to the opcode itself.

Mod R/M table

What matters here is the "/r" and the "/digit" notations. They are actually indexing the columns in the Mod R/M table (or for 16 bits addressing). Then, you pick the register or the effective address that you need, which indexes the line. You now have your value!

Quick example

Here follows a practical application just to be sure. Say you want to encode this:

	ADC	$0x90, %dh
where $0x90 is an immediate value, and %dh is the destination register. The Intel documentation mentions:
	80 /2 ib ADC r/m8,imm8
where 80 is the opcode (in hexadecimal), ADC the "Add with carry" instruction, and r/m8,imm8 means it takes a mod r/m byte and an immediate value (single byte) as arguments. It will be encoded like this:
	0x80 0xD6 0x90
because in the "/2" column from the table (the third), the line corresponding to the DH register contains the value D6.

I thought it would be useful to summarize this in one place.

Just for the notice
Tue May 6 22:12:09 CEST 2008

There is something going on with pdp of gnucitizen called the "House of hackers". I won't comment on it. I can only see that it looks like a social network for « hackers », conforming with these « Web 2.0 » principles. However, just look at the address of his latest posts, in chronological order:

I guess it will make me show up in the « trackbacks ». Whatever.

Come back...
Creative Commons License