December 15, 2006
porting frustration
(9/10 on the boring scale)
So I just spent the last day or two (I honestly can't remember) getting Jesusonic
to run on OS X/x86. I already had it running on OS X/PPC, so I figured it would
be easy. Wrooong...
Jesusonic works by compiling code on the fly into native assembly. It actually
does this by glueing together stubs of functions (the most critical which are
written in assembly for each platform) into code that it can execute. Overall
the code it generates is not terribly optimized, but it's definitely not slow
either.
So this is what I found in porting JS.
1) the compiler bitched about my assembly code trashing EBX. Turns out, EBX is
used for position independent code (PIC) addressing. So I figured I should probably
not be messing with EBX, and that the OS needed it. With a bit of additional
work on the compile side I got it to not use EBX, but in the end it turned out
that EBX only really needs to be preserved within functions that use it, and
for my uses I really could use it. Oh well. Time wasted, but hey kinda useful.
Stuff still wasnt working right.
2) Many of the assembly stubs for particular functions needed to call C code,
whether it be a C library function like pow(), or some of our own code (FFTs,
file reading, accessing memory, etc). Since GCC was generating my functions
as PIC, the extended assembly syntax failed to assemble (ending up with assembly
like "movl ((symbolname-$LABELBLAHBLAH(%ebx))), %edi", etc. So turns out MY
compile step needs to actually go generate absolute addresses at runtime,
instead of at compile time. Fair enough, that took an hour or so, and a bunch
of testing/fixing to make sure that nothing broke.
3) And this was the big bitch, and it took me a long time to figure out. Turns
out, and this is well documented, you have to keep the stack aligned to 16
bytes. I would call pow(), and it would end up trying to do an sse load/store
at an unaligned address, and things would proceed to blow up. So I had to go
update all of my stubs and functions to keep the stack nicely aligned, which
is probably not a bad idea anyway. Once I finally got them all correct, I
tried it out, and... IT still didn't work. So I ended up spending a lot of
time with GDB (Xcode's debugger won't let me see registers, argh), and figured
out that, indeed, the stack was aligned when I called my generated, code, but
no, the stack wasn't aligned when it got to pow().
After changing some build settings, I found that with -O0, it did in fact work.
So then I did some gcc -S -O0 file.c and gcc -S -O2 file.c and compared the
generated code for the assembly stubs, and it seems that with -O2, gcc itself
would let the stack get unaligned , as long as my stub wasn't obviously calling
another function.
I looked for a long time to see if I could disable this in gcc, and I gave up,
so on OS X/x86 Jesusonic will have this code for each stub:
pushl %ebp
movl %esp, %ebp
andl $-16, %esp
(run code)
leave
that way, whenever I call out, I can ensure that the stack is aligned, no
matter what kind of crap GCC is generating for the function.
The better way of dealing with this would probably be to write these functions
in assembly directly, or improve the code that cuts up the stubs to have it
filter out the stack frame setup that GCC produces anyway, but hell I'm too
lazy and this works and it's reasonably fast enough as it is. And most
importantly, I get to get back to the fast, satisfying building of UI and
porting of easy things.
4 Comments