System calls from assembly language
Nov 12, 2013 7:04:14 GMT -5
Post by Deleted on Nov 12, 2013 7:04:14 GMT -5
To-day I want to say a little about OpenBSD system calls. A list of all their names may be found in /usr/include/sys/syscall.h. And if any one of those names interests you you may type "man 2 that name" to discover more about it. That man page should point you to further include files which contain numerical definitions of all the values that may be used in the arguments. When calling one of these things the convention is that one pushes the right-most argument first.
So to start with here is an example of how one might open a file in c (taken from lrmi_init):
fd = open(name, O_RDONLY);
if (fd == -1)
{
perror("open");
return 0;
}
It will be observed that, as the man page confirms, the return value is either a file descriptor or an error indication of minus one. And further, the man page tells us that in the error (-1) case one must go to "errno" to obtain an error code indicating the reason for failure.
Well! Were one to attempt to transcribe that rigmarole directly into assembly language one would quickly come a cropper! The true system call returns quite different values.
This held me up for some time, until I wrote a little test in c and disassembled it, as follows:
. . .
int fd, tata;
fd = open( "/dev/mem", O_RDWR );
tata = fd;
printf( "Hello, fd is %d and tata is %d\n", fd, tata );
. . .
Get an assembly listing by way of something like this:
$ gcc -ggdb -S -static test.c -o test.s
Have a look at what c - silly c as I call it - has produced:
. . .
- subtract 36 from esp (an odd and unnecessary thing to do)
- mov 2 to (esp+4) (this is O_RDWR)
- mov the address of the file name to (esp)
- call open
- mov eax to (ebp-8) (save the value of fd returned)
- mov (ebp-8) to eax (which as members will we hope see is entirely absurd)
- mov eax to (ebp-12) (setting tata)
- mov (ebp-12) to eax (the same sort of absurdity - for the tata in printf)
. . . and so on to the printf, and then
- add 36 to esp (recovering from the odd and unnecessary move at the start)
. . .
Now, as I said, this does not reflect the values I get when I simply call "open" from an assembly programme. So let us look at the c business further. First we compile it properly, this time going through the link stage as well:
$ gcc -ggdb -static test.c -o test
And now inspect the resultant static executable:
$ objdump -d test | less
Well again! Instead of the simple "call open" the compiler or linker has now inserted this:
call 1c00044e <_thread_sys_open>
and this "_thread_sys_open" is a little subroutine elsewhere in the executable programme. What it does is:
- mov 5 to eax (the system call number for open)
- int 0x80
So far that corresponds to what I did in my assembly programme. But why was I getting different return values? Well, because after that int we see:
- jb 1c000448
- ret
This setting of the carry flag is not mentioned at all in the man page for the "open" system call, yet it is the fundamental method of indicating an error upon return!
So what in fact does c do when the flag is set? At 1c000448 we find another little subroutine, called "___cerror" (ridiculous name) in the dump:
- it saves eax in the location 0x3c00c480
- then it moves -1 to eax
- and then it moves -1 to edx (presumably just to be certain should the caller expect a double-length return value)
- and then it returns to the caller of <_thread_sys_open>
- and then there follow eight unused bytes
So the caller, upon receiving the -1 in his return value, still does not know what the error was. To discover that, a fourth little subroutine, named "___errno", is called, which runs as follows:
- push ebp
- mov the contents of 0x3c00c480 back to eax (which we saw saved a few lines back - sillier and sillier)
- mov esp to ebp
- pop ebp (yes, that is what it does - nothing could be sillier!)
- ret (with the error number in eax)
- and then there follow six unused bytes
Anyway, after inspecting all this I was able to write a much saner assembly language "open" call, and simply test the carry flag for an error. eax upon return will contain either the new file descriptor or the error number, and that is that:
- push O_RDWR
- push the address of the file name
- push eax (just a dummy required by all OpenBSD system calls
- mov 5 to eax (the number of the "open" system function)
- int 0x80 (perform the open)
- jnb good --> open succeeded
- the reason for the error is already in eax, so we can at once handle it (and of course adjust the stack if we wish to continue)
good: add 12 to esp
- save eax as the file descriptor
and continue happily.
So to start with here is an example of how one might open a file in c (taken from lrmi_init):
fd = open(name, O_RDONLY);
if (fd == -1)
{
perror("open");
return 0;
}
It will be observed that, as the man page confirms, the return value is either a file descriptor or an error indication of minus one. And further, the man page tells us that in the error (-1) case one must go to "errno" to obtain an error code indicating the reason for failure.
Well! Were one to attempt to transcribe that rigmarole directly into assembly language one would quickly come a cropper! The true system call returns quite different values.
This held me up for some time, until I wrote a little test in c and disassembled it, as follows:
. . .
int fd, tata;
fd = open( "/dev/mem", O_RDWR );
tata = fd;
printf( "Hello, fd is %d and tata is %d\n", fd, tata );
. . .
Get an assembly listing by way of something like this:
$ gcc -ggdb -S -static test.c -o test.s
Have a look at what c - silly c as I call it - has produced:
. . .
- subtract 36 from esp (an odd and unnecessary thing to do)
- mov 2 to (esp+4) (this is O_RDWR)
- mov the address of the file name to (esp)
- call open
- mov eax to (ebp-8) (save the value of fd returned)
- mov (ebp-8) to eax (which as members will we hope see is entirely absurd)
- mov eax to (ebp-12) (setting tata)
- mov (ebp-12) to eax (the same sort of absurdity - for the tata in printf)
. . . and so on to the printf, and then
- add 36 to esp (recovering from the odd and unnecessary move at the start)
. . .
Now, as I said, this does not reflect the values I get when I simply call "open" from an assembly programme. So let us look at the c business further. First we compile it properly, this time going through the link stage as well:
$ gcc -ggdb -static test.c -o test
And now inspect the resultant static executable:
$ objdump -d test | less
Well again! Instead of the simple "call open" the compiler or linker has now inserted this:
call 1c00044e <_thread_sys_open>
and this "_thread_sys_open" is a little subroutine elsewhere in the executable programme. What it does is:
- mov 5 to eax (the system call number for open)
- int 0x80
So far that corresponds to what I did in my assembly programme. But why was I getting different return values? Well, because after that int we see:
- jb 1c000448
- ret
This setting of the carry flag is not mentioned at all in the man page for the "open" system call, yet it is the fundamental method of indicating an error upon return!
So what in fact does c do when the flag is set? At 1c000448 we find another little subroutine, called "___cerror" (ridiculous name) in the dump:
- it saves eax in the location 0x3c00c480
- then it moves -1 to eax
- and then it moves -1 to edx (presumably just to be certain should the caller expect a double-length return value)
- and then it returns to the caller of <_thread_sys_open>
- and then there follow eight unused bytes
So the caller, upon receiving the -1 in his return value, still does not know what the error was. To discover that, a fourth little subroutine, named "___errno", is called, which runs as follows:
- push ebp
- mov the contents of 0x3c00c480 back to eax (which we saw saved a few lines back - sillier and sillier)
- mov esp to ebp
- pop ebp (yes, that is what it does - nothing could be sillier!)
- ret (with the error number in eax)
- and then there follow six unused bytes
Anyway, after inspecting all this I was able to write a much saner assembly language "open" call, and simply test the carry flag for an error. eax upon return will contain either the new file descriptor or the error number, and that is that:
- push O_RDWR
- push the address of the file name
- push eax (just a dummy required by all OpenBSD system calls
- mov 5 to eax (the number of the "open" system function)
- int 0x80 (perform the open)
- jnb good --> open succeeded
- the reason for the error is already in eax, so we can at once handle it (and of course adjust the stack if we wish to continue)
good: add 12 to esp
- save eax as the file descriptor
and continue happily.