Reading one character at a time
Sept 3, 2013 8:37:55 GMT -5
Post by Deleted on Sept 3, 2013 8:37:55 GMT -5
One of the many sillinesses of the "C" language is that it offers no standard way to read one character at a time as the user types at the keyboard - at least not under Unix. And yet that is what many users want to do! Some Windows compilers have the "getch" function, supported in a half-hearted way. Under OpenBSD there are several somewhat gimcrack ways to circumvent this lamentable omission.
1) Here is one way that I discovered through a web search; the essence of it is as follows:
struct termios ttyInfo;
tcgetattr( 0, &ttyInfo );
ttyInfo.c_lflag &= ~ECHO; (ECHO is 0x00000008)
ttyInfo.c_lflag &= ~ICANON; (ICANON is 0x00000100)
tcsetattr( ttyFD, TCSANOW, &ttyInfo );
/* Get the character */
read( 0, &ch, 1 );
The structure termios and the two functions tcgetattr and tcsetattr are part of a system set up to control a terminal interface; foreign to "C" itself but tacked on. So, although I write in Assembly language, I decided to give that way a try. Yes, it did work! But little did I know what was really going on!
2) The next step, then, was to make that sequence a function, and try calling that "C" function from an Assembly programme - something that is often done on systems other than OpenBSD. Here is - or rather was - my linker line (one of a number of varied trials):
ld -m elf_i386_obsd -o slt -static slt.o tcsub.o -lc
Well! More than a page of gibberish appears, beginning with:
tcsub.o(.text+0x7): In function 'mygetch':
: undefined reference to '__guard_local'
In English this means that something in my programme is attempting to call a function with the name "guard_local".
3) But the thing is, I did NOT call a function with the name "guard_local." What could it be? Again I turned to the much-vaunted "man pages": not a hint, not a mention of "guard_local". I searched through all the "include" directories: again this "guard_local" is utterly absent. Several days later I discovered something really obscure: it is part of a so-called "buffer overflow protection" - it also involves "terminator canaries" and "ProPolice":
en.wikipedia.org/wiki/Buffer_overflow_protection
It is something thought up by a Japanese employee of IBM Japan, and included in the "C" compiler provided with OpenBSD - goodness knows why.
I am accustomed to, and entitled to, the expectation that the code I write is the code that will get executed. But here is an unknown undocumented grotesquerie secretly inserted in every one's "C" programmes! It's precisely a kind of Japanese virus is it not? The secret comes to light only when an attempt is made to link a non-gcc programme with it.
Now, there is a gcc option to turn off this virus-like feature in the "C" programmes one compiles. But of course "tcsetattr" - which I was attempting to call - is not one of mine, but comes with OpenBSD and the "ProPolice" have already been compiled in.
4) My idea of calling a "C" function to use the precompiled termios functions was therefore pretty much impossible. So the next step was to download and inspect the source of tcsetatt. In fact it is very simple, consisting of one simple system call: ioctl. It is easy to put system calls into an Assembly language programme; in that way we can forget "C" altogether.
5) The next step, then, is to find out what arguments to ioctl are used in the source of tcsetatt. Here they are:
return (ioctl(fd, TIOCSETA, t));
"fd" is sysin, thus the terminal, "t" is the same "termios" structure - but what is that TIOCSETA? Another hunt through the "include" directories reveals this definition, in ttycom.h:
#define TIOCSETA _IOW('t', 20, struct termios)
Oh yes? Wheels within wheels. That _IOW looks like a macro, but where is it defined and what does it do? See section 7 below.
Similarly, TIOCGETA is defined as _IOR('t',19, struct termios).
6) So we can now put the lines:
iret = ioctl(0, TIOCGETA, &ttyinfo);
iret = ioctl(0, TIOCSETA, &ttyinfo);
into a dummy "C" programme, and compile it with the "-S" option (but not link it). The result is a file in the old-fashioned Assembly format used by gcc. And therein we find for TIOCGETA and TIOCSETA the equivalents $1076655123 and $-2144570348 (presumably signed decimal).
7) Entering those into my trusty Sharp Scientific calculator EL-506P (from 1985) we get the hexadecimal equivalents 0x402c7413 and 0x802c7414
It was, as I say, difficult to find a clear explanation of how these numbers are coded; the best is at
www.mjmwired.net/kernel/Documentation/ioctl/ioctl-decoding.txt
Quote:
===================
bits 31-30:
00 - no parameters: uses _IO macro
10 - read: _IOR (wrong I believe)
01 - write: _IOW (also wrong I believe)
11 - read/write: _IOWR
bits 29-16 size of arguments (2c)
bits 15-8 ascii character supposedly unique to each driver ( t = 74)
bits 7-0 function number (13 hex = read; 14 hex = write)
====================
And indeed it now almost makes sense at last (except I believe the read and write bits are still in the wrong order in that table): 4-02c-74-13 and 8-02c-74-14
8) So here after all that is my code (in summary form) to enable the keyboard to supply characters one by one - any one is of course free to use it!
; Read the current input settings
push dword c_iflag ; ADDRESS of termios structure
push dword 0x402c7413 ; TIOCGETA
push 0 ; sysin
mov eax,54 ; ioctl system call
sub esp,4 ; Dummy for OpenBSD system call peculiarity
int 080h
add esp,4+12 ; (The dummy plus 3 arguments) * 4
; Make the required changes
and dword [c_lflag],~ECHO
and dword [c_lflag],~ICANON
; Write out the altered input settings
push dword c_iflag ; ADDRESS of termios structure
push dword 0x802c7414 ; TIOCSETA
push 0 ; sysin
mov eax,54 ; ioctl system call
sub esp,4 ; Dummy for OpenBSD system call peculiarity
int 080h
add esp,4+12 ; (The dummy plus 3 arguments) * 4
9) And now one is able to read a character at a time:
push dword 1 ; Byte count
push dword inbyte ; Address of a memory location to receive the character
push dword 0 ; sysin
mov eax,3 ; read system call
sub esp,4 ; Dummy
int 080h
add esp,16
1) Here is one way that I discovered through a web search; the essence of it is as follows:
struct termios ttyInfo;
tcgetattr( 0, &ttyInfo );
ttyInfo.c_lflag &= ~ECHO; (ECHO is 0x00000008)
ttyInfo.c_lflag &= ~ICANON; (ICANON is 0x00000100)
tcsetattr( ttyFD, TCSANOW, &ttyInfo );
/* Get the character */
read( 0, &ch, 1 );
The structure termios and the two functions tcgetattr and tcsetattr are part of a system set up to control a terminal interface; foreign to "C" itself but tacked on. So, although I write in Assembly language, I decided to give that way a try. Yes, it did work! But little did I know what was really going on!
2) The next step, then, was to make that sequence a function, and try calling that "C" function from an Assembly programme - something that is often done on systems other than OpenBSD. Here is - or rather was - my linker line (one of a number of varied trials):
ld -m elf_i386_obsd -o slt -static slt.o tcsub.o -lc
Well! More than a page of gibberish appears, beginning with:
tcsub.o(.text+0x7): In function 'mygetch':
: undefined reference to '__guard_local'
In English this means that something in my programme is attempting to call a function with the name "guard_local".
3) But the thing is, I did NOT call a function with the name "guard_local." What could it be? Again I turned to the much-vaunted "man pages": not a hint, not a mention of "guard_local". I searched through all the "include" directories: again this "guard_local" is utterly absent. Several days later I discovered something really obscure: it is part of a so-called "buffer overflow protection" - it also involves "terminator canaries" and "ProPolice":
en.wikipedia.org/wiki/Buffer_overflow_protection
It is something thought up by a Japanese employee of IBM Japan, and included in the "C" compiler provided with OpenBSD - goodness knows why.
I am accustomed to, and entitled to, the expectation that the code I write is the code that will get executed. But here is an unknown undocumented grotesquerie secretly inserted in every one's "C" programmes! It's precisely a kind of Japanese virus is it not? The secret comes to light only when an attempt is made to link a non-gcc programme with it.
Now, there is a gcc option to turn off this virus-like feature in the "C" programmes one compiles. But of course "tcsetattr" - which I was attempting to call - is not one of mine, but comes with OpenBSD and the "ProPolice" have already been compiled in.
4) My idea of calling a "C" function to use the precompiled termios functions was therefore pretty much impossible. So the next step was to download and inspect the source of tcsetatt. In fact it is very simple, consisting of one simple system call: ioctl. It is easy to put system calls into an Assembly language programme; in that way we can forget "C" altogether.
5) The next step, then, is to find out what arguments to ioctl are used in the source of tcsetatt. Here they are:
return (ioctl(fd, TIOCSETA, t));
"fd" is sysin, thus the terminal, "t" is the same "termios" structure - but what is that TIOCSETA? Another hunt through the "include" directories reveals this definition, in ttycom.h:
#define TIOCSETA _IOW('t', 20, struct termios)
Oh yes? Wheels within wheels. That _IOW looks like a macro, but where is it defined and what does it do? See section 7 below.
Similarly, TIOCGETA is defined as _IOR('t',19, struct termios).
6) So we can now put the lines:
iret = ioctl(0, TIOCGETA, &ttyinfo);
iret = ioctl(0, TIOCSETA, &ttyinfo);
into a dummy "C" programme, and compile it with the "-S" option (but not link it). The result is a file in the old-fashioned Assembly format used by gcc. And therein we find for TIOCGETA and TIOCSETA the equivalents $1076655123 and $-2144570348 (presumably signed decimal).
7) Entering those into my trusty Sharp Scientific calculator EL-506P (from 1985) we get the hexadecimal equivalents 0x402c7413 and 0x802c7414
It was, as I say, difficult to find a clear explanation of how these numbers are coded; the best is at
www.mjmwired.net/kernel/Documentation/ioctl/ioctl-decoding.txt
Quote:
===================
bits 31-30:
00 - no parameters: uses _IO macro
10 - read: _IOR (wrong I believe)
01 - write: _IOW (also wrong I believe)
11 - read/write: _IOWR
bits 29-16 size of arguments (2c)
bits 15-8 ascii character supposedly unique to each driver ( t = 74)
bits 7-0 function number (13 hex = read; 14 hex = write)
====================
And indeed it now almost makes sense at last (except I believe the read and write bits are still in the wrong order in that table): 4-02c-74-13 and 8-02c-74-14
8) So here after all that is my code (in summary form) to enable the keyboard to supply characters one by one - any one is of course free to use it!
; Read the current input settings
push dword c_iflag ; ADDRESS of termios structure
push dword 0x402c7413 ; TIOCGETA
push 0 ; sysin
mov eax,54 ; ioctl system call
sub esp,4 ; Dummy for OpenBSD system call peculiarity
int 080h
add esp,4+12 ; (The dummy plus 3 arguments) * 4
; Make the required changes
and dword [c_lflag],~ECHO
and dword [c_lflag],~ICANON
; Write out the altered input settings
push dword c_iflag ; ADDRESS of termios structure
push dword 0x802c7414 ; TIOCSETA
push 0 ; sysin
mov eax,54 ; ioctl system call
sub esp,4 ; Dummy for OpenBSD system call peculiarity
int 080h
add esp,4+12 ; (The dummy plus 3 arguments) * 4
9) And now one is able to read a character at a time:
push dword 1 ; Byte count
push dword inbyte ; Address of a memory location to receive the character
push dword 0 ; sysin
mov eax,3 ; read system call
sub esp,4 ; Dummy
int 080h
add esp,16