mixing O_NONBLOCK and raw tty causes portability problems

Niels Möller nisse@lysator.liu.se
15 Oct 2003 16:01:19 +0200


Matthias Drochner <M.Drochner@fz-juelich.de> writes:

> I thought I'd give lsh a try, just to see how it compares to openssh...
> The client didn't work well on NetBSD, got a message like "unexpected
> EWOULDBLOCK" on each keystroke.

I rely on the following behaviour of VMIN > 0 and VTIME > 0, as
described by the glibc manual:

   * Both TIME and MIN are nonzero.

     In this case, TIME specifies how long to wait after each input
     character to see if more input arrives. After the first character
     received, `read' keeps waiting until either MIN bytes have
     arrived in all, or TIME elapses with no further input.

     `read' always blocks until the first character arrives, even if
     TIME elapses first. `read' can return more than MIN characters if
     more than MIN happen to be in the queue.

and on the general principle that if I call read after that
poll/select have said that a fd is readable, then read on that fd
shouldn't return EWOULDBLOCK.

> -on Linux, if a key is pressed, the read returns immediately with
>  that one character
> -on NetBSD, the read returns with no data but EWOULDBLOCK
> -on D'UNIX, the poll() doesn't teturn before 4 keypresses are done;
>  the read() returns these 4 characters

Interesting, all three seem broken to me, in one way or the other. I
get the same result as you on my linux 2.4.18 laptop.

On netbsd, if you type quickly, will read return several characters at
a time?

I also tried your test program on SunOS 5.9, where I had to add the
definiton

#define cfmakeraw(ios) do {						   \
  (ios)->c_iflag &= ~(IGNBRK|BRKINT|PARMRK|ISTRIP|INLCR|IGNCR|ICRNL|IXON); \
  (ios)->c_oflag &= ~OPOST;						   \
  (ios)->c_lflag &= ~(ECHO|ECHONL|ICANON|ISIG|IEXTEN);			   \
  (ios)->c_cflag &= ~(CSIZE|PARENB); (ios)->c_cflag |= CS8;		   \
} while(0)

to make it compile. And then I increased VTIME to 10 to get a more
noticable effect.

Then the test program works just the way it should: if I type less
then 4 characters, I have to wait a while until read returns them all
at the same time. If I type four characters quickly, they are returned
as soon as I type the fourth one.

> Indeed, in SUSv2's termios page is a sentence which says that if
> both O_NONBLOCK and VTIME>0 are set, the behaviour is more or less
> undefined.

Hmm, can you point me to the right place to read that?

> (Don't know what liboop uses under the hood, but in case it does
> poll(), anything with VMIN>1 wouldn't work with D'Unix...)

Does it matter if one uses poll or select? If so, that may be a reason
to prefer one over the other. (In general, I think the poll interface
is nice and clean, but in practice the details differ so much between
different systems that it is painful to use in portable code. select
behaviour is much more uniform).

Then, possible workarounds. What happens if one jsut ignores the
EWOULDBLOCK error? As long as the code doesn't degrade into a busy
loop around poll/select and read, ignoring EWOULDBLOCK is harmless.
This seems like the simplest solution. Untested patch below.

I don't quite like setting VMIN to 1, because that's a degradation for
systems like Solaris where these things actually work. I guess it's
possible to write a configure test that creates a pty pair and then
tests behaviour, but that doesn't sound particularly fun.

Regards,
/Niels

diff -u -a -p -r1.203 io.c
--- src/io.c	25 Sep 2003 14:46:02 -0000	1.203
+++ src/io.c	15 Oct 2003 14:00:25 -0000
@@ -570,7 +570,7 @@ do_consuming_read(struct io_callback *c,
 	    case EINTR:
 	      break;
 	    case EWOULDBLOCK:
-	      werror("io.c: read_consume: Unexpected EWOULDBLOCK\n");
+	      verbose("io.c: read_consume: Unexpected EWOULDBLOCK\n");
 	      break;
 	    case EPIPE:
 	      /* FIXME: I don't understand why reading should return