Daniel Bump bump@match.Stanford.EDU writes:
David wrote:
I have an email from David referring to man IPC::Open2 on the dangers of open2. But I can't say I really understand why this deadlock occurs or how it can be avoided.
The problem is that when you do fprintf(), there is no guarantee that the data has actually been written out of the process buffers. So process A thinks it has sent the command and is blocked waiting for the reply. But the message is still in A's stdio buffers, and so process is still blocked waiting to read it. => deadlock.
But in this case the printf is done by the perl twogtp command:
print $black_in "boardsize $size\n";
Here $black_in is the file descriptor created with:
$pidb = open2($black_out, $black_in, $black);
and man IPC::Open2 says:
The write filehandle will have autoflush turned on.
Doesn't this mean that the data should have been written out of the buffer?
(BTW forget about the bit I wrote previously about stdin being set up before files are changed : I had misremembered how the code worked... )
Let's see... I ran strace -f on that command line, and separated the output from the three processes.
pid 22039 = perl, starting after the fork. pid 22040 = metamachine pid 22041 = the gnuogo spawned by perl pid 22042 = the gnugo spawned by metamachine
(The last few lines of trace are the most useful)
We can clearly see that the boardsize command left process 22039 on fid 4, but no reply arrived on fid 5.
The boardsize command arrived in pid 22040 on fid 0. It looks like the reply "= \n\n" arrived, but was not send back to the perl process.
The boardsize command arrived in pid 22042, and the "= \n\n" reply was sent.
So it looks like a buffering problem in metamachine, where it is not flushing the reply from gnugo back to its parent.
dd
perl ----
[pid 22039] close(3) = 0 [pid 22039] munmap(0x40194000, 4096) = 0 [pid 22039] close(6) = 0 [pid 22039] munmap(0x4019e000, 4096) = 0 [pid 22039] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 [pid 22039] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 [pid 22039] pipe([3, 6]) = 0 [pid 22039] fcntl(3, F_GETFL) = 0 (flags O_RDONLY) [pid 22039] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22039] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40194000 [pid 22039] _llseek(3, 0, 0xbffff880, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22039] fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) [pid 22039] fstat(6, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22039] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4019e000 [pid 22039] _llseek(6, 0, 0xbffff878, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22039] fcntl(3, F_SETFD, FD_CLOEXEC) = 0 [pid 22039] fcntl(6, F_SETFD, FD_CLOEXEC) = 0 [pid 22039] pipe([7, 8]) = 0 [pid 22039] fcntl(7, F_GETFL) = 0 (flags O_RDONLY) [pid 22039] fstat(7, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22039] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4019f000 [pid 22039] _llseek(7, 0, 0xbffff880, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22039] fcntl(8, F_GETFL) = 0x1 (flags O_WRONLY) [pid 22039] fstat(8, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22039] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x401a0000 [pid 22039] _llseek(8, 0, 0xbffff878, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22039] fcntl(7, F_SETFD, FD_CLOEXEC) = 0 [pid 22039] fcntl(8, F_SETFD, FD_CLOEXEC) = 0 [pid 22039] fork() = 22041 [pid 22039] close(3) = 0 [pid 22039] munmap(0x40194000, 4096) = 0 [pid 22039] close(8) = 0 [pid 22039] munmap(0x401a0000, 4096) = 0 [pid 22039] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 [pid 22039] rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 [pid 22039] write(4, "boardsize 19\n", 13) = 13 [pid 22039] read(5, <unfinished ...>
metamachine -----------
[pid 22040] getpid() = 22040 [pid 22040] close(4) = 0 [pid 22040] munmap(0x40195000, 4096) = 0 [pid 22040] fcntl(3, F_GETFL) = 0 (flags O_RDONLY) [pid 22040] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40195000 [pid 22040] _llseek(3, 0, 0xbffff6e0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22040] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] dup2(3, 0) = 0 [pid 22040] fcntl(0, F_SETFD, 0) = 0 [pid 22040] close(5) = 0 [pid 22040] munmap(0x40196000, 4096) = 0 [pid 22040] fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY) [pid 22040] fstat(6, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40196000 [pid 22040] _llseek(6, 0, 0xbffff6e0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22040] fstat(6, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] dup2(6, 1) = 1 [pid 22040] fcntl(1, F_SETFD, 0) = 0 [pid 22040] execve("./metamachine", ["./metamachine"], [/* 39 vars */]) = 0 [pid 22040] brk(0) = 0x8049cfc [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000 [pid 22040] open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22040] open("/etc/ld.so.cache", O_RDONLY) = 3 [pid 22040] fstat(3, {st_mode=S_IFREG|0644, st_size=16265, ...}) = 0 [pid 22040] mmap(0, 16265, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000 [pid 22040] close(3) = 0 [pid 22040] open("/lib/libc.so.6", O_RDONLY) = 3 [pid 22040] fstat(3, {st_mode=S_IFREG|0755, st_size=4101836, ...}) = 0 [pid 22040] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096 [pid 22040] mmap(0, 1001532, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000 [pid 22040] mprotect(0x40106000, 30780, PROT_NONE) = 0 [pid 22040] mmap(0x40106000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xec000) = 0x40106000 [pid 22040] mmap(0x4010a000, 14396, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4010a000 [pid 22040] close(3) = 0 [pid 22040] mprotect(0x40019000, 970752, PROT_READ|PROT_WRITE) = 0 [pid 22040] mprotect(0x40019000, 970752, PROT_READ|PROT_EXEC) = 0 [pid 22040] munmap(0x40015000, 16265) = 0 [pid 22040] personality(0 /* PER_??? */) = 0 [pid 22040] getpid() = 22040 [pid 22040] pipe([3, 4]) = 0 [pid 22040] pipe([5, 6]) = 0 [pid 22040] fork() = 22042 [pid 22040] fcntl(4, F_GETFL) = 0x1 (flags O_WRONLY) [pid 22040] brk(0) = 0x8049cfc [pid 22040] brk(0x8049dc4) = 0x8049dc4 [pid 22040] brk(0x804a000) = 0x804a000 [pid 22040] SYS_197(0x4, 0xbffff864, 0x40109d40, 0x8049d08, 0x4) = -1 ENOSYS (Function not implemented) [pid 22040] fstat(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000 [pid 22040] _llseek(4, 0, 0xbffff8a4, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22040] fcntl(5, F_GETFL) = 0 (flags O_RDONLY) [pid 22040] fstat(5, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40016000 [pid 22040] _llseek(5, 0, 0xbffff8a4, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22040] fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40017000 [pid 22040] read(0, <unfinished ...> [pid 22040] <... read resumed> "boardsize 19\n", 4096) = 13 [pid 22040] write(4, "boardsize 19\n", 13) = 13 [pid 22040] read(5, <unfinished ...> [pid 22040] <... read resumed> "= \n\n", 4096) = 4 [pid 22040] fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22040] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40018000 [pid 22040] read(0, <unfinished ...>
gnugo spawned by perl (not interesting) ---------------------
[pid 22041] getpid() = 22041 [pid 22041] close(6) = 0 [pid 22041] munmap(0x4019e000, 4096) = 0 [pid 22041] fcntl(3, F_GETFL) = 0 (flags O_RDONLY) [pid 22041] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22041] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4019e000 [pid 22041] _llseek(3, 0, 0xbffff6e0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22041] fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22041] dup2(3, 0) = 0 [pid 22041] fcntl(0, F_SETFD, 0) = 0 [pid 22041] close(7) = 0 [pid 22041] munmap(0x4019f000, 4096) = 0 [pid 22041] fcntl(8, F_GETFL) = 0x1 (flags O_WRONLY) [pid 22041] fstat(8, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22041] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4019f000 [pid 22041] _llseek(8, 0, 0xbffff6e0, SEEK_CUR) = -1 ESPIPE (Illegal seek) [pid 22041] fstat(8, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22041] dup2(8, 1) = 1 [pid 22041] fcntl(1, F_SETFD, 0) = 0 [pid 22041] brk(0x81ac000) = 0x81ac000 [pid 22041] execve("./gnugo", ["./gnugo", "--mode", "gtp", "--quiet"], [/* 39 vars */]) = 0 [pid 22041] brk(0) = 0x863c68c [pid 22041] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000 [pid 22041] open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22041] open("/etc/ld.so.cache", O_RDONLY) = 3 [pid 22041] fstat(3, {st_mode=S_IFREG|0644, st_size=16265, ...}) = 0 [pid 22041] mmap(0, 16265, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40015000 [pid 22041] close(3) = 0 [pid 22041] open("/usr/lib/libncurses.so.4", O_RDONLY) = 3 [pid 22041] fstat(3, {st_mode=S_IFREG|0755, st_size=274985, ...}) = 0 [pid 22041] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\335"..., 4096) = 4096 [pid 22041] mmap(0, 254540, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40019000 [pid 22041] mprotect(0x4004b000, 49740, PROT_NONE) = 0 [pid 22041] mmap(0x4004b000, 36864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x31000) = 0x4004b000 [pid 22041] mmap(0x40054000, 12876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40054000 [pid 22041] close(3) = 0 [pid 22041] open("/lib/libm.so.6", O_RDONLY) = 3 [pid 22041] fstat(3, {st_mode=S_IFREG|0755, st_size=527442, ...}) = 0 [pid 22041] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320F\0"..., 4096) = 4096 [pid 22041] mmap(0, 117208, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40058000 [pid 22041] mprotect(0x40074000, 2520, PROT_NONE) = 0 [pid 22041] mmap(0x40074000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x1b000) = 0x40074000 [pid 22041] close(3) = 0 [pid 22041] open("/lib/libc.so.6", O_RDONLY) = 3 [pid 22041] fstat(3, {st_mode=S_IFREG|0755, st_size=4101836, ...}) = 0 [pid 22041] read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096 [pid 22041] mmap(0, 1001532, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40075000 [pid 22041] mprotect(0x40162000, 30780, PROT_NONE) = 0 [pid 22041] mmap(0x40162000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xec000) = 0x40162000 [pid 22041] mmap(0x40166000, 14396, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40166000 [pid 22041] close(3) = 0 [pid 22041] mprotect(0x40075000, 970752, PROT_READ|PROT_WRITE) = 0 [pid 22041] mprotect(0x40075000, 970752, PROT_READ|PROT_EXEC) = 0 [pid 22041] munmap(0x40015000, 16265) = 0 [pid 22041] personality(0 /* PER_??? */) = 0 [pid 22041] getpid() = 22041 [pid 22041] rt_sigaction(SIGTERM, {0x804a614, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0 [pid 22041] time(NULL) = 999018803 [pid 22041] brk(0) = 0x863c68c [pid 22041] brk(0x863c6c4) = 0x863c6c4 [pid 22041] brk(0x863d000) = 0x863d000 [pid 22041] mmap(0, 651264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4016a000 [pid 22041] mmap(0, 14307328, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40209000 [pid 22041] mmap(0, 1822720, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40fae000 [pid 22041] SYS_197(0, 0xbfffee04, 0x40165d40, 0x401638a0, 0) = -1 ENOSYS (Function not implemented) [pid 22041] fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22041] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000 [pid 22041] read(0, <unfinished ...>
gnugo spawned by metamachine ----------------------------
[pid 22042] close(0) = 0 [pid 22042] dup(3) = 0 [pid 22042] close(1) = 0 [pid 22042] dup(6) = 1 [pid 22042] execve("/usr/local/bin/gnugo", ["gnugo", "--mode", "gtp", "--quiet"], [/* 39 vars */]) = 0 [pid 22042] brk(0) = 0x863c68c [pid 22042] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40014000 [pid 22042] open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory) [pid 22042] open("/etc/ld.so.cache", O_RDONLY) = 7 [pid 22042] fstat(7, {st_mode=S_IFREG|0644, st_size=16265, ...}) = 0 [pid 22042] mmap(0, 16265, PROT_READ, MAP_PRIVATE, 7, 0) = 0x40015000 [pid 22042] close(7) = 0 [pid 22042] open("/usr/lib/libncurses.so.4", O_RDONLY) = 7 [pid 22042] fstat(7, {st_mode=S_IFREG|0755, st_size=274985, ...}) = 0 [pid 22042] read(7, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\335"..., 4096) = 4096 [pid 22042] mmap(0, 254540, PROT_READ|PROT_EXEC, MAP_PRIVATE, 7, 0) = 0x40019000 [pid 22042] mprotect(0x4004b000, 49740, PROT_NONE) = 0 [pid 22042] mmap(0x4004b000, 36864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x31000) = 0x4004b000 [pid 22042] mmap(0x40054000, 12876, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40054000 [pid 22042] close(7) = 0 [pid 22042] open("/lib/libm.so.6", O_RDONLY) = 7 [pid 22042] fstat(7, {st_mode=S_IFREG|0755, st_size=527442, ...}) = 0 [pid 22042] read(7, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320F\0"..., 4096) = 4096 [pid 22042] mmap(0, 117208, PROT_READ|PROT_EXEC, MAP_PRIVATE, 7, 0) = 0x40058000 [pid 22042] mprotect(0x40074000, 2520, PROT_NONE) = 0 [pid 22042] mmap(0x40074000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x1b000) = 0x40074000 [pid 22042] close(7) = 0 [pid 22042] open("/lib/libc.so.6", O_RDONLY) = 7 [pid 22042] fstat(7, {st_mode=S_IFREG|0755, st_size=4101836, ...}) = 0 [pid 22042] read(7, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\210\212"..., 4096) = 4096 [pid 22042] mmap(0, 1001532, PROT_READ|PROT_EXEC, MAP_PRIVATE, 7, 0) = 0x40075000 [pid 22042] mprotect(0x40162000, 30780, PROT_NONE) = 0 [pid 22042] mmap(0x40162000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0xec000) = 0x40162000 [pid 22042] mmap(0x40166000, 14396, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40166000 [pid 22042] close(7) = 0 [pid 22042] mprotect(0x40075000, 970752, PROT_READ|PROT_WRITE) = 0 [pid 22042] mprotect(0x40075000, 970752, PROT_READ|PROT_EXEC) = 0 [pid 22042] munmap(0x40015000, 16265) = 0 [pid 22042] personality(0 /* PER_??? */) = 0 [pid 22042] getpid() = 22042 [pid 22042] rt_sigaction(SIGTERM, {0x804a614, [], SA_RESTART|0x4000000}, {SIG_DFL}, 8) = 0 [pid 22042] time(NULL) = 999018803 [pid 22042] brk(0) = 0x863c68c [pid 22042] brk(0x863c6c4) = 0x863c6c4 [pid 22042] brk(0x863d000) = 0x863d000 [pid 22042] mmap(0, 651264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4016a000 [pid 22042] mmap(0, 14307328, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40209000 [pid 22042] mmap(0, 1822720, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40fae000 [pid 22042] SYS_197(0, 0xbfffee04, 0x40165d40, 0x401638a0, 0) = -1 ENOSYS (Function not implemented) [pid 22042] fstat(0, {st_mode=S_IFIFO|0600, st_size=13, ...}) = 0 [pid 22042] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40015000 [pid 22042] read(0, "boardsize 19\n", 4096) = 13 [pid 22042] fstat(1, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 [pid 22042] mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40016000 [pid 22042] write(1, "= \n\n", 4) = 4 [pid 22042] read(0, <unfinished ...>
dd