A number of us who are running caudium on 7.6.50 have been experiencing what I'll describe as "lockups" on a pretty frequent basis. Normally, the watchdog will kill the process and restart things, but I wanted to figure out what was causing the problem. I managed to catch the hangup on my server a few minutes ago, and this is what I get:
Describing all threads: Thread 1: Thread.Thread(3351712).Queue: Thread.Thread(3351712).Queue(0)->wait (_static_modules.Builtin()->mutex_key()) /usr/local/pike/7.6.50/lib/modules/Thread.pmod:278: Thread.Thread (3351712).Queue(0)->read() base_server/caudium.pike:585: /opt/caudium/server/base_server/caudium ()->handler_thread(2)
Thread 2: Thread.Thread(3351712).Queue: Thread.Thread(3351712).Queue(0)->wait (_static_modules.Builtin()->mutex_key()) /usr/local/pike/7.6.50/lib/modules/Thread.pmod:278: Thread.Thread (3351712).Queue(0)->read() base_server/caudium.pike:585: /opt/caudium/server/base_server/caudium ()->handler_thread(3)
Thread 3: Thread.Thread(3351712).Queue: Thread.Thread(3351712).Queue(0)->wait (_static_modules.Builtin()->mutex_key()) /usr/local/pike/7.6.50/lib/modules/Thread.pmod:278: Thread.Thread (3351712).Queue(0)->read() base_server/caudium.pike:585: /opt/caudium/server/base_server/caudium ()->handler_thread(4)
Thread 4: _static_modules.Builtin()->thread_id: Thread.Thread(1)->backtrace() base_server/caudium.pike:3493: /opt/caudium/server/base_server/caudium ()->describe_all_threads() Stdio.File: Stdio.File("socket", "63.85.143.88 1198", 777 /* fd=103 */)->set_write_callback(Stdio.File("socket", "63.85.143.88 1198", 777 /* fd=103 */)$ /usr/local/pike/7.6.50/lib/modules/Stdio.pmod/module.pmod:1097: Stdio.File("socket", "63.85.143.88 1198", 777 /* fd=103 */)-
set_write_callback(SSL.ssl$
/usr/local/pike/7.6.50/lib/modules/SSL.pmod/sslfile.pike:1321: SSL.sslfile(Fd(103))->direct_write() /usr/local/pike/7.6.50/lib/modules/SSL.pmod/sslfile.pike:516: SSL.sslfile(Fd(103))->close(UNDEFINED,UNDEFINED) protocols/http.pike:812: /opt/caudium/server/protocols/ssl3()->end (0,UNDEFINED) _static_modules.Builtin()->Backend: Pike.Backend(0)->`()(0) /usr/local/pike/7.6.50/lib/master.pike:2703: master()->_main(({"/usr/ local/bin/pike","-DENABLE_THREADS","-DCAUDIUM","-DCAUDIUM_CACHE","- DROXEN","-Ietc/$
Thread 5: Thread.Thread(3351712).Queue: Thread.Thread(3351712).Queue(0)->wait (_static_modules.Builtin()->mutex_key()) /usr/local/pike/7.6.50/lib/modules/Thread.pmod:278: Thread.Thread (3351712).Queue(0)->read() base_server/caudium.pike:585: /opt/caudium/server/base_server/caudium ()->handler_thread(0)
Thread 6: Thread.Thread(3351712).Queue: Thread.Thread(3351712).Queue(0)->wait (_static_modules.Builtin()->mutex_key()) /usr/local/pike/7.6.50/lib/modules/Thread.pmod:278: Thread.Thread (3351712).Queue(0)->read() base_server/caudium.pike:585: /opt/caudium/server/base_server/caudium ()->handler_thread(1)
The hangup appears to be:
Thread.Queue()->read()
I also did a truss to see what the threads were up to, and here's what I got:
bash-2.05$ sudo truss -p 26891 /1: ioctl(105, DP_POLL, 0xFFBFE8B8) (sleeping...) /2: lwp_park(0x00000000, 1) (sleeping...) /3: lwp_park(0x00000000, 0) (sleeping...) /4: lwp_park(0x00000000, 1) (sleeping...) /5: lwp_park(0x00000000, 0) (sleeping...) /6: lwp_park(0x00000000, 1) (sleeping...) /1: Received signal #5, SIGTRAP, in ioctl() [caught]
This is Solaris 9/SPARC... any suggestions on how I might be able to troubleshoot the problem further?
Bill