I've upgraded to Linux 3.13.0-34 and now the async_tls_close_test doesn't work anymore when called with 0 1 as parameters (blocking without clean close). Any ideas what to look for?
Martin Nilsson (Opera Mini - AFK!) @ Pike (-) developers forum wrote:
I've upgraded to Linux 3.13.0-34 and now the async_tls_close_test doesn't work anymore when called with 0 1 as parameters (blocking without clean close). Any ideas what to look for?
What was the previous kernel version you were running?
Stephen R. van den Berg wrote:
Martin Nilsson (Opera Mini - AFK!) @ Pike (-) developers forum wrote:
3.2.0-67
You're talking about:
Doing tests in tlib/modules/SSL.pmod/testsuite (241 tests, pid 19094) test 16, line 244 No result from subprocess (died of signal SIGTERM) test 18, line 246 No result from subprocess (died of signal SIGTERM) Testing SSL 3.0..3.0 client with SSL 3.0..3.0 server (threaded)...
I presume?
It looks like a Threading issue, some kind of deadlock waiting for each other (lovely).
A cursory reading of the strace seems to indicate that there are two threads speaking to each other, and that both threads end up waiting in a poll on two filedescriptors each (after having exchanged some data previously). So it does not seem to be a buffer-issue where the kernel stalls. One would almost guess it's timing related (even lovelier).
Thanks alot. ( https://www.google.com/search?q=alot&tbm=isch )
That brings me down to the normal three testsuite errors (below). Perhaps we should split out 8.1 and make the final fixes for an 8.0 release based on where we are now?
/home/nilsson/pike/src/testsuite.in:570: Test 140 (shift 2) failed. 1: mixed a() { return sprintf("%O", typeof(sprintf("%c", 1023))); ; } 2: mixed b() { return "string(1023..1023)"; } 3:
o->a(): "string" o->b(): "string(1023..1023)" /home/nilsson/pike/src/testsuite.in:574: Test 142 (shift 1) (CRNL) failed. 1: mixed a() { return sprintf("%O", typeof(sprintf("%c\n", Stdio))); ; } 2: mixed b() { return "string"; } 3:
o->a(): "__attribute__("sprintf_result", string)" o->b(): "string"
Testing beginning of heapIncorrect number of call outs!
([ /* 11 elements */ 0: -10000, /main()->f0: 10456, /main()->f1: 514, /main()->f2: 507, /main()->f3: 491, /main()->f4: 465, /main()->f5: 509, /main()->f6: 460, /main()->f7: 519, /main()->f8: 555, /main()->f9: 524 ]) != ([ /* 10 elements */ /main()->f0: 456, /main()->f1: 514, /main()->f2: 507, /main()->f3: 491, /main()->f4: 465, /main()->f5: 509, /main()->f6: 460, /main()->f7: 519, /main()->f8: 555, /main()->f9: 524 ]) /home/nilsson/pike/src/testsuite.in:13157: Test 11320 (shift 1) failed. 1: mixed a() { 2: object pid = Process.create_process(RUNPIKE_ARRAY + 3: ({ "/home/nilsson/pike/src/test_co.pike" })); 4: int i; 5: for (i=0; i < 120; i++) { 6: if (pid->status()>0) break; 7: __signal_watchdog(); 8: sleep(1); 9: } 10: if (pid->status() <= 0) { 11: pid->kill(9); 12: return "Killed"; 13: } 14: return pid->wait(); 15: ; } 16: mixed b() { return 0; } 17:
o->a(): 1 o->b(): 0
The last error doesn't show up if I run dmalloc though. I do see the following dmalloc issues:
Doing tests in post_modules/Nettle/testsuite (641 tests, pid 11588)
==LEAK==: (0x3fb7ce0) 8 bytes **Block: 0x3fb7ce0 Type: PIKE_T_UNKNOWN Refs: 1431655765 **Cannot describe block of type PIKE_T_UNKNOWN (247) ******************* Locations that handled 0x3fb7ce0: (gc generation: 2/7 gc pass: 0/0) *** /home/nilsson/pike/src/modules/Gmp/mpz_glue.c:2235 malloc (1 times) !*! --> /home/nilsson/pike/src/post_modules/Nettle/testsuite.in:1818: Test 607 (shift 1):0 (1 times) !*! ==LEAK==: (0x4613250) 8 bytes **Block: 0x4613250 Type: PIKE_T_UNKNOWN Refs: 1431655765 **Cannot describe block of type PIKE_T_UNKNOWN (247) ******************* Locations that handled 0x4613250: (gc generation: 2/7 gc pass: 0/0) *** /home/nilsson/pike/src/modules/Gmp/mpz_glue.c:2235 malloc (1 times) !*! --> /home/nilsson/pike/src/post_modules/Nettle/testsuite.in:1818: Test 607 (shift 1):0 (1 times) !*! ==LEAK==: (0x3ebade0) 8 bytes **Block: 0x3ebade0 Type: PIKE_T_UNKNOWN Refs: 1431655765 **Cannot describe block of type PIKE_T_UNKNOWN (247) ******************* Locations that handled 0x3ebade0: (gc generation: 2/7 gc pass: 0/0) *** /home/nilsson/pike/src/modules/Gmp/mpz_glue.c:2235 malloc (1 times) !*! --> /home/nilsson/pike/src/post_modules/Nettle/testsuite.in:1818: Test 607 (shift 1):0 (1 times) !*!
I sometimes also get this error in the SSL code
Testing SSL 3.0..3.1 client with SSL 3.0..3.1 server (threaded)...Not open. /home/nilsson/pike/lib/modules/SSL.pmod/File.pike:908: SSL.File(Stdio.File(0, 0, 777 /* fd=11 */), SSL.ClientConnection(handshaki ng|local_closed|peer_fatal))->write(({"��\217+�/�?.A�\215�\t��z.p\212\20 4\3�U�\177h\b�ap�\216�I\t�\31u��O�/\232\v\21�\2\24\6�\35\6�g\0*2\223� �\177K\202�B!\216O\204|f?�\r��\214��D�\17\6\213"\21Yg-�\b�\236\217\25�\r �n�\203H�99��k�^��\25yw!�\v�%�\20e-[��\u00167�\36{�\215e>�9�sfu�\2 01�s�\200�\4\25\31jbq9�j�1�\226{�&�\207�B�sdZxD"+[65326]})) tlib/modules/SSL.pmod/testsuite:61: testsuite()->__lambda_66285_2_line_60()
Finally there is a bunch of valgrind errors. Mostly these two, though there are a few more that shows up several times.
==28114== Conditional jump or move depends on uninitialised value(s) ==28114== at 0x7C6FEA7: ??? ==28114== by 0x431C4D: apply_lfun (interpret.c:1707) ==28114== by 0x4C3F2D: object_index_no_free (object.c:1476) ==28114== by 0x428B5E: opcode_F_LOCAL_INDEX (interpret_functions.h:1987) ==28114== by 0x7C79EC9: ??? ==28114== by 0x41E270: catching_eval_instruction (interpret.c:1707) ==28114== by 0x420DF7: inter_return_opcode_F_CATCH (interpret.c:1291) ==28114== by 0x7C78DCD: ??? ==28114== by 0x42F6FD: mega_apply_low (interpret.c:1707) ==28114== by 0x42CD80: lower_mega_apply (interpret.c:2168) ==28114== by 0x42F6E3: mega_apply_low (interpret.c:2715) ==28114== by 0x43234F: apply_external (interpret.c:3195) ==28114==
==28114== Conditional jump or move depends on uninitialised value(s) ==28114== at 0x465AAD: do_docode2 (docode.c:1317) ==28114== by 0x460750: do_docode2 (docode.c:314) ==28114== by 0x4622CD: do_docode2 (docode.c:314) ==28114== by 0x467D32: do_code_block (docode.c:314) ==28114== by 0x55F623: dooptcode (las.c:5228) ==28114== by 0x41B753: yyparse (language.yacc:919) ==28114== by 0x4EC94E: run_pass2 (program.c:9142) ==28114== by 0x4EF237: f_compilation_compile (program.c:9654) ==28114== by 0x42CD80: lower_mega_apply (interpret.c:2168) ==28114== by 0x431EB9: apply (interpret.c:2715) ==28114== by 0x4D9753: f_compilation_env_compile (program.c:8668) ==28114== by 0x42CD80: lower_mega_apply (interpret.c:2168) ==28114==
/home/nilsson/pike/src/testsuite.in:570: Test 140 (shift 2) failed. 1: mixed a() { return sprintf("%O", typeof(sprintf("%c", 1023))); ; } 2: mixed b() { return "string(1023..1023)"; } 3:
o->a(): "string" o->b(): "string(1023..1023)"
The above test is now fixed. It was due to
sprintf("%c", 1023)
being optimized to
int2char(1023)
which didn't have the same level of strict return type...
Does it work with a Pike version that is a few hundred revisions old? In that case bisect.
pike-devel@lists.lysator.liu.se