Hello list,
I have a segfault with a Pike that happens rarely on a busy Camas server. To get a backtrace and fill a bug report I've tried to run it into gdb however I didn't get any segfault for now and I can't run it into gdb for a long time since it is a production server (I saw this error only one time on the development server, too few traffic). Maybe it's not even possible to have a segfault inside gdb because of a strange side effects (maybe the segfault occurs only on a fast process or something).
So I'd like to know if it would be possible to catch SIGSEGV in Pike and output a backtrace so that:
. You might get more bug reports. . It would be possible to track problems when it's not possible to run inside a debugger or that the error don't happen in this case.
Thank you for your reply.
-- David Gourdelier
Can't you get a coredump? It's usually just as helpful.
Another thing you perhaps can try is to compile Pike with --with-rtldebug. It'll run slower but not unbearably so, and there's a fair chance the extra internal checks will give you a better error to report. (The next step is to add -d but that will probably make it too slow for heavy production use.)
As for catching SIGSEGV in pike, it's possible strictly speaking (just use the normal signal() function), but I suspect it won't work well to try to run any pike code after it has happened. And besides, it's the C-level state that usually is interesting for such errors, and you can't get to that from inside pike. So running gdb on a coredump is much better.
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-12 00:15: Subject: Catching segfault
Hello list,
I have a segfault with a Pike that happens rarely on a busy Camas server. To get a backtrace and fill a bug report I've tried to run it into gdb however I didn't get any segfault for now and I can't run it into gdb for a long time since it is a production server (I saw this error only one time on the development server, too few traffic). Maybe it's not even possible to have a segfault inside gdb because of a strange side effects (maybe the segfault occurs only on a fast process or something).
So I'd like to know if it would be possible to catch SIGSEGV in Pike and output a backtrace so that:
. You might get more bug reports. . It would be possible to track problems when it's not possible to run inside a debugger or that the error don't happen in this case.
Thank you for your reply.
-- David Gourdelier
/ Brevbäraren
Another thing worth mentioning regarding coredumps: If you are running multithreaded in Linux, it will only dump one of the threads. That will make the coredump useless in most cases. To fix it you have to patch the kernel.
Search the kernel archives for "tcore kernel patch" and you should find it. If not, I have one for 2.4.20 that I can make available for you.
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-12 01:14: Subject: Catching segfault
Can't you get a coredump? It's usually just as helpful.
Another thing you perhaps can try is to compile Pike with --with-rtldebug. It'll run slower but not unbearably so, and there's a fair chance the extra internal checks will give you a better error to report. (The next step is to add -d but that will probably make it too slow for heavy production use.)
As for catching SIGSEGV in pike, it's possible strictly speaking (just use the normal signal() function), but I suspect it won't work well to try to run any pike code after it has happened. And besides, it's the C-level state that usually is interesting for such errors, and you can't get to that from inside pike. So running gdb on a coredump is much better.
/ Martin Stjernholm, Roxen IS
Hi,
Can't you get a coredump? It's usually just as helpful.
Yes but in this case I didn't get one, I don't know why (I checked ulimit).
Another thing you perhaps can try is to compile Pike with --with-rtldebug. It'll run slower but not unbearably so, and there's a fair chance the extra internal checks will give you a better error to report. (The next step is to add -d but that will probably make it too slow for heavy production use.)
Ok.
As for catching SIGSEGV in pike, it's possible strictly speaking (just use the normal signal() function), but I suspect it won't work well to try to run any pike code after it has happened. And besides, it's the C-level state that usually is interesting for such errors, and you can't get to that from inside pike. So running gdb on a coredump is much better.
Thank you too for the coredump tip with Linux and threads since it's my case.
/ David Gourdelier
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-12 00:15: Subject: Catching segfault
Hello list,
I have a segfault with a Pike that happens rarely on a busy Camas server. To get a backtrace and fill a bug report I've tried to run it into gdb however I didn't get any segfault for now and I can't run it into gdb for a long time since it is a production server (I saw this error only one time on the development server, too few traffic). Maybe it's not even possible to have a segfault inside gdb because of a strange side effects (maybe the segfault occurs only on a fast process or something).
So I'd like to know if it would be possible to catch SIGSEGV in Pike and output a backtrace so that:
. You might get more bug reports. . It would be possible to track problems when it's not possible to run inside a debugger or that the error don't happen in this case.
Thank you for your reply.
-- David Gourdelier
/ Brevbäraren
Are you changing the effective user or group? In that case you won't get coredumps since they would be a theoretical security problem.
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-12 16:23: Subject: Re: Catching segfault
Hi,
Can't you get a coredump? It's usually just as helpful.
Yes but in this case I didn't get one, I don't know why (I checked ulimit).
Another thing you perhaps can try is to compile Pike with --with-rtldebug. It'll run slower but not unbearably so, and there's a fair chance the extra internal checks will give you a better error to report. (The next step is to add -d but that will probably make it too slow for heavy production use.)
Ok.
As for catching SIGSEGV in pike, it's possible strictly speaking (just use the normal signal() function), but I suspect it won't work well to try to run any pike code after it has happened. And besides, it's the C-level state that usually is interesting for such errors, and you can't get to that from inside pike. So running gdb on a coredump is much better.
Thank you too for the coredump tip with Linux and threads since it's my case.
/ David Gourdelier
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-12 00:15: Subject: Catching segfault
Hello list,
I have a segfault with a Pike that happens rarely on a busy Camas server. To get a backtrace and fill a bug report I've tried to run it into gdb however I didn't get any segfault for now and I can't run it into gdb for a long time since it is a production server (I saw this error only one time on the development server, too few traffic). Maybe it's not even possible to have a segfault inside gdb because of a strange side effects (maybe the segfault occurs only on a fast process or something).
So I'd like to know if it would be possible to catch SIGSEGV in Pike and output a backtrace so that:
. You might get more bug reports. . It would be possible to track problems when it's not possible to run inside a debugger or that the error don't happen in this case.
Thank you for your reply.
-- David Gourdelier
/ Brevbäraren
/ Brevbäraren
Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Are you changing the effective user or group? In that case you won't get coredumps since they would be a theoretical security problem.
Yes I am. But why this would be a theoretical security problem ?
Euid/egid is typically used to run untrusted code provided by a user as that user. If the kernel would dump core as root then it could be possible for the user to trig core files to be written as root in directories (s)he doesn't have access to. If the kernel would dump core as the user then (s)he could trig a coredump to get access to sensitive data that the process holds in memory.
The kernel has a flag that enables coredumps which is cleared when seteuid et al are called. If you're using a sufficiently recent Pike (be it 7.2, 7.4 or 7.5), you can use system.dumpable() or System.dumpable() to reenable the flag. We added that function in early March this year.
/ Martin Stjernholm, Roxen IS
Previous text:
2003-09-15 10:28: Subject: Re: Catching segfault
Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Are you changing the effective user or group? In that case you won't get coredumps since they would be a theoretical security problem.
Yes I am. But why this would be a theoretical security problem ?
-- David Gourdelier
/ Brevbäraren
pike-devel@lists.lysator.liu.se