Mark H Weaver mhw@netris.org writes:
The SIGSEGV happens in the following call:
_nettle_cpuid (0, cpuid_data);
The problem appears to be that the PLT entry for '_nettle_cpuid' has not yet been initialized when 'fat_init' is called via 'nettle_memxor_resolve':
Sounds pretty bad... We really need to fix this in one way or the other. I'm not 100% sure I understand what's going on, but from your gdb session, I think I agree with your analysis.
I havent't seen any documentation explaing precisely what one can and cannot do in an ifunc resolver. Do you know?
Is RTLD_NOW part of the problem (i.e., does it work if you change the test program to use RTLD_LAZY and then call nettle_memxor)? If RTLD_NOW either
1. resolved all normal (i.e., not ifunc) symbols first, before calling the ifunc resolvers, or
2. first initialized the plt entries in the same way as for RTLD_LAZY, and then replace the entries by resolving one symbol at a time.
Some things you could try,
* Undefine HAVE_LINK_IFUNC, falling back to the non-ifunc code.
* Declare _nettle_cpuid as having visibility hidden (then I think the call should not jump via the plt). Might need corresponding pseudo-ops also in x86_64/fat/cpuid.asm, I'm not sure.
We'd really need to ask some glibc guru about the ordering. To me, it seems like a bug if ifunc resolver functions can't call any other functions symbols in the library.
Regards, /Niels