On Tue, Dec 17, 2013 at 4:12 PM, Niels Möller nisse@lysator.liu.se wrote:
A solution to that issue would be to have a nettle library constructor that runs the equivalent of cpuid in ARM, and stores it to a global variable. Then each assembly module (e.g., aes-arm) will jump to the correct implementation detected at runtime.
The difficult part is the configure work. We'd either have to build multiple object files for each function, with different link names, and then have some glue to select the right one at runtime.
Why not a big assembly function that contains everything? In the start it simply checks which CPU optimization is available and jumps to the appropriate label (i'm thinking x86 asm here but I hope what I say applies to arm as well).
Or use a "master file" for each function, say arm/fat/foo.asm, which includes the other files and makes the right thing happen.
That could work too.
Things get a bit more complex if we need to use the C version on some machines, since the current build setup assumes that an assembly file completely replaces the corresponding C file.
If everything were in a single file it would work like charm, but even splitting them to multiple files would work if subdirectories are used, and only the main file is considered the "real" asm. (I suppose you are referring to --disable-assember?)
I clearly see the need for a runtime test for neon. Say, --enable-arm-neon=fat or a more general --enable-fat.
I like the name :) I think the latter makes more sense if it is to be used for x86 as well.
But you also mention v6 optimizations, for clarity, do you mean that you'd like to see runtime tests for that as too? To me, it seems a bit unlikely to need a fat binary which supports both pre-v6 arm, and v6 and later. I'd expect pre-v6 arm to be used only in embedded systems where the cpu flavor is known at build time.
You may be right; it may make sense to treat them separately.
regards, Nikos