The main challenge with making the class version fast is that you pay the additional function call overhead for the call to `[](). I think inlining does not work unless you are inlining something from within the same class (or parent maybe). It would be nice if the compiler would generate a fast path which assumes that your loop always calls `[]() in your class 'amapping', in that case it would be possible to optimize the function call in a way similar to what the 'fast_call' branch does with map(), etc.