Re: nettle-2.7 release candidate

24 Apr 2013

      On Tue, 23 Apr 2013, Niels Möller wrote:
...
Martin Storsjö martin@martin.st writes:
...
Hmm, yes, I think that might have been the case. So since we can't
rely on that being aligned anyway, we could just as well skip the 8
byte offset.
If it works now, I don't think we should touch this code further before
release.
Yes, that's probably wisest.
...
For later optimization (if it really makes a difference to performance
if we use aligned or unaligned loads and stores here? I don't know),
one could keep the 8 byte extra allocation, then do something like
lea	8(%rsp), %r10
       and	$-16, %r10
(%r10 should always be free for scratch use at both entry and exit,
right?). Then %r10 will be 16 byte aligned, and hold either %rsp or %rsp

And we can then do fully aligned loads and stores of the xmm

registers via offsets from %r10.
That would probably work. I don't know these things well enough to say 
whether there's any serious performance to be gained by doing this, 
compared to the inconvenience of wasting one register.
// Martin

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: nettle-2.7 release candidate