Hi,
I've prepared a tarball for the first nettle-4.0 release candidate.
See https://www.lysator.liu.se/~nisse/archive/nettle-4.0rc1.tar.gz, https://www.lysator.liu.se/~nisse/archive/nettle-4.0rc1.tar.gz.sig
NEWS file at https://git.lysator.liu.se/nettle/nettle/-/blob/master/NEWS
All testing and feedback appreciated. I plan to get the release out next weekend.
Regards, /Niels
I'd say it is good to go, but some minor stuff for consideration:
1) Is including HTML and PDF manuals in 'make dist' tarballs still useful? Reproducing the tarballs from git requires more tools, especially the PDF. Removing them reduces tarball from 2.7MB to 2.0MB.
2) The tarball embeds some username info, some portability/reproducability TAR_OPTIONS for inspiration:
export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name --mode=go+u,go-w --format=ustar
3) On a riscv64 machine (RevyOS, derived from Debian trixie, with gcc --version 'gcc (Debian 14.3.0-10revyos2) 14.3.0') I got the warning below, and I'm not sure why it doesn't show up on amd64. Self-tests passes fine on the platform.
In function ‘xalloc’, inlined from ‘test_aead_message’ at testutils.c:989:19: testutils.c:37:13: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=] 37 | void *p = malloc(size); | ^~~~~~~~~~~~ In file included from testutils.h:11, from testutils.c:3: /usr/include/stdlib.h: In function ‘test_aead_message’: /usr/include/stdlib.h:672:14: note: in a call to allocation function ‘malloc’ declared here 672 | extern void *malloc (size_t __size) __THROW __attribute_malloc__ | ^~~~~~
4) Enabling -fanalyzer shows the warning below. I've had mixed experiences with -fanalyzer, but I've had real positives with it.
gcc -I. -I. -Wall -Warith-conversion -Wcast-align=strict -Wdate-time -Wdouble-promotion -Wduplicated-branches -Wduplicated-cond -Wextra -Wformat-signedness -Wflex-array-member-not-at-end -Winit-self -Winvalid-pch -Wlogical-op -Wmissing-include-dirs -Wnull-dereference -Wold-style-definition -Wopenmp-simd -Woverlength-strings -Wpacked -Wpointer-arith -Wstack-protector -Wstrict-prototypes -Wsuggest-attribute=cold -Wsuggest-final-methods -Wsuggest-final-types -Wsync-nand -Wtrampolines -Wuninitialized -Wunknown-pragmas -Wunsafe-loop-optimizations -Wunused-macros -Wvariadic-macros -Wvector-operation-performance -Wvla -Wwrite-strings -Warray-bounds=2 -Wattribute-alias=2 -Wbidi-chars=any,ucn -Wformat-overflow=2 -Wformat=2 -Wformat-truncation=2 -Wimplicit-fallthrough=5 -Wshift-overflow=2 -Wuse-after-free=3 -Wunused-const-variable=2 -Wvla-larger-than=4031 -Wno-analyzer-malloc-leak -Wno-sign-compare -Wno-system-headers -Wno-cast-align -Wno-clobbered -Wno-discarded-qualifiers -Wno-format -Wno-implicit-fallthrough -Wno-unused-parameter -fstrict-flex-arrays -Wstrict-flex-arrays -Werror -Wno-error=unused-macros -Wno-error=unused-const-variable= -Wno-error=format-truncation -fanalyzer -DHAVE_CONFIG_H -g -O2 -ggdb3 -Wall -W -Wno-sign-compare -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wbad-function-cast -Wnested-externs -fpic -MT gosthash94.o -MD -MP -MF gosthash94.o.d -c gosthash94.c \ && true gosthash94.c: In function ‘gost_block_compress’: gosthash94.c:76:40: error: use of uninitialized value ‘v[1]’ [CWE-457] [-Werror=analyzer-use-of-uninitialized-value] 76 | w[0] = u[0] ^ v[0], w[1] = u[1] ^ v[1]; | ~^~~ ‘gosthash94_update_int.part.0’: event 1 | | 297 | gosthash94_update_int (struct gosthash94_ctx *ctx, | | ^~~~~~~~~~~~~~~~~~~~~ | | | | | (1) entry to ‘gosthash94_update_int.part.0’ | ‘gosthash94_update_int.part.0’: event 2 | |macros.h:184:8: | 184 | if ((ctx)->index) \ | | ^ | | | | | (2) following ‘false’ branch... gosthash94.c:301:5: note: in expansion of macro ‘MD_UPDATE’ | 301 | MD_UPDATE(ctx, length, msg, COMPRESS, ctx->count++); | | ^~~~~~~~~ | ‘gosthash94_update_int.part.0’: event 3 | |cc1: | (3): ...to here | ‘gosthash94_update_int.part.0’: event 4 | |macros.h:205:21: | 205 | while ((length) >= sizeof((ctx)->block)) \ | | ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ | | | | | (4) following ‘true’ branch (when ‘length > 31’)... gosthash94.c:301:5: note: in expansion of macro ‘MD_UPDATE’ | 301 | MD_UPDATE(ctx, length, msg, COMPRESS, ctx->count++); | | ^~~~~~~~~ | ‘gosthash94_update_int.part.0’: event 5 | | 286 | #define COMPRESS(ctx, block) gost_compute_sum_and_hash((ctx), (block), sbox); | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (5) ...to here macros.h:207:9: note: in expansion of macro ‘COMPRESS’ | 207 | f((ctx), (data)); \ | | ^ gosthash94.c:301:5: note: in expansion of macro ‘MD_UPDATE’ | 301 | MD_UPDATE(ctx, length, msg, COMPRESS, ctx->count++); | | ^~~~~~~~~ | ‘gosthash94_update_int.part.0’: event 6 | | 286 | #define COMPRESS(ctx, block) gost_compute_sum_and_hash((ctx), (block), sbox); | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (6) calling ‘gost_compute_sum_and_hash’ from ‘gosthash94_update_int.part.0’ macros.h:207:9: note: in expansion of macro ‘COMPRESS’ | 207 | f((ctx), (data)); \ | | ^ gosthash94.c:301:5: note: in expansion of macro ‘MD_UPDATE’ | 301 | MD_UPDATE(ctx, length, msg, COMPRESS, ctx->count++); | | ^~~~~~~~~ | +--> ‘gost_compute_sum_and_hash’: events 7-8 | | 266 | gost_compute_sum_and_hash (struct gosthash94_ctx *ctx, const uint8_t *block, | | ^~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (7) entry to ‘gost_compute_sum_and_hash’ |...... | 273 | for (i = carry = 0; i < 8; i++, block += 4) | | ~~~~~ | | | | | (8) following ‘true’ branch (when ‘i != 8’)... | ‘gost_compute_sum_and_hash’: event 9 | |macros.h:120:20: | 120 | ( (((uint32_t) (p)[3]) << 24) \ | | ~~~^~~ | | | | | (9) ...to here gosthash94.c:275:25: note: in expansion of macro ‘LE_READ_UINT32’ | 275 | block_le[i] = LE_READ_UINT32(block); | | ^~~~~~~~~~~~~~ | ‘gost_compute_sum_and_hash’: events 10-12 | | 273 | for (i = carry = 0; i < 8; i++, block += 4) | | ~~^~~ | | | | | (10) following ‘false’ branch (when ‘i == 8’)... |...... | 283 | gost_block_compress (ctx, block_le, sbox); | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (11) ...to here | | (12) calling ‘gost_block_compress’ from ‘gost_compute_sum_and_hash’ | +--> ‘gost_block_compress’: events 13-15 | | 65 | gost_block_compress (struct gosthash94_ctx *ctx, const uint32_t *block, | | ^~~~~~~~~~~~~~~~~~~ | | | | | (13) entry to ‘gost_block_compress’ |...... | 69 | uint32_t key[8], u[8], v[8], w[8], s[8]; | | ~ | | | | | (14) region created on stack here |...... | 76 | w[0] = u[0] ^ v[0], w[1] = u[1] ^ v[1]; | | ~~~~ | | | | | (15) use of uninitialized value ‘v[1]’ here | cc1: all warnings being treated as errors
5) Another warning, just for testsuite:
poly1305-test.c: In function ‘test_fixed’: poly1305-test.c:184:7: error: ‘test_poly1305_internal’ reading 16 bytes from a region of size 1 [-Werror=stringop-overread] 184 | test_poly1305_internal (key->data, message->length, message->data, nonce, digest->data); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ poly1305-test.c:184:7: note: referencing argument 1 of type ‘const uint8_t[16]’ {aka ‘const unsigned char[16]’} poly1305-test.c:184:7: note: referencing argument 4 of type ‘const uint8_t[16]’ {aka ‘const unsigned char[16]’} poly1305-test.c:184:7: error: ‘test_poly1305_internal’ reading 16 bytes from a region of size 1 [-Werror=stringop-overread] poly1305-test.c:184:7: note: referencing argument 5 of type ‘const uint8_t[16]’ {aka ‘const unsigned char[16]’} poly1305-test.c:127:1: note: in a call to function ‘test_poly1305_internal’ 127 | test_poly1305_internal (const uint8_t key[16], | ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors
6) Sometimes -fstrict-flex-arrays -Wstrict-flex-arrays finds interesting spots too -- https://blogs.oracle.com/linux/closing-a-hole-in-the-detection-of-buffer-ove... -- also for testsuite
gcc -I.. -I.. -Wall -Warith-conversion -Wcast-align=strict -Wdate-time -Wdouble-promotion -Wduplicated-branches -Wduplicated-cond -Wextra -Wformat-signedness -Wflex-array-member-not-at-end -Winit-self -Winvalid-pch -Wlogical-op -Wmissing-include-dirs -Wnull-dereference -Wold-style-definition -Wopenmp-simd -Woverlength-strings -Wpacked -Wpointer-arith -Wstack-protector -Wstrict-prototypes -Wsuggest-attribute=cold -Wsuggest-final-methods -Wsuggest-final-types -Wsync-nand -Wtrampolines -Wuninitialized -Wunknown-pragmas -Wunsafe-loop-optimizations -Wunused-macros -Wvariadic-macros -Wvector-operation-performance -Wvla -Wwrite-strings -Warray-bounds=2 -Wattribute-alias=2 -Wbidi-chars=any,ucn -Wformat-overflow=2 -Wformat=2 -Wformat-truncation=2 -Wimplicit-fallthrough=5 -Wshift-overflow=2 -Wuse-after-free=3 -Wunused-const-variable=2 -Wvla-larger-than=4031 -Wno-analyzer-malloc-leak -Wno-sign-compare -Wno-system-headers -Wno-cast-align -Wno-clobbered -Wno-discarded-qualifiers -Wno-format -Wno-implicit-fallthrough -Wno-unused-parameter -fstrict-flex-arrays -Wstrict-flex-arrays -Werror -Wno-error=unused-macros -Wno-error=unused-const-variable= -Wno-error=format-truncation -fanalyzer -Wno-error=overlength-strings -DHAVE_CONFIG_H -g -O2 -ggdb3 -Wall -W -Wno-sign-compare -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wbad-function-cast -Wnested-externs -MT pss-test.o -MD -MP -MF pss-test.o.d -c pss-test.c && true pss-test.c: In function ‘test_main’: pss-test.c:57:13: error: array subscript 6 is above array bounds of ‘uint8_t[1]’ {aka ‘unsigned char[1]’} [-Werror=array-bounds=] 57 | salt->data[6] = 0x00; | ~~~~~~~~~~^~~ pss-test.c:57:13: error: trailing array ‘uint8_t[1]’ {aka ‘unsigned char[1]’} should not be used as a flexible array member [-Werror=strict-flex-arrays]
/Simon
Simon Josefsson simon@josefsson.org writes:
- Is including HTML and PDF manuals in 'make dist' tarballs still
useful? Reproducing the tarballs from git requires more tools, especially the PDF. Removing them reduces tarball from 2.7MB to 2.0MB.
I'm not sure which formats people find useful. I think it's valuable that one can get the tarball and read the docs without having to build or generate anything.
Nettle releases are a bit old-fashioned, with little thought to reproducibility and transparency.
- The tarball embeds some username info, some
portability/reproducability TAR_OPTIONS for inspiration:
export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name --mode=go+u,go-w --format=ustar
Trimming the tar meta data a little seems like an easy improvement. When I looked at this in a different project, I ended up with the invocation here: https://git.glasklar.is/system-transparency/core/system-transparency/-/blob/...
tar --exclude .git --exclude .gitmodules --sort=name --format=posix \ --pax-option=exthdr.name=%d/PaxHeaders/%f \ --pax-option=delete=atime,delete=ctime \ --clamp-mtime --mtime="./$(basename "${DIST_DIR}")/${LATEST_COMPONENT}" \ --numeric-owner --owner=0 --group=0 \ --mode=go+u,go-w \ -cf - "$(basename "${DIST_DIR}")" ) | gzip --no-name --best > "${DIST_DIR}.tar.gz"
based mostly on the section on reproducibility in the GNU tar manual. Is there some reason to prefer format ustar over posix?
If I remember correctly, fiddling with file timestamps was also rather important to get a reproducible tar file. But it may be a good first step to fix the non-timestamp metadata?
- On a riscv64 machine (RevyOS, derived from Debian trixie, with gcc
--version 'gcc (Debian 14.3.0-10revyos2) 14.3.0') I got the warning below, and I'm not sure why it doesn't show up on amd64. Self-tests passes fine on the platform.
In function ‘xalloc’, inlined from ‘test_aead_message’ at testutils.c:989:19: testutils.c:37:13: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=] 37 | void *p = malloc(size); | ^~~~~~~~~~~~ In file included from testutils.h:11, from testutils.c:3: /usr/include/stdlib.h: In function ‘test_aead_message’: /usr/include/stdlib.h:672:14: note: in a call to allocation function ‘malloc’ declared here 672 | extern void *malloc (size_t __size) __THROW __attribute_malloc__ | ^~~~~~
I've seen similar warnings occationally on other platforms, without making any sense of it. And it appears the code looks like
uint8_t *buf = xalloc (cipher->length + 1); uint8_t *copy = xalloc (cipher->length);
where your compiler warns for the second line, but not the first.
- Enabling -fanalyzer shows the warning below. I've had mixed
experiences with -fanalyzer, but I've had real positives with it.
Would be interesting to get gcc's analyzer to run reliably in the CI (currently, it runs clang's static analyzer). But I can't make sense of this report. As I read it, it is the code path that processes complete blocks.
| 184 | if ((ctx)->index) \ | | ^ | | | | | (2) following ‘false’ branch...
Means that the block buffer is empty, and
| 205 | while ((length) >= sizeof((ctx)->block)) \ | | ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ | | | | | (4) following ‘true’ branch (when ‘length > 31’)...
means that caller's input is a complete block (or more). That input block is eventually passed to gost_compute_sum_and_hash, which reads it as 8 little-endian 32-bit numbers
uint32_t block_le[8]; ... for (i = carry = 0; i < 8; i++, block += 4) { block_le[i] = LE_READ_UINT32(block); ...
and passes to gost_block_compress which uses this to initizes a working copy,
uint32_t key[8], u[8], v[8], w[8], s[8]; ... memcpy (v, block, sizeof (v));
At this point, all of v should be properly initialized, unless the external caller passes an uninitialized message.
The trace from the analyzer mention this memcpy call as a relevant event, maybe that is where it loses track? Analyzer says
| 69 | uint32_t key[8], u[8], v[8], w[8], s[8]; | | ~ | | | | | (14) region created on stack here |...... | 76 | w[0] = u[0] ^ v[0], w[1] = u[1] ^ v[1]; | | ~~~~ | | | | | (15) use of uninitialized value ‘v[1]’ here
- Another warning, just for testsuite:
poly1305-test.c: In function ‘test_fixed’: poly1305-test.c:184:7: error: ‘test_poly1305_internal’ reading 16 bytes from a region of size 1 [-Werror=stringop-overread] 184 | test_poly1305_internal (key->data, message->length, message->data, nonce, digest->data); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I think this is related to the traditional "flexible arrays" used by the test type
struct tstring { struct tstring *next; size_t length; uint8_t data[1]; };
If I look at https://en.cppreference.com/w/c/language/array.html, it seems that instead declaring the flexible member as
uint8_t data[];
should be allowed in c99, so it's doable to update to the more modern style.
Regards, /Niels
Niels Möller nisse@lysator.liu.se writes:
- The tarball embeds some username info, some
portability/reproducability TAR_OPTIONS for inspiration:
export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name --mode=go+u,go-w --format=ustar
Trimming the tar meta data a little seems like an easy improvement. When I looked at this in a different project, I ended up with the invocation here: https://git.glasklar.is/system-transparency/core/system-transparency/-/blob/...
tar --exclude .git --exclude .gitmodules --sort=name --format=posix \ --pax-option=exthdr.name=%d/PaxHeaders/%f \ --pax-option=delete=atime,delete=ctime \ --clamp-mtime --mtime="./$(basename "${DIST_DIR}")/${LATEST_COMPONENT}" \ --numeric-owner --owner=0 --group=0 \ --mode=go+u,go-w \ -cf - "$(basename "${DIST_DIR}")" ) | gzip --no-name --best > "${DIST_DIR}.tar.gz"
based mostly on the section on reproducibility in the GNU tar manual. Is there some reason to prefer format ustar over posix?
I tried --format=posix for some time (also inspired by the tar manual), but realized it did not provide anything important, that ustar did not provide. I found that the ./PaxHeadeers/ virtual sub-directory with --format=posix archives annoying:
https://lists.gnu.org/archive/html/bug-gnulib/2025-01/msg00233.html
Using gzip with --rsyncable can dramatically improve some situations (e.g., transfer or storage of multiple nettle-*.tar.gz tarballs with block-based deduping), but it costs 2-3% of size.
If I remember correctly, fiddling with file timestamps was also rather important to get a reproducible tar file. But it may be a good first step to fix the non-timestamp metadata?
It is indeed trickier than everyone would want this to be... small incremental improvements like avoiding uid/gid help though.
/Simon
Simon Josefsson simon@josefsson.org writes:
- The tarball embeds some username info, some
portability/reproducability TAR_OPTIONS for inspiration:
export TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name --mode=go+u,go-w --format=ustar
I'm trying this patch:
--- a/Makefile.in +++ b/Makefile.in @@ -648,8 +648,9 @@ distdir: $(DISTFILES) mkdir "$$sd" && $(MAKE) -C $$d distdir="`cd $$sd && pwd`" $@ ; \ done
+TAR_OPTIONS = --owner=0 --group=0 --numeric-owner --sort=name --mode=go+u,go-w --format=ustar dist: distdir - tar cf - $(distdir) | gzip -c >$(distdir).tar.gz + TAR_OPTIONS="$(TAR_OPTIONS)" tar cf - $(distdir) | gzip -c >$(distdir).tar.gz rm -rf $(distdir)
rm_distcheck = test ! -d distcheck-tmp \
My thinking is that
* I want it to be an environment variable, so that it is ignored if tar happens to not be GNU tar (or does other tar programs also use TAR_OPTIONS? Is there some GNU-specific environment variable I could use?)
* I want to set it locally, rather than using the export feature of make, to not have it affect other uses of tar (e.g., it's also used for unpacking in the distcheck target).
* It seems the --no-name option to gzip is not needed when gzip operates in a pipe.
Does that make sense?
I also wonder if there's some *easy* way to fix timestamps. Currently, the dist/distcheck targets copies files into $(distdir), and timestamps will then be the time that copying happens for each file. It's tempting to try to get a timestamp from git, but I prefer of the make rules don't depend on $(srcdir) being a git checkout (e.g., in the make distcheck case, that won't be the case). So I'd need some other source, like a time inside the contents or metadata of some specific file.
Mostly unrelated to this change, this also makes me think about the timestamps of the various stamp files included in the tarball (e.g., stamp-h.in). The make rules dealing with them make my head spin a bit, but I worry that if files get timestamps based on time of copy, they may get in the wrong order so that make from a tarball will want to run autoheader, which defeats the purpose of including that stamp file and config.h.in in the tarball in the first place. Forcing all files to have the same timestamp would probably solve that problem too.
Regards, /Niels
nettle-bugs@lists.lysator.liu.se