Forum: >>> Magnum BBS <<<

[gentoo-dev] proposal: use only one hash function in manifest files

From Jason A. Donenfeld@21:1/5 to All on Tue Apr 5 01:50:02 2022

Hi,

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest? It's not about file
integrity, since certainly a single hash handles that use case fine.
And it's not about security either, since for that we use gpg
signatures, and gpg signatures are carried out over a _single_ hash of
the plain text being hashed, so the security of the system reduces to
breaking SHA2-512 anyway. So, if it's not about file integrity and
it's not about security, what is it about?

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256,
SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Jason

PS: there _is_ a good reason for recording the file size in Manifest
files as we do now: it's quicker to compare sizes on large files than
it is to read and hash the whole thing, so this gives us a "free" way
of noticing quick corruption.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From John Helmert III@21:1/5 to Jason A. Donenfeld on Tue Apr 5 03:50:01 2022

I don't really have any strong opinion, but I'll note this was
discussed here last year, too:

https://archives.gentoo.org/gentoo-dev/message/a51ef62765b577dccfde67d5d2d727ae

On Tue, Apr 05, 2022 at 01:41:50AM +0200, Jason A. Donenfeld wrote:

Hi,

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest? It's not about file
integrity, since certainly a single hash handles that use case fine.
And it's not about security either, since for that we use gpg
signatures, and gpg signatures are carried out over a _single_ hash of
the plain text being hashed, so the security of the system reduces to breaking SHA2-512 anyway. So, if it's not about file integrity and
it's not about security, what is it about?

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256,
SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Jason

PS: there _is_ a good reason for recording the file size in Manifest
files as we do now: it's quicker to compare sizes on large files than
it is to read and hash the whole thing, so this gives us a "free" way
of noticing quick corruption.

-----BEGIN PGP SIGNATURE-----

iQIzBAABCAAdFiEElFuPenBj6NvNLoABXP0dAeB+IzgFAmJLn/UACgkQXP0dAeB+ IzjIyA/+P6UuaQ+ckuelVambFSnzj3DsGkJxNHRj6Cn6vRpCi4iv09EYrBps49ws d1e5QEGdIoJf/Sh7CVV89K63M+cO6ngdysWaJwYwRmF2y0iasPvp1y664O0ELz+H 1l5RjlV94tR8J++G3CVvohoYH4vSf8p4VSiSBe9t/f6YRkRhl3vyErECjeracdGi cAF+vjrk+3a/aKaPsXozNAGwxYXTtStuaOT6BGoQ9aOy7Prsn4c821Ag8iv2EPUs cMJRRj8E5UcDmjAlzXvOAi3RFi40HUn8okF8nt85nUKB+/9JM7FEF2pXhEE5zGZo 4dCNXtP7wGEKaUwgufEYx50HdAWBrOtcJ6DP5gGNObY3CLV0EsNUL7G711/Mya9z fVF2BYkPgvau90/eKwzPFrtbELblGAvIlg1zr5zujILXkVtsq8pX1pKtl3H/OsSa LxOWt6k685ubmvHNeZXPFcNeRcVvtUMlskI9Gd2k3H+Rdt72C7C2OCw6ixitE1+S iQ31MagqDZrPzQaAsh3G3I1rVFIzOmO6JOuebZ/ybS+6hjoSFvhbvyTTbaNvZpOP jyuq0cYVBxHmH09EfGIzrxuEHdpk8xxCNiiobKcuNPW/3Li7wrSIy9VubPkWS1lo XXXbPJSutIEH78Fmn3zzE+Agy8QcO2TfFMENqE1/qak+SdWULWY=
=Z1uW
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to Jason A. Donenfeld on Tue Apr 5 15:40:01 2022

To move things forward with something more concrete:

On 4/5/22, Jason A. Donenfeld <[email protected]> wrote:

Hi,

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

[...]
I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256,
SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

As you might have realized from my work on other projects, I like
BLAKE2 a lot. However, I think there are two strong reasons for going
with SHA512 exclusively here:

- GPG signatures are already over the SHA512 of the plain text, so
they security of the system already reduces to that. By choosing
SHA512, we don't add more risk, whilst choosing something else means
we're in trouble if either one has a problem.
- Other package managers use SHA512 in their recipes, so it makes it
easier to compare tarball checksums.

The principle advantage of BLAKE2b is 64-bit speed, but SHA512
performs okay enough in that regard anyway.

Therefore, to amend my proposal:

- Use SHA512 as the Manifest hash.

Any objections?

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?=@21:1/5 to Jason A. Donenfeld on Tue Apr 5 16:50:01 2022

On Tue, 2022-04-05 at 01:41 +0200, Jason A. Donenfeld wrote:

Hi,

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest? It's not about file
integrity, since certainly a single hash handles that use case fine.
And it's not about security either, since for that we use gpg
signatures, and gpg signatures are carried out over a _single_ hash of
the plain text being hashed, so the security of the system reduces to breaking SHA2-512 anyway. So, if it's not about file integrity and
it's not about security, what is it about?

If you mean "remove entirely", then that's a bad idea. While
the original reasons for multiple hash functions might have been, erm,
not exactly correct, the dual-hash situation is needed for transitional periods. Particularly because we have a number of fetch-restricted
packages where we simply need to wait for someone with the distfile to
rehash them (or eventually remove them, if we can't get a new hash).

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256,
SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Back when we added BLAKE2b, the idea was to eventually remove SHA512
(the previous hash). However, this was rejected afterwards.

PS: there _is_ a good reason for recording the file size in Manifest
files as we do now: it's quicker to compare sizes on large files than
it is to read and hash the whole thing, so this gives us a "free" way
of noticing quick corruption.

The primary use of knowing the file size is to know whether to try to
resume fetching.

--
Best regards,
Michał Górny

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Tue Apr 5 16:20:01 2022

On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

- GPG signatures are already over the SHA512 of the plain text, so
they security of the system already reduces to that. By choosing
SHA512, we don't add more risk, whilst choosing something else means
we're in trouble if either one has a problem.

The OpenPGP signature is for the top-level Manifest only. In case there
was any trouble, it would be trivial to change the hash algorithm used
for this.

In constrast to that, updating the hashes in all Manifest files is a
huge pain in the neck. Basically, you must download all distfiles, which
is not trivial. For example, think of fetch-restricted files. (I've
helped twice with updating Manifest files, so I believe I know what I'm
talking about. :)

I think that be benefit of dropping one of the hashes would be close to
zero, especially if we would drop the faster one.

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJMTdkPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4uTwIH/3Bs7reoMKTJhCMiwzgB4lqitdjHNqEq9lsX rIUEJsVO/pExQmQcwogn0GY8Lcsy8S75ayddFKXpe5+HxdWzPa9n22WXyblvozqL wHRVOtBuKUXcJ6b14fqsHUyRuJw+utcUVSXvLIr1CVh/12TSAzGGi9kBSLhi/gbX JtK0fA38EoEfZ50GpvksABosDJzdSwltaMxQWpPwwvEzPeAx+lOxHJ+n5FWhadcz tjZ8WcWxnmM6CLRzcPIXeE6le5FF5VHbkM5AzaMHB6b1Y0Ytq1jm8TReQgzSijt4 x5ypeleogCn5dr9hiB8BbwYxZ+nd1yXdD4C+nhw+8g59VPfw+Vk=
=uCuV
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 17:20:01 2022

Hi Ulrich,

On Tue, Apr 5, 2022 at 4:10 PM Ulrich Mueller <[email protected]> wrote:

The OpenPGP signature is for the top-level Manifest only. In case there
was any trouble, it would be trivial to change the hash algorithm used
for this.

In constrast to that, updating the hashes in all Manifest files is a
huge pain in the neck. Basically, you must download all distfiles, which
is not trivial. For example, think of fetch-restricted files. (I've
helped twice with updating Manifest files, so I believe I know what I'm talking about. :)

The thing is, if SHA-512 is broken, that will really be the least of
our concerns. TLS itself will be broken....

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to All on Tue Apr 5 20:50:02 2022

Hi Michal,

On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256, SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Back when we added BLAKE2b, the idea was to eventually remove SHA512
(the previous hash). However, this was rejected afterwards.

Maybe we should pick that back up? Do you remember the ultimate
rationale for rejecting it? Do you suppose those are still valid?

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Matt Turner@21:1/5 to [email protected] on Tue Apr 5 21:00:01 2022

On Tue, Apr 5, 2022 at 11:47 AM Jason A. Donenfeld <[email protected]> wrote:

Hi Michal,

On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:

I don't really care which one we use, so long as it's not already
broken or too obscure/new. So in other words, any one of SHA2-256, SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
pick one and roll with it?

Back when we added BLAKE2b, the idea was to eventually remove SHA512
(the previous hash). However, this was rejected afterwards.

Maybe we should pick that back up? Do you remember the ultimate
rationale for rejecting it? Do you suppose those are still valid?

(Somehow you broke threading)

This was a topic in June 2021's Council meeting:

https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137

Basically there was no great reason presented for making the change
and some (IMO specious) reasons for keeping multiple hashes. I don't
think anyone felt strongly enough about removing one hash to fight for
it.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 21:40:02 2022

Hi Matt,

On Tue, Apr 5, 2022 at 8:58 PM Matt Turner <[email protected]> wrote:

This was a topic in June 2021's Council meeting:

https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137

Basically there was no great reason presented for making the change
and some (IMO specious) reasons for keeping multiple hashes. I don't
think anyone felt strongly enough about removing one hash to fight for
it.

Huh. Something not brought up there or https://bugs.gentoo.org/784710
is the fact that the _security_ of the system reduces to SHA-512 as
used by our GPG signatures.

By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Tue Apr 5 22:20:01 2022

On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

Huh. Something not brought up there or https://bugs.gentoo.org/784710
is the fact that the _security_ of the system reduces to SHA-512 as
used by our GPG signatures.

The hash algorithm would be the least of my concerns about the security
of these signatures.

IIUC, the secret signing key is stored on a machine that is connected to
the network (Infra, please correct me if I'm wrong). So there are other
more likely attack vectors than a preimage attack on a 512 bit hash
function.

Also: https://xkcd.com/538/ :)

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJMo0MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4u0bwH/R7+XFylWyavzkWIr2ZdzIw+KOVyidyQqsps qdj5VYWRgokX/JC9JTwHbqn1wJ/gmlwRVM6QyhcC5dN6XaXWXCxLihBmIqjrOIwR W5S62G9loWymrRdJonDUViGUjxiKo5L8jbHkDHcxVi8zpKfStq5zCqO8vnjxJngl UmnoDZbtvemzRYe6xYRxPIK40zV4LqW9Ear2gWZIUCnI4nnGQaNM/pELMEikSR9C OAUhSsdczSECopk+Mykfs/LsVHS2NjUxRbdmLsgD7f0RtJlIxAbperrn9OKC1ncf gHcJEu5qnaU8ABdsk7HQdSTflYUU9qF6FwlaZqckBoqBZmk1Phc=
=709J
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Matt Turner@21:1/5 to [email protected] on Tue Apr 5 22:40:02 2022

On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:

By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

I don't know, but the hash-checking is definitely checked before src_prepare().

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jonas Stein@21:1/5 to All on Tue Apr 5 23:20:01 2022

Hi

I'd like to propose the following for portage:

- Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
- Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

No, this has no benefit.

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest?

Implementations are often broken and we have to expect zero day attacks
on hashes and on signatures. Hence it does not hurt to have a second hash.

It is very likely that we can not trust in X for a while in the next
years, but it is very unlikely that two different implementations are
affected.

Additionally calculating a second hash does not cost anything.
This was also the outcome of the discussion some time ago here.

--
Best,
Jonas

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:40:01 2022

Hi Ulrich,

On Tue, Apr 5, 2022 at 10:15 PM Ulrich Mueller <[email protected]> wrote:

On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

Huh. Something not brought up there or https://bugs.gentoo.org/784710
is the fact that the _security_ of the system reduces to SHA-512 as
used by our GPG signatures.

The hash algorithm would be the least of my concerns about the security
of these signatures.

IIUC, the secret signing key is stored on a machine that is connected to
the network (Infra, please correct me if I'm wrong). So there are other
more likely attack vectors than a preimage attack on a 512 bit hash
function.

You missed the point, which is that having two hashes, SHA512 and
BLAKE2b, doesn't actually help anything, since an attacker only must
attack SHA512 in order to break the signature system, which is
actually what we're relying on for security. Yes there are other
attacks too on the signature system. But in terms of hashing, my point
is that adding an additional hash to manifest files to the one used by
the signature doesn't help anything from a security perspective, since
if you have an attack on the signature's hash, then no additional
hashing is going to actually help.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:40:01 2022

Hi Jonas,

On Tue, Apr 5, 2022 at 11:20 PM Jonas Stein <[email protected]> wrote:

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest?

Implementations are often broken and we have to expect zero day attacks
on hashes and on signatures. Hence it does not hurt to have a second hash.

It is very likely that we can not trust in X for a while in the next
years, but it is very unlikely that two different implementations are affected.

This is the part that doesn't really make any sense to me. The
security of the system reduces to the SHA512 used by those GPG
signatures. If SHA512 breaks, the fact that our Manifest files also
use BLAKE2b isn't going to help us, since an attacker could
presumably, in that case, forge the signatures that we're using as a
root of trust. I don't see what a second hash buys us from a security perspective here. What attack model do you have where it makes sense?

Additionally calculating a second hash does not cost anything.

How is that possible? Doesn't calculating two things always cost more
than calculating one? If what you actually mean is, "performance is
not important," we can discuss that, but it sounds like you're saying
that there's zero performance impact. How does that work exactly? Is
only one calculated at emerge time or something clever like that?

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:50:01 2022

Hi Matt,

On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:

By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

I don't know, but the hash-checking is definitely checked before src_prepare().

Er, during the builtin fetch phase. Anyway, you know what I meant. :)

Anyway, looking at the portage source code, to answer my own question,
it looks like the file is actually being read twice and both hashes
computed. I would have at least expected an optimization like:

hash1_init(&hash1);
hash2_init(&hash2);
for chunks in file:
hash1_update(&hash1, chunk);
hash2_update(&hash2, chunk);
hash1_final(&hash1, out1);
hash2_final(&hash2, out2);

But actually what's happening is the even less efficient:

hash1_init(&hash1);
for chunks in file:
hash1_update(&hash1, chunk);
hash1_final(&hash1, out1);
hash2_init(&hash2);
for chunks in file:
hash2_update(&hash2, chunk);
hash1_final(&hash2, out2);

So the file winds up being open and read twice. For huge tarballs like
chromium or libreoffice...

But either way you do it - the missed optimization above or the
unoptimized reality below - there's still twice as much work being
done. This is all unless I've misread the source code, which is
possible, so if somebody knows this code well and I'm wrong here,
please do speak up.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sam James@21:1/5 to All on Wed Apr 6 02:10:01 2022

On 5 Apr 2022, at 22:13, Jonas Stein <[email protected]> wrote:

Hi

I'd like to propose the following for portage:
- Only support one "secure" hash function (such as sha2, sha3, blake2, etc) >> - Only generate and parse one hash function in Manifest files
- Remove support for multiple hash functions

No, this has no benefit.

Which part has no benefit? I could see a case (although I don't think it's a super strong one)
for keeping support for multiple hash types in Portage, but only 1 in a Manifest.

I think Jason's made a fair case for dropping it.

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest?

Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.

I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).

It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.

I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

Additionally calculating a second hash does not cost anything.

It does have a cost at both Manifest-generation time and emerge-time.

Thanks,
sam

-----BEGIN PGP SIGNATURE-----

iQGTBAEBCgB9FiEEYOpPv/uDUzOcqtTy9JIoEO6gSDsFAmJM2TRfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDYw RUE0RkJGRkI4MzUzMzM5Q0FBRDRGMkY0OTIyODEwRUVBMDQ4M0IACgkQ9JIoEO6g SDswLwf7BPmP1KOQN8Al5zGSb14aTvawwfrkQ6r21sbjMijmVVaPoB05D8Y3AcmC sXA67evmhPPnKT993uJworOxinx1EBz+1v/EcTTL+33d72KEhW+7fkkEkcb41Rfq DWwmmoj6OFKHo1q/4C9TJAChR8kAjWHIbOme3Oa3DtEwyO7w34v68nKUaAVIMTVs Oo8qTCjdBl3m5bp0Tl0J0DBsYi2OnNAzIw3bbLgK1u0N5wJnm7aWNcuzDmRNzBFn 4baN1gcXT3mLQrnE04Wj7qxOzWvIroTTHnLOqHnEA4qfG34I4h9jwd3BVdiTHL4/ 568nAKkj8Pwr6tAybwyWKQPTzWhdzg==
=gmE2
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Wed Apr 6 02:20:01 2022

Hi Sam,

On Wed, Apr 6, 2022 at 2:02 AM Sam James <[email protected]> wrote:

This matches my views and recollection. We could revisit it
if there was a passionate advocate (which it looks like there may well be).

While I wasn't against it before, I was sort of ambivalent given
we had no strong reason to, but I'm more willing now given
we're also cleaning out other Portage cruft at the same time.

I think actually the argument I'm making this time might be subtly
different from the motions that folks went through last year.
Specifically, the idea last year was to switch to using BLAKE2b only.
I think what the arguments I'm making now point to is switching to
SHA2-512 only.

There are two reasons for this.

1) Security: since the GPG signatures use SHA2-512, then the whole
system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
hash, then if either SHA2-512 or BLAKE2b break, then the system
breaks. But if we choose SHA2-512 as our only hash, then we only need
to worry about SHA2-512 breaking.

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

A reason why some people might prefer BLAKE2b over SHA2-512 is a
performance improvement. However, seeing as right now we're opening
the file, reading it, computing BLAKE2b, closing the file, opening the
file again, reading it again, computing SHA2-512, closing the file, I
don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
performance "boost" or sorts.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Sam James@21:1/5 to All on Wed Apr 6 02:30:01 2022

On 6 Apr 2022, at 01:15, Jason A. Donenfeld <[email protected]> wrote:

Hi Sam,

On Wed, Apr 6, 2022 at 2:02 AM Sam James <[email protected]> wrote:

This matches my views and recollection. We could revisit it
if there was a passionate advocate (which it looks like there may well be). >>
While I wasn't against it before, I was sort of ambivalent given
we had no strong reason to, but I'm more willing now given
we're also cleaning out other Portage cruft at the same time.

I think actually the argument I'm making this time might be subtly
different from the motions that folks went through last year.
Specifically, the idea last year was to switch to using BLAKE2b only.
I think what the arguments I'm making now point to is switching to
SHA2-512 only.

Oh, right. I see!

(Aside: I should've been clearer in my first email, what I meant was: I'm
fine with revisiting this, but I remember us feeling kind of lacklustre because even the proposer (mgorny) ended up not having the oomph to push it through given (small) opposition. I don't recall who had the stiff opposition at the time,
but I do recall it was only small, but nobody really felt like it was worth the hassle.

The overall Council feeling was "meh" without some momentum.)

There are two reasons for this.

1) Security: since the GPG signatures use SHA2-512, then the whole
system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
hash, then if either SHA2-512 or BLAKE2b break, then the system
breaks. But if we choose SHA2-512 as our only hash, then we only need
to worry about SHA2-512 breaking.

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

A reason why some people might prefer BLAKE2b over SHA2-512 is a
performance improvement. However, seeing as right now we're opening
the file, reading it, computing BLAKE2b, closing the file, opening the
file again, reading it again, computing SHA2-512, closing the file, I
don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
performance "boost" or sorts.

I think this seems pretty reasonable and I don't have any objection to it.

2) is a nice point and it's something Robin raised last time around too.

Jason

best,
sam

-----BEGIN PGP SIGNATURE-----

iQGTBAEBCgB9FiEEYOpPv/uDUzOcqtTy9JIoEO6gSDsFAmJM3fpfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDYw RUE0RkJGRkI4MzUzMzM5Q0FBRDRGMkY0OTIyODEwRUVBMDQ4M0IACgkQ9JIoEO6g SDuvCggAgE4LrEzU5beQy8NopAU8A3musgHbOODn7UAvA8iU8UYaYYbppY80+4el i1BBDRGNfaS+7M7KwXBqM4eAWRQYjVmTzd0qZlTihjr0R2CTUZm9m+ofGGbhVD5U nTbKJCjGeg0GacLpgGV7yBA2jGh232hyjEPhQkuojRpBbmmVZEY+HCfrMQ4yI057 ULy0DcAxcYyZ6mvws028z5gO42TW+ox0K4bdgDMFCd+cM8J2FWxBxZzr+RefYT8z I+BKsj2Oz2c9qVt8a+/Spsby+1CR55o1DFPcjW35hNuBKuKDcNQ76IFtB0QQsOis Ap3ciIIZK7TMsbJutvQu337n+mEmsQ==
=zfr7
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Rich Freeman@21:1/5 to [email protected] on Wed Apr 6 03:40:01 2022

On Tue, Apr 5, 2022 at 8:05 PM Sam James <[email protected]> wrote:

On 5 Apr 2022, at 22:13, Jonas Stein <[email protected]> wrote:

In other words, what are we actually getting by having _both_ SHA2-512
and BLAKE2b for every file in every Manifest?

Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.

I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).

Our security fails currently if EITHER SHA2-512 or a hardened version
of SHA-1 are defeated. Our top gpg signature is bound to a git commit
record by SHA2-512, and the git commit record is bound to everything
else in the repository (including the manifest objects) by SHA-1,
because git hasn't transitioned away from that (as far as I'm aware it
is still a work in progress - the SHA-1 algorithm it uses is hardened
against known attacks).

That said, I think there is still an argument for having two hashes in
the manifests. If we have two independent manifests, then if either
SHA-1 or SHA2-512 are defeated all we need to do is update git+gpg to
the patched version (which no doubt would be rushed into a release
quickly), and then do a commit to the repo and sign it with the Gentoo
key. The new commit would have a full set of new hashes using a
secure hash function, and then a back-reference to the previous commit
using SHA-1 (assuming we didn't rebase the entire tree and lose all
our historical gpg signatures - we might consider creating a new repo
and saving a historical one). That would have new hashes all the way
from the top commit down to all the objects it references, so the top
commit would now be secure. When signed with an updated gpg the
signature would be attached with a secure hash. So now we're secure
again. If we're concerned about old signatures getting recycled in
preimage attacks we could of course revoke the key and issue a new
one.

What we don't need to do is redo all the manifests, and that is
important because we don't actually have the ability to redo those
centrally. Anybody can add a commit to the repo and re-sign it, but
we'd need all the maintainers to go through and generate new manifests
for anything that is fetch-restricted, or aggressively treeclean.

So it isn't that having two hashes can't fail, but rather that if it
does fail it is easier to recover.

It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.

I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

I agree that this is an unlikely scenario, so it is a judgement call
as to whether the ease of recovery in the event of a failure is worth
the cost to maintain the second hash. I agree that we'd need double
algorithms in the whole stack to prevent a failure, but in the current
state we do have advantages for recovering from a failure after the
fact.

It seems that the likely scenario is that we get advance warning of
weaknesses in a hash function, but without a practical exploit being
readily available. In that case we could do a more orderly
transition. We'd still save time with the double hashed manifests,
and whether this makes a difference is hard to say.

Additionally calculating a second hash does not cost anything.

It does have a cost at both Manifest-generation time and emerge-time.

This is certainly true, though if the current algorithm is reading the
files twice we could at least fix that.

I don't really have a strong opinion here. I just wanted to point out
the recovery benefit of having two hashes on just the manifests, given
that it isn't easy to access all the distfiles. I also wanted to
point out that we have SHA-1 exposure today, at least in git.

--
Rich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Wed Apr 6 06:20:01 2022

On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

I think actually the argument I'm making this time might be subtly
different from the motions that folks went through last year.
Specifically, the idea last year was to switch to using BLAKE2b only.
I think what the arguments I'm making now point to is switching to
SHA2-512 only.

Still, I think that if we drop one of the hashes then we should proceed
with the original plan. That is, keep the more modern BLAKE2B (which was
a participant of the SHA-3 competition [1]) and drop the older SHA512.

Back then, we had the choice between adding SHA3_512 and BLAKE2B, and we preferred BLAKE2B for performance reasons.

I also think that the argument about the OpenPGP signature isn't very
strong, because replacing that signature by another one using a
different hash is trivial. As I said before, replacing all Manifest
files in the tree isn't.

Ulrich

[1] https://en.wikipedia.org/wiki/NIST_hash_function_competition

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJNE1MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4un2sIANm++5Qi/2/jlVJWyzNjwiIgs9OwQfBu76wW LfUDtCp0rBJUNcM64agDfkNr9c5if9mmXp1scBFszm4HL4OmErrUSGAvg3ZNkl1D ufp7lup40i8CMm7oekZftwWyy8de0i7OL8aHpMNfV/B2td76EG5pkdh2811Oxfjz ZKmiTHkVKLTJ7R8ve4+U9nffV+EnttNlgTVerE12Qe4RZYzWK1Cx90vjMqsct74u JuoqwsbJrJa8fe9//I8Ll8nNWZGMYP8g3L918V62hKZnbp8z6XFup6jalYQfBj4P sqgB/EtaAIdsuEh2xZ6VdVCqsbAQS1ZiMcBBZATH7WI+Bu2tGcI=
=q+3W
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to Ulrich Mueller on Wed Apr 6 13:50:02 2022

Hi Ulrich,

On 4/6/22, Ulrich Mueller <[email protected]> wrote:

On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

I think actually the argument I'm making this time might be subtly
different from the motions that folks went through last year.
Specifically, the idea last year was to switch to using BLAKE2b only.
I think what the arguments I'm making now point to is switching to
SHA2-512 only.

Still, I think that if we drop one of the hashes then we should proceed
with the original plan. That is, keep the more modern BLAKE2B (which was
a participant of the SHA-3 competition [1]) and drop the older SHA512.

Why? Then we're dependent on two things, either of which could break,
rather than one.

To be clear, I'm a big fan of BLAKE2 myself and have used it in a
number of projects. And either one breaking would be a big deal. So
maybe it doesn't really matter that much. But strictly formally, it
seems like SHA512 is the most sound decision? I spelled out two
reasons for that to Sam; if you still disagree, maybe you can address
why you think my two reasons aren't very meaningful?

I also think that the argument about the OpenPGP signature isn't very
strong, because replacing that signature by another one using a
different hash is trivial. As I said before, replacing all Manifest
files in the tree isn't.

I looked into changing gnupg to use BLAKE2b for signatures, but it
doesn't appear to be supported. It's in gcrypt but not gpg. From
--version: `Hash: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224`.
Since my argument rests on minimizing probability of a break, changing
the signature hash algo after it's broken doesn't help with much, so I
think this is something we'd want to happen now, rather than later, if
we're to use BLAKE2b exclusively.

I could potentially send a patch to gnupg for this if you want to take
the long path. But also: don't forget there's also the
interoperability argument that favors SHA512 too.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to [email protected] on Wed Apr 6 19:10:02 2022

Hi Ulrich,

On Wed, Apr 6, 2022 at 6:38 PM Ulrich Mueller <[email protected]> wrote:

Why? Then we're dependent on two things, either of which could break, rather than one.

See? If either of these should happen, then we'll be happy that we still
have both hashes in our Manifest files.

OTOH, if that argument is not relavant because the probability of both
is close to zero, then (from a security POV) it doesn't matter which of
the two hashes we remove.

No, you're still missing the point.

If SHA-512 breaks, the security of the system fails, regardless of
what change we make. This is because GnuPG uses SHA-512 for its
signatures.

So I'll spell out the different possibilities:

1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
1a) Possibility: SHA-512 is broken. Result: system broken.
1b) Possibility: BLAKE2b is broken. Result: nothing.

2) GPG uses SHA-512. Manifest uses SHA-512.
2a) Possibility: SHA-512 is broken. Result: system broken.
2b) Possibility: BLAKE2b is broken. Result: nothing.

3) GPG uses SHA-512. Manifest uses BLAKE2b.
3a) Possibility: SHA-512 is broken. Result: system broken.
3b) Possibility: BLAKE2b is broken. Result: system broken.

See how from a security perspective, (2) is not worse than (1), but
(3) is worse than both (1) and (2)?

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Wed Apr 6 18:40:01 2022

On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

Why? Then we're dependent on two things, either of which could break,
rather than one.

See? If either of these should happen, then we'll be happy that we still
have both hashes in our Manifest files.

OTOH, if that argument is not relavant because the probability of both
is close to zero, then (from a security POV) it doesn't matter which of
the two hashes we remove.

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJNwe4PHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4u/4YH/05vVdiCDTUoWbtnhcYtGfZ24Y2W20i7RcJb IKV7z/kFxHa4LoBQDv6LW0tqfPuXIKsXjEmGNnEUH3MyhrqjoGZg3r7LTEsm7X2d P+v74bc9ZdR8RqQoJQq+cUHJvZ3IFXW0xlJxt0HS42QuFSrDF57zbhUlCsxstKpR rqwW/3XNak8VNvP6TZb8HmrNZq69ImyKOCHwA1E0GeOYeWMrMWcI4C68ns3rCeAQ aN2fy57jpVI3q7Uaoj7l0FwlE85XjDr2/PwD1ZQWTZh0RMsOF3lk1LKU6KNry1Ep qRijs7VxaXZ1NrwXrmeJHiiCMHR+Umoql+BTJ/RlJF7nmmVfF6M=
=Nfbs
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Robin H. Johnson@21:1/5 to Jason A. Donenfeld on Wed Apr 6 19:30:01 2022

On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

Can we expand on this specific thread for a moment?

I was the author of GLEP59 about changing the Manifest hashes, and I
noted at the time, with references, that the effective strength of a set
of hashes is only that of the strongest hash.

One of my regrets from GLEP59 is that it's made it harder for use cases
outside of the normal user distfile workflow.

The use case that impacted me the most was being able to compare our
distfiles were over time vs external sources, esp. if the file goes
missing or was fetch-restricted and we can't produce a new hash of it.
Maybe upstream only ever published SHA1/SHA256, and we only ever
calculated SHA512/BLAKE2b on the file. Since we never had hashes from
both sides at the same time, we cannot prove it was the same file.

We need to be able to ship one or more hashes to users, for the specific
use case of validating the distfiles they download.

As a developer, I'd like to be able to track the other hashes for a
file, without forcing ourselves to retain the file. This might be to
compare with upstream published hashes, or to compare with other
distros.

In fact it would be really nice to have a semi-automated pipeline to
plug in signed upstream hashes to our Manifests, and make it possibly to
prove our new SHA512/BLAKE2B hash was taken over the correct input in
the first place, and there wasn't any subtle supply-chain attack early
in the packaging process.

Where would those hashes go? They don't need to be in the Manifest, or
at the very least they don't need to be distributed via rsync to users
(it only costs a small amount of bytes to do so).

Where else could they go?
- Commit messages could work.
- Git notes to a lesser degree.
- alternate repos?

A reason why some people might prefer BLAKE2b over SHA2-512 is a
performance improvement. However, seeing as right now we're opening
the file, reading it, computing BLAKE2b, closing the file, opening the
file again, reading it again, computing SHA2-512, closing the file, I
don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
performance "boost" or sorts.

Or just only verifying the "strongest" hash gives you that boost.

I do want to check into the code that you pointed out, because I'm
really sure much older versions of Portage did the CORRECT thing of only reading the file in a single pass.

--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : [email protected]
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it.

iQKTBAABCgB9FiEEveu2pS8Vb98xaNkRGTlfI8WIJsQFAmJNzIxfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEJE RUJCNkE1MkYxNTZGREYzMTY4RDkxMTE5Mzk1RjIzQzU4ODI2QzQACgkQGTlfI8WI JsRBZxAAvvDQcg//MO7J2rPKZeeqJHUO8cYlM4AaKp+hmukqu0sSlUhihpxdlk0d eh/Lpnoo0nMsJszTtYEXsW4UYETd2UjAod7zyvYFxeALd0Ww4GUr0SpQnZ+u66gM K/teCGNwQURKJOxSHyPPgdSrLihgb7ofWxfryfbzVGHnMTHSJBleemeK+q9xrrvI rcPMjYNlobxgp/dCsQzehJdbPU/s+JIsOZeuVvmXq7vRKUyTHcuEGKbCPDg1Z4cK SNTC/YFZE9s4K67WDfCT1q0qfS4iiCopg4tUkrFJ2MHRxfIUwG+qLCvzJB5OEie6 DHvEyTkgr7Vr0fnLjLFLYi0HkPGZ4nCFgFumNZjBPJrDcOmu+airFN2hL+EmZiAz 1qD9Yc9wvEoYFfdgS/B5

From Jason A. Donenfeld@21:1/5 to Rich Freeman on Wed Apr 6 19:30:01 2022

Hi Rich,

On 4/6/22, Rich Freeman <[email protected]> wrote:

On Tue, Apr 5, 2022 at 8:05 PM Sam James <[email protected]> wrote:
Our security fails currently if EITHER SHA2-512 or a hardened version
of SHA-1 are defeated. Our top gpg signature is bound to a git commit
record by SHA2-512, and the git commit record is bound to everything
else in the repository (including the manifest objects) by SHA-1,
because git hasn't transitioned away from that (as far as I'm aware it
is still a work in progress - the SHA-1 algorithm it uses is hardened
against known attacks).

Sort of. The security between infra and users relies on SHA2-512. The
security between devs and infra relies on SHA-1. I guess the "full
system" depends on both, but I've been focused on the more likely
issue of a community-run mirror serving bogus files.

I agree that this is an unlikely scenario, so it is a judgement call
as to whether the ease of recovery in the event of a failure is worth
the cost to maintain the second hash. I agree that we'd need double algorithms in the whole stack to prevent a failure, but in the current
state we do have advantages for recovering from a failure after the
fact.

It seems that the likely scenario is that we get advance warning of weaknesses in a hash function, but without a practical exploit being
readily available. In that case we could do a more orderly
transition. We'd still save time with the double hashed manifests,
and whether this makes a difference is hard to say.

Yea I see this argument, but I don't quite buy it. Maintaining two
sets of hashes for the unlikely event that one gets broken AND we
absolutely cannot incrementally transition gradually to an unbroken
one seems rather overblown.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Ulrich Mueller@21:1/5 to All on Wed Apr 6 20:00:01 2022

On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

So I'll spell out the different possibilities:

1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
1a) Possibility: SHA-512 is broken. Result: system broken.
1b) Possibility: BLAKE2b is broken. Result: nothing.

2) GPG uses SHA-512. Manifest uses SHA-512.
2a) Possibility: SHA-512 is broken. Result: system broken.
2b) Possibility: BLAKE2b is broken. Result: nothing.

3) GPG uses SHA-512. Manifest uses BLAKE2b.
3a) Possibility: SHA-512 is broken. Result: system broken.
3b) Possibility: BLAKE2b is broken. Result: system broken.

See how from a security perspective, (2) is not worse than (1), but
(3) is worse than both (1) and (2)?

No it isn't. We can replace the top-level signature easily, but
replacing all Manifest hashes in the tree is hard (i.e. 1a and 3a are
trivial to fix, but 2a and 3b aren't).

I've said this multiple times now, so I'm out of here.

Ulrich

-----BEGIN PGP SIGNATURE-----

iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJN08MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4uu5IH/0KQbQBDLIThP4UrVFAzt+vd6+JnZteGSBJ/ Q/8UQnQIG0DLXiEchBNZJyZsixhHcwtgEqz3vrrEzPtH4j6W4XxjgDjWWbT9ikv9 R/KwwGxukfeWy7oJ8dlmEx/rP99zeU6LhZ8L1PlKqonKOlxEOkGxc7ijRlTf334f AU1PRLDzqvAD8m9R7oYBFzmz8LU2uAEkkbk4BukBIXnuU7cWW07PBJCvsmTT3CK6 +jKOvUSgz4sRwz1QsvQPWK/Vpdnu2dWNGsBaRh0T3k55BUmQStdM4RPr/q2XAF9f 9SOPsTYemnA6WirIJ1q8tbphFa54KcDWI2CPyK3ZZRT1yux173c=
=TrNP
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Robin H. Johnson@21:1/5 to Jason A. Donenfeld on Wed Apr 6 19:40:01 2022

On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:

No, you're still missing the point.

If SHA-512 breaks, the security of the system fails, regardless of
what change we make. This is because GnuPG uses SHA-512 for its
signatures.

Question directly for you Jason, because you make a professional study
of this: does the type of breakage/successful attack against against
SHA-512 matter?

e.g. is it possible that some type of attack would only work against the Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
the opposite).

The best hypothetical idea I had was that there exists some large
special input that lets an attacker reset the output to an arbitrary
hash after their malicious payload: but it wouldn't fit in the GPG
signature space.

So I'll spell out the different possibilities:
1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.

score -1 + 0 = -1

2) GPG uses SHA-512. Manifest uses SHA-512.

score -1 + 0 = -1

3) GPG uses SHA-512. Manifest uses BLAKE2b.

score -1 + -1 = -2

See how from a security perspective, (2) is not worse than (1), but
(3) is worse than both (1) and (2)?

Yes, (2) is not worse than (1) for the overall security perspective.
That leaves the discussion does (1) have other benefits / value
propositions that make it worth less than (2). (see my other thread)

--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : [email protected]
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it.

iQKTBAABCgB9FiEEveu2pS8Vb98xaNkRGTlfI8WIJsQFAmJNzl1fFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEJE RUJCNkE1MkYxNTZGREYzMTY4RDkxMTE5Mzk1RjIzQzU4ODI2QzQACgkQGTlfI8WI JsTzZg//fTqB63tBFSatv1fcY0UAAZhU1uy65020iOqcfk8gBCfzjkfwXWt+CEOw UhiuSydfD4lRcpAwRkEkYelr9vmsEC5DPBCdE51asQSvmDqc7iEZdlrE5oWY/csj BAs2n8HhPhrNniqq7WWzn6y67MiGprLBHVYb//6K+KQeQCdAcmFWp5tVvU5qv5ZS zuAENB4ID0J1XZSz7I699hNRozMQm/V4rQoR6zs8lNO67cETh3wBYL84aPrx+fuV +n8JgG15E4R37eNDc4af6tpMu4ik/F70To+67vPUnB7vZ60Nz0X2yvFfAKi+Mp5n naHx00QIJQWnaQx7ZrXLYjT2ysvxpRumOl+XP/seXz1MAUpTmTUl3qedVPkHP9mL wfseOLHMEgjf9Z04Kd/M

From Rich Freeman@21:1/5 to [email protected] on Wed Apr 6 20:40:01 2022

On Wed, Apr 6, 2022 at 1:29 PM Jason A. Donenfeld <[email protected]> wrote:

Sort of. The security between infra and users relies on SHA2-512. The security between devs and infra relies on SHA-1. I guess the "full
system" depends on both, but I've been focused on the more likely
issue of a community-run mirror serving bogus files.

Well, that depends on how you're syncing the tree. If you're using
rsync then there is a signed manifest in the root, so I agree in that
case it is just SHA2-512. If you're syncing using git then the
manifests only reference distfiles, and the only link between the
commit and the tree/objects are their SHA-1 hashes until git adopts a
different hash function.

Yea I see this argument, but I don't quite buy it. Maintaining two
sets of hashes for the unlikely event that one gets broken AND we
absolutely cannot incrementally transition gradually to an unbroken
one seems rather overblown.

It is very much a hand-waving judgement call. This is one of those
low cost, low risk, high reward situations IMO. The cost of
calculating hashes is fairly low (especially if done in a more sane
way). The odds it will ever have a benefit are low. If it does have
a benefit, it will be in a situation where the world is on fire and
we'll be very happy to not have to go verify a gazillion distfiles on
top of everything else we have to fix. I'll defer to those wiser than
me to make the call. :)

--
Rich

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marek Szuba@21:1/5 to All on Thu Apr 7 17:30:06 2022

This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------5zI3BuuKye608g0lcg0cbzls
Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

T24gMjAyMi0wNC0wNiAxOTozNCwgUmljaCBGcmVlbWFuIHdyb3RlOg0KDQo+IFRoaXMgaXMg b25lIG9mIHRob3NlIGxvdyBjb3N0LCBsb3cgcmlzaywgaGlnaCByZXdhcmQgc2l0dWF0aW9u cyBJTU8uDQoNCipwdXRzIG9uIENvdW5jaWwgaGF0Kg0KDQpUaGUgYWJvdmUgcHJldHR5IG11 Y2ggY292ZXJzIG15IG93biBvcGluaW9uIG9uIHRoZSBzdWJqZWN0Lg0KDQotLSANCk1hcmVj a2kNCg==

--------------5zI3BuuKye608g0lcg0cbzls--

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEE+MBeYVMkcD2jfqCrKMQ7KFUeMgEFAmJPAWcACgkQKMQ7KFUe MgGxCQ/9Hh5YIP2ldJzZkoDhxU/8iyJhuEdSV/F9rHNV8ZPm9rNK4Uj7vZ6BmS6M 4d7Py7F92LVwUIWdUKZxZpYpowJNMZm/INr6nOXCY5ky0Lgt4Q1vJqC/dQCog7pO pzE9BYocBrYnqs1G/eBjmWMES1VDqTb8DjzFTwvUvMoP5ovSM0M+a6F9dE0NRRBO XuDhsVwfgmyWYq1SwLEnc4WwfiDlleY80UuQkEYOig0xlOj6sx4xAIYNlwBUDE/Y 5SgBah7F5aQpuD8AWwWGxfP+2zujZPByzIIed/l1kCsTnR8iYxiWyi+zfx1/jHV1 7lnNlRe2RHVkTfcHkcmWC2lPBte9v22N3pvmfD5emr5lffAvhEpNb59OoQvx+XI5 t+fQhAUXZBRkcMr1kNS5nJRqiOMYHTJjhVazSJaMhEkFT2zKu8qztXU7q2z8rkGT b5WBuB4dWfLuqf5XarKmESo6be7RRHeGoGOEymvKzGtw+VNkY8cF0UhdpkgjH1qn owDDtkwVxy1oldJyXvYs8+rLxzR//4c/cbJ7PQjqcyzC1bsPQTtSAppGvUsTvPj6 sVgZN7KafUmY/x42jm0VTe7e6io45qFetzsrCSKX4LhTSCR+YB26HWz3J7Dh+VCC yN89sy08fR8guKZ+c44Fo2PX7h80Uppu0mBOmvp7XHJnyCN8G4A=
=ZsDB
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Joshua Kinard@21:1/5 to Jason A. Donenfeld on Tue Apr 12 01:20:01 2022

On 4/5/2022 17:49, Jason A. Donenfeld wrote:

Hi Matt,

On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote: >>> By the way, we're not currently _checking_ two hash functions during

src_prepare(), are we?

I don't know, but the hash-checking is definitely checked before src_prepare().

Er, during the builtin fetch phase. Anyway, you know what I meant. :)

Anyway, looking at the portage source code, to answer my own question,
it looks like the file is actually being read twice and both hashes
computed. I would have at least expected an optimization like:

hash1_init(&hash1);
hash2_init(&hash2);
for chunks in file:
hash1_update(&hash1, chunk);
hash2_update(&hash2, chunk);
hash1_final(&hash1, out1);
hash2_final(&hash2, out2);

But actually what's happening is the even less efficient:

hash1_init(&hash1);
for chunks in file:
hash1_update(&hash1, chunk);
hash1_final(&hash1, out1);
hash2_init(&hash2);
for chunks in file:
hash2_update(&hash2, chunk);
hash1_final(&hash2, out2);

So the file winds up being open and read twice. For huge tarballs like chromium or libreoffice...

But either way you do it - the missed optimization above or the
unoptimized reality below - there's still twice as much work being
done. This is all unless I've misread the source code, which is
possible, so if somebody knows this code well and I'm wrong here,
please do speak up.

Not to go off-topic, but where in Portage's source is this logic at? It
seems like an easy fix for a slightly more efficient Portage.

--
Joshua Kinard
Gentoo/MIPS
[email protected]
rsa6144/5C63F4E3F5C6C943 2015-04-27
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."

--Emperor Turhan, Centauri Republic

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mike Gilbert@21:1/5 to [email protected] on Tue Apr 12 14:50:01 2022

On Mon, Apr 11, 2022 at 7:14 PM Joshua Kinard <[email protected]> wrote:

On 4/5/2022 17:49, Jason A. Donenfeld wrote:

Hi Matt,

On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:

By the way, we're not currently _checking_ two hash functions during
src_prepare(), are we?

I don't know, but the hash-checking is definitely checked before src_prepare().

Er, during the builtin fetch phase. Anyway, you know what I meant. :)

Anyway, looking at the portage source code, to answer my own question,
it looks like the file is actually being read twice and both hashes computed. I would have at least expected an optimization like:

hash1_init(&hash1);
hash2_init(&hash2);
for chunks in file:
hash1_update(&hash1, chunk);
hash2_update(&hash2, chunk);
hash1_final(&hash1, out1);
hash2_final(&hash2, out2);

But actually what's happening is the even less efficient:

hash1_init(&hash1);
for chunks in file:
hash1_update(&hash1, chunk);
hash1_final(&hash1, out1);
hash2_init(&hash2);
for chunks in file:
hash2_update(&hash2, chunk);
hash1_final(&hash2, out2);

So the file winds up being open and read twice. For huge tarballs like chromium or libreoffice...

But either way you do it - the missed optimization above or the
unoptimized reality below - there's still twice as much work being
done. This is all unless I've misread the source code, which is
possible, so if somebody knows this code well and I'm wrong here,
please do speak up.

Not to go off-topic, but where in Portage's source is this logic at? It seems like an easy fix for a slightly more efficient Portage.

I believe it's the portage.checksum.verify_all() function.

https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/checksum.py?h=portage-3.0.30#n471

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Robin H. Johnson@21:1/5 to Robin H. Johnson on Wed Apr 20 02:10:02 2022

On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:

On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

Can we expand on this specific thread for a moment?

I was the author of GLEP59 about changing the Manifest hashes, and I
noted at the time, with references, that the effective strength of a set
of hashes is only that of the strongest hash.

Bump for my parent message, that I'm very surprised at the lack of
responses to two messages in this thread.

https://archives.gentoo.org/gentoo-dev/message/18216da0128ee79733fa68bb77fa8b69 https://archives.gentoo.org/gentoo-dev/message/a9974ec34dfb25810dab47e3fa322a52

--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail : [email protected]
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to Robin H. Johnson on Wed Apr 20 16:00:01 2022

Hey Robin,

Sorry for the delay in getting back to you. As mentioned on IRC, both of
your messages bounced earlier, and I was at a conference all last week. Catching up with this thread now...

On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:

On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:

2) Comparability: other distros use SHA2-512, as well as various
upstreams, which means we can compare our hashes to theirs easily.

Can we expand on this specific thread for a moment?

I was the author of GLEP59 about changing the Manifest hashes, and I
noted at the time, with references, that the effective strength of a set
of hashes is only that of the strongest hash.

One of my regrets from GLEP59 is that it's made it harder for use cases outside of the normal user distfile workflow.

The use case that impacted me the most was being able to compare our distfiles were over time vs external sources, esp. if the file goes
missing or was fetch-restricted and we can't produce a new hash of it.
Maybe upstream only ever published SHA1/SHA256, and we only ever
calculated SHA512/BLAKE2b on the file. Since we never had hashes from
both sides at the same time, we cannot prove it was the same file.

We need to be able to ship one or more hashes to users, for the specific
use case of validating the distfiles they download.

As a developer, I'd like to be able to track the other hashes for a
file, without forcing ourselves to retain the file. This might be to
compare with upstream published hashes, or to compare with other
distros.

In fact it would be really nice to have a semi-automated pipeline to
plug in signed upstream hashes to our Manifests, and make it possibly to prove our new SHA512/BLAKE2B hash was taken over the correct input in
the first place, and there wasn't any subtle supply-chain attack early
in the packaging process.

Where would those hashes go? They don't need to be in the Manifest, or
at the very least they don't need to be distributed via rsync to users
(it only costs a small amount of bytes to do so).

Where else could they go?
- Commit messages could work.
- Git notes to a lesser degree.
- alternate repos?

Interesting idea. This seems orthogonal to my proposal ("just use one
hash in the manifest and call it a day; make it the same as what gpg
uses for signing to minimize moving pieces"), and so I'm hesitant to
indulge too much in this thread, for fear of it being derailed with this different thing you want.

With that said, I'm not quite sure I understood everything you're asking
for. You said that you want "to have a semi-automated pipeline to plug
in signed upstream hashes to our Manifests, and make it possibly to
prove our new SHA512/BLAKE2B hash was taken over the correct input", but
at the same time you also said that you want "to be able to track the
other hashes for a file, without forcing ourselves to retain the file."
What I'm wondering is: how do you propose that we calculate a SHA-512
hash of a file and "prove it correct" using, e.g., a signed SHA-256
hash, if we don't download the whole file?

It sounds like the thing that would be interesting to you would be for
infra to manage some sort of master hash database collecting all the
hashes from all over the internet of every file that hits distfiles,
verifying and then generating a bunch more hash variants of all kinds,
and then cross-verifying those with the hashes extracted from every
other distro, making for a wild hash verification aggregator machine. I
think I can see the utility of it. It would also unburden manifest
files, as those could then just have a SHA-512 hash and nothing else,
making things a bit lighter.

A reason why some people might prefer BLAKE2b over SHA2-512 is a performance improvement. However, seeing as right now we're opening
the file, reading it, computing BLAKE2b, closing the file, opening the
file again, reading it again, computing SHA2-512, closing the file, I
don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
performance "boost" or sorts.

Or just only verifying the "strongest" hash gives you that boost.

I do want to check into the code that you pointed out, because I'm
really sure much older versions of Portage did the CORRECT thing of only reading the file in a single pass.

Let me know if your findings are different from mine...

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jason A. Donenfeld@21:1/5 to Robin H. Johnson on Wed Apr 20 18:40:01 2022

Hi Robin,

On Wed, Apr 06, 2022 at 05:31:09PM +0000, Robin H. Johnson wrote:

On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:

No, you're still missing the point.

If SHA-512 breaks, the security of the system fails, regardless of
what change we make. This is because GnuPG uses SHA-512 for its
signatures.

Question directly for you Jason, because you make a professional study
of this: does the type of breakage/successful attack against against
SHA-512 matter?

e.g. is it possible that some type of attack would only work against the Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
the opposite).

The best hypothetical idea I had was that there exists some large
special input that lets an attacker reset the output to an arbitrary
hash after their malicious payload: but it wouldn't fit in the GPG
signature space.

Generally speaking, the more control an attacker has over the input, the
easier certain types of attacks might be. So maybe in the most general
sense that applies. I wouldn't model a security analysis around that,
though. Rather, the usual way to apply that sort of thinking is to
design algorithms that rely on certain properties of hash functions, but
not others; for example, Ed25519 does not rely on the hash function
being collision resistant due to its construction.

Jason

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet
- Ataricrypt
  Mon Jul 27 19:19:17 2026
  from England via Telnet
- Bob Worm
  Mon Jul 27 15:19:55 2026
  from Wales, Uk via Telnet
- Rixter
  Mon Jul 27 13:04:59 2026
  from Madison, Nc via Telnet
- Krenn
  Mon Jul 27 11:54:32 2026
  from Sydney, Nsw via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	43:58:57
Calls:	12,443
Calls today:	3
Files:	15,192
Messages:	6,537,081

[gentoo-dev] proposal: use only one hash function in manifest files

Who's Online

Recent Visitors

System Info