• [gentoo-dev] proposal: use only one hash function in manifest files

    From Jason A. Donenfeld@21:1/5 to All on Tue Apr 5 01:50:02 2022
    Hi,

    I'd like to propose the following for portage:

    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
    - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest? It's not about file
    integrity, since certainly a single hash handles that use case fine.
    And it's not about security either, since for that we use gpg
    signatures, and gpg signatures are carried out over a _single_ hash of
    the plain text being hashed, so the security of the system reduces to
    breaking SHA2-512 anyway. So, if it's not about file integrity and
    it's not about security, what is it about?

    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256,
    SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    Jason

    PS: there _is_ a good reason for recording the file size in Manifest
    files as we do now: it's quicker to compare sizes on large files than
    it is to read and hash the whole thing, so this gives us a "free" way
    of noticing quick corruption.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From John Helmert III@21:1/5 to Jason A. Donenfeld on Tue Apr 5 03:50:01 2022
    I don't really have any strong opinion, but I'll note this was
    discussed here last year, too:

    https://archives.gentoo.org/gentoo-dev/message/a51ef62765b577dccfde67d5d2d727ae

    On Tue, Apr 05, 2022 at 01:41:50AM +0200, Jason A. Donenfeld wrote:
    Hi,

    I'd like to propose the following for portage:

    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
    - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest? It's not about file
    integrity, since certainly a single hash handles that use case fine.
    And it's not about security either, since for that we use gpg
    signatures, and gpg signatures are carried out over a _single_ hash of
    the plain text being hashed, so the security of the system reduces to breaking SHA2-512 anyway. So, if it's not about file integrity and
    it's not about security, what is it about?

    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256,
    SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    Jason

    PS: there _is_ a good reason for recording the file size in Manifest
    files as we do now: it's quicker to compare sizes on large files than
    it is to read and hash the whole thing, so this gives us a "free" way
    of noticing quick corruption.


    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEElFuPenBj6NvNLoABXP0dAeB+IzgFAmJLn/UACgkQXP0dAeB+ IzjIyA/+P6UuaQ+ckuelVambFSnzj3DsGkJxNHRj6Cn6vRpCi4iv09EYrBps49ws d1e5QEGdIoJf/Sh7CVV89K63M+cO6ngdysWaJwYwRmF2y0iasPvp1y664O0ELz+H 1l5RjlV94tR8J++G3CVvohoYH4vSf8p4VSiSBe9t/f6YRkRhl3vyErECjeracdGi cAF+vjrk+3a/aKaPsXozNAGwxYXTtStuaOT6BGoQ9aOy7Prsn4c821Ag8iv2EPUs cMJRRj8E5UcDmjAlzXvOAi3RFi40HUn8okF8nt85nUKB+/9JM7FEF2pXhEE5zGZo 4dCNXtP7wGEKaUwgufEYx50HdAWBrOtcJ6DP5gGNObY3CLV0EsNUL7G711/Mya9z fVF2BYkPgvau90/eKwzPFrtbELblGAvIlg1zr5zujILXkVtsq8pX1pKtl3H/OsSa LxOWt6k685ubmvHNeZXPFcNeRcVvtUMlskI9Gd2k3H+Rdt72C7C2OCw6ixitE1+S iQ31MagqDZrPzQaAsh3G3I1rVFIzOmO6JOuebZ/ybS+6hjoSFvhbvyTTbaNvZpOP jyuq0cYVBxHmH09EfGIzrxuEHdpk8xxCNiiobKcuNPW/3Li7wrSIy9VubPkWS1lo XXXbPJSutIEH78Fmn3zzE+Agy8QcO2TfFMENqE1/qak+SdWULWY=
    =Z1uW
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to Jason A. Donenfeld on Tue Apr 5 15:40:01 2022
    To move things forward with something more concrete:

    On 4/5/22, Jason A. Donenfeld <[email protected]> wrote:
    Hi,

    I'd like to propose the following for portage:

    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
    - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    [...]
    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256,
    SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    As you might have realized from my work on other projects, I like
    BLAKE2 a lot. However, I think there are two strong reasons for going
    with SHA512 exclusively here:

    - GPG signatures are already over the SHA512 of the plain text, so
    they security of the system already reduces to that. By choosing
    SHA512, we don't add more risk, whilst choosing something else means
    we're in trouble if either one has a problem.
    - Other package managers use SHA512 in their recipes, so it makes it
    easier to compare tarball checksums.

    The principle advantage of BLAKE2b is 64-bit speed, but SHA512
    performs okay enough in that regard anyway.

    Therefore, to amend my proposal:

    - Use SHA512 as the Manifest hash.

    Any objections?

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Micha=C5=82_G=C3=B3rny?=@21:1/5 to Jason A. Donenfeld on Tue Apr 5 16:50:01 2022
    On Tue, 2022-04-05 at 01:41 +0200, Jason A. Donenfeld wrote:
    Hi,

    I'd like to propose the following for portage:

    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
    - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest? It's not about file
    integrity, since certainly a single hash handles that use case fine.
    And it's not about security either, since for that we use gpg
    signatures, and gpg signatures are carried out over a _single_ hash of
    the plain text being hashed, so the security of the system reduces to breaking SHA2-512 anyway. So, if it's not about file integrity and
    it's not about security, what is it about?

    If you mean "remove entirely", then that's a bad idea. While
    the original reasons for multiple hash functions might have been, erm,
    not exactly correct, the dual-hash situation is needed for transitional periods. Particularly because we have a number of fetch-restricted
    packages where we simply need to wait for someone with the distfile to
    rehash them (or eventually remove them, if we can't get a new hash).

    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256,
    SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    Back when we added BLAKE2b, the idea was to eventually remove SHA512
    (the previous hash). However, this was rejected afterwards.

    PS: there _is_ a good reason for recording the file size in Manifest
    files as we do now: it's quicker to compare sizes on large files than
    it is to read and hash the whole thing, so this gives us a "free" way
    of noticing quick corruption.

    The primary use of knowing the file size is to know whether to try to
    resume fetching.

    --
    Best regards,
    Michał Górny

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich Mueller@21:1/5 to All on Tue Apr 5 16:20:01 2022
    On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

    - GPG signatures are already over the SHA512 of the plain text, so
    they security of the system already reduces to that. By choosing
    SHA512, we don't add more risk, whilst choosing something else means
    we're in trouble if either one has a problem.

    The OpenPGP signature is for the top-level Manifest only. In case there
    was any trouble, it would be trivial to change the hash algorithm used
    for this.

    In constrast to that, updating the hashes in all Manifest files is a
    huge pain in the neck. Basically, you must download all distfiles, which
    is not trivial. For example, think of fetch-restricted files. (I've
    helped twice with updating Manifest files, so I believe I know what I'm
    talking about. :)

    I think that be benefit of dropping one of the hashes would be close to
    zero, especially if we would drop the faster one.

    Ulrich

    -----BEGIN PGP SIGNATURE-----

    iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJMTdkPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4uTwIH/3Bs7reoMKTJhCMiwzgB4lqitdjHNqEq9lsX rIUEJsVO/pExQmQcwogn0GY8Lcsy8S75ayddFKXpe5+HxdWzPa9n22WXyblvozqL wHRVOtBuKUXcJ6b14fqsHUyRuJw+utcUVSXvLIr1CVh/12TSAzGGi9kBSLhi/gbX JtK0fA38EoEfZ50GpvksABosDJzdSwltaMxQWpPwwvEzPeAx+lOxHJ+n5FWhadcz tjZ8WcWxnmM6CLRzcPIXeE6le5FF5VHbkM5AzaMHB6b1Y0Ytq1jm8TReQgzSijt4 x5ypeleogCn5dr9hiB8BbwYxZ+nd1yXdD4C+nhw+8g59VPfw+Vk=
    =uCuV
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 17:20:01 2022
    Hi Ulrich,

    On Tue, Apr 5, 2022 at 4:10 PM Ulrich Mueller <[email protected]> wrote:
    The OpenPGP signature is for the top-level Manifest only. In case there
    was any trouble, it would be trivial to change the hash algorithm used
    for this.

    In constrast to that, updating the hashes in all Manifest files is a
    huge pain in the neck. Basically, you must download all distfiles, which
    is not trivial. For example, think of fetch-restricted files. (I've
    helped twice with updating Manifest files, so I believe I know what I'm talking about. :)

    The thing is, if SHA-512 is broken, that will really be the least of
    our concerns. TLS itself will be broken....

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to All on Tue Apr 5 20:50:02 2022
    Hi Michal,

    On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:
    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256, SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    Back when we added BLAKE2b, the idea was to eventually remove SHA512
    (the previous hash). However, this was rejected afterwards.

    Maybe we should pick that back up? Do you remember the ultimate
    rationale for rejecting it? Do you suppose those are still valid?

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Turner@21:1/5 to [email protected] on Tue Apr 5 21:00:01 2022
    On Tue, Apr 5, 2022 at 11:47 AM Jason A. Donenfeld <[email protected]> wrote:

    Hi Michal,

    On Tue, Apr 05, 2022 at 02:49:12PM +0000, Michał Górny wrote:
    I don't really care which one we use, so long as it's not already
    broken or too obscure/new. So in other words, any one of SHA2-256, SHA2-512, SHA3, BLAKE2b, BLAKE2s would be fine with me. Can we just
    pick one and roll with it?

    Back when we added BLAKE2b, the idea was to eventually remove SHA512
    (the previous hash). However, this was rejected afterwards.

    Maybe we should pick that back up? Do you remember the ultimate
    rationale for rejecting it? Do you suppose those are still valid?

    (Somehow you broke threading)

    This was a topic in June 2021's Council meeting:

    https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
    https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137

    Basically there was no great reason presented for making the change
    and some (IMO specious) reasons for keeping multiple hashes. I don't
    think anyone felt strongly enough about removing one hash to fight for
    it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 21:40:02 2022
    Hi Matt,

    On Tue, Apr 5, 2022 at 8:58 PM Matt Turner <[email protected]> wrote:
    This was a topic in June 2021's Council meeting:

    https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613-summary.txt#n33
    https://gitweb.gentoo.org/sites/projects/council.git/tree/meeting-logs/20210613.txt#n137

    Basically there was no great reason presented for making the change
    and some (IMO specious) reasons for keeping multiple hashes. I don't
    think anyone felt strongly enough about removing one hash to fight for
    it.

    Huh. Something not brought up there or https://bugs.gentoo.org/784710
    is the fact that the _security_ of the system reduces to SHA-512 as
    used by our GPG signatures.

    By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich Mueller@21:1/5 to All on Tue Apr 5 22:20:01 2022
    On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

    Huh. Something not brought up there or https://bugs.gentoo.org/784710
    is the fact that the _security_ of the system reduces to SHA-512 as
    used by our GPG signatures.

    The hash algorithm would be the least of my concerns about the security
    of these signatures.

    IIUC, the secret signing key is stored on a machine that is connected to
    the network (Infra, please correct me if I'm wrong). So there are other
    more likely attack vectors than a preimage attack on a 512 bit hash
    function.

    Also: https://xkcd.com/538/ :)

    Ulrich

    -----BEGIN PGP SIGNATURE-----

    iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJMo0MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4u0bwH/R7+XFylWyavzkWIr2ZdzIw+KOVyidyQqsps qdj5VYWRgokX/JC9JTwHbqn1wJ/gmlwRVM6QyhcC5dN6XaXWXCxLihBmIqjrOIwR W5S62G9loWymrRdJonDUViGUjxiKo5L8jbHkDHcxVi8zpKfStq5zCqO8vnjxJngl UmnoDZbtvemzRYe6xYRxPIK40zV4LqW9Ear2gWZIUCnI4nnGQaNM/pELMEikSR9C OAUhSsdczSECopk+Mykfs/LsVHS2NjUxRbdmLsgD7f0RtJlIxAbperrn9OKC1ncf gHcJEu5qnaU8ABdsk7HQdSTflYUU9qF6FwlaZqckBoqBZmk1Phc=
    =709J
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matt Turner@21:1/5 to [email protected] on Tue Apr 5 22:40:02 2022
    On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:
    By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

    I don't know, but the hash-checking is definitely checked before src_prepare().

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonas Stein@21:1/5 to All on Tue Apr 5 23:20:01 2022
    Hi

    I'd like to propose the following for portage:

    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc)
    - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    No, this has no benefit.

    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest?

    Implementations are often broken and we have to expect zero day attacks
    on hashes and on signatures. Hence it does not hurt to have a second hash.

    It is very likely that we can not trust in X for a while in the next
    years, but it is very unlikely that two different implementations are
    affected.

    Additionally calculating a second hash does not cost anything.
    This was also the outcome of the discussion some time ago here.

    --
    Best,
    Jonas

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:40:01 2022
    Hi Ulrich,

    On Tue, Apr 5, 2022 at 10:15 PM Ulrich Mueller <[email protected]> wrote:

    On Tue, 05 Apr 2022, Jason A Donenfeld wrote:

    Huh. Something not brought up there or https://bugs.gentoo.org/784710
    is the fact that the _security_ of the system reduces to SHA-512 as
    used by our GPG signatures.

    The hash algorithm would be the least of my concerns about the security
    of these signatures.

    IIUC, the secret signing key is stored on a machine that is connected to
    the network (Infra, please correct me if I'm wrong). So there are other
    more likely attack vectors than a preimage attack on a 512 bit hash
    function.

    You missed the point, which is that having two hashes, SHA512 and
    BLAKE2b, doesn't actually help anything, since an attacker only must
    attack SHA512 in order to break the signature system, which is
    actually what we're relying on for security. Yes there are other
    attacks too on the signature system. But in terms of hashing, my point
    is that adding an additional hash to manifest files to the one used by
    the signature doesn't help anything from a security perspective, since
    if you have an attack on the signature's hash, then no additional
    hashing is going to actually help.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:40:01 2022
    Hi Jonas,

    On Tue, Apr 5, 2022 at 11:20 PM Jonas Stein <[email protected]> wrote:
    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest?

    Implementations are often broken and we have to expect zero day attacks
    on hashes and on signatures. Hence it does not hurt to have a second hash.

    It is very likely that we can not trust in X for a while in the next
    years, but it is very unlikely that two different implementations are affected.

    This is the part that doesn't really make any sense to me. The
    security of the system reduces to the SHA512 used by those GPG
    signatures. If SHA512 breaks, the fact that our Manifest files also
    use BLAKE2b isn't going to help us, since an attacker could
    presumably, in that case, forge the signatures that we're using as a
    root of trust. I don't see what a second hash buys us from a security perspective here. What attack model do you have where it makes sense?

    Additionally calculating a second hash does not cost anything.

    How is that possible? Doesn't calculating two things always cost more
    than calculating one? If what you actually mean is, "performance is
    not important," we can discuss that, but it sounds like you're saying
    that there's zero performance impact. How does that work exactly? Is
    only one calculated at emerge time or something clever like that?

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Tue Apr 5 23:50:01 2022
    Hi Matt,

    On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

    On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:
    By the way, we're not currently _checking_ two hash functions during src_prepare(), are we?

    I don't know, but the hash-checking is definitely checked before src_prepare().

    Er, during the builtin fetch phase. Anyway, you know what I meant. :)

    Anyway, looking at the portage source code, to answer my own question,
    it looks like the file is actually being read twice and both hashes
    computed. I would have at least expected an optimization like:

    hash1_init(&hash1);
    hash2_init(&hash2);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash2_update(&hash2, chunk);
    hash1_final(&hash1, out1);
    hash2_final(&hash2, out2);

    But actually what's happening is the even less efficient:

    hash1_init(&hash1);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash1_final(&hash1, out1);
    hash2_init(&hash2);
    for chunks in file:
    hash2_update(&hash2, chunk);
    hash1_final(&hash2, out2);

    So the file winds up being open and read twice. For huge tarballs like
    chromium or libreoffice...

    But either way you do it - the missed optimization above or the
    unoptimized reality below - there's still twice as much work being
    done. This is all unless I've misread the source code, which is
    possible, so if somebody knows this code well and I'm wrong here,
    please do speak up.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Sam James@21:1/5 to All on Wed Apr 6 02:10:01 2022
    On 5 Apr 2022, at 22:13, Jonas Stein <[email protected]> wrote:

    Hi

    I'd like to propose the following for portage:
    - Only support one "secure" hash function (such as sha2, sha3, blake2, etc) >> - Only generate and parse one hash function in Manifest files
    - Remove support for multiple hash functions

    No, this has no benefit.

    Which part has no benefit? I could see a case (although I don't think it's a super strong one)
    for keeping support for multiple hash types in Portage, but only 1 in a Manifest.

    I think Jason's made a fair case for dropping it.


    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest?

    Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.

    I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).


    It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.


    I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

    Additionally calculating a second hash does not cost anything.

    It does have a cost at both Manifest-generation time and emerge-time.

    Thanks,
    sam


    -----BEGIN PGP SIGNATURE-----

    iQGTBAEBCgB9FiEEYOpPv/uDUzOcqtTy9JIoEO6gSDsFAmJM2TRfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDYw RUE0RkJGRkI4MzUzMzM5Q0FBRDRGMkY0OTIyODEwRUVBMDQ4M0IACgkQ9JIoEO6g SDswLwf7BPmP1KOQN8Al5zGSb14aTvawwfrkQ6r21sbjMijmVVaPoB05D8Y3AcmC sXA67evmhPPnKT993uJworOxinx1EBz+1v/EcTTL+33d72KEhW+7fkkEkcb41Rfq DWwmmoj6OFKHo1q/4C9TJAChR8kAjWHIbOme3Oa3DtEwyO7w34v68nKUaAVIMTVs Oo8qTCjdBl3m5bp0Tl0J0DBsYi2OnNAzIw3bbLgK1u0N5wJnm7aWNcuzDmRNzBFn 4baN1gcXT3mLQrnE04Wj7qxOzWvIroTTHnLOqHnEA4qfG34I4h9jwd3BVdiTHL4/ 568nAKkj8Pwr6tAybwyWKQPTzWhdzg==
    =gmE2
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Wed Apr 6 02:20:01 2022
    Hi Sam,

    On Wed, Apr 6, 2022 at 2:02 AM Sam James <[email protected]> wrote:
    This matches my views and recollection. We could revisit it
    if there was a passionate advocate (which it looks like there may well be).

    While I wasn't against it before, I was sort of ambivalent given
    we had no strong reason to, but I'm more willing now given
    we're also cleaning out other Portage cruft at the same time.

    I think actually the argument I'm making this time might be subtly
    different from the motions that folks went through last year.
    Specifically, the idea last year was to switch to using BLAKE2b only.
    I think what the arguments I'm making now point to is switching to
    SHA2-512 only.

    There are two reasons for this.

    1) Security: since the GPG signatures use SHA2-512, then the whole
    system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
    hash, then if either SHA2-512 or BLAKE2b break, then the system
    breaks. But if we choose SHA2-512 as our only hash, then we only need
    to worry about SHA2-512 breaking.

    2) Comparability: other distros use SHA2-512, as well as various
    upstreams, which means we can compare our hashes to theirs easily.

    A reason why some people might prefer BLAKE2b over SHA2-512 is a
    performance improvement. However, seeing as right now we're opening
    the file, reading it, computing BLAKE2b, closing the file, opening the
    file again, reading it again, computing SHA2-512, closing the file, I
    don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
    performance "boost" or sorts.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Sam James@21:1/5 to All on Wed Apr 6 02:30:01 2022
    On 6 Apr 2022, at 01:15, Jason A. Donenfeld <[email protected]> wrote:

    Hi Sam,

    On Wed, Apr 6, 2022 at 2:02 AM Sam James <[email protected]> wrote:
    This matches my views and recollection. We could revisit it
    if there was a passionate advocate (which it looks like there may well be). >>
    While I wasn't against it before, I was sort of ambivalent given
    we had no strong reason to, but I'm more willing now given
    we're also cleaning out other Portage cruft at the same time.

    I think actually the argument I'm making this time might be subtly
    different from the motions that folks went through last year.
    Specifically, the idea last year was to switch to using BLAKE2b only.
    I think what the arguments I'm making now point to is switching to
    SHA2-512 only.

    Oh, right. I see!

    (Aside: I should've been clearer in my first email, what I meant was: I'm
    fine with revisiting this, but I remember us feeling kind of lacklustre because even the proposer (mgorny) ended up not having the oomph to push it through given (small) opposition. I don't recall who had the stiff opposition at the time,
    but I do recall it was only small, but nobody really felt like it was worth the hassle.

    The overall Council feeling was "meh" without some momentum.)


    There are two reasons for this.

    1) Security: since the GPG signatures use SHA2-512, then the whole
    system breaks if SHA2-512 breaks. If we choose BLAKE2b as our only
    hash, then if either SHA2-512 or BLAKE2b break, then the system
    breaks. But if we choose SHA2-512 as our only hash, then we only need
    to worry about SHA2-512 breaking.

    2) Comparability: other distros use SHA2-512, as well as various
    upstreams, which means we can compare our hashes to theirs easily.

    A reason why some people might prefer BLAKE2b over SHA2-512 is a
    performance improvement. However, seeing as right now we're opening
    the file, reading it, computing BLAKE2b, closing the file, opening the
    file again, reading it again, computing SHA2-512, closing the file, I
    don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
    performance "boost" or sorts.


    I think this seems pretty reasonable and I don't have any objection to it.

    2) is a nice point and it's something Robin raised last time around too.

    Jason

    best,
    sam


    -----BEGIN PGP SIGNATURE-----

    iQGTBAEBCgB9FiEEYOpPv/uDUzOcqtTy9JIoEO6gSDsFAmJM3fpfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDYw RUE0RkJGRkI4MzUzMzM5Q0FBRDRGMkY0OTIyODEwRUVBMDQ4M0IACgkQ9JIoEO6g SDuvCggAgE4LrEzU5beQy8NopAU8A3musgHbOODn7UAvA8iU8UYaYYbppY80+4el i1BBDRGNfaS+7M7KwXBqM4eAWRQYjVmTzd0qZlTihjr0R2CTUZm9m+ofGGbhVD5U nTbKJCjGeg0GacLpgGV7yBA2jGh232hyjEPhQkuojRpBbmmVZEY+HCfrMQ4yI057 ULy0DcAxcYyZ6mvws028z5gO42TW+ox0K4bdgDMFCd+cM8J2FWxBxZzr+RefYT8z I+BKsj2Oz2c9qVt8a+/Spsby+1CR55o1DFPcjW35hNuBKuKDcNQ76IFtB0QQsOis Ap3ciIIZK7TMsbJutvQu337n+mEmsQ==
    =zfr7
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich Freeman@21:1/5 to [email protected] on Wed Apr 6 03:40:01 2022
    On Tue, Apr 5, 2022 at 8:05 PM Sam James <[email protected]> wrote:
    On 5 Apr 2022, at 22:13, Jonas Stein <[email protected]> wrote:

    In other words, what are we actually getting by having _both_ SHA2-512
    and BLAKE2b for every file in every Manifest?

    Implementations are often broken and we have to expect zero day attacks on hashes and on signatures. Hence it does not hurt to have a second hash.

    I don't think this is the case. They're not broken often, it's a very very big deal when they do, and we'd also have far bigger problems in such a case (as already pointed out, TLS would be an issue, but also GPG signatures, git commit hashes, ...).

    Our security fails currently if EITHER SHA2-512 or a hardened version
    of SHA-1 are defeated. Our top gpg signature is bound to a git commit
    record by SHA2-512, and the git commit record is bound to everything
    else in the repository (including the manifest objects) by SHA-1,
    because git hasn't transitioned away from that (as far as I'm aware it
    is still a work in progress - the SHA-1 algorithm it uses is hardened
    against known attacks).

    That said, I think there is still an argument for having two hashes in
    the manifests. If we have two independent manifests, then if either
    SHA-1 or SHA2-512 are defeated all we need to do is update git+gpg to
    the patched version (which no doubt would be rushed into a release
    quickly), and then do a commit to the repo and sign it with the Gentoo
    key. The new commit would have a full set of new hashes using a
    secure hash function, and then a back-reference to the previous commit
    using SHA-1 (assuming we didn't rebase the entire tree and lose all
    our historical gpg signatures - we might consider creating a new repo
    and saving a historical one). That would have new hashes all the way
    from the top commit down to all the objects it references, so the top
    commit would now be secure. When signed with an updated gpg the
    signature would be attached with a secure hash. So now we're secure
    again. If we're concerned about old signatures getting recycled in
    preimage attacks we could of course revoke the key and issue a new
    one.

    What we don't need to do is redo all the manifests, and that is
    important because we don't actually have the ability to redo those
    centrally. Anybody can add a commit to the repo and re-sign it, but
    we'd need all the maintainers to go through and generate new manifests
    for anything that is fetch-restricted, or aggressively treeclean.

    So it isn't that having two hashes can't fail, but rather that if it
    does fail it is easier to recover.



    It is very likely that we can not trust in X for a while in the next years, but it is very unlikely that two different implementations are affected.


    I don't think it is likely that e.g. SHA512 will be broken in the next few years, no, but if it is going to be, we have far bigger issues and we'd need to have double algorithms in our whole stack, which we don't have.

    I agree that this is an unlikely scenario, so it is a judgement call
    as to whether the ease of recovery in the event of a failure is worth
    the cost to maintain the second hash. I agree that we'd need double
    algorithms in the whole stack to prevent a failure, but in the current
    state we do have advantages for recovering from a failure after the
    fact.

    It seems that the likely scenario is that we get advance warning of
    weaknesses in a hash function, but without a practical exploit being
    readily available. In that case we could do a more orderly
    transition. We'd still save time with the double hashed manifests,
    and whether this makes a difference is hard to say.


    Additionally calculating a second hash does not cost anything.

    It does have a cost at both Manifest-generation time and emerge-time.

    This is certainly true, though if the current algorithm is reading the
    files twice we could at least fix that.

    I don't really have a strong opinion here. I just wanted to point out
    the recovery benefit of having two hashes on just the manifests, given
    that it isn't easy to access all the distfiles. I also wanted to
    point out that we have SHA-1 exposure today, at least in git.

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich Mueller@21:1/5 to All on Wed Apr 6 06:20:01 2022
    On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

    I think actually the argument I'm making this time might be subtly
    different from the motions that folks went through last year.
    Specifically, the idea last year was to switch to using BLAKE2b only.
    I think what the arguments I'm making now point to is switching to
    SHA2-512 only.

    Still, I think that if we drop one of the hashes then we should proceed
    with the original plan. That is, keep the more modern BLAKE2B (which was
    a participant of the SHA-3 competition [1]) and drop the older SHA512.

    Back then, we had the choice between adding SHA3_512 and BLAKE2B, and we preferred BLAKE2B for performance reasons.

    I also think that the argument about the OpenPGP signature isn't very
    strong, because replacing that signature by another one using a
    different hash is trivial. As I said before, replacing all Manifest
    files in the tree isn't.

    Ulrich

    [1] https://en.wikipedia.org/wiki/NIST_hash_function_competition

    -----BEGIN PGP SIGNATURE-----

    iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJNE1MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4un2sIANm++5Qi/2/jlVJWyzNjwiIgs9OwQfBu76wW LfUDtCp0rBJUNcM64agDfkNr9c5if9mmXp1scBFszm4HL4OmErrUSGAvg3ZNkl1D ufp7lup40i8CMm7oekZftwWyy8de0i7OL8aHpMNfV/B2td76EG5pkdh2811Oxfjz ZKmiTHkVKLTJ7R8ve4+U9nffV+EnttNlgTVerE12Qe4RZYzWK1Cx90vjMqsct74u JuoqwsbJrJa8fe9//I8Ll8nNWZGMYP8g3L918V62hKZnbp8z6XFup6jalYQfBj4P sqgB/EtaAIdsuEh2xZ6VdVCqsbAQS1ZiMcBBZATH7WI+Bu2tGcI=
    =q+3W
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to Ulrich Mueller on Wed Apr 6 13:50:02 2022
    Hi Ulrich,

    On 4/6/22, Ulrich Mueller <[email protected]> wrote:
    On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

    I think actually the argument I'm making this time might be subtly
    different from the motions that folks went through last year.
    Specifically, the idea last year was to switch to using BLAKE2b only.
    I think what the arguments I'm making now point to is switching to
    SHA2-512 only.

    Still, I think that if we drop one of the hashes then we should proceed
    with the original plan. That is, keep the more modern BLAKE2B (which was
    a participant of the SHA-3 competition [1]) and drop the older SHA512.

    Why? Then we're dependent on two things, either of which could break,
    rather than one.

    To be clear, I'm a big fan of BLAKE2 myself and have used it in a
    number of projects. And either one breaking would be a big deal. So
    maybe it doesn't really matter that much. But strictly formally, it
    seems like SHA512 is the most sound decision? I spelled out two
    reasons for that to Sam; if you still disagree, maybe you can address
    why you think my two reasons aren't very meaningful?

    I also think that the argument about the OpenPGP signature isn't very
    strong, because replacing that signature by another one using a
    different hash is trivial. As I said before, replacing all Manifest
    files in the tree isn't.

    I looked into changing gnupg to use BLAKE2b for signatures, but it
    doesn't appear to be supported. It's in gcrypt but not gpg. From
    --version: `Hash: SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224`.
    Since my argument rests on minimizing probability of a break, changing
    the signature hash algo after it's broken doesn't help with much, so I
    think this is something we'd want to happen now, rather than later, if
    we're to use BLAKE2b exclusively.

    I could potentially send a patch to gnupg for this if you want to take
    the long path. But also: don't forget there's also the
    interoperability argument that favors SHA512 too.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to [email protected] on Wed Apr 6 19:10:02 2022
    Hi Ulrich,

    On Wed, Apr 6, 2022 at 6:38 PM Ulrich Mueller <[email protected]> wrote:
    Why? Then we're dependent on two things, either of which could break, rather than one.

    See? If either of these should happen, then we'll be happy that we still
    have both hashes in our Manifest files.

    OTOH, if that argument is not relavant because the probability of both
    is close to zero, then (from a security POV) it doesn't matter which of
    the two hashes we remove.

    No, you're still missing the point.

    If SHA-512 breaks, the security of the system fails, regardless of
    what change we make. This is because GnuPG uses SHA-512 for its
    signatures.

    So I'll spell out the different possibilities:

    1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
    1a) Possibility: SHA-512 is broken. Result: system broken.
    1b) Possibility: BLAKE2b is broken. Result: nothing.

    2) GPG uses SHA-512. Manifest uses SHA-512.
    2a) Possibility: SHA-512 is broken. Result: system broken.
    2b) Possibility: BLAKE2b is broken. Result: nothing.

    3) GPG uses SHA-512. Manifest uses BLAKE2b.
    3a) Possibility: SHA-512 is broken. Result: system broken.
    3b) Possibility: BLAKE2b is broken. Result: system broken.

    See how from a security perspective, (2) is not worse than (1), but
    (3) is worse than both (1) and (2)?

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich Mueller@21:1/5 to All on Wed Apr 6 18:40:01 2022
    On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

    Why? Then we're dependent on two things, either of which could break,
    rather than one.

    See? If either of these should happen, then we'll be happy that we still
    have both hashes in our Manifest files.

    OTOH, if that argument is not relavant because the probability of both
    is close to zero, then (from a security POV) it doesn't matter which of
    the two hashes we remove.

    Ulrich

    -----BEGIN PGP SIGNATURE-----

    iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJNwe4PHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4u/4YH/05vVdiCDTUoWbtnhcYtGfZ24Y2W20i7RcJb IKV7z/kFxHa4LoBQDv6LW0tqfPuXIKsXjEmGNnEUH3MyhrqjoGZg3r7LTEsm7X2d P+v74bc9ZdR8RqQoJQq+cUHJvZ3IFXW0xlJxt0HS42QuFSrDF57zbhUlCsxstKpR rqwW/3XNak8VNvP6TZb8HmrNZq69ImyKOCHwA1E0GeOYeWMrMWcI4C68ns3rCeAQ aN2fy57jpVI3q7Uaoj7l0FwlE85XjDr2/PwD1ZQWTZh0RMsOF3lk1LKU6KNry1Ep qRijs7VxaXZ1NrwXrmeJHiiCMHR+Umoql+BTJ/RlJF7nmmVfF6M=
    =Nfbs
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin H. Johnson@21:1/5 to Jason A. Donenfeld on Wed Apr 6 19:30:01 2022
    On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
    2) Comparability: other distros use SHA2-512, as well as various
    upstreams, which means we can compare our hashes to theirs easily.
    Can we expand on this specific thread for a moment?

    I was the author of GLEP59 about changing the Manifest hashes, and I
    noted at the time, with references, that the effective strength of a set
    of hashes is only that of the strongest hash.

    One of my regrets from GLEP59 is that it's made it harder for use cases
    outside of the normal user distfile workflow.

    The use case that impacted me the most was being able to compare our
    distfiles were over time vs external sources, esp. if the file goes
    missing or was fetch-restricted and we can't produce a new hash of it.
    Maybe upstream only ever published SHA1/SHA256, and we only ever
    calculated SHA512/BLAKE2b on the file. Since we never had hashes from
    both sides at the same time, we cannot prove it was the same file.

    We need to be able to ship one or more hashes to users, for the specific
    use case of validating the distfiles they download.

    As a developer, I'd like to be able to track the other hashes for a
    file, without forcing ourselves to retain the file. This might be to
    compare with upstream published hashes, or to compare with other
    distros.

    In fact it would be really nice to have a semi-automated pipeline to
    plug in signed upstream hashes to our Manifests, and make it possibly to
    prove our new SHA512/BLAKE2B hash was taken over the correct input in
    the first place, and there wasn't any subtle supply-chain attack early
    in the packaging process.

    Where would those hashes go? They don't need to be in the Manifest, or
    at the very least they don't need to be distributed via rsync to users
    (it only costs a small amount of bytes to do so).

    Where else could they go?
    - Commit messages could work.
    - Git notes to a lesser degree.
    - alternate repos?

    A reason why some people might prefer BLAKE2b over SHA2-512 is a
    performance improvement. However, seeing as right now we're opening
    the file, reading it, computing BLAKE2b, closing the file, opening the
    file again, reading it again, computing SHA2-512, closing the file, I
    don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
    performance "boost" or sorts.
    Or just only verifying the "strongest" hash gives you that boost.

    I do want to check into the code that you pointed out, because I'm
    really sure much older versions of Portage did the CORRECT thing of only reading the file in a single pass.

    --
    Robin Hugh Johnson
    Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
    E-Mail : [email protected]
    GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
    GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2
    Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it.

    iQKTBAABCgB9FiEEveu2pS8Vb98xaNkRGTlfI8WIJsQFAmJNzIxfFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEJE RUJCNkE1MkYxNTZGREYzMTY4RDkxMTE5Mzk1RjIzQzU4ODI2QzQACgkQGTlfI8WI JsRBZxAAvvDQcg//MO7J2rPKZeeqJHUO8cYlM4AaKp+hmukqu0sSlUhihpxdlk0d eh/Lpnoo0nMsJszTtYEXsW4UYETd2UjAod7zyvYFxeALd0Ww4GUr0SpQnZ+u66gM K/teCGNwQURKJOxSHyPPgdSrLihgb7ofWxfryfbzVGHnMTHSJBleemeK+q9xrrvI rcPMjYNlobxgp/dCsQzehJdbPU/s+JIsOZeuVvmXq7vRKUyTHcuEGKbCPDg1Z4cK SNTC/YFZE9s4K67WDfCT1q0qfS4iiCopg4tUkrFJ2MHRxfIUwG+qLCvzJB5OEie6 DHvEyTkgr7Vr0fnLjLFLYi0HkPGZ4nCFgFumNZjBPJrDcOmu+airFN2hL+EmZiAz 1qD9Yc9wvEoYFfdgS/B5
  • From Jason A. Donenfeld@21:1/5 to Rich Freeman on Wed Apr 6 19:30:01 2022
    Hi Rich,

    On 4/6/22, Rich Freeman <[email protected]> wrote:
    On Tue, Apr 5, 2022 at 8:05 PM Sam James <[email protected]> wrote:
    Our security fails currently if EITHER SHA2-512 or a hardened version
    of SHA-1 are defeated. Our top gpg signature is bound to a git commit
    record by SHA2-512, and the git commit record is bound to everything
    else in the repository (including the manifest objects) by SHA-1,
    because git hasn't transitioned away from that (as far as I'm aware it
    is still a work in progress - the SHA-1 algorithm it uses is hardened
    against known attacks).

    Sort of. The security between infra and users relies on SHA2-512. The
    security between devs and infra relies on SHA-1. I guess the "full
    system" depends on both, but I've been focused on the more likely
    issue of a community-run mirror serving bogus files.

    I agree that this is an unlikely scenario, so it is a judgement call
    as to whether the ease of recovery in the event of a failure is worth
    the cost to maintain the second hash. I agree that we'd need double algorithms in the whole stack to prevent a failure, but in the current
    state we do have advantages for recovering from a failure after the
    fact.

    It seems that the likely scenario is that we get advance warning of weaknesses in a hash function, but without a practical exploit being
    readily available. In that case we could do a more orderly
    transition. We'd still save time with the double hashed manifests,
    and whether this makes a difference is hard to say.

    Yea I see this argument, but I don't quite buy it. Maintaining two
    sets of hashes for the unlikely event that one gets broken AND we
    absolutely cannot incrementally transition gradually to an unbroken
    one seems rather overblown.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ulrich Mueller@21:1/5 to All on Wed Apr 6 20:00:01 2022
    On Wed, 06 Apr 2022, Jason A Donenfeld wrote:

    So I'll spell out the different possibilities:

    1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
    1a) Possibility: SHA-512 is broken. Result: system broken.
    1b) Possibility: BLAKE2b is broken. Result: nothing.

    2) GPG uses SHA-512. Manifest uses SHA-512.
    2a) Possibility: SHA-512 is broken. Result: system broken.
    2b) Possibility: BLAKE2b is broken. Result: nothing.

    3) GPG uses SHA-512. Manifest uses BLAKE2b.
    3a) Possibility: SHA-512 is broken. Result: system broken.
    3b) Possibility: BLAKE2b is broken. Result: system broken.

    See how from a security perspective, (2) is not worse than (1), but
    (3) is worse than both (1) and (2)?

    No it isn't. We can replace the top-level signature easily, but
    replacing all Manifest hashes in the tree is hard (i.e. 1a and 3a are
    trivial to fix, but 2a and 3b aren't).

    I've said this multiple times now, so I'm out of here.

    Ulrich

    -----BEGIN PGP SIGNATURE-----

    iQFDBAEBCAAtFiEEtDnZ1O9xIP68rzDbUYgzUIhBXi4FAmJN08MPHHVsbUBnZW50 b28ub3JnAAoJEFGIM1CIQV4uu5IH/0KQbQBDLIThP4UrVFAzt+vd6+JnZteGSBJ/ Q/8UQnQIG0DLXiEchBNZJyZsixhHcwtgEqz3vrrEzPtH4j6W4XxjgDjWWbT9ikv9 R/KwwGxukfeWy7oJ8dlmEx/rP99zeU6LhZ8L1PlKqonKOlxEOkGxc7ijRlTf334f AU1PRLDzqvAD8m9R7oYBFzmz8LU2uAEkkbk4BukBIXnuU7cWW07PBJCvsmTT3CK6 +jKOvUSgz4sRwz1QsvQPWK/Vpdnu2dWNGsBaRh0T3k55BUmQStdM4RPr/q2XAF9f 9SOPsTYemnA6WirIJ1q8tbphFa54KcDWI2CPyK3ZZRT1yux173c=
    =TrNP
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin H. Johnson@21:1/5 to Jason A. Donenfeld on Wed Apr 6 19:40:01 2022
    On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:
    No, you're still missing the point.

    If SHA-512 breaks, the security of the system fails, regardless of
    what change we make. This is because GnuPG uses SHA-512 for its
    signatures.
    Question directly for you Jason, because you make a professional study
    of this: does the type of breakage/successful attack against against
    SHA-512 matter?

    e.g. is it possible that some type of attack would only work against the Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
    the opposite).

    The best hypothetical idea I had was that there exists some large
    special input that lets an attacker reset the output to an arbitrary
    hash after their malicious payload: but it wouldn't fit in the GPG
    signature space.


    So I'll spell out the different possibilities:
    1) GPG uses SHA-512. Manifest uses SHA-512 and BLAKE2b.
    score -1 + 0 = -1
    2) GPG uses SHA-512. Manifest uses SHA-512.
    score -1 + 0 = -1
    3) GPG uses SHA-512. Manifest uses BLAKE2b.
    score -1 + -1 = -2
    See how from a security perspective, (2) is not worse than (1), but
    (3) is worse than both (1) and (2)?
    Yes, (2) is not worse than (1) for the overall security perspective.
    That leaves the discussion does (1) have other benefits / value
    propositions that make it worth less than (2). (see my other thread)

    --
    Robin Hugh Johnson
    Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
    E-Mail : [email protected]
    GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
    GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2
    Comment: Robbat2 @ Orbis-Terrarum Networks - The text below is a digital signature. If it doesn't make any sense to you, ignore it.

    iQKTBAABCgB9FiEEveu2pS8Vb98xaNkRGTlfI8WIJsQFAmJNzl1fFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEJE RUJCNkE1MkYxNTZGREYzMTY4RDkxMTE5Mzk1RjIzQzU4ODI2QzQACgkQGTlfI8WI JsTzZg//fTqB63tBFSatv1fcY0UAAZhU1uy65020iOqcfk8gBCfzjkfwXWt+CEOw UhiuSydfD4lRcpAwRkEkYelr9vmsEC5DPBCdE51asQSvmDqc7iEZdlrE5oWY/csj BAs2n8HhPhrNniqq7WWzn6y67MiGprLBHVYb//6K+KQeQCdAcmFWp5tVvU5qv5ZS zuAENB4ID0J1XZSz7I699hNRozMQm/V4rQoR6zs8lNO67cETh3wBYL84aPrx+fuV +n8JgG15E4R37eNDc4af6tpMu4ik/F70To+67vPUnB7vZ60Nz0X2yvFfAKi+Mp5n naHx00QIJQWnaQx7ZrXLYjT2ysvxpRumOl+XP/seXz1MAUpTmTUl3qedVPkHP9mL wfseOLHMEgjf9Z04Kd/M
  • From Rich Freeman@21:1/5 to [email protected] on Wed Apr 6 20:40:01 2022
    On Wed, Apr 6, 2022 at 1:29 PM Jason A. Donenfeld <[email protected]> wrote:

    Sort of. The security between infra and users relies on SHA2-512. The security between devs and infra relies on SHA-1. I guess the "full
    system" depends on both, but I've been focused on the more likely
    issue of a community-run mirror serving bogus files.

    Well, that depends on how you're syncing the tree. If you're using
    rsync then there is a signed manifest in the root, so I agree in that
    case it is just SHA2-512. If you're syncing using git then the
    manifests only reference distfiles, and the only link between the
    commit and the tree/objects are their SHA-1 hashes until git adopts a
    different hash function.

    Yea I see this argument, but I don't quite buy it. Maintaining two
    sets of hashes for the unlikely event that one gets broken AND we
    absolutely cannot incrementally transition gradually to an unbroken
    one seems rather overblown.

    It is very much a hand-waving judgement call. This is one of those
    low cost, low risk, high reward situations IMO. The cost of
    calculating hashes is fairly low (especially if done in a more sane
    way). The odds it will ever have a benefit are low. If it does have
    a benefit, it will be in a situation where the world is on fire and
    we'll be very happy to not have to go verify a gazillion distfiles on
    top of everything else we have to fix. I'll defer to those wiser than
    me to make the call. :)

    --
    Rich

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marek Szuba@21:1/5 to All on Thu Apr 7 17:30:06 2022
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------5zI3BuuKye608g0lcg0cbzls
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMjAyMi0wNC0wNiAxOTozNCwgUmljaCBGcmVlbWFuIHdyb3RlOg0KDQo+IFRoaXMgaXMg b25lIG9mIHRob3NlIGxvdyBjb3N0LCBsb3cgcmlzaywgaGlnaCByZXdhcmQgc2l0dWF0aW9u cyBJTU8uDQoNCipwdXRzIG9uIENvdW5jaWwgaGF0Kg0KDQpUaGUgYWJvdmUgcHJldHR5IG11 Y2ggY292ZXJzIG15IG93biBvcGluaW9uIG9uIHRoZSBzdWJqZWN0Lg0KDQotLSANCk1hcmVj a2kNCg==

    --------------5zI3BuuKye608g0lcg0cbzls--

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEE+MBeYVMkcD2jfqCrKMQ7KFUeMgEFAmJPAWcACgkQKMQ7KFUe MgGxCQ/9Hh5YIP2ldJzZkoDhxU/8iyJhuEdSV/F9rHNV8ZPm9rNK4Uj7vZ6BmS6M 4d7Py7F92LVwUIWdUKZxZpYpowJNMZm/INr6nOXCY5ky0Lgt4Q1vJqC/dQCog7pO pzE9BYocBrYnqs1G/eBjmWMES1VDqTb8DjzFTwvUvMoP5ovSM0M+a6F9dE0NRRBO XuDhsVwfgmyWYq1SwLEnc4WwfiDlleY80UuQkEYOig0xlOj6sx4xAIYNlwBUDE/Y 5SgBah7F5aQpuD8AWwWGxfP+2zujZPByzIIed/l1kCsTnR8iYxiWyi+zfx1/jHV1 7lnNlRe2RHVkTfcHkcmWC2lPBte9v22N3pvmfD5emr5lffAvhEpNb59OoQvx+XI5 t+fQhAUXZBRkcMr1kNS5nJRqiOMYHTJjhVazSJaMhEkFT2zKu8qztXU7q2z8rkGT b5WBuB4dWfLuqf5XarKmESo6be7RRHeGoGOEymvKzGtw+VNkY8cF0UhdpkgjH1qn owDDtkwVxy1oldJyXvYs8+rLxzR//4c/cbJ7PQjqcyzC1bsPQTtSAppGvUsTvPj6 sVgZN7KafUmY/x42jm0VTe7e6io45qFetzsrCSKX4LhTSCR+YB26HWz3J7Dh+VCC yN89sy08fR8guKZ+c44Fo2PX7h80Uppu0mBOmvp7XHJnyCN8G4A=
    =ZsDB
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joshua Kinard@21:1/5 to Jason A. Donenfeld on Tue Apr 12 01:20:01 2022
    On 4/5/2022 17:49, Jason A. Donenfeld wrote:
    Hi Matt,

    On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

    On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote: >>> By the way, we're not currently _checking_ two hash functions during
    src_prepare(), are we?

    I don't know, but the hash-checking is definitely checked before src_prepare().

    Er, during the builtin fetch phase. Anyway, you know what I meant. :)

    Anyway, looking at the portage source code, to answer my own question,
    it looks like the file is actually being read twice and both hashes
    computed. I would have at least expected an optimization like:

    hash1_init(&hash1);
    hash2_init(&hash2);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash2_update(&hash2, chunk);
    hash1_final(&hash1, out1);
    hash2_final(&hash2, out2);

    But actually what's happening is the even less efficient:

    hash1_init(&hash1);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash1_final(&hash1, out1);
    hash2_init(&hash2);
    for chunks in file:
    hash2_update(&hash2, chunk);
    hash1_final(&hash2, out2);

    So the file winds up being open and read twice. For huge tarballs like chromium or libreoffice...

    But either way you do it - the missed optimization above or the
    unoptimized reality below - there's still twice as much work being
    done. This is all unless I've misread the source code, which is
    possible, so if somebody knows this code well and I'm wrong here,
    please do speak up.

    Not to go off-topic, but where in Portage's source is this logic at? It
    seems like an easy fix for a slightly more efficient Portage.

    --
    Joshua Kinard
    Gentoo/MIPS
    [email protected]
    rsa6144/5C63F4E3F5C6C943 2015-04-27
    177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943

    "The past tempts us, the present confuses us, the future frightens us. And
    our lives slip away, moment by moment, lost in that vast, terrible in-between."

    --Emperor Turhan, Centauri Republic

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mike Gilbert@21:1/5 to [email protected] on Tue Apr 12 14:50:01 2022
    On Mon, Apr 11, 2022 at 7:14 PM Joshua Kinard <[email protected]> wrote:

    On 4/5/2022 17:49, Jason A. Donenfeld wrote:
    Hi Matt,

    On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <[email protected]> wrote:

    On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <[email protected]> wrote:
    By the way, we're not currently _checking_ two hash functions during
    src_prepare(), are we?

    I don't know, but the hash-checking is definitely checked before src_prepare().

    Er, during the builtin fetch phase. Anyway, you know what I meant. :)

    Anyway, looking at the portage source code, to answer my own question,
    it looks like the file is actually being read twice and both hashes computed. I would have at least expected an optimization like:

    hash1_init(&hash1);
    hash2_init(&hash2);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash2_update(&hash2, chunk);
    hash1_final(&hash1, out1);
    hash2_final(&hash2, out2);

    But actually what's happening is the even less efficient:

    hash1_init(&hash1);
    for chunks in file:
    hash1_update(&hash1, chunk);
    hash1_final(&hash1, out1);
    hash2_init(&hash2);
    for chunks in file:
    hash2_update(&hash2, chunk);
    hash1_final(&hash2, out2);

    So the file winds up being open and read twice. For huge tarballs like chromium or libreoffice...

    But either way you do it - the missed optimization above or the
    unoptimized reality below - there's still twice as much work being
    done. This is all unless I've misread the source code, which is
    possible, so if somebody knows this code well and I'm wrong here,
    please do speak up.

    Not to go off-topic, but where in Portage's source is this logic at? It seems like an easy fix for a slightly more efficient Portage.

    I believe it's the portage.checksum.verify_all() function.

    https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/checksum.py?h=portage-3.0.30#n471

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin H. Johnson@21:1/5 to Robin H. Johnson on Wed Apr 20 02:10:02 2022
    On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:
    On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
    2) Comparability: other distros use SHA2-512, as well as various
    upstreams, which means we can compare our hashes to theirs easily.
    Can we expand on this specific thread for a moment?

    I was the author of GLEP59 about changing the Manifest hashes, and I
    noted at the time, with references, that the effective strength of a set
    of hashes is only that of the strongest hash.
    Bump for my parent message, that I'm very surprised at the lack of
    responses to two messages in this thread.

    https://archives.gentoo.org/gentoo-dev/message/18216da0128ee79733fa68bb77fa8b69 https://archives.gentoo.org/gentoo-dev/message/a9974ec34dfb25810dab47e3fa322a52

    --
    Robin Hugh Johnson
    Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
    E-Mail : [email protected]
    GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
    GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to Robin H. Johnson on Wed Apr 20 16:00:01 2022
    Hey Robin,

    Sorry for the delay in getting back to you. As mentioned on IRC, both of
    your messages bounced earlier, and I was at a conference all last week. Catching up with this thread now...

    On Wed, Apr 06, 2022 at 05:23:25PM +0000, Robin H. Johnson wrote:
    On Wed, Apr 06, 2022 at 02:15:02AM +0200, Jason A. Donenfeld wrote:
    2) Comparability: other distros use SHA2-512, as well as various
    upstreams, which means we can compare our hashes to theirs easily.
    Can we expand on this specific thread for a moment?

    I was the author of GLEP59 about changing the Manifest hashes, and I
    noted at the time, with references, that the effective strength of a set
    of hashes is only that of the strongest hash.

    One of my regrets from GLEP59 is that it's made it harder for use cases outside of the normal user distfile workflow.

    The use case that impacted me the most was being able to compare our distfiles were over time vs external sources, esp. if the file goes
    missing or was fetch-restricted and we can't produce a new hash of it.
    Maybe upstream only ever published SHA1/SHA256, and we only ever
    calculated SHA512/BLAKE2b on the file. Since we never had hashes from
    both sides at the same time, we cannot prove it was the same file.

    We need to be able to ship one or more hashes to users, for the specific
    use case of validating the distfiles they download.

    As a developer, I'd like to be able to track the other hashes for a
    file, without forcing ourselves to retain the file. This might be to
    compare with upstream published hashes, or to compare with other
    distros.

    In fact it would be really nice to have a semi-automated pipeline to
    plug in signed upstream hashes to our Manifests, and make it possibly to prove our new SHA512/BLAKE2B hash was taken over the correct input in
    the first place, and there wasn't any subtle supply-chain attack early
    in the packaging process.

    Where would those hashes go? They don't need to be in the Manifest, or
    at the very least they don't need to be distributed via rsync to users
    (it only costs a small amount of bytes to do so).

    Where else could they go?
    - Commit messages could work.
    - Git notes to a lesser degree.
    - alternate repos?

    Interesting idea. This seems orthogonal to my proposal ("just use one
    hash in the manifest and call it a day; make it the same as what gpg
    uses for signing to minimize moving pieces"), and so I'm hesitant to
    indulge too much in this thread, for fear of it being derailed with this different thing you want.

    With that said, I'm not quite sure I understood everything you're asking
    for. You said that you want "to have a semi-automated pipeline to plug
    in signed upstream hashes to our Manifests, and make it possibly to
    prove our new SHA512/BLAKE2B hash was taken over the correct input", but
    at the same time you also said that you want "to be able to track the
    other hashes for a file, without forcing ourselves to retain the file."
    What I'm wondering is: how do you propose that we calculate a SHA-512
    hash of a file and "prove it correct" using, e.g., a signed SHA-256
    hash, if we don't download the whole file?

    It sounds like the thing that would be interesting to you would be for
    infra to manage some sort of master hash database collecting all the
    hashes from all over the internet of every file that hits distfiles,
    verifying and then generating a bunch more hash variants of all kinds,
    and then cross-verifying those with the hashes extracted from every
    other distro, making for a wild hash verification aggregator machine. I
    think I can see the utility of it. It would also unburden manifest
    files, as those could then just have a SHA-512 hash and nothing else,
    making things a bit lighter.


    A reason why some people might prefer BLAKE2b over SHA2-512 is a performance improvement. However, seeing as right now we're opening
    the file, reading it, computing BLAKE2b, closing the file, opening the
    file again, reading it again, computing SHA2-512, closing the file, I
    don't think performance is actually something people care about. Seen differently, removing either one of them will already give us a
    performance "boost" or sorts.
    Or just only verifying the "strongest" hash gives you that boost.

    I do want to check into the code that you pointed out, because I'm
    really sure much older versions of Portage did the CORRECT thing of only reading the file in a single pass.

    Let me know if your findings are different from mine...

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jason A. Donenfeld@21:1/5 to Robin H. Johnson on Wed Apr 20 18:40:01 2022
    Hi Robin,

    On Wed, Apr 06, 2022 at 05:31:09PM +0000, Robin H. Johnson wrote:
    On Wed, Apr 06, 2022 at 07:06:30PM +0200, Jason A. Donenfeld wrote:
    No, you're still missing the point.

    If SHA-512 breaks, the security of the system fails, regardless of
    what change we make. This is because GnuPG uses SHA-512 for its
    signatures.
    Question directly for you Jason, because you make a professional study
    of this: does the type of breakage/successful attack against against
    SHA-512 matter?

    e.g. is it possible that some type of attack would only work against the Manifest entry, but NOT against the GPG signature's embedded SHA-512 (or
    the opposite).

    The best hypothetical idea I had was that there exists some large
    special input that lets an attacker reset the output to an arbitrary
    hash after their malicious payload: but it wouldn't fit in the GPG
    signature space.

    Generally speaking, the more control an attacker has over the input, the
    easier certain types of attacks might be. So maybe in the most general
    sense that applies. I wouldn't model a security analysis around that,
    though. Rather, the usual way to apply that sort of thinking is to
    design algorithms that rely on certain properties of hash functions, but
    not others; for example, Ed25519 does not rely on the hash function
    being collision resistant due to its construction.

    Jason

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)