• Seconding the General Resolution to Deploy Tag2upload: supporting the i

    From Sam Hartman@21:1/5 to All on Thu Jun 27 22:40:01 2024
    "Sean" == Sean Whitton <[email protected]> writes:

    Sean> ===== BEGIN FORMAL RESOLUTION TEXT

    Sean> tag2upload allows DDs and DMs to upload simply by using the
    Sean> git-debpush(1) script to push a signed git tag.

    Sean> 1. tag2upload, in the form designed and implemented by Sean
    Sean> Whitton and Ian Jackson, and design reviewed by Jonathan
    Sean> McDowell and Russ Allbery, should be deployed to official
    Sean> Debian infrastructure.

    Sean> 2. Under Constitution §4.1(3), we overrule the ftpmaster
    Sean> delegate's decision: the Debian Archive should be configured
    Sean> to accept and trust uploads from the tag2upload service.

    Sean> 3. Future changes to tag2upload should follow normal Debian
    Sean> processes.

    Sean> 4. Nothing in this resolution should be taken as requiring
    Sean> maintainers to use any particular git or salsa workflows.

    Sean> END FORMAL RESOLUTION TEXT =====


    Seconded.
    I realize I asked you to wait, but Russ's message [email protected] has convinced me I was wrong.

    In addition, I would like to respond to those who claim that a GR is inappropriate, or that ftpmaster should be given more time.

    Back in 2019, you and Ian approached ftpmaster asking for tag2upload
    support. There was an extensive debian-devel thread, and there was a fair
    bit of private discussion.

    At that time, I was DPL. You followed a reasonable process. You tried
    to understand their requirements. It became clear to you and to me [1]
    as a neutral party that your design was not going to be able to meet
    their requirements.

    [1]: I did support the idea of replacing Debian source packages with
    git in my DPL platform. Amusingly that was in response to a proposal
    Joerg made. I was not a specific proponent of either tag2upload or
    ftpmaster's concerns. I did support exploring tag2upload as a promising
    option.

    Ftpmaster did decide. No, perhaps it was not a formally announced
    decision. Perhaps it was only a blocking decision of multiple individual members rather than a team. No one proposed an option for how you could
    ask for a formal decision. Other members of ftpmaster could have
    indicated how you could ask for a collaborative design. They could have proposed ways to have collaborative design. They could have, but did not.
    They blocked your work.

    My take away was that they believed that essential security properties
    they cared about were incompatible with your design and there was no way forward. Even getting requirements out of ftpmaster was difficult. I had
    to intervene as DPL and talk about the importance of delegated teams
    working with the project as a whole. (It is not the DPL’s job to second
    guess a team’s decision, but it is the DPL’s job to prod teams when balls are getting dropped or when teams aren’t working to resolve cross-team issues.)

    Let’s be honest, personalities were involved. It looked fairly clearly
    like there were people on ftpmaster who didn’t want to work with Ian on
    this issue. I get that: I’ve been frustrated working with Ian before, and
    I know he’s been frustrated with me.

    There are solutions to allow things to move forward even when
    personality conflicts get in the way. could have worked with you. They could have even insisted that if tag2upload was going to move forward, you
    needed to get someone else involved to work on the design with them. (You
    might not have liked that, but tag2upload has enough support that if
    getting another designer involved was what it took to make progress, that
    could have happened.)

    Instead, there was silence. You worked with ftpmaster. They said no. That’s fine. There’s nothing wrong with that. These things are always
    complicated. It was probably a combination of believing that the
    security concerns they had were paramount, insufficient hours in the
    day, and emotionally draining discussions. All that is part of life in
    Debian. What should not be part of life in Debian is those combinations blocking work with no recourse to get a broader opinion.

    The project needs to be able to balance the decisions of teams
    against other competing interests. We delegate responsibility to our
    teams to move forward based on the overall needs of the project, not on
    the needs and desires of the delegates. Yes, delegates get a lot of
    discretion because they are doing the work, because we value being a
    do-ocracy, and because volunteer work is fun when we have the flexibility
    to do our jobs in a manner that appeals to us.

    For Debian to remain vibrant and fun, we need to have ways to resolve
    conflicts when one person’s doing crosses ways with another person’s
    doing. We need to have ways to ask a broader group when ftpmaster’s
    concerns about archive signatures run against your desire to make Debian easier to contribute to and to provide better traceability of our source
    packages.

    It turns out we do have ways of addressing that. They include GRs, the technical committee setting policy (such as policy on archive security requirements), and the DPL making delegations. We also have other less
    formal approaches.
    At any point, ftpmaster could have proposed a time-bounded decision
    strategy for resolving the conflict they were more comfortable with. You
    chose GR.

    Back in 2019, I told Ian there was not sufficient support for a formal
    process to break the conflict. In the debian-devel threads, it looked
    like not enough people understood the issues. Also, we’d had a lot of
    formal discussion already in 2019, and there’s only so much change that
    is good for the project in a year.
    I recommended working to increase the number of people who were familiar
    with tag2upload and the issues involved.

    That has clearly happened. Many people have participated in the discussion displaying a informed understanding of the issues. Today, we are ready to
    make an informed decision about the issues involved.

    Also, in the five years, someone else who wanted a different git-based
    solution could have worked on it and proposed a prototype. Heck, they
    could have even taken your code as a starting point: it’s free software
    after all.

    That can still happen. If a year from now, we have a different way to do
    git pushes into the archive, we can revisit whether trusting tag2upload
    makes sense.

    For all these reasons and more, I think now is the time to make this
    decision, and I think a GR is an appropriate mechanism to do so.

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iHUEARYIAB0WIQSj2jRwbAdKzGY/4uAsbEw8qDeGdAUCZn3M7gAKCRAsbEw8qDeG dPL/AP9y/Z2q9iPFRzej/oRneR7oxLqzv/I/pTmBkpiUTtb5+wD/SRUJIazUgEV0 gTkD64iohp19+K/OSz5PcDMMc3cp+wo=QuNr
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joerg Jaspert@21:1/5 to Sean Whitton on Sun Jun 30 01:20:01 2024
    On 17273 March 1977, Sean Whitton wrote:

    The ftpmaster team have refused to trust uploads coming from the
    tag2upload service. This GR is to override that decision.

    took us a few days to discuss and get to a point, but hey, where is the
    hurry anyways? It's not like we are time bound in implementing this.

    Now, we have the following proposal on how to get t2u integrated. Note,
    we are not entirely happy with it and do not think this is the best way forward, but given the current situation, it is a way that gets things untangled, and then we see what the future will bring.

    So, in short: A t2u uploaded source package should consist of whatever
    t2u produces (normal Debian source package) *plus* two additional files.

    The first file contains client side generated data, but to *not*
    overburden the client, this *only* consists of the output of `git
    ls-files --format="%(objectmode) %(objectname) %(path)"` for the tag
    that should be uploaded, signed by the DD/DM key - or something
    similarly easily generated on client side. Exact format can be hashed
    out between t2u people and ftpmaster during implementation.

    The second file consists of a shallow git clone of the repository for
    the tag that t2u wants to upload, put into an appropriately named
    tarball.

    We believe that this, while not being ideal, serves both sides: t2u has
    their "just git and similar anyways-available-tools on client", while
    the archive gets enough uploaded to independently do whatever it wants
    to do.

    Obvious (ha) note: The above mentions a command and stuff, but the exact
    way is up to implementation. It can even contain more data if deemed
    useful or add a file from t2u with a list what it added/generated, for
    example. Or - in the future - might be completely replaced with another implementation we all agree on, provided it has similar minimal basic requirements as the thing proposed here. (Summarized as detached
    signature from the DD/DM over the tree of the tag, and that tree/tags
    data available in a file beside the upload).

    --
    for the FTPTeam,
    Joerg,
    Thorsten,
    Ansgar,
    Scott,
    Luke

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Joerg Jaspert on Sun Jun 30 18:00:02 2024
    Joerg Jaspert <[email protected]> writes:

    Now, we have the following proposal on how to get t2u integrated. Note,
    we are not entirely happy with it and do not think this is the best way forward, but given the current situation, it is a way that gets things untangled, and then we see what the future will bring.

    Thank you very much for putting this together! I know how hard it is to coordinate a bunch of voices and turn them into something concrete. This
    is incredibly helpful and I really appreciate it.

    After reading this over and thinking about it for a bit, I had a few
    questions just to make sure I fully understood the proposal.

    So, in short: A t2u uploaded source package should consist of whatever
    t2u produces (normal Debian source package) *plus* two additional files.

    The first file contains client side generated data, but to *not*
    overburden the client, this *only* consists of the output of `git
    ls-files --format="%(objectmode) %(objectname) %(path)"` for the tag
    that should be uploaded, signed by the DD/DM key - or something
    similarly easily generated on client side. Exact format can be hashed
    out between t2u people and ftpmaster during implementation.

    You describe the contents here, but not the semantics, and I'm not sure
    that I fully understand what the intended semantics are (in other words,
    what packages dak will accept under this proposal). Would source packages
    that contain additional files not represented in this list of files and
    hashes be accepted, for example?

    Here's one specific concrete example for one workflow using tag2upload
    (there are other variations): suppose I, as the uploader, tag a Git tree
    that is patches-applied with no patches in debian/patches/*. I run the
    above command and include that signed data somewhere where tag2upload can
    get at it via the Git tag. tag2upload then turns that into a 3.0 (quilt) package and uploads that package along with the information as requested, including that list of files and hashes that I signed from the tree that I tagged.

    When dak sees the package, all of the files in debian/* in the source
    package will have the same hashes as in the git ls-files output, but the
    source package will have additional files in debian/patches/* that do not
    exist in the git ls-files output. Some of the upstream files will have
    hashes in the git ls-files output that match the contents of those files
    after unpacking the source package (and thus applying patches), but will
    not match the hashes of those files as they exist in the upstream
    orig.tar.gz.

    In this proposal, would dak be willing to accept such a package?

    There are a few other, similar sorts of cases. Perhaps the other that's
    the most interesting is the case where tag2upload has to build the
    orig.tar.gz file because this is a new upstream release. The contents of
    that tarball will be identical to the Git tree referenced by the upstream
    hash in the signed Git tag. Ignoring, for now, the case of a repository
    that contains only the debian/* files, the file hashes in the git ls-files output will *either* match the file hashes in the orig.tar.gz file *or*
    the file hashes of the corresponding unpacked source package, but those
    two hashes may be different for specific upstream files that are patched
    by Debian.

    The case of a repository that contains only the debian/* files poses
    another set of complications, but I don't think we have to get into that immediately. The above examples are probably enough to work through to understand what the intended semantics of this manifest is.

    The second file consists of a shallow git clone of the repository for
    the tag that t2u wants to upload, put into an appropriately named
    tarball.

    Just to double check, to make sure I'm not missing some subtlety, it's intentional that this file contains all of the same information as in the
    first file, and the first file is just a subset of this same information
    in a different form?

    In other words, someone could verify the signature on the Git tag in this
    file and then run the git ls-files command on the Git repository and get exactly the same information as in the first file, so the first file is technically redundant. I can think of some reasons why you might want
    that, but it's a little surprising, so I wanted to make sure that's intentional.

    Obvious (ha) note: The above mentions a command and stuff, but the exact
    way is up to implementation. It can even contain more data if deemed
    useful or add a file from t2u with a list what it added/generated, for example. Or - in the future - might be completely replaced with another implementation we all agree on, provided it has similar minimal basic requirements as the thing proposed here. (Summarized as detached
    signature from the DD/DM over the tree of the tag, and that tree/tags
    data available in a file beside the upload).

    Oh, yeah, for sure, none of the details now should block future
    improvements worked out in the normal way.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ian Jackson@21:1/5 to Russ Allbery on Sun Jun 30 18:20:02 2024
    Joerg, thanks for trying to work with us to find a way forward.

    Russ Allbery writes ("Re: t2u in the archive"):
    Thank you very much for putting this together! I know how hard it is to coordinate a bunch of voices and turn them into something concrete. This
    is incredibly helpful and I really appreciate it.

    After reading this over and thinking about it for a bit, I had a few questions just to make sure I fully understood the proposal.

    Thanks for asking these questions. These are very similar to the
    questions I would ask. So I won't repeat them.

    Also I need a bit of time to digest this proposal.

    Thanks,
    Ian.

    --
    Ian Jackson <[email protected]> These opinions are my own.

    Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
    that is a private address which bypasses my fierce spamfilter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aigars Mahinovs@21:1/5 to Russ Allbery on Sun Jun 30 19:50:01 2024
    On Sun, 30 Jun 2024 at 19:28, Russ Allbery <[email protected]> wrote:

    Aigars Mahinovs <[email protected]> writes:
    Correct me if I'm wrong, but I believe the intention is to have two technically redundant data points saved into the archive:

    1) checksums of the contents of the shallow copy git tree in the
    maintainer work folder (signed by the maintainer)
    2) contents of the shallow copy git tree in the t2u server work folder (signed by t2u)

    Oh! Did I misunderstand Joerg's second point entirely? By "the tag that
    t2u wants to upload," I assumed that meant the tag the uploader signed or,
    in other words, the state of the tree *before* t2u started doing its work that has the uploader signature attached.

    I do not see that in either what me or Joerg wrote. And I also don't
    see much sense in that.

    In contrast, having a tarball of the git state *before* t2u starts its
    work would provide a tarball that *can* be verified against the
    checksums from the first file. That will give you a clear data point -
    t2u started its work with the exactly the same workspace as the
    maintainer signed. And will provide a frozen copy of that starting
    workspace in the archive independent of the (more complex) dgit
    service.

    --
    Best regards,
    Aigars Mahinovs mailto:[email protected]
    #--------------------------------------------------------------#
    | .''`. Debian GNU/Linux (http://www.debian.org) |
    | : :' : Latvian Open Source Assoc. (http://www.laka.lv) |
    | `. `' Linux Administration and Free Software Consulting |
    | `- (http://www.aiteki.com) |
    #--------------------------------------------------------------#

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Aigars Mahinovs on Sun Jun 30 19:30:01 2024
    Aigars Mahinovs <[email protected]> writes:
    On Sun, 30 Jun 2024 at 17:58, Russ Allbery <[email protected]> wrote:

    The second file consists of a shallow git clone of the repository for
    the tag that t2u wants to upload, put into an appropriately named
    tarball.

    Just to double check, to make sure I'm not missing some subtlety, it's
    intentional that this file contains all of the same information as in
    the first file, and the first file is just a subset of this same
    information in a different form?

    In other words, someone could verify the signature on the Git tag in
    this file and then run the git ls-files command on the Git repository
    and get exactly the same information as in the first file, so the first
    file is technically redundant. I can think of some reasons why you
    might want that, but it's a little surprising, so I wanted to make sure
    that's intentional.

    Correct me if I'm wrong, but I believe the intention is to have two technically redundant data points saved into the archive:

    1) checksums of the contents of the shallow copy git tree in the
    maintainer work folder (signed by the maintainer)
    2) contents of the shallow copy git tree in the t2u server work folder (signed by t2u)

    Oh! Did I misunderstand Joerg's second point entirely? By "the tag that
    t2u wants to upload," I assumed that meant the tag the uploader signed or,
    in other words, the state of the tree *before* t2u started doing its work
    that has the uploader signature attached.

    If the intent is to have it be the tree *after* t2u has done its work (in
    other words, the same patches-applied tree with debian/patches/* populated
    that t2u would push to dgit-repos), that's something a bit different.

    Personally, I think if we're including a Git bundle anyway, we should
    create a shallow clone that includes all of those things: the Git tag the maintainer signed, the Git tag that t2u pushed to dgit-repos, and the Git
    tag of the upstream tree. It's a Git bundle, it's easy to include more
    things.

    The only suggestion I would have here would be to have the shallow git
    clone on the t2u side have a variable depth that is selected so that the commits in the resulting depth are sufficient for the source package construction, like in case of a rebase workflow you'd need to have git history deep enough to include all Debian patches and the last upstream commit.

    Yes, that's also an interesting idea.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Kitterman@21:1/5 to All on Sun Jun 30 14:46:34 2024
    On Sunday, June 30, 2024 1:45:15 PM EDT Aigars Mahinovs wrote:
    On Sun, 30 Jun 2024 at 19:28, Russ Allbery <[email protected]> wrote:
    Aigars Mahinovs <[email protected]> writes:
    Correct me if I'm wrong, but I believe the intention is to have two technically redundant data points saved into the archive:

    1) checksums of the contents of the shallow copy git tree in the maintainer work folder (signed by the maintainer)
    2) contents of the shallow copy git tree in the t2u server work folder (signed by t2u)

    Oh! Did I misunderstand Joerg's second point entirely? By "the tag that t2u wants to upload," I assumed that meant the tag the uploader signed or, in other words, the state of the tree *before* t2u started doing its work that has the uploader signature attached.

    I do not see that in either what me or Joerg wrote. And I also don't
    see much sense in that.

    In contrast, having a tarball of the git state *before* t2u starts its
    work would provide a tarball that *can* be verified against the
    checksums from the first file. That will give you a clear data point -
    t2u started its work with the exactly the same workspace as the
    maintainer signed. And will provide a frozen copy of that starting
    workspace in the archive independent of the (more complex) dgit
    service.

    It's one at the point the maintainer signed the tag.

    Scott K
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEE53Kb/76FQA/u7iOxeNfe+5rVmvEFAmaBqAoACgkQeNfe+5rV mvHYFA/8CoE+RXUHbPpJpHiZxBPKR6G4dRu55X071ofplKmTHeFMJWQA2K66L9DZ sSXkzR9XGwqyGs80xJOLGbfofvHCuutcJS5WL5E8Joft1HtSZ/ypt8HEeKpaAgLZ eA6IyBeHCcClzhkXxXoKBVSeWNjzwXF14fx1j9Rcn3Em/YDk1Lh2T5bTNW9sZ0RN fYf4o4h0cCvEOpCvS4RZE6xvdXUFyIs88nZBXB8sDjk9OOCCXpelvALPMSrZ3lEg jb22zBXVqOvfPT6tQReAkgqjuu1vR376RGpQGHn+QX58uq6JbOTvmbvKzmaDw1Ui Kh7sPp4P2rfy2aoofPbeEA7jusRMKiqYK8vmDvuGJXmRqcmuaudUnjXE0EzY9SCH 5JsK6uTF/1QQ40TSm0WEBwPhekllzEaUEGrAVTRNi3GOIPpjyGjW6jyGyIdtASqi XL2KH7hdVEDL2AKZ+tOZEI47UQ1ozvn6iz81a1dfKThA8BCsDrZWePru+sdezM8h 7KjGaJNo7qOq9AzjKVpGrMRWIOycPWajvCwkkB72AwFiWakvU85IYsjyIoA8iJEX cApn1YgauW1kIj+1TGqzElM9ioIJ+qwjqzt074T6QX3Cii18OISD6IQQXM5y+JuQ ytM6ytoWsRWvWTLodTP9c+OJj0rD4/fxQIyY32uxxJRR22ssg4Q=
    =JBM2
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Aigars Mahinovs on Sun Jun 30 21:00:01 2024
    Aigars Mahinovs <[email protected]> writes:

    In contrast, having a tarball of the git state *before* t2u starts its
    work would provide a tarball that *can* be verified against the
    checksums from the first file. That will give you a clear data point -
    t2u started its work with the exactly the same workspace as the
    maintainer signed. And will provide a frozen copy of that starting
    workspace in the archive independent of the (more complex) dgit service.

    Oh, okay, that's what I thought Joerg was saying, and I misunderstood your message. So yes, the two files are technically redundant (I think they're
    both signed by t2u since presumably they're in *.changes).

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joerg Jaspert@21:1/5 to Aigars Mahinovs on Sun Jun 30 21:40:02 2024
    On 17276 March 1977, Aigars Mahinovs wrote:

    The only suggestion I would have here would be to have the shallow git
    clone on the t2u side have a variable depth that is selected so that
    the commits in the resulting depth are sufficient for the source
    package construction, like in case of a rebase workflow you'd need to
    have git history deep enough to include all Debian patches and the
    last upstream commit.

    "A shallow clone that ensures whatever t2u does for magic can be
    reproduced". or something like it.

    --
    bye, Joerg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Urlichs@21:1/5 to All on Sun Jun 30 21:40:02 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------ddZeAY58DLcgFt4zSQhdVlt9
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMzAuMDYuMjQgMTk6MjgsIFJ1c3MgQWxsYmVyeSB3cm90ZToNCj4gT2ghICBEaWQgSSBt aXN1bmRlcnN0YW5kIEpvZXJnJ3Mgc2Vjb25kIHBvaW50IGVudGlyZWx5PyAgQnkgInRoZSB0 YWcgdGhhdA0KPiB0MnUgd2FudHMgdG8gdXBsb2FkLCIgSSBhc3N1bWVkIHRoYXQgbWVhbnQg dGhlIHRhZyB0aGUgdXBsb2FkZXIgc2lnbmVkIG9yLA0KPiBpbiBvdGhlciB3b3JkcywgdGhl IHN0YXRlIG9mIHRoZSB0cmVlKmJlZm9yZSogIHQydSBzdGFydGVkIGRvaW5nIGl0cyB3b3Jr DQo+IHRoYXQgaGFzIHRoZSB1cGxvYWRlciBzaWduYXR1cmUgYXR0YWNoZWQuDQoNCkdpdmVu IHRoYXQgdGhlIHVwbG9hZGVyIGJhc2ljYWxseSBzaWduZWQgYW4gZW50aXJlIGdpdCB0cmVl LCBhZGRpbmcgDQphZGRpdGlvbmFsIHRhcmJhbGxzIGlzIHJhdGhlciByZWR1bmRhbnQgSU1I Ty4gQWxzbywgcmVxdWlyaW5nIGEgImZhdCIgDQp1cGxvYWQgd291bGQgcHJlY2x1ZGUgdGhl ICJ1cGxvYWRlciBiZWhpbmQgYW4gdW5yZWxpYWJsZSwgbW9sYXNzZXMtc2xvdywgDQpvciBo ZWF2aWx5LW1ldGVyZWQgbmV0d29yayBjb25uZWN0aW9uIiB1c2UgY2FzZXMgd2hpY2ggdGhl IHQydSBkZXNpZ24gDQplbmFibGVzIChhbmQgd2hpY2ggSSdkIGluc2lzdCBvbiwgZ2l2ZW4g dGhlIG1vYmlsZSBuZXR3b3JraW5nIHNpdHVhdGlvbiANCmluIG5vbi11cmJhbiBhcmVhcyBv ZiBxdWl0ZSBhIGZldyBvc3RlbnNpYmx5LWZpcnN0LXdvcmxkLWNvdW50cmllcyDigJMgbm90 IA0KdG8gbWVudGlvbiB0aGUgcmVzdCBvZiB0aGUgcGxhbmV0KS4NCg0KSW5jbHVkaW5nIGEg ImdpdCBscy1maWxlcyIgb3V0cHV0IGZyb20gdGhlIGNsaWVudCBpcyBzb21ldGhpbmcgSSBj YW4gDQpzb3J0b2YgbGl2ZSB3aXRoLCBnaXZlbiB0aGF0IGl0J2xsIGNvbXByZXNzIHF1aXRl IHdlbGwuIEZyYW5rbHkgSSBkb24ndCANCnNlZSB0aGUgc2VjdXJpdHkgYWR2YW50YWdlIG9m IHN1Y2ggYSBmaWxlLiBPbiB0aGUgb3RoZXIgaGFuZCwgaWYgaXQncyBhIA0KY2FzZSBvZiAi bWVldCB0aGUgZnRwbWFzdGVycycgdW5lYXNlIGFib3V0IHRoZSB3aG9sZSB0aGluZyBoYWxm d2F5IiwgYW5kIA0KaWYgdGhhdCdzIHdoYXQgaXQgdGFrZXMgdG8gYXZvaWQgYSBHUiDigJMg dGltZSB0aGF0J3MgYmV0dGVyIHNwZW50IA0KYWN0dWFsbHkgZGVwbG95aW5nIHQydSwg4oCT IEknbSArMCBvbiB0aGUgaWRlYS4NCg0KLS0gDQotLSByZWdhcmRzDQotLSANCi0tIE1hdHRo aWFzIFVybGljaHMNCg0K

    --------------ddZeAY58DLcgFt4zSQhdVlt9--

    -----BEGIN PGP SIGNATURE-----

    wsF5BAABCAAjFiEEr9eXgvO67AILKKGfcs+OXiW0wpMFAmaBrfwFAwAAAAAACgkQcs+OXiW0wpPg ww/7Brfr4za2hZRqFIiobBmXzIaEMg1cTw1r78ayIQJMeMzL2xn3CWrXO+AdQX+6NGt//HZlqLfs UJBa4cRe+A3nUp5TYMA47Km4nRgGyg2I71jGJSJjWtaffD7/N9WbZx5DvDLLUTyYUb4y/VTa3TJh d33Ddn1TirorHbD/QXeEWYo5+kWH0mzdCIxm2aH22Li1TKIjLLm8bt+d1UCjw8exhVmAo5kTUerO QbMLzkB+HXGveR6ZUteXddRGOxcBjeZuUqvTKMY7uC7ozGzv8/cBSxxOli86IQSt+60zQ4Dsf1/3 eaYbJF37pAGTZNTGzFZpl9DAPHmGmlF91HrYhX3jFax6tsjBCrvTDz7Xy7+9FBtEUvtt22ZNt5GN O8FJAIncrW1LGF5cS7T9RyzUSnbENE++Z8670UyW+Xhwgl9fIMkrvb2XStYbdK7gE+9TTaxpG7+/ tLFh2fcpWOw1Xiq6753Jhcf9UUO92dw/kgo/5ggKL60FhWqav8ZV97J0vr811+hGFZN0gh2bOZoo V/hgWYhLRwcjK2YdnMzOwJDc8MKR29bbcBxPyEDZrLpuVpH8uLz//navvhCdH7is7sjjs8vwDG9p n9b4IdJJgcF4QZNNwjU0zLugrGU2lCEZYCX+zV7f5WsBh9ifCLQPs+JF9sQII9OlIFt5KEoD5PTy WX4=
    =s7IF
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aigars Mahinovs@21:1/5 to Scott Kitterman on Sun Jun 30 21:40:02 2024
    On Sun, 30 Jun 2024 at 20:47, Scott Kitterman <[email protected]> wrote:

    On Sunday, June 30, 2024 1:45:15 PM EDT Aigars Mahinovs wrote:
    On Sun, 30 Jun 2024 at 19:28, Russ Allbery <[email protected]> wrote:
    Aigars Mahinovs <[email protected]> writes:
    Correct me if I'm wrong, but I believe the intention is to have two technically redundant data points saved into the archive:

    1) checksums of the contents of the shallow copy git tree in the maintainer work folder (signed by the maintainer)
    2) contents of the shallow copy git tree in the t2u server work folder (signed by t2u)

    Oh! Did I misunderstand Joerg's second point entirely? By "the tag that t2u wants to upload," I assumed that meant the tag the uploader signed or,
    in other words, the state of the tree *before* t2u started doing its work that has the uploader signature attached.

    I do not see that in either what me or Joerg wrote. And I also don't
    see much sense in that.

    In contrast, having a tarball of the git state *before* t2u starts its
    work would provide a tarball that *can* be verified against the
    checksums from the first file. That will give you a clear data point -
    t2u started its work with the exactly the same workspace as the
    maintainer signed. And will provide a frozen copy of that starting workspace in the archive independent of the (more complex) dgit
    service.

    It's one at the point the maintainer signed the tag.

    Yes, but I would like to point out the difference. It's where and when
    this tarball is created.

    The Debian developer/maintainer creates a signed git tag that contains
    (in its message, presumably, to avoid adding new communication lines)
    the file listing of the git checkout at the point of signing
    (including file names, modes and short SHA checksum hashes). This
    extra content is added at the end of the tag message, after the other
    metadata that t2u proposal/code already defines. This tag is pushed to
    a git server that t2u is monitoring. No files or tarballs are created
    at this point yet. No files or tarballs leave travel out of the
    developers computer. Only the signed tag with a (now quite long)
    message.

    t2u starts its work by checking out the git tag and saving two files:
    1) the tag message with the corresponding Devian developer/maintainer signature, so it can be checked outside of git (can this be done
    easily?)
    2) tarball of the git clone before anything is done to it - filesystem
    tree in this tarball will be (presumably) the same as what was in the
    workspace of the maintainer when they signed the tag

    Both t2u and dak may check this equivalence at any time in the
    process. The contents of the listing in the tag *should* match the
    file tree content of the checkout that t2u gets from git. If that does
    not match, then something fishy is going on and the upload should be
    aborted.

    After that t2u does all its processing, pushes the unified git tree to
    dgit, constructs the source packages, signs the source package and the
    new two files and sends it all to dak.

    IMHO that all makes sense and does not block anything while providing
    useful features, like a frozen git snapshot in the archive. Please
    correct if I got anything wrong. This is getting into lower level git
    and t2u details that I am barely competent in :D

    --
    Best regards,
    Aigars Mahinovs mailto:[email protected]
    #--------------------------------------------------------------#
    | .''`. Debian GNU/Linux (http://www.debian.org) |
    | : :' : Latvian Open Source Assoc. (http://www.laka.lv) |
    | `. `' Linux Administration and Free Software Consulting |
    | `- (http://www.aiteki.com) |
    #--------------------------------------------------------------#

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joerg Jaspert@21:1/5 to Russ Allbery on Sun Jun 30 21:40:01 2024
    On 17276 March 1977, Russ Allbery wrote:

    So, in short: A t2u uploaded source package should consist of
    whatever
    t2u produces (normal Debian source package) *plus* two additional
    files.

    The first file contains client side generated data, but to *not*
    overburden the client, this *only* consists of the output of `git
    ls-files --format="%(objectmode) %(objectname) %(path)"` for the tag
    that should be uploaded, signed by the DD/DM key - or something
    similarly easily generated on client side. Exact format can be hashed
    out between t2u people and ftpmaster during implementation.
    You describe the contents here, but not the semantics, and I'm not
    sure
    that I fully understand what the intended semantics are (in other
    words,
    what packages dak will accept under this proposal). Would source
    packages
    that contain additional files not represented in this list of files
    and
    hashes be accepted, for example?

    dak sets no limit on the packages here. That is, no matter what t2u has
    to do in its blackbox, go and have fun. Just include the signed list of
    files from the maintainer plus the shallow clone.

    The intention is that enough gets uploaded and stored somewhere that dak
    (or whoever later) can reconstruct what t2u did. And, obviously, if you
    then follow the steps t2u does and use as input the shallow clone
    (verified against the maintainers sig), it really should get identical
    output. (Maybe minus timestamps, but for the important part).

    Here's one specific concrete example for one workflow using tag2upload
    (there are other variations): suppose I, as the uploader, tag a Git
    tree
    that is patches-applied with no patches in debian/patches/*. I run
    the
    above command and include that signed data somewhere where tag2upload
    can
    get at it via the Git tag. tag2upload then turns that into a 3.0
    (quilt)
    package and uploads that package along with the information as
    requested,
    including that list of files and hashes that I signed from the tree
    that I
    tagged.

    In the thing I described it does not matter what t2u does, and yes, what
    it uploads may well contain more or less files, or changed files, from
    what the maintainer signed and started the process with. But those are
    all added/removed/changed files from the t2u process, so ought to be
    able to be redone, taking the same starting point. If *that* isn't true, something would be fundamentally wrong.

    When dak sees the package, all of the files in debian/* in the source
    package will have the same hashes as in the git ls-files output, but
    the
    source package will have additional files in debian/patches/* that do
    not
    exist in the git ls-files output. Some of the upstream files will
    have
    hashes in the git ls-files output that match the contents of those
    files
    after unpacking the source package (and thus applying patches), but
    will
    not match the hashes of those files as they exist in the upstream orig.tar.gz.

    In this proposal, would dak be willing to accept such a package?

    Unless dak goes and redoes the work that t2u is doing, ie. construct the
    whole thing from git and whatever magic needed, dak will not care, if
    the files in the source package are modified by t2u.

    The case of a repository that contains only the debian/* files poses
    another set of complications, but I don't think we have to get into
    that
    immediately. The above examples are probably enough to work through
    to
    understand what the intended semantics of this manifest is.

    I'm not entirely sure on what is best to require here. I mean, the orig
    source has to be somewhere, including on the maints machine, so should
    be possible to be included in this without any extra large magic.

    The second file consists of a shallow git clone of the repository for
    the tag that t2u wants to upload, put into an appropriately named
    tarball.
    Just to double check, to make sure I'm not missing some subtlety, it's intentional that this file contains all of the same information as in
    the
    first file, and the first file is just a subset of this same
    information
    in a different form?

    It's not the same. The first contains a list of things with hashes or
    something (whatever will be defined between t2u and ftpmaster). The
    second contains the actual data. But yes, you can generate the first out
    of the second - except for the maintainers signature.

    In other words, someone could verify the signature on the Git tag in
    this
    file and then run the git ls-files command on the Git repository and
    get
    exactly the same information as in the first file, so the first file
    is
    technically redundant. I can think of some reasons why you might want
    that, but it's a little surprising, so I wanted to make sure that's intentional.

    And yes, you can get the exact same thing out. Thats how you can verify
    it later, if it matches. I mean, we CAN condense it down to "the
    maintainer does the shallow clone and signs that, and that gets
    uploaded", but I think that goes against the "simple on the client
    side".



    --
    bye, Joerg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Urlichs@21:1/5 to All on Mon Jul 1 11:40:01 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------xX4sLwl6Zckc9wV4WSE0OrWg
    Content-Type: multipart/alternative;
    boundary="------------8CDxTHT5gH9I2BYncbg0vwyV"

    --------------8CDxTHT5gH9I2BYncbg0vwyV
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMzAuMDYuMjQgMjE6MzAsIEFpZ2FycyBNYWhpbm92cyB3cm90ZToNCj4gVGhlIERlYmlh biBkZXZlbG9wZXIvbWFpbnRhaW5lciBjcmVhdGVzIGEgc2lnbmVkIGdpdCB0YWcgdGhhdCBj b250YWlucw0KPiAoaW4gaXRzIG1lc3NhZ2UsIHByZXN1bWFibHksIHRvIGF2b2lkIGFkZGlu ZyBuZXcgY29tbXVuaWNhdGlvbiBsaW5lcykNCj4gdGhlIGZpbGUgbGlzdGluZyBvZiB0aGUg Z2l0IGNoZWNrb3V0IGF0IHRoZSBwb2ludCBvZiBzaWduaW5nDQo+IChpbmNsdWRpbmcgZmls ZSBuYW1lcywgbW9kZXMgYW5kIHNob3J0IFNIQSBjaGVja3N1bSBoYXNoZXMpLiBUaGlzDQo+ IGV4dHJhIGNvbnRlbnQgaXMgYWRkZWQgYXQgdGhlIGVuZCBvZiB0aGUgdGFnIG1lc3NhZ2Us DQoNCk9LLCBtYXliZSBJJ20ganVzdCBub3QgZ2V0dGluZyBpdCwgYnV0IHRoZSB0YWcgKmFs cmVhZHkqIGNvbnRhaW5zIHRoZSANCmZpbGUgbGlzdGluZyB5b3Ugd2FudCB0byBhZGQgdG8g dGhlIHRhZywgaW1wbGljaXRseTogaXQgcmVmZXJzIGEgY29tbWl0IA0Kd2hpY2ggcmVmZXJz IGEgdHJlZSB3aGljaCByZWZlcnMgdG8gZXhhY3RseSB0aG9zZSBmaWxlcy4NCg0KSWYgaXQg ZXZlciBkb2VzIG5vdCwgdGhlbiB3ZSdkIGFsbCBoYXZlIF93YXlfIHdvcnNlIHByb2JsZW1z IHRoYW4gDQpmaWd1cmluZyBvdXQgaG93IHRvIHNhZmVseSBjcmVhdGUgYSB0MnUgdGFnLg0K DQpTbyB3aGF0IHdvdWxkIHRoaXMgYWN0dWFsbHkgYnV5IHVzLCBpbiB0ZXJtcyBvZiBhZGRp dGlvbmFsIHNhZmV0eT8NCg0KLS0gDQotLSByZWdhcmRzDQotLSANCi0tIE1hdHRoaWFzIFVy bGljaHMNCg0K
    --------------8CDxTHT5gH9I2BYncbg0vwyV
    Content-Type: text/html; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable

    <!DOCTYPE html>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <div class="moz-cite-prefix">On 30.06.24 21:30, Aigars Mahinovs
    wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:CABpYwDWvRuxVnhgoFRUOAv2kc7d=[email protected]">
    <pre>The Debian developer/maintainer creates a signed git tag that contains
    (in its message, presumably, to avoid adding new communication lines)
    the file listing of the git checkout at the point of signing
    (including file names, modes and short SHA checksum hashes). This
    extra content is added at the end of the tag message,</pre>
    </blockquote>
    <p>OK, maybe I'm just not getting it, but the tag *already* contains
    the file listing you want to add to the tag, implicitly: it refers
    a commit which refers a tree which refers to exactly those files.</p>
    <p>If it ever does not, then we'd all have _way_ worse problems than
    figuring out how to safely create a t2u tag.</p>
    <p>So what would this actually buy us, in terms of additional
    safety?<br>
    </p>
    <pre class="moz-signature" cols="72">--
    -- regards
    --
    -- Matthias Urlichs</pre>
    </body>
    </html>

    --------------8CDxTHT5gH9I2BYncbg0vwyV--

    --------------xX4sLwl6Zckc9wV4WSE0OrWg--

    -----BEGIN PGP SIGNATURE-----

    wsF5BAABCAAjFiEEr9eXgvO67AILKKGfcs+OXiW0wpMFAmaCdmsFAwAAAAAACgkQcs+OXiW0wpMu Kg/+M+erOb4/trnWNuZWZ/Pthpw8511kQWTP+JC0hKWdPUe9zx6LWUwfa+hqFsmY4T4uluwllV34 imrQy1XyDKZIuHrlsSKK/3tlywmZu23ZiiaaDydmHf/EgYamApjL3gmppqoqfUgzYZE41P2HxvMT Ub0IejNLLbQmBqBfsNbcmKQAN7g08AOYRN9nNC1VOQKGO162AtLhvsJuDNie4HOHDiDnpqssWkiM SU2Af2TSvsVTBaPKirzZwiWyObk4WTUc0CGG9MpYm96uHC2RLIwA2IZNTJq7Yz/JjyicI/7DrQjA akVkRPqZ8Q/yy5Ea2jGMPb+FoEg+nNA8lof0HHvvy+ZcVZuJYK6ZNSEGVTbDy8A1ayc3aqTHksPo bu4LJqGME+Wq2i3IaEDjjKSHwwJsafFIZndYP+PCGEN2KfW4eFnIMOj37w/gX+oeTjMxxEp17syi 00S/OcuPb1O76Trkw95yuJgLa4v6i4+OlBUf07NwdxySGK9Cdh3QxTFUr4wW1FpKngK81pvybXw3 3IfZqL1QwcfrpFNcnRxVnH0R61aojjee+xKUOK4PGMVhW6Q1SSZRAErTpqSLc1TO/A8s5nxMCpCd iK2C2RNZZ4foiDKRXox2oXwf4IfLBHsMvsa5zrRZshvUzZJhHW33gaVUkhC2LoiOUkQrsQxuEtrc GS8=
    =OlJp
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
  • From Ian Jackson@21:1/5 to Joerg Jaspert on Mon Jul 1 11:50:01 2024
    Hi again. Thanks for the clarifications. Speaking personally I've
    found your replies encouraging, and I'm cautiously optimistic that
    this might be a workable approach. We'll keep working on a proper
    response.


    In the meantime, I have a couple of questions.

    Joerg Jaspert writes ("Re: t2u in the archive"):
    The intention is that enough gets uploaded and stored somewhere that dak
    (or whoever later) can reconstruct what t2u did. And, obviously, if you
    then follow the steps t2u does and use as input the shallow clone
    (verified against the maintainers sig), it really should get identical output. (Maybe minus timestamps, but for the important part).

    Firstly, you say a "shallow clone".

    It is not straightforward to include *precisely* the set of commits
    that are required to reproduce the output. The conversion might, in
    principle, go arbitrarily far into the maintainer's packaging branch;
    and, if the conversion involves an external tool such as
    git-debcherry, that tool probably won't currently report what
    commit(s) it used - so would need to be modified.

    I'm hoping the reason you say "shallow clone" is simply to avoid
    bloat.

    In that case, it's fairly simple: I find it difficult to imagine a
    future workflow that includes the history *of the upstream branch*.
    So the t2u server could exclude commits which are in the history of
    the nominated upstream tag. That would generally do the right thing,
    but it wouldn't *guarantee* not to include unwanted history. Would
    that be OK ?


    Secondly, the file listing. Thanks for the explanation. I'm still
    not quite sure we understand why you want it.

    Even so, I think I have a possible way to eliminate it, while still
    giving you the property that dak (or a future audit) can know the file
    list of the tree signed by the maintainer, without needing to actually
    run git.

    (I'm guessing that having dak not run git is why you don't think it's
    good enough that one can verify the contents directly from the git tag
    by running the git-ls-files rune.)

    The git tag is itself a Merkle tree, containing the information you
    need. So the hashes of all these things, and the filenames, are
    already signed by the maintainer - that's the git tag. The reason
    it's not readily verifiable without running git itself, is mostly
    because getting the actual object texts out of git is very
    complicated.

    How about we (the tag2upload team):

    * Have the git clone tarball contain the following
    - the tag itself
    - the tagged commit
    - the tagged tree objects (recursively)
    - the blobs
    as loose objects.

    * Provide a program, that given this information, recursively
    verifies the git hashes, and prints the list of files.

    That is, it does this:

    1. find the commitid in the tag (textual parse)
    2. find the commit object, as a file
    3. calculate the git objectid of that file, by re-hashing
    it with sha1sum and the appropriate prefix,
    and checking that it matches
    4. find the tree objectid in the commit object (textual parse)
    5. repeat step 3 for the tree objectid
    6. parse the tree object into a list of filenames, modes,
    and objectids
    7. for each referenced objectid, find it, and rehash it
    8. referenced objects must be trees (go to step 4)
    or blobs (now we know the path, print it).

    * When doing whatever verification this file list is for (I'm not
    sure if this is dak?), run that program to generate the file list
    directly from the maintainer's signed git tag, rather than using a
    separate copy plumbed through from the maintainer's system.

    The new listing program could be written in the language of your
    choice. (I'm volunteering to write it.) I think this program would
    be quite simple. It would get a bit more complicated when we want to
    support longer git hashes, but not by very much. It does *not* need
    to parse git pack files. The only git thing it needs to parse is the
    tree object format, which is binary, but really quite simple, and the
    textual metadata to find the commit and the tree.

    (It's possible that such a program exists already. I don't know, but
    let's assume for the sake of argument that we'll have to write it, for
    one reason or another.)

    What do you think of this idea ?


    The case of a repository that contains only the debian/* files
    poses another set of complications, but I don't think we have to
    get into that immediately. The above examples are probably enough
    to work through to understand what the intended semantics of this
    manifest is.

    I'm not entirely sure on what is best to require here. I mean, the orig source has to be somewhere, including on the maints machine, so should
    be possible to be included in this without any extra large magic.

    I think we could include it as a different ref in the git tarball.


    Thanks,
    Ian.

    --
    Ian Jackson <[email protected]> These opinions are my own.

    Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
    that is a private address which bypasses my fierce spamfilter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Aigars Mahinovs@21:1/5 to Matthias Urlichs on Mon Jul 1 12:50:01 2024
    On Mon, 1 Jul 2024 at 11:33, Matthias Urlichs <[email protected]> wrote:

    On 30.06.24 21:30, Aigars Mahinovs wrote:

    The Debian developer/maintainer creates a signed git tag that contains
    (in its message, presumably, to avoid adding new communication lines)
    the file listing of the git checkout at the point of signing
    (including file names, modes and short SHA checksum hashes). This
    extra content is added at the end of the tag message,

    OK, maybe I'm just not getting it, but the tag *already* contains the file listing you want to add to the tag, implicitly: it refers a commit which refers a tree which refers to exactly those files.

    If it ever does not, then we'd all have _way_ worse problems than figuring out how to safely create a t2u tag.

    So what would this actually buy us, in terms of additional safety?

    Yes and no. See what the git tag actually contains and what the GPG
    signature actually signs is just the one hash of the commit object.
    This commit object then refers to the other files of the repo, but the
    GPG signature does not directly sign those.

    I just did a local experiment to verify this. I created a new signed
    tag in one of my repos.

    $ git tag -s -m "test signed tag" 24w27.1-1

    Then I inspected the result with this command:

    $ git cat-file -p 24w27.1-1
    object 4d0e377e992901f873bfca5850eb8862b9a1f057
    type commit
    tag 24w27.1-1
    tagger Aigars Mahinovs <[email protected]> 1719829010 +0200

    test signed tag
    -----BEGIN PGP SIGNATURE-----

    iI0EABYIADUWIQSeqa3XEFMtCKeITWHKWshbVxu0FgUCZoKCEhccYWlnYXJzLm1h aGlub3ZzQGJtdy5kZQAKCRDKWshbVxu0Fq4fAQDSySaFH9ytCr70i+Bs0MxfPDRt BQ4O9Xp9JCoXnrVsiAEA/i8CUFqSlU51fy1UL6YTPC/O4pq1QUYcVJP7X9V5FAo=
    =r+o7
    -----END PGP SIGNATURE-----

    That looks like simple text and detached GPG signature. So to check
    that I split the text part and the signature part into two files:
    ```
    object 4d0e377e992901f873bfca5850eb8862b9a1f057
    type commit
    tag 24w27.1-1
    tagger Aigars Mahinovs <[email protected]> 1719829010 +0200

    test signed tag
    ```
    and
    ```
    -----BEGIN PGP SIGNATURE-----

    iI0EABYIADUWIQSeqa3XEFMtCKeITWHKWshbVxu0FgUCZoKCEhccYWlnYXJzLm1h aGlub3ZzQGJtdy5kZQAKCRDKWshbVxu0Fq4fAQDSySaFH9ytCr70i+Bs0MxfPDRt BQ4O9Xp9JCoXnrVsiAEA/i8CUFqSlU51fy1UL6YTPC/O4pq1QUYcVJP7X9V5FAo=
    =r+o7
    -----END PGP SIGNATURE-----
    ```

    And running `gpg --verify` with those two files showed a good
    signature. This confirmed my theory that the git tag signature nothing
    more than a signature on this exact text that we see above.
    It signs the commit object hash, tag name, tagger identity and
    whatever is in the tag message.

    Just dumping the output of `git cat-file -p ${tagname}` would provide
    a way of verifying the uploader signature even without git.
    And including the output of `git ls-files ...` in that message would
    both provide a non-git way of matching this tag to the tarball and
    would additionally extend the scope of the GPG signature to hashes of
    all files in the repo checkout directly, without going through
    multiple hash object jumps in the git object hierarchy.

    As far as I can see doing things this way provides for a rather simple verification pathway that does not involve running git or parsing git
    objects. It's easier to reason about this compared to rather complex
    git internals. Can even be done in a shell script.
    Does this actually prevent or make herder any attack vectors right
    now? I don't think so.

    However, implementing it this way does give us options later, for
    example if `git ls-files` gets an ability to print stronger checksums
    of files we can switch to use that and gain stronger checksum benefits
    even if the git repo itself remains based on the current checksumming
    scheme. If we (later) agree to require that the maintainer has not
    only git and gpg installed locally but also sha512sum binary, for
    example then the client command line can be later changed so that it
    uses that binary to create checksums of all files instead of using
    checksums that git already has. Same for another (not invented yet)
    hashing algorithm. And all without having to change git internal data structures for all the repos.

    --
    Best regards,
    Aigars Mahinovs mailto:[email protected]
    #--------------------------------------------------------------#
    | .''`. Debian GNU/Linux (http://www.debian.org) |
    | : :' : Latvian Open Source Assoc. (http://www.laka.lv) |
    | `. `' Linux Administration and Free Software Consulting |
    | `- (http://www.aiteki.com) |
    #--------------------------------------------------------------#

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Urlichs@21:1/5 to All on Mon Jul 1 15:10:01 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------PKNwV5NIPEInMQi7DqR9BF3A
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMDEuMDcuMjQgMTI6NDYsIEFpZ2FycyBNYWhpbm92cyB3cm90ZToNCj4gWWVzIGFuZCBu by4gU2VlIHdoYXQgdGhlIGdpdCB0YWcgYWN0dWFsbHkgY29udGFpbnMgYW5kIHdoYXQgdGhl IEdQRw0KPiBzaWduYXR1cmUgYWN0dWFsbHkgc2lnbnMgaXMganVzdCB0aGUgb25lIGhhc2gg b2YgdGhlIGNvbW1pdCBvYmplY3QuDQo+IFRoaXMgY29tbWl0IG9iamVjdCB0aGVuIHJlZmVy cyB0byB0aGUgb3RoZXIgZmlsZXMgb2YgdGhlIHJlcG8sIGJ1dCB0aGUNCj4gR1BHIHNpZ25h dHVyZSBkb2VzIG5vdCBkaXJlY3RseSBzaWduIHRob3NlLg0KU28gaXQgc2lnbnMgdGhlbSBp bmRpcmVjdGx5IGluc3RlYWQuIEkgZG9uJ3QgY29uc2lkZXIgdGhhdCB0byBiZSBhIHByb2Js ZW0uDQoNClRoZXJlJ3Mgbm8gbWF0ZXJpYWwgZGlmZmVyZW5jZSB3aGV0aGVyIHRoZSB0YWcg c2lnbnMgYSBjb21taXQgdGhhdCANCmhhc2hlcyBhIHRyZWUgdGhhdCAoZXZlbnR1YWxseSkg aGFzaGVzIHRoZSBmaWxlcywgb3IgYSBsaXN0IG9mIHRoZSBmaWxlcyANCnBsdXMgdGhlaXIg aGFzaGVzLCBvciBhIHRhcmJhbGwgb2YgdGhlIGZpbGVzIGluIHF1ZXN0aW9uIChleGNlcHQg dGhhdCANCnRoZSB3YXkgd2UgZG8gdGhlIGxhdHRlciBpcyB0b28gYnJpdHRsZSDigJMgaXQg ZGVwZW5kcyBvbiB0aGUgZmlsZSBvcmRlciANCmFuZCBjb21wcmVzc2lvbiB1c2VkKS4NCg0K VGhlIHNpbmdsZSBhZHZhbnRhZ2Ugb2YgaW5jbHVkaW5nIGEgZmlsZSBsaXN0IHdvdWxkIGJl IGlmIGl0IGluY2x1ZGVkIA0KdGhlIGZpbGVzJyBTSEEyNTYtb3ItYmV0dGVyIGhhc2hlcywg YnV0IGdpdmVuIHRoZSBkaWZmaWN1bHR5IG9mIGZpbmRpbmcgDQoqYW5kKiBleHBsb2l0aW5n IGEgU0hBMSBjb2xsaXNpb24gaXQncyBhIGp1ZGdtZW50IGNhbGwgd2hldGhlciB0aGF0J3Mg DQp3b3J0aCB0aGUgZWZmb3J0Lg0KDQoNCkNyZWF0aW5nIGFuIG92ZXJzaXplZCB0YWcgb2Jq ZWN0IGlzbid0IGEgZ29vZCBpZGVhIElNSE8uIEZvciByZWZlcmVuY2UsIA0KdGhlIGxpc3Qg b2YgZmlsZXMgaW4gdGhlIExpbnV4IGtlcm5lbCBpcyAzLjIgTUJ5dGVzIChnaXQgd291bGQg Y29tcHJlc3MgDQp0aGF0IGRvd24gdG8gNDUwa0Igb3Igc28pLCBwbHVzIDEwIE1CeXRlcyBv ZiBzaGE1MTJzdW1zIChjb21wcmVzc2libGUgdG8gDQo1TUIgYnkgZGVmaW5pdGlvbik7IGdy YW50ZWQgdGhhdCB0aGUga2VybmVsIGlzIGFuIGV4dHJlbWUgZXhhbXBsZSBidXQgDQppdCdz IG5vdCB0aGUgb25seSBvbmUuDQoNCkFsc28sIHRoZXNlIHRhZ3MgYXJlIG5vdCBqdXN0IHB1 c2hlZCB0byBTYWxzYSwgcHVsbGVkIGJ5IHQydSBzZXJ2ZXIsIGFuZCANCnN1YnNlcXVlbnRs eSBpZ25vcmVkLiBBbnlib2R5IHdobyBjbG9uZXMgb3IgcHVsbHMgZnJvbSBhbiBhcmNoaXZl IGlzIA0KbGlrZWx5IHRvIGFsc28gcHVsbCBpdHMgdGFncy4gSSBzdXNwZWN0IChidXQgd291 bGQgbmVlZCB0byB2ZXJpZnkpIHRoYXQgDQpnaXQgZG9lcyBub3QgZG8gZGVsdGEgZW5jb2Rp bmcgd2hlbiBpdCBzZW5kcyB0YWcgb2JqZWN0cywgYW5kIHdlIA0Kc2hvdWxkbid0IGRlcGVu ZCBvbiB0aGF0IGluIG9yZGVyIHRvIGJlIHJlYXNvbmFibHkgZWZmaWNpZW50Lg0KDQpJZiB3 ZSBkbyBkZWNpZGUgdGhhdCBhIHNlY29uZCBoYXNoIGlzIHdvcnRoIHRoZSBlZmZvcnQsIEkg KnN0cm9uZ2x5KiANCnJlY29tbWVuZCB0byBzaW1wbHkgYWRkIGFuIChvcHRpb25hbCkgZmll bGQgd2l0aCB0aGUgb3V0cHV0IG9mICJnaXQgDQpscy1maWxlcyAteiB8IHhhcmdzIC0wIHNo YTUxMnN1bSB8IHNvcnQgfCBzaGE1MTJzdW0iIHRvIHRoZSB0YWcuIFRoaXMgDQpoYXMgdGhl IGV4YWN0IHNhbWUgc2VjdXJpdHkgaW1wbGljYXRpb25zIGFzIGEgbGlzdCBvZiBwYXRocyBh bmQgdGhlaXIgDQpzaGE1MTJzdW0gYnV0IGlzIGEgaGVhcCBvZiBvcmRlcnMgb2YgbWFnbml0 dWRlIHNtYWxsZXIuDQoNCk9uZSBzbGlnaHQgZGlzYWR2YW50YWdlIG9mIHRoaXMgc2NoZW1l IGlzIHRoYXQgeW91J2QgbmVlZCB0aGUgZnVsbCBsaXN0IA0Kb24gYm90aCBzaWRlcyB0byBm aWd1cmUgb3V0IGV4YWN0bHkgd2hhdCB3ZW50IHdyb25nLCBidXQgaWYgdGhhdCBldmVyIA0K aGFwcGVucyBwZW9wbGUgbmVlZCB0byBsb29rIGF0IHRoZSBzaXR1YXRpb24gb24gYm90aCBl bmRzIGFueXdheTsgDQpyZXBlYXRpbmcgdGhpcyBjb21tYW5kIHdpdGhvdXQgdGhlIHRyYWls aW5nIHBpcGUgdG8gInNoYTUxMnN1bSIgd291bGQgYmUgDQp0aGUgbGVhc3Qgb2Ygb3VyIHBy b2JsZW1zLiBBbHRlcm5hdGVseSB0aGUgb3JpZ2luYXRvciBjb3VsZCBrZWVwIHRoZSANCm91 dHB1dCBhcm91bmQgZm9yIGEgZmV3IGRheXMuDQoNCj4gZG9pbmcgdGhpbmdzIHRoaXMgd2F5 IHByb3ZpZGVzIGZvciBhIHJhdGhlciBzaW1wbGUNCj4gdmVyaWZpY2F0aW9uIHBhdGh3YXkg dGhhdCBkb2VzIG5vdCBpbnZvbHZlIHJ1bm5pbmcgZ2l0IG9yIHBhcnNpbmcgZ2l0DQo+IG9i amVjdHMuDQo+DQpJIGFncmVlIHRoYXQgd2UgbWlnaHQgd2FudCBhIGdpdC10aGUtQy1iaW5h cnktaW5kZXBlbmRlbnQgd2F5IG9mIHdvcmtpbmcgDQp3aXRoIGdpdCBvYmplY3RzLCBidXQg Z2l0b3hpZGVbMF0gYWxyZWFkeSBleGlzdHMgKHdyaXR0ZW4gaW4gUnVzdCkuDQoNCklmIHdl IHJlYWxseSB3YW50IHRvIGRvIHRoZSBiYXJlLWJvbmVzIHRoaW5nLCB0aGUgYWJvdmUgImdp dCBscy1maWxlcyAteiANCnwg4oCmIiBjYW4gZWFzaWx5IGJlIHJlcGxhY2VkIHdpdGggc29t ZXRoaW5nIGxpa2UgImZpbmQgLiAtcGF0aCAuLy5naXQgDQotcHJ1bmUgLW8gLXR5cGUgZiAt cHJpbnRmICclUFwwJyB8IOKApiIuDQoNClswXSBodHRwczovL2dpdGh1Yi5jb20vQnlyb24v Z2l0b3hpZGUNCg0KLS0gDQotLSByZWdhcmRzDQotLSANCi0tIE1hdHRoaWFzIFVybGljaHMN
    Cg0K

    --------------PKNwV5NIPEInMQi7DqR9BF3A--

    -----BEGIN PGP SIGNATURE-----

    wsF5BAABCAAjFiEEr9eXgvO67AILKKGfcs+OXiW0wpMFAmaCosQFAwAAAAAACgkQcs+OXiW0wpPq Pw//RUj6K2GTiuMh6n9ZoTk42+cW9KLKQPlpS0+T6F6YdLAEqHn8KzN2yW6saYr+pCtn7q7EyzTq 5vpfGYt4sFW1sSK9+LR18uq8tsNEl3jGjBFaK7O5jLkCHLscg8eNZD6YobAstBSKIMJ61QBHPRSX ebDTXWJoNbaW0FrtQSF1LYF0gwRpThQal3V2Xfn349UshtToXPkP74AOhZBdny45ADXChmGkodEd HX1V8Bqms8vhV/VvfIqXVcPSQh8PsfYFsISLR5gy1AXpFsdKmGDN8SCGxJIy0wfUd+5mg8z5y6pX QXI5eNHGbqrwB8EN40MDY+gX8fWtaPxCf+q9Tvv/zhgBQPMh4nPEFMclCS6OAeMK1+6CkHK7JI38 ZDEmuCCjHoAgMnU0BgF7UVQGgOhY1uKPJI7yqOANtSyoRNDEflL/1vfA6MdKDrMzOwL5bE9R9lDD pMQgeE5dCMcb+t/UnrPOzhjAKpupuVFU/VK8ZZpCeDhyXN+LXc73Z2ul42SikKjN3V1Jv5mEtDr+ NOnjSFqO+5H/ja+0LNzmFhcYToClnjx1pzfoInIkEBWQBhNj+w15c98i0t4q7oc2LGhc+n53FxW/ NjSjb9r9zYmvDXpbewxWjIzzqIr/3x0HqoKEWr/vo97VeCzh+YfUvuncuahzaIEr8nYK+kXtSKee rDo=
    =9AC7
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Matthias Urlichs on Mon Jul 1 18:20:01 2024
    Matthias Urlichs <[email protected]> writes:

    On 01.07.24 12:46, Aigars Mahinovs wrote:
    Yes and no. See what the git tag actually contains and what the GPG
    signature actually signs is just the one hash of the commit object.
    This commit object then refers to the other files of the repo, but the
    GPG signature does not directly sign those.
    So it signs them indirectly instead. I don't consider that to be a problem.

    There's no material difference whether the tag signs a commit that
    hashes a tree that (eventually) hashes the files, or a list of the
    files plus their hashes, or a tarball of the files in question (except
    that the way we do the latter is too brittle – it depends on the file
    order and compression used).

    The single advantage of including a file list would be if it included
    the files' SHA256-or-better hashes, but given the difficulty of
    finding *and* exploiting a SHA1 collision it's a judgment call whether
    that's worth the effort.

    I believe you only need a SHA1 collision to corrupt the t2u scheme, and
    those are not difficult. Unless I'm missing something, and to help
    everyone get on the same page of this analysis, and to get corrections
    from other if my analysis is wrong, here are the concrete steps for a
    malicious upstream maintainer or a malicious Salsa git committer:

    0) Gain commit access to target git repository.

    1) Create one new commit with a SHA1 collision HEAD object (using a git carefully modified to not use SHA1CD), with two different source code
    files (one malicious and one harmelss).

    2) Add some harmless new commit with SHA1CD safe commit id.

    3) Push that into the public git repository.

    4) Over time add many other unrelated commits (which shouldn't touch the
    same content for the malicious commit - hint: put them in a opaque
    binary self-test file...).

    5) Create a t2u git sign on the then HEAD object.

    We now have a sitution where an attacker can provide a separate git
    repository with different content than the intentional version, with
    signed git tags still verify correctly. I think this will be hard to
    make use of successfully in any reasonable scenario in practice, but it
    appears cryptographically possible.

    You can mitigate this by re-validating all commit hashes using a SHA1CD
    git implementation before trusting a git repository. I have not seen confirmation that 'git fsck' actually do that. If some new attack implementation on SHA1 appears, that isn't detected by your SHA1CD
    variant, your validation can be by-passed. I don't think many
    researches bother attacking SHA1 and publishing details publically at
    this point, as it is proven broken already. I suppose that a SHA1 collision-generating algorithm that by-pass SHA1CD still has market
    value.

    Note: I don't think this problem is a deal-breaker for the t2u scheme,
    nor that its design even has to change due to this. We already live
    decently with many theoretical risks.

    If we do decide that a second hash is worth the effort, I *strongly* recommend to simply add an (optional) field with the output of "git
    ls-files -z | xargs -0 sha512sum | sort | sha512sum" to the tag. This
    has the exact same security implications as a list of paths and their sha512sum but is a heap of orders of magnitude smaller.

    Something like this adds more strength, I like it.

    I think we should compare how other distributions handle this; I believe
    Guix hashes of the file content of source tarballs, instead of hashing
    the source tarball. Maybe the details how to compute these are generic
    enough to be reusable by Debian.

    /Simon

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iIoEARYIADIWIQSjzJyHC50xCrrUzy9RcisI/kdFogUCZoLU3hQcc2ltb25Aam9z ZWZzc29uLm9yZwAKCRBRcisI/kdFokOLAP4mSpnrKh8W9TZHEZABtkDWtGAn5Nnf 8EJWKiXW4dzXcwD/dJvNX2CPY0Kg9uXQFvDxJpUwuQT8+494EP7+r2XkkgA=PlQI
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Simon Josefsson on Mon Jul 1 19:10:01 2024
    Simon Josefsson <[email protected]> writes:

    You can mitigate this by re-validating all commit hashes using a SHA1CD
    git implementation before trusting a git repository. I have not seen confirmation that 'git fsck' actually do that.

    I convinced myself that it does. One of the things git fsck does is recalculate the hash of every object in the repository and ensure that it matches (this is, to a large extent, the entire point; the other checks
    are sort of an addition), and since Git uses SHA1CD now, git fsck will instantly detect this attack as soon as it does that.

    I admittedly did not go so far as to track down test objects with the same SHA-1 hash and construct an experiment. But I couldn't see any way where
    git fsck could *not* detect this problem unless I'm wrong that it
    recalculates all the hashes, and I'm fairly sure I'm not wrong about that.

    If some new attack implementation on SHA1 appears, that isn't detected
    by your SHA1CD variant, your validation can be by-passed.

    This is true.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joerg Jaspert@21:1/5 to Ian Jackson on Mon Jul 1 22:10:01 2024
    On 17277 March 1977, Ian Jackson wrote:

    Firstly, you say a "shallow clone".

    It is not straightforward to include *precisely* the set of commits
    that are required to reproduce the output. The conversion might, in principle, go arbitrarily far into the maintainer's packaging branch;
    and, if the conversion involves an external tool such as
    git-debcherry, that tool probably won't currently report what
    commit(s) it used - so would need to be modified.

    I'm hoping the reason you say "shallow clone" is simply to avoid
    bloat.

    Yes.

    In that case, it's fairly simple: I find it difficult to imagine a
    future workflow that includes the history *of the upstream branch*.
    So the t2u server could exclude commits which are in the history of
    the nominated upstream tag. That would generally do the right thing,
    but it wouldn't *guarantee* not to include unwanted history. Would
    that be OK ?

    Yes.

    Secondly, the file listing. Thanks for the explanation. I'm still
    not quite sure we understand why you want it.

    Even so, I think I have a possible way to eliminate it, while still
    giving you the property that dak (or a future audit) can know the file
    list of the tree signed by the maintainer, without needing to actually
    run git.

    (I'm guessing that having dak not run git is why you don't think it's
    good enough that one can verify the contents directly from the git tag
    by running the git-ls-files rune.)

    The git tag is itself a Merkle tree, containing the information you
    need. So the hashes of all these things, and the filenames, are
    already signed by the maintainer - that's the git tag. The reason
    it's not readily verifiable without running git itself, is mostly
    because getting the actual object texts out of git is very
    complicated.

    How about we (the tag2upload team):

    [... long description ...]

    Honestly I think that sounds way more complicated than a "git ls-files something" based file and process, and binds us more tightly to actual
    git than ls-files does (one could easily have more fields in there, if
    deemed neccessary).

    But I don't see anything obviously so wrong that its a NO, so fine.

    The new listing program could be written in the language of your
    choice. (I'm volunteering to write it.)

    While I personally love Rust nowadays, dak is python, so python that
    will be. (While it won't be integrated into dak (so easier for others to
    take), it should share the same language).

    --
    bye, Joerg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ian Jackson@21:1/5 to Joerg Jaspert on Tue Jul 2 12:20:01 2024
    Joerg Jaspert writes ("Re: t2u in the archive"):
    Honestly I think that sounds way more complicated than a "git ls-files something" based file and process, and binds us more tightly to actual
    git than ls-files does (one could easily have more fields in there, if
    deemed neccessary).

    But I don't see anything obviously so wrong that its a NO, so fine.

    Thanks. We don't much like this either, but we like it much more than
    putting the extra stuff in the tag.

    The new listing program could be written in the language of your
    choice. (I'm volunteering to write it.)

    While I personally love Rust nowadays, dak is python, so python that
    will be. (While it won't be integrated into dak (so easier for others to take), it should share the same language).

    OK.

    Thanks! I am really looking forward to this turning into a coding
    problem.

    Sean, I think we should finish updating the design with these agreed
    changes. Joerg, when we have done that, will you review it and make
    sure we have properly captured what you think we've agreed ?

    Thanks,
    Ian.

    --
    Ian Jackson <[email protected]> These opinions are my own.

    Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
    that is a private address which bypasses my fierce spamfilter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Joerg Jaspert@21:1/5 to Ian Jackson on Tue Jul 2 23:10:01 2024
    On 17278 March 1977, Ian Jackson wrote:

    Sean, I think we should finish updating the design with these agreed
    changes. Joerg, when we have done that, will you review it and make
    sure we have properly captured what you think we've agreed ?

    Sure.

    --
    bye, Joerg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)