• tag2upload, reproducible .orig and dfsg repacks

    From Simon Richter@21:1/5 to All on Wed Jun 26 13:20:01 2024
    Hi,

    We basically have three cases:

    1. upstream has an official .orig.tar.* file we can use

    In my opinion, we'd want to use this because we don't need to explain
    how it was generated, and any signature from upstream can be used to
    verify that we are shipping exactly their release.

    I'm aware that there is disagreement over this point, and there is a
    faction that would like us to rebuild upstream archives from git tags to
    avoid problems like we had with xz-utils, but without an easy way for
    users to verify that an archive corresponds to upstream git, we're
    mainly introducing an explanation why signatures do not match and should
    be disregarded.

    In this case, I'd like a tag2upload service to have a mechanism to
    ensure the upload will use the correct file -- i.e. a mismatch in
    pristine-tar settings will not cause the file to be rebuilt differently
    and subsequently uploaded because there is no verification step between constructing the source package and uploading it.

    2. upstream has no official release, but a git tag we can use

    Here, it obviously makes sense to use git-archive.

    3. upstream needs to be repacked for dfsg reasons

    So far, I believe this has no good representation in git, and the
    packages that do this basically generate a dfsg orig.tar.* file and
    reimport this into git -- which is pretty much the least ideal
    situation, because we have no links to the upstream repo.

    For 1. and 2., what I'd like to kind of see as part of the interface to
    a tag2upload service is a way to explicitly specify what kind of .orig
    archive should be constructed, and this needs to become a condition for actually uploading, so the magic tag containing maintainer intent would explicitly say "the .orig archive needs to have this sha256sum" for the pristine-tar case, and "the .orig archive needs to have a git extended
    pax header containing this sha1sum/sha256sum" for the second.

    I have no good idea for dfsg repacks so far.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Urlichs@21:1/5 to All on Wed Jun 26 15:40:02 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------6ihpVOWU7OHF03Xyf6J0KOV7
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMjYuMDYuMjQgMTM6MTgsIFNpbW9uIFJpY2h0ZXIgd3JvdGU6DQo+IDMuIHVwc3RyZWFt IG5lZWRzIHRvIGJlIHJlcGFja2VkIGZvciBkZnNnIHJlYXNvbnMNCj4NCj4gU28gZmFyLCBJ IGJlbGlldmUgdGhpcyBoYXMgbm8gZ29vZCByZXByZXNlbnRhdGlvbiBpbiBnaXQgW+KApl0N Cg0KSU1ITyB0aGlzIGlzIGEgbW9zdGx5LXNvbHZlZCBwcm9ibGVtLg0KDQpZb3UgY2FuIGZl ZWQgaGFzaGVzIG9mIHRoZSBvZmZlbmRlcnMgdG8gImdpdCBmaWx0ZXItcmVwbyANCi0tc3Ry aXAtYmxvYnMtd2l0aC1pZHMg4oC5ZmlsZW5hbWXigLoiLiBUaGlzIG9wZXJhdGlvbiBpcyBp ZGVtcG90ZW50IGFuZCANCmRldGVybWluaXN0aWMuDQoNCklmIHdlIGFkZCB0aGVzZSBoYXNo ZXMgdG8gYSBmaWxlLCBsZXQncyBzYXkgZC9zb3VyY2UvZGZzZy1maWx0ZXJlZCwgd2UgDQpj YW4gdGh1cyByZXByb2R1Y2libHkgZ2VuZXJhdGUgYSBkZnNnLWNvbXBsaWFudCB2ZXJzaW9u IG9mIHdoaWNoZXZlciANCnVwc3RyZWFtIGNvbW1pdCBvciB0YWcgd2Ugd2FudCwgYW5kIG9m IGNvdXJzZSBnZW5lcmF0ZSBhIHRhcmJhbGwgZnJvbSANCnRoZXJlIGlmIHJlcXVpcmVkLg0K DQotLSANCi0tIHJlZ2FyZHMNCi0tIA0KLS0gTWF0dGhpYXMgVXJsaWNocw0KDQo=

    --------------6ihpVOWU7OHF03Xyf6J0KOV7--

    -----BEGIN PGP SIGNATURE-----

    wsF5BAABCAAjFiEEr9eXgvO67AILKKGfcs+OXiW0wpMFAmZ8Fs8FAwAAAAAACgkQcs+OXiW0wpP3 Lw/+PewjI73+HoMeWkE3yp5LL9WIAUDp1R80fXY69gTF9RP1UfgopBEG6NAT0GT7yzx1Dw7d3IiJ YZE/Ao/wuIF9Mdsaxgh5tAkUDQDWHzDgj4gL1adB6LLlrN52WRH0lzxWUGPdqL+gAK05/s1Mb81j CuPTR8KZC7HKxUQ2tdWFNBEKLOM0Ea5b6pr3jbknzL/cKZSgdGyp9ZGcXQCYARvJblwPQEluSLzQ uo+Yw1/p1tXhSGsfYg8N+6JZYZ1knlCZ/cgH/zZ9GNlMrC8OQwOZpzPtXaXVmFA/Kbn4MIdCsM2J 5Hhb2V3buMnMA4Zb7LFcSX+t6lGHjpyTv5gTE6dWNFuVaNXZbriu3Bz5y2k46Cumw9KSdhQJj9J8 50tQohLN2soTeJ/DQTFVIh9K7yXcePcO03bAo4wbwo6FdMTGcvcCrVYhYWOiXIuQOZMQ3wNlJqtC P/UDBlivKkXi1Twn5xZgPw1NJtyzcFdxCw06aGi9Re70+tUpM9NcUx54kGjcQuGjdSzCdMDig/NF 4vSHkkarbFumoMO5ixeb92wsqRN5DY+9RIMaZncWqcTSZKcNrLVUShJaES0YZtzikIcDZKS5VBJG OFh/gnINxZfjVZ5dySNq+qm44t0B4hkIv2hMiHQ+Sg7eDLZJGyZ3p80FEDSNtyXd24eEEPHIxB3P 5io=
    =Ni6F
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Simon Richter on Wed Jun 26 17:50:02 2024
    Simon Richter <[email protected]> writes:

    1. upstream has an official .orig.tar.* file we can use

    In my opinion, we'd want to use this because we don't need to explain
    how it was generated, and any signature from upstream can be used to
    verify that we are shipping exactly their release.

    I'm aware that there is disagreement over this point, and there is a
    faction that would like us to rebuild upstream archives from git tags to avoid problems like we had with xz-utils, but without an easy way for
    users to verify that an archive corresponds to upstream git, we're
    mainly introducing an explanation why signatures do not match and should
    be disregarded.

    I think there are two cases here: upstream produces a tarball release as
    their official release artifact and produces a Git tag as a side effect or doesn't make a Git tag at all, or upstream produces both a tarball release
    and a Git tag and treats them both as first-class release artifacts.

    The first case is the weakest case for tag2upload until it has support for upstream tarballs. I think there are various ways to add that support
    that aren't too bad (git-lfs for instance) and don't require pristine-tar,
    but it's future work and they're not supported now.

    The second case seems fine with tag2upload? Particularly if upstream
    signs the Git tag. Instead of pointing to a possibly signed release
    tarball, the tag2upload tag points to a signed upstream Git tag. We get basically the same properties and avoid dealing with opaque upstream
    tarballs.

    Obviously this depends on what things are added to the release tarball,
    and there are a bunch of cases with gnulib, etc., where it's difficult to reproduce what upstream does during the release process for one reason or another. But there are a lot of upstreams for which this is not the case.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Richter@21:1/5 to Russ Allbery on Wed Jun 26 18:20:01 2024
    Hi,

    On 6/27/24 00:41, Russ Allbery wrote:

    The second case seems fine with tag2upload? Particularly if upstream
    signs the Git tag. Instead of pointing to a possibly signed release
    tarball, the tag2upload tag points to a signed upstream Git tag. We get basically the same properties and avoid dealing with opaque upstream tarballs.

    The one property we don't get is "our orig archive is bitwise identical
    with what is on upstream's release page" -- which is a *very* important property if I'm being asked to sponsor a package, as it saves me a long investigation.

    Obviously this depends on what things are added to the release tarball,
    and there are a bunch of cases with gnulib, etc., where it's difficult to reproduce what upstream does during the release process for one reason or another. But there are a lot of upstreams for which this is not the case.

    In my packages the git tree does not contain any autogenerated files,
    which means that people using it will have to run autogen.sh. I think
    pretty much everyone else using autotools is doing the same.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ian Jackson@21:1/5 to Simon Richter on Wed Jun 26 18:20:01 2024
    Simon Richter writes ("tag2upload, reproducible .orig and dfsg repacks"):
    For 1. and 2., what I'd like to kind of see as part of the interface to
    a tag2upload service is a way to explicitly specify what kind of .orig archive should be constructed,

    This is already there.

    Firstly, everything I'm about to say applies only to a new upstream
    version upload. For an existing upstream version, the existing .orig
    from the archive is obtained and reused, so no orig construction takes
    place.

    Secondsly, there is only currently one supported way to generate an
    orig: git-deborig aka git-archive. So the protocol in this area is
    fairly simple, and the possible extension to support other orig
    generation modes is not described in the document.

    The relevant protocol elements are the `upstream=` and `upstream-tag=`
    keywords in the tag2upload tag metadata. tag2upload(5) says:

    | The orig tarball will be generated with "git archive", as invoked
    | by "git deborig".

    If we were to support pristine-tar we would include that information
    in the tag. Probably, we'd specify the pristine-tar branch commitid,
    and a ref to obtain it from (maybe the pristine-tar brancy). But, I
    haven't designed this in detail.

    Ian.

    --
    Ian Jackson <[email protected]> These opinions are my own.

    Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
    that is a private address which bypasses my fierce spamfilter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ian Jackson@21:1/5 to Ian Jackson on Wed Jun 26 18:30:01 2024
    Ian Jackson writes ("Re: tag2upload, reproducible .orig and dfsg repacks"):
    ...
    Secondsly, there is only currently one supported way to generate an
    orig: git-deborig aka git-archive.

    An implication, which I should atate explicitly, is that if you want
    something else (eg, to be sure use unmodified upstream tarballs), you
    cannot use tag2upload for your new-upstream-version uploads, until
    some kind of support for that scenario is implemented.

    (While I'm here, what Russ said in his reply is correct.)

    Ian.

    --
    Ian Jackson <[email protected]> These opinions are my own.

    Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
    that is a private address which bypasses my fierce spamfilter.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Richter@21:1/5 to Ian Jackson on Wed Jun 26 18:30:01 2024
    Hi,

    On 6/27/24 01:16, Ian Jackson wrote:

    Secondsly, there is only currently one supported way to generate an
    orig: git-deborig aka git-archive. So the protocol in this area is
    fairly simple, and the possible extension to support other orig
    generation modes is not described in the document.

    So if I use pristine-tar, it is very important that new upstream
    versions are not uploaded through tag2upload, or future uploads until
    the next upstream release also have to go through tag2upload, and the
    .orig archive will fail validation if we later on get a service to check
    the archive contents against git?

    Would it make sense to lock this out for the time being so we don't accidentally upload a repacked .orig after taking a lot of care to store
    the upstream archive in pristine-tar?

    That happens way too often already when I do this manually -- my git
    workflow includes building the source package and verifying that it is
    indeed reproduced correctly, and that is one of the reasons I find the
    git workflows tedious if I'm the sole maintainer of a package, and why I
    end up force-pushing quite a lot to salsa.

    Simon

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Simon Richter on Wed Jun 26 20:00:01 2024
    Simon Richter <[email protected]> writes:
    On 6/27/24 00:41, Russ Allbery wrote:

    The second case seems fine with tag2upload? Particularly if upstream
    signs the Git tag. Instead of pointing to a possibly signed release
    tarball, the tag2upload tag points to a signed upstream Git tag. We
    get basically the same properties and avoid dealing with opaque
    upstream tarballs.

    The one property we don't get is "our orig archive is bitwise identical
    with what is on upstream's release page" -- which is a *very* important property if I'm being asked to sponsor a package, as it saves me a long investigation.

    Instead, we have "our orig archive is treesame to the upstream signed Git
    tag." This seems equivalent? We don't have as simple of tools right now
    to *check* this property, but that's a fixable problem, and the amount of *information* is the same.

    By all means, don't use tag2upload if you don't like its upstream tarball handling, and I think supporting pristine-lfs in tag2upload is a good idea
    to handle the cases where upstream is tarball-centric. (I personally
    would not be eager to support pristine-tar only because I don't think it's sustainable, but it is very widely used, so my personal opinion may be
    wrong.) But I do think using the signed upstream Git tag even when they
    also have a signed tarball release is defensible and a matter of personal preference.

    In my packages the git tree does not contain any autogenerated files,
    which means that people using it will have to run autogen.sh. I think
    pretty much everyone else using autotools is doing the same.

    Right, but that's a feature, not a bug. If it's just a matter of running
    the autotools, it's *better*, in my opinion, to start from the Git tag so
    that you don't have the already-generated files that you are trying to
    discard sitting around obscuring the situation and possibly still
    lingering because we have some bug in the code that's supposed to move
    them aside.

    I do not currently practice what I preach and currently base the Debian packages for code for which I'm also upstream on signed tarballs, but
    that's because I'm a creature of habit and haven't gotten around to
    changing that yet.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Brian May@21:1/5 to Matthias Urlichs on Thu Jun 27 03:30:01 2024
    Matthias Urlichs <[email protected]> writes:

    IMHO this is a mostly-solved problem.

    You can feed hashes of the offenders to "git filter-repo --strip-blobs-with-ids ‹filename›". This operation is idempotent and deterministic.

    If we add these hashes to a file, let's say d/source/dfsg-filtered, we
    can thus reproducibly generate a dfsg-compliant version of whichever upstream commit or tag we want, and of course generate a tarball from
    there if required.

    Sometimes files have to be edited and/or created in order to make the
    tar ball DFSG complaint and not fail build. Just deleted a list of files
    is not sufficient.

    For example, if an individual file contains a mixture of non-dfsg stuff
    and dfsg stuff that is required for building.

    For more details, see this really old discussion, from 2008. https://lists.debian.org/debian-devel/2008/06/msg00233.html

    I hope I haven't just opened a can of worms here :-)
    --
    Brian May @ Debian

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Tille@21:1/5 to All on Thu Jun 27 07:50:01 2024
    Hi Matthias,

    Am Wed, Jun 26, 2024 at 03:25:34PM +0200 schrieb Matthias Urlichs:
    You can feed hashes of the offenders to "git filter-repo --strip-blobs-with-ids ‹filename›". This operation is idempotent and deterministic.

    If we add these hashes to a file, let's say d/source/dfsg-filtered, we can thus reproducibly generate a dfsg-compliant version of whichever upstream commit or tag we want, and of course generate a tarball from there if required.

    Your suggestion sounds sensible. However, I'd prefer if we would not
    invent a file that might duplicate the content of the d/copyright Files-Excluded field - but this seems to be some implementation detail.

    Kind regards
    Andreas.

    --
    https://fam-tille.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Matthias Urlichs@21:1/5 to All on Thu Jun 27 09:10:01 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------i8my3OY3lbK9dntfb0c23HDi
    Content-Type: multipart/alternative;
    boundary="------------tIURCgnvhvLV3ujZiKz0XESW"

    --------------tIURCgnvhvLV3ujZiKz0XESW
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    T24gMjcuMDYuMjQgMDc6NDgsIEFuZHJlYXMgVGlsbGUgd3JvdGU6DQo+IEknZCBwcmVmZXIg aWYgd2Ugd291bGQgbm90DQo+IGludmVudCBhIGZpbGUgdGhhdCBtaWdodCBkdXBsaWNhdGUg dGhlIGNvbnRlbnQgb2YgdGhlIGQvY29weXJpZ2h0DQo+IEZpbGVzLUV4Y2x1ZGVkIGZpZWxk IC0gYnV0IHRoaXMgc2VlbXMgdG8gYmUgc29tZSBpbXBsZW1lbnRhdGlvbiBkZXRhaWwuDQoN CllvdSBoYXZlIGEgcG9pbnQgdGhlcmUuIFdlIGNvdWxkIHVzZSAiZ2l0IGZpbHRlci1yZXBv IC0taW52ZXJ0LXBhdGhzIA0KLS1wYXRocy1mcm9tLWZpbGUgPChleHRyYWN0LWV4Y2x1ZGVk LSBwYXRocyBkL2NvcHlyaWdodCkiIGluc3RlYWQuDQoNCk9uIHRoZSBvdGhlciBoYW5kOiBp ZiB0aGUgZmlsZSBpc24ndCBwcmVzZW50IGFueXdheSwgd2h5IHdvdWxkIHdlIGxpc3QgDQpp dCB0aGVyZSBpbiB0aGUgZmlyc3QgcGxhY2U/DQoNCkJyaWFuOg0KDQo+IEZvciBleGFtcGxl LCBpZiBhbiBpbmRpdmlkdWFsIGZpbGUgY29udGFpbnMgYSBtaXh0dXJlIG9mIG5vbi1kZnNn IHN0dWZmDQo+IGFuZCBkZnNnIHN0dWZmIHRoYXQgaXMgcmVxdWlyZWQgZm9yIGJ1aWxkaW5n Lg0KDQpUaGVzZSBjYXNlcyB3b3VsZCByZXF1aXJlIHNvbWUgbW9yZSBpbnRydXNpdmUgZWRp dGluZy4gImdpdCBmaWx0ZXItcmVwbyIgDQpkb2VzIHN1cHBvcnQgdGhhdDogeW91IGNhbiB1 c2UgYSBibG9iIGNhbGxiYWNrIHRvIGVkaXQgaW5kaXZpZHVhbCBvYmplY3RzLg0KDQpPZiBj b3Vyc2Ugd2UnZCBoYXZlIHRvIGJlIGEgYml0IG1vcmUgY2FyZWZ1bCB0byBlbnN1cmUgdGhh dCB0aGUgY2FsbGJhY2sgDQppcyBpZGVtcG90ZW50IGFuZCBhbGwgdGhhdCwgYnV0IGFnYWlu IEkgZG9uJ3Qgc2VlIGEgbWFqb3IgcHJvYmxlbSBoZXJlLCANCm90aGVyIHRoYW4gYSBzb21l d2hhdC1hbm5veWluZyBpbnRlcm1lZGlhdGUgc3RlcCB3aGVuIHlvdSBwdWxsIGZyb20gDQp1 cHN0cmVhbSDigJMgYnV0IHdlIGFsc28gbmVlZCBhbiBhbm5veWluZyBzdGVwIHdoZW4gd2Ug YnVpbGQgdGhlIG5leHQgREZTRyANCnRhcmJhbGwsIHNvIHRoYXQncyBubyBsb3NzLiA6LVAN Cg0KLS0gDQotLSBtaXQgZnJldW5kbGljaGVuIEdyw7zDn2VuDQotLSANCi0tIE1hdHRoaWFz IFVybGljaHMNCg0K
    --------------tIURCgnvhvLV3ujZiKz0XESW
    Content-Type: text/html; charset=UTF-8
    Content-Transfer-Encoding: quoted-printable

    <!DOCTYPE html>
    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <div class="moz-cite-prefix">On 27.06.24 07:48, Andreas Tille wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:[email protected]">
    <pre>I'd prefer if we would not
    invent a file that might duplicate the content of the d/copyright Files-Excluded field - but this seems to be some implementation detail.
    </pre>
    </blockquote>
    <p>You have a point there. We could use "git filter-repo
    --invert-paths --paths-from-file &lt;(extract-excluded- paths
    d/copyright)" instead.</p>
    <p>On the other hand: if the file isn't present anyway, why would we
    list it there in the first place?<br>
    </p>
    <p>Brian:</p>
    <pre class="moz-quote-pre" wrap="">&gt; For example, if an individual file contains a mixture of non-dfsg stuff
    &gt; and dfsg stuff that is required for building.
    </pre>
    <p>These cases would require some more intrusive editing. "git
    filter-repo" does support that: you can use a blob callback to
    edit individual objects.</p>
    <p>Of course we'd have to be a bit more careful to ensure that the
    callback is idempotent and all that, but again I don't see a major
    problem here, other than a somewhat-annoying intermediate step
    when you pull from upstream – but we also need an annoying step
    when we build the next DFSG tarball, so that's no loss. :-P<br>
    </p>
    <p></p>
    <pre class="moz-signature" cols="72">--
    -- mit freundlichen Grüßen
    --
    -- Matthias Urlichs</pre>
    </body>
    </html>

    --------------tIURCgnvhvLV3ujZiKz0XESW--

    --------------i8my3OY3lbK9dntfb0c23HDi--

    -----BEGIN PGP SIGNATURE-----

    wsF5BAABCAAjFiEEr9eXgvO67AILKKGfcs+OXiW0wpMFAmZ9Cm8FAwAAAAAACgkQcs+OXiW0wpPK HA/+O6XyDQL1V6l/zoJH613uBXOfSLB6IE8Wowme1TeeOjpJ2Xc/LTgEtq9cBpAMArCzlxMu1fWP zpM3SeTvZqQKzUMsYsPBgRtmGHygOas/jQdy+cy49W/SCmXi7PcDDiXYPsq8U3hzfBcB9qhcSxk8 hVBBn7+eTBsyHX2wTmaoO5mTZzVHq+fyZs0OBqmL0iO4y++xgfxWCzbksbrxlMIw1l3nLl/AF8sY MuUI02G+tlPNg1HStyHubB8vlx1kKxM7GFyTIZyqjjzZMJEdCwOvNWrH3QRhgpNSElWyyEanQ29m fxzvpQaXW5vZqKslVGoc+GfavCoAMbUKEExKMJO7LDjq8Mv07gLuwW+gOhwYpDkqd3+kpMutH6kB H9iLWVJc34pVu4/QKoXxRsCV0avod0auzbgZ/yLtCzh0/2TjQFlmPbUdX9Vf1JhYKrblcRFc7a2o NEB4KzTnWX45PBSFD2HM+nSfRkPiI3MjOPBmfC+CVbCNjyINFs7IZz6cJeDTto6BE8zr2ZThE0iQ k+pq271K8tX1pU+wDrqWjdl+IsgfyArIlby1eZTurWjq3HVHQlxFlZtBTQe2x5GzSMdDQWWT1p4C FnWLVXjAhg/30apwmWHV/ewq71s+F2GTkAaDYKPx/t1lnYvF2XlePWgnDlhkJ1WAxGClW3HKbe0h jP4=
    =K0KS
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05