Hello everyone,
This is a draft GR. I'm posting it now for textual review, because of
the relative shortness of our official discussion periods.
After some time for review, I'll post again seeking seconds.
The first sections are an introductory discussion. For the actual GR
text, scroll down to the bottom of this e-mail. Thanks.
=====
INTRODUCTION
The tag2upload system, designed for deployment on official Debian infrastructure, allows DDs and DMs to make source-only uploads simply by pushing a signed git tag. There are two key advantages:
- it will be much quicker and easier for us to do most of our uploads
- it improves the traceability and auditability of our source-only
uploads, in ways that are particular salient in the wake of xz-utils.
The system works like this:
1. Maintainer types 'git debpush' to sign and push a suitable git tag.
The tag includes certain metadata that makes the maintainer's
intention to upload fully traceable, and unambiguous.
2. A robot on DSA infrastructure automatically, reliably and traceably
builds the source package, and uploads it to the Debian Archive.
tag2upload will be an additional option for your source-only uploads;
no-one will be required to use it. For more information on the details
of the system itself, I've included some links down below.
ftpmaster stated a hard requirement that dak has to be able to
completely re-perform the verification of maintainer intent done by the tag2upload service. That goal cannot be met without fatally undermining
the tag2upload design and user experience.
Russ Allbery, and others, tried very hard to get ftpmaster to explain
why this should be a requirement, but we never got an answer that we
could understand as a strong technical objection, despite many attempts.
On Tuesday, June 11, 2024 3:25:02 PM MST Sean Whitton wrote:
ftpmaster stated a hard requirement that dak has to be able to
completely re-perform the verification of maintainer intent done by the
tag2upload service. That goal cannot be met without fatally
undermining the tag2upload design and user experience.
Russ Allbery, and others, tried very hard to get ftpmaster to explain
why this should be a requirement, but we never got an answer that we
could understand as a strong technical objection, despite many
attempts.
In order to make an informed decision, can you please explain in what
way dak is not able to "completely re-perform the verification of
maintainer intent done by the tag2upload service”?
And on the implementation details, I really do not like the idea of
having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
If this goes forward, in my opinion it should exclusively use Salsa as
the git server, to avoid duplicating infrastructure.
Sean,
Thanks for taking the time to put this together.
On Tuesday, June 11, 2024 3:25:02 PM MST Sean Whitton wrote:
ftpmaster stated a hard requirement that dak has to be able to
completely re-perform the verification of maintainer intent done by the
tag2upload service. That goal cannot be met without fatally undermining
the tag2upload design and user experience.
Russ Allbery, and others, tried very hard to get ftpmaster to explain
why this should be a requirement, but we never got an answer that we
could understand as a strong technical objection, despite many attempts.
In order to make an informed decision, can you please explain in what way dak is not able to "completely re-perform the verification of maintainer intent done by the tag2upload service”?
The short answer is that the input to dak is a source package, not a git
tag. And it's the latter that is signed by the maintainer, under
tag2upload.
A longer answer is that for dak to do that, it would need to reimplement
all of tag2upload. As you will see from the design docs, we have
carefully sandboxed the various stages of tag2upload's processing, for security isolation. It wouldn't make sense to implement all that again
on dak. And indeed, the git-to-source-package processing should not
happen on the same host where we have the master archive signing keys.
As tag2upload is security-sensitive, the design has had careful,
independent security review from Russ Allbery and Jonathan McDowell,
ftpmaster stated a hard requirement that dak has to be able to
completely re-perform the verification of maintainer intent done by the tag2upload service. That goal cannot be met without fatally undermining
the tag2upload design and user experience.
THE DESIGN & IMPLEMENTATION ARE LATE-STAGE
We wish to be clear that tag2upload can be deployed without *any*
code changes to dak. It just needs to be given a suitably trusted
key, very similar to how buildds have trusted keys.
Should this GR pass, then the tag2upload project will be unstuck, and
could be deployed in a matter of months, and the source-only uploads
of
as many of us who want it can become just 'git debpush' and done,
without any other workflow changes or learning.
As I said several times before: the implementation has known security
bugs (unless you fixed them). But I guess this is going to get ignored
again anyway...
In addition it reintroduces trust in weak cryptographic hashes which
effort was spent to remove.
And we also remove the Debian Maintainer role as dak would no longer
know who uploaded the package?
If only one could use regular git instead of a custom, non-standard VCS
built on top of Git that makes some workflows impossible and team
maintenance harder by not supporting publishing intermediate work. :-(
As I said several times before: the implementation has known securityCould you describe what known security vulnerabilities you believe exist,
bugs (unless you fixed them). But I guess this is going to get ignored again anyway...
Ansgar 🙀 <[email protected]> writes:
In addition it reintroduces trust in weak cryptographic hashes which
effort was spent to remove.
I think this concern is significantly overblown and attempted to explain precisely why I believe that in my security review. I'll also point out
that using SHA-256 hashes in *.dsc files does not somehow mean that Debian
is no longer trusting SHA-1 hashes, given that most Debian development is done in Git using SHA-1 hashes.
I think we're all agreed that switching Git to SHA-256 hashes would be
great and, once that work is done, we should take advantage of it,
including in tag2upload.
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of
having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git
operations, but I could be mistaken.
If this goes forward, in my opinion it should exclusively use Salsa as
the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa so
that it's a reliable source of tracing information. You don't want to support force pushes, for example; the whole point is that it should be append-only, which would be a controversial choice for Salsa but which is fine for the archives of the uploaded packages. I would also want a much smaller attack surface for that type of record than than GitLab. GitLab
is designed as a place to do interactive work, not to keep a reliable permanent record.
That Git archive is not parallel to or competitive with Salsa and doesn't provide most of the functionality that Salsa does. It has a different purpose.
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of having a competing git forge with Salsa. This dgit server seems to just be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa
as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa
so that it's a reliable source of tracing information. You don't
want to support force pushes, for example; the whole point is that it should be append-only, which would be a controversial choice for
Salsa but which is fine for the archives of the uploaded packages. I
would also want a much smaller attack surface for that type of record
than than GitLab. GitLab is designed as a place to do interactive
work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can
have these repositories in a separate namespace, which sets strong
branch and tag protection rules to achieve what you describe. As far
as I am aware, this is possible to do in Salsa already, it doesn't
have to be a per-forge rule, it can be per-namespace, I think this is possible to achieve in Gitlab. I have not used tag protection rules
(on gitlab, I used them on github though), but I do regularly use
branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in salsa.debian.org, not the git repositories as they might exist on
Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend
git store deduplicates it anyway, so in practice it should be quite
cheap). Having an entire separate git forge that competes with Salsa
seems orthogonal to this, and counterproductive for the project.
That Git archive is not parallel to or competitive with Salsa and doesn't provide most of the functionality that Salsa does. It has a different purpose.
I disagree strongly. As we have seen in the recent Salsa thread on
d-private, there are a few but very strongly opinionated people who
are vehemently against Salsa and would like to see it gone. Having a
parallel and competing git forge I fear would give them very strong ammunition to do so: "if the real uploads and the real repositories
are on a separate and independent git forge, why have Salsa at all?
Get rid of it and use the other forge exclusively."
Russ Allbery <[email protected]> writes:
Ansgar 🙀 <[email protected]> writes:
In addition it reintroduces trust in weak cryptographic hashes which effort was spent to remove.
I think this concern is significantly overblown and attempted to explain precisely why I believe that in my security review. I'll also point out that using SHA-256 hashes in *.dsc files does not somehow mean that Debian is no longer trusting SHA-1 hashes, given that most Debian development is done in Git using SHA-1 hashes.
I think we're all agreed that switching Git to SHA-256 hashes would be great and, once that work is done, we should take advantage of it, including in tag2upload.
I have not more than skimmed the architecture, so forgive me if this
makes no sense: Could this fear (whether overblown or not) not be
alleviated by including in the tag2upload structured metadata a SHA-256
hash of all the files in the given commit?
On Wed, 2024-06-12 at 06:25 +0800, Sean Whitton wrote:
As tag2upload is security-sensitive, the design has had careful, independent security review from Russ Allbery and Jonathan McDowell,
As I said several times before: the implementation has known security
bugs (unless you fixed them). But I guess this is going to get ignored
again anyway... Reviewing the design doesn't help with this.
In addition it reintroduces trust in weak cryptographic hashes which
effort was spent to remove.
ftpmaster stated a hard requirement that dak has to be able to
completely re-perform the verification of maintainer intent done by the tag2upload service. That goal cannot be met without fatally undermining the tag2upload design and user experience.
That's not the only issue. Known security issues are another.
In addition from the history of WebPKI compromises, it should we well understood that having several paths to certificate issuance is not a
good idea. Several paths to introduce source to Debian has similar
problems.
THE DESIGN & IMPLEMENTATION ARE LATE-STAGE
We wish to be clear that tag2upload can be deployed without *any*
code changes to dak. It just needs to be given a suitably trusted
key, very similar to how buildds have trusted keys.
And we also remove the Debian Maintainer role as dak would no longer
know who uploaded the package? Debian is larger than only Debian
Developers.
If only one could use regular git instead of a custom, non-standard VCS
built on top of Git that makes some workflows impossible and team
maintenance harder by not supporting publishing intermediate work. :-(
Am Tue, Jun 11, 2024 at 10:27:56PM -0700 schrieb Russ Allbery:
As I said several times before: the implementation has knownCould you describe what known security vulnerabilities you believe
security
bugs (unless you fixed them). But I guess this is going to get
ignored
again anyway...
exist,
does it matter if this GR is about a design? currently the RFC is not
to vote about an implementation... :/
On Wed, Jun 12, 2024 at 06:50:44AM +0200, Ansgar 🙀 wrote:
In addition it reintroduces trust in weak cryptographic hashes which
effort was spent to remove.
Thanks for reminding. While I've seen arguments in favour of the
weaknesses of sha1 not affecting our use much, the xz-incident changes
the weights of those arguments for me. I am now wondering whether we can
be more proactive about changing the hash function used by git. For new repositories, this seems as simple as git init --object-format=sha256.
Doing so will make repositories inaccessible to people running buster or older and I guess we can live with that limitation. It is not clear to
me how repositories are converted. To me it seems plausible to deny use
of sha1 hashes with debpush at this time (even though that is not
implemented right now).
I note that use of weak hashes in the current tag2upload is not a
fundamental blocker but something that I expect proponents to work on in
case the GR passes. Would one of the proponents confirm that they see
this as worth spending their time on (on the condition that the GR
passes)?
As far as I understand, the GR is about pushing the design and
implementation as is, without any changes. It very explicitly says so.
On Wed, 2024-06-12 at 09:18 +0200, Gard Spreemann wrote:
I have not more than skimmed the architecture, so forgive me if this
makes no sense: Could this fear (whether overblown or not) not be
alleviated by including in the tag2upload structured metadata a SHA-256
hash of all the files in the given commit?
Yes, that was suggested as a compromise in the past, but tag2upload
upstream was not interested in having any changes.
Quoting Luca Boccassi (2024-06-12 10:21:40)
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of having a competing git forge with Salsa. This dgit server seems to just be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa
as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa
so that it's a reliable source of tracing information. You don't
want to support force pushes, for example; the whole point is that it should be append-only, which would be a controversial choice for
Salsa but which is fine for the archives of the uploaded packages. I would also want a much smaller attack surface for that type of record than than GitLab. GitLab is designed as a place to do interactive
work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can
have these repositories in a separate namespace, which sets strong
branch and tag protection rules to achieve what you describe. As far
as I am aware, this is possible to do in Salsa already, it doesn't
have to be a per-forge rule, it can be per-namespace, I think this is possible to achieve in Gitlab. I have not used tag protection rules
(on gitlab, I used them on github though), but I do regularly use
branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in salsa.debian.org, not the git repositories as they might exist on
Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend
git store deduplicates it anyway, so in practice it should be quite
cheap). Having an entire separate git forge that competes with Salsa
seems orthogonal to this, and counterproductive for the project.
I fail to recognize how strong ACLs achieves exactly the same separate storage on a separate host. Especially when the purpose is to minimize attack vectors.
That Git archive is not parallel to or competitive with Salsa and doesn't provide most of the functionality that Salsa does. It has a different purpose.
I disagree strongly. As we have seen in the recent Salsa thread on d-private, there are a few but very strongly opinionated people who
are vehemently against Salsa and would like to see it gone. Having a parallel and competing git forge I fear would give them very strong ammunition to do so: "if the real uploads and the real repositories
are on a separate and independent git forge, why have Salsa at all?
Get rid of it and use the other forge exclusively."
I don't follow d-private, but sounds to me like that argument goes both
ways - i.e. also "if the real uploads and the real repositories are on
(some specially locked down section of) same git forge, why not embrace additional features offered from same vendor of said forge?"
On Wed, 12 Jun 2024 at 09:35, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 10:21:40)
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa
so that it's a reliable source of tracing information. You don't
want to support force pushes, for example; the whole point is that it should be append-only, which would be a controversial choice for
Salsa but which is fine for the archives of the uploaded packages. I would also want a much smaller attack surface for that type of record than than GitLab. GitLab is designed as a place to do interactive work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can
have these repositories in a separate namespace, which sets strong
branch and tag protection rules to achieve what you describe. As far
as I am aware, this is possible to do in Salsa already, it doesn't
have to be a per-forge rule, it can be per-namespace, I think this is possible to achieve in Gitlab. I have not used tag protection rules
(on gitlab, I used them on github though), but I do regularly use
branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in salsa.debian.org, not the git repositories as they might exist on
Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend
git store deduplicates it anyway, so in practice it should be quite cheap). Having an entire separate git forge that competes with Salsa seems orthogonal to this, and counterproductive for the project.
I fail to recognize how strong ACLs achieves exactly the same separate storage on a separate host. Especially when the purpose is to minimize attack vectors.
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same, and
again as per security review, this case benefits from centralizing:
one host to maintain, and one set of admins to trust, is better than
two. Especially as Salsa is Gitlab, which is maintained upstream and
benefits from the many-eyes-and-many-users situation, while a
completely custom local git forge reimplementation, other than
inevitably suffering from bitrot at some point in the future, like all
custom infrastructure, will have the disadvantage that nobody else
uses it. This is the reason Alioth is gone, and it's a very good
reason.
That Git archive is not parallel to or competitive with Salsa and doesn't
provide most of the functionality that Salsa does. It has a different purpose.
I disagree strongly. As we have seen in the recent Salsa thread on d-private, there are a few but very strongly opinionated people who
are vehemently against Salsa and would like to see it gone. Having a parallel and competing git forge I fear would give them very strong ammunition to do so: "if the real uploads and the real repositories
are on a separate and independent git forge, why have Salsa at all?
Get rid of it and use the other forge exclusively."
I don't follow d-private, but sounds to me like that argument goes both ways - i.e. also "if the real uploads and the real repositories are on (some specially locked down section of) same git forge, why not embrace additional features offered from same vendor of said forge?"
I don't follow, we already use features from Salsa? Like the CI
pipeline, which is awesome. ACLs on repositories are not really unique
or particular to Github, modern forges pretty much have to support
them, Github has them too.
a completely custom local git forge reimplementation, other than
inevitably suffering from bitrot at some point in the future, like
all custom infrastructure, will have the disadvantage that nobody
else uses it.
This is the reason Alioth is gone, and it's a very good reason.
Quoting Luca Boccassi (2024-06-12 12:28:21)
On Wed, 12 Jun 2024 at 09:35, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 10:21:40)
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa so that it's a reliable source of tracing information. You don't want to support force pushes, for example; the whole point is that it should be append-only, which would be a controversial choice for Salsa but which is fine for the archives of the uploaded packages. I would also want a much smaller attack surface for that type of record than than GitLab. GitLab is designed as a place to do interactive work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can have these repositories in a separate namespace, which sets strong branch and tag protection rules to achieve what you describe. As far
as I am aware, this is possible to do in Salsa already, it doesn't
have to be a per-forge rule, it can be per-namespace, I think this is possible to achieve in Gitlab. I have not used tag protection rules
(on gitlab, I used them on github though), but I do regularly use branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in salsa.debian.org, not the git repositories as they might exist on
Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you want, even if it duplicates the individual repositories (the backend git store deduplicates it anyway, so in practice it should be quite cheap). Having an entire separate git forge that competes with Salsa seems orthogonal to this, and counterproductive for the project.
I fail to recognize how strong ACLs achieves exactly the same separate storage on a separate host. Especially when the purpose is to minimize attack vectors.
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same, and
again as per security review, this case benefits from centralizing:
one host to maintain, and one set of admins to trust, is better than
two. Especially as Salsa is Gitlab, which is maintained upstream and benefits from the many-eyes-and-many-users situation, while a
completely custom local git forge reimplementation, other than
inevitably suffering from bitrot at some point in the future, like all custom infrastructure, will have the disadvantage that nobody else
uses it. This is the reason Alioth is gone, and it's a very good
reason.
So your argument is that that strong ACLs achieve exactly the same as separate storage on a separate host, because separate storage on a
separate host inevitably leads to bitrot and lack of eyeballs.
I rest my case.
That Git archive is not parallel to or competitive with Salsa and doesn't
provide most of the functionality that Salsa does. It has a different
purpose.
I disagree strongly. As we have seen in the recent Salsa thread on d-private, there are a few but very strongly opinionated people who
are vehemently against Salsa and would like to see it gone. Having a parallel and competing git forge I fear would give them very strong ammunition to do so: "if the real uploads and the real repositories
are on a separate and independent git forge, why have Salsa at all?
Get rid of it and use the other forge exclusively."
I don't follow d-private, but sounds to me like that argument goes both ways - i.e. also "if the real uploads and the real repositories are on (some specially locked down section of) same git forge, why not embrace additional features offered from same vendor of said forge?"
I don't follow, we already use features from Salsa? Like the CI
pipeline, which is awesome. ACLs on repositories are not really unique
or particular to Github, modern forges pretty much have to support
them, Github has them too.
Sorry, I cannot possibly get a point across a cloud of awesomeness.
Am Tue, Jun 11, 2024 at 10:27:56PM -0700 schrieb Russ Allbery:
As I said several times before: the implementation has known security bugs (unless you fixed them). But I guess this is going to get ignored again anyway...Could you describe what known security vulnerabilities you believe exist,
does it matter if this GR is about a design? currently the RFC is not to
vote about an implementation... :/
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
The git repositories, sure. The git forge?
does it matter if this GR is about a design? currently the RFC is not to
vote about an implementation... :/
(to be clear: I do find it very wrong to vote about a design, not an implementation. I'd probably also would find it wrong to vote for an implementation, but it's worse to decide by vote that a design should
be implemented.)
the RFC for this GR only links to some design documents, at least
that's what the RFC says, I haven't followed those links yet.
And it certainly doesnt describe a (minimal) version/git tag/release of said implementation.
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
Thanks for taking an interest.
That is indeed a cgit view of the dgit git server.
The git repositories, sure. The git forge?
I'm not sure what you mean by "forge". I think "forge" means
"something like gitlab / sourcehut / github / ...". Ie, a system
which doesn't just do git repository hosting, but also has (some or
all of) a linked issue tracker, merge request review systme,
discussion tooling, CI, etc.
The t2u/dgit git server doesn't have any of those things. It's purely
a git server, with some bespoke access control. So I don't think it's
a "forge".
It *does* present the contents of its repositories via a web interface
for use by a web browser, but that view is completely read-only.
(Indeed in the current setup in Debian it's served from a mirror.)
Whatever the definition of "forge" the key point you are making is
that you see it as competing with Salsa. But, I don't think it does
compete with Salsa. It doesn't offer any of the useful features that
Salsa has.
(In another sense, the Debian archive + ci.debian.net + the BTS +
britney etc. etc., *is* competing with Salsa. In this view of things, browse.dgit.d.o is a view of part of the archive. But I don't think
that's where you're coming from.)
Salsa is not really suitable for use as the t2u/dgit repos git server,
for the reasons others have explained in this thread. Very early in
dgit's history, the dgit repos were hosted on Alioth, but we replaced
that with a dedicated server for many reasons, some of which are security-related and discussed in Russ's review.
As far as I can tell, from what was shared in these documents, the
security feature needed is an append-only repository, with safeguards
that an individual developer cannot bypass. As far as I can tell, the
same setup can be achieved with repository ACLs, and it would have the
same vulnerability: an admin with full access to the server can bypass
such measures, in either case. Is there something else I am missing?
On Wed, 12 Jun 2024 at 12:03, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 12:28:21)
On Wed, 12 Jun 2024 at 09:35, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 10:21:40)
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of
having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa
as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa so that it's a reliable source of tracing information. You don't want to support force pushes, for example; the whole point is that it
should be append-only, which would be a controversial choice for Salsa but which is fine for the archives of the uploaded packages. I
would also want a much smaller attack surface for that type of record
than than GitLab. GitLab is designed as a place to do interactive work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can have these repositories in a separate namespace, which sets strong branch and tag protection rules to achieve what you describe. As far as I am aware, this is possible to do in Salsa already, it doesn't have to be a per-forge rule, it can be per-namespace, I think this is possible to achieve in Gitlab. I have not used tag protection rules (on gitlab, I used them on github though), but I do regularly use branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in salsa.debian.org, not the git repositories as they might exist on Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you want, even if it duplicates the individual repositories (the backend git store deduplicates it anyway, so in practice it should be quite cheap). Having an entire separate git forge that competes with Salsa seems orthogonal to this, and counterproductive for the project.
I fail to recognize how strong ACLs achieves exactly the same separate storage on a separate host. Especially when the purpose is to minimize attack vectors.
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same, and
again as per security review, this case benefits from centralizing:
one host to maintain, and one set of admins to trust, is better than
two. Especially as Salsa is Gitlab, which is maintained upstream and benefits from the many-eyes-and-many-users situation, while a
completely custom local git forge reimplementation, other than
inevitably suffering from bitrot at some point in the future, like all custom infrastructure, will have the disadvantage that nobody else
uses it. This is the reason Alioth is gone, and it's a very good
reason.
So your argument is that that strong ACLs achieve exactly the same as separate storage on a separate host, because separate storage on a
separate host inevitably leads to bitrot and lack of eyeballs.
I rest my case.
No, my argument is that append-only can (as far as I can tell) be
achieved on Salsa too, it doesn't seem to necessitate a bespoke forge.
The centralizing argument is not mine, it's from the security review
that was published this morning:
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
https://lists.debian.org/debian-vote/2024/06/msg00004.html
That Git archive is not parallel to or competitive with Salsa and doesn't
provide most of the functionality that Salsa does. It has a different
purpose.
I disagree strongly. As we have seen in the recent Salsa thread on d-private, there are a few but very strongly opinionated people who are vehemently against Salsa and would like to see it gone. Having a parallel and competing git forge I fear would give them very strong ammunition to do so: "if the real uploads and the real repositories are on a separate and independent git forge, why have Salsa at all? Get rid of it and use the other forge exclusively."
I don't follow d-private, but sounds to me like that argument goes both ways - i.e. also "if the real uploads and the real repositories are on (some specially locked down section of) same git forge, why not embrace additional features offered from same vendor of said forge?"
I don't follow, we already use features from Salsa? Like the CI
pipeline, which is awesome. ACLs on repositories are not really unique
or particular to Github, modern forges pretty much have to support
them, Github has them too.
Sorry, I cannot possibly get a point across a cloud of awesomeness.
"Having an easy-to-use and working CI is really bad for a software development organization, actually" is... a bold take, no doubt about
that.
But anyway, thanks for proving my point for me: there is a small but
loud minority who would like to kill Salsa, and this proposal as
implemented would help them achieve that goal. If it goes to a GR,
this is enough to make me vote against it, as while the concept is
really nice and I like it a lot, it's not worth jeopardizing Salsa's existence.
I have not more than skimmed the architecture, so forgive me if this
makes no sense: Could this fear (whether overblown or not) not be
alleviated by including in the tag2upload structured metadata a SHA-256
hash of all the files in the given commit?
I think we probably need to provide more links, then.[...]
Let me try to help.
Hope this helps.
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
As far as I can tell, from what was shared in these documents, the
security feature needed is an append-only repository, with safeguards
that an individual developer cannot bypass. As far as I can tell, the
same setup can be achieved with repository ACLs, and it would have the
same vulnerability: an admin with full access to the server can bypass
such measures, in either case. Is there something else I am missing?
There is also an assurance question. Salsa is running gitlab, which
is an extremely complicated piece of software with very many features.
Any one of those features (which are constantly changing) offers an opportunity for compromise of Salsa. Also, we don't have the
resources to audit all the code comeing from gitlab upstream.
The attack surface of the dgit repos server is much smaller. Its
supply chain integrity is much better. So it is much less likely to
be compromised. (Also, diversity of implementation is helpful.)
And, while I find gitlab and Salsa very convenient, we have already
had one git forge fall by the wayside. As I say, the dgit repos
server has already survived the death of one forge and it needs to
survive any problem with Salsa.
Finally, using Salsa instead would involve modelling the Debian upload permissions model in repository ACLs, which we don't currently do.
We would need to link uploaders' PGP keys to their ssh keys and rely
on ssh keys, I think.
Thanks for reminding. While I've seen arguments in favour of the
weaknesses of sha1 not affecting our use much, the xz-incident changes
the weights of those arguments for me. I am now wondering whether we can
be more proactive about changing the hash function used by git. For new repositories, this seems as simple as git init --object-format=sha256.
I note that use of weak hashes in the current tag2upload is not a
fundamental blocker but something that I expect proponents to work on in
case the GR passes. Would one of the proponents confirm that they see
this as worth spending their time on (on the condition that the GR
passes)?
I think you this depends on the bigger picture. When the /usr-merge transition was started, it was repeatedly sold as opt-in, but it really
was not meant as opt-in. Maybe tag2upload is similar. While the mail
thread suggests that it does not have to be used, I wouldn't be
surprised to be talking about terminating non-git uploads later. Once
that happens, we'd back to one path of issuance and that one path is tag2upload then. It is not entirely clear whether this is the vision or
not and to me this vision would make the argument for tag2upload
stronger due to the reason you give.
We only really have two ways of changing the upload process. Either we temporarily add a new process and later remove the old process or we do
a flag-day transition. I guess that most of us agree that doing a
flag-day transition from .dsc uploads to git-debpush would not pass
muster. So we have little options but adding it if we agree that the
upload process needs changes.
And we also remove the Debian Maintainer role as dak would no longer
know who uploaded the package? Debian is larger than only Debian Developers.
This is a policy aspect. When we need to revoke a key used for uploading
this happens via keyring maintainers as far as I understand, but in
urgent cases it is ftp master who can also deny upload rights more
quickly than via a keyring update. In moving to tag2upload as a service external to ftp, we partially move this capability from ftp master to
the entity running tag2upload (DSA afaiui). Is there a sensible way to
leave this policy aspect with the ftp team when using the tag2upload
service? In effect, I'm asking whether ftp could somehow provide an authorization oracle to be used by tag2upload.
If only one could use regular git instead of a custom, non-standard VCS built on top of Git that makes some workflows impossible and team maintenance harder by not supporting publishing intermediate work. :-(
Even though the people behind the tag2upload work are the same as dgit,
the tag2upload service has been carefully designed to actually work with
most maintainer views. It also uploads to dgit, but I think tag2upload
would also work if that dgit part were skipped (please correct me if I'm wrong about this). Hence, tag2upload provides a very important value to
me: I then get a an authentication chain from the Debian archive signing
key to the actual git object used for uploading. That is a property that
I would formerly only get from using dgit and only for the dgit view.
With tag2upload, we would be authenticating actual maintainer tags of maintainer histories on salsa from the archive signing key. To me, this
is a significant step affecting my workflows in a positive way.
Quoting Luca Boccassi (2024-06-12 14:40:01)
On Wed, 12 Jun 2024 at 12:52, Ian Jackson
<[email protected]> wrote:
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
As far as I can tell, from what was shared in these documents, the security feature needed is an append-only repository, with safeguards that an individual developer cannot bypass. As far as I can tell, the same setup can be achieved with repository ACLs, and it would have the same vulnerability: an admin with full access to the server can bypass such measures, in either case. Is there something else I am missing?
There is also an assurance question. Salsa is running gitlab, which
is an extremely complicated piece of software with very many features. Any one of those features (which are constantly changing) offers an opportunity for compromise of Salsa. Also, we don't have the
resources to audit all the code comeing from gitlab upstream.
The attack surface of the dgit repos server is much smaller. Its
supply chain integrity is much better. So it is much less likely to
be compromised. (Also, diversity of implementation is helpful.)
Given we had a very well done and professional security review (thanks Russ!), I think we should defer to that and take it into serious consideration, and its conclusion seems quite clear to me in this
regard:
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
So I don't think this is a good argument. One system is better than
two. And we need to secure all of it anyway, as Salsa is a component
of the solution anyway.
I read the analysis more that two systems is better than one thousand systems.
I.e. centralizing (compared to building done on developers' systems) to a system that can be analyzed (which Gitlab is quite a challenge to do).
- it improves the traceability and auditability of our source-only
uploads, in ways that are particular salient in the wake of xz-utils.
On Wed, 12 Jun 2024 at 12:52, Ian Jackson
<[email protected]> wrote:
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
As far as I can tell, from what was shared in these documents, the security feature needed is an append-only repository, with safeguards that an individual developer cannot bypass. As far as I can tell, the same setup can be achieved with repository ACLs, and it would have the same vulnerability: an admin with full access to the server can bypass such measures, in either case. Is there something else I am missing?
There is also an assurance question. Salsa is running gitlab, which
is an extremely complicated piece of software with very many features.
Any one of those features (which are constantly changing) offers an opportunity for compromise of Salsa. Also, we don't have the
resources to audit all the code comeing from gitlab upstream.
The attack surface of the dgit repos server is much smaller. Its
supply chain integrity is much better. So it is much less likely to
be compromised. (Also, diversity of implementation is helpful.)
Given we had a very well done and professional security review (thanks Russ!), I think we should defer to that and take it into serious consideration, and its conclusion seems quite clear to me in this
regard:
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
So I don't think this is a good argument. One system is better than
two. And we need to secure all of it anyway, as Salsa is a component
of the solution anyway.
On Wed, 12 Jun 2024 at 13:47, Jonas Smedegaard <[email protected]> wrote:
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
As far as I can tell, from what was shared in these documents, the security feature needed is an append-only repository, with safeguards that an individual developer cannot bypass. As far as I can tell, the same setup can be achieved with repository ACLs, and it would have the
same vulnerability: an admin with full access to the server can bypass
such measures, in either case. Is there something else I am missing?
I read the analysis more that two systems is better than one thousand systems.
I.e. centralizing (compared to building done on developers' systems)
to a system that can be analyzed (which Gitlab is quite a challenge
to do).
"centralize the risk as much as possible" applies to both cases, as
does the justification for it. And again, Salsa is already part of the solution, so this argument doesn't seem very strong to me.
Quoting Luca Boccassi (2024-06-12 14:55:13)
On Wed, 12 Jun 2024 at 13:47, Jonas Smedegaard <[email protected]> wrote:
[...]
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
As far as I can tell, from what was shared in these documents, the security feature needed is an append-only repository, with safeguards
that an individual developer cannot bypass. As far as I can tell, the
same setup can be achieved with repository ACLs, and it would have the
same vulnerability: an admin with full access to the server can bypass
such measures, in either case. Is there something else I am missing?
[...]
I read the analysis more that two systems is better than one thousand systems.
I.e. centralizing (compared to building done on developers' systems)
to a system that can be analyzed (which Gitlab is quite a challenge
to do).
"centralize the risk as much as possible" applies to both cases, as
does the justification for it. And again, Salsa is already part of the solution, so this argument doesn't seem very strong to me.
No, not centralizing as much as possible, only as much as sensible.
You apparently find it equally sensible, specifically as a security
measure, a) apply ACLs on an otherwise massively multi-user-write-access
host and b) use a separate far-less-featured host.
You claim that both setups have equal vulnerabilities.
I disagree. I think you are mistaken - and no, it is totally irrelevant
for this accusation whether or not I am a fan of Salsa, and whether or
not I represent a loud or silent minority or majority. This is not about
me.
A side note, but related: recently I had the (dis)pleasure of having
to deal with a git repository that was switched to sha256 (SUSE's
Gitea instance). The conversion is destructive, and breaks, for
example, git submodules functionality (a sha1 repository cannot use
sha256 submodules). So I'd be very careful in assuming repositories
can be converted later.
On Tuesday, June 11, 2024 6:25:02 PM EDT Sean Whitton wrote:
- it improves the traceability and auditability of our source-only
uploads, in ways that are particular salient in the wake of xz-utils.
As I understand it, Debian was affected by the xz-utils hack, in
part, because some artifacts were inserted into an upstream tarball
that were not represented in the upstream git. Please explain how
use of tag2upload is relevant to this scenario? I'm afraid I don't
follow.
On Wed, 12 Jun 2024 at 14:15, Jonas Smedegaard <[email protected]> wrote:
You apparently find it equally sensible, specifically as a security measure, a) apply ACLs on an otherwise massively multi-user-write-access host and b) use a separate far-less-featured host.
You claim that both setups have equal vulnerabilities.
No, I claim they have different sets of vulnerabilities, disadvantages
and advantages, and that both can provide the required feature:
disallow force pushes/deleting tags. The hardest thing with security
is that it requires a constant, ongoing effort, that will never end,
and will only get harder. A widely used software like Gitlab is better
for this, as is a widely used kernel like Linux. Or are you suggesting
such a server should run on Hurd, given it's far-less-featured and
thus has a much smaller attack surface than Linux?
I disagree. I think you are mistaken - and no, it is totally
irrelevant for this accusation whether or not I am a fan of Salsa,
and whether or not I represent a loud or silent minority or majority.
This is not about me.
And I think it is very much relevant, given the obvious end goal of
some individuals is to kill Salsa, which this proposal - as it stands
- would facilitate.
And I think it is very much relevant, given the obvious end goal of
some individuals is to kill Salsa, which this proposal - as it stands
- would facilitate.
As I understand it, Debian was affected by the xz-utils hack, in part, because >some artifacts were inserted into an upstream tarball that were not >represented in the upstream git. Please explain how use of tag2upload is >relevant to this scenario? I'm afraid I don't follow.I think that it was assumed, and I agree, that a well-maintained Debian
Luca Boccassi writes ("Re: [RFC] General Resolution to deploy tag2upload"):
And I think it is very much relevant, given the obvious end goal of
some individuals is to kill Salsa, which this proposal - as it stands
- would facilitate.
Gosh. Are you serious?
For the avoidance of doubt: I'm a fan of gitlab and of Salsa.
I don't want it killed.
In addition it reintroduces trust in weak cryptographic hashes whichWhile SHA-1 is generally deprecated, it is not "weak" in the way that it
effort was spent to remove.
Scott Kitterman writes ("Re: [RFC] General Resolution to deploytag2upload"):
On Tuesday, June 11, 2024 6:25:02 PM EDT Sean Whitton wrote:
- it improves the traceability and auditability of our source-only
uploads, in ways that are particular salient in the wake of xz-utils.
As I understand it, Debian was affected by the xz-utils hack, in
part, because some artifacts were inserted into an upstream tarball
that were not represented in the upstream git. Please explain how
use of tag2upload is relevant to this scenario? I'm afraid I don't
follow.
Disclaimer: I don't know precisely the Debian xz's maintainer's
workflow.
tag2upload, like dgit, ensures and insists that the git tree you are uploading corresponds precisely [1] to the generated source package.
If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the
upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.
In the xz case, if the .orig.tar.gz is upstream's, that would have
detected the attack. More realistically, since the attacker was
targeting Debian, they would instead have had to put all of the
malicious code into the git repository, which is possible, but riskier
- so it makes the attack harder, or easier to detect, but doesn't rule
it out.
There are some cavests to this.
I believe some maintainers maintain a "upstream tarball imports"
branch, which has upstream git as its ancestor, but whose tree
contents are the upstream tarballs. They then base the Debian branch
on that. That workflow is vulnerable to "random stuff" in the
tarballs.
It would also be possible to create a debian/patches/ patch [2]
representing the difference between git and the tarball. There are
various tools in Debian that might make such a patch, including (I
think) dpkg-source, gbp and perhaps dgit, depending on what workflow
and options and so on.
There are probably other workflows that have similar weaknesses.
I wouldn't recommend any of them.
Stepping back a bit, the underlying theme is (obviously) that the
upstream tarball wasn't great, in this case.
In Debian we have historically had a strong culture of wanting to use upstream release tarballs. That made a lot of sense 20-30 years ago
when almost all free software projects released tarballs, and
considered them primary, and the VCS situation was a total mess.
Nowadays, for most projects, the upstream developers work in git. So
git is the source code. Upstream provides tarballs via some
semi-automated process, but it's not what they work with. Ie the
tarballs are an intermediate build product.
In Debian we are supposed to use the source code. We should be using
the same thing as upstream.
There are other reasons why tarballs can be worse, than that they
could be maliciously modified. Often tarballs contain prebuilt stuff
of various kinds. In Debian we usually want to build everything from
source. That's much easier to get right if we start from the actual
source!
Ian.
[1] Modulo "patches-applied" vs "patches-unapplied" and some other
fiddly details which aren't relevant to this discussion.
[2] Assuming a gbp workflow and `3.0 (quilt)`, for the moment.
tag2upload, like dgit, ensures and insists that the git tree you are uploading corresponds precisely [1] to the generated source package.
If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the
upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.
In the xz case, if the .orig.tar.gz is upstream's, that would have
detected the attack.
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend
git store deduplicates it anyway, so in practice it should be quite
cheap). Having an entire separate git forge that competes with Salsa
seems orthogonal to this, and counterproductive for the project.
Hello everyone,
This is a draft GR. I'm posting it now for textual review, because of
the relative shortness of our official discussion periods.
After some time for review, I'll post again seeking seconds.
The first sections are an introductory discussion. For the actual GR
text, scroll down to the bottom of this e-mail. Thanks.
On Wed, 12 Jun 2024 at 15:20:45 +0100, Ian Jackson wrote:
tag2upload, like dgit, ensures and insists that the git tree you are uploading corresponds precisely [1] to the generated source package.
If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the
upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.
Is your position here that if your upstream releases source tarballs
that intentionally differ from what's in git (notably this is true
for Autotools `make dist`), then any Good™ maintainer must generate
their own .orig.tar.* from upstream git and use those in the upload, disregarding upstream's source tarball entirely?
Is your position here that if your upstream releases source tarballsIt is mine, and this is what I have been doing for a long time for all
that intentionally differ from what's in git (notably this is true
for Autotools `make dist`), then any Good™ maintainer must generate
their own .orig.tar.* from upstream git and use those in the upload, >disregarding upstream's source tarball entirely?
On Wed, 12 Jun 2024 at 15:34, Ian Jackson
For the avoidance of doubt: I'm a fan of gitlab and of Salsa.
I don't want it killed.
I did not say you do! Nor did I say it was intentional. I am saying
that a few people do (and you know this), and I am worried that this
proposal could be taken advantage of for that purpose.
If we need a design, then we can easily avoid the problem points. There
is a working counter proposal open:
https://bblank.thinkmo.de/introducing-uploads-debian-git.html
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same,
Is your position here that if your upstream releases source tarballs
that intentionally differ from what's in git (notably this is true
for Autotools `make dist`), then any Good™ maintainer must generate
their own .orig.tar.* from upstream git and use those in the upload, >disregarding upstream's source tarball entirely?
It is mine, and this is what I have been doing for a long time for all
my packages.
On Wed, 12 Jun 2024 at 16:04:34 -0000, Marco d'Itri wrote:
Is your position here that if your upstream releases source tarballs
that intentionally differ from what's in git (notably this is true for
Autotools `make dist`), then any Good™ maintainer must generate their
own .orig.tar.* from upstream git and use those in the upload,
disregarding upstream's source tarball entirely?
It is mine, and this is what I have been doing for a long time for all
my packages.
If there is consensus that devref is lagging behind best-practice and actually this is fine (or preferable, or should-be-required), perhaps
someone who advocates this model could propose a replacement for devref §6.8.8?
There is also an assurance question. Salsa is running gitlab, which
is an extremely complicated piece of software with very many features.
Any one of those features (which are constantly changing) offers an opportunity for compromise of Salsa. Also, we don't have the
resources to audit all the code comeing from gitlab upstream.
The attack surface of the dgit repos server is much smaller. Its
supply chain integrity is much better. So it is much less likely to
be compromised. (Also, diversity of implementation is helpful.)
And, while I find gitlab and Salsa very convenient, we have already
had one git forge fall by the wayside. As I say, the dgit repos
server has already survived the death of one forge and it needs to
survive any problem with Salsa.
Finally, using Salsa instead would involve modelling the Debian upload permissions model in repository ACLs, which we don't currently do.
We would need to link uploaders' PGP keys to their ssh keys and rely
on ssh keys, I think.
Given how much resistance there is to even t2u's current dssign, I
don't think placing this much reliance on Salsa etc. is politically
viable even if it were wise (which I think it wouldn't be).
Is your position here that if your upstream releases source tarballs
that intentionally differ from what's in git (notably this is true
for Autotools `make dist`), then any Good™ maintainer must generate >their own .orig.tar.* from upstream git and use those in the upload, >disregarding upstream's source tarball entirely?
It is mine, and this is what I have been doing for a long time for all
my packages.
If there is consensus that devref is lagging behind best-practice and actually this is fine (or preferable, or should-be-required), perhaps
someone who advocates this model could propose a replacement for devref §6.8.8?
I am still making my way through the discussion, however, and there
are many bits I haven't understood. But the project has (mostly)
decided and adopted Salsa as our project-wide Git "thingy". If it were feasible to adequate Salsa to add the ACLs needed for tag2upload to be securely deployable, I don't follow the need to have a second Git implementation we'd all have to interface with (in order to use
tag2upload).
And even if Salsa is deemed insufficiently prepared (or having a too
large vulnerability footprint), a second, hidden Git-based server
could be made to pull from Salsa, quietly syncing and acting when the
right tags are found. And, of course, loudly complaining to users if
any invalid operation (i.e. history rewrites involving published tags)
were attempted.
In my personal opinion, tag2upload is more compelling for packages where
the upstream maintainer treats Git tags as their primary release artifact, and less compelling for packages where the upstream maintainer views Git
as a possibly incomplete implementation detail of their workflow and
signed tarball releases as the only supported release artifact. I would
pick and choose when to use it based on those sorts of factors. That's
one of the reasons, in my mind, why use of it would be entirely optional. It's an extension of Debian packaging practices to a Git-first world, and therefore makes the most sense when upstream has adopted a Git-first development approach.
I am still making my way through the discussion, however, and there are
many bits I haven't understood. But the project has (mostly) decided and adopted Salsa as our project-wide Git "thingy". If it were feasible to adequate Salsa to add the ACLs needed for tag2upload to be securely deployable, I don't follow the need to have a second Git implementation
we'd all have to interface with (in order to use tag2upload).
And even if Salsa is deemed insufficiently prepared (or having a too
large vulnerability footprint), a second, hidden Git-based server could
be made to pull from Salsa, quietly syncing and acting when the right
tags are found. And, of course, loudly complaining to users if any
invalid operation (i.e. history rewrites involving published tags) were attempted.
There was more confusion about this point than I had anticipated, so
I want to emphasize that the dgit-repos server is not a forge, is not
a competitor to Salsa, doesn't replace Salsa in any way, and is not
something that people interact with the way that they interact with
Salsa.
It's much closer to a Git equivalent of archive.debian.org: a
persistent historical record accessible via the Git protocol and (as
I discovered during this thread) a cgit web interface.
On Wed, 2024-06-12 at 10:43 -0700, Russ Allbery wrote:
There was more confusion about this point than I had anticipated, so I
want to emphasize that the dgit-repos server is not a forge, is not a
competitor to Salsa, doesn't replace Salsa in any way, and is not
something that people interact with the way that they interact with
Salsa. It's much closer to a Git equivalent of archive.debian.org: a
persistent historical record accessible via the Git protocol and (as I
discovered during this thread) a cgit web interface.
In that sense, it's more like snapshot.debian.org, I think?
"Adam D. Barratt" <[email protected]> writes:
On Wed, 2024-06-12 at 10:43 -0700, Russ Allbery wrote:
There was more confusion about this point than I had anticipated, so I
want to emphasize that the dgit-repos server is not a forge, is not a
competitor to Salsa, doesn't replace Salsa in any way, and is not
something that people interact with the way that they interact with
Salsa. It's much closer to a Git equivalent of archive.debian.org: a
persistent historical record accessible via the Git protocol and (as I
discovered during this thread) a cgit web interface.
In that sense, it's more like snapshot.debian.org, I think?
Yes, apologies, that's a much better analogy.
Quoting Luca Boccassi (2024-06-12 15:27:36)
On Wed, 12 Jun 2024 at 14:15, Jonas Smedegaard <[email protected]> wrote:
You apparently find it equally sensible, specifically as a security measure, a) apply ACLs on an otherwise massively multi-user-write-access host and b) use a separate far-less-featured host.
You claim that both setups have equal vulnerabilities.
No, I claim they have different sets of vulnerabilities, disadvantages
and advantages, and that both can provide the required feature:
disallow force pushes/deleting tags. The hardest thing with security
is that it requires a constant, ongoing effort, that will never end,
and will only get harder. A widely used software like Gitlab is better
for this, as is a widely used kernel like Linux. Or are you suggesting
such a server should run on Hurd, given it's far-less-featured and
thus has a much smaller attack surface than Linux?
No, I am not suggesting the use of the Hurd here, and I am having a hard
time assuming good faith with the potential undertones of that question.
To answer your convoluted question, I am suggesting that Salsa and
tag2upload has very different needs (multi-user write versus multi-user append-only, drastically simplified), and consequently to not argue that reuse of Salsa for hosting tag2upload is a security benefit.
Luca Boccassi <[email protected]> writes:
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same,
I'm not sure that I understand what you're saying here, but if I did understand this correctly, no, this is not correct. My security review
says the exact opposite of this: admin access to Salsa does not allow you
to bypass the tag2upload checks or upload a source package.
But you don't push to snapshot, it's just a backup method, it doesn't
take any input from DDs (AFAIK? Am I wrong?). Given https://browse.dgit.debian.org/ exists and has tons of stuff already,
and this proposal for tag2upload doesn't exist yet, I gather that dgit
is already a thing that is used independently of tag2upload?
So I don't think this analogy works. One couldn't say "let's remove archive.debian.org, just push to snapshot.debian.org", but one could say "let's remove salsa.debian.org, just push to dgit.debian.org".
On Wed, 12 Jun 2024 at 17:46, Russ Allbery <[email protected]> wrote:
I'm not sure that I understand what you're saying here, but if I did
understand this correctly, no, this is not correct. My security review
says the exact opposite of this: admin access to Salsa does not allow
you to bypass the tag2upload checks or upload a source package.
Probably "push commits anyway" was a wrong oversimplification, what I
was referring to was all the various "someone with admin access on
Salsa" mentions on the document you shared.
Luca Boccassi <[email protected]> writes:
But you don't push to snapshot, it's just a backup method, it doesn't
take any input from DDs (AFAIK? Am I wrong?). Given https://browse.dgit.debian.org/ exists and has tons of stuff already,
and this proposal for tag2upload doesn't exist yet, I gather that dgit
is already a thing that is used independently of tag2upload?
Correct. I've been using dgit to upload my packages for years now. The packages are all maintained on Salsa. dgit push-source (the command to upload a package) both pushes to the dgit-repos server and runs dput,
along with a few other things including source package construction and signing.
(dgit, the command-line tool, is independent of the tag2upload server.
You do not have to use dgit to use tag2upload.)
So I don't think this analogy works. One couldn't say "let's remove archive.debian.org, just push to snapshot.debian.org", but one could say "let's remove salsa.debian.org, just push to dgit.debian.org".
In what sense could one say that? What do you think pushing to dgit.debian.org would do? I think you have some confusion here about what the dgit-repos server is for and what it does, but I'm having a hard time figuring out the exact source of the confusion.
If you're saying that if one doesn't care about making work in progress available, doesn't want pull requests, doesn't want multiple people to be able to work on a package together, and doesn't want CI, but only and exclusively wants a Git server to archive a record of what Git trees were uploaded as source packages, one could use only the dgit-repos server and
not Salsa, then yes, that's true.
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
So I don't think this is a good argument. One system is better than
two. And we need to secure all of it anyway, as Salsa is a component
of the solution anyway.
Yes, that's the argument - all Salsa features are bad and "bloat":
issues are bad, teams are bad, CIs are bad, merge requests are bad, the
only thing needed is to push&pull to some git backend, everything else
is bad and unneeded.
On 17258 March 1977, Luca Boccassi wrote:
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
So I don't think this is a good argument. One system is better than
two. And we need to secure all of it anyway, as Salsa is a component
of the solution anyway.
Nah. Without having looked through the dgit source - having a system
beside salsa do this for Debian is much preferable.
The gitlab for salsa is
a.) forcing us to follow a way that does *not* fit how Debian works for
uploads
b.) a codebase so much larger and made out of so many more components
than all of this proposals code combined together, it will be *worse*.
I mean, look at the security history of Gitlab. Sure, they are fast in
fixing. But they are *constantly* fixing things up with "critical
release, apply ASAP".
On Wed, 12 Jun 2024 at 15:20, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 15:27:36)
On Wed, 12 Jun 2024 at 14:15, Jonas Smedegaard <[email protected]> wrote:
You apparently find it equally sensible, specifically as a security measure, a) apply ACLs on an otherwise massively multi-user-write-access
host and b) use a separate far-less-featured host.
You claim that both setups have equal vulnerabilities.
No, I claim they have different sets of vulnerabilities, disadvantages and advantages, and that both can provide the required feature:
disallow force pushes/deleting tags. The hardest thing with security
is that it requires a constant, ongoing effort, that will never end,
and will only get harder. A widely used software like Gitlab is better for this, as is a widely used kernel like Linux. Or are you suggesting such a server should run on Hurd, given it's far-less-featured and
thus has a much smaller attack surface than Linux?
No, I am not suggesting the use of the Hurd here, and I am having a hard time assuming good faith with the potential undertones of that question.
To answer your convoluted question, I am suggesting that Salsa and tag2upload has very different needs (multi-user write versus multi-user append-only, drastically simplified), and consequently to not argue that reuse of Salsa for hosting tag2upload is a security benefit.
The argument is about attack surface, number of features, size of code
base, auditability, etc. If you make that argument about the git stack running on a server, then the same argument applies for every other
component in the same server that interact in any way with the
payload(s) - kernel, libc, compilers, etc. Otherwise you are just cherrypicking what is convenient, and ignoring what is not. If Gitlab
can't be used in a security-relevant component because it's too big to
audit, then so are the Linux kernel and GCC.
My argument is that having a single system is beneficial for
maintenance costs (fewer platforms, fewer moving parts), for security (components in widespread usage with heavy commercial backing spending
the big $$$$ to ensure it's not completely borken), and for
rationalizing and avoiding duplication.
tag2upload leaves this policy with the ftpmaster team. It uses theAnd we also remove the Debian Maintainer role as dak would noThis is a policy aspect. When we need to revoke a key used for
longer
know who uploaded the package? Debian is larger than only Debian
Developers.
uploading
this happens via keyring maintainers as far as I understand, but in
urgent cases it is ftp master who can also deny upload rights more
quickly than via a keyring update. In moving to tag2upload as a
service
external to ftp, we partially move this capability from ftp master to
the entity running tag2upload (DSA afaiui). Is there a sensible way
to
leave this policy aspect with the ftp team when using the tag2upload
service? In effect, I'm asking whether ftp could somehow provide an
authorization oracle to be used by tag2upload.
archive keyrings and dak's list of Debian Maintainers.
So there is no change here.
I understand the proposal doesn't directly say "oh yeah, we're actually thinking we should ditch salsa and replace it with all those nice little small components", but it is certainly taking a stand that Salsa is not
good enough to provide the level of security that is required to upload packages in Debian, and saying that is saying a lot because I suspect we
are *actually* trusting Salsa and GitLab with our code much more than we would like to admit...
Whatever end goals some individuals may have is *NOT* a good base to
decide on how a technical implementation for Debian should be.
If it turns out that this new thingie makes Salsa entirely
unneccessary,
then so be it. Good for us.
I highly doubt this will happen. The dgit stuff only implements a
small
subset of features. The BTS does *not* provide what Salsa issues do.
There isn't anything even near to do what MRs do. CI integration?
Even
less so. No idea why anyone should fear this dgit thing will lead to
Salsa getting turned off, at this point.
But really, if we end up getting something that makes an installation
of
gitlab unneccessary, then yay, party. It is not something to be
feared.
So in other words, I am 100% right to worry about this being the thin
end of the wedge that some will use to try and kill Salsa. Sounds like
below NoTA if it goes to GR for anybody who, like me, doesn't want
Salsa to be jeopardised, then.
Quoting Luca Boccassi (2024-06-12 22:00:04)
On Wed, 12 Jun 2024 at 15:20, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 15:27:36)
On Wed, 12 Jun 2024 at 14:15, Jonas Smedegaard <[email protected]> wrote:
You apparently find it equally sensible, specifically as a security measure, a) apply ACLs on an otherwise massively multi-user-write-access
host and b) use a separate far-less-featured host.
You claim that both setups have equal vulnerabilities.
No, I claim they have different sets of vulnerabilities, disadvantages and advantages, and that both can provide the required feature: disallow force pushes/deleting tags. The hardest thing with security
is that it requires a constant, ongoing effort, that will never end, and will only get harder. A widely used software like Gitlab is better for this, as is a widely used kernel like Linux. Or are you suggesting such a server should run on Hurd, given it's far-less-featured and
thus has a much smaller attack surface than Linux?
No, I am not suggesting the use of the Hurd here, and I am having a hard time assuming good faith with the potential undertones of that question.
To answer your convoluted question, I am suggesting that Salsa and tag2upload has very different needs (multi-user write versus multi-user append-only, drastically simplified), and consequently to not argue that reuse of Salsa for hosting tag2upload is a security benefit.
The argument is about attack surface, number of features, size of code base, auditability, etc. If you make that argument about the git stack running on a server, then the same argument applies for every other component in the same server that interact in any way with the
payload(s) - kernel, libc, compilers, etc. Otherwise you are just cherrypicking what is convenient, and ignoring what is not. If Gitlab
can't be used in a security-relevant component because it's too big to audit, then so are the Linux kernel and GCC.
My point above, reframed to your new context, is that regardless of how overwhelmingly large the attack surface of GCC+linux is, the attack
surface of GCC+linux+Gitlab is much larger, while that of GCC+linux+tag2upload is little larger.
My argument is that having a single system is beneficial for
maintenance costs (fewer platforms, fewer moving parts), for security (components in widespread usage with heavy commercial backing spending
the big $$$$ to ensure it's not completely borken), and for
rationalizing and avoiding duplication.
Ok, if your argument is no longer that "the same setup can be achieved
with repository ACLs, and it would have the same vulnerability" but that security+economy+maintenance combined makes your previous security-only
point less relevant to discuss, then I have nothing sensible to
contribute to this new path of yours.
On 17258 March 1977, Ian Jackson wrote:
tag2upload leaves this policy with the ftpmaster team. It uses theAnd we also remove the Debian Maintainer role as dak would no > longer >>> > know who uploaded the package? Debian is larger than only DebianThis is a policy aspect. When we need to revoke a key used for uploading >>> this happens via keyring maintainers as far as I understand, but in
Developers.
urgent cases it is ftp master who can also deny upload rights more
quickly than via a keyring update. In moving to tag2upload as a service
external to ftp, we partially move this capability from ftp master to
the entity running tag2upload (DSA afaiui). Is there a sensible way to
leave this policy aspect with the ftp team when using the tag2upload
service? In effect, I'm asking whether ftp could somehow provide an
authorization oracle to be used by tag2upload.
archive keyrings and dak's list of Debian Maintainers.
So there is no change here.
Actually, we can set acls on fingerprints and then that key wont be able
to upload anymore. That is not something recorded in the keyrings or the
DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn off anything with it *NOW*" situations, which it's what Helmut meant with the urgent cases.
On 17258 March 1977, Luca Boccassi wrote:
Whatever end goals some individuals may have is *NOT* a good base to
decide on how a technical implementation for Debian should be.
If it turns out that this new thingie makes Salsa entirely
unneccessary,
then so be it. Good for us.
I highly doubt this will happen. The dgit stuff only implements a
small
subset of features. The BTS does *not* provide what Salsa issues do.
There isn't anything even near to do what MRs do. CI integration?
Even
less so. No idea why anyone should fear this dgit thing will lead to
Salsa getting turned off, at this point.
But really, if we end up getting something that makes an installation
of
gitlab unneccessary, then yay, party. It is not something to be
feared.
So in other words, I am 100% right to worry about this being the thin
end of the wedge that some will use to try and kill Salsa. Sounds like below NoTA if it goes to GR for anybody who, like me, doesn't want
Salsa to be jeopardised, then.
WTF is up with you? Honest question. I just explained, in a load of
words, that this thing is *really* unlikely to provide whatever it needs
to replace Salsa, as there is basically nothing actually providing the features Salsa provides. And your conclusion is more "trying to kill
Salsa"?
If it turns out that this new thingie makes Salsa entirely unneccessary,then so be it. Good for us.
But really, if we end up getting something that makes an installation ofgitlab unneccessary, then yay, party. It is not something to be feared.
THE DESIGN & IMPLEMENTATION ARE LATE-STAGE
We wish to be clear that tag2upload can be deployed without *any*
code changes to dak. It just needs to be given a suitably trusted
key, very similar to how buildds have trusted keys.
And we also remove the Debian Maintainer role as dak would no longer
know who uploaded the package? Debian is larger than only Debian
Developers.
Should this GR pass, then the tag2upload project will be unstuck, and
could be deployed in a matter of months, and the source-only uploads
of
as many of us who want it can become just 'git debpush' and done,
without any other workflow changes or learning.
If only one could use regular git instead of a custom, non-standard VCS
built on top of Git that makes some workflows impossible and team
maintenance harder by not supporting publishing intermediate work. :-(
On Wed, 12 Jun 2024 at 22:26, Jonas Smedegaard <[email protected]> wrote:
My point above, reframed to your new context, is that regardless of
how overwhelmingly large the attack surface of GCC+linux is, the
attack surface of GCC+linux+Gitlab is much larger, while that of GCC+linux+tag2upload is little larger.
The attack surface of GCC+linux+tag2upload is orders of magnitude
larger than that of TCC+hurd+tag2upload. Are you going to advocate for
that switch to happen? If not, why? Why do you think it's worth to
deprecate Salsa because of its much larger attack surface, but it's
not worth deprecating Linux and GCC for their demonstrably much larger
attack surfaces? Could it be, maybe, perhaps, that a superficial
comparison of perceived attack surfaces alone is not really a good
metric to make a decision?
On Wed, 12 Jun 2024 at 22:35, Joerg Jaspert <[email protected]> wrote:
On 17258 March 1977, Luca Boccassi wrote:
Whatever end goals some individuals may have is *NOT* a good base to
decide on how a technical implementation for Debian should be.
If it turns out that this new thingie makes Salsa entirely
unneccessary,
then so be it. Good for us.
I highly doubt this will happen. The dgit stuff only implements a
small
subset of features. The BTS does *not* provide what Salsa issues do.
There isn't anything even near to do what MRs do. CI integration?
Even
less so. No idea why anyone should fear this dgit thing will lead to
Salsa getting turned off, at this point.
But really, if we end up getting something that makes an installation
of
gitlab unneccessary, then yay, party. It is not something to be
feared.
So in other words, I am 100% right to worry about this being the thin
end of the wedge that some will use to try and kill Salsa. Sounds like below NoTA if it goes to GR for anybody who, like me, doesn't want
Salsa to be jeopardised, then.
WTF is up with you? Honest question. I just explained, in a load of
words, that this thing is *really* unlikely to provide whatever it needs
to replace Salsa, as there is basically nothing actually providing the features Salsa provides. And your conclusion is more "trying to kill Salsa"?
You _literally_ just wrote:
If it turns out that this new thingie makes Salsa entirely unneccessary,then so be it. Good for us.
But really, if we end up getting something that makes an installation ofgitlab unneccessary, then yay, party. It is not something to be feared.
Not even an hour ago. How can you expect someone to reach any _other_ conclusion? WTF right back at you.
Could you say more specifically how seldom, and also how long itSo there is no change here.Actually, we can set acls on fingerprints and then that key wont be
able
to upload anymore. That is not something recorded in the keyrings or
the
DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn
off
anything with it *NOW*" situations, which it's what Helmut meant with
the
urgent cases.
usually
takes between you flicking the emergency switch, and the keyring team
pushing an update?
WTF is up with you? Honest question. I just explained, in a load ofYou _literally_ just wrote:
words, that this thing is *really* unlikely to provide whatever it
needs
to replace Salsa, as there is basically nothing actually providing
the
features Salsa provides. And your conclusion is more "trying to kill
Salsa"?
If it turns out that this new thingie makes Salsa entirelyNot even an hour ago. How can you expect someone to reach any _other_ conclusion? WTF right back at you.
unneccessary,
then so be it. Good for us.
But really, if we end up getting something that makes an installation
of
gitlab unneccessary, then yay, party. It is not something to be
feared.
On Wed, 12 Jun 2024 at 19:24, Russ Allbery <[email protected]> wrote:
"Adam D. Barratt" <[email protected]> writes:
On Wed, 2024-06-12 at 10:43 -0700, Russ Allbery wrote:
There was more confusion about this point than I had anticipated, so I
want to emphasize that the dgit-repos server is not a forge, is not a
competitor to Salsa, doesn't replace Salsa in any way, and is not
something that people interact with the way that they interact with
Salsa. It's much closer to a Git equivalent of archive.debian.org: a
persistent historical record accessible via the Git protocol and (as I
discovered during this thread) a cgit web interface.
In that sense, it's more like snapshot.debian.org, I think?
Yes, apologies, that's a much better analogy.
But you don't push to snapshot, it's just a backup method, it doesn't
take any input from DDs (AFAIK? Am I wrong?). Given >https://browse.dgit.debian.org/ exists and has tons of stuff already,
and this proposal for tag2upload doesn't exist yet, I gather that dgit
is already a thing that is used independently of tag2upload? I mean,
that's how it was explained to me yesterday anyway.
So I don't think this analogy works. One couldn't say "let's remove >archive.debian.org, just push to snapshot.debian.org", but one could
say "let's remove salsa.debian.org, just push to dgit.debian.org".
Hello Antoine,
Thank you for your interest.
I think I should say right away that tag2upload != dgit.
With tag2upload, you will be able to replace 'dpkg-buildpackage -S' and 'dput' with just 'git debpush'. Your other gbp usage is unchanged.
On Wed 12 Jun 2024 at 11:08am -04, Antoine Beaupré wrote:
I understand the proposal doesn't directly say "oh yeah, we're actually
thinking we should ditch salsa and replace it with all those nice little
small components", but it is certainly taking a stand that Salsa is not
good enough to provide the level of security that is required to upload
packages in Debian, and saying that is saying a lot because I suspect we
are *actually* trusting Salsa and GitLab with our code much more than we
would like to admit...
I don't think we are taking a stand that salsa is not good enough to
provide any particular form of security.
In fact, I don't think that tag2upload changes the extent to which we
trust salsa: we would not be trusting it any more nor any less. Perhaps
you could take another look at the design.
(In the background: I very much share your view that we are actually
trusting salsa far much than we generally think we are.)
Yes, that's the argument - all Salsa features are bad and "bloat":
issues are bad, teams are bad, CIs are bad, merge requests are bad,
the only thing needed is to push&pull to some git backend, everything
else is bad and unneeded.
[email protected] wrote:
As I understand it, Debian was affected by the xz-utils hack, in part, >because some artifacts were inserted into an upstream tarball that were not >represented in the upstream git. Please explain how use of tag2upload is >relevant to this scenario? I'm afraid I don't follow.I think that it was assumed, and I agree, that a well-maintained Debian
git source tree has the upstream branch pulled from the upstream git repository, keeping the complete history, and not created locally by importing upstream tar release archives.
Bastian Blank <[email protected]> writes:
If we need a design, then we can easily avoid the problem points. There
is a working counter proposal open:
https://bblank.thinkmo.de/introducing-uploads-debian-git.html
It requires a sufficiently reproducible build for source packages.
Right now it is only known to work with the special 3.0 (gitarchive)
source format, but even that requires the latest version of this
format. No idea if it is possible to use others, like 3.0 (quilt) for
this purpose.
This sounds like a major blocker to me. tag2upload works with the
existing representations of Debian packages in Git and with the existing supported source package formats.
This is largely in the eye of the beholder as there's no strict
definition that I am aware of, so one could or could not include
these, but I do note that what you describe above is not really that different from Alioth - that also didn't have merge requests or CIs,
and we didn't really use the rudimentary ticket system (IIRC it did
have one? Might be wrong). If Alioth was a forge, and I think it was,
then this alternative system also sounds like a forge to me.
One thing I really dislike, is having a single gpg key to upoload them all. I very much preferred the design that Didier explained during Debconf Kosovo, where the .changes signature is uploaded together with the tagged commit.
Your thoughts?
Cheers,
Thomas Goirand (zigo)
P.S: The thread is huge, I have no time to read it all, sorry if someone else also raised the same concern.
Now, again: tag2upload/dgit is not in this category. Not even a little
nor close to.
And then something that didn't appear yet: Has anyone asked the Salsa
admins if they even would like tag2upload? Tell you what, the answer is
*no*. This does *NOT* belong on Salsa. This should *not* end up on
Salsa, and we will fight any such move. This is good to go on a
different host and stay seperate. Different people and different
machine. It is an addition, probably a useful one, but nothing to
co-exist on the existing forge.
I am mentioning this because I see quite a bit of friction in this
regard. Some people see your tag2upload proposal as a step to diminish Salsa's place in Debian and probably even have it fully replaced in
the future.
thanks to all for this GR. I like tag2upload in principle. The only
thing I'm a bit scared about is that it simplifies uploading something
that was never built before on the local machine. Sure, this can be
done with source-only uploads as well, but tag2upload makes it even
easier.
Maybe I missed it in the long thread and I need to admit that I have
not read the docs, thus the explicit question here: Is the package
undergoing some CI test (maybe not only building but also autopkgtest
which I'm doing locally for any package I'm uploading) before it is
forwarded to dak?
On Wed, 12 Jun 2024 at 15:20:45 +0100, Ian Jackson wrote:
tag2upload, like dgit, ensures and insists that the git tree you are
uploading corresponds precisely [1] to the generated source package.
If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the
upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.
Is your position here that if your upstream releases source tarballs
that intentionally differ from what's in git (notably this is true
for Autotools `make dist`), then any Good™ maintainer must generate
their own .orig.tar.* from upstream git and use those in the upload, disregarding upstream's source tarball entirely?
That approach has many advantages, but it flatly contradicts what devref claims a Good™ maintainer would do, which is to always use the pristine source tarball as released by upstream (unless it's non-free) - which
implies that if they're using dgit, then the upstream tree must match
an import of the tarball.
Hi,
On Wed, 2024-06-12 at 08:59 +0200, Holger Levsen wrote:
Am Tue, Jun 11, 2024 at 10:27:56PM -0700 schrieb Russ Allbery:
As I said several times before: the implementation has knownCould you describe what known security vulnerabilities you believe
security
bugs (unless you fixed them). But I guess this is going to get
ignored
again anyway...
exist,
does it matter if this GR is about a design? currently the RFC is not
to vote about an implementation... :/
As far as I understand, the GR is about pushing the design and
implementation as is, without any changes. It very explicitly says so.
On 17258 March 1977, Sean Whitton wrote:
Could you say more specifically how seldom, and also how long it usuallySo there is no change here.Actually, we can set acls on fingerprints and then that key wont be able >>> to upload anymore. That is not something recorded in the keyrings or the >>> DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn off
anything with it *NOW*" situations, which it's what Helmut meant with the >>> urgent cases.
takes between you flicking the emergency switch, and the keyring team
pushing an update?
*Really* seldom. I would have to dig and see when, especially for the
timing thing with keyring team.
=====Considering that tag2upload is supposed to become a critical
BEGIN FORMAL RESOLUTION TEXT
tag2upload allows DDs and DMs to upload simply by using the
git-debpush(1) script to push a signed git tag.
1. tag2upload, in the form designed and implemented by Sean Whitton and
Ian Jackson, and reviewed by Jonathan McDowell and Russ Allbery,
should be deployed to official Debian infrastructure.
To answer your convoluted question, I am suggesting that Salsa and
tag2upload has very different needs (multi-user write versus multi-user append-only, drastically simplified), and consequently to not argue that reuse of Salsa for hosting tag2upload is a security benefit.
On Wed, 12 Jun 2024 at 12:03, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 12:28:21)
On Wed, 12 Jun 2024 at 09:35, Jonas Smedegaard <[email protected]> wrote:
Quoting Luca Boccassi (2024-06-12 10:21:40)
On Wed, 12 Jun 2024 at 02:31, Russ Allbery <[email protected]> wrote:
Luca Boccassi <[email protected]> writes:
And on the implementation details, I really do not like the idea of
having a competing git forge with Salsa. This dgit server seems to just
be a ye olde git-web interface.
Does it support gitweb? I thought it only supported regular Git
operations, but I could be mistaken.
I might be wrong, but this is what this looks like to me (it was
linked to me on IRC yesterday, wasn't aware of it before):
https://browse.dgit.debian.org/
If this goes forward, in my opinion it should exclusively use Salsa
as the git server, to avoid duplicating infrastructure.
I think you want the Git archive to be entirely separate from Salsa >> > > > > so that it's a reliable source of tracing information. You don't
want to support force pushes, for example; the whole point is that it
should be append-only, which would be a controversial choice for
Salsa but which is fine for the archives of the uploaded packages. I
would also want a much smaller attack surface for that type of record
than than GitLab. GitLab is designed as a place to do interactive >> > > > > work, not to keep a reliable permanent record.
The git repositories, sure. The git forge? I don't see why. You can
have these repositories in a separate namespace, which sets strong
branch and tag protection rules to achieve what you describe. As far >> > > > as I am aware, this is possible to do in Salsa already, it doesn't
have to be a per-forge rule, it can be per-namespace, I think this is >> > > > possible to achieve in Gitlab. I have not used tag protection rules
(on gitlab, I used them on github though), but I do regularly use
branch protection rules on my Salsa repositories.
To be clear, I am exclusively talking about the git forge, as in
salsa.debian.org, not the git repositories as they might exist on
Salsa under the debian/ namespace or any other namespace.
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend >> > > > git store deduplicates it anyway, so in practice it should be quite
cheap). Having an entire separate git forge that competes with Salsa >> > > > seems orthogonal to this, and counterproductive for the project.
I fail to recognize how strong ACLs achieves exactly the same separate >> > > storage on a separate host. Especially when the purpose is to minimize >> > > attack vectors.
As per the security review just shared, admin access to Salsa allows
to push commits anyway which would get uploaded just the same, and
again as per security review, this case benefits from centralizing:
one host to maintain, and one set of admins to trust, is better than
two. Especially as Salsa is Gitlab, which is maintained upstream and
benefits from the many-eyes-and-many-users situation, while a
completely custom local git forge reimplementation, other than
inevitably suffering from bitrot at some point in the future, like all
custom infrastructure, will have the disadvantage that nobody else
uses it. This is the reason Alioth is gone, and it's a very good
reason.
So your argument is that that strong ACLs achieve exactly the same as
separate storage on a separate host, because separate storage on a
separate host inevitably leads to bitrot and lack of eyeballs.
I rest my case.
No, my argument is that append-only can (as far as I can tell) be
achieved on Salsa too, it doesn't seem to necessitate a bespoke forge.
The centralizing argument is not mine, it's from the security review
that was published this morning:
"My security recommendation in this case is therefore to centralize
the risk as much as possible, moving it off of individual uploader
systems with unknown security profiles and onto a central system that
can be analyzed and iteratively improved."
https://lists.debian.org/debian-vote/2024/06/msg00004.html
That Git archive is not parallel to or competitive with Salsa and doesn't
provide most of the functionality that Salsa does. It has a different
purpose.
I disagree strongly. As we have seen in the recent Salsa thread on
d-private, there are a few but very strongly opinionated people who
are vehemently against Salsa and would like to see it gone. Having a >> > > > parallel and competing git forge I fear would give them very strong
ammunition to do so: "if the real uploads and the real repositories
are on a separate and independent git forge, why have Salsa at all?
Get rid of it and use the other forge exclusively."
I don't follow d-private, but sounds to me like that argument goes both >> > > ways - i.e. also "if the real uploads and the real repositories are on >> > > (some specially locked down section of) same git forge, why not embrace >> > > additional features offered from same vendor of said forge?"
I don't follow, we already use features from Salsa? Like the CI
pipeline, which is awesome. ACLs on repositories are not really unique
or particular to Github, modern forges pretty much have to support
them, Github has them too.
Sorry, I cannot possibly get a point across a cloud of awesomeness.
"Having an easy-to-use and working CI is really bad for a software development organization, actually" is... a bold take, no doubt about
that.
But anyway, thanks for proving my point for me: there is a small but
loud minority who would like to kill Salsa, and this proposal as
implemented would help them achieve that goal. If it goes to a GR,
this is enough to make me vote against it, as while the concept is
really nice and I like it a lot, it's not worth jeopardizing Salsa's existence.
thanks to all for this GR. I like tag2upload in principle. The only
thing I'm a bit scared about is that it simplifies uploading something
that was never built before on the local machine. Sure, this can be
done with source-only uploads as well, but tag2upload makes it even
easier.
I don't believe it makes any difference. We already have 'dgit
push-source' which will do a source-only upload with a single command invocation. And if 'dgit push-source' errors out, that's equivalent to tag2upload failing to upload and e-mailing you.
No, there is nothing additional being done.
Now we have salsa CI, though, we have various good options for automated pre-upload testing.
On 2024/06/12 10:21, Luca Boccassi wrote:
Having a separate namespace with strong ACLs seems exactly what you
want, even if it duplicates the individual repositories (the backend
git store deduplicates it anyway, so in practice it should be quite
cheap). Having an entire separate git forge that competes with Salsa
seems orthogonal to this, and counterproductive for the project.
I found the overview of tag2upload from Ian at MDC Campbridge quite
useful (and the workflow diagrams that he presented). From my
understanding (and I may still have the wrong end of a stick here), the additional git store used for tag2upload becomes a replacement for
source packages that happens to use git. So from my understanding, it's
more a competitor to source packages rather than to salsa.
I think it is more accurate to say that they are mirrors. They both contain details of current and historical packages. The difference is that snapshot is downstream of the archive, while these putative the tag2upload repositories are upstream.
It's it being upstream of the primary archive that makes it far more security sensitive.
[and]Actually, we can set acls on fingerprints and then that key wont be able
to upload anymore. That is not something recorded in the keyrings or the
DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn off
anything with it *NOW*" situations, which it's what Helmut meant with the
urgent cases.
*Really* seldom. I would have to dig and see when, especially for the
timing thing with keyring team.
Thanks. Then possibly it is sufficient for ftpmaster just to disable tag2upload's whole key until the keyring update is pushed.
Sean Whitton writes ("Re: [RFC] General Resolution to deploy tag2upload"): >[Joerg Jaspert wrote:]
[and]Actually, we can set acls on fingerprints and then that key wont be able >> to upload anymore. That is not something recorded in the keyrings or the >> DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn off
anything with it *NOW*" situations, which it's what Helmut meant with the >> urgent cases.
*Really* seldom. I would have to dig and see when, especially for the
timing thing with keyring team.
Thanks. Then possibly it is sufficient for ftpmaster just to disable tag2upload's whole key until the keyring update is pushed.
I'm not sure this is a sufficient answer. We don't want uploads by
revoked keys to appear on *.dgit.d.o either.
On Wed, Jun 12, 2024 at 04:23:29PM +0100, Simon McVittie wrote:
On Wed, 12 Jun 2024 at 15:20:45 +0100, Ian Jackson wrote:
tag2upload, like dgit, ensures and insists that the git tree you are uploading corresponds precisely [1] to the generated source package.
If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.
How would it fail?
This actually means we need to get rid of orig.tar completely.
Something that does not exist can't differ.
(Because git inherently has history, the dgit-repos server can
perform both functions at once.)
Considering that tag2upload is supposed to become a critical
component of our infrastructure, I am missing (or may have
overlooked) some information on how the deployment is going to be maintained.
I assume that you will continue to work on the code itself, but who
is going to be responsible for keeping the tag2upload service
operational? Are you going to manage the deployment as well, has DSA
agreed to do it, or do you have an altogether different arrangement
in mind?
Again, thank you your for work on this!
On 6/13/24 18:57, Ian Jackson wrote:
(Because git inherently has history, the dgit-repos server can
perform both functions at once.)
Do we actually want or need to hoard all the collaboration history?
Correct me if I am wrong, but if we are looking at dgit.d.o as
snapshot and audit log of the tag2upload service, would it not be
beneficial for the auditing and back-tracing process to actually
keep the code that someone tried to upload via tag2upload even if
their key is revoked, expired or signature is invalid?
On Wed 12 Jun 2024 at 11:14am +02, Ansgar 🙀 wrote:
As far as I understand, the GR is about pushing the design and implementation as is, without any changes. It very explicitly says
so.
It does not say this.
Do we actually want or need to hoard all the collaboration history?Of course: this makes auditing much easier.
Right now my workflow is basically git-buildpackage + salsa + dput, relunctantly using pristine-tar sometimes.
I have *tried* to use dgit, but [...]
1. how does this change my gbp/salsa/dput workflow?
can i *just* s/dput/dgit/?
Can I just keep doing gbp + salsa and switch the "dput" bit to
"dgit" or "tag2upload" without changing anything else? That would be
kind of neat, but I'm not sure *why* I would do that in the first
place...
2. does this scale to the archive?...
==================================
So what's the plan for dealing with the sheer size of the Debian
archive, assuming that eventually everything might reasonably be
expected to be *both* on dgit and salsa, if I understand the proposal correctly?
(Well, technically, the proposal says "this is opt-in, entirely
optional", but Ian at least has explicitly stated he expects people to enthusiastically start to use dgit massively in the future, so even if
that's not actually part of the proposal, we should take that scenario
into account.)
3. what does this mean for salsa/jenkins/bts/etc?
In the long term, what do you actually think we should do about the duplication of tools out there? We are wasting a lot of energy here maintaining two CI systems (Jenkins and GitLab CI), two bug trackers
(BTS and GitLab issues), two wiki systems (MoinMoin and GitLab Wikis),
two (or more?) VCS hosting systems (dgit and GitLab repos)?
I understand the proposal doesn't directly say "oh yeah, we're actually thinking we should ditch salsa and replace it with all those nice little small components", but it is certainly taking a stand that Salsa is not
good enough to provide the level of security that is required to upload packages in Debian, and saying that is saying a lot because I suspect we
are *actually* trusting Salsa and GitLab with our code much more than we would like to admit...
Anyways, I hope I'm not throwing a brick here, I do really have those questions and concerns and I am hoping a GR would pre-emptively answer
them so we have a better idea of what we're actually voting on here,
because I think the proposal, as it stands now, hides a lot of the
unresolved issues and problems we have.
That means some package build process is done before the source
package is forwarded to dak and sends some e-mail back?
I know we have this. My point is that tag2upload users might forget
to use it before using tag2upload service. I simply want to make
sure that tag2upload is not another way to upload anything that does
not build on buildservices.
On Thu, 2024-06-13 at 16:58 +0800, Sean Whitton wrote:
On Wed 12 Jun 2024 at 11:14am +02, Ansgar 🙀 wrote:
As far as I understand, the GR is about pushing the design and
implementation as is, without any changes. It very explicitly says
so.
It does not say this.
Quote:
---
1. tag2upload, in the form designed and implemented by Sean Whitton and
Ian Jackson, and reviewed by Jonathan McDowell and Russ Allbery,
should be deployed to official Debian infrastructure.
---
The statement also reads like the implementation was reviewed by Russ
which as far as I understand isn't the case either? Or do you only plan
to deploy a version once such a review happened?
Andreas Tille writes ("Re: [RFC] General Resolution to deploy tag2upload"):
That means some package build process is done before the source
package is forwarded to dak and sends some e-mail back?
Only a source package build.
Hello,
On Thu 13 Jun 2024 at 01:05pm +02, Ansgar 🙀 wrote:
The statement also reads like the implementation was reviewed by Russ
which as far as I understand isn't the case either? Or do you only plan
to deploy a version once such a review happened?
We weren't planning for this to be done, no.
Do we actually want or need to hoard all the collaboration history?
Of course: this makes auditing much easier.
As far as I understand in the current proposal the trigger is aI hate that idea. From past experience, the Salsa CI pipeline is
webhook running on Salsa after a push - have you considered instead
having the trigger be a stage in the salsa-ci pipeline, that would run
after the previous stages have completed successfully?
Luca Boccassi <[email protected]> [2024-06-13 14:23]:
As far as I understand in the current proposal the trigger is a
webhook running on Salsa after a push - have you considered instead
having the trigger be a stage in the salsa-ci pipeline, that would run >after the previous stages have completed successfully?
I hate that idea. From past experience, the Salsa CI pipeline is
slower and much more flaky than the buildds, so I'm not going to
spend several hours (and retries) per upload waiting to see if the
Salsa CI deemed my upload worthy.
As far as I understand in the current proposal the trigger is a
webhook running on Salsa after a push - have you considered instead
having the trigger be a stage in the salsa-ci pipeline, that would run
after the previous stages have completed successfully? IE, like we can
do today with aptly or pages publishing, for example. What runs in the pipeline is still under the control of the individual repo
maintainers, but the default would mean having this additional CI
step, which I think is what Andreas is hinting at, but solve it on the
other end of the pipeline - at the beginning, rather than at the end.
Timo Röhling writes ("Re: [RFC] General Resolution to deploy tag2upload"):
Luca Boccassi <[email protected]> [2024-06-13 14:23]:
As far as I understand in the current proposal the trigger is a
webhook running on Salsa after a push - have you considered instead >having the trigger be a stage in the salsa-ci pipeline, that would run >after the previous stages have completed successfully?
I hate that idea. From past experience, the Salsa CI pipeline is
slower and much more flaky than the buildds, so I'm not going to
spend several hours (and retries) per upload waiting to see if the
Salsa CI deemed my upload worthy.
I hope Luca wasn't suggesting that Salsa CI as a blocker ought to be mandatory. Like so many things in this space, some people love what
others hate.
Antoine Beaupré writes ("Re: [RFC] General Resolution to deploy tag2upload"):
Right now my workflow is basically git-buildpackage + salsa + dput,
relunctantly using pristine-tar sometimes.
I have *tried* to use dgit, but [...]
I think maybe I should make a blog post explaining what dgit is, and
isn't. But that's probably rather out-of-scope for this thread.
1. how does this change my gbp/salsa/dput workflow?
can i *just* s/dput/dgit/?
By "this" I'm going to take you to mean "tag2upload".
With tag2upload you don't run dgit.
You replace gbp/salsa/dput with git-debpush. git-debpush will push to
your branch salsa for you, as well as making and pushing the git tag.
The tag2upload service will take care of the rest.
You'll still want to run gbp, etc., as part of your pre-upload
testing, of course.
Can I just keep doing gbp + salsa and switch the "dput" bit to
"dgit" or "tag2upload" without changing anything else? That would be
kind of neat, but I'm not sure *why* I would do that in the first
place...
tag2upload and dgit have many additional safety checks that help avoid mistakes. For example, you can be sure that the git tree you are
about to upload is precisely what ends up in the archive - so you can
rely on git diff and never need to run debdiff on source packages.
It is much harder to accidentally undo an NMU. etc.
2. does this scale to the archive?...
==================================
So what's the plan for dealing with the sheer size of the Debian
archive, assuming that eventually everything might reasonably be
expected to be *both* on dgit and salsa, if I understand the proposal
correctly?
It's true that this is a lot of data. It's going to be comparable in
size to the archive. Scalability is a reasonable concern.
There is one singleton service push.dgit.d.o, which is used only by
uploaders (and the tag2upload robot). So it shouldn't become
overloaded.
Non-uploading clients use {browse,git}.dgit.d.o. Currently that is a
single host, which is also shared with some other services. But it is
a read-only mirror and we could scale up to multiple mirrors.
3. what does this mean for salsa/jenkins/bts/etc?
Nothing.
In the long term, what do you actually think we should do about the
duplication of tools out there? We are wasting a lot of energy here
maintaining two CI systems (Jenkins and GitLab CI), two bug trackers
(BTS and GitLab issues), two wiki systems (MoinMoin and GitLab Wikis),
I don't think I have an opinion about that. (Or at least, maybe I do,
but it's not relevant.)
tag2upload is not a competitor to any of the things you list.
In the long term, tag2upload depends on there being one or more things
that are a enough like git forges that they can call webhooks and
serve up git tags. Right now that's Salsa. If Debian wants to
replace gitlab with some other forge that's not something that
tag2upload has much of an opinion about.
Ultimately, *.dgit.d.o is in some sense a competitor to
archive.debian.org, but I don't see us abolishing archive.d.o.
Instead, tag2upload is getting us further towards on dual running,
where we accept either source packages or git trees, and publish both.
Anyways, I hope I'm not throwing a brick here, I do really have those
questions and concerns and I am hoping a GR would pre-emptively answer
them so we have a better idea of what we're actually voting on here,
because I think the proposal, as it stands now, hides a lot of the
unresolved issues and problems we have.
Past experience with GRs suggests very strongly that GR proposals
should be short. So I think the background has to be outside the
formal GR - in places like this discussion thread.
On 6/13/24 20:29, Marco d'Itri wrote:
Of course: this makes auditing much easier.
That is a *massive* amount of data though, especially if we're expected
to import the entire upstream git history as well and base the packaging branch on top of an upstream commit.
We will also need to be prepared for removal requests, so there needs to
be a procedure in place for that, people authorized to perform it, and
an audit framework for that.
We could add some mechanisms, like enforcing that merge commits pulling
in a new upstream version will only modify files outside of debian/ in
one subtree, and files inside debian/ in the other, but that conflicts
with workflows that maintain Debian-specific patches as commits instead
of patch files.
We have several 90% solutions of mapping Debian packaging onto git, but
all of these are incomplete and annoying to use because we disagree with
git on what constitutes data, and what constitutes metadata, so the data model does not match reality or requirements, and from a security
standpoint that concerns me more than improved forensics.
I was not, I wasn't suggesting to make this a hard requirement, as
you say that's more complicated. Merely moving the fire-and-forget
webhook as the last stage of the pipeline, as the default >setting/setup/config/whatever. This is not to provide strong
guarantees, but merely an easy default that encourages a QA pass
first. Then maintainers can override the pipeline config and skip
it, if they don't want it for any reason. If it was the default, I
suspect de-facto the majority of uploads would go through it, and
we would gain in quality, on average (exceptions apply, etc etc).
Sean Whitton <[email protected]> wrote on 13/06/2024 at 14:44:57+0200:
Hello,
On Thu 13 Jun 2024 at 01:05pm +02, Ansgar 🙀 wrote:
The statement also reads like the implementation was reviewed by Russ
which as far as I understand isn't the case either? Or do you only plan
to deploy a version once such a review happened?
We weren't planning for this to be done, no.
I'm sorry but I have a problem here.
You stated in your first mail that both rra and noodles audited your
work, and here it seems that audited is potentially a bit more than what
has been done.
Could you elaborate explicitly on what you mean with "audited"?
On 2024-06-13 12:38:36, Ian Jackson wrote:...
Antoine Beaupr� writes ("Re: [RFC] General Resolution to deploy tag2upload"):
3. what does this mean for salsa/jenkins/bts/etc?
Nothing.
I don't think I have an opinion about that. (Or at least, maybe I do,
but it's not relevant.)
I do think it's relevant. [...]
You, I suspect, have a bias as well. If you don't state it clearly,
people will (and have, already!) speculate as to what your underlying intentions are. Sean, for example, has clearly stated he likes Salsa and wants it to stick around, which probably will comfort people who worry
about this.
I think if you, in particular, would speak your mind about this, it
could help alleviate some of those concerns, or at least clarify the
scope of concerns people should have. :p
Well, isn't tag2upload part of dgit? Or at least git-debpush, the binary package, seems to be part of the dgit source package here... we're also talking frequently about dgit.debian.org as part of this infrastructure,
so clearly this whole thing is kind of a part of dgit...
I am not sure saying those things are completely separate here is
helpful, it would be more useful to clarify exactly what component we're adopting and what patterns we need to change if we want to adopt
this. For example, does this respect DEP-14? Which parts?
tag2upload and dgit have many additional safety checks that help avoid mistakes. For example, you can be sure that the git tree you are
about to upload is precisely what ends up in the archive - so you can
rely on git diff and never need to run debdiff on source packages.
It is much harder to accidentally undo an NMU. etc.
This brings another question to mind. Right now, I understand that some people use dgit for NMUs, on packages they do not own. Does this
workflow still support the old NMU process where i get a debdiff with an upload someone makes for me, or if someone opts in to this process, for
an NMU, *I*, as a maintainer, now have to figure out dgit? :)
...2. does this scale to the archive?
There is one singleton service push.dgit.d.o, ...
Non-uploading clients use {browse,git}.dgit.d.o. ...
Are those two different hosts with their own replicas of the git repos?
Because then that means we have *three* replicas (push.dgit.d.o, browse.dgit.d.o and salsa.d.o) of those repositories...
Ultimately, *.dgit.d.o is in some sense a competitor to
archive.debian.org, but I don't see us abolishing archive.d.o.
Instead, tag2upload is getting us further towards on dual running,
where we accept either source packages or git trees, and publish both.
hmmm... maybe I'm missing something, but archive.d.o also has binary packages, dgit.d.o doesn't do that, does it? Or are you only refering to
the source packages part?
Thanks. Then possibly it is sufficient for ftpmaster just to disableI'm not sure this is a sufficient answer. We don't want uploads by
tag2upload's whole key until the keyring update is pushed.
revoked keys to appear on *.dgit.d.o either.
Joerg, is there some way that this fingerprint block information could
be made available in a more timely manner? Ideally we would update push.dgit.d.o to use this information, regardless of tag2upload.
(And the t2u conversion system should use it too.)
I think maybe we should take this to a different venue, than this
thread on -vote. How about a bug against ftp.d.o and/or
dgit-infrastructure ?
I think it is possible that there will be a handful of packages where
things are significantly more awkward, which might not be able to
adopt tag2upload.
On 17259 March 1977, Ian Jackson wrote:
Thanks. Then possibly it is sufficient for ftpmaster just to disableI'm not sure this is a sufficient answer. We don't want uploads by
tag2upload's whole key until the keyring update is pushed.
revoked keys to appear on *.dgit.d.o either.
Joerg, is there some way that this fingerprint block information could
be made available in a more timely manner? Ideally we would update
push.dgit.d.o to use this information, regardless of tag2upload.
(And the t2u conversion system should use it too.)
I think maybe we should take this to a different venue, than this
thread on -vote. How about a bug against ftp.d.o and/or
dgit-infrastructure ?
I think this is a minor issue, actually. It does not happen often. For
the time it will, we can have something like "ftpmaster pushes a list of >fingerprints via $mechanism" (ssh forced command is widely used for
similar things, for example).
That's really simple to implement.
I agree that this isn't a major design issue, but I think it is
something that I think needs to be addressed before deployment of
tag2upload. The need is certainly rare, but when it's needed, it's
needed because it's important.
We might get additional insights after a breach, perhaps, if Github
decide to take a compromised repository offline and our copy is still accessible.
We have several 90% solutions of mapping Debian packaging onto git, but
all of these are incomplete and annoying to use because we disagree with
git on what constitutes data, and what constitutes metadata, so the data model does not match reality or requirements, and from a security
standpoint that concerns me more than improved forensics.
On June 13, 2024 3:02:48 PM UTC, Joerg Jaspert <[email protected]> wrote:
I think this is a minor issue, actually. It does not happen often. For
the time it will, we can have something like "ftpmaster pushes a list of >fingerprints via $mechanism" (ssh forced command is widely used for
similar things, for example).
That's really simple to implement.
I agree that this isn't a major design issue, but I think it is something that I think needs to be addressed before deployment of tag2upload. The need is certainly rare, but when it's needed, it's needed because it's important.
It also suggests to me that it's premature to freeze and mandate the current design via GR.
One thing I really dislike, is having a single gpg key to upoload them
all. I very much preferred the design that Didier explained during
Debconf Kosovo, where the .changes signature is uploaded together with
the tagged commit.
Actually, we can set acls on fingerprints and then that key wont be able
to upload anymore. That is not something recorded in the keyrings or the
DM list. Obviously that is not something used often (really really
seldom), it is more for "this key is compromised badly, please turn off anything with it *NOW*" situations, which it's what Helmut meant with the urgent cases.
Could you say more specifically how seldom, and also how long it usually takes between you flicking the emergency switch, and the keyring team
pushing an update?
From a user perspective some intermediate binary build wouldn't be more difficult, thought.
Andreas Tille writes ("Re: [RFC] General Resolution to deploy tag2upload"):
That means some package build process is done before the source
package is forwarded to dak and sends some e-mail back?
Only a source package build.
I know we have this. My point is that tag2upload users might forget
to use it before using tag2upload service. I simply want to make
sure that tag2upload is not another way to upload anything that does
not build on buildservices.
I'm afriad that tag2upload is precisely another way to do that.
That's because that's how uploading works now, and tag2upload is
another way to make an upload. Uploads must be source-only nowadays
(in most cases). So there is, by design, nothing in the existing
setup that ensures that a maintainer built binaries.
(I get the
feeling that you're not happy with this situation, but that's how
Debian is now, and I think it's a jolly good thing.)
You might argue that tag2upload makes this worse because it makes it
easier to perform uploads. It certainly *does* make it easier to
perform uploads. That's a big part of the point.
I think this can only be a *downside* if you think it is a goodexpense of extra power consumption and for large packages (which could
thing that uploading is difficult.
From a user perspective some intermediate binary build wouldn't be more difficult, thought. I think we could make things more safe by the
On Thu, 13 Jun 2024 at 15:08:15 +0100, Ian Jackson wrote:
I think it is possible that there will be a handful of packages where
things are significantly more awkward, which might not be able to
adopt tag2upload.
This would presumably be the same minority of packages where maintainers
use a debian/-only workflow (even if they normally prefer to keep upstream source in git) and avoid dgit (even if they normally prefer to use it), because the upstream source is too bulky to be convenient to track in git? Such as the openarena-data family and other large game assets?
Those packages are already exceptional and already need to be handled specially. They'd only be a problem if dgit and/or tag2upload became mandatory, which (as far as I understand it) is not the plan.
We have several 90% solutions of mapping Debian packaging onto git, but
all of these are incomplete and annoying to use because we disagree with
git on what constitutes data, and what constitutes metadata, so the data
model does not match reality or requirements, and from a security
standpoint that concerns me more than improved forensics.
This is why people are working on incremental improvements. I think such improvements are more likely to get us closer to where we want to be than
a boil-the-ocean approach that attempts wholesale change to how Debian
works. It's easy to come up with new designs that in theory would be more coherent and straightforward, and very hard in practice to avoid that
turning into <https://xkcd.com/927/>.
It would be nice to not do this on the tag2upload server, though, to
maintain some security separation.
Given all of that, I think it would be more promising to look into a
deeper integration with Salsa to check if the Salsa CI has succeeded, as discussed earlier in this thread. That would also match common upstream practice in Git-first development where the workflow for generating the release artifact depends on all of the tests passing through the normal CI mechanism.
On 6/14/24 00:50, Russ Allbery wrote:
This is why people are working on incremental improvements. I think
such improvements are more likely to get us closer to where we want to
be than a boil-the-ocean approach that attempts wholesale change to how
Debian works. It's easy to come up with new designs that in theory
would be more coherent and straightforward, and very hard in practice
to avoid that turning into <https://xkcd.com/927/>.
The reason we have multiple git workflows is because they are
incremental designs that do not try to change the way Debian works, or
the way git works.
At the very least, we need to make it explicit which repository layout
is to be used, and version and document that interface, then support it
for several years in the future even as we make incremental changes,
because we want to be able to regenerate packages from the git archive.
One _incremental_ change I'd like to see would be archive support for .orig.bundle.* (containing a shallow copy of the upstream commit) and .debian.bundle.* (containing the differences between the upstream
commit and the package), which would be an absolute game changer for
git integration, the archive side would probably be fairly simple to implement, and it would allow us to ship the "preferred form for modification" for a lot of projects more easily.
Mirrors would still get a size-minimal representation, this format
does not impose a particular workflow and can be easily generated from
and validated against the full tree.
As I understand, the proper way to resolve disagreement over technical
issues is to bring the matter to the Technical Committee. Why are you proposing a GR instead?
I don't think a shallow copy will work generally. Instead you want
to upload the entire upstream git repository as a bundle.
Please read his lightning talk "debconf22-94-lightning-talks.webm". Here's the
first to talk in the video:
https://meetings-archive.debian.net/pub/debian-meetings/2022/DebConf22/
What I found super nice with his design is that:
* there's no need to modify anything on the Debian infrastructure
* there's no need for a GR or a change of any Debian current policy.
* packages continue to be signed with your own DD key
Why can't we move to this route, with standardized tooling?
The reason we have multiple git workflows is because they are
incremental designs that do not try to change the way Debian works, or
the way git works.
By creating an upload service, we elevate git to "interface" status.
That would be a good thing if there was a single interface. However, we
have three (that I know of), none of these were designed to talk to
anything but itself, and the service uses a heuristic to determine which
one is used.
At the very least, we need to make it explicit which repository layout
is to be used, and version and document that interface, then support it
for several years in the future even as we make incremental changes,
because we want to be able to regenerate packages from the git archive.
I think it's possible to avoid running arbitrary code from the package
during a source package build because tag2upload doesn't need to run the clean step since it's starting from a fresh Git checkout (please check me
on this).
The TC can overrule individual developers (�6.1.4 in the Debian constitution), but it can't overrule a position delegated by the DPL,
and the ftp team is such a position. We have had situations in the past
where an issue involving the ftp team was brought to the TC, and the
most we could do abotu it was to "offer advice" (agree among ourselves
on a non-binding opinion, �6.1.5) and hope that the ftp team might
reconsider their decisions on the basis of that advice.
To the best of my understanding, the only mechanism the project has for overruling a DPL delegate is a GR.
tag2upload already supports most existing workflows (including the one
you yourself prefer, where only debian/ is committed to git).
On Thu, 2024-06-13 at 05:58 +0800, Sean Whitton wrote:
tag2upload already supports most existing workflows (including the one you yourself prefer, where only debian/ is committed to git).
How does this work? Does the builder run the `get-orig-source` target
in debian/rules?
Ansgar 🙀 writes ("Re: [RFC] General Resolution to deploy tag2upload"):
On Thu, 2024-06-13 at 05:58 +0800, Sean Whitton wrote:
tag2upload already supports most existing workflows (including the one >> > you yourself prefer, where only debian/ is committed to git).
How does this work? Does the builder run the `get-orig-source` target
in debian/rules?
No. The git commitid of the upstream source is named in the tag
generated by git-debpush. (So that upstream git branch has to be in
your git repo somewhere - just not in your branch.) The t2u server
will use that (ultimately, via git-archive).
Ian Jackson <[email protected]> writes:
No. The git commitid of the upstream source is named in the tag
generated by git-debpush. (So that upstream git branch has to be in
your git repo somewhere - just not in your branch.) The t2u server
will use that (ultimately, via git-archive).
How does t2u find out the URL to the upstream git repository? Is
https:// enforced, or are http:// or git:// URLs supported? Is the
upstream git branch recorded anywhere, or just the commit?
"Simon" == Simon McVittie <[email protected]> writes:
Simon Josefsson <[email protected]> writes:
Ian Jackson <[email protected]> writes:
No. The git commitid of the upstream source is named in the tag
generated by git-debpush. (So that upstream git branch has to be in
your git repo somewhere - just not in your branch.) The t2u server
will use that (ultimately, via git-archive).
How does t2u find out the URL to the upstream git repository? Is
https:// enforced, or are http:// or git:// URLs supported? Is the
upstream git branch recorded anywhere, or just the commit?
My understanding is that there is no separate upstream Git repository. I believe that's what Ian means by "the upstream Git branch has to be in
your Git repo somewhere." In other words, you have to push upstream to
your Salsa packaging repository to use tag2upload, but you don't have to merge it with your packaging branch.
The ftp team could ignore that too, but the ftp team could also ignore a
GR, and I think the DPL would be well justified for removing a delegate either for ignoring an override in a GR or for failing to follow
sensible policies adopted by the TC under 6.1.1.
Hello zigo,
On Fri 14 Jun 2024 at 11:39am +02, Thomas Goirand wrote:
Please read his lightning talk "debconf22-94-lightning-talks.webm". Here's the
first to talk in the video:
https://meetings-archive.debian.net/pub/debian-meetings/2022/DebConf22/
What I found super nice with his design is that:
* there's no need to modify anything on the Debian infrastructure
* there's no need for a GR or a change of any Debian current policy.
The work has already been done to prepare the additional infrastructure
(note that there is no need to *modify* any existing infrastructure),
and to prepare this GR.
We are enthusiastic to complete the remaining work. The mere fact that >change is required shouldn't hold us back from going for what we think
is the best solution, if there are people willing to implement it.
* packages continue to be signed with your own DD key
Why can't we move to this route, with standardized tooling?
Well, to put it simply, because it's better to do things using only
signed git tags than to do something highly Debian-specific.
It is better if new contributors don't have to learn about source
packages and dput at all. It is also much more convenient for existing >contributors. Take a look at how git-debpush works -- it's really very >simple and lightweight. I think you'll like it.
Scott Kitterman <[email protected]> writes:
I agree that this isn't a major design issue, but I think it is
something that I think needs to be addressed before deployment of
tag2upload. The need is certainly rare, but when it's needed, it's
needed because it's important.
I don't understand why this would be a blocker given that dak can redo the >authorization check at the same point that it does authorization checks
now, should it so desire. This does require a small change to dak to >retrieve the key fingerprint from the source package in the case where the >source package is signed with the tag2upload key, but that doesn't seem
too difficult.
On June 13, 2024 3:29:21 PM UTC, Russ Allbery <[email protected]> wrote:
I don't understand why this would be a blocker given that dak can redo
the authorization check at the same point that it does authorization
checks now, should it so desire. This does require a small change to
dak to retrieve the key fingerprint from the source package in the case
where the source package is signed with the tag2upload key, but that
doesn't seem too difficult.
I think that if the proposers want to direct use of a specific design
via GR, it ought to be complete.
It's unclear to me how the FTP Masters could ask for this after the GR,
since the GR takes anything to do with tag2upload out of their hands
going forward.
Post GR, it's not clear to me who gets to decide if changes are needed without another GR.
Scott Kitterman <[email protected]> writes:
On June 13, 2024 3:29:21 PM UTC, Russ Allbery <[email protected]> wrote:
I don't understand why this would be a blocker given that dak can redo
the authorization check at the same point that it does authorization
checks now, should it so desire. This does require a small change to
dak to retrieve the key fingerprint from the source package in the case
where the source package is signed with the tag2upload key, but that
doesn't seem too difficult.
I think that if the proposers want to direct use of a specific design
via GR, it ought to be complete.
Sorry, I don't understand. What isn't complete? I just explained how dak could continue to enforce all the same authorization checks as it does
today. This is part of the design as proposed. The key fingerprint of
the original tag signer is present in the Git-Tag-Info header in the *.dsc file as uploaded to dak.
It's unclear to me how the FTP Masters could ask for this after the GR, since the GR takes anything to do with tag2upload out of their hands
going forward.
I don't believe this is a correct interpretation of how GRs that override
a delegate decision are applied in Debian.
For one, absolutely nothing about a GR or any other action in Debian constrains what FTP Masters can *ask* for. Surely that's obvious. It
would only constrain what FTP Masters can *demand*. One would hope that,
in the presence of new guidance from the project as a whole about the technical direction, FTP Masters and the tag2upload developers would work collaboratively together to improve the entire architecture. Nothing in
the GR prevents that; that's never been how we interpret GRs.
Second, the specific thing that the GR requires of FTP Master is that tag2upload be allowed to upload source packages signed with its key, following the architecture spelled out here. That architecture includes providing dak, and everyone else looking at the *.dsc file, with the fingerprint of the original tag signer. It does not preclude dak from performing the normal authorization checks for uploads to the archive,
only from rejecting packages because they are uploaded through tag2upload.
Post GR, it's not clear to me who gets to decide if changes are needed without another GR.
This was much-discussed after both
https://www.debian.org/vote/2007/vote_002 and https://www.debian.org/vote/2007/vote_003. Kurt is authoritative on this point, I think, since it's a question of constitutional interpretation,
but my understanding of the project consensus is that a GR is not forever-binding. We all understand that circumstances change in the
future and we do not need to strictly follow the exact text of a GR into
the indefinite future. It's not a law. The exact time frame is not
defined anywhere, but I would think of it as a sort of "slow decay" where, over time, the GR should be seen as a directional statement but the exact architecture should and will change based on new requirements, new issues, and more experience.
Scott Kitterman <[email protected]> writes:
On June 13, 2024 3:29:21 PM UTC, Russ Allbery <[email protected]> wrote:
I don't understand why this would be a blocker given that dak can redo the authorization check at the same point that it does authorization checks now, should it so desire. This does require a small change to dak to retrieve the key fingerprint from the source package in the case where the source package is signed with the tag2upload key, but that doesn't seem too difficult.
I think that if the proposers want to direct use of a specific design
via GR, it ought to be complete.
Sorry, I don't understand. What isn't complete? I just explained how dak could continue to enforce all the same authorization checks as it does today. This is part of the design as proposed. The key fingerprint of
the original tag signer is present in the Git-Tag-Info header in the *.dsc file as uploaded to dak.
On Friday, June 14, 2024 2:45:55 PM EDT Russ Allbery wrote:
Sorry, I don't understand. What isn't complete? I just explained how
dak could continue to enforce all the same authorization checks as it
does today. This is part of the design as proposed. The key
fingerprint of the original tag signer is present in the Git-Tag-Info
header in the *.dsc file as uploaded to dak.
Can, but doesn't currently. Elsewhere it has been claimed that
tag2upload can be implemented with no changes elsewhere and I think
that's just not true.
Which means that in the future, the tag2upload developers can make
whatever changes they want and the FTP team is required to accept them?
I'm still concerned about how this is going to work in practice. the tag2upload developers seem to be very confident that they have a good
design that is ready to be deployed and once the FTP Masters are
overridden on this, until such time as this natural decay runs, there's
no incentive for them to cooperate.
Is anyone volunteering to do the DAK changes to use Git-Tag-Info header
to get the signature if the source package is signed by a tag2upload
key?
On Fri, 2024-06-14 at 11:45 -0700, Russ Allbery wrote:
Sorry, I don't understand. What isn't complete? I just explained how
dak could continue to enforce all the same authorization checks as it
does today. This is part of the design as proposed. The key
fingerprint of the original tag signer is present in the Git-Tag-Info
header in the *.dsc file as uploaded to dak.
This would require the check to be implemented correctly in tag2upload. Otherwise whatever check dak performs is fairly useless.
We would also have a new critical system written and maintained by 1.2
people in a fairly old-style Perl dialect that have previously not kept
up with promises to maintain software stacks (e.g., systemd-shim which
then had to be replaced by other people with something else).
I think that some of the posts on this thread are exactly backwards in
their understanding of human motivation. Blocking someone's work from
being used until it's done the way that you would have done it yourself is not motivating, it's horribly demotivating. Seeing your work deployed
live and actively used by Debian does not eliminate the motivation to make any further changes; rather, it increases the willingness to do further
work drastically.
Okay, so we have to accept a path into the archive that is known to
accept malicious uploads that would have been rejected by dak so maybe
that path will be changed later? I don't see that happening given all suggestions to change this have been rejected, even when fairly simple
to implement.
My understanding is that there is no separate upstream Git repository.
I
believe that's what Ian means by "the upstream Git branch has to be in
your Git repo somewhere." In other words, you have to push upstream
to
your Salsa packaging repository to use tag2upload, but you don't have
to
merge it with your packaging branch.
Ansgar 🙀 <[email protected]> writes:
On Fri, 2024-06-14 at 11:45 -0700, Russ Allbery wrote:
Sorry, I don't understand. What isn't complete? I just explained how
dak could continue to enforce all the same authorization checks as it
does today. This is part of the design as proposed. The key
fingerprint of the original tag signer is present in the Git-Tag-Info
header in the *.dsc file as uploaded to dak.
This would require the check to be implemented correctly in tag2upload. Otherwise whatever check dak performs is fairly useless.
It requires that the signature on the Git tag be correctly checked and
that fingerprint be put into the *.dsc file, yes.
It doesn't require that dak then also trust the authorization checks.
We would also have a new critical system written and maintained by 1.2 people in a fairly old-style Perl dialect that have previously not kept
up with promises to maintain software stacks (e.g., systemd-shim which
then had to be replaced by other people with something else).
Yes, the tag2upload developers implemented the service the way that they implemented it, and the proposed GR would say that they can deploy that implementation. Asking them to redo that work in a different programming language or with a substantially different architecture before it can be deployed is not, at this point, a reasonable request, even apart from the general principle that Debian is a volunteer project and no one is
required to do work.
I think that some of the posts on this thread are exactly backwards in
their understanding of human motivation. Blocking someone's work from
being used until it's done the way that you would have done it yourself is not motivating, it's horribly demotivating. Seeing your work deployed
live and actively used by Debian does not eliminate the motivation to make any further changes; rather, it increases the willingness to do further
work drastically.
On Friday, June 14, 2024 5:25:33 PM EDT Russ Allbery wrote:
It requires that the signature on the Git tag be correctly checked and
that fingerprint be put into the *.dsc file, yes.
It doesn't require that dak then also trust the authorization checks.
Yes. It does. Since DAK has no way to check the signature of the tag against the keyring, it has to trust the source package signature done
by tag2upload. The only two choices are blindly trust tag2upload is
correct or don't accept uploads from tag2upload.
My impression (and I may be wrong, because it was awhile ago and since
I'm not an FTP Master I wasn't super focused on it) is that the
fundamental issue is tag2upload inherently requiring DAK to blindly
accept anything tag2upload signs and the FTP delegates not being
comfortable with that.
That was the issue last time this was discussed (IIRC) and it doesn't
appear that anything has changed. I don't see how it can with the
current architecture.
I suspect a vote of no confidence by the project in the FTP Masters
would not be super motivating either.
Scott Kitterman <[email protected]> writes:
On Friday, June 14, 2024 5:25:33 PM EDT Russ Allbery wrote:
It requires that the signature on the Git tag be correctly checked and
that fingerprint be put into the *.dsc file, yes.
It doesn't require that dak then also trust the authorization checks.
Yes. It does. Since DAK has no way to check the signature of the tag against the keyring, it has to trust the source package signature done
by tag2upload. The only two choices are blindly trust tag2upload is correct or don't accept uploads from tag2upload.
That's exactly what I just said. It has to trust that tag2upload verified the signature on the Git tag correctly. It does not have to trust that tag2upload performed the authorization check correctly; it has the fingerprint and can redo that itself.
It is entirely correct that deployment of tag2upload means that there are
two separate systems performing the OpenPGP signature verification for upload, and dak has to trust tag2upload's performance of that
verification. This is inherent in the design: dak and tag2upload are verifying signatures over different types of objects, and the verification
of the tag signature is not useful without also performing the
transformation to a source package. That is exactly what the whole tag2upload server is there to do.
dak should not be doing the source package transformation, because that is
a much more complicated process and therefore a larger security attack surface. That's why it's done in a sandbox with a bunch of privilege separation. That does indeed mean that dak has to trust the tag2upload verification of the original Git tag and its verification of the semantics
of that Git tag, because that's part and parcel with the rest of the work that tag2upload is doing. The tag2upload developers believe that the
schemes proposed for trying to make the original signature portable to the generated *.dsc file are too awkward and complex to be supportable, and personally I agree.
But this is entirely separate from the *authorization* check. After tag2upload uploads the *.dsc and *.changes file to dak, dak is in
possession of the key fingerprint of the original signer, the source
package name, the suite, and so forth. It can redo the *authorization*
check itself if it so chooses. The only thing it can't do is the *authentication* check.
My impression (and I may be wrong, because it was awhile ago and since
I'm not an FTP Master I wasn't super focused on it) is that the
fundamental issue is tag2upload inherently requiring DAK to blindly
accept anything tag2upload signs and the FTP delegates not being comfortable with that.
Yes, I believe that's the core disagreement. I don't believe there is any way around that without breaking one or more design goals of tag2upload.
It's not clear to me why it is considered a blocker for signature verification in the tag2upload case to be done by a different piece of software running on limited-access Debian project infrastructure instead
of in dak, a piece of software running on limited-access Debian project infrastructure. But that's fine; it doesn't need to be clear to me. I believe it is in the remit of the FTP team delegation to make that
decision, but there is also a constitutional process for appealing that decision to the project as a whole. The tag2upload developers have made their case, the FTP team can make their case for why they don't want to
allow this, and the project can decide. That's how our system works.
That was the issue last time this was discussed (IIRC) and it doesn't appear that anything has changed. I don't see how it can with the
current architecture.
I agree.
I suspect a vote of no confidence by the project in the FTP Masters
would not be super motivating either.
I think interpreting this GR as a vote of no confidence by the project in
the FTP Masters would be an extreme overreaction. The FTP Masters were overruled in https://www.debian.org/vote/2007/vote_002 and life went on.
All of us are at odds with the general consensus of the project at one
point or another. That's just part of working collaboratively with people who are not clones of us. Feedback from the project as a whole can be extremely helpful and constructive. There's no reason to take it
personally. I have been overruled in my design decisions many times in my life, including by people who were and remain close friends.
Just becuase I think the FTP team made the wrong decision in this
particular case does not mean I have no confidence in their regular work.
On 6/14/24 12:01, Sean Whitton wrote:
Well, to put it simply, because it's better to do things using only
signed git tags than to do something highly Debian-specific.
In what ways aren't we discussing debian-specific things anyways? I don't understand this part. We we just type "push2upload" and it's doing some magic behind, what's the issue? Moving the magic inside the CI is even more hiding things than doing them in the local computer.
It is better if new contributors don't have to learn about source
packages and dput at all. It is also much more convenient for existing
contributors. Take a look at how git-debpush works -- it's really very
simple and lightweight. I think you'll like it.
Here as well, I don't understand. If we have the necessary tools to do the way
Didier did, why would a new contributor need to learn about dput? The CI would
upload for them...
As for "learn about source packages", I'm not sure what their would be
to learn, except having to configure a correct build environment. That
indeed is an impressive amount of things to learn, but:
1/ one has to learn how to build packages to be able to contribute
Maybe. Maybe this breaks the thing into two parts in a way it wasn't
before If you verify the signature on the source package and the key is
in the keyring, you know that the package was uploaded by someone
authorized to do so and that the code you have is what they signed.
With tag2upload you have neither. You have tag2upload's claim of who
signed the tag and the source package constructed by tag2upload. The connection to what the uploader intended to upload is completely
indirect.
I don't think there's any real mystery about this, but the claim in the
draft GR was that there was an unwillingness to communicate.
Some or all of them may be unwilling to continue to be responsible for managing the security of the archive once the security of the system has
been (in what I believe to be their view) compromised.
Scott Kitterman <[email protected]> writes:
Maybe. Maybe this breaks the thing into two parts in a way it wasn't before If you verify the signature on the source package and the key is
in the keyring, you know that the package was uploaded by someone authorized to do so and that the code you have is what they signed.
With tag2upload you have neither. You have tag2upload's claim of who signed the tag and the source package constructed by tag2upload. The connection to what the uploader intended to upload is completely
indirect.
I guess I consider separating authentication from authorization to be a pretty routine thing to do, since I've worked on lots of systems that do that. But yes, it is a change from dak's perspective in that it is no
longer the sole agent involved in authentication checks and it has to
trust that tag2upload did its part correctly.
I don't think there's any real mystery about this, but the claim in the draft GR was that there was an unwillingness to communicate.
I cannot speak for the authors of the draft GR, but the claim that I would make is that there seems to be a lot of reluctance on the part of the FTP team to communicate *why* they think that trusting tag2upload is a
problem. My conversation with Ansgar felt typical to me: vague assertions
of security problems without an explanation of what those assertions are based on.
Again, this is their perogative under the Debian constitution, although it has reached a point that I personally find a bit rude. But the project
can decide how much weight to put on those assertions.
With any luck, there's an explanation already waiting in my inbox while
I'm writing this and I'll be happily wrong. :)
I guess my assumption is that the security objections are based on a gut feeling or vibes, which makes them hard to explain. That's a real thing,
and I am familiar with the feeling, but I also don't expect it to be that persuasive to other people. When it comes down to rejecting substantial amounts of work that other people have put into solving a problem they
care deeply about, I feel like it's my responsibility to really dig in and figure out what my vibes are based on or to let go of my objection.
That's my personal take; obviously other people can have their own
opinions on that score.
If the objection is that there should be one and only one piece of
software that verifies package upload signatures, meh, sure, all other
things being equal it's better to only have to trust one system than two,
but the whole point is that all other things aren't equal. Additional complexity is always a drawback, but it's also often the cost of adding
new features. If that's the sole objection, it seems pretty weak to me.
If the objection is that the implementation of the tag2upload security
checks is not secure, then that is a very real problem that would need to
be fixed and someone should spit out the details so that we can have a
real discussion. But I have a hard time imagining that this is a blocking architectural objection. It's clearly possible to securely verify a Git
tag signature, modulo concerns about SHA-1 hashes that have been discussed exhaustively elsewhere. If tag2upload is doing it wrong, then tag2upload
can be fixed.
But this is all speculation on my part. I don't actually know what the objection is because no one has explained it, at least that I have seen
and understood. Maybe I just missed it.
Some or all of them may be unwilling to continue to be responsible for managing the security of the archive once the security of the system has been (in what I believe to be their view) compromised.
You narrowly dodged me going off on a long rant about one of my pet peeves about the computer security profession, but I had a nice dinner with my family and decided to spare everyone. :)
I guess the main thing that I will say to this is that I certainly hope people are not feeling this way because I think that would be an unhealthy way to approach a dispute. I guess this gets into personal philosophy,
but I think it's important to not hold one's positions so tightly that
they become brittle. I find that when I do that, *I* become brittle, and it's a deeply unpleasant experience.
It's not very helpful to think of systems as secure or insecure. Security
is always and forever a tradeoff. This is particularly true of the sort
of discussion where we're having, where no one is identifying a concrete attack that could be performed against tag2upload today. Instead, we're discussing design principles that may or may not make tag2upload
vulnerable to problems in the future. Those discussions are very
important -- my security review was full of them -- but they're also inherently speculation and opinion, and it's always possible that one's design intuition is wrong.
One of the reasons why I wanted to write a proper security review and post
it publicly is because I want people to check my work. If someone finds a serious problem that I didn't think of, I would hope that I would change
my opinion accordingly. I know that's psychologically difficult to do
once I've publicly committed to a position, but that's all the more reason
to constantly restate to myself the importance of holding opinions loosely and being open to new information. This should be a collaborative
process. The goal is to enable people to do the work they want to do in a reasonably secure fashion, not to stand in front of people and declaim
"you are secure" or "you are not secure."
I have not done a review of the implementation. I'm not a great code reviewer becuase I have a lot of difficulty separating my aesthetic preferences from my analysis. I'm better at architecture. If someone is willing to do a detailed security review of the code, that, as far as I'm concerned, would be great. That's how we get better. If that turns up problems, clearly those problems should be fixed.
I am absolutely confident that if someone discovers an exploitable vulnerability in tag2upload, the tag2upload developers would be the very first people to turn the service off until they can figure out how to fix
the vulnerability. Everyone involved in this discussion cares deeply
about not compromising the security of the archive.
At the end of the day, it's just code. Most code problems are fixable.
We're pretty good at what we do. I have great confidence in Debian's
ability to make tag2upload work in a secure manner if we decide that's something we want to do.
FTP Masters, commonly referred to as "ftpmaster", oversee and maintain
the well-being of Debian's official package repositories.
Ansgar 🙀 <[email protected]> writes:
Okay, so we have to accept a path into the archive that is known to
accept malicious uploads that would have been rejected by dak so maybe
that path will be changed later? I don't see that happening given all suggestions to change this have been rejected, even when fairly simple
to implement.
This is not known. You have asserted this, and then come up with increasingly implausible excuses for why you cannot clearly explain wtf
you are talking about.
It's entirely possible that there are security bugs in the current
tag2upload implementation, just like it's entirely possible that there are security bugs in dak and in any other piece of software. The way we deal with those, now and in the future, is that someone explains what the
security bug is and then we see if we can fix it.
The ftpmaster team have refused to trust uploads coming from the
tag2upload service. This GR is to override that decision.
My point is that it's not doing any magic. It's less than 500 lines of shell.
(And given upstream's hard policy to not merge changes not signing off
extra legal stuff, I sadly cannot give a more detailed bug report as
any fix created as a derived work from that would be unmergable unless
there was a Debian fork of the project with a different policy.)
*shrug* For tag2upload even trivial patches fixing bugs like references
to undefined functions won't be applied.
I doubt any more involved patches to fix security issues would be
applied. So I decided to not waste my time on that (but I checked
briefly and it at a quick glance it looks like issues from ~5 years ago
are still not resolved) and not stand in the way to create another
stalemate in case someone wants to fix them.
BEGIN FORMAL RESOLUTION TEXT
tag2upload allows DDs and DMs to upload simply by using the
git-debpush(1) script to push a signed git tag.
So, why am I proposing a GR?
The ftpmaster team have refused to trust uploads coming from the
tag2upload service. This GR is to override that decision.
While we want people in Debian in critical paths for archive security
to
be relatively conservative, that conservatism can be taken too far,
and
we think that is what has happened in this case.
In fact, tag2upload significantly *improves* the traceability of our source-only uploads.
From that, t2u can do its magic, build a source package, sign that withits key - and in the source include the full maintainer sig. Field in
On Wed, Jun 12, 2024 at 06:25:02AM +0800, Sean Whitton wrote:
BEGIN FORMAL RESOLUTION TEXT
tag2upload allows DDs and DMs to upload simply by using the
git-debpush(1) script to push a signed git tag.
Question. Does the tag signer need to trust the remote vcs and its
admins at the moment of tag signing? With a .changes file the signer has
full local control: local source code inspection, local checksums
generation, and local signing. I wonder how tag2upload would offer this
level of control without lowering the value of the signatures.
FTPMaster *is* in support of t2u, if it ends up in a way that allows dak doing the final verification/authorization of the upload, NOT needing to trust some other instance.
As generating changes and dsc on the maintainer side is out (we want a
git $something workflow now), that verification ought to be over the
content. So whatever tool the maintainer ends up calling ought to
generate a signature over the content of the package and put that into
git (a tag, or whatever t2u uses).
That then allows dak to do what it does now and trust the thing
originates from the maintainer.
My understanding is that the problem with thisDoes it? What if both the tag2upload client and server implemented
design from their perspective is that it requires a fat client on the >uploader's system, and whole point of tag2upload is to stop requiring a
fat client on the uploader's system. In particular, it requires all the
code to reconstruct the source package from a Git tree be installed
locally, which is basically a full dgit implementation.
[email protected] wrote:
My understanding is that the problem with this design from their
perspective is that it requires a fat client on the uploader's system,
and whole point of tag2upload is to stop requiring a fat client on the
uploader's system. In particular, it requires all the code to
reconstruct the source package from a Git tree be installed locally,
which is basically a full dgit implementation.
Does it? What if both the tag2upload client and server implemented
instead some very simple serialization and canonicalization algorithm
over the source package?
I am thinking about hashing something like a sorted list of (file name,
file hash) tuples.
The serialization isn't the problem, constructing the source package is.Yes, I understand this. But I think that the goal can be much simpler:
Once you have a source package, there are lots of things you can do, but
the problem is precisely that going from a Git tree to a source package is non-trivial and involves a whole bunch of Debian-specific code.
Maybe, but it should not be hard to add this kind of metadata.I am thinking about hashing something like a sorted list of (file name, file hash) tuples.I was trying to figure out while I was walking today whether that would be all you need, and I'm not sure it is. I couldn't convince myself that you could ignore file permissions, symlinks, hard links, and so forth.
I'm currently a bystander. And while I reply to Joerg's mail, I'm not directly referencing
any of the points in his mail, so no quotes.
I'd like to point out though, that signing the content of the package is not possible if the
developer should only need to do `git $something`.
They would also need to generate the source package, as I don't see a guarantee that
regenerating the source package from the same git tag (by t2u) would necessarily result
in a bitwise identical source package.
What would be possible would be (if dak has sufficient network access) to check the
signed git tag that t2u used and re-check the signature on that. The problem remains
that this only verified that the tag was set, not that t2u actually used the code that tag
points to. That would again require trust in t2u or reproducible source package builds
(and for dak to rebuild from the git repo).
In essence: I don't see how to fulfill the mentioned requirements by ftpmasters while
keeping the workflow of developers minimal. The only way I see to fulfill them is to have
the workflow that t2u is supposed to simplify and host actually run on the developer-controlled machines instead of a centralized service. Which defeats the
purpose IMHO.
On 17258 March 1977, Sean Whitton wrote:
So, why am I proposing a GR?
This one took me by surprise, honestly.
Looking into my notmuch, the last time tag2upload came up in my
ftpmaster inbox was in 2019. Between then and now there doesn't appear to
be any serious contact with us about it. There had been mentionings on
some mailing list somewhere, but nothing coming to us, that I can
find.
Even then, back in 2019, one of the major points that ftpteam members
raised had been "the archive has to be the final point to check if an
upload is accepted" and that we do retain *all* user signatures of
source packages, and that such a service must provide the same level
of possible verification. Some other requirements on the signature too (collision resistant, need to be verifyable with only stuff included
in the source package). Also something about not using Perl, but meh,
lets ignore that one here.
So, 5 years of (hopefully) development, but the major point (this should *not* bypass/circumvent archive upload checks and restrictions) did not
get addressed. More like, entirely ignored.
I'm a bit confused by the claim that no infrastructure changes are needed for this to go forward.
If I have been following the proposal correctly, source packages will be signed by tag2upload and not the uploader. Doesn't that mean changes are going to be needed so that we know in the archive who uploaded the package?
Hi Sean
On 2024/06/15 02:14, Sean Whitton wrote:
My point is that it's not doing any magic. It's less than 500 lines of shell.
Where do I find this? I searched for tag2upload on salsa and did an 'apt-cache
search tag2upload', but couldn't find anything.
On Fri 14 Jun 2024 at 06:06pm GMT, Scott Kitterman wrote:
I'm a bit confused by the claim that no infrastructure changes are needed for
this to go forward.
If I have been following the proposal correctly, source packages will be
signed by tag2upload and not the uploader. Doesn't that mean changes are
going to be needed so that we know in the archive who uploaded the package? >>
Ah, do you mean how tracker.d.o shows (signed by: [email protected]) for a >sponsored upload?
(...)
The tag2upload developers have been working on this system for many years now. Deployment has been blocked by an FTP team decision. At this point, the tag2upload developers are quite understandably not willing to do more work unless their work can actually be used by the project. The design
has gone through multiple iterative improvements, and while it can always
be better, at some point we need to make a decision about whether we're
going to ask them to abandon it or let them deploy it. The decision of
the FTP team appears to be that they should abandon it. The tag2upload developers are proposing to appeal that decision to the project as a
whole.
Deciding to deploy the service does not mean freezing the whole thing
as-is forever. It's inevitable that new issues will arise once the
service is running, and those will need to be addressed. It does mean extending trust to the tag2upload developers to manage their portion of
the service, similar to how we trust the FTP team to manage dak. That
trust includes expecting them to respond reasonably to concerns and
problems as they arise.
There is no need for this to be personal or hostile. I'm also a project delegate in another area, and I consider it part of the role of a project delegate to accept guidance from the project. I will make mistakes. I
will also make decisions that I don't think are mistakes but that don't
match what the project wants. The project gets to decide via GR if I'm wrong. That's part of the bargain I accepted when I joined the project.
If they so decide, it's my responsibility to go along with that decision
in good faith and with good will, or to decide that I no longer want to do the delegated work.
The point of the GR is to provide clear guidance from the project: either deploy this thing so that we can see how it works and further improve it,
or abandon the idea. We need to get some closure on this; talking about
it forever while blocking forward progress creates animosity and
frustration. Once the project provides that closure, I would expect
everyone involved to reorient around the guidance that the project has provided and work collaboratively together on Debian, just like we all try
to do in every other area.
Today I can download any source package in the archive and verify who uploaded the package and is responsible for its contents. It doesn't
matter if I download it from the main archive or a mirror. Personally,
I think that's an important characteristic of our package archive, which
is lost by tag2upload.
Scott Kitterman <[email protected]> writes:Would it be possible for tag2upload generate some sort of log or
Today I can download any source package in the archive and verifyThe same *information* is there, provided that the tag2upload
who uploaded the package and is responsible for its contents. It
doesn't matter if I download it from the main archive or a
mirror. Personally, I think that's an important characteristic
of our package archive, which is lost by tag2upload.
metadata is trustworthy, but it is not trivial to verify that
tag2upload did its part of the job properly. You can trace the
package back to tag2upload and you can see who tag2upload asserted
uploaded the package, and you can then retrieve that signed Git tag
and verify it, but in order to establish the last missing link, you
would have to redo the work that tag2upload did to assemble the
source package to check that it was done properly.
On 13.06.24 10:26, Sean Whitton wrote:
Yes. A proposal that has not yet engaged with the complexities of
3.0 (quilt) is not one in which we can yet have any confidence.
The proposal simply intends to do whatever the uploader would do to build
the source package from a tagged git worktree, except in a controlled and sandboxed environment.
I fail to understand why we should have any less confidence in that than in whatever the uploader does manually to achieve the same result (we hope!!).
--
-- regards
--
-- Matthias Urlichs
Hello,
On Sat 15 Jun 2024 at 06:03pm +02, Joerg Jaspert wrote:
On 17258 March 1977, Sean Whitton wrote:
So, why am I proposing a GR?
This one took me by surprise, honestly.
Looking into my notmuch, the last time tag2upload came up in my
ftpmaster inbox was in 2019. Between then and now there doesn't appear to be any serious contact with us about it. There had been mentionings on
some mailing list somewhere, but nothing coming to us, that I can
find.
In recent years, you have stepped in with your expertise in a number of emergencies, and I am most grateful for that.
But with respect, you have not otherwise been active in the ftpmaster
team, and you didn't significantly participate in the original
tag2upload discussions. So I think you may be missing things.
We have been seeking help behind the scenes over the past four years.
No progress was made, so we decided to draft a GR.
Even then, back in 2019, one of the major points that ftpteam members raised had been "the archive has to be the final point to check if an upload is accepted" and that we do retain *all* user signatures of
source packages, and that such a service must provide the same level
of possible verification. Some other requirements on the signature too (collision resistant, need to be verifyable with only stuff included
in the source package). Also something about not using Perl, but meh,
lets ignore that one here.
So, 5 years of (hopefully) development, but the major point (this should *not* bypass/circumvent archive upload checks and restrictions) did not
get addressed. More like, entirely ignored.
Like Russ, I'm grateful for how you've set out some things more clearly
in this message. I'm looking forward to reading your reply to him.
I would ask you not to characterise the disagreement we are having as
merely over a technical detail.
It's the essence of tag2upload that the tag metadata is minimal, and
easily generated by a short shell script, like git-debpush.
We did not ignore your position: we argued against it. No-one from
ftpmaster has responded to our arguments for wanting the metadata to be minimal. So as I say, I'm looking forward to your reply to Russ.
I currently do not have too deep a thought on how good their
implementation is. Just one thing I've seen picked at multiple times,
and in different places: The current implementation appears to move away
the final integrity check linking an upload to a person away from the
archive software to some other.
Thats a no-go.
Note: I do not say it must be "a dsc" "a git commit" or "a something"
that is used for this check. That is an implementation detail. But the
final check/link of an upload with a maintainer(s key) has to be "in"
the archive. Systems before it can *additionally* do any number of them,
but the final one is in dak.
On 15.06.24 00:37, Russ Allbery wrote:
dak should not be doing the source package transformation, because that
is a much more complicated process and therefore a larger security
attack surface. That's why it's done in a sandbox with a bunch of
privilege separation.
… which incidentally is far more secure than what Joe Random DD does
when he generates a source package.
The difference of course is that if somebody manages to compromise
tag2upload then they could insert backdoors into any and all Debian
packages, not just Joe's. On the other hand we can mitigate this by
careful auditing and monitoring. There is no auditing and monitoring on
Joe's development system, and the XZ backdoor has shown that you don't
even need to compromise Joe; a hit on the tarball that Joe uses is sufficient.
Would it be possible for tag2upload generate some sort of log or diff of
its operation? Then, a verifier does not have to reimplement the whole
dgit logic with all its edge cases, it merely has to apply the same tree transformation(s) as t2u and verify that this will indeed produce the
source package from the signed Git tag.
I believe that's what tag2upload pushes to the dgit-repos server,I was pondering over a way to securely link the Git tag with the
although I'm not sure that exactly matches what you're asking for.
On 13.06.24 21:51, Gunnar Wolf wrote:
But there can be many reasons the
three of us (keyring-maints) are unreachable for several hours.
Maybe a "send your key revocation certificate here and this will be done automagically" [email protected] email address might be a good idea …?
However this is mostly unrelated to the current GR discussion IMHO.
Do we delete all our old snapshots from snapshot.d.o if/when
infringing or non-Free content is detected in a package?
AFAIK: no we don't.
Looking into my notmuch, the last time tag2upload came up in myBut with respect, you have not otherwise been active in the ftpmaster
ftpmaster inbox was in 2019. Between then and now there doesn't
appear to
be any serious contact with us about it. There had been mentionings
on
some mailing list somewhere, but nothing coming to us, that I can
find.
team, and you didn't significantly participate in the original
tag2upload discussions. So I think you may be missing things.
We have been seeking help behind the scenes over the past four years.
No progress was made, so we decided to draft a GR.
So, 5 years of (hopefully) development, but the major point (this
should
*not* bypass/circumvent archive upload checks and restrictions) did
not
get addressed. More like, entirely ignored.
Like Russ, I'm grateful for how you've set out some things more
clearly
in this message. I'm looking forward to reading your reply to him.
I would ask you not to characterise the disagreement we are having as
merely over a technical detail.
It's the essence of tag2upload that the tag metadata is minimal, and
easily generated by a short shell script, like git-debpush.
From what I think right now, what we want would fit this.
Marco d'Itri <[email protected]> writes:
[email protected] wrote:
My understanding is that the problem with this design from their
perspective is that it requires a fat client on the uploader's system,
and whole point of tag2upload is to stop requiring a fat client on the
uploader's system. In particular, it requires all the code to
reconstruct the source package from a Git tree be installed locally,
which is basically a full dgit implementation.
Does it? What if both the tag2upload client and server implemented
instead some very simple serialization and canonicalization algorithm
over the source package?
The serialization isn't the problem, constructing the source package is.
Once you have a source package, there are lots of things you can do, but
the problem is precisely that going from a Git tree to a source package is non-trivial and involves a whole bunch of Debian-specific code.
FTPMaster *is* in support of t2u, if it ends up in a way that allows
dak
doing the final verification/authorization of the upload, NOT needing
to
trust some other instance.
Why is this your red line? Is it only that you don't want to add
another
system to the trusted set, or is there something more specific that
you're
concerned about?
As generating changes and dsc on the maintainer side is out (we wantI want to talk about designs from the perspective of threat models and constraints, so I'm going to try to reverse engineer those from your
a
git $something workflow now), that verification ought to be over the
content. So whatever tool the maintainer ends up calling ought to
generate a signature over the content of the package and put that
into
git (a tag, or whatever t2u uses).
proposed solution so that we can have a more structured discussion of
the
security properties. Please check this and make sure that I've
correctly
captured your thought process here:
The threat that you are trying to protect against is a compromise of
the
tag2upload server. I think you're trying to find a design that meets
the
following constraints:
1. You want there to be some external check on the tag2upload server
to
ensure that it correctly constructed a source package from the
uploader-signed artifact.
2. You (correctly, in my opinion) do not want dak to perform the
construction of the source package from a Git tag, so you are
looking
for some other agent in the system to serve as the check on the
tag2upload server.
Is that correct?
If that's right, it sounds like your solution is to push that
verification
work to the uploader. I know you're trying to avoid specifics, but
let me
make this slightly more specific so that I have something concrete to
talk
about. If I understand correctly, a design that you would approve
would
look something like this:
1. The uploader performs the work to transform a Git tree into an
unpacked
source package and calculates a Merkle tree hash of that unpacked
source package or something equivalent.
2. The uploader creates a signed Git tag over the corresponding Git
tree
in the same way as in the tag2upload design but additionally
includes
the Merkle tree hash in the signed data.
3. tag2upload functions in the same way as designed, starting from the
Git
tag, constructing the source package, and passing it to dak. It
additionally conveys the signed Git tag object to dak in some form,
such as a separate file.
4. dak verifies the Git tag signature, performs normal authorization
checks against it, unpacks the source package, calculates the same
Merkle tree hash, and ensures that the hash matches the one in the
Git
tag.
Am I correct that this is the type of design that you are asking for
and
that you would approve this design, modulo the normal sort of details
that
would need to be hashed out?
If this is the case, then I think it's incorrect to say that thethis requires Debian specific tools and something locally doing work,
tag2upload maintainers have ignored this feedback. I can't speak for
them, obviously, but I believe I've seen them answer essentially this feedback multiple times. My understanding is that the problem with
this
design from their perspective is that it requires a fat client on the uploader's system, and whole point of tag2upload is to stop requiring
a
fat client on the uploader's system. In particular, it requires all
the
code to reconstruct the source package from a Git tree be installed
locally, which is basically a full dgit implementation.
From rereading the 2019 thread a bit, the argument *seems* to be that
This is a real trade off about which we can disagree! This is a
useful
thing for us to argue about and vote about. I agree that the design
that
you propose is somewhat more secure in that it adds a check on the
security of the tag2upload server that would catch some classes of compromise, although I believe I have a substantial caveat to your
analysis that I'll talk about more below. But it's a trade off, like
most
things in security: the cost is that it's still not possible to upload
a
Debian package via a signed Git tag with some metadata that one can
manually construct if one wishes. A Debian uploader still has to have
a
Debian-specific program installed locally that does a bunch of complex transformations of a Git tree before they can trigger an upload.
If the disagreement is over whether that user interface property is
worth
the security trade off, then that's a concrete thing that we can argue
about, but I want to make sure that this fully captures your
objection.
That then allows dak to do what it does now and trust the thing
originates from the maintainer.
I think this is probably my strongest point of disagreement with your analysis. I think you're putting more weight on this idea of
maintainer
intent than it can actually support, and I think your analysis of
maintainer intent is somewhat incorrect.
It sounds like you are assuming that the maintainer has vetted the
thing
that they sign. I am extremely dubious that this is the case. I
believe
that the typical maintainer workflow today is that the maintainer
works on
the package in a working directory (usually but not always in Git)
until
they are happy with the results. Then they run a build tool that
generates a source package, and they blindly sign and upload that
source
package. They do not verify that the resulting source package matches
their intent in their working tree apart from building binary packages
based on it and running them.
In other words, the intent that the maintainer who uses Git is trying
to
express is "upload something corresponding to this Git tree and this
upstream orig tarball to the archive." By asking for the signature to
be
over the source package instead of over the Git tree, we are already
diluting maintainer intent. The thing the maintainer signs is not the
source code of the package; that's the Git tree. It's a build product
of
the source code.
In that sense, the signature verification that the tag2upload server
does
is *closer* to actual maintainer intent than a signature verification
on
the *.dsc file. We're diluting maintainer intent by moving to the
source
package.
That's one of my objections. My other objection is that I think that
the
uploader's system is already the weakest link in our current security
model. Relying on it for additional security properties is something
that
we're currently doing, and having the uploader's system redundantly
check
the tag2upload server does have some security benefit, but I think
that
security benefit is substantially less than the benefit of, say, a reproducible source package build server in a separate security domain
but
with a similar secured architecture rather than whatever state the
uploader's system is in.
In other words, if the goal is to create a redudant check on the
tag2upload server, doing that via something the uploader signs is not
clearly better (and I think arguably worse) than having two tag2upload servers in separate security domains that perform the same operations.
In
both cases you're still trusting the same code to perform the source
package transformation, but the tag2upload server has a better
security
model than the uploader's local system.
We want dak (and anyone else) to be able to say "Yes, DD/DM $x hasYes, you have been very clear from the start that this is what you want.
signed off this content". That only works, if dak (and later, the
public, if they want to check too) have the signature for this in a way
they can verify it. And not just a line somewhere "Sure, $service
checked this for you, trust us, please".
On 2024-06-15 5 h 03 a.m., Philip Hands wrote:
Sean Whitton <[email protected]> writes:
...
The ftpmaster team have refused to trust uploads coming from the
tag2upload service. This GR is to override that decision.
Full disclosure:
I'm a happy dgit user. The support I've had from Ian for dgit (when I
messed things up, generally) has been outstandingly good, and has
generally resulted in a change to dgit that prevents me (and others)
from messing up in a similar manner. It strikes me that tag2upload is
another stride in the same direction, so I'd like to have the chance
to use it, because I suspect that it will also make contributing to
Debian easier, less error-prone, and just more pleasant.
[Note: in the following, I am NOT trying to suggest a technical fix, so
please don't start nitpicking the details -- it's just a thought
experiment that I hope might shed some light on the situation]
If it were easy to deploy an instance of tag2upload in my house,
populated with a sub-key of my GPG key, I would probably set that up
(and then start worrying about the security of the sub-key ;-) ).
If I did that, I believe the FTP masters would still accept my uploads.
Should they? or is it perhaps the case that they are objecting to the
idea that tag2upload is capable of reliably generating a source package from a git tag. (I personally trust Ian when he says that it is capable)
If Ian were to offer a hosting service for such personal tag2upload instances, in a way that he assured me could not be used to sign
packages unless I had signed a matching git-tag, I would be willing to trust his assurances, and may well take him up on the offer.
It seems to me that such a centralised service is more likely to do
things like keep the keys in an HSM, and have effective separation of
the components, than something set up by a random developer at home, so
one could argue that it's going to be more secure than the self-hosted version.
Would the FTP masters still be OK with that? If not, what's changed?
If that's OK, but tag2upload as proposed is not, are we really drawing a line based on what name is on the signing key?
Would it make any difference to the FTP masters if there was some way
for me to assert that I trust the tag2upload service/key to build/sign source packages for me?
For instance, if one had to sign something with a GPG key that matches
the one that later signs a gpg tag, before tag2upload would be willing
to process one's signed tags, would that make the FTP masters happier?
Personally, I'm not convinced that would really add anything, since if
one has sufficient control of the key to push a signed tag, then one's
also going to be able to sign a statement that you want tag2upload to
act on that tag, but I thought that describing the options might help narrow down what the perceived problem is.
Of course, without something describing exactly what the problem is from the FTP master's point of view, it's very hard to judge the merits of
their position.
Cheers, Phil.
Thanks for this thought experiment. Although I was already in favor of
the proposal, it helped me get a better grasp of what is at stake in
this GR.
As many others have asked already, if there is really an opposition from
the FTP masters to the t2u proposal as stated in the draft GR, I urge
them to make it heard, especially with regards to Phil's email.
But maybe you can answer the question: Given the .dsc file, how can
you, and more critical the public, verify that you and only you signed
that upload?
If Ian were to offer a hosting service for such personal tag2upload instances, in a way that he assured me could not be used to sign
packages unless I had signed a matching git-tag, I would be willing to
trust his assurances, and may well take him up on the offer.
If that's OK, but tag2upload as proposed is not, are we really drawing a
line based on what name is on the signing key?
Would it make any difference to the FTP masters if there was some way
for me to assert that I trust the tag2upload service/key to build/sign
source packages for me?
Of course, without something describing exactly what the problem is from
the FTP master's point of view, it's very hard to judge the merits of
their position.
Bastian Blank <[email protected]> writes:
But maybe you can answer the question: Given the .dsc file, how can
you, and more critical the public, verify that you and only you signed
that upload?
Why is this, specifically, important?
I can turn that question around: given the .dsc file, how can I find the
Git tree that the maintainer vetted and intended to upload to the archive? >Why should I have any faith in the archive if I cannot verify that?
I don't think this is a useful way to talk about the security guarantees
that we can provide. You are massively overindexing on a very specific >implementation detail that does not prove what you seem to think it
proves.
But maybe you can answer the question: Given the .dsc file, how can
you, and more critical the public, verify that you and only you signed
that upload?
Still, we should find a way to keep the existing property ofVerifying what the uploader signed is simple enough, it's a git tag.
verifying
what the uploader signed to upload *without* requiring a third-party
$something to be available.
You
fetch it and verify that the hashes match ("git fsck"; current git is hardened against SHAttered) and that it's signed by the correct key.
You want to verify t2u's work? Simple enough, run dgit and compare to whatever t2u sent you. No $something required.
Oh wait, t2u isn't even "third party". It's a Debian tool running on properly-administered (we assume) Debian hardware, running just
another
build step in a sandbox.
Another way of doing this would be to teach t2u to simply push the tag
to an append-only git store. Then teach the builders that instead of
their equivalent of "apt-get source" they should fetch this tag from
our
git store and run dgit (and then push the legacy source tarballs)
themselves.
Would that scheme work better for you?
Hi Ansgar,
On Fri, Jun 14, 2024 at 10:39:11PM +0200, Ansgar 🙀 wrote:
...
Could you please expand on this and/or provide references? I have no idea what you're even talking about here.
I doubt any more involved patches to fix security issues would be
applied. So I decided to not waste my time on that (but I checked
briefly and it at a quick glance it looks like issues from ~5 years ago
are still not resolved) and not stand in the way to create another stalemate in case someone wants to fix them.
Glances can be deceiving. From what I've seen from dgit bugs Ian just likes to keep bugs open for discussion. I see no problem if thats what you're seeing.
What bugs are you looking at? Please be more concrete.
1) because it is the job of FTPmaster to authenticate and authorize theIf this were the actual issue then the ftpmasters could just run the tag2upload server themselves (which I think would make sense).
uploader (and Joerg sees that as "human uploader", which I somewhat agree
with)
2) because Joerg wants third parties to be able to verify the signature ofYes, I understand what he wants. But again, it is not obvious why we
the human uploader without the need for Debian specific tools.
There is another aspect he mentioned: he thinks the uploader needs to testEverybody can upload totally untested packages even without tag2upload:
the build of the package. (I'm theory I agree, but there are situations
As someone who has read every email in this chain, I have a couple of recommendations.
1. Clarify that the GR does not prevent future flexibility in changing the tag2upload service and does not give Sean Whitton, Ian Jackson, Jonathan McDowell, or Russ Allbery perpetual power to direct how it is implemented. This was already discussed as the intent, but I think it should be clearer in the GR itself who has authority to implement and manage tag2upload going forward.
My recommendation is that either, 1) the tag2upload service should
come under the implementation and management umbrella of the
ftpmasters delegation, or, 2) the DPL should create a new delegation
to implement and manage the tag2upload service. The DPL might
consider appointing one or more of the four people listed above as
either ftpmasters or to the new delegation (assuming they are
interested and willing).
My understanding is that the problem with this
design from their perspective is that it requires a fat client on the uploader's system,
Timo R�hling <[email protected]> writes:
Would it be possible for tag2upload generate some sort of log or diff of its operation? Then, a verifier does not have to reimplement the whole
dgit logic with all its edge cases, it merely has to apply the same tree transformation(s) as t2u and verify that this will indeed produce the source package from the signed Git tag.
I believe that's what tag2upload pushes to the dgit-repos server, although I'm not sure that exactly matches what you're asking for.
Another way of doing this would be to teach t2u to simply push the tag
to an append-only git store.
Then teach the builders that instead of their equivalent of "apt-get
source" they should fetch this tag from our git store and run dgit
(and then push the legacy source tarballs) themselves.
On 17262 March 1977, Sean Whitton wrote:
I would ask you not to characterise the disagreement we are having as merely over a technical detail.
You see this as personal? I don't, but if it is not technical, what
else?
Which behind the scenes? To who did you talk?
Also, currently we have the nicety that we store all signatures directly besides the source package, available for everyone to go and check.
Linking back to the actual Uploader, not to a random service key. You
can take that, run a gpgv on it and via the checksums of the files then
see that, sure, this is the code that the maintainer took and uploaded.
You do *not* need to trust any other random key on that. Not that of tag2upload. *AND* not that of FTPMaster.
[...]
We want dak (and anyone else) to be able to say "Yes, DD/DM $x has
signed off this content". That only works, if dak (and later, the
public, if they want to check too) have the signature for this in a way
they can verify it. And not just a line somewhere "Sure, $service
checked this for you, trust us, please".
Say I need to apply a security patch to some package's git tree on
Salsa. How can I be sure to even create the same source tree as the
previous uploader? I don't know which tool the maintainer used, nor the options supplied to it, so I can't.
Thus I need to ignore the maintainer's git tree in favor of "apt-get
source",
* We made fairly formal appeals to two sitting DPLs. What we got
was, basically, attempts at mediation, or facilitation of
discussions. We didn't see that as helpful, since we saw an
irreconcilable gap between our position and ftpmaster's.
Sean and I were under the impression that the most recent response
we got from a sittinug DPL was sent to us after consulting with
ftpmaster.
On 17.06.24 12:14, Ian Jackson wrote:
[1] "precisely the patches in d/patches" turns out to be extremely complicated in the general case. Different maintainer tooling
interprets d/patches differently. dpkg-source and gbp do not agree!
There are maintainer workflows and git trees with partially
incompatible notions!
That's an important point IMHO.
Say I need to apply a security patch to some package's git tree on
Salsa. How can I be sure to even create the same source tree as the
previous uploader? I don't know which tool the maintainer used, nor the options supplied to it, so I can't.
Thus I need to ignore the maintainer's git tree in favor of "apt-get source", manually apply the fix, upload that to the archive, then apply
the (hopefully) exact same patch to the actual git sources. Sorry but
WTF? [1]
[how about a design which includes:]
- there exists some tool that can extract the information >from the
DSC, verify the git signature, and that it generates a tar with
the same content?
On 17263 March 1977, Matthias Urlichs wrote:
Still, we should find a way to keep the existing property ofVerifying what the uploader signed is simple enough, it's a git tag.
verifying
what the uploader signed to upload *without* requiring a third-party
$something to be available.
You fetch it and verify that the hashes match ("git fsck"; current git
is hardened against SHAttered) and that it's signed by the correct
key.
Thats a third-party.
You want to verify t2u's work? Simple enough, run dgit and compare to
whatever t2u sent you. No $something required.
$something is required. It is not there with the source package on your mirror. It is a random other place. Sure, hosted by Debian, but its
still elsewhere and another thing required to have.
Another way of doing this would be to teach t2u to simply push the tag
to an append-only git store. Then teach the builders that instead of
their equivalent of "apt-get source" they should fetch this tag from
our git store and run dgit (and then push the legacy source tarballs)
themselves.
Would that scheme work better for you?
Not with an unchanged archive structure.
Oh wait, you mean the builders generate the source package as it will
end up in the archive? Where is the difference in builders generating
vs. t2u generating here, as the other part, signature from uploader,
would still not be there?
Quoting Matthias Urlichs (2024-06-17 13:05:17)
[...]
Thus I need to ignore the maintainer's git tree in favor of "apt-get source", manually apply the fix, upload that to the archive, then apply
the (hopefully) exact same patch to the actual git sources. Sorry but
WTF? [1]
[...]
The topic of this GR is not streamlining Debian use of git, but allowing
a simpler path from existing messy git to acceptance into Debian.
Your WTF seems to be from a false assumption that git is central to
Debian package maintenance. It isn't. It is popular, but not central,
nor standardized.
The topic of this GR is not streamlining Debian use of git, but allowing
a simpler path from existing messy git to acceptance into Debian.
Thus I need to ignore the maintainer's git tree in favor of "apt-get source", manually apply the fix, upload that to the archive, then apply the (hopefully) exact same patch to the actual git sources. Sorry but WTF? [1]
[...]
The topic of this GR is not streamlining Debian use of git, but allowing
a simpler path from existing messy git to acceptance into Debian.
Is this a GR? If it is, don't we have a process that's designed to
eventually stop never-ending back and forth disagreements, like the many
that have been seen in these threads?
On Sun, Jun 16, 2024 at 03:31:25PM +0200, Matthias Urlichs wrote:
On 13.06.24 10:26, Sean Whitton wrote:
Yes. A proposal that has not yet engaged with the complexities of
3.0 (quilt) is not one in which we can yet have any confidence.
The proposal simply intends to do whatever the uploader would do to build the source package from a tagged git worktree, except in a controlled and sandboxed environment.
I fail to understand why we should have any less confidence in that than in whatever the uploader does manually to achieve the same result (we hope!!).
One could argue that neiter matter. It is the outcome that matters: the source
package itself. That's what gets distributed.
Is this a GR?
If it is, don't we have a process that's designed to eventually
stop never-ending back and forth disagreements, like the many that have been seen in these threads?
Jonas Smedegaard writes ("Re: [RFC] General Resolution to deploy tag2upload [and 1 more messages]"):
Your WTF seems to be from a false assumption that git is central to
Debian package maintenance. It isn't. It is popular, but not central,
nor standardized.
git is central to most software maintenance in the world at large.
Not all, by any means. But, overwhelmingly, most.
In Debian it's unstandardised *unless* you use dgit push or tag2upload.
Then the git representation *is* standardised, albeit complex.
The topic of this GR is not streamlining Debian use of git, but allowing
a simpler path from existing messy git to acceptance into Debian.
I don't think this is true. tag2upload (like dgit) imposes a taxonomy
of git approaches, and defines precisely what each of these named
approaches means.
On 17.06.24 00:04, Joerg Jaspert wrote:
Still, we should find a way to keep the existing property of verifying
what the uploader signed to upload *without* requiring a third-party $something to be available.
Verifying what the uploader signed is simple enough, it's a git tag. You fetch it and verify that the hashes match ("git fsck"; current git is hardened against SHAttered) and that it's signed by the correct key.
You want to verify t2u's work? Simple enough, run dgit and compare to whatever t2u sent you. No $something required.
Oh wait, t2u isn't even "third party". It's a Debian tool running on properly-administered (we assume) Debian hardware, running just another build step in a sandbox.
at least technical debt). It is very bad design to have multiple ofthese for a single system as you significantly increase the attack
Hi,
On Mon, 2024-06-17 at 08:30 +0200, Matthias Urlichs wrote:
On 17.06.24 00:04, Joerg Jaspert wrote:
Still, we should find a way to keep the existing property of verifying
what the uploader signed to upload *without* requiring a third-party
$something to be available.
Verifying what the uploader signed is simple enough, it's a git tag. You
fetch it and verify that the hashes match ("git fsck"; current git is
hardened against SHAttered) and that it's signed by the correct key.
That's not usable though to match to what dak gets.
You want to verify t2u's work? Simple enough, run dgit and compare to
whatever t2u sent you. No $something required.
No, I just want it not to duplicate authentication and authorization in incompatible ways. Sadly tag2upload developers explicitly do not want
that.
Oh wait, t2u isn't even "third party". It's a Debian tool running on
properly-administered (we assume) Debian hardware, running just another
build step in a sandbox.
It's a third party that would accept uploads that dak would reject for security and/or policy reasons (including security critical ones); that
is not easily fixable if tag2upload is deployed as is (and the
developers have indicated that they do not want to change that).
It essentially introduces an alternative authentication system (and authorization system as tag2upload seems to care about DM status) that *replaces* the one in dak *and* *disagrees* it. Even when you fix one
of the instances where the systems disagree, the basic problem remains
at least technical debt). It is very bad design to have multiple ofthese for a single system as you significantly increase the attack
surface (and one of these usually ends up with less maintenance than
the other). (Only one of the systems has to allow the upload, i.e., a
big "*OR*".)
Joerg Jaspert writes ("Re: [RFC] General Resolution to deploy tag2upload"):
On 17262 March 1977, Sean Whitton wrote:
I would ask you not to characterise the disagreement we are having as
merely over a technical detail.
You see this as personal? I don't, but if it is not technical, what
else?
I think Sean means it's not a detail. From our point of view, we're
talking about a critical property of our design.
Which behind the scenes? To who did you talk?
Firstly, I want to ask: would it have made any difference if we had
raised the matter in public again on -devel?
Based on your replies here, it seems that ftpmaster's objections are
still just as firm now as they have been over the past four years.
We wouldn't want to keep asking the same question on a list like
-devel, when we are pretty sure the answer will just be the same; that
would be rude to both ftpmaster and the rest of Debian.
Anyway, if we're going down this route:
I think we didn't speak to ftpmaster directly about this since 2019.
I don't want to name names in case this turns into a finger-pointing >exercise, but, over the years, we have spoken to various people, with
varying levels of formality:
* We made fairly formal appeals to two sitting DPLs. What we got
was, basically, attempts at mediation, or facilitation of
discussions. We didn't see that as helpful, since we saw an
irreconcilable gap between our position and ftpmaster's.
Sean and I were under the impression that the most recent response
we got from a sittinug DPL was sent to us after consulting with
ftpmaster.
* We have asked for help from two sitting members of the TC, and one
former DPL. I don't think any of those people would have spoken to
ftpmaster, but neither did they suggest that we should raise the
matter on -devel again.
One outcome of that was encouragement to give the talk I did at the
2023 Cambridge minidebconf. I explicitly stated that the project
was blocked by ftpmaster, but of course I don't expect ftpmaster to
have necessarily seen that talk.
When it comes to tag2upload, I believe it's something that most people
would want. At least it doesn't take away from any existing workflow or
force people to change their habits right away, so in terms of being
able to gain support for it, it has a lot going for it.
Michael Lustfield writes ("Re: [RFC] General Resolution to deploy
tag2upload [and 1 more messages]"):
Is this a GR?
It is not yet a formally proposed GR. So in one sense, no.
If it is, don't we have a process that's designed to eventuallybeen
stop never-ending back and forth disagreements, like the many that have
seen in these threads?
Actually, it seems to me that these threads, while long, have mostly
avoided repetetive back-and-forth.
The formal GR discussion period is very short. We have had a couple
of important points raised here imply changes to the resolution.
I think the thread so far has been very useful to help everyone
understand our proposal; to reconfirm that our position and
ftpmaster's are still irreconcilable; and to help us identify
questions we probably want to address in the FAQ we're preparing.
I don't think we would have achieved that with the formal GR
discussion period. We anticipated that there would be many questions,
which is why we started with a draft.
I agree that we don't want to drag this out. I have been trying to
avoid replying when I wouldn't be adding anything. I think Sean and
Russ have been doing the same.
And, we'll bring this to a formal GR soon, so hopefully you'll only
have to bear a few more weeks of this. In the meantime, thanks for
your forbearance.
Ian.
--
Ian Jackson <[email protected]> These opinions are my
own.
Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.
A few more weeks, eh?
To me, it seems like we're intentionally avoiding the GR process because we don't like the process and have decided to simply ignore it for the sake of extending the discussion.
A few more weeks, eh?
To me, it seems like we're intentionally avoiding the GR process because we don't like the process and have decided to simply ignore it for the sake of extending the discussion.
I think if you want to step away from the implementation details, the
more abstract point is that you don't need data from outside the archive
(or a mirror of the archive) in order to verify that the source package
you downloaded has not been modified since then and who uploaded it.
As it happens though you can't tell if what's in the archive matches the uploader intent with tag2upload either.
All you can vet is that the tag2upload service claims it does. You may
think that's better, but neither of them are entirely free of risk.
There is another aspect he mentioned: he thinks the uploader needs to test
the build of the package. (I'm theory I agree, but there are situations
Everybody can upload totally untested packages even without tag2upload:
maybe tag2upload would make this marginally easier, but then I do not
believe that this is a compelling enough argument to offset the benefits
of a tag2upload-like service.
On 6/17/24 20:38, Jonathan Carter wrote:
When it comes to tag2upload, I believe it's something that most people would want. At least it doesn't take away from any existing workflow or force people to change their habits right away, so in terms of being
able to gain support for it, it has a lot going for it.
Is tag2upload completely orthogonal to any efforts to move all packages
to git-based team maintenance?
To me, it seems like we're intentionally avoiding the GR process because
we don't like the process and have decided to simply ignore it for the
sake of extending the discussion.
A few more weeks, eh?
To me, it seems like we're intentionally avoiding the GR process becausewe
don't like the process and have decided to simply ignore it for the sakeof
extending the discussion.
Technically, the GR process advantages the ones proposing a GR
for giving them more time to think things trough,
while the others are restricted to the short discussion period.
So doing it like this is more fair play.
On 17 Jun 2024, at 14:53, Ansgar 🙀 <[email protected]> wrote:
It essentially introduces an alternative authentication system (and authorization system as tag2upload seems to care about DM status) that *replaces* the one in dak *and* *disagrees* it. Even when you fix one
of the instances where the systems disagree, the basic problem remains
at least technical debt). It is very bad design to have multiple ofthese for a single system as you significantly increase the attack
surface (and one of these usually ends up with less maintenance than
the other). (Only one of the systems has to allow the upload, i.e., a
big "*OR*".)
Would an API for tag2upload to use satisfy that concern? It feeds in
a source package name and key fingerprint (or the signature, or
whatever’s deemed useful), dak replies whether it’s valid for
uploading. Then you don’t need to trust tag2upload’s authorisation
checks beyond that it adheres to what dak says each time.
Quoting Ian Jackson (2024-06-17 14:13:00)
git is central to most software maintenance in the world at large.
Not all, by any means. But, overwhelmingly, most.
In Debian it's unstandardised *unless* you use dgit push or tag2upload. Then the git representation *is* standardised, albeit complex.
Has Debian standardized dgit for git source package representation?
Or do you simply mean that dgit is standardized by the dgit developers,
same as various other tools like git-buildpackage arguably has
standardized their various structures as well?
The topic of this GR is not streamlining Debian use of git, but allowing a simpler path from existing messy git to acceptance into Debian.
I don't think this is true. tag2upload (like dgit) imposes a taxonomy
of git approaches, and defines precisely what each of these named approaches means.
Oh, if one of the proposers of the GR insists that the GR is indeed about streamlining Debian use of git, then I'll back off.
Which behind the scenes? To who did you talk?
Firstly, I want to ask: would it have made any difference if we had
raised the matter in public again on -devel?
Based on your replies here, it seems that ftpmaster's objections are
still just as firm now as they have been over the past four years.
We wouldn't want to keep asking the same question on a list like
-devel, when we are pretty sure the answer will just be the same; that
would be rude to both ftpmaster and the rest of Debian.
Anyway, if we're going down this route:
I think we didn't speak to ftpmaster directly about this since 2019.
I don't want to name names in case this turns into a finger-pointing exercise, but, over the years, we have spoken to various people, with
varying levels of formality:
* We made fairly formal appeals to two sitting DPLs. What we got
was, basically, attempts at mediation, or facilitation of
discussions. We didn't see that as helpful, since we saw an
irreconcilable gap between our position and ftpmaster's.
Sean and I were under the impression that the most recent response
we got from a sittinug DPL was sent to us after consulting with
ftpmaster.
* We have asked for help from two sitting members of the TC, and one
former DPL. I don't think any of those people would have spoken to
ftpmaster, but neither did they suggest that we should raise the
matter on -devel again.
One outcome of that was encouragement to give the talk I did at the
2023 Cambridge minidebconf. I explicitly stated that the project
was blocked by ftpmaster, but of course I don't expect ftpmaster to
have necessarily seen that talk.
Ian.
Your WTF seems to be from a false assumption that git is central to
Debian package maintenance. It isn't. It is popular, but not central,
nor standardized.
git is central to most software maintenance in the world at large.
Not all, by any means. But, overwhelmingly, most.
Thanks, Joerg and Ian, for summing it up in this subthread and
arriving at the mail I am quoting. I think that, if we are to vote on
this topic, both sides to this disagreement should be made clear and explicit. The call for votes should include, or at least prominently
link to, where the disagreement lies. Seeing this as an example of
(failed) collaboration between two teams with differing goals, with
both not willing to compromise in their position, makes it much
clearer to those of us who have only watched at (quite) a distance.
On 17261 March 1977, Russ Allbery wrote:
Why is this your red line? Is it only that you don't want to add
another system to the trusted set, or is there something more specific
that you're concerned about?
There ought to be one point that is doing this step, not many, yes.
Includes that it is the delegated work and task description of FTPMaster
to do this, though that can be addressed by either us ending up running
it, or adjusting delegations. Not sure the latter ends up with happy
people, but is one existing way.
Also, currently we have the nicety that we store all signatures directly besides the source package, available for everyone to go and check.
Linking back to the actual Uploader, not to a random service key. You
can take that, run a gpgv on it and via the checksums of the files then
see that, sure, this is the code that the maintainer took and uploaded.
You do *not* need to trust any other random key on that. Not that of tag2upload. *AND* not that of FTPMaster.
Unsure those are the right words. We want to have the uploader create a signature over the content they want to have appear in the archive. In a
way, that this signature can be taken and placed beside the source, and
then independently verified. *Currently* this is done using .dsc files.
I basically assume that the uploader *does* need to have their source locally, no matter what. (Their git cloned).
I also do assume that the uploader will build things, to see if the
stuff they are going to "push to the archive" (and our users) actually
does what they intended it to do - and to test it.
Well, if the maintainers system is broken in, it makes no difference if
a git tag or a dsc or whatever else is signed.
I think we cannot increment ourselves into a good solution here.
There are lots of other problems with the various mappings in use, each
makes different trade-offs. Accommodating them all inside tag2upload is
an ab-initio commitment to technical debt.
Joerg Jaspert <[email protected]> writes:
I also do assume that the uploader will build things, to see if the
stuff they are going to "push to the archive" (and our users) actually
does what they intended it to do - and to test it.
This is the assumption that I think is no longer valid given Salsa CI. It used to be that this was the only way to test a package; now we can do equally well and often better by letting Salsa CI do the hard work.
Sent from Workspace ONE Boxer
On Jun 17, 2024 6:23 PM, Ansgar 🙀 <[email protected]> wrote:
Hi,
On Mon, 2024-06-17 at 14:59 +0100, Jessica Clarke wrote:that
On 17 Jun 2024, at 14:53, Ansgar 🙀 <[email protected]> wrote:
It essentially introduces an alternative authentication system (and authorization system as tag2upload seems to care about DM status)
one*replaces* the one in dak *and* *disagrees* it. Even when you fix
remainsof the instances where the systems disagree, the basic problem
ofat least technical debt). It is very bad design to have multiple
athese for a single system as you significantly increase the attack surface (and one of these usually ends up with less maintenance than the other). (Only one of the systems has to allow the upload, i.e.,
big "*OR*".)
Would an API for tag2upload to use satisfy that concern? It feeds in
a source package name and key fingerprint (or the signature, or whatever’s deemed useful), dak replies whether it’s valid for uploading. Then you don’t need to trust tag2upload’s authorisation checks beyond that it adheres to what dak says each time.
Hmm, a signed manifest solves that problem and also adds some integrity verification and possibility for third parties to check the signature itself as well.
Back to square one: Didier's proof of concept design is much better, as it solves all of the concerns. No need to trust a 3rd party key, packages are signed and identified with the uploader's key, and respect all ACLs. No
need to change anything to our infrastructure. Added bemefit: packages must be reproducible to support it.
The point it isn't solving: contributors still need to learn how to build *source* packages locally. Is this a problem ? I don't think so: we are talking about contributing to packaging anyways. Isn't this the bare
minimum knowledge to expect ?
Thomas
Which behind the scenes? To who did you talk?Firstly, I want to ask: would it have made any difference if we had
raised the matter in public again on -devel?
Based on your replies here, it seems that ftpmaster's objections are
still just as firm now as they have been over the past four years.
We wouldn't want to keep asking the same question on a list like
-devel, when we are pretty sure the answer will just be the same; that
would be rude to both ftpmaster and the rest of Debian.
I think we didn't speak to ftpmaster directly about this since 2019.
discussions. We didn't see that as helpful, since we saw an
irreconcilable gap between our position and ftpmaster's.
awesome feature.There are plenty of valid use cases that do not create a dsc locally.
Please be specific in why it is unacceptable to have a local tool do local (very quick) computation in a full automated way for you. Why is this unacceptable?
The whole point is that dsc package is *not* source. It is not the format most commonly used for development work. It is an intermediate software generated artifact.
What point are you trying to make here? A .dsc is metadata for the upload indeed, just like the
.changes. Then what? Why should you care if a local tooling on your laptop is building it, adding it to the signed tag, and maybe (optionnaly) deleting your local copy after sending it to Salsa CI for the upload?
Everything after that: dsc, deb, Packages.gz, Release.gz are generated from that actual source tree and can be handled by appropriate automation.
Have you ever created a .dsc file by hand? I suppose answer is no. So what is the trouble ? Or are you saying I am proposing to do that ? I suppose no as well. So what bothers you? :)
That your laptop need to calculate the hash of your debian.tar.xz when tagging? Isn't this a very small deal to make, so we don't need to touch a bit to our infrastructure, auth and ACLs? Plus having no "a single key to sign them all" would be an
How do you upload then? There's somewhere a script that actually creates
the .dsc and .changes files for upload, right?
So, you're proposing to upload a signed git tag. I'm proposing that you
do that, PLUS 2 signed artifact within that git tag, that are later
check (expected to be bit-by-bit identical) against something that's generated in a trusted environment.
Agreed 100%. I'm just voting for a different way to write the
automation, one where the .dsc and .changes *SIGNATURES* are produce on
a DD laptop, just like now. Everything else is the way you describe.
On 6/18/24 10:03, Aigars Mahinovs wrote:
The point is that with certain git-centric workflows (like what Russ described for git-debrebase) there never is a *.dsc or a debian.tar.xz
or even an orig.tar.gz. Those are never there to be checksummed. And
the process for getting from the real git tree that a developer
*actually* does their work on and verifies the contents of to these generated source artifacts is sufficiently non-trivial that people end
up never actually verifying the files they are signing. The signature
on the dsc is signing something that people never actually check.
How do you upload then? There's somewhere a script that actually creates
the .dsc and .changes files for upload, right?
On Tue, 18 Jun 2024 at 17:44, Soren Stoutner <[email protected]> wrote:create
From a security perspective, it makes sense to me that the DD should
a
.dsc and .changes and sign them, and then tag2upload should create them as well and verify they match exactly.
They will not. Translation from a git tree to a Debian source package
with dsc and changes
is not a trivial operation.
I wonder if we have a good idea of what the project believes to be the case between #1 and #2:
1) Is the source of a package the debian source distribution?
2) Is the source of a package the VCS where the source is held?
Or, to extend it once more in the context of this discussion -- should the source be built by a buildd from the "true" source? Why do we bother having
a maintainer sign this intermediate artifact, like we used to with debs?
Even more extremely -- should we bother with dscs anymore if they're just
an intermediate artifact?
Most extremely -- do we need a new dpkg source format?
Should buildds build off git tags? Do we need to overhaul how we treat sources?
From a security perspective, it makes sense to me that the DD should
create a .dsc and .changes and sign them, and then tag2upload should
create them as well and verify they match exactly.
So far, from this thread, it looks like the decision from 2019 may
still
stand, but I think there are still places to explore.
There ought to be one point that is doing this step, not many, yes.Elsewhere in this thread, Jessica Clarke made the excellent suggestion
Includes that it is the delegated work and task description of
FTPMaster
to do this, though that can be addressed by either us ending up
running
it, or adjusting delegations. Not sure the latter ends up with happy
people, but is one existing way.
that perhaps the authentication check concern could be resolved by dak providing an API for performing the authentication and authorization
check. I am embarrassed that I didn't think of that; thank you very
much
to Jessica for that suggestion.
That gives me some hope that this point has a relatively neat
solution, so
I'm going to focus on exactly what dak needs the uploader signature to
cover in order to accept the package.
Also, currently we have the nicety that we store all signaturesThe dgit-repos server similarly archives the signed Git tag with the
directly
besides the source package, available for everyone to go and check.
Linking back to the actual Uploader, not to a random service key. You
can take that, run a gpgv on it and via the checksums of the files
then
see that, sure, this is the code that the maintainer took and
uploaded.
You do *not* need to trust any other random key on that. Not that of
tag2upload. *AND* not that of FTPMaster.
Git
tree over which it is a signature, ensuring that this is independent
of
Salsa where the tag could potentially be deleted by someone. This is
not
in the archive, of course, but I don't see any technical reason why
some
version of that data couldn't also be uploaded to the archive if one
wanted to use the archive as a highly distributed backup of the
dgit-repos
server. There is, however, the long-standing concern about any
variation
on the 3.0 (git) source package format that the Git tree the
maintainer
signed may contain non-free code somewhere in its history.
So here too, I'm not sure that this is inherently a blocker, although
in
the past the FTP team has been reluctant to include in the archive the
data that is required to preseve a complete record of what is signed
by a
Git tag. (One obvious potential solution is to only put a shallow
clone
in the archive, so you can verify the signature but some of the content-addressable store references are unresolved.)
Unsure those are the right words. We want to have the uploader createOkay, so again I think it's easier to talk about specifics, so let me
a
signature over the content they want to have appear in the archive.
In a
way, that this signature can be taken and placed beside the source,
and
then independently verified. *Currently* this is done using .dsc
files.
make
this concrete by using myself as the use case.
I use the git-debrebase workflow for maintaining most of my Debian
packages. What this means, for those who aren't familiar with it, is
that
my workflow looks like this (this is idealized; I'm still migrating my packages fully to this workflow so the specifics currently vary
somewhat):
Now, I would like to use tag2upload rather than using dgit locally to
make
the upload. I want to move my testing into Salsa CI so that my
overall
workflow more closely matches the way that I do all of my development
in
my day job. Salsa CI is great about not getting lazy and skipping
test
steps just because I am in a hurry to get a package uploaded, and I
can
capture every test that was useful and not have to remember to re-run
it.
(This is the part that I haven't done yet; I know I want to do it and
have
not yet found the time.)
What signed artifact do I need to provide so that the FTP team will be comfortable accepting my tag2upload-built source package?
Note, importantly, that the source package contains things that are
not in
the files present in the working tree of a local Git checkout of my
source
package. The patch descriptions and committer information and related metadata are where they are supposed to be in Git: in the metadata for
the
corresponding Git commit, not in a file in my working tree. The transformation that puts that data into a 3.0 (quilt) source package
is
not rocket science, but it's not trivial either.
The signed artifact that I'm naturally providing is a signature across
the
entire Git tree, which includes all of the history and thus all of the
data that goes into the source package. So everything that goes into
the
source package *is signed*, by me, when I trigger a tag2upload upload.
The problem comes when dak wants to verify the correspondence between
that
data structure and the source package. It certainly can verify that
my
Git tag is valid and it can verify that the tag specifies the correct
source package, version, and so forth. But if it wants to verify that
the
construction of the debian/patches/* directory is correct, I think it
would have to perform the same transformation on my Git history that
dgit
and tag2upload perform.
I basically assume that the uploader *does* need to have their sourceYes, I agree. I don't think there's any way to avoid this: the source
locally, no matter what. (Their git cloned).
has
to be in the same place that the key is in, or close to in the case of
secure key storage, in order for the uploader to sign it and know what
they are signing.
I also do assume that the uploader will build things, to see if theThis is the assumption that I think is no longer valid given Salsa CI.
stuff they are going to "push to the archive" (and our users)
actually
does what they intended it to do - and to test it.
It
used to be that this was the only way to test a package; now we can do equally well and often better by letting Salsa CI do the hard work.
On 17263 March 1977, Russ Allbery wrote:
What signed artifact do I need to provide so that the FTP team will be
comfortable accepting my tag2upload-built source package?
I do like that workflow. The one magic part seems to be the thing that creates debian/patches files out of your git tree. As soon as that step
is done, I think everything should be there that makes up the thing that
gets uploaded?
But I bet there are other ways to work that make it less trivial magic
than what you seem to have.
To answer your question from above: At the point the magic is done,
anything that lists the files and checksums of (ignore metadata like
file timestamps) that will appear when one unpacks the source package.
On 17265 March 1977, Russ Allbery wrote:
Does anyone think I've missed anything?
Yah. About the whole first half of my mail.
The API details need to be defined, of course, but the basic idea is
there and sounds good, and I believe one can get t2u and dak
interoperate with that with about all of the client side requirements
the t2u people want to have stay intact.
Does anyone think I've missed anything?
While I still answer to some of the rest of your mail, this api
thing
sounds like a good way forward to me.
On Mon, 2024-06-17 at 14:59 +0100, Jessica Clarke wrote:
Would an API for tag2upload to use satisfy that concern? It feeds in a
source package name and key fingerprint (or the signature, or
whatever’s deemed useful), dak replies whether it’s valid for
uploading. Then you don’t need to trust tag2upload’s authorisation
checks beyond that it adheres to what dak says each time.
Hmm, a signed manifest solves that problem and also adds some integrity verification and possibility for third parties to check the signature
itself as well.
An API that gets the signed data (signed tag) with some metadata (which package, version, target suite, maybe some other bits of d/control,
would have to think; parts of that might also be in the tag) would
still allow having a single system make the decisions.
A downside is that integrity verification and third parties (possibly) verifying the data falls flat. For me integrity verification would be somewhat nice and third parties a bit less interesting (given they can
get the tag, compare files and possibly redo what tag2upload does if
they also care about .dsc and stuff).
We are not very good at doing integrated services though. (Please no RPC
via SSH forced commands...)
With regard to integrity verification, it's probably not fully clear
what one exactly wants there. Joerg's idea to have tag2upload run by ftp-master is related to that: if it was part of the archive, all
integrity promises still come from within the archive. (I don't really
want to run the service though; outsourcing is probably better.)
Arguably we do trust some external services somewhat already (like
buildds for binary packages).
Ansgar 🙀 <[email protected]> writes:
A downside is that integrity verification and third parties (possibly) verifying the data falls flat. For me integrity verification would be somewhat nice and third parties a bit less interesting (given they can
get the tag, compare files and possibly redo what tag2upload does if
they also care about .dsc and stuff).
Integrity verification of the source package construction by dak was the
part of this that I was the most worried about, because that's the place where it had looked like the tag2upload goals couldn't meet the FTP team's requirements. If that's something that can be relaxed, that's huge for being able to find an implementation that hopefully makes everyone happy.
My understanding of the FTP team position is that the uploader must[...]
generate and sign either the final source package, or the files that constitute the unpacked view of the final source package, or a
trivial variation thereof, and include that signature in the upload.
My understanding of the tag2upload developer position is that this requirement prohibits the goal of tag2upload. People who want to
build the source package locally can already use the same algorithm
today; that's what dgit is. The whole point of the tag2upload
project is to remove the requirement that the package uploader
install dgit or equivalent software locally and build the source
package locally. This FTP team requirement therefore makes it
impossible for tag2upload to proceed; any system that would have the
required property would fail to accomplish the core goal of the
tag2upload project. Therefore, this delegate decision is blocking
the deployment of tag2upload.
My understanding of the tag2upload developer position is that this requirement prohibits the goal of tag2upload. People who want to
build the source package locally can already use the same algorithm
today; that's what dgit is. The whole point of the tag2upload
project is to remove the requirement that the package uploader
install dgit or equivalent software locally and build the source
package locally. This FTP team requirement therefore makes it
impossible for tag2upload to proceed; any system that would have the required property would fail to accomplish the core goal of the
tag2upload project. Therefore, this delegate decision is blocking
the deployment of tag2upload.
The code the tag2upload developers wrote is perfectly able to do that: git-debpush, the tag2upload client by the tag2upload developers,
doesn't require dgit nor building the source package, and documented in
the initial mail about the GR to be used by people. It already looks at patched and unpatched source trees (and checks that patches applies)
and compares them with the tree in Git.
It could easily compute an integrity hash as well.
Or is git-debpush itself incompatible with the goals of tag2upload?
What would a client-side compatible with the goals then look like?
Will such a client be available before the GR?
I hope the tag2upload developers requirements will not make it
impossible to proceed and they will not continue to block the
deployment of tag2upload.
This is *exactly* the same situation as we already have with
source-only uploads. There is a
state of the software upload that the developer signs off on and then
there are further technical
build artifacts that the developer does *not* sign - they are signed
by the technical systems that
generated those artifacts. And those systems are centrally maintained
for scalability, convenience
and security.
Just include a hash
similar to [1] in the signed tag data
it might need minor changes if
one cares about file permissions[2].
I agree with that, but it effectively changes what we consider a "source package", and that comes with all the baggage of archival:
- we need to store the actual contents in the archive, not just a
reference to an online service, or the online service becomes part of
the archive.
- we need to distribute that on CDs and mirrors (which both have size constraints)
- we need to keep and archive also the tools required for processing
- we still need to be able to comply with removal requests
- we need to be able to deal with "epochs" in package development
For example, the "clinfo" packages that were shipped in jessie and in
stretch and following are completely unrelated, they just have similar
enough output that one could be used as a replacement for the other.
If those had been git-maintained packages, how would those have been archived? Is the dgit package namespace separate from Debian source
package names?
On Wed, 19 Jun 2024 at 07:54:45 +0200, Ansgar 🙀 wrote:
Just include a hash
similar to [1] in the signed tag data
Prior art: this is conceptually the same as git-evtag from
src:git-evtag. You can see real-world use of git-evtag in the upstream
tags (e.g. v0.9.0) of src:bubblewrap.
it might need minor changes if
one cares about file permissions[2].
If this is something that will be used as a security mechanism, then
I think it probably needs to represent symbolic links as well. I think git-evtag does (it checksums all git "blobs" and I believe that includes symlinks), but it seems sumdb/dirhash behaves as though symlinks didn't exist.
git specifically *doesn't* care about file permissions, beyond a 1-bit representation of whether it's executable or not, so anything like
tag2upload that is based on git-as-source will have to cope with mtimes
and detailed permissions possibly differing between what was obtained
from git and what's in the .dsc. When people have talked about code being "treesame" elsewhere in this thread, I believe they mean "all facts that
git tracks in its tree are the same, facts that git does not track might
not be".
I wonder if we have a good idea of what the project believes to be the case between #1 and #2:
1) Is the source of a package the debian source distribution?
2) Is the source of a package the VCS where the source is held?
Or, to extend it once more in the context of this discussion --
should the source be built by a buildd from the "true" source? Why
do we bother having a maintainer sign this intermediate artifact,
like we used to with debs?
Even more extremely -- should we bother with dscs anymore if they're
just an intermediate artifact?
Most extremely -- do we need a new dpkg source format? Should
buildds build off git tags? Do we need to overhaul how we treat
sources?
Galaxy brain extremely -- what does GPL compliance mean if the dsc is not the true source? (ok this one isn't serious, there's no doubt it's corresponding source :) )
Ansgar 🙀 <[email protected]> writes:
It doesn't require dak to reproduce whatever steps tag2upload runs to
generate the .dsc from that or source packages to be reproducible; the
uploader only needs to know which files end up in the source package,
something I would expect an uploader to know.
No, the uploader doesn't know this. Some of the files (the ones in debian/patches) are synthesized from Git commits and do not exist at all
in the checkout of the Git tree, which will often be in patches-applied
form.
On Tue, 2024-06-18 at 18:25 -0700, Russ Allbery wrote:
Integrity verification of the source package construction by dak was
the part of this that I was the most worried about, because that's the
place where it had looked like the tag2upload goals couldn't meet the
FTP team's requirements. If that's something that can be relaxed,
that's huge for being able to find an implementation that hopefully
makes everyone happy.
I don't think it is hard to include that as well. Just include a hash
similar to [1] in the signed tag data; it might need minor changes if
one cares about file permissions[2].
It doesn't require dak to reproduce whatever steps tag2upload runs to generate the .dsc from that or source packages to be reproducible; the uploader only needs to know which files end up in the source package, something I would expect an uploader to know.
- we need to store the actual contents in the archive, not just a
reference to an online service, or the online service becomes part of
the archive.
Effectively the dgit.debian.org becomes the archive or the snapshot service of the git view of the Debian source packages.
People interacting with Debian on the Debian source package level can keep doing that exactly as before. But to access a deeper, git level of source you would, naturally, have to use different tools and access a different service.
- we need to distribute that on CDs and mirrors (which both have size
constraints)
Do we? Already Debian source packages are in reality a separate set of DVDs (we don't even provide source on CDs anymore)
and only a subset of mirrors. While having an option to have a dgit server mirrored might be nice, it does not really have to be inside the current mirror
or archive structure. And it does not have to be a blocker.
- we need to keep and archive also the tools required for processing
Which should be trivial as long as they are packaged.
For example, the "clinfo" packages that were shipped in jessie and in
stretch and following are completely unrelated, they just have similar
enough output that one could be used as a replacement for the other.
If those had been git-maintained packages, how would those have been
archived? Is the dgit package namespace separate from Debian source
package names?
Well, there are many ways to do that. For example have a merge commit
that merges in
the new upstream into the old upstream branch where the merge commit
itself deletes
all old files and replaces them with the new files. The Debian changelog would
note the changed upstream and move on as normal.
Ansgar 🙀 <[email protected]> writes:
It doesn't require dak to reproduce whatever steps tag2upload runs to generate the .dsc from that or source packages to be reproducible; the uploader only needs to know which files end up in the source package, something I would expect an uploader to know.
No, the uploader doesn't know this. Some of the files (the ones in debian/patches) are synthesized from Git commits and do not exist at all
in the checkout of the Git tree, which will often be in patches-applied
form.
On Wed, 2024-06-19 at 07:45 -0700, Russ Allbery wrote:
No, the uploader doesn't know this. Some of the files (the ones in
debian/patches) are synthesized from Git commits and do not exist at
all in the checkout of the Git tree, which will often be in
patches-applied form.
Hmm, I did not think of people effectively forking the upstream project instead of doing only packaging.
People could just move to native packages if they do that: that also
works for changes to binary files, no longer requires synthesizing
patches and thus brings the Debian source package closer to the Git
state. This is also easier to compare to maintainers' repository.
(I'm a bit biased to only doing packaging, ideally unrelated to the
upstream source code as it only describes how to use said code.)
For example, the "clinfo" packages that were shipped in jessie and in
stretch and following are completely unrelated, they just have similar
enough output that one could be used as a replacement for the other.
People could just move to native packages if they do that: that also
works for changes to binary files, no longer requires synthesizing
patches and thus brings the Debian source package closer to the Git
state. This is also easier to compare to maintainers' repository.
Sure, we could tell people to use 3.0 (native) for everything with Debian changes to the upstream source and stop trying to use 3.0 (quilt). You're not the first person to make that suggestion, and it has some real merit
for simplicity of representation of source packages. But that means that
we now can't share the .orig.tar.gz file between Debian package releases, which has implications for the size of the archive.
Ansgar 🙀 <[email protected]> writes:
People could just move to native packages if they do that: that also
works for changes to binary files, no longer requires synthesizing
patches and thus brings the Debian source package closer to the Git
state. This is also easier to compare to maintainers' repository.
Sure, we could tell people to use 3.0 (native) for everything with Debian changes to the upstream source and stop trying to use 3.0 (quilt). You're not the first person to make that suggestion, and it has some real merit
for simplicity of representation of source packages.
But that means that
we now can't share the .orig.tar.gz file between Debian package releases, which has implications for the size of the archive.
It breaks tools to
extract the patches from a 3.0 (quilt) package. And, based on previous debian-devel discussions of exactly this, people have a lot of strong opinions about not using 3.0 (native) when there is an upstream.
tag2upload is intended to work with what package maintainers are doing
today as much as reasonably possible, not to force them to use different workflows and source package representations.
On Wed, 2024-06-19 at 08:39 -0700, Russ Allbery wrote:
Sure, we could tell people to use 3.0 (native) for everything with
Debian changes to the upstream source and stop trying to use 3.0
(quilt). You're not the first person to make that suggestion, and it
has some real merit for simplicity of representation of source
packages.
Yes, it is both simpler and allows for more integrity guarantees.
Other ideas like waldi's 3.0 (gitarchive) also go in that direction.
I'm interested what other ftp-masters prefer when considering a trade
off between space and additional integrity guarantees here. I have a preference for the integrity side.
Well, it doesn't change what package maintainers do as the purpose of tag2upload is that package maintainers don't have to think about source package representation? So changing those should not affect maintainers
much?
If dgit becomes the archive, not being able to mirror it is a major
regression.
It's not quite as trivial as I'd like to pack git archives (shallow or
not) so that they can be mirrored and mostly-seamlessly unpacked at the destination, but it's possible. You can even do it incrementally.
I think the main problem with 3.0 (native) without a canonicalization step for maintainer workflows is that it forces patches-applied. This is
totally fine with *me*, since this is how I want to work on all of my packages, but as I recall from past discussions it is very much not fine
with a lot of maintainers. Some folks really do want to directly maintain
a stack of patches in debian/patches. This breaks with 3.0 (native)
because 3.0 (native) turns off all the dpkg mechanisms to apply those patches. Now you have to add some goo to debian/rules to apply the
patches during the build and we're back to the world of dpatch and I don't think any of us want that.
So, I think this reintroduces the same problem with a different set of
source packages and transformations: the Git tree doesn't represent the format of the buildable source package, and there's no way to easily
provide dak with the final form of the source package because that has to
be constructed by applying all of the patches.
(And in case you're now wondering whether tag2upload can just bifurcate
here and produce 3.0 (native) for patches-applied and 3.0 (quilt) for patches-unapplied, I don't think that works either. There are yet other cases that we haven't talked about. For just one example, I believe one large maintainer team uses a combination of some changes in debian/patches and some changes committed directly to the upstream code. I personally would not do this, but it is supportable and supported and they have their reasons for wanting to do it that way. See dpkg-source --auto-commit.)
tag2upload canonicalizes all of this random stuff to 3.0 (quilt) with specific predictable properties, which has some real and non-trivial
benefits for everything downstream that wants to analyze the archive.
I think it's also a trade off in supported workflows. If we start telling people exactly how they have to use Git to work on Debian packages, we can simplify a lot of things, but wow do I ever not want to open that can of worms. Every time we open it on debian-devel, it's a giant mess. (Even more than this thread!)
Well, it doesn't change what package maintainers do as the purpose of tag2upload is that package maintainers don't have to think about source package representation? So changing those should not affect maintainers much?
I wish it wouldn't change what package maintainers do, but the only way I think that works is to interpose a relatively complex build step between
the maintainer representation and the archive, which is exactly what we're currently stuck on.
Other ideas like waldi's 3.0 (gitarchive) also go in that direction.
Similarly, I'm not sure a source package based on Git avoids the need for
a source package build system. There are a ton of ways that maintainers store things in Git, and I'm not sure it makes sense to upload all of
those as-is.
The things that break are slightly different, but, for instance, some maintainers do not want the upstream source in the same
branch as their Debian packaging files. You may have to add quite a lot
of unwanted complexity to 3.0 (gitarchive) to represent those cases.
Or reintroduce a source package build step, in which case we're back to where
we started.
I'm interested what other ftp-masters prefer when considering a trade
off between space and additional integrity guarantees here. I have a preference for the integrity side.
I think it's also a trade off in supported workflows. If we start telling people exactly how they have to use Git to work on Debian packages, we can simplify a lot of things, but wow do I ever not want to open that can of worms. Every time we open it on debian-devel, it's a giant mess. (Even
more than this thread!)
On Wed, 2024-06-19 at 11:18 -0700, Russ Allbery wrote:
(And in case you're now wondering whether tag2upload can just bifurcate
here and produce 3.0 (native) for patches-applied and 3.0 (quilt) for
patches-unapplied, I don't think that works either. There are yet
other cases that we haven't talked about. For just one example, I
believe one large maintainer team uses a combination of some changes in
debian/patches and some changes committed directly to the upstream
code. I personally would not do this, but it is supportable and
supported and they have their reasons for wanting to do it that way.
See dpkg-source --auto-commit.)
Then those packages can't use tag2upload as is. That doesn't seem to be
a critical problem as tag2upload doesn't support all cases anyway.
tag2upload canonicalizes all of this random stuff to 3.0 (quilt) with
specific predictable properties, which has some real and non-trivial
benefits for everything downstream that wants to analyze the archive.
Some of this random stuff, not all of it.
All these transformations remind me of running autoreconf to include generated configure scripts in the release tarballs. That is a
relatively complex build step between the maintainer representation and
the release. Running autoreconf at that stage has come a bit out of
fashion, but I feel we talk about hanging on to that here...
I also think the maintainer should have a chance to know what actually
ends in the source package that gets used as input by buildds. Complex transformations happening on a remote black-box do not make that easier.
On Wed, Jun 19, 2024 at 11:18:36AM -0700, Russ Allbery wrote:
Similarly, I'm not sure a source package based on Git avoids the need
for a source package build system. There are a ton of ways that
maintainers store things in Git, and I'm not sure it makes sense to
upload all of those as-is.
Do you have some examples of weird things?
But why would tag2upload need to support them? It is something new, it
can tell people: we support the following, use it or not.
IMHO it is one of the problem of Debian that we fail to move forward.
I'm interested what other ftp-masters prefer when considering a tradeI think it's also a trade off in supported workflows. If we start
off between space and additional integrity guarantees here. I have a
preference for the integrity side.
telling
people exactly how they have to use Git to work on Debian packages, we
can
simplify a lot of things, but wow do I ever not want to open that can
of
worms. Every time we open it on debian-devel, it's a giant mess.
(Even
more than this thread!)
Well, it doesn't change what package maintainers do as the purpose ofI wish it wouldn't change what package maintainers do, but the only
tag2upload is that package maintainers don't have to think about
source
package representation? So changing those should not affect
maintainers
much?
way I
think that works is to interpose a relatively complex build step
between
the maintainer representation and the archive, which is exactly what
we're
currently stuck on.
For repeated downloads one could use Git to get incremental updates
and
use only the integrity information from the archive (if desired).
I'm interested what other ftp-masters prefer when considering a trade
off between space and additional integrity guarantees here. I have a preference for the integrity side.
The point is to support a wide variety of workflows and not impose aa blocker.
workflow on the maintainer. The cost is a source package build step.
Maybe there are some workflows that can be supported without that, but
that's not the point; in the general and hardest cases, the build step
is
required and the transformations are not trivial. The goal is that
the
maintainer doesn't have to reproduce that step and sign the results;
they
can sign a Git tag and let tag2upload do the work.
You and Joerg both sounded like you were considering accepting that. Specifically, you said:
| A downside is that integrity verification and third parties
(possibly)
| verifying the data falls flat. For me integrity verification would
be
| somewhat nice and third parties a bit less interesting (given they
can
| get the tag, compare files and possibly redo what tag2upload does if
| they also care about .dsc and stuff).
"Somewhat nice" is not "this is a blocking requirement." It's "I
don't
like that we don't have this, but maybe there's some room here."
So... is
there some room here?
From my side, it hasn't changed. A new way of representing things is not
On 17265 March 1977, Russ Allbery wrote:
I wish it wouldn't change what package maintainers do, but the only way
I think that works is to interpose a relatively complex build step
between the maintainer representation and the archive, which is exactly
what we're currently stuck on.
Well, no. We *can* say "This new thingie *only* works this way. If you
want it, your package has to look like this, if not, do it whatever way
you prefer, but this new thing then is not for you". I don't think thats
bad.
I don't think it's bad in any inherent way, but it's not tag2upload. It's not the thing that the developers have been working on, it doesn't solve
the problems they're trying to solve, and it doesn't let people use the workflows that they want to support.
On Wed, 2024-06-19 at 14:43 -0700, Russ Allbery wrote:
I don't think it's bad in any inherent way, but it's not tag2upload.
It's not the thing that the developers have been working on, it doesn't
solve the problems they're trying to solve, and it doesn't let people
use the workflows that they want to support.
You basically say "nothing would work at all".
Is any change a hard blocker from the tag2upload team perspective? Or
is there some room for changes, even though it would be a design that is
not identical to the one currently proposed by the tag2upload
developers?
Because from my perspective it mostly looks like us like ftp-master
willing to find some compromise, but the tag2upload side hard blocking
on any possible change.
Removing nearly all usefulness from the system and preventing it from
getting more useful over time is not a compromise. That is blocking
by a wrecking amendment.
On Thu, 20 Jun 2024, 01:03 Ansgar 🙀, <[email protected]> wrote:
Hi,
On Wed, 2024-06-19 at 14:43 -0700, Russ Allbery wrote:
I don't think it's bad in any inherent way, but it's not
tag2upload. It's
not the thing that the developers have been working on, it
doesn't solve
the problems they're trying to solve, and it doesn't let people
use the
workflows that they want to support.
You basically say "nothing would work at all".
Is any change a hard blocker from the tag2upload team perspective?
Or
is there some room for changes, even though it would be a design
that
is not identical to the one currently proposed by the tag2upload developers?
Because from my perspective it mostly looks like us like ftp-master
willing to find some compromise, but the tag2upload side hard
blocking
on any possible change.
If there is absolutely no space for changes, then it's probably not
useful to have any discussion as we would just turn in circles.
Ansgar
I appreciate that you're trying really hard to find a way to represent the Git tree directly in a source package so that no build step is required
and building the tree of files locally is trivial. I understand that you truly think that this accomplishes the same goal with some acceptable
lossage around the edges. But it doesn't; it's missing the point. The point is that there are a wide variety of potential transformations
between the Git tree and the source package required to accomodate the
range of Git workflows used in Debian today *and tomorrow*.
Having a
source package build system in the middle is what allows us to *decouple*
the workflow from the build output so that people can use the workflow
that works best for them and we get standardized Debian source packages
out the other end.
But the number of workflows
that can be supported under that restriction is extremely limited without making the uploader build the source package locally, still. It might be useful for some relatively simple cases, but it doesn't move the source package build step to a Debian project infrastructure system and it still requires that the uploader build the source package file tree locally.
At the risk of trying to argue by analogy, it feels akin to saying "okay,
you can have a binary buildd, but only if it doesn't use a compiler and
only copies files around." Yes, that's a compromise compared to no binary buildds, but in a way that makes the whole picture more complicated and doesn't achieve the point of the design.
On Wed, 2024-06-19 at 16:06 -0700, Russ Allbery wrote:
I appreciate that you're trying really hard to find a way to represent
the Git tree directly in a source package so that no build step is
required and building the tree of files locally is trivial. I
understand that you truly think that this accomplishes the same goal
with some acceptable lossage around the edges. But it doesn't; it's
missing the point. The point is that there are a wide variety of
potential transformations between the Git tree and the source package
required to accomodate the range of Git workflows used in Debian today
*and tomorrow*.
Let us talk about *today*. How many packages would not be possible to
upload via tag2upload if one required a signature covering content of packages? Is it 0.1%? Is it 90%?
For *tomorrow* we might change things in the future. Some things like arbitrary code execution at .dsc construction time are fairly useful
after all (required for some workflows, even when it might not change
the files ending in the source package).
Pretty much all changes in Debian (say systemd, usrmerge, ...) happened incrementally. Why should that not be appropriate here? (Or was a slow
move wrong in retrospect and we should have decided to support only
systemd and drop sysvinit in 2014, and to move to only usrmerge also
several releases earlier?)
Why should there be a standardized Debian source package in the end
(where tag2upload might try to build quilt patches) when nobody but a
machine is supposed to use them?
Also, if they are standardized why are there options for the maintainer
to control how the source package gets constructed? (Like an option to control how many patches end up in d/patches.)
Similarly, if one has a package that could be dealt with by a limited tag2upload, and then upstream changed something that nudged you into the problematic territory, you'll be confronted with having to abandon tag2upload, or perhaps having to start doing trickery to live within the limitations (e.g. performing the problematic steps and then committing
the results of that to git, say, which will just make the package
horrible).
If that's the choice available, I'll be sticking with local dgit from
the start, because at least that's going to be able to deal with
tomorrow's version from upstream.
If that's the rational choice which any well-informed uploader
will adopt, then such a limited tag2upload really serves no purpose.
Ansgar 🙀 <[email protected]> writes:[...]
Let us talk about *today*. How many packages would not be possible to upload via tag2upload if one required a signature covering content of packages? Is it 0.1%? Is it 90%?
I personally do not have those numbers. I know there are a huge variety
of workflows mostly from previous debian-devel discussions, which gives me
an appreciation for the scope of the problem that tag2upload solves but doesn't give me numbers.
I checked on trends.debian.net to see if by
chance it was trying to collect workflow data, but the closest thing to relevant is graphs showing the overwhelming popularity of 3.0 (quilt) as a source package format.
I can say that for the packages I maintain personally, 100% of them would
not be possible to upload this way at some point over time. As mentioned previously, I frequently have reasons to carry a Debian-specific patch for some period of time (which is a file that's generated at source package
build time)
On 21.06.24 13:18, Ansgar 🙀 wrote:
If we want to drop integrity checks,
IMHO tag2upload does not drop integrity checks, for the simple reason
that a maintainer who uses dgit today does not perform any such test.
I would like to at least know how
many packages would benefit from such a change.
At minimum, all packages that already use dgit: https://browse.dgit.debian.org/ contains ~3000 repositories.
On 21.06.24 14:33, Ansgar 🙀 wrote:
I think you misunderstand my question: I would like to know which packages*could not* provide an integrity hash for the archive.
Ah, sorry.
Answer: Lots of them. AFAIK you can simply click on a package in https://browse.dgit.debian.org/ and check whether the latest "archive/debian/*" and "debian/*" tags are different.
Of the last 50 uploads, I counted 42. Thus, >80%.
And you do not have a working tree containing either the patched source
that would allow computing a integrity hash using 3.0 (native) or
separate debian/patches where 3.0 (quilt) would work?
On 21.06.24 14:33, Ansgar 🙀 wrote:
IMHO tag2upload does not drop integrity checks, for the simple reason that a maintainer who uses dgit today does not perform any such test.The change request is for the archive to drop them.
Umm, no. We're not dropping the check; we required a maintainer's
signature before, and we still do so after. We just place a packaging-and-possibly-source-mangling server between A and B.
Also it's not an integrity check. It doesn't verify that the files in
the uploaded tarball correspond to either the git tag of the source the maintainer worked on *or* the contents of their file system when they
ran "dpkg -S".
Fundamentally, the fact remains that when you do a "dgit push-source"
today, dak integrity-checks some tar files that were generated and
signed on a random machine with random and possibly-malevolent software
that could have silently replaced any file it wanted to – files which
you currently can't auto-verify independently and which no human will
examine (unless there's a strong external suspicion of foul play).
The source
is now a git tag on Salsa whose history people who work on the code
actually use and examine, the dgit job runs in a VM with defined state,
and the correctness of its output is easily machine-verifiable.
If we are talking about hypothetical ways upstream might nudge you
into:
there is a large territory that requires arbitrary code execution
as build time (say to instatiate d/control from a template) which
neither proposal for tag2upload allows.
Ansgar 🙀 <[email protected]> writes:
And you do not have a working tree containing either the patched source that would allow computing a integrity hash using 3.0 (native) or
separate debian/patches where 3.0 (quilt) would work?
Wait, why would I ever want to upload a 3.0 (native) package for a
non-native package with the tooling as it is today in Debian?
If I did
that *today*, which I thought was the context of the question, that would
be a significant regression and I would lose functionality that I, and
other people downstream of me, rely on.
But to take the
most obvious example, I rely on <https:/udd.debian.org/patches.cgi> to communicate with upstream about exactly what changes I'm applying to their code and allow them to easily pull those patches any time they want.
You
would need to develop a replacement that worked with 3.0 (native) if you wanted to make this change in Debian.
On 21.06.24 16:57, Ansgar 🙀 wrote:
So your suggested method of measurement and the resulting 80% seem
very
questionable and only usable as a very generous upper boundary.
You're right but AFAICT this can only happen when the two tags are consecutive, I spot-checked a couple of them to verify this.
Excluding packages where this is the case drops the count to … 39.
Plus
26 for another 50.
Thus, not *that* generous.
On Fri, 2024-06-21 at 16:55 +0200, Matthias Urlichs wrote:
Umm, no. We're not dropping the check; we required a maintainer's
signature before, and we still do so after. We just place a
packaging-and-possibly-source-mangling server between A and B.
So we drop the integrity check using the maintainer's signature in the archive...
On Fri, 2024-06-21 at 08:29 -0700, Russ Allbery wrote:
Wait, why would I ever want to upload a 3.0 (native) package for a
non-native package with the tooling as it is today in Debian?
As far as I understand this whole thread is about changing the tooling.
Why do you point them to build artifacts instead of just the diff
(including individual commits) between the upstream tree and Debian tree
in Git?
I would still like to understand how your packages would not work with
the suggested integrity check. Could you give an example, possibly with
a reference to a Git repository as well?
Ansgar 🙀 <[email protected]> writes:
I would still like to understand how your packages would not work with
the suggested integrity check. Could you give an example, possibly with
a reference to a Git repository as well?
https://tracker.debian.org/pkg/tf5
See the "debian patches" link in the right sidebar.
Ansgar 🙀 <[email protected]> writes:
On Fri, 2024-06-21 at 08:29 -0700, Russ Allbery wrote:
Wait, why would I ever want to upload a 3.0 (native) package for a
non-native package with the tooling as it is today in Debian?
As far as I understand this whole thread is about changing the tooling.
This whole thread is about deploying something that already works with our >existing tooling and doesn't require boil-the-ocean changes to Debian >infrastructure or workflows.
On 21.06.24 17:37, Ansgar 🙀 wrote:
I also wrote:
(Please also consider in your reply the suggested changes to make
hashing possible/easier...)
Did you consider that in your new analysis? If yes, how?
No,
but I just checked: among the newest 25 packages in browse.dgit.d.o,
16 contain a d/patches directory and would thus (presumably) be not
eligible for creating the hash locally.
This whole thread is about a draft GR to override a FTP Master decision
based on a claim that they had refused to engage with the tag2upload developers for years to explain their concerns or work on resolving
them.
None of that turned out to be accurate.
The only published branch (debian/unstable) looks like it would be
trivial to for git-debpush to compute an integrity hash as suggested
using only a bit shell around Git commands (no dpkg, dgit or anything)? (AFAIU git-debpush would already check that patches apply cleanly here,
so it could provide hashes of either patches-applied or patches-
unapplied.)
Scott Kitterman <[email protected]> writes:
This whole thread is about a draft GR to override a FTP Master decision
based on a claim that they had refused to engage with the tag2upload
developers for years to explain their concerns or work on resolving
them.
None of that turned out to be accurate.
tag2upload is still being blocked by the FTP team so far as I can tell (I >don't understand if Joerg's last message changes this), and a GR is still
the only way to unblock that work that I can see.
It is true that we have now finally had the discussion, with actual >engagement, that we needed to have. (And this is exactly why I am so in >favor of draft GRs for controversial proposals. That final check is often
so incredibly useful at uncovering communication problems and getting >discussions to happen.)
But so far, this has not resolved the problem that one team's work is
being blocked by a delegate decision with no obvious path forward that >achieves their goals. Maybe we will still be able to resolve this: for a >brief moment, it looked like there was some movement, and then Ansgar
walked it back. But I'm still willing to talk about that (and also
because Joerg asked to keep talking about it for a bit longer).
As things currently stand, though, a GR is still the only path forward.
I think it would be better to reset and actually have the conversation
that was assumed to have happened before taking the step of a GR.
I see broad agreement on the goals of tag2upload (at least to a certain
level of detail)
and I don't think there's any clear evidence that a solution that meets
those goals while also addressing the FTP Master's concerns isn't
possible.
This is the soapbox that I climb onto whenever a GR is proposed, and I
guess I need to climb onto it again. More fundamentally than my position
on any given GR, I am on team closure. One of the worst and, I would
argue, actually abusive things that Debian systematically does to people
is string them along for long periods of time saying neither yes or no. I personally would rather be told "no" than "maybe" if the "maybe" involves staying in limbo for months or years.
Obviously it's not my GR and I'm not going to make any of those decisions. Maybe the tag2upload maintainers would prefer "maybe" to "no" and that's their choice to make. But if it were me, I'd want this concluded here and now so that I can either deploy my thing or abandon it and find something else to do with my time and, more importantly, my emotional energy.
We've just done a whole ton of work to reach a better shared understanding
of all of the corners of the problem, and I for one have spent a truly
absurd amount of time and energy over the past week trying to make the problem clear. It makes no sense to me, and I think would be actively
cruel, to stop now and have to substantial amounts of that work all over again some time later. Debian is profligate in wasting the time and
energy of its contributors already; we don't need to make it worse.
I am frequently told that Debian is a do-ocracy: the people who are
willing to do the work have wide latitude over how that work is done. One
of the implications of that is that delegates don't get to force other
people to do their work in arbitrarily different ways just because they
would personally like that better. There is an obligation that comes with the delegate position to only block things for clear and important reasons that matter, and to do that, one also has to be *correct*. If I make a delegate decision on an incorrect basis, I am wrong and I can and should
be overridden.
So yes, you're right, the git-debrebase example is not nearly as
interesting as I had thought because the tooling works differently than I
had realized.
Russ Allbery writes ("Re: [RFC] General Resolution to deploy tag2upload"):
So yes, you're right, the git-debrebase example is not nearly as
interesting as I had thought because the tooling works differently than
I had realized.
As ever, it's all more complicated than you thought (and than you now
think). I'm going to give just a few examples of the frantic paddling
that dgit is doing underneath the waterline. This is therefore an *extremely* long message.
However, you can also run `dgit push-source --split-view=always`. This
is an alternative workflow. In that case, the synthetic git commits
which introduce d/patches don't end up in your own maintainer git
branch. (I'm not sure Russ knows this feature exists.) This mode is
nicer because you don't get diff noise about changes to the completely autogenerated contents of d/patches. Specifically, without the split
view, each upload introduces a bunch of patches onto the maintainer
branch, which the next run of git-debrebase after the upload immediately deletes.
In this message I discuss in some detail five packaging workflows.I am more familiar with the gbp patches-unapplied workflow: can you
ftpmaster's alternative design, AIUI:But why does it have to be patches-applied?
The alternative design I've been positing supposes including a
manifest of the contents of the unpacked source package. Ie, patches >applied.
[email protected] wrote:
In this message I discuss in some detail five packaging workflows.
I am more familiar with the gbp patches-unapplied workflow: can you
point us to some educationlly relevant example repositories using the git-debrebase workflows?
(Maybe without dgit, to make things easier to understand.)
The alternative design I've been positing supposes including a
manifest of the contents of the unpacked source package. Ie, patches >applied.
But why does it have to be patches-applied?
Then both sides could easily (?) compute a canonical hash of the patches-unapplied git repositories, and it would still provide the same security properties.
The difference is the expectation that the delegates will continue to
perform this work and therefore need to deal with the long term
impact. One-time contributions are welcomed as long as they are a net positive, but not all of them are, and some take up hundreds of hours of volunteer time several years down the line.
Deploying tag2upload *as a service* is an ongoing commitment, which
means creating a new delegation, or altering the scope of an existing
one. We need to be explicit which one it is.
First, as I understand the position of the FTP Masters involved in this discussion (for clarity, I'm a non-delegated member of the FTP Team
(i.e. FTP Assistant)), their view is that determining if an upload is
from a person authorized to upload to the Debian archive is a function
that is within the scope of their delegated authority and the current tag2upload proposal takes over that function.
Simon Richter <[email protected]> writes:
The difference is the expectation that the delegates will continue to perform this work and therefore need to deal with the long term
impact. One-time contributions are welcomed as long as they are a net positive, but not all of them are, and some take up hundreds of hours of volunteer time several years down the line.
Sure. I don't think anyone involved has ever intended for tag2upload to
be a one-time contribution. It's an ongoing service and the developers
have been clear that they intend to maintain and further improve it if it
is deployed.
Deploying tag2upload *as a service* is an ongoing commitment, which
means creating a new delegation, or altering the scope of an existing
one. We need to be explicit which one it is.
I don't know what the general practice is for Debian project
infrastructure. There isn't a separate delegation for the buildds, even though I don't believe they're run by the FTP team, and I don't think
they're entirely covered by the DSA delegation. tag2upload is essentially
a source package buildd, so however the buildds are handled might make
sense, although I realize binary buildds are a bit more complicated since it's often different people per architecture.
In other words, I'm not sure that "ongoing commitment" == "delegation" (either new or existing). I think there are lots of things in Debian that are ongoing commitments but not delegations. But I can also see the
argument for considering that a bug and wanting a delegation for any new ongoing service.
Scott Kitterman <[email protected]> writes:
First, as I understand the position of the FTP Masters involved in this discussion (for clarity, I'm a non-delegated member of the FTP Team
(i.e. FTP Assistant)), their view is that determining if an upload is
from a person authorized to upload to the Debian archive is a function
that is within the scope of their delegated authority and the current tag2upload proposal takes over that function.
As mentioned in the summary, I believe we've found a resolution to this problem provided that the FTP team is willing to implement the protocol I described in dak, which Ansgar seemed supportive of. That allows them to
do both the authentication and authorization check directly on the Git tag signed by the uploader, which means the trust extended to tag2upload is
then almost precisely equivalent to the trust extended to a binary buildd: start from an independently-verified maintainer-signed thing and produce a build artifact.
On Sunday, June 23, 2024 11:48:09 AM EDT Russ Allbery wrote:
As mentioned in the summary, I believe we've found a resolution to this
problem provided that the FTP team is willing to implement the protocol
I described in dak, which Ansgar seemed supportive of. That allows
them to do both the authentication and authorization check directly on
the Git tag signed by the uploader, which means the trust extended to
tag2upload is then almost precisely equivalent to the trust extended to
a binary buildd: start from an independently-verified
maintainer-signed thing and produce a build artifact.
I will confess having trouble keeping track of all the back and forth.
After that, it was my impression that the press was on to deploy as is.
Regardless, I think the major point is that running on a DSA managed
host doesn't necessarily equate to DSA running the service.
I think who is going to run the tag2upload service and if some
delegation for doing so is appropriate are both questions that aren't answered by DSA will run the host.
Scott Kitterman <[email protected]> writes:
On Sunday, June 23, 2024 11:48:09 AM EDT Russ Allbery wrote:
As mentioned in the summary, I believe we've found a resolution to this
problem provided that the FTP team is willing to implement the protocol
I described in dak, which Ansgar seemed supportive of. That allows
them to do both the authentication and authorization check directly on
the Git tag signed by the uploader, which means the trust extended to
tag2upload is then almost precisely equivalent to the trust extended to
a binary buildd: start from an independently-verified
maintainer-signed thing and produce a build artifact.
I will confess having trouble keeping track of all the back and forth. After that, it was my impression that the press was on to deploy as is.
So far as I know, no one in this discussion has ever asked for the FTP
team to deploy tag2upload. The only hard request of the FTP team is to
not block uploads made with it. If the FTP team refuses to do any work whatsoever on anything related to tag2upload, it is still possible to
deploy it (with some assistance from other teams such as DSA, of course).
There are some very obvious and relatively minor changes to dak that would make Debian as a whole more secure and that I would hope the FTP team
would be willing to make, such as a separate keyring for tag2upload that
only allows source packages similar to the separate keyring for buildds
that only allows binary packages. One of them is to allow tag2upload
uploads to contain an additional file holding the original signed Git tag, and then dak can choose to repeat the authentication and authorization
checks on that tag, verify that the fields in the *.dsc match the tag,
etc. tag2upload can be deployed without those changes, but it would be better if those changes were also made.
If FTP team is willing to incorporate those changes into dak but doesn't
want to write them, I am sure that we can find volunteers to do so. That volunteer might be me, for example; implementing something practical would
be a nice break from arguing about things.
Regardless, I think the major point is that running on a DSA managed
host doesn't necessarily equate to DSA running the service.
Yes, I think everyone agrees with this and no one expected DSA to run the service. Maybe I'm wrong, in which case someone please correct me.
Sorting out exactly what "run" means and how labor is divided is something that I assumed they would work out with DSA once there was a path forward
for deploying the service.
I personally don't know what the standard model for Debian infrastructure services is, but I believe Ian has already been through this process with
the dgit-repos service and knows how it works.
I think who is going to run the tag2upload service and if some
delegation for doing so is appropriate are both questions that aren't answered by DSA will run the host.
I believe the answer to who is going to run the tag2upload service is Ian
and Sean.
Russ Allbery writes ("Re: [RFC] General Resolution to deploy tag2upload"):
So yes, you're right, the git-debrebase example is not nearly as
interesting as I had thought because the tooling works differently than I
had realized.
As ever, it's all more complicated than you thought (and than you now
think). I'm going to give just a few examples of the frantic paddling
that dgit is doing underneath the waterline. This is therefore an *extremely* long message.
First, though, I want to summarise:
In this message I discuss in some detail five packaging workflows.
For the three current workflows I discuss their workings in some
detail; I explain some of the wrinkles, anomalies and complications
that dgit currently deals with, and that tag2upload takes care of.
For the two future workflows - one near future, and one speculative -
I sketch out what the support might look like.
I also discuss my understanding of the alternative design proposed by
some ftpmasters. In each case, the tag2upload design handles the
situation well. In each case, the alternative design works
significantly less well, or requires significantly more complexity in
more places - usually both. In some cases the alternative design
can't sensibly work at all.
I want to emphasise that these are *examples*. I feel we have
spent much of this thread (and much of previous conversations)
playing whack-a-mole with "but you could fix that anomaly by doing
X" and "you could handle that other sutation by doing Y". Where "X
and "Y" are each not great, but perhaps might be tolerable, if they
were the only limitation.
So, yes, it is true, that in *some* of these cases, including
perhaps many actual packages in practice, the alternativew design
could be made to work. But the alternative design does *not* solve
all the problems that tag2upload does, and the problems that it
does solve it handles in a more complicated and ugly way, with more
limitations.
Taking this all together, the alternative proposal is sufficiently
limited in scope, and poor in its outcomes, that it's not worth
pursuing.
Right then.
1. git-debrebase:
Firstly, this is one of the easier cases from tag2upload's point of
view. git-debrebase is modern and git-based, so has fewer warts.
It's true that git-debrebase can make patches.
But, the calls to git-debrebase that you make as a maintainer do not
make any patches in debian/patches. Indeed, usually, if git-debrebase
finds anything in debian/patches, it simply deletes it all.
What happens is that dgit has special knowledge about git-debrebase:
it knows that git-debrebase can make patches. (This is actually there
as an optimisation: git-debrebase can make patches much faster.)
When you do `dgit push-source` (which is how git-debrebase users
upload), dgit knows it needs to maybe make patches, because that's how
a "3.0 (quilt)" source package works. This is the "quilt-fixup" step
of uploading, which is what (for historical reasons) the source
package canonicalisation is called.
So iff you are using git-debrebase with "3.0 (quilt)", dgit uses git-debrebase to make the patches and commit them to your branch.
However, you can also run `dgit push-source --split-view=always`.
This is an alternative workflow. In that case, the synthetic git
commits which introduce d/patches don't end up in your own maintainer
git branch. (I'm not sure Russ knows this feature exists.) This mode
is nicer because you don't get diff noise about changes to the
completely autogenerated contents of d/patches. Specifically, without
the split view, each upload introduces a bunch of patches onto the
maintainer branch, which the next run of git-debrebase after the
upload immediately deletes.
So in that case the maintainer branch never has patches and isn't
treesame to a "3.0 (quilt)" source package.
Also! You can use git-debrebase with 1.0-with-diff, or with 1.0
native. (I'm not sure Russ knows this, either.) This is often a nice
way of working, for a small package which usually has an empty or tiny
patch queue. If you do that then there are no patches, ever, just git commits and an output tarball. And, there's a wrinkle: you can't use git-debrebase with "3.0 (native)" because of a bug in dpkg-source [1].
So whether there are patches depends on the maintainer workflow, the
intended source package format, and the surrounding context (eg
sponsorship), and they are made by dgit, which calls out to
git-debrebase as an optimisation.
Relationship to tag2upload:
git-debpush and the tag2upload tag don't know anything about any of
this chaos. git-debpush simply signs a tag saying "this git branch is
in a format suitable for quilt fixup in linear patches mode".
git-debpush has *no* code to deal with any of the above. All of this
is left to the tag2upload service.
With a git-based sponsorthip workflow, the sponsor may not need to
learn git-debrebase. They can review the git *tree*, diffing it
against the upstream (ideally, upstream's signed tag), and likewise
they can diff it against the previous upload. They'll declare the
nicely predictable "linear" workflow mode in their tag. They can be
sure that the output source package will be precisely the code they've reviewed git.
(git-debpush does have one piece of git-debrebase-specific knowledge -
an overrideable sanity check to guard against a user error causing an anonalous branch state. It's 9 lines of code - and nothing to do
with source pacakge construction or package contents. This sanity
check is not an essential part of git-debpush, and another tag
generation utility, or a human, could omit it.)
ftpmaster's alternative design, AIUI:
(Here I'm going to compare tag2upload with the alternative design
where the uploader signature covers a manifest of all the files in the unpacked source package - ie, of the result of dpkg-source -x. The ftpmasters haven't produced a complete design, but I think I can infer
the properties that a full proposal would have.)
In this alternative design, software making an upload intent tag for a git-debrebase package would need code to generate the contents of debian/patches. Realistically, that means it needs a copy of
git-debrebase.
And, the person authorising the upload now needs to to learn about and
run and trust git-debrebase, which in our design they often didn't.
2. linear quilt mode, especially with NMUs
I'm going to explain this in terms of git-based NMUs. Similar
situations can arise in other situations, including certain (I think
not widely used) maintainer workflows.
When doing an NMU with git, you first obtain a suitable
patches-applied git branch from somewhere. (Currently `dgit clone` is
the best way to do that, but tag2upload will open up the
possibility[2] of making it be just a `git clone` in the future.)
You then make commit(s) representing your changes, and test them.
(NB that testing them doesn't necessarily involve making a
"3.0 (quilt)" source package. You can build binaries from git.)
When you're happy, you file the NMUdiff bug report (you can use git-format-patch or git-diff for this), and you `dgit push-source`.
Note that at no point have you done anything with d/patches.
So at this stage, your git working tree has some applied patches in d/patches, plus also some changes that are only in git commits.
dgit knows how to figure out *which* git commits need making into
patches, which is a nontrivial problem. The basic algorithm is to
calculate what the tree looks like if you take the orig tarball and
apply the contents of debian/patches - that gives dgit the tree at the
last upload. Then dgit walks backwards through the git history hoping
to find a commit whose tree matches that last upload. Then it can
walk forward again and make patches out of the commits.
There's more. dgit wants to make patches that the NMU recipient won't
object to. So, we can't just use gbp pq because some maintainers
don't like its output and want the patches in closer to DEP-3 format. Therefore, dgit makes these patches by calling `dpkg-source --commit`
with a stunt value of `EDITOR`.
Again, all of this is only necessary with "3.0 (quilt)". It also
depends on the archive contents - it's important to be using the orig
tarball from the archive.
Finally, did you know that dpkg-source and git can disagree about the
meaning of patches? There are patches that dpkg-source can apply, but
which git fails on. There are also patches that they *both* apply,
but *disagree* about the meaning of! Real packages, including highly important core packges, are sometimes afflicted. dgit has code in it
to deal with that too.
Relationship to tag2upload:
Once again, git-debpush and the tag2upload tag don't know anything
about any of this chaos. git-debpush simply signs a tag saying "this
git branch is in a format suitable for quilt fixup in linear patches
mode".
ftpmaster's alternative design, AIUI:
In the alternative design it is probably not feasible to support NMUs
of arbitrary "3.0 (quilt)" packages.
Likewise maintainer workflows that rely on dgit's sophisticated git to
quilt linearisation algorithm are also not supportable.
3. gbp
git-buildpackage and gbp pq, and its patches-unapplied branch format,
are probably the most common workflow in Debian right now.
With gbp pq, the maintainer's DEP-14 tag (the tag2upload tag) is on
that unapplied branch. With a "3.0 (quilt)" source package, it is not actualliy strictly necessary to apply the patches to make the source
package, since the applied form of the files is not directly
represeented. Instead, dpkg-source applies the patches on extraction.
But there is a wrinkle. gbp inherits a bug in dpkg-source[4]: if the maintainer has edited the upstream .gitignore, in their git
representation, this is *not* represented in the source package
generated by git-buildpackage. IMO this is a clear DFSG violation[5].
If the maintainer uses `dgit push-source --quilt=gbp`, dgit will spot
this situation and make an additional patch in debian/patches,
representing the maintainer's edits to .gitignore. That patch appears
only in the canonical git branch and the source package, not in the maintainer's view of debian/patches.
How does this relate to tag2upload?
The tag2upload git tag does not contain any detailed information about
any of this. It simply specifies that the quilt mode `gbp` should be
used. The tag2upload server does all the work.
(git-debpush *does* contain an overrideable sanity check that upstream
files match and the patches apply. Again, this is not an essential
part of its functionality and another signing tool wouldn't need it.)
ftpmaster's alternative design, AIUI:
The alternative design I've been positing supposes including a
manifest of the contents of the unpacked source package. Ie, patches applied.
In that alternative design, any utility which wanted to make an upload
intent tag would need to be able to apply the patches. The patch
application code becomes an essential part of the tag generation
software.
Also, the tag generation utility would need to have special knowledge
about .gitignore. There are two options here: (1) have code to find
the upstream .gitignores, compare them with the maintainer's
.gitignores, and generate a synthetic patch. Or, (2) find the
upstream .gitignores and arrange to include the hashes of the upstream .gitignores rather than the maintainer's .gitignores in the manifest
(which IMO violates the DFSG [5]). In either case the tag generation
utility needs special knowledge about gbp's .gitignore behaviour. Or
of course we could: (3) don't let maintainers edit or add .gitignore
in the upstream part of the package.
4. git-debcherry
git-debcherry is an interesting git patch workflow utility. It is not currently supported by dgit, but that's not because it's impossible,
or even particularly difficult. We just haven't got around to it. [6]
I don't fully understand git-debcherry, but AIUI the basic principle
is that it is a tool for constructing debian/patches based on a patches-applied maintainer branch. It has an interesting algorithm
with some nice properties, including that it doesn't constrain the
maintainer git branch structure.
Only git-debcherry knows what patches it's going to produce, and
it takes the orig tarball as an input.
Support in dgit would be to have dgit call git-debcherry at an
appropriate point in the source package construction (during what dgit
calls "quilt fixup").
Relationship to tag2upload:
tag2upload doesn't support this yet, but it could do. We would add
the support in dgit, and when that was deployed to the tag2upload
server, git-debcherry would be useable with tag2upload right away.
As with the other workflows, git-debpush wouldn't need any code
specific to git-debcherrry. Like the other patches-applied workflows,
the authorising uploader (eg, a sponsor) does not need to understand,
or run, git-debcherry.
ftpmaster's alternative design, AIUI:
git-debcherry uses the orig tarball, so it couldn't be supported,
since the uploading developer doesn't have any tarballs.
It might be supportable if we also made changes to git-debcherry, to
allow it to work off an upstream git tag instead.
5. language team monorepos
Several teams handling upstream language-specific package managers
have a monorepo on salsa containing metadata and patches. I'm aware
of at least Rust and Haskell working this way. The precise contents
of the monorepo vary, and each team has team-specific tooling.
The fragmentation is a problem, and the workflows can be very awkward. Typically .dscs are constructed on maintainer laptops using
team-specific tooling, taking both the team monorepo and upstream
artifacts as inputs.
None of these are supported by `dgit push-source` right now. It would
be nice to be able to improve this, by formalising and streamlining
the conversion process including source package construction. I think
that would be possible in principle, but the design space is large and
as far as I'm aware there hasn't been any serious conversations,
involving both source handling experts (like the dgit team) and
multiple monorepo packaging teams, about common aspects of their
workflows, differing requirements, etc.
(I should say that at least for Rust, which I know very well, I have
serious doubts as to whether the monorepo is the right approach, but
that's a whole other can of worms.)
Relationship to tag2upload:
If we deploy tag2upload, we'll be greatly streamlining the usual
uplaod case. This will increase the gap between the existing monorepo workflows on the one hand, and the majority of packages (which are
supported by tag2upload) on the other hand.
The potential gains from improving the monorepo workflows will be
bigger, and also more evident to a wider set of people.
In summary, supporting monorepo team(s) with more-git-based workflows
is probably possible, in the medium to long term. I think it's likely
to happen with tag2upload.
ftpmaster's alternative design, AIUI:
I think the alternative design couldn't ever handle multi-package
monorepos in the style of the Rust or Haskell teams.
Ian.
Footnotes.
[1] dpkg-source hates "3.0 (native)" with non-native version,
despite TC request to please allow it:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=737634#107
[2] To support NMUs based on just "git clone" we'd need to start
importing every non-git-based[3] .dsc into git, which isn't a sensible
thing to do until the git repository and everything is scaled up due
to git-based .dscs being more common, which will be an effect of
tag2upload.
[3] By "git-based" I mean that the .dsc tells you which git commit it
was made from, and the git tags etc. tell you how. I don't mean to
include ad-hoc source package construction from untraceable git trees
using untrackedd software on maintainer laptops.
[4] The dpkg-source bug about the .gitignore DFSG violation:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=908747
[5] Reading the bug report[4] it's clear that not everyone agrees that discarding our .gitignore changes is a DFSG violation. I find that
position quite implausible but I'm hoping we don't need to resolve it
here.
[6] dgit feature request ticket "want dgit --quilt=debcherry"
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930881
Ansgar, Joerg,
Discussion has died down without a resolution of our impasse, but Ian
sent a very long message, so perhaps you are working through it.
Could you let me know if you are still working on further responses,
and if so, roughly how long you think you need?
On 27.06.24 09:50, Ansgar 🙀 wrote:
leading to having no idea
why a checksum as suggested isn't possible (it would work trivially for
the counterexamples given...).
I assume that "checksum" refers to something likehttps://pkg.go.dev/golang.org/x/mod/sumdb/dirhash#Hash1 which you referred to in a message on 06-19.
Question: How can the tag2upload client, which is not going to run dgit
by design, add the hash of dgit's output to the tag it creates?
Answer: it's impossible to do that. This is the whole point of tag2upload.
Please enlighten me if I have missed something here.
On Mon, 2024-06-24 at 16:12 +0800, Sean Whitton wrote:
Ansgar, Joerg,
Discussion has died down without a resolution of our impasse, but Ian
sent a very long message, so perhaps you are working through it.
Could you let me know if you are still working on further responses,
and if so, roughly how long you think you need?
As I guess you are aware (given you are on #-ftp-private), I haven't
found time to read recent mails.
In particular I haven't had time to find anything after the discussion
with Russ, leading to having no idea why a checksum as suggested isn't possible (it would work trivially for the counterexamples given...).
On Thu 27 Jun 2024 at 09:50am +02, Ansgar 🙀 wrote:
In particular I haven't had time to find anything after the discussion
with Russ, leading to having no idea why a checksum as suggested isn't possible (it would work trivially for the counterexamples given...).
This isn't true. You are just asserting it, without evidence.
You have not done the detailed work on this topic that we have.
I'll expand on the here slightly for your benefit:
$ git clone https://salsa.debian.org/rra/tf5.git
[...]
$ apt-get source tf5
[...]
$ rm -rf tf5/.git tf5-5.0beta8/.pc
$ diff -Nur tf5 tf5-5.0beta8; echo $?
0
If one is really bored:
$ (cd tf5; sha256sum $(find . -type f | sort) | sha256sum -) 8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
$ (cd tf5-5.0beta8; sha256sum $(find . -type f | sort) | sha256sum -) 8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
I will leave it as an exercise to you to compare the output and to
reason about results of different ways to compare both trees.
As soon as I start using tag2upload, I am no longer running dgit locally
and that patch generation will be represented in the Git tree that I
sign to trigger tag2upload.
Le vendredi, 28 juin 2024, 08.32:43 h CEST Ansgar 🙀 a écrit :
I'll expand on the here slightly for your benefit:
$ git clone https://salsa.debian.org/rra/tf5.git
[...]
$ apt-get source tf5
[...]
$ rm -rf tf5/.git tf5-5.0beta8/.pc
$ diff -Nur tf5 tf5-5.0beta8; echo $?
0
If one is really bored:
$ (cd tf5; sha256sum $(find . -type f | sort) | sha256sum -)
8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
$ (cd tf5-5.0beta8; sha256sum $(find . -type f | sort) | sha256sum -)
8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
I will leave it as an exercise to you to compare the output and to
reason about results of different ways to compare both trees.
It looks to me that you have taken (by choice, or by chance) an example
that too conveniently fits what you want to demonstrate: in which the
git repository and the .dsc are treesame. They are often the case, but
not always, as documented in the various git workflows' documentation provided by dgit. The salsa repo can be patches-applied and not have
the debian/patches files, they'd be created at dgit push-source time.
Le vendredi, 28 juin 2024, 08.32:43 h CEST Ansgar 🙀 a écrit :
I'll expand on the here slightly for your benefit:
$ git clone https://salsa.debian.org/rra/tf5.git
[...]
$ apt-get source tf5
[...]
$ rm -rf tf5/.git tf5-5.0beta8/.pc
$ diff -Nur tf5 tf5-5.0beta8; echo $?
0
If one is really bored:
$ (cd tf5; sha256sum $(find . -type f | sort) | sha256sum -) 8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
$ (cd tf5-5.0beta8; sha256sum $(find . -type f | sort) | sha256sum -) 8d7820471fb44382a0c752319906064a1276ff18873fb4730dec1319aaf7b459 -
I will leave it as an exercise to you to compare the output and to
reason about results of different ways to compare both trees.
It looks to me that you have taken (by choice, or by chance) an example that too conveniently fits what you want to demonstrate: in which the git repository and the .dsc are treesame.
If I understand your position correctly (please correct me if needed): you (with a ftpmaster hat) would like all uploads to come with a signed artefact of hashes corresponding to the set of files as represented by the current Debian source package format, as accepted by the archive today. And you would
like this artefact's signature be a signature by the human uploader. Did I get
this right?
If I understand dgit and tag2upload correctly, in the cases where the git repository is treesame to the source package (patches-applied, with debian/ patches file stored in git, as pointed by a tag), this artefact has the exact
same cryptographic value as the git tag, pointing to the git tree, pointing to
the git objects (modulo the SHA-1 vs SHA-256 hash functions choice, which was
clarified elsewhere). One such example is the tf5 source that you used as example.
In that case, would you still want a outside-of-git hash, signed by
the human uploader?
In the cases where the git repository is _not_ treesame to the source package
(patches-applied, but debian/patches not stored in git), uploads are already possible via dgit push-source (and the human upload signature covers the source package as it goes in the archive, not the git tree). In that other case, would you still want a signed artifact of hashes, signed by the human uploader?
And do we both understand that this means that some git repository
layouts would hence not be possible to be uploaded via tag2upload (because it
needs a much heavier git tag client, that builds the final source package, hashes its contents, and creates the git tag)?
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 716 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 55:04:51 |
| Calls: | 12,117 |
| Calls today: | 8 |
| Files: | 15,010 |
| Messages: | 6,518,640 |
| Posted today: | 2 |