There is a stark disagreement over the importance of that signature, and
it appears to be the remaining blocking issue. I have argued that it
makes little difference from a security standpoint whether the source
package construction step happens before the uploader signature (the
current dak upload process) or after the uploader signature (the
tag2upload process), given that the uploader doesn't (and can't, realistically) check the output in either case. I believe at least one
FTP team delegate disagrees. I personally don't understand the
disagreement.
Again, this summary is only my opinion. I am not a tag2upload maintainer, nor is the draft GR my GR. I cannot speak for any of the parties involved here, only for myself.
On 22.06.24 07:36, Soren Stoutner wrote:...
1. Maintain a complete history of the source of Debian packages in Git, including their Git history.
2. Create the source packages in a controlled, centralized environment.
3. Eliminate the need for a fat client or any special Debian tooling on the
DD’s end, and handle everything with standard Git tools.
There is literally nothing for tag2upload to do when (3) falls away
because these goals do not require dgit to tag *or* upload anything
(beyond the simple, unsigned tagging and uploading that it already does,
of course).
"dgit push-source" already does (1). Nothing to do here.
To achieve (2) it could simply add a field to the .dsc with the hash of
the git commit that it already pushes to its append-only archive.
thank you, Russ, for summarizing the discussion so far. As far as it represents the tag2upload proponents point of view, I think it is a fair summary of what was said so far. A few aspects seem to be missing from
my point of view, so I'd like to share my own opinion (I don't know
whether it qualifies as a reasonable complement to Russ's summary).
Russ's summary seems to miss the questions Paul R. Tagliamonte raised in
[1] and extending on in [2]. In essence, what do we consider as source
after all?
[1.] https://lists.debian.org/debian-vote/2024/06/msg00363.html
[2.] https://lists.debian.org/debian-vote/2024/06/msg00424.html
On 23.06.24 04:45, Russ Allbery wrote:
that just feels wrong to me. Rude. Dismissive. And self-defeating
for Debian as a whole.
100% agree. Though again: that *feels* rude and dismissive. I'm *not* ascribing *intent* to be rude or dismissive to anybody here, and I'm sure Russ isn't either.
Matthias Urlichs <[email protected]> writes:
On 23.06.24 04:45, Russ Allbery wrote:
that just feels wrong to me. Rude. Dismissive. And self-defeating
for Debian as a whole.
100% agree. Though again: that *feels* rude and dismissive. I'm *not* ascribing *intent* to be rude or dismissive to anybody here, and I'm sure Russ isn't either.
Yes, thank you, that's an important clarification and I agree. I do not
want to speculate about other people's intent, and I think it's very easy
to get mentally stuck in a corner when new work changes an invariant that
you believed was important, even if it comes with an argument for why that invariant isn't as important as you thought.
I think that can work both ways. I am old enough to have seen many
instances of some new hotness coming along and any objection to it being swept aside because it was clear that the people objecting just didn't understand why the new hotness was so wonderful and why their concerns
didn't matter anymore. My experience has been that when those concerns
have been ignored (they usually are), things often don't end well.
Scott Kitterman <[email protected]> writes:
I think that can work both ways. I am old enough to have seen many instances of some new hotness coming along and any objection to it being swept aside because it was clear that the people objecting just didn't understand why the new hotness was so wonderful and why their concerns didn't matter anymore. My experience has been that when those concerns have been ignored (they usually are), things often don't end well.
I'm not quite sure how to phrase this (mostly because I want to use much stronger language), but I find the belief that what we have just done over the past week and a half somehow constitutes ignoring concerns to be
rather remarkable.
A whole lot of other people have been involved in this discussion and deep
in the analysis, but for the moment, I'll just speak for myself here.
I have, to the absolute best of my ability, taken every concern that
people have raised very seriously. I have spelled out exactly where I
agree with them and where I disagree with them, I have tried to explain in great detail precisely why I disagree with the concerns that I disagree
with, and I posted an entire formal security analysis to that effect. In
the places where I was wrong, I have tried to say openly that I was wrong
and go back and correct the mistaken things that I said.
Having all of that quite significant work, which has substantially eaten
into a much-needed vacation and which has literally kept me up nights, dismissed as ignoring concerns is....
Well, I guess I don't have words for that. At least not ones that I want
to write on this mailing list.
You are entitled to believe that my analysis is wrong. You are not
entitled to claim that I didn't do the work that I did, quite publicly and openly, right here on this mailing list for everyone to see.
On Sunday, June 23, 2024 11:43:47 AM EDT Russ Allbery wrote:
You are entitled to believe that my analysis is wrong. You are not
entitled to claim that I didn't do the work that I did, quite publicly
and openly, right here on this mailing list for everyone to see.
This was not intended as a personal attack on you. I think you've been
very diligent in your work and clearly you are trying to be careful to address concerns. I don't think that's true of everyone involved in
this conversation.
My impression is that there's still a communication gap between people.
I think it's, mostly, in good faith, but it's there.
As an example, I think the fact that I can download any source package
in the archive and cryptographically verify who uploaded it and that
it's unmodified from what was uploaded is an important property of our current archive structure. IIRC, you've claimed it's not. I don't
think either of us has a very good understanding of why the other
believes that. I think for both of us it's just too obviously true/not
true to be easy to explain.
P.S. FWIW, the emotional reaction I infer you had when you read my last message on this topic is pretty close to the one I had when I read the message I was replying to.
So rather than attacking me, you were insinuating attacks on other people. I'm not sure that's any better.
In general our traditional approach of handling source packages is,
we upload upstream's source achive plus our modifications (patches)
and instructions how to build it (the packaging). Our tooling
(basically dpkg-source) reference all these from a Debian source
control file, the .dsc file.
From what I understood so far, the overall promise tag2upload tries
to make is to simplify package maintenance to an extent that we
offload the separation of the patches and the packaging from the
upstream source to Git.
In the tag2upload concept, the uploaded files referenced by the
.dsc file and the .dsc file itself are implicitly considered a
second class citizen of the Debian ecosystem
Still the tag2upload proponents dismiss a suggestion to also take
changing the source format or tooling into account as an unfeasible "boil-the-ocean approach" (see [8]).
Scott Kitterman <[email protected]> writes:
On Sunday, June 23, 2024 11:43:47 AM EDT Russ Allbery wrote:
You are entitled to believe that my analysis is wrong. You are not
entitled to claim that I didn't do the work that I did, quite publicly
and openly, right here on this mailing list for everyone to see.
This was not intended as a personal attack on you. I think you've been very diligent in your work and clearly you are trying to be careful to address concerns. I don't think that's true of everyone involved in
this conversation.
So rather than attacking me, you were insinuating attacks on other people. I'm not sure that's any better.
If you wanted to know why I have been doing most of the discussion rather than Sean and Ian, you could just ask. There is a quite straightforward answer:
1. I did an independent security review and most of the discussion has
focused on security and, specifically, on things that I already
considered in my review. I therefore have opinions that I thought
about for weeks before the draft GR was posted, and security is my area
of professional expertise, so it makes sense for me to address those
concerns.
2. I have a lot of patience for sprawling Debian arguments like this, and
I get some amount of personal satisfaction out of keeping them
constructive. I therefore tend to try to step in and make the
arguments in a way that I think is the most productive, often before
someone else can compose a response. I doubtless do this more than I
actually need to.
3. I'm a self-destructive idealist with poor boundary control who ends up
thinking about these discussions whether I want to or not, and since
I'm already waking up in the middle of the night drafting email
messages in my head, I may as well write them down so that I can go
back to sleep? That may be a little harsh on myself. :) But I seem
to allow Debian to destroy my vacations like clockwork every time I
take one, so why stop now.
My impression is that there's still a communication gap between people.
I think it's, mostly, in good faith, but it's there.
Oh, probably, but also there is a limit to how much energy one can
possibly sink into a discussion, so at some point, if the discussion is
still stuck, we have to accept that we did the best we could and couldn't resolve them and it's time for a GR. I'm not going to rephrase things literally forever until somehow I find the magic phrasing that works and
gets through the communication gap. There's only so much that's humanly possible.
As an example, I think the fact that I can download any source package
in the archive and cryptographically verify who uploaded it and that
it's unmodified from what was uploaded is an important property of our current archive structure. IIRC, you've claimed it's not. I don't
think either of us has a very good understanding of why the other
believes that. I think for both of us it's just too obviously true/not true to be easy to explain.
This is a disagreement. This is not either of us ignoring each other's arguments. This is us failing to convince each other. That's not the
same thing at all. (And for what it's worth, I don't think it's too obviously true to explain. Quite the contrary, I have written at least
two comprehensive explanations for exactly why I think this. But that doesn't mean they were convincing to someone else.)
I rewrote my original message four times to try to avoid implying the category into which the FTP team concerns fell. If I still failed, then I sincerely apologize, but I don't think I did. Here is precisely what I
said:
| Blocking people's work beause it's actively dangerous, sure, sometimes
| we have to do that and it sucks but it may make sense. But blocking
| people's work because it didn't solve a larger problem than they wanted
| to solve, or cared more about backward compatibility than one might
| wish, or changed a security model in a way that's a little better in
| places and a little worse than others... that just feels wrong to me.
| Rude. Dismissive. And self-defeating for Debian as a whole.
There are two options here: actively dangerous, or a bucket of other
possible objections. I very carefully did not try to classify people's objections into either of those buckets because that wasn't the point.
This (from an earlier paragraph) was the point:
| I do think we should review major changes to ensure that they don't
| create serious new problems, and we have also failed on that score in
| the past. That's part of why I invested a lot of effort in trying to
| help check that in this case. And you may disagree with my evaluation
| there. But what I would ask is to separate that from the question of
| what we ideally should do, separate it from how you would like to see
| the work done, and be very precise and clear about whether there is an
| actual, serious problem. Not just "less than ideal" or "not any better
| than what we already have" or "we could solve so many more problems with
| a more radical design."
In other words, I am arguing for a standard of review for delegate
decisions.
If the FTP team truly believes that tag2upload would create serious new problems or is actively dangerous, then I agree that they are applying the correct standard of review. I happen to disagree with that decision and I think I have a lot of evidence to support my disagreement. If the
tag2upload developers choose to appeal that decision to the project as a whole, I know which way I'm voting. But that is the correct standard of review and hopefully we could have a GR on the merits.
If their standard of review is something else, then I am asking them to please reconsider. Rejecting work that other people care deeply about has
a very high cost for Debian and should require correspondingly serious justification.
P.S. FWIW, the emotional reaction I infer you had when you read my last message on this topic is pretty close to the one I had when I read the message I was replying to.
I have lots of emotional reactions to messages on threads like this. My emotional reaction is just a fact about me. It doesn't necessarily imply anything about the message that provokes it. It's on me to try to figure
out where that emotion is coming from, how to deal with it in a
constructive way, and whether it is justified or driven by some other personal reaction.
In this particular case, I believed, and still believe, that you were implying something that I think is clearly false, and in so doing you belittled a lot of really difficult work that I and others have been doing
in a way that I found insulting. I considered that reaction, probably not
as long as I should have, and I decided you crossed a boundary that I
wanted to enforce. I declined to edit the emotion out of my message the
way that I usually try to do because there are times when that emotional reaction is justified and I think this is one of them. This is one of the cases where I think the constructive approach is to make it clear that
this was an unacceptable way to have a conversation.
Since the implication was made in public and, were it true, is material to
an eventual GR, it felt correct to respond in public to rebut it.
This may have been the wrong choice. This discussion has been a bit of a marathon and I certainly don't have the emotional reserves that I had at
the start of it.
Micha Lenk writes ("Re: Summary of the current state of the tag2upload discussion"):
In general our traditional approach of handling source packages is,
we upload upstream's source achive plus our modifications (patches)
and instructions how to build it (the packaging). Our tooling
(basically dpkg-source) reference all these from a Debian source
control file, the .dsc file.
This is the "Debian source packages in 1993" slide from my 2023 talk.
This was true, once. Nowadays, for most packages, this is a pretence.
Most people maintain their packages in git. Many upstreams regard git
as canonical, and publish tarballs not at all, or only as a
concession. The proportion of software for which this is true is
steadily rising. Many teams (including of serious, important and longstanding packages) work only in git. The Debian Xen team take
upstream signed git tags as their input and work entirely in git.
From what I understood so far, the overall promise tag2upload tries
to make is to simplify package maintenance to an extent that we
offload the separation of the patches and the packaging from the
upstream source to Git.
I don't think this is the right way to look at it.
tag2upload formalises and standardises and streamlines the *existing* git-based workflows which are in use in Debian today.
In the tag2upload concept, the uploaded files referenced by the
.dsc file and the .dsc file itself are implicitly considered a
second class citizen of the Debian ecosystem
The tag2upload system is part of a *parallel* git-based system for
handling source code. The intent is that all source packages will be available *both* ways (and can be edited using *both* views).
What I am trying to do is put git on the same footing as .dscs: you'll
be able to use both, with the same level of fidelity and officialness.
Still the tag2upload proponents dismiss a suggestion to also take
changing the source format or tooling into account as an unfeasible
"boil-the-ocean approach" (see [8]).
People have been thinking about changing source formats, or making
other kinds of more radical changes, for many many years.
Certainly since at least 2013, when Joey Hess and I and some others we
came up with a plan to improve things that we thought would be
workable and deployable, unlike the attempts (and suggestions) that
had been made so far. We've come a long way already, and
significantly improved some maintainers' workflows.
I think we have seen and still see with usrmerge how difficult and cumbersome >the resolution of an initially as simple presented project turned out. I >understand the answer of Scott directed in that way, at least this is a >reservation of mine.For the record, usrmerge became difficult and cumbersome because
I think we have seen and still see with usrmerge how difficult and cumbersome
the resolution of an initially as simple presented project turned out. I
understand the answer of Scott directed in that way, at least this is a
reservation of mine.
For the record, usrmerge became difficult and cumbersome because non-technical reasons prevented implementing the simple solution in
dpkg.
Am 23.06.24 um 20:32 schrieb Ian Jackson:
Most people maintain their packages in git. Many upstreams regard git
as canonical, and publish tarballs not at all, or only as a
concession. [...]
Yet our tooling and official workflows don't work that way.
tag2upload formalises and standardises and streamlines the *existing* git-based workflows which are in use in Debian today.
Okay, but couldn't the existing git-based workflows get formalized and standardized without changing the way how you upload to Debian? Why is
it so important to tie the support for so many workflows to the method
how the upload is done?
Certainly since at least 2013, when Joey Hess and I and some others we
came up with a plan to improve things that we thought would be
workable and deployable, unlike the attempts (and suggestions) that
had been made so far. We've come a long way already, and
significantly improved some maintainers' workflows.
It is possible that I don't have all the needed background here. Would
you mind to share a few pointers that outline how these plans worked
out? I am genuinely interested in anything that could explain to me why
any previous attempts of making Git a first class citizen in the Debian ecosystem didn't succeed so far. This could help me to get a better understanding of your long way, and maybe shed some light on why you're insisting so starkly on some of the tag2upload design decisions.
As an example, I think the fact that I can download any source package in
the
archive and cryptographically verify who uploaded it and that it's
unmodified
from what was uploaded is an important property of our current archive
structure. IIRC, you've claimed it's not. I don't think either of us has >> a
very good understanding of why the other believes that. I think for both
of
us it's just too obviously true/not true to be easy to explain.
There are a few problems with that.
1. The source package is not the end state and not what the end user ends
up using in their system. Users use a binary deb that is .. generated by >build and signed automatically by build key. You have to trust the build in >the source to deb translation. Nowadays most builds are reproducible, but
not all and they were not when buildd keys were introduced. The
git-to-source conversion is a much simpler process than a build. I have not >seen any good arguments against applying the same logic here.
2. What is signed is not the same as what the developer has been writing or >reading. You are putting a lot of weight on this signature of intermediate, >generated artifacts. Developers basically never verify that contents of the >tarballs and diffs they are signing actually match the contents of their
work folder. It would be trivial the create a modified tool in the >dpkg-source chain that would inject malicious software just into the source >package files just before they are signed, especially if targeting a >particular developer.
The tag in git is closest to what the developer inspected and actually can >sign with confidence. All the downstream from that is a generated artifact >that may be tampered with and is much harder to manually verify. From this >perspective relying on debian source package signatures is less secure than >the proposal with git2upload, but that is what we have historically agreed
to accept.
There is a bunch of steps between developer uploading the source package to >the archive and all the way to the end user downloading and installing the >final package into their system. And we (as Debian) have been diligently >working towards automating and centrally managing as many of them as >feasible. All this does is (for some packages) moving the automation state >one more step closer to the actual source code. Just because this step used >to be the first does not make it so special as not to be extendable in the >same way.
And then we can work or reproducible source builds and running two
conversion servers and comparing results and other security improvements >(which is a higher bar than what we demand for binary packages that actual >end users are running).
On Sun, Jun 23, 2024, 19:17 Scott Kitterman <[email protected]> wrote:
As an example, I think the fact that I can download any source package
in the archive and cryptographically verify who uploaded it and that
it's unmodified from what was uploaded is an important property of our
current archive structure. IIRC, you've claimed it's not. I don't
think either of us has a very good understanding of why the other
believes that. I think for both of us it's just too obviously true/not
true to be easy to explain.
There are a few problems with that.
is a bad security practice that exposes you to various attacks.
Just because we have been doing this poor security practice for a long time >does not make it better. Now better methods are possible and we shouldn't >prevent them from being used just because we are used to the weaker
approach.
On Mon, 24 Jun 2024, 18:34 Scott Kitterman, <[email protected]> wrote:
None of that changes the fact that it's what they signed. Historically,
the project has found that useful and I think it still is.
the victim as part of normal delivery. There will be nothing suspicious
about it unless someone else does a NMU and sees a bigger than expected >debdiff.
Even if the developer is very security minded and maintains a separate >air-gapped signing laptop, that doesn't help unless you first actually >analyse the actual artifact that you are signing.
Maybe it would even possible to trick the developer into to signing an
upload of a different package (add a binary package with high version to >their source package?).
With tag2upload there is no obscured source package file to be signed, so
all content going into the archive must already be visible in the git repo >being signed and will also be visible in the dgit repo. Any difference to
the upstream will be quite obvious in either case.
That is the difference between signing something that no human will ever be >reading and singing the actual source that everyone will be looking at. And >that is the difference between needing to secure just one service >(tag2upload) instead of securing a thousand work PCs of all DDs. And we do >this already for build machines. If one would want to sneak stuff into >Debian, hacking a buildd would be the best target - you are putting hacked >binaries into end user machines without leaving traces in source packages
or repos.
An attack on upstream where a release tarball is different form upstream
git tree would also be side-stepped by the Debian maintainer simply using >only the git tree as upstream and completely ignoring the tarballs. It
would not provide a solution for code hidden in the upstream git itself
that the maintainer missed.
On Mon, 24 Jun 2024, 22:03 Scott Kitterman, <[email protected]> wrote:
Do you have any examples of problems that this would have avoided
(xz-utils isn't one - due to the way it's releases are done, it wouldn't be >> suitable for tag2upload)?
Scott K
On June 24, 2024 6:36:59 PM UTC, Aigars Mahinovs <[email protected]>
wrote:
Signing something that you did not write and something that you don't read >> >is a bad security practice that exposes you to various attacks.time
Just because we have been doing this poor security practice for a long
does not make it better. Now better methods are possible and we shouldn't >> >prevent them from being used just because we are used to the weaker
approach.
On Mon, 24 Jun 2024, 18:34 Scott Kitterman, <[email protected]> wrote: >> >
None of that changes the fact that it's what they signed. Historically, >> >> the project has found that useful and I think it still is.
Russ Allbery:
For the third purpose, I believe only weak intent information can be
derived from the uploader signature today. It is common practice in
Debian to verify the Git tree that one wants to upload, run a package
build step, and then blindly sign the resulting source package. The
uploader signature therefore does not say that the uploader verified
the correctness of the source package, only that they triggered a build
process and trusted its results.
Assuming they run this on the same system, I don't think that for
integrity protection the source package build process makes a meaningful difference.
It might be a bit easier for someone else to spot a suspect git change,
but that's far from given.
Also based on the description in this thread at least for dgit workflows
you can already generate and check that the uploaded source is
tree-same.
Also the code in the source package is not just a hidden artifact that
is temporarily used during the build. It's surfaced at various places,
like sources.debian.org, apt-get source, dgit clone, .... So spotting an unusual change by accident is possible here too.
This is equivalent in the tag2upload case except that the uploader
signature happens before the build process (on the Git tag) rather than
after the build process (on the source package). In both cases, the
uploader does not subsequently verify the content of the source
package; in both cases, if the system on which the source package is
built is compromised, the source package may be compromised and the
uploader signature step will not detect this compromise.
But you are trusting the Developer system that signs the tag or source package anyway. If compromised it can simply sign malicious code in both cases.
Therefore, for this purpose, the current uploader signature appears to
provide stronger evidence of intent than it truly does. The signature
on the original Git tag triggering the upload provides equal evidence
of uploader intent.
As described above I'm not convinced by that "intent" argument from a security point of view.
What might be worth mentioning is that verifying the signature by the developer on a .dsc or a git tag, can be actually quite involved.
Especially if you don't want to trust the Debian archive (which some
seem to see as a feature of shipping the signed .dscs). You first need
to find out when the signature was made without trusting the included timestamps (since they could be faked). Then check the state of the
Debian keyring at that time. Then check if the key was revoked later.
Then you can finally check the actually file signature.
Given this I'm not sure if tag2upload would make it much harder (but yes
you would need things not on the mirrors, although you already kind of
need this for checking the timestamps).
#### Misattribution of the source package
But like it or not mistakes can happen. e.g. somebody applies a security update to the project. And uploads it to Debian. But forgets to do a git
push to salsa.
Then later on - maybe months. Or years. The packages I deal with don't
change frequently. Somebody else makes changes to the git based on the
salsa repo.
Do you have any examples of problems that this would have avoided
(xz-utils isn't one - due to the way it's releases are done, it
wouldn't be suitable for tag2upload)?
I do not think it is reasonable that a particular (git?) workflow, specific to the way *YOU* prefer working, gains special upload rights.
I've read
about the deb-rebase workflow, I would hate it, and prefer managing patches with quilt directly.
Does this mean your tag2upload doesn't work for me?
I believe tag2upload supports all of the source tree layouts that dgit
does, and I regularly upload gbp-pq-based packages with `dgit push-source`, which works fine.
Scott Kitterman <[email protected]> writes:
Do you have any examples of problems that this would have avoided
(xz-utils isn't one - due to the way it's releases are done, it
wouldn't be suitable for tag2upload)?
I'm somehow reminded of Ignaz Semmelweis's attempts to improve medical >hygiene by getting doctors to emulate the local midwives, who scrubbed
their hands between patients, whereas the doctors generally didn't, and
would alternate between performing autopsies and attending deliveries.
I'd guess someone may well have pushed back against that, thus:
Can you to name a single patient who has suffered as a result of
existing practice?
If I stretch that metaphor (possibly beyond breaking point), then one
might think of our developers' laptops as the (potentially infected) >cadavers, the newly uploaded source packages as the live births, and our >tooling as the doctors' hands that may carry the infectious material
from one to the other.
I hope that we've been lucky enough to not actually have any of the
relevant "infections" in the population of laptops that produce our
packages, but would it not be wise to make it more difficult for such an >infection to be silently transmitted?
People state that a compromised machine can as easily commit malicious
code to git as it could insert it into a source package, but the
difference is that the malicious commit then needs to be pushed in
order to work, exposing it to examination.
In our metaphor perhaps the git commit step would equate to requiring
doctors to touch a new Petri dish before each patient, which would at
least record what was going on, and might give the opportunity to deal
with the situation before real harm is done.
On 6/25/24 11:56, Matthias Urlichs wrote:
(a) I don't think of the tag2upload service as "3rd party". it's
(proposed to be) part of our infrastructure and treated as such,
including restricted access, security updates, assembling the source in
a sandbox, and all that. Just like the binary builders with their
somewhat larger attack surface which we happen to also trust.
It's 3rd party in the sense that the person uploading isn't generating
or even signing the source package.
It's blind because I'm not running it: some infrastructure is.
https://isdebianreproducibleyet.com/
IMO, we can consider it is there, and fix the remaining 3.6%, or at
least, exclude these from the discussion and consider they need to be
fixed to be part of the game.
I expect that the vast majority of DDs are using sbuild on their
laptops.
I'm opposed to trusting only a signed git tag in your proposed implementation, when it has been proven we can do much better.
On Jun 25, 2024 5:50 PM, Russ Allbery <[email protected]> wrote:
The tag2upload proposal moves the source package build from 1 to 2.
NO ! That is NOT what you are proposing. There's been a 10 years long
effort to have package reproducibility, your proposal is trowing all
away. How does one check the reproducibility of git to source package transformation?
If we were signing source packages manifests locally, then tag2upload
were also producing it, check it is the same as in the pushes tag, and
used the sgnature, then we'd be good. But you don't want this because:
- you feel it is not convenient
- it is hard to implement
As an aside, I'm not sure there's any ethical way to do this (and any
way to do this that doesn't result in people panicking about a test),
but the security person in me badly wants to run a red team exercise
with reproducible binary builds. If we intentionally introduce a
(benign) bit of code into an amd64 binary build without anyone involved
in either reproducible builds or maintenance of that package knowing,
how long would it take for this to be flagged as a possible compromise?
Are you hereby vouching for reproducibility to become RC bugs?
Watch the Kosovo lightning talk where Didier shows what he did. It is a proven concept.
Also, source package builds generally aren't done inside sbuild if I
remember the architecture correctly.
I'm opposed to trusting only a signed git tag in your proposed
implementation, when it has been proven we can do much better.
"Proven" to me implies that we have an implementation of tag2upload that
has better security properties. I don't think this is true? If it is,
I'd love to look at it.
For the third purpose, I believe only weak intent information can be derived from the uploader signature today. It is common practice in Debian to verify the Git tree that one wants to upload, run a package build step, and then blindly sign the resulting source package. [...]
I feel this is somehow ... wrong. I think, *currently*, it should be a moral obligation for a DD to make sure the resulting source package is correct.
Although many people claim the source package is an build artifact, I
think the source package is still supposed can be read by a human,
unlike binary packages. I think it is true especially for patches under d/patches/ as they are very similar to git commits.
In my own config, which uses cowbuilder to provide the clean chroot
build environment, the process I use creates a source package in the
cloned repo in my normal filesystem, then hands that source package into
the clean chroot build environment where it is used to build binary
packages. I test those packages in various ways, and if I like the test results, I sign and upload the source package. In that way, I'm at
least testing the source package I create even if I'm not carefully inspecting it?
On 24.06.24 23:20, [email protected] wrote:
I see it as signing the very thing that is pushed to the Debian archive. You aren't uploading a bunch of git SHA to the archive but a source package. It feels very normal that therefor, that is the thing that we would like you to sign. Too bad this is less convenient for your
workflow, but that is the correct semantic.
Well, yes. Right now this is the case, and t2u adds an additional step to that equation which historically we didn't have.
However …
(a) the thing I'm signing isn't the thing I worked on. I didn't look at it and, given a git-centric work flow, nobody else will either. It feels very unnormal to me that I'm signing some artifact that I didn't even look at. Heck it felt unnormal to me 20 years ago when I joined and built my first packages.
(b) we might decide, sometime in the future, that sources.dgit.d.o is to be treated as part of "the Debian archive" and that our builders shall pull
from there instead of unpacking a tarball if the maintainer used t2u, thus effectively removing your objection.
--
-- mit freundlichen Grüßen
--
-- Matthias Urlichs
Russ Allbery <[email protected]> writes:
For the third purpose, I believe only weak intent information can be
derived from the uploader signature today. It is common practice in
Debian to verify the Git tree that one wants to upload, run a package
build step, and then blindly sign the resulting source package. [...]
I feel this is somehow ... wrong. I think, *currently*, it should be a
moral obligation for a DD to make sure the resulting source package is correct.
However due to possible bug/change in the chaintools, malwares, mistakes
or other things, the DD's intent may not present in the resulted source/binary package. And *currently* in buildd, binary packages are
still built from source packages. So I think it should be a moral
obligation for a DD to make sure his/her intent is present in the source packages (and finally present in the binary packages).[1] I know that
DDs are volunteers and it is impossible for them to perform a thorough inspection of the source package. But I feel that it is lack of moral obligation that a DD blindly sign the resulting source package without
even spend a few second look what is inside it, if he/she knows the
resulting source package may differ from his/her intent. And for
tag2upload, I think there is the same moral obligation for a DD even
though he/she do not need to sign the source package.
Do you actually check that the contents of the source *package* (after all operations done by dpkg-source and possibly other tools) actually match
what you were looking at before in your source work tree folder?
On 25.06.24 01:26, HW42 wrote:
But you are trusting the Developer system that signs the tag or source package anyway. If compromised it can simply sign malicious code in both cases.
It's not that easy.
Hiding compromised code in git is difficult, given that actual people routinely look at commit histories and diffstats and diffs, esp. right between "I create the signed tag" and "I push the t2u tag to Salsa" (we
could even tell the push script to show the git log by default before doing so).
Hiding compromised code in our tarballs is easy. Nobody will ever look at them (ordinarily) and you only need a single shell script running as the developer in question; given inotify it doesn't even show up in "top" while it waits for an opportunity to strike.
Russ> I worked on an update of my security review last night to take"Russ" == Russ Allbery <[email protected]> writes:
Do you actually check that the contents of the source *package* (after
all operations done by dpkg-source and possibly other tools) actually
match what you were looking at before in your source work tree folder?
Hi,
On 6/26/24 03:42, Aigars Mahinovs wrote:
Do you actually check that the contents of the source *package* (after all operations done by dpkg-source and possibly other tools) actually match what you were looking at before in your source work tree folder?
Yes, although more from a quality control perspective, so I don't do it for minimal changes or simple "declare different target distribution" style backports.
With the 1.0 format, that was a lot easier (zless on the .debian.diff.gz), but with mc that is still doable with 3.0.
I've quite often found problems like accidentally included files, or inconsistencies in the control file because I forgot to apply a change to one package.
[email protected] writes:
Watch the Kosovo lightning talk where Didier shows what he did. It is a proven concept.
If this is the proof of concept where the *.dsc file is encoded in a Git
tag (sorry, there have been several proofs of concept and I lose track of which person was associated with which one), I understand it and have
already said why I don't think it will work reliably enough in its current form. (Summary: It relies on the reproducibility of tar and compression programs.) We should measure reproducible source packages by comparing
the unpacked source packages. It's a lot more robust.
"Matthias" == Matthias Urlichs <[email protected]> writes:
The basic problem (and this may sound flippant but it isn't) is that DDs
are humans and humans are so bad at the type of validation that you would want them to perform that there is essentially no chance that this will happen systematically and effectively.
People, when trying to solve a security problem, will tend towards a
fairly straightforward approach at first: figure out where the attack happened, figure out how to detect the attack, and then add a new process
to check for that attack. When another attack is found, add another
process to detect that attack. And so forth.
This is a very natural approach, and in some cases it can work, but in the general case what you get is airport security. It has precisely that
model: planes were hijacked with weapons or blown up with bombs, and therefore we check every passenger's luggage for weapons or bombs. Each
time we find a new weapon, we add a new check. In this case, we have gone far, far beyond anything that Debian would have the resources to do: we
hire an army of people whose entire job is to do these checks, we give
them a bunch of expensive equipment, and we regularly test them to see how well they do at performing this screening.
In independent tests of TSA screening (the US airport security agency),
that screening misses 80-95% of weapons or bombs.
This is a very common cautionary tale in the security community, and it's
not because the agency is incompetent, or at least any more incompetent
than any group of fallible humans told to do a boring job over and over
again will be. This is not a TSA problem, it's a human being problem.
The assumption on which this security approach is based is not as true as people want it to be. You cannot simply tell people to check something
and expect them to find problems if those problems are extremely rare and
99% of the time they will find nothing. The human brain is literally
hostile to this activity. It is not optimized for it and will not do it reliably. To get the accuracy as high as airport security achieves
already requires training, testing, and a whole lot of assistance from technology that doesn't get bored or inattentive, and it still fails more often than not.
You have probably run into a similar version of this problem if you have proofread something that you have written. If you are like most people (there are some people who are an exception to this), you will miss
obvious, glaring errors that someone else will see immediately because you know what you intended to write and therefore that's what you see. With a great deal of concentration and techniques to override your brain's normal behavior, such as reading all the sentences in reverse order, you can
improve your accuracy rate, but you will still regularly miss errors.
Security is even worse because it's adversarial. You're not attempting to spot static bugs that are sitting there waiting to be detected. You are trying to defeat an intelligent, creative, adaptive enemy who watches what you do and adjusts their attack approach to focus on your weak spots. And
in the case of computer security, *unlike* airport security, catching the attacker once doesn't let you arrest them and prevent them from ever attacking you again. The attacker can just try again, and again, and
again until they succeed.
This doesn't mean that spot checks are useless. Doing an occasional
manual check can catch unanticipated problems and snare a lot of
attackers, and it's part of a good overall security strategy. But this approach isn't *reliable* and shouldn't be the front line of the strategy.
This problem is one of the reasons why there is so much emphasis in
computer security at looking at the whole system, including the humans involved, and anticipating how the humans will fail. Humans are
vulnerable to boredom, to inattentiveness, to alert fatigue, to seeing
what they expect to see, to laziness and haste, to emotional manipulation, and any number of other factors that interfere with them following
processes. The security design has to account for this.
Some processes are worse than others. Processes that require a human to check something for errors that will not be present 99% of the time, and where it is overwhelmingly likely that no one will ever know if the human
did the check properly or not, is nearly a worst-case scenario. The check just won't happen reliably, no matter the intentions of everyone involved. People will skip it "just this time." People will open the thing they're supposed to look at and their eyes will point at the screen but their
brain will not actually analyze anything they're looking at because it doesn't believe it will find anything.
Systems that need better humans won't work reliably. You can say that
it's a moral obligation and make people feel guilty for not following the system properly, but this still doesn't get you reliable performance, only
a bunch of stressed, guilty people who are now demotivated on top of all
the problems that you already had.
This is why good security design is all about having humans do the things that they're good at or at least okay at (patch review, for example, which can still be tedious but which is substantially more novel and interesting than reviewing source packages that will be exactly what you expect nearly every time), and not relying on humans to do the things they're bad at.
This is the whole philosophy behind reproducible builds: don't make humans tediously compare things, set up mechanisms so that computers can compare things. Computers don't get bored or inattentive or tired and do not care
in the slightest about doing exactly the same thing millions of times to
get only one anomalous result.
This is also why even for the case of finding bugs in source packages,
which is much less of a worst-case scenario than finding maliciously
injected code that is trying to hide, "test the package before you upload
it" is only a fallback strategy when you don't have anything better. A
much better strategy is to encode the tests that you would perform as programs and make the computer run them. This is why we write test suites and autopkgtests: then all the tests happen with every upload and the computer finds problems and no human has to try to force themselves (and likely fail) to perform the same tedious steps over and over again.
One final philosophical note: humans also have incredibly limited energy. This is particularly a problem in a volunteer project, as you noted in
your footnote. There is way more to do than we have resources to do. I
want to use that energy as wisely as possible. That means I
*particularly* do not want that energy to go into doing things that humans are bad at and that probably won't be done well anyway. This means
designing the whole upload system so that we can create mechanisms like reproducible binary builds, reproducible source builds, autopkgtests, and other ways to move the load onto computers and off of humans and save that precious human attention for the things that only humans can do.
(a) checking the source package is not a one-liner. You need to untar to someplace temporary, run a recursive diff (remembering to not skip new files), then clean up the tempdir.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 07:06:52 |
| Calls: | 12,100 |
| Calls today: | 8 |
| Files: | 15,003 |
| Messages: | 6,517,927 |
| Posted today: | 1 |