I noticed that Fedora 42 was released and their docker images lack a
'awk' tool. Debian trixie images ship with 'mawk' pre-installed right
now. While I'm not convinced the removal game is necessarily a good
one, I can see that it does have some advantages. Is it possible to
drop 'mawk' from the set of default tools in trixie? If not, what are
the blockers? What is the method to find out what the blockers are?
Installed size of mawk is 263 MB which is really small for today's standards.
Simon Josefsson <[email protected]> writes:
I noticed that Fedora 42 was released and their docker images lack a
'awk' tool. Debian trixie images ship with 'mawk' pre-installed right
now. While I'm not convinced the removal game is necessarily a good
one, I can see that it does have some advantages. Is it possible to
drop 'mawk' from the set of default tools in trixie? If not, what are
the blockers? What is the method to find out what the blockers are?
awk is in the essential set in Debian, so this would be a very substantial amount of work. See the Pre-Depends in base-files, which is there to make some awk implementation essential while still allowing the user to switch between implementations.
Installed size of mawk is 263 MB which is really small for today's standards.
awk is in the essential set in Debian, so this would be a very substantial >amount of work.
On Thu, Apr 17, 2025 at 08:40:42PM +0200, Santiago Vila wrote:
Installed size of mawk is 263 MB which is really small for today's standards.
KB rather than MB, thankfully!
Debian trixie images ship with 'mawk' pre-installed right now. While
I'm not convinced the removal game is necessarily a good one, I can
see that it does have some advantages. Is it possible to drop 'mawk'
from the set of default tools in trixie? If not, what are the
blockers? What is the method to find out what the blockers are?
On Thu, Apr 17, 2025 at 11:27:22AM -0700, Russ Allbery wrote:
awk is in the essential set in Debian, so this would be a very substantial amount of work.
{Docker image comaintainer hat on}
This is right. More specifically -- the Debian docker images are (intentionally) -- "just" `debootstrap --variant=minbase`.
Changing what packages are "pre-installed" with the Docker image is not a negotiation that we wanted to have in isolation as the people who keep
the image current. Our goal was to have an image that wasn't unique (or suprising) to a Debian project member -- rather, IMVHO, the package(s)
should be added or removed from the minbase set via our usual conventions.
This has come up from time to time (in the form of some people asking to 'please install X', or 'why did Y go away') -- but the result and push to sync these two ecosystems (debootstrap and the image) is something I believe to be correct, and don't have any real intention of changing as of right
now.
If we want to drop something from the Docker image -- that's great! I'd love that. It's just something we'd have to work through the usual process of changing priority, deps, or what have you. Which -- I will note -- benefits the whole operating system on all platforms, not just one container image (this is the way).
Our goal was to have an image that wasn't unique (or suprising) to a
Debian project member -- rather, IMVHO, the package(s) should be added
or removed from the minbase set via our usual conventions.
So, personally, I think getting mktemp(1) added to POSIX would be
better for portability in the long run anyway.
I noticed that Fedora 42 was released and their docker images lack a
'awk' tool. Debian trixie images ship with 'mawk' pre-installed right
now. While I'm not convinced the removal game is necessarily a good
one, I can see that it does have some advantages. Is it possible to
drop 'mawk' from the set of default tools in trixie? If not, what are
the blockers? What is the method to find out what the blockers are?
On Thu 17 Apr 2025 at 08:02pm -05, Richard Laager wrote:
So, personally, I think getting mktemp(1) added to POSIX would be
better for portability in the long run anyway.
Eventually. POSIX.1-2017 is going to be the thing to target for a long
time, I think.
GNU m4 doesn't follow POSIX strictly, unfortunately.
See these workarounds for both the potential lack of m4 and the lack of
GNU m4 behaving POSIXly:
Is it possible to drop 'mawk' from the set of default tools in trixie?
I noticed that Fedora 42 was released and their docker images lack a
'awk' tool.
On Fri, Apr 18, 2025 at 02:52:17PM +0800, Sean Whitton wrote:
On Thu 17 Apr 2025 at 08:02pm -05, Richard Laager wrote:
So, personally, I think getting mktemp(1) added to POSIX would be
better for portability in the long run anyway.
Eventually. POSIX.1-2017 is going to be the thing to target for a long >>time, I think.
I think POSIX is mostly a relic, and not worth worrying about except as one of
many inputs. Too many mistakes were made too early on, and it's just too late to get everyone to agree on a common standard because real world implementations diverged in too many ways. If someone wants to make a program that works reliably across platforms sh isn't the right tool in 2025. (And I say that as someone who quotes POSIX regularly: it has value for things like choosing amongst a set of possible implementations, but not for making assumptions about what will work in the real world.)
I'm curious what modern platform doesn't have mktemp; is this more than an academic question?
Hello,
On Fri 18 Apr 2025 at 08:18am -04, Michael Stone wrote:
On Fri, Apr 18, 2025 at 02:52:17PM +0800, Sean Whitton wrote:
On Thu 17 Apr 2025 at 08:02pm -05, Richard Laager wrote:
So, personally, I think getting mktemp(1) added to POSIX would be
better for portability in the long run anyway.
Eventually. POSIX.1-2017 is going to be the thing to target for a long >>>time, I think.
I think POSIX is mostly a relic, and not worth worrying about except as one of
many inputs. Too many mistakes were made too early on, and it's just too late
to get everyone to agree on a common standard because real world
implementations diverged in too many ways. If someone wants to make a program
that works reliably across platforms sh isn't the right tool in 2025. (And I >> say that as someone who quotes POSIX regularly: it has value for things like >> choosing amongst a set of possible implementations, but not for making
assumptions about what will work in the real world.)
I have interpreted scripts that I want to run on any FreeBSD and Debian machine, because they are part of my OS bootstrapping. What else is
there than POSIX sh for this? Therefore, it's still relevant.
They likely lack perl, as well. Most/all awk usage in maintainer
scripts could probably be replaced with perl. But, if you are in the >minimizing game, perhaps you'd rather remove perl from the essential
set? A substantially harder project.
I have interpreted scripts that I want to run on any FreeBSD and Debian >machine, because they are part of my OS bootstrapping. What else is
there than POSIX sh for this? Therefore, it's still relevant.
On Sat, Apr 19, 2025 at 08:05:54PM +0800, Sean Whitton wrote:
I have interpreted scripts that I want to run on any FreeBSD and Debian >>machine, because they are part of my OS bootstrapping. What else is
there than POSIX sh for this? Therefore, it's still relevant.
With that requirement, what you really want to know is how to write a script that works on FreeBSD and Debian--which POSIX can't tell you. (Neither of those is POSIX certified or fully compliant.) POSIX might be a starting point,
but you'll have to read man pages and figure out the discrepencies. If you're stuck doing that anyway, I seriously question the value of artificially limiting yourself to what unix tools did 30 or 40 years ago--newer tools or options often let you accomplish tasks much more efficiently. Maybe it would be worth avoiding those if POSIX really did let you write once and run anywhere...but it doesn't.
On Fri, Apr 18, 2025 at 10:17:12PM +0100, Jonathan Dowland wrote:
They likely lack perl, as well. Most/all awk usage in maintainer
scripts could probably be replaced with perl. But, if you are in the >>minimizing game, perhaps you'd rather remove perl from the essential
set? A substantially harder project.
If the goal is a minimal container image, why use debian at all vs a >distribution optimized for that purpose? Running alpine without perl
is already a solved problem...
This just hasn't been my experience. You don't need perfect
compatibility (or certification). By restricting myself to the POSIX >specifications of sh, awk, find, grep and sed, I've profitably written >several non-trivial programs that work correctly on any FreeBSD install
and any Debian install that wasn't specifically engineered to be
minimal.
If the goal is a minimal container image, why use debian at all vs a >distribution optimized for that purpose? Running alpine without perlBecause I want to use a real libc, for a start.
is already a solved problem...
* Michael Stone <[email protected]> [250419 15:47]:
If the goal is a minimal container image, why use debian at all vs a >>distribution optimized for that purpose? Running alpine without perl
is already a solved problem...
This is true for a lot of things Debian is used for. As an example:
GNOME desktop users could also use Fedora, and the work of
maintaining GNOME in Debian would be saved.
Debian is a general purpose OS that can form the foundation for a lot
of variants. But, that flexibility has a cost, and the cost is size & complexity. /var/lib/apt and /var/lib/dpkg alone are the size of a
minimal linux distribution, without even accounting for actual
executables. You can shrink the minimal set by making some components replaceable, but for a general purpose OS that implies the 60k update-alternatives program plus /etc/alternatives plus /var/lib/dpkg/alternatives--all to support reconfiguration that won't
ever happen in a container image.
And the extra symlinks in `/etc/alternatives` don't take much size; I
agree you don't need update-alternatives, but then, you also don't
strictly need the entire dpkg and apt packages, if you're already
omitting their files under /var/lib.
has anyone considered if Debian should have official containers
without apt and dpkg?
Simon Josefsson wrote:
Debian trixie images ship with 'mawk' pre-installed right now. While
I'm not convinced the removal game is necessarily a good one, I can
see that it does have some advantages. Is it possible to drop 'mawk'
from the set of default tools in trixie? If not, what are the
blockers? What is the method to find out what the blockers are?
I would *love* to see the Essential set reduced. But I think this is combining a couple of steps, and we'd do better to separate those steps.
One is "should we make dependencies on awk explicit, rather than having
them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
The latter may or may not happen in any individual case, but I think the former would have a lot of value independently.
And with the former
done, we'd have the opportunity to *consider* the latter on a
case-by-case basis, with rationales like "if packages A and B didn't use
awk, then we'd simplify bootstrapping", or "if packages B and C didn't
use awk, it'd be possible for XYZ useful class of minimal systems/containers/VMs to not need it installed".
...
In general, I think this is roughly the right approach for any proposed
work on the Essential set, with the first step being to declare
dependencies explicitly.
- Josh Triplett
...
It leads us to analyzing the effort and impact. Being in the essential
set means that dependencies are not spelled out. So the first step is locating those dependencies. As we will likely not be able to audit
Debian's source code for awk uses in a reasonable amount of time,
empirical methods are likely needed.
* Rebuild the archive with awk dropped and see what fails
* Consider using reproducible builds to additionally see what packages
change as a result of dropping awk (for those that happen to be
reproducible)
...
Helmut
The former without the latter is just a lot of wasted work without any benefits.
BTW: Replacing mawk with original-awk in installs might be a low-hanging
fruit to save 100kB in forky, having original-awk as only AWK
variant installed is already a supported configuration.
Josh Triplett <[email protected]> writes:
And the extra symlinks in `/etc/alternatives` don't take much size; I
agree you don't need update-alternatives, but then, you also don't
strictly need the entire dpkg and apt packages, if you're already
omitting their files under /var/lib.
Right -- has anyone considered if Debian should have official containers without apt and dpkg? I think that for many use-cases for containers,
apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a
package you want to have installed (which is usually not necessary), so something like this should be possible.
On Thu, Apr 17, 2025 at 01:38:18PM -0700, Josh Triplett wrote:[...]
Simon Josefsson wrote:
Debian trixie images ship with 'mawk' pre-installed right now. While
I'm not convinced the removal game is necessarily a good one, I can
see that it does have some advantages. Is it possible to drop 'mawk' from the set of default tools in trixie? If not, what are the
blockers? What is the method to find out what the blockers are?
I would *love* to see the Essential set reduced. But I think this is combining a couple of steps, and we'd do better to separate those steps.
One is "should we make dependencies on awk explicit, rather than having them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
The latter may or may not happen in any individual case, but I think the former would have a lot of value independently.
The former without the latter is just a lot of wasted work without any benefits.
In general, I think this is roughly the right approach for any proposed work on the Essential set, with the first step being to declare dependencies explicitly.
It's just a waste of time, especially if the end goal is not defined
from the start.
If someone wants to remove awk from the essential set,[...]
then replacing the far larger sed would also be desirable.
Unless someone wants to get rid of Perl in the essential set,
which is 10 times the size of AWK and sed combined.
The sane starting point would be discussing which tools should be part
of the (transitive) essential set.
[...]Debian trixie images ship with 'mawk' pre-installed right now. While
I'm not convinced the removal game is necessarily a good one, I can
see that it does have some advantages. Is it possible to drop 'mawk'
from the set of default tools in trixie? If not, what are the
blockers? What is the method to find out what the blockers are?
I would *love* to see the Essential set reduced. But I think this is
combining a couple of steps, and we'd do better to separate those steps. >> >
One is "should we make dependencies on awk explicit, rather than having
them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
The latter may or may not happen in any individual case, but I think the >> > former would have a lot of value independently.
The former without the latter is just a lot of wasted work without any
benefits.
In general, I think this is roughly the right approach for any proposed
work on the Essential set, with the first step being to declare
dependencies explicitly.
It's just a waste of time, especially if the end goal is not defined
from the start.
What I'm suggesting here is that if every individual package that needs
awk has a Depends on it (via a package that allows switching >implementations), rather than relying on Essential, then it becomes
possible to make incremental progress, and that incremental progress
benefits people who are willing to carefully remove some of what Debian >normally always has installed packages.
On Sun, Apr 20, 2025 at 12:48:08PM +0200, Simon Josefsson wrote:
Josh Triplett <[email protected]> writes:
And the extra symlinks in `/etc/alternatives` don't take much size; I agree you don't need update-alternatives, but then, you also don't strictly need the entire dpkg and apt packages, if you're already omitting their files under /var/lib.
Right -- has anyone considered if Debian should have official containers without apt and dpkg? I think that for many use-cases for containers,
apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a package you want to have installed (which is usually not necessary), so something like this should be possible.
The tricky part of that would be that you then couldn't use that
container image as a base and install any further packages. Offering a "stock" container image without dpkg and apt would mean that the
container image has to *already* have everything installed that people
using the container need. (By contrast, if someone is installing their
own container they could then finalize it by removing dpkg and apt and
other things not needed at runtime.)
I think it's a good idea to support this case, but I would ideally want
to support it in tools that people use to build containers. For
instance, suppose we had an mmdebstrap option to purge dpkg and apt and associated paraphernalia, after installing everything needed.
...
On Sun, Apr 20, 2025 at 06:05:13PM +0100, Josh Triplett wrote:
On Sun, Apr 20, 2025 at 12:48:08PM +0200, Simon Josefsson wrote:
Josh Triplett <[email protected]> writes:
And the extra symlinks in `/etc/alternatives` don't take much size; I agree you don't need update-alternatives, but then, you also don't strictly need the entire dpkg and apt packages, if you're already omitting their files under /var/lib.
Right -- has anyone considered if Debian should have official containers without apt and dpkg? I think that for many use-cases for containers, apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a package you want to have installed (which is usually not necessary), so something like this should be possible.
The tricky part of that would be that you then couldn't use that
container image as a base and install any further packages. Offering a "stock" container image without dpkg and apt would mean that the
container image has to *already* have everything installed that people using the container need. (By contrast, if someone is installing their
own container they could then finalize it by removing dpkg and apt and other things not needed at runtime.)
I think it's a good idea to support this case, but I would ideally want
to support it in tools that people use to build containers. For
instance, suppose we had an mmdebstrap option to purge dpkg and apt and associated paraphernalia, after installing everything needed.
...
This would be for the use case where a user does not want to be able to install security updates,
but does need binary compatibility with Debian.
Should we start declaring deps on all essential packages explicitly?
From what I've seen, there are two arguments for Essential:
On Sun, Apr 20, 2025 at 08:58:29PM +0300, Adrian Bunk wrote:
On Sun, Apr 20, 2025 at 06:05:13PM +0100, Josh Triplett wrote:
On Sun, Apr 20, 2025 at 12:48:08PM +0200, Simon Josefsson wrote:
Josh Triplett <[email protected]> writes:
And the extra symlinks in `/etc/alternatives` don't take much size; I agree you don't need update-alternatives, but then, you also don't strictly need the entire dpkg and apt packages, if you're already omitting their files under /var/lib.
Right -- has anyone considered if Debian should have official containers
without apt and dpkg? I think that for many use-cases for containers, apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a package you want to have installed (which is usually not necessary), so something like this should be possible.
The tricky part of that would be that you then couldn't use that container image as a base and install any further packages. Offering a "stock" container image without dpkg and apt would mean that the container image has to *already* have everything installed that people using the container need. (By contrast, if someone is installing their own container they could then finalize it by removing dpkg and apt and other things not needed at runtime.)
I think it's a good idea to support this case, but I would ideally want to support it in tools that people use to build containers. For
instance, suppose we had an mmdebstrap option to purge dpkg and apt and associated paraphernalia, after installing everything needed.
...
This would be for the use case where a user does not want to be able to install security updates,
With this style of container use case, you handle security updates (or
any other package version upgrade) by creating a new container with the
new package versions, and deplying that new container. That doesn't
require having apt or dpkg in the container.
but does need binary compatibility with Debian.
Or is just familiar with Debian, appreciates the variety of packages and
the maintenance and stability, and prefers to use it as their base.
Container size is obviously not a priority for such users.
On a slightly related note, one of these days I'd love to figure out howYes please! This is why I almost never add overrides to binary packages.
we could stop systematically installing /usr/share/lintian/overrides *in >binary packages*, and move them to some form of metadata that doesn't
get installed.
With embedded distributions a whole system of bootloader, kernel and >userspace easily fits on 16 MB flash, even when including bloated stuffNo, because the goal is to be able to use the whole Debian packages
like glibc and systemd, with plenty of space left for the application
that should run on the device.
You can't do that with Debian.
On Sun, Apr 20, 2025 at 02:56:58PM +0300, Adrian Bunk wrote:
On Thu, Apr 17, 2025 at 01:38:18PM -0700, Josh Triplett wrote:
Simon Josefsson wrote:
Debian trixie images ship with 'mawk' pre-installed right now. While I'm not convinced the removal game is necessarily a good one, I can
see that it does have some advantages. Is it possible to drop 'mawk' from the set of default tools in trixie? If not, what are the blockers? What is the method to find out what the blockers are?
I would *love* to see the Essential set reduced. But I think this is combining a couple of steps, and we'd do better to separate those steps.
One is "should we make dependencies on awk explicit, rather than having them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
The latter may or may not happen in any individual case, but I think the former would have a lot of value independently.
The former without the latter is just a lot of wasted work without any benefits.[...]
In general, I think this is roughly the right approach for any proposed work on the Essential set, with the first step being to declare dependencies explicitly.
It's just a waste of time, especially if the end goal is not defined
from the start.
What I'm suggesting here is that if every individual package that needs
awk has a Depends on it (via a package that allows switching implementations), rather than relying on Essential, then it becomes
possible to make incremental progress, and that incremental progress
benefits people who are willing to carefully remove some of what Debian normally always has installed packages.
If you're already building the kind of container that will want to
remove dpkg and apt (among other things) when you're done building it,
it'd be nice to have dependency metadata that helps you figure out what
is and isn't still used. That's useful even if not everything eliminates
its dependencies yet.
By way of example: e2fsprogs uses awk (in e2scrub), but many container builders will remove that package (or never install it in the first
place), so it's not particularly important to do anything about its dependency on awk, *other than declaring it*. If other, harder-to-remove packages manage to stop using awk, then awk becomes removable, in a less error-prone way.
...
I think removing awk is a bad idea. It will break legacy scripts as
has already been suggested. I am mostly an observer on this list and
say very little but I think that awk is used by a lot of people. I
used it in a script that analyzed mail logs for example. It was
previously written in perl but I redid it in bash with awk and it ran
faster.
On Sun, Apr 20, 2025 at 06:25:53PM +0100, Josh Triplett wrote:
On Sun, Apr 20, 2025 at 02:56:58PM +0300, Adrian Bunk wrote:
On Thu, Apr 17, 2025 at 01:38:18PM -0700, Josh Triplett wrote:
Simon Josefsson wrote:
Debian trixie images ship with 'mawk' pre-installed right now. While I'm not convinced the removal game is necessarily a good one, I can see that it does have some advantages. Is it possible to drop 'mawk' from the set of default tools in trixie? If not, what are the blockers? What is the method to find out what the blockers are?
I would *love* to see the Essential set reduced. But I think this is combining a couple of steps, and we'd do better to separate those steps.
One is "should we make dependencies on awk explicit, rather than having them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
The latter may or may not happen in any individual case, but I think the
former would have a lot of value independently.
The former without the latter is just a lot of wasted work without any benefits.[...]
In general, I think this is roughly the right approach for any proposed work on the Essential set, with the first step being to declare dependencies explicitly.
It's just a waste of time, especially if the end goal is not defined
from the start.
What I'm suggesting here is that if every individual package that needs
awk has a Depends on it (via a package that allows switching implementations), rather than relying on Essential, then it becomes possible to make incremental progress, and that incremental progress benefits people who are willing to carefully remove some of what Debian normally always has installed packages.
If you're already building the kind of container that will want to
remove dpkg and apt (among other things) when you're done building it,
it'd be nice to have dependency metadata that helps you figure out what
is and isn't still used. That's useful even if not everything eliminates its dependencies yet.
If you have no need to install security updates
With embedded distributions a whole system of bootloader, kernel and userspace
By way of example: e2fsprogs uses awk (in e2scrub), but many container builders will remove that package (or never install it in the first
place), so it's not particularly important to do anything about its dependency on awk, *other than declaring it*. If other, harder-to-remove packages manage to stop using awk, then awk becomes removable, in a less error-prone way.
...
"harder-to-remove packages manage to stop using awk" is such awfully
passive language.
Let's rather talk about what Debian should officially support,
and how Josh Triplett plans to implement it.
Trying to officially support removing essential packages sounds to me
like a maintainance nightmare with little benefit, you have to do some explaining how you will keep this maintainable when you do it.
Andrey Rakhmatullin wrote:[...]
Should we start declaring deps on all essential packages explicitly?
I personally think that would be a good idea, though I'm not currently
trying to make the case for that across the board here. Right now, I'm
From what I've seen, there are two arguments for Essential:
1) Shrinking the Packages file. This is something that good compression
handles quite well, and it's not obvious that it provides much of a
win. And if we *really* care about shrinking the Packages file,
there's a lot of low-hanging fruit there: MD5sum, tags
(https://lists.debian.org/debian-devel/2023/11/msg00226.html), and
several others. Eliminating MD5sum alone would save more than 1MB of
*compressed* size from the currently ~8MB Packages.xz. And the names
of common packages are *much* more compressible than MD5sums. :)
2) Maintenance: missing dependencies are hard to track and test. But
these days, we have much more automatic testing infrastructure, much
more install/upgrade/removal testing infrastructure, and many other
things. And note, in particular, that there's nothing stopping us
from adding some of these packages to *Build-Essential* at the same
time we dropped them from Essential, for convenience.
Let's rather talk about what Debian should officially support,
and how Josh Triplett plans to implement it.
I would be more than happy to work on it, in collaboration with others >proposing such changes. I expect that such work consists of 10% doing
careful archive-wide scans to detect usage of packages, 10% writing
tooling, 5% writing relatively small patches,
and 75% discussion threads
having to defend the value of such work from people who have no interest
in working on it themselves but spend a lot of energy telling other
people it's a waste of time.
Right -- has anyone considered if Debian should have official containers without apt and dpkg? I think that for many use-cases for containers,
apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a
package you want to have installed (which is usually not necessary), so something like this should be possible.
I would suggest you hit up some of the current maintainers of Essential: yes packages, and leave the naysayers on d-devel to themselves.
El 21/4/25 a las 14:02, Chris Hofstaedtler escribió:
I would suggest you hit up some of the current maintainers of Essential: yes packages, and leave the naysayers on d-devel to themselves.
Note that there might be some overlap in those two sets of people.
The set of currently essential packages, and the fact that awk is among them >in particular, reflects a consensus which may be seen as nearly "foundational".
Consensus is not sacred and everything is open to discussion, but extraordinary breakages of consensus (like this one) should require extraordinary benefits,
and in this case we are talking about 263KB in a container image several >orders of magnitude bigger.
[..]
One is "should we make dependencies on awk explicit, rather than having
them be implicit and undocumented because awk is Essential".
The other is "should we reduce dependencies on awk".
original-awk's man page admits to one area of POSIX-nonconformance:
BUGS
...
POSIX‐standard interval expressions in regular expressions are not
supported.
...which I think weakens the case for your proposal helping us to have
AWK scripts that don't exercise extensions to POSIX. (But maybe the
newer original-awk that supports CSV data--a non-POSIX extension--fixes that.)
...
original-awk's man page admits to one area of POSIX-nonconformance:
BUGS
...
POSIX‐standard interval expressions in regular expressions are not
supported.
...which I think weakens the case for your proposal helping us to have
AWK scripts that don't exercise extensions to POSIX. (But maybe the
newer original-awk that supports CSV data--a non-POSIX extension--fixes that.)
I wonder if it'd be less effort to _review_ what AWK scripts we have
in maintainer scripts for satisfiability by any POSIX-conforming AWK.
How many can there be? </Jeremy Clarkson>
Regards,
Branden
...
You happened to pick two of the most compatible OSs--it's not hard to
be portable between linux & freebsd *by accident* as there's a long
history of cross-pollination between them. (E.g., coreutils routinely
looks to see what parameters freebsd used when implementing a new
feature.)
Expand the problem set to include running on SunOS and AIX and OSX and
QNX and ... and the problem becomes much harder. But if you don't care
about all those oddballs, why limit yourself to POSIX--whose point was
to try to enable that degree of cross-platform interoperability?
[...]
It's just one of those things where regardless of what standard you
are writing to, you still need to check to see how reality matches the standard.
On Sun, Apr 20, 2025 at 12:48:08PM +0200, Simon Josefsson wrote:
Josh Triplett <[email protected]> writes:
And the extra symlinks in `/etc/alternatives` don't take much size; I agree you don't need update-alternatives, but then, you also don't strictly need the entire dpkg and apt packages, if you're already omitting their files under /var/lib.
Right -- has anyone considered if Debian should have official containers without apt and dpkg? I think that for many use-cases for containers,
apt and dpkg will not be used and just take up space. Guix packs (containers) doesn't get Guix installed unless you specify that as a package you want to have installed (which is usually not necessary), so something like this should be possible.
The tricky part of that would be that you then couldn't use that
container image as a base and install any further packages. Offering a "stock" container image without dpkg and apt would mean that the
container image has to *already* have everything installed that people
using the container need. (By contrast, if someone is installing their
own container they could then finalize it by removing dpkg and apt and other things not needed at runtime.)
I think it's a good idea to support this case, but I would ideally want to support it in tools that people use to build containers. For instance, suppose we had an mmdebstrap option to purge dpkg and apt and associated paraphernalia, after installing everything needed.
On Sun, Apr 20, 2025 at 06:25:53PM +0100, Josh Triplett wrote:
What I'm suggesting here is that if every individual package that needs
awk has a Depends on it (via a package that allows switching implementations), rather than relying on Essential, then it becomes possible to make incremental progress, and that incremental progress benefits people who are willing to carefully remove some of what Debian normally always has installed packages.
Should we start declaring deps on all essential packages explicitly?
Le Sun, Apr 20, 2025 at 11:22:04PM +0500, Andrey Rakhmatullin a �crit :
On Sun, Apr 20, 2025 at 06:25:53PM +0100, Josh Triplett wrote:
What I'm suggesting here is that if every individual package that needs
awk has a Depends on it (via a package that allows switching
implementations), rather than relying on Essential, then it becomes
possible to make incremental progress, and that incremental progress
benefits people who are willing to carefully remove some of what Debian
normally always has installed packages.
Should we start declaring deps on all essential packages explicitly?
There are maintainers scripts that run without the dependencies installed
(or even without the package being installed).
They can only use Essential:yes packages.
There is no place to write such dependency currently.
Having some mechanism to create package-specific users seems like one
useful goal, and I don't understand why each package has to write
scripts to invoke 'adduser' and deal with all the complexity around
that on their own.
Having some mechanism to create package-specific users seems like oneWe have one: it is documented in sysusers.d(5).
useful goal, and I don't understand why each package has to write
scripts to invoke 'adduser' and deal with all the complexity around that
on their own. There could be a declarative interface a package can use
and say 'USERS+=saned' or 'USERS+=munin' or 'USERS+=openldap' and that's
it.
On May 12, Simon Josefsson <[email protected]> wrote:
Having some mechanism to create package-specific users seems like one >>useful goal, and I don't understand why each package has to writeWe have one: it is documented in sysusers.d(5).
scripts to invoke 'adduser' and deal with all the complexity around that
on their own. There could be a declarative interface a package can use
and say 'USERS+=saned' or 'USERS+=munin' or 'USERS+=openldap' and that's >>it.
Now you just need to persuade everybody to use it.
Marco d'Itri <[email protected]> writes:
On May 12, Simon Josefsson <[email protected]> wrote:
Having some mechanism to create package-specific users seems like oneWe have one: it is documented in sysusers.d(5).
useful goal, and I don't understand why each package has to write
scripts to invoke 'adduser' and deal with all the complexity around that >>> on their own. There could be a declarative interface a package can use
and say 'USERS+=saned' or 'USERS+=munin' or 'USERS+=openldap' and that's >>> it.
Now you just need to persuade everybody to use it.
Oh I wasn't aware of that, thanks for the pointer. Is there any known
reason (except lack of time) that people aren't using it? I'll see if I
can come up with a way to use it in some packages, I think 'pqconnect'
would be a good candidate -- the postinst script is only there to call addgroup+adduser and it always felt like a hack.
https://salsa.debian.org/python-team/packages/pqconnect/-/issues/13
Marco d'Itri <[email protected]> writes:
On May 12, Simon Josefsson <[email protected]> wrote:
Having some mechanism to create package-specific users seems like one useful goal, and I don't understand why each package has to write
scripts to invoke 'adduser' and deal with all the complexity around that on their own. There could be a declarative interface a package can use and say 'USERS+=saned' or 'USERS+=munin' or 'USERS+=openldap' and that's it.
We have one: it is documented in sysusers.d(5).
Now you just need to persuade everybody to use it.
Oh I wasn't aware of that, thanks for the pointer. Is there any known
reason (except lack of time) that people aren't using it? I'll see if I
can come up with a way to use it in some packages, I think 'pqconnect'
would be a good candidate -- the postinst script is only there to call addgroup+adduser and it always felt like a hack.
https://salsa.debian.org/python-team/packages/pqconnect/-/issues/13
Relatively new perhaps. Needs a little fiddling to work with debhelper compat level 13 (needs dh helper called from d/rules).
Hi Ahmad,
Il 13 maggio 2025 00:30:09 CEST, Ahmad Khalifa <[email protected]> ha scritto:
Relatively new perhaps. Needs a little fiddling to work with debhelper compat level 13 (needs dh helper called from d/rules).
You might want to build-depend on dh-sequence-sysusers instead. This way, you don't need to fiddle with d/rules.
Are there any guarantees on semantics for package removals? Will the user/group be removed from /etc/{passwd,group} or not? Will it remove
the home directory? What happens if the home directory is not empty?
Will it remove files owned by that user/group elsewhere? I recall
different packages have different preferences on these topics.
On Wed, May 14, 2025 at 09:30:25AM +0200, Simon Josefsson wrote:
Are there any guarantees on semantics for package removals? Will the
user/group be removed from /etc/{passwd,group} or not? Will it remove
the home directory? What happens if the home directory is not empty?
Will it remove files owned by that user/group elsewhere? I recall
different packages have different preferences on these topics.
see #228692 from 20024, or maybe directly jump to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=228692#63
from at least 2020.
To quote Russ once again: 'I think Policy should say something like
"created users and groups should not be removed by default, but may be removed on purge if the local administrator explicitly requests this, either for that package or as a system-wide default."
I think this is still the best practice, even if underdocumented.
Finally I'd like to add that I would not remove home directories even on purge, unless they are empty.
systemd's sysuser support is only apparently slightly better thanThis is much less of an issue that it used to be, because nowadays the /{etc,run,var/*}/$NAME directories can be created most of the times on
adduser (in that it is declarative), otherwise it feels rather
underwhelming in the package management context. It does not solve being
able to use such users in .deb files w/o maintainer scripts, it currently
also uses maintainer scripts for its normal operation (you just do notI suppose that you could implement support for calling systemd-sysusers directly in dpkg, if you think it is needed.
write them explicitly),
it does not solve bootstrapping issues, doesWhat do you mean here?
not support setting a system-wide policy on whether to remove the >users/groups on package purge, etc.I think that this could be implemented in systemd-sysusers, if somebody
To quote Russ once again: 'I think Policy should say something like
"created users and groups should not be removed by default, but may be >removed on purge if the local administrator explicitly requests this, either >for that package or as a system-wide default."
I think this is still the best practice, even if underdocumented.
On Sun, Apr 20, 2025 at 10:12:24PM +0300, Adrian Bunk wrote:
Container size is obviously not a priority for such users.
That is incorrect. Many, many users use Debian as the basis for
containers, and many such users care about container size, sufficiently
so to work on reducing it. You are suggesting that because they want to
use Debian, they don't care at all; I'm observing that they want to use Debian and they care enough to try to make Debian better.
Also, while the idea of Josh might sound good in theory (adding
dependencies will not harm anybody, we just want to see the
dependencies explicit),
it might create some undeserved pressure on maintainers to stop using
awk.
In some cases I'm sure that it would be easy to rewrite the code, but
in some others the alternate construction may be a lot less readable,
and overall worse.
Note also that the base system and the container images are expected
to grow over time, because everything grows over time, but machines
hosting those container images also grow over time, so one would
naturally wonder why awk has become a problem now when it was never
a problem due to its extremely small size.
My modest proposal here after trixie, if there is a consensus that
it's a good step, would be to replace mawk by original-awk in the
base system and see what can we learn from that.
I would see that little change as something similar to what we did
with /bin/sh being replaced by dash to ensure compatibility and
standards compliance
(back then, we discovered some bashisms, and we either rewrote them to
be sh-compliant or used #!/bin/bash instead, and everybody was happy
with those little incremental changes).
I don't think we have many mawk-isms in the distribution, but this
would be an opportunity to check if all AWKs are really
interchangeable.
| Sysop: | Keyop |
|---|---|
| Location: | Huddersfield, West Yorkshire, UK |
| Users: | 715 |
| Nodes: | 16 (2 / 14) |
| Uptime: | 41:54:43 |
| Calls: | 12,109 |
| Files: | 15,006 |
| Messages: | 6,518,416 |