XPost: linux.debian.bugs.dist, linux.debian.devel
From:
[email protected]
Package: coreutils
Version: 9.5-1
Severity: wishlist
Tags: patch upstream
X-Debbugs-Cc:
[email protected],
[email protected]
Hi Michael,
we recently considered ways of exercising more CPU cores during package
builds on
[email protected]. The discussion starts at
https://lists.debian.org/debian-devel/2024/11/msg00498.html. There, we considered extending debhelper and dpkg. Neither of those options looked
really attractive. because they were limiting the parallelity of the
complete build. However, different phases of a builds tend to require
different amounts of memory. Typically linking requires more memory than compiling and test suites may have different requirements. The ninja
build tool partially accommodates this by providing different "pools"
for different processes limiting linker concurrency. Generally,
individual packages have the best knowlegde of their individual memory requirements, but turning that into parallelism is presently
inconvenient. This is where I see nproc helping.
On Thu, Dec 05, 2024 at 09:23:24AM +0100, Helmut Grohne wrote:
How about instead we try to extend coreutils' nproc? How about adding
more options to it?
I propose adding new options to the nproc utility to support these use
cases. For one thing, I suggest adding --assume to override initial
detection. This allows passing the parallel=N value from
DEB_BUILD_OPTIONS as initial value to nproc. The added value arises from
a second option --require-mem that reduces the amount of parallelism
based on available system ram and user-provided requirements.
Let me sketch some expected uses:
* Typically build daemons now limit the number of system processors
by downsizing VMs to avoid builds failing with OOM. Instead, they
could supply an adjusted DEB_BUILD_OPTIONS=parallel=$(nproc
--require-mem=2G).
* Individual packages already reduce parallelism based on available
memory. A number of them attempt to parse /proc/meminfo. Instead they
could include /usr/share/dpkg/buildopts.mk and compute parallelism as
NPROC=$(shell nproc --assume=$(or $(DEB_BUILD_OPTION_PARALLEL),1)
--require-mem=3G).
* When using the meson with ninja, the linker parallelism can be
selected separately using backend_max_links and a different value
using a different --require-mem argument can be passed.
I expect these options to reduce complexity in debian/rules files and
hope that in providing better tooling we can require packages to be
buildable with higher default parallelism. Finding a good place to share
this tooling is difficult, but nproc seems like a sensible spot. The
nproc binary grows by 4kb as a result and this impacts the essential
base system.
I'm attaching a patch and note that a significant part of the diff is a
gnulib update of the physmem module. Option naming improvable.
What do you think about this approach?
Helmut
--- coreutils-9.5.orig/src/nproc.c
+++ coreutils-9.5/src/nproc.c
@@ -23,6 +23,7 @@
#include "system.h"
#include "nproc.h"
+#include "physmem.h"
#include "quote.h"
#include "xdectoint.h"
@@ -34,13 +35,17 @@
enum
{
ALL_OPTION = CHAR_MAX + 1,
- IGNORE_OPTION
+ ASSUME_OPTION,
+ IGNORE_OPTION,
+ REQUIRE_RAM_OPTION
};
static struct option const longopts[] =
{
{"all", no_argument, nullptr, ALL_OPTION},
+ {"assume", required_argument, nullptr, ASSUME_OPTION},
{"ignore", required_argument, nullptr, IGNORE_OPTION},
+ {"require-mem", required_argument, nullptr, REQUIRE_RAM_OPTION},
{GETOPT_HELP_OPTION_DECL},
{GETOPT_VERSION_OPTION_DECL},
{nullptr, 0, nullptr, 0}
@@ -61,7 +66,9 @@
"), stdout);
fputs (_("\
--all print the number of installed processors\n\
+ --assume=N assume the given number of processors before applying limits\n\
--ignore=N if possible, exclude N processing units\n\
+ --require-mem=M reduce emit