• Re: Architecture variants for Debian / Ubuntu (3/4)

    From Michael Hudson-Doyle@1:229/2 to Guillem Jover on Thu Nov 23 05:50:01 2023
    [continued from previous message]

    so that when adding a new arch, for example you can specify whether<br>
    that arch is runnable, which could help dpkg decide whether to allow<br>
    by default to install M-A:foreign packages.<br></blockquote><div><br></div><div>Ah. Would this be a change to /var/lib/dpkg/arch or an additional file or ...?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:
    1px solid rgb(204,204,204);padding-left:1ex">
    I guess this is similar, so such future interface should probably take<br>
    this into account as something to support too. Will check where this<br>
    is tracked and add a note to it.<br></blockquote><div><br></div><div>Did you find this place? :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    And of course that is fine as a guardrail, but if a user hit that out<br>
    of running a frontend, then that would already be too late, which to<br>
    me means that frontends need to be aware of this too (and not pass<br>
    packages that dpkg would/could/might refuse to install), when deciding<br>
    what to pass to dpkg.<br></blockquote><div><br></div><div>Good point.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    But in any case, as you say, this currently would not be worse than<br> configuring a foreign arch, installing some foreign package and trying<br>
    to run it, but it might make it potentially more common. And as<br>
    mentioned above the effecting layer this needs to be decided up seems<br> higher anyway (even if dpkg could provide the infra for it).<br>

    &gt; &gt; If the only change in the package filename format is in the &lt;arch&gt; part<br>
    &gt; &gt; where we&#39;d use a name which would otherwise be valid as an arch name (so,<br>
    &gt; &gt; no weird symbols, or «-» separators that are not intended to split &lt;os&gt;<br>
    &gt; &gt; and &lt;cpu&gt; or similar), then using a name for the variant/ISA would be<br>
    &gt; &gt; fine.<br>

    &gt; Right. I think that (when possible pretending e.g. &quot;amd64v3&quot; is a distinct<br>
    &gt; architecture will generally make things easier. E.g. I think britney<br> &gt; wouldn&#39;t need to know about the relationship between &quot;amd64&quot; and &quot;amd64v3&quot;.<br>

    I guess that depends on whether the intention is to create a full<br>
    optimized archive, or just a partial overlay one. In the latter case<br>
    then it might need to know to be able to satisfy dependencies.<br></blockquote><div><br></div><div>Maybe! Depends on details I think.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);
    padding-left:1ex">
    &gt; &gt; That would be one solution yes, which could give automatic bijective<br>
    &gt; &gt; mappings, although ideally with a machine-readable way to get at it,<br>
    &gt; &gt; which I&#39;m not sure we have currently.<br>

    &gt; I think &quot;gcc -Q --help=target | grep -e &#39;^\s*-march&#39;&quot; is about as machine<br>
    &gt; readable as it gets currently, for better or worse (mostly worse).<br>

    That does not look very satisfactory, though. </blockquote><div><br></div><div>Agreed!</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">And llvm/clang does not
    support it. :/<br></blockquote><div><br></div><div>Ah I did not know that.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt; For example code in dpkg-dev<br>
    &gt; &gt; already runs «$CC -dumpmachine» to infer the host architecture to use<br>
    &gt; &gt; during builds.<br>
    &gt; &gt;<br>
    &gt; &gt; While using a triplet variation could be a way to do that, that would<br>
    &gt; &gt; require such triplet support for each variant/ISA, which tends to be<br>
    &gt; &gt; very painful to introduce if it&#39;s not there already, so I&#39;d not<br>
    &gt; &gt; consider this specific way a viable option.<br>
    &gt; <br>
    &gt; I admit I&#39;m not an expert on triplet intricacies but I think a new triplet<br>
    &gt; is not appropriate here (a bit like a new Debian architecture for a<br> &gt; variant/ISA choice is not the right concept).<br>

    We have i386 or arm (?) as (bad IMO) examples where the triplet can<br>
    define the arch baseline. The problem is that this requires updating<br>
    the GNU config.git upstream, and then getting that to trickle down into<br> every package that might be using autotools and not using autoreconf<br>
    at build time, or to even update triplet matches in configure scripts<br>
    and similar, which might be &quot;acceptable&quot; for a new arch, but seems<br>
    disproportionate for a new ISA, so yes, as mentioned I agree it&#39;s not<br> viable.<br></blockquote><div><br></div><div>OK. Let&#39;s stop worrying about that then :)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    &gt; &gt; On Thu, 2023-09-21 at 14:43:42 +1200, Michael Hudson-Doyle wrote:<br> &gt; &gt; &gt; * Should the ISA influence the toolchain via toolchain defaults or<br>
    &gt; &gt; &gt; dpkg-buildflags?<br>
    &gt; &gt; &gt; * How is the default ISA for a buildd chroot selected?<br>
    &gt; &gt;<br>
    &gt; &gt; So the clear downsides of either modifying the default toolchain or<br>
    &gt; &gt; having to provide an additional one is that this seems pretty heavy<br>
    &gt; &gt; weight. Also because people might want to build optimized variants<br>
    &gt; &gt; locally w/o having to mess with their already existing toolchains.<br>
    &gt; &gt; (I&#39;m not sure whether something going along the lines of<br>
    &gt; &gt; &lt;<a href="https://git.hadrons.org/cgit/debian/fakecross.git" rel="noreferrer" target="_blank">https://git.hadrons.org/cgit/debian/fakecross.git</a>&gt; could be an<br>
    &gt; &gt; option, although as mentioned above, if that would imply new triplets,<br>
    &gt; &gt; then probably not.)<br>
    &gt; &gt;<br>
    &gt; &gt; So the easiest way might indeed be by controlling this via an envvar,<br>
    &gt; <br>
    &gt; DEB_HOST_ARCH_ISA?<br>

    Yeah, that works, and follows the current DPKG_*_ARCH_ABI lead for<br> example.<br>

    &gt; &gt; which dpkg-buildpackage could also setup internally via a new option,<br>
    &gt; &gt; say --arch-isa=amd64v3 or similar<br>

    &gt; --host-arch-isa would be more coherent I think.<br>

    Ah absolutely! For some reason had --arch in mind as a valid option<br>
    (I only see it now in dpkg-scanpackages :D, or maybe I was thinking<br>
    about --host-isa :).<br>

    &gt; I guess one could add support for --target-host-arch-isa to build a<br> &gt; toolchain that defaults to a particular ISA. But well.<br>

    Yes, the ISA support in dpkg should be extensive enough (so that if<br>
    this needs to be supported in the toolchain, then it is possible).<br>

    &gt; So to summarise, here are the generic changes that I think need to be made<br>
    &gt; to src:dpkg to support variant ISAs as a thing:<br>
    &gt; <br>
    &gt;  * add get_host_arch_isa() to Dpkg::Arch<br>

    Yes (perhaps as mentioned below also just get_host_isa()).<br>

    &gt;  * dpkg-gencontrol records DEB_HOST_ARCH_ISA into DEBIAN/control as<br> &gt; ArchitectureIsa<br>


    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)
  • From Michael Hudson-Doyle@1:229/2 to Guillem Jover on Thu Sep 21 05:10:02 2023
    [continued from previous message]

        <a href="https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in" rel="noreferrer" target="_blank">https://github.com/rpm-software-management/rpm/blob/master/rpmrc.in</a>)<br></blockquote><div><br></div><div>I agree that is not
    completely clear what the best approach here is, do we change the defaults of gcc or influence things via default buildflags.</div><div><br></div><div>I&#39;m sure there are packages that do not respect dpkg-buildflags during build but the consequences
    of this do not seem all that great -- such packages would not be optimized for the variant / ISA but if someone manages to notice this, they can fix the bug.</div><div><br></div><div>OTOH, having the compiler default change may be a bit of a surprise for
    people who build binaries for deployment not via Debian packages. (Do our compilers in general target the same baseline as Debian does for a given architecture?).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-
    left:1px solid rgb(204,204,204);padding-left:1ex">
     - Perhaps that&#39;s a limitation from the archive software side, but<br>
       requiring to place the binary packages in the same pool seems<br>
       rather restrictive (it forces different filenames for example).<br></blockquote><div><br></div><div>We are considering supporting multiple variant/ISAs in the primary Ubuntu archive, so if we get that far then yes, we want to have all the binary
    packages in the same pool. The first steps don&#39;t have to support this I guess.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
     - I guess it might be nice for the ISA to be passed down to the<br>
       dpkg tools, but I don&#39;t think this is strictly necessary? A<br>
       frontend like apt could also decide based on metadata in say the<br>
       Release file, although not having the actual installed package<br>
       metadata on whether it was a different ISA build or not would make<br>
       its job more inconvenient. In any case I don&#39;t have a big issue<br>
       with recording this via dpkg-gencontrol or similar if necessary.<br></blockquote><div><br></div><div>I agree, I don&#39;t think it&#39;s /strictly/ required that the target ISA is recorded in the deb. But I think adding a field for it reduces scope
    for confusion later.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    On the specific implementation details:<br>

     - Changing the Architecture format (as in adding colons there) seems<br>
       like a non-starter, and I expect that would break lots of things<br>
       (I mean it could be done but I&#39;m not sure it&#39;s worth it for this).<br>
       Recording this mostly as a hint than anything else, via another<br>
       field (if necessary at all) I think would be best.<br></blockquote><div><br></div><div>Agreed.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
     - As covered in previous discussions, dpkg could (but I don&#39;t think<br>    it&#39;s necessary) check whether the .deb is runnable on the current<br>    hw, but that&#39;s tricky as chrootless installs need to be taken<br>
       into account, etc. It should certainly not be part of dependency<br>
       resolution.<br></blockquote><div><br></div><div>I&#39;m sorry, what is a chrootless install? But I think I agree here too: tricky and just not really worth it.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-
    left:1px solid rgb(204,204,204);padding-left:1ex">
     - I&#39;m not fond of having to change the binary package name format<br>
       either for this (name_version_arch.deb) even if at least dpkg<br>
       itself does not care (but I know other tools do care), and<br>
       depending on the format I&#39;d expect things to break (this goes<br>
       back to the shared pool concern).<br></blockquote><div><br></div><div>I don&#39;t think this is avoidable in the long run. I must admit I have generally thought of the presence of the architecture name in the .deb file name to be more a convention
    than part of the format (and the &quot;real&quot; indication of a binary package&#39;s architecture is in DEBIAN/control).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-
    left:1ex">
     - If dpkg-architecture needs to be aware of this, then this might need<br>
       to be auto-detectable from just the current toolchain being used.<br></blockquote><div><br></div><div>So you are saying to configure a build environment for, say, x86-64-v3 you would configure gcc with --with-arch64=x86-64-v3 and then dpkg-
    architecture would parse the output of gcc -Q --help=target to set DEB_HOST_ARCH_VARIANT appropriately? (modulo mistakes in details) Or do you mean something else entirely?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.
    8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
    Some of the above problems could perhaps be avoided if we introduced<br>
    a concept of architecture aliases/ISAs (similar to what rpm has), which<br> would side-step the pool sharing issue, the binary package renaming,<br>
    etc. One big issue with this is that it requires for dpkg to have an<br> exhaustive table of all such aliases, and if there&#39;s ever a new alias<br> added, old dpkg versions need to be updated or they will not understand<br> what they match with. So this does not seem ideal either. So I guess this<br> is a variation over your proposal, but perhaps this could still be used<br>
    in specific contexts, say only at build-time (but not for dependency<br> relationships), for repo management (say binary-arm64v9/Packages.xz),<br>
    or binary package names where the field would specify the actual name<br>
    for the filename, say:<br>

      Architecture: arm64<br>
      ArchitectureIsa: arm64v9<br>

    or maybe better:<br>

      Architecture: arm64<br>
      ArchitectureIsa: v9<br>

    resulting in dpkg-deb generating:<br>

      binpkg_1.0-1_arm64v9.deb<br>

    but targeting arm64.</blockquote><div><br></div><div>I&#39;m not sure but I think you have talked yourself into suggesting something very similar to my proposal here?</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;
    border-left:1px solid rgb(204,204,204);padding-left:1ex">I also think I prefer naming this explicitly as ISA<br>
    variants, if you will, than just architecture variants as that gives<br>
    way too much room</blockquote><div><br></div><div>Certainly I think all the interesting use cases are basically changing the set of instructions emitted by the toolchains by default. I suppose you could have a variant that changed the set of hardening
    flags or something but that doesn&#39;t seem an especially good idea. So I guess I&#39;d be happy with s/ArchitectureVariant/ArchitectureISA/ everywhere.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px
    solid rgb(204,204,204);padding-left:1ex"> (which perhaps we want, but then that has other<br>
    implications over compatibility), and for the field perhaps just Isa is<br>

    [continued in next message]

    --- SoupGate-Win32 v1.05
    * Origin: you cannot sedate... all the things you hate (1:229/2)