• Re: Non-LLM example where we do not in practice use original training d

    From Ansgar =?UTF-8?Q?=F0=9F=99=80?=@21:1/5 to Sam Hartman on Mon May 5 22:40:01 2025
    Hi,

    On Mon, 2025-05-05 at 14:27 -0600, Sam Hartman wrote:
    If I wanted to package up my classifier state and distribute it under a
    free software license, I think it should be DFSG free.
    I think that to satisfy the DFSG I would need to include  all the
    training data I still had and any scripts I used.

    And the training data would have to be under a DFSG-free license. I
    doubt phishing or spam mail comes with proper licensing; even ham
    doesn't do this (what are the license terms of this mail?). So if you
    were required to include training data it wouldn't be possible even for
    fairly boring classifiers.

    Ansgar

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Sam Hartman@21:1/5 to All on Mon May 5 23:00:01 2025
    "Ansgar" == Ansgar 🙀 <[email protected]> writes:

    Ansgar> Hi,
    Ansgar> On Mon, 2025-05-05 at 14:27 -0600, Sam Hartman wrote:
    >> If I wanted to package up my classifier state and distribute it
    >> under a free software license, I think it should be DFSG free. I
    >> think that to satisfy the DFSG I would need to include  all the
    >> training data I still had and any scripts I used.

    Ansgar> And the training data would have to be under a DFSG-free
    Ansgar> license. I doubt phishing or spam mail comes with proper
    Ansgar> licensing; even ham doesn't do this (what are the license
    Ansgar> terms of this mail?). So if you were required to include
    Ansgar> training data it wouldn't be possible even for fairly boring
    Ansgar> classifiers.

    Thank you. I should have caught that.
    I guess even under my proposal option, packaging the classifier might be tricky. If I deleted the training data and no longer had it, then I
    think under my option, the classifier could be DFSG free.
    If I retained the training data, then ftpmaster would need to decide
    whether I as upstream had a more preferred form of modification than
    the rest of the world. (My understanding is we approach upstreams with well-justified suspicion when they have source-like things that the rest
    of the world does not have, and I tried to capture that in my option.)

    --=-=-Content-Type: application/pgp-signature; name="signature.asc"

    -----BEGIN PGP SIGNATURE-----

    iHUEARYKAB0WIQSj2jRwbAdKzGY/4uAsbEw8qDeGdAUCaBkmaAAKCRAsbEw8qDeG dKF6AP9UWTt291lP4vYLAePDSoz64J5B8LIkz82fMuMWY/xo0wEAmMEhnMXmIF1h FNUD6W+URx/ARJcFrCLVLwWDoXRAzgQ=LenK
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to [email protected] on Mon May 5 23:10:01 2025
    Ansgar 🙀 <[email protected]> writes:
    On Mon, 2025-05-05 at 14:27 -0600, Sam Hartman wrote:

    If I wanted to package up my classifier state and distribute it under a
    free software license, I think it should be DFSG free. I think that to
    satisfy the DFSG I would need to include  all the training data I still
    had and any scripts I used.

    And the training data would have to be under a DFSG-free license. I
    doubt phishing or spam mail comes with proper licensing; even ham
    doesn't do this (what are the license terms of this mail?). So if you
    were required to include training data it wouldn't be possible even for fairly boring classifiers.

    Debian is not required to be a distribution point for every type of
    software or database file that people have thought of. I don't believe
    that a Bayesian spam filter database trained in this way is DFSG-free, and
    I don't think it should be included in Debian main.

    That doesn't mean I think it's bad or immoral or anything like that. I
    have a database like that myself. :) It's simply not free software, and is outside the scope of what Debian is for. Not even all of Debian's own data
    is free software. For example, I would not consider the BTS database or
    the mailing list archives to be free software because the licensing status
    is not sufficiently clear.

    There is lots of useful software and data in the world that is not free software, and there are lots of other projects that can distribute it.

    Obviously, this is just my opinion, and I realize other people disagree.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Sam Hartman on Mon May 5 23:20:01 2025
    Sam Hartman <[email protected]> writes:

    I guess even under my proposal option, packaging the classifier might be tricky. If I deleted the training data and no longer had it, then I
    think under my option, the classifier could be DFSG free.

    I realize that we have made exceptions in the past for files where the
    original source has been lost so the form in which they exist, although
    quite clearly not the original source, is now de facto the preferred form
    of modification. (Sam may be thinking of the same PDF files that I'm
    thinking of.)

    However, I am very leery about extending that exception to cases where
    people are intentionally creating that situation by deleting the input
    data on purpose. It's one thing when the source has been lost but the
    output document is still important for historical purposes. I think that
    falls into the category of cases where humans are not computers and we do
    not have to blindly follow rules without considering their underlying
    purpose. But extending that case to cases where people are intentionally discarding the training data so that they don't have to produce it is a
    step too far down a slippery slope for me.

    I think we would think very hard before accepting a compiled binary into
    the archive whose original source had been lost, and would be very
    unlikely to accept a compiled binary where the original source was intentionally deleted so as to make it DFSG-free. The "preferred form of modification" test is not, to me, the only test for compliance with the
    DFSG. The point of the DFSG is more than to *only* put everyone on an
    equal footing. There is also a straightforward desire to actually have meaningful and useful source code, without which free software is kind of pointless.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefano Zacchiroli@21:1/5 to Russ Allbery on Tue May 6 14:00:01 2025
    On Mon, May 05, 2025 at 02:13:58PM -0700, Russ Allbery wrote:
    However, I am very leery about extending that exception to cases where
    people are intentionally creating that situation by deleting the input
    data on purpose.

    I agree with you on this. I do wonder however where you would place the
    case where the training data is available (possibly: publicly
    available), and the model trainers would even want to distribute it, but
    cannot due to unclear licensing terms. Would you say that it is a "less
    nasty" case than that where training data is deleted on purpose, or
    would you consider it as bad?

    FWIW, in terms of free software ethics, I consider non-open data to be
    "less nasty" than non-free code. That's because with code we can take
    the activist approach of just rewriting it under a free software license (provided enough development resources are available). With non-open
    data, there are cases in which you cannot just recreate and release it
    under a free license, no matter how many resources you have.

    The ability to exploit non-open-data to serve the needs of free software
    (as it would be the case with DFGS-free models, trained on non-DFSG-free
    data) is something I hesitate giving up on.

    Cheers
    --
    Stefano Zacchiroli . [email protected] . https://upsilon.cc/zack _. ^ ._
    Full professor of Computer Science o o o \/|V|\/ Télécom Paris, Polytechnic Institute of Paris o o o </> <\> Co-founder & CSO Software Heritage o o o o /\|^|/\ Mastodon: https://mastodon.xyz/@zacchiro '" V "'

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEE8ZooXsFA+JEz681OfH5Cj5NBJ5kFAmgZ+YAACgkQfH5Cj5NB J5mpWA/7BRG1tX1xKP+uIZ63yiT/PCMUPvSSvEDR+YvchIUv1n3l5/7mq6FdhTof exykHHFhClbLxZN1oD/Fn+OJMJIBlnL0/5wnNz2yITPkCYsXQ2pvuQKrQt7tEf+9 b1SNGV/413gfxhBMywuMJE/m4xMOud9b/UuwNkMtGKIzkTxzR6dkp2ApQZ1lf5UM dKul2vxrt+rWbsvpjpAhENo6czeSfFApAKbKs5n/imypnHxtVWDaOBQ2kFndu0yX /1ipw3L02EjTh09280t4LJxySvbOw6FsKH5EoT1ag2vgMTzWDbwyFlXUiwWIkbdi hqFWglBMPLgx8X3KLvswPLaAI9KGQ5sDIgRsFytlHHfjRWp1fSsRQgGCRouw4+3o raN9gIgDzAjilnb2ZfDGJWNDk67XofDhx7HV1cmBVdDbBMCsOA3qU5ojCLdWaxUQ 74Infh/HXCumU5J3MXO1N6Y3HbUsQ2IP1nWVr4RiKLlty5ATxEAjU5rEHuwSx2y8 wRbb/V7r7oS4JylYYPD8Vk
  • From Sam Hartman@21:1/5 to All on Tue May 6 16:10:01 2025
    "Stefano" == Stefano Zacchiroli <[email protected]> writes:

    Stefano> On Mon, May 05, 2025 at 02:13:58PM -0700, Russ Allbery wrote:
    >> However, I am very leery about extending that exception to cases
    >> where people are intentionally creating that situation by
    >> deleting the input data on purpose.

    Stefano> I agree with you on this.

    FWIW, I also agree. I let the rules lawyer side of me get ahead of
    everything else.
    I am collecting my thoughts and will put them together into a summary
    with my thinking updated based on Russ's input and Ansgar's input after
    I come up with some concise to say.
    I just wanted to respond to this point because in retrospect the
    position I took yesterday on intentionally deleting training data was
    not well-considered.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Stefano Zacchiroli on Tue May 6 16:40:01 2025
    On Tue, 06 May 2025 at 13:58:57 +0200, Stefano Zacchiroli wrote:
    FWIW, in terms of free software ethics, I consider non-open data to be
    "less nasty" than non-free code.

    Debian is unusual in the way we interpret our mission statement as
    extending to everything we distribute being Free, not just our
    executable code. Many other FOSS distributions apply the DFSG, the OSD,
    the FSF's guidelines or similar principles to executable code (only),
    and do not see a problem with having non-executable data that Debian
    would consider to be non-Free.

    Game assets are a prominent example: for example Debian puts alien-arena
    (a Free engine for a non-DFSG game) in contrib, and alien-arena-data
    (the non-DFSG textures, models, etc.) in non-free, whereas distributions
    like Fedora and Mageia put all of it in their equivalent of main.
    Non-Free documentation is another common example.

    I'm carefully avoiding saying "software" here because it's ambiguous
    whether that refers to executable code only, or to all works stored electronically; see also GRs 2004-003, 2004-004, 2006-004 which
    (somewhat) clarify that Debian is currently interpreting the DFSG as
    applying to all works, whether they are executable code or not.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andrey Rakhmatullin@21:1/5 to Simon Josefsson on Tue May 6 17:40:01 2025
    On Tue, May 06, 2025 at 05:12:12PM +0200, Simon Josefsson wrote:
    FWIW, in terms of free software ethics, I consider non-open data to be >>>"less nasty" than non-free code.

    Debian is unusual in the way we interpret our mission statement as
    extending to everything we distribute being Free, not just our
    executable code.

    I don't think Debian is perfectly consistent in applying that principle:
    for example, the text of the Developer Certificate of Origin (DCO) is >included in Debian packages (in 'main') and has a clearly non-free
    license, and IIRC sometimes not even in debian/copyright.

    Same for the text of GPL.

    There are
    other examples historically of similar content too, e.g., IETF RFCs

    These are usually considered bugs.


    --
    WBR, wRAR

    -----BEGIN PGP SIGNATURE-----

    iQJhBAABCgBLFiEEolIP6gqGcKZh3YxVM2L3AxpJkuEFAmgaK7UtFIAAAAAAFQAP cGthLWFkZHJlc3NAZ251cGcub3Jnd3JhckBkZWJpYW4ub3JnAAoJEDNi9wMaSZLh sTUP/0mzeekOn901ERGMKYnlJ9dkrzrF5ZYEyi8IjbrjG0rNuQ/54xO9wVvhimOf M3+ziCTDO6I6bis1KdlreeH2qXZaLm9apljxSlaU7YngvYnq22KC1eZVCnywBqSn 9VFQgIV9u5kPb6p/bTptVyOupCmlShC/aCn06iGmx/pcOQevxYENo73ACN5nYjXd jaXGGUQw315M3jXsq9OXX884b5lbCMUcBBe3Uski0Z2xqadXwlrhBGXia2BFphDa ZslSrUMG6/egAwjBZh2xGF+ct77SmPoNCvIzGlMNG3BV+JVuD22KPRhb1apIBO/c 87rtOLhSqTQ/713rd0WlcbZnvm/BS/uAArrU8IWrnSMSXoO1KrshEj1oQz4q2+IP QeeKTlyOGJ8WyUVem355xNalysOeyhruyxk8oIhA5NuUowEpnb97STXoV3IVhqu8 hu6HNf8m/OdJt/jKHCs+fMmDd+uxqlTtcKLLctJcD8gwzQQd/2P8ebgkpsFsnHdf +JsvWKT1v3ZE4Sde3wWpyQorRYy/71C64ChA1SNJzyeEzvU9V5AFmXey2CCjSw03 IjFU5POoI/GIgL2iDapYIpqJdZNnuLeJrjHoHVX3BZvZBtZAI5xdbCVJeKh0JD3t uRE3jryeSZQupycISic0xJuUUlINkMeszr5vsmXPlWS7GFDd
    =o/fr
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Andrey Rakhmatullin on Tue May 6 17:50:01 2025
    Andrey Rakhmatullin <[email protected]> writes:
    Simon Josefsson wrote:

    I don't think Debian is perfectly consistent in applying that
    principle: for example, the text of the Developer Certificate of
    Origin (DCO) is included in Debian packages (in 'main') and has a
    clearly non-free license, and IIRC sometimes not even in
    debian/copyright.

    Same for the text of GPL.

    License texts have always been a special exception, and I kind of wish we
    would amend the DFSG to make that clear. Not because I think the status of license texts is somehow in question, but because having one undeclared exception makes people think we should have other undeclared exceptions. I would much prefer to take the time to enumerate all of our major
    exceptions.

    This is probably the rules lawyer in me who likes having everything pinned
    down as well as we can.

    I do still want us to remain flexible around edge cases and interpret the
    DFSG as humans and not like a computer program, but licenses are a
    sufficiently obvious exception that I think we should ideally spell that
    out, along with anything else that's similarly substantial and common.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Stefano Zacchiroli on Tue May 6 17:40:01 2025
    Stefano Zacchiroli <[email protected]> writes:
    On Mon, May 05, 2025 at 02:13:58PM -0700, Russ Allbery wrote:

    However, I am very leery about extending that exception to cases where
    people are intentionally creating that situation by deleting the input
    data on purpose.

    I should say explicitly that I jumped quite a bit down a slippery slope in
    this reply to Sam to make a rhetorical point, and there is a VAST excluded middle between "training data available" and "training data intentionally deleted to avoid having to disclose it."

    In most cases, I suspect the real situation will be more that the training
    data was just unmanageably large and the people doing the training saw no reason to retain it because they considered it easier to, for instance,
    scrape the web again than to keep all the data on hand. This is Sam's
    point, as I understand it, and it's entirely valid *if* you believe that
    the DFSG provision for source code is primarily about putting everyone, including upstream, on an equal footing. I agree that if the training data
    was never kept or intended to be kept, upstream is clearly indicating that
    they don't consider it the "preferred form of modification" and they are
    on equal footing with everyone else.

    I do *not* consider putting everyone on equal footing to be the only or
    even the primary goal of the requirement to have source code. I am
    concerned about other ethical issues such as transparency and auditability
    that come with source code.

    I agree with you on this. I do wonder however where you would place the
    case where the training data is available (possibly: publicly
    available), and the model trainers would even want to distribute it, but cannot due to unclear licensing terms. Would you say that it is a "less nasty" case than that where training data is deleted on purpose, or
    would you consider it as bad?

    I think it's clearly less bad in some sense, in that there isn't the
    feeling of someone gaming the system and thus I'm less leery of their
    motives. This case is instead the far more familiar and typical case that
    free software encounters all the time: portions of the source are under
    unclear licenses and are not clearly DFSG-free.

    No one in those situations is doing anything wrong, in my opinion, but we
    still don't allow such software into Debian main because we are a free
    software distribution and that is not free software. There are other forms
    of good in the world besides free software, and I am very glad there are
    other organizations to pursue them, but I don't see the justification for Debian to expand its scope.

    FWIW, in terms of free software ethics, I consider non-open data to be
    "less nasty" than non-free code.

    I agree with this in terms of ethics, but I think they're equivalent in
    terms of what we put in Debian main.

    The ability to exploit non-open-data to serve the needs of free software
    (as it would be the case with DFGS-free models, trained on non-DFSG-free data) is something I hesitate giving up on.

    Well, first, I continue to object to the idea that a model can be
    DFSG-free if it's trained on non-DFSG-free data. I think that makes it definitionally non-free. (I have read Aigars's arguments to the contrary
    and do not find them at all persusasive.)

    But, more directly to your point, I agree with you, but I don't understand
    why this implies that it's necessary to put non-free data in Debian main.
    I can exploit all sorts of non-open data from my Debian computer by
    obtaining it from any number of other sources. I don't see the need for
    Debian to host it.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Simon McVittie on Tue May 6 17:20:01 2025
    Simon McVittie <[email protected]> writes:

    On Tue, 06 May 2025 at 13:58:57 +0200, Stefano Zacchiroli wrote:
    FWIW, in terms of free software ethics, I consider non-open data to be >>"less nasty" than non-free code.

    Debian is unusual in the way we interpret our mission statement as
    extending to everything we distribute being Free, not just our
    executable code.

    I don't think Debian is perfectly consistent in applying that principle:
    for example, the text of the Developer Certificate of Origin (DCO) is
    included in Debian packages (in 'main') and has a clearly non-free
    license, and IIRC sometimes not even in debian/copyright. There are
    other examples historically of similar content too, e.g., IETF RFCs and
    Unicode tables. So I think there are some implicit acknowledgement from Debian's (in)actions that non-open data like documentation or firmware
    is not as evil as non-free code. Extending that approach to AI models
    isn't unreasonable, although I would prefer if we didn't.

    /Simon

    Developer Certificate of Origin
    Version 1.1

    Copyright (C) 2004, 2006 The Linux Foundation and its contributors.

    Everyone is permitted to copy and distribute verbatim copies of this
    license document, but changing it is not allowed.


    Developer's Certificate of Origin 1.1

    By making a contribution to this project, I certify that:

    (a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

    (b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

    (c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

    (d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgaJswUHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdFov+xAPwOkXw5h5Cq vrgRKK/zVql1VrD5p8PWMoW0A63bpiqQRgD7BF9fUUVqM67bhR9vuDy4UhJv+M3t Ez0OjA7bLxm8Bww=
    =OhT6
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Simon Josefsson on Tue May 6 19:10:01 2025
    On Tue, 06 May 2025 at 17:12:12 +0200, Simon Josefsson wrote:
    Simon McVittie <[email protected]> writes:
    Debian is unusual in the way we interpret our mission statement as
    extending to everything we distribute being Free, not just our
    executable code.

    I don't think Debian is perfectly consistent in applying that principle:
    for example, the text of the Developer Certificate of Origin (DCO) is >included in Debian packages (in 'main') and has a clearly non-free
    license, and IIRC sometimes not even in debian/copyright.

    I think this is seen as part of making a wider exception for licenses
    and similar legal texts. The text of the GPL has similar wording, and we certainly can't be a legally-valid Linux distribution without shipping a
    copy of that.

    There are
    other examples historically of similar content too, e.g., IETF RFCs and >Unicode tables.

    I believe project policy is currently that IETF RFCs under the "old"
    non-Free license, even if they are only in source packages, are a DFSG violation (although I'm sure there are plenty of undiscovered bugs for
    this topic, and I have no particular interest in proactively looking for
    them).

    My understanding is that Unicode tables are under a Free license, and at
    least some packages go to significant effort to ensure that we have a
    preferred form for modification for their content (for example src:glib2.0
    has a copy of the subset of unicode-data necessary to regenerate its
    internal tables, because the version of Unicode that it implements is
    part of its external API and should not be arbitrarily changed
    downstream, but we cannot guarantee that it will always be perfectly in
    sync with the current version of src:unicode-data in Debian).

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Russ Allbery on Tue May 6 22:50:01 2025
    Russ Allbery <[email protected]> writes:

    Andrey Rakhmatullin <[email protected]> writes:
    Simon Josefsson wrote:

    I don't think Debian is perfectly consistent in applying that
    principle: for example, the text of the Developer Certificate of
    Origin (DCO) is included in Debian packages (in 'main') and has a
    clearly non-free license, and IIRC sometimes not even in
    debian/copyright.

    Same for the text of GPL.

    License texts have always been a special exception, and I kind of wish we would amend the DFSG to make that clear. Not because I think the status of license texts is somehow in question, but because having one undeclared exception makes people think we should have other undeclared exceptions. I would much prefer to take the time to enumerate all of our major
    exceptions.

    Are you suggesting that the DCO is a license text that has to be part of
    the licensing information for a piece of work, and mentioned in debian/copyright? Clarifying that would be good. To me the DCO reads
    as information intended for contributors to some project, to govern
    policies around contributions, and I don't find it particulary relevant
    for debian/copyright in the same way I don't find CLA texts relevant for inclusion. I think the DCO content is similar to some of the
    Non-Variant GFDL sections rejected by general principle by Debian today, explaining matters related to contributing to a project.

    To me this handling feels like having a generic rule ("no non-free
    texts") and applying the rule in different ways ("DCO text" vs
    "Non-Variant GFDL text") depending on what outcome we want. I find that
    worse than having a (known incomplete) number of specific rules that are applied consistently.

    /Simon

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgadMkUHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdForHmAQDPJS4c5dVv 0mNPTy+/QdVneAWUe8ljGa28Km5jSwwcRQEA+fMkHIAzVjna7DD/iHtpLsQ0UcsT xIv/DWHYgyZH7Qc=
    =5Ix6
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Simon Josefsson on Tue May 6 23:10:01 2025
    Simon Josefsson <[email protected]> writes:

    Are you suggesting that the DCO is a license text that has to be part of
    the licensing information for a piece of work, and mentioned in debian/copyright?

    It struck me as a similar sort of thing on first glance, so I used your
    comment to make a broader point, but it sounds like this is a more
    complicated situation that I don't know anything about. I haven't done any research here, so I have no opinion about whether the DCO should be
    included in Debian packages or not.

    My general point is that we will need to have some exceptions for various reasons, and I'd rather document them explicitly in the DFSG rather than
    having a set of well-understood exceptions within Debian that aren't
    recorded where people would expect that information to be. I think that's
    in general alignment with your point.

    I will make the general comment that I think it's reasonable to care more
    about code or data that is integral to the functionality of something we package and less about ancillary files that aren't particularly important
    to the normal functioning of the package (such as files that exist only in source packages and don't contribute to the binary package). Bad licensing
    for the latter is still a bug, to be clear, but the centrality of the code
    or data to functionality does affect my opinion about the severity of the
    bug (unless, of course, we would get into legal trouble for distributing
    it at all).

    Upstreams put all sorts of weird things in source packages, so there will
    be a steady stream of bugs about files with odd licensing. We should
    strive to fix them all, but I would prioritize ones that affect the code
    that users run or the documentation that they read. Those are more central
    to what we're trying to accomplish as a project. That's why I'm
    particularly interested in the source for AI models and less worried about whether we manage to find and excise every poorly-licensed RFC in a Debian source package, although I agree that the presence of the latter is a bug
    and we should fix those bugs (but perhaps less urgently than fixing bugs
    where the code itself is not free).

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Marco d'Itri@21:1/5 to [email protected] on Wed May 7 00:30:01 2025
    [email protected] wrote:

    Debian is unusual in the way we interpret our mission statement as
    extending to everything we distribute being Free, not just our
    executable code. Many other FOSS distributions apply the DFSG, the OSD,
    the FSF's guidelines or similar principles to executable code (only),
    and do not see a problem with having non-executable data that Debian
    would consider to be non-Free.
    I have been a Debian developer for almost 30 years, and I remember that
    when I joined the project we had no plans to apply the DFSG to e.g. documentation.
    Then the "editorial changes" (not) GR happened, and some people were
    very surprised by the practical outcome.

    --
    ciao,
    Marco

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Clint Adams@21:1/5 to Russ Allbery on Wed May 7 02:10:01 2025
    On Tue, May 06, 2025 at 08:36:50AM -0700, Russ Allbery wrote:
    Well, first, I continue to object to the idea that a model can be
    DFSG-free if it's trained on non-DFSG-free data. I think that makes it >definitionally non-free. (I have read Aigars's arguments to the contrary
    and do not find them at all persusasive.)

    We appear to have plenty of pre-trained models, apparently trained on non-DFSG-free data, in main right now, which strikes me as a violation
    of our current "preferred form of modification" rule.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Clint Adams on Wed May 7 03:00:02 2025
    Clint Adams <[email protected]> writes:
    On Tue, May 06, 2025 at 08:36:50AM -0700, Russ Allbery wrote:

    Well, first, I continue to object to the idea that a model can be
    DFSG-free if it's trained on non-DFSG-free data. I think that makes it
    definitionally non-free. (I have read Aigars's arguments to the
    contrary and do not find them at all persusasive.)

    We appear to have plenty of pre-trained models, apparently trained on non-DFSG-free data, in main right now, which strikes me as a violation
    of our current "preferred form of modification" rule.

    Yes. That's the conclusion I've arrived at as well, after thinking about
    this over the course of the discussion, although I suppose it's also an argument that I'm thinking about this wrong and the current status quo is
    fine.

    This is not something we've paid a lot of attention to, and I think we've defaulted to accepting stuff that claims to be under DFSG licenses. That's certainly what I did with gnubg when I was maintaining it. I never really thought about this issue. That means there are some practical problems
    with changing the de facto policy.

    I think if any of the options in the current GR except Aigars's (and maybe Sam's?) passes, that would effectively be a change in our current policy,
    even if the current policy is not precisely intentional. Personally, I
    think it would bring us back closer in line with our principles, but that doesn't make the practical problems go away. Right now is a really bad
    time to change our policies. Whatever we do, I don't think we should try
    to change anything before the release. Shortly after the release would be
    a much better time, so that we have some time to sort this out.

    We have previously had a lot of problems with the implementation of
    changes (either via GR or via delegate interpretation) from a de facto licensing policy and had to thrash out the implications with multiple GRs
    [1], which isn't very fun. I'm not sure that we've thought through the implications of this proposed change yet, and I'm not sure that we have a
    plan. The plan doesn't need to be in the GR, but I'd feel more comfortable
    if we had a list of affected packages and some idea of what we're going to
    do with them.

    That means I'm not sure how to vote on the current proposal as it
    currently stands. I'd rather not have to do a second GR just to clarify
    the timing of implementation given the upcoming release. Maybe there's
    still time to address that directly? Also, there is absolutely nothing
    wrong with temporarily withdrawing a GR (or even having it fail because we didn't realize in previous debian-project discussions that we'd not fully worked through the idea yet), and then bringing it back up when we have an implementation plan.

    [1] https://www.debian.org/vote/2004/vote_003
    https://www.debian.org/vote/2004/vote_004
    https://www.debian.org/vote/2006/vote_007
    https://www.debian.org/vote/2008/vote_003

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Aigars Mahinovs on Wed May 7 11:40:01 2025
    Aigars Mahinovs <[email protected]> writes:

    On Wed, 7 May 2025 at 02:56, Russ Allbery <[email protected]> wrote:


    I think if any of the options in the current GR except Aigars's (and maybe >> Sam's?) passes, that would effectively be a change in our current policy,
    even if the current policy is not precisely intentional.


    IMHO my option will also be a change in our current policy, but, instead of requiring the training data itself, my option would just require adding a documentation section describing how to create/gather and process data required to train such models *if* someone would want to reproduce them.

    Would failure for anyone else to be able to reproduce them be a RC bug?

    Do the tools required for reproducing the model have to be in Debian
    main, or are non-free or external proprietary tools okay?

    Do the toolchain for LLM models support bit-by-bit reproducible outputs?

    Is a Build-Depends on such a LLM-model acceptable? Then we could
    eventually replace the source code for `sudo` in Debian with a LLM
    prompt like "write me a secure replacement for sudo and output a
    executable ELF binary for my host architecture". In fact, with a bit of
    more irony, we could replace a lot of insecure source code this way.

    I'm not convinced this approach leads to something desirable. I fear it
    means people will have yet another way to add proprietary content into
    Debian, and that Debian give up further on caring about user freedom.
    But this is already the case, so I feel at a loss to use how to use this argument.

    /Simon

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgbKSMUHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdFooipAP959+qgpYdu vxP4Prb9qWX0M2f/V2iM3n1WA9ROumnfxQEAmezVyHGvUo0mrHCQOlGV5gR0QKtX yEOkhb80izXQogM=
    =mKB4
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Aigars Mahinovs on Wed May 7 14:30:01 2025
    Aigars Mahinovs <[email protected]> writes:

    (While I find the tone of the email a bit exasperated, I will try to
    reply factually and I hope it will be received as such.)

    Thanks for answers! Surprisingly I now find myself agreeing that your
    approach is reasonable and is consistent with existing Debian practices.
    I just wish that the existing practices were more libre and more
    consistent with documented policies, but I also think this is not the
    popular opinion.

    /Simon

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgbUBwUHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdFohxeAP9sSRafghDi ovFdsm94+OwkSWb3DJxlaTX0hcw4rEm4FwEAodckcR7lncoefh9BrxwHM9ubcVud +6veOaozNU4FLQk=
    =p56l
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Clint Adams@21:1/5 to Simon Josefsson on Wed May 7 16:10:01 2025
    On Wed, May 07, 2025 at 02:20:44PM +0200, Simon Josefsson wrote:
    Thanks for answers! Surprisingly I now find myself agreeing that your >approach is reasonable and is consistent with existing Debian practices.
    I just wish that the existing practices were more libre and more
    consistent with documented policies, but I also think this is not the
    popular opinion.

    So, let's delve deeper on the practical impact of such consistency
    or not. Let's say we have a hypothetical package called
    gnipgnop-rattrap. It's an accessibility tool which tracks elements
    of your face using pretrained Haar cascade classifier models, and
    based on where you look, moves the "mouse" pointer. The models
    we ship it with have been trained solely on 75 gigabytes of images
    captured from Disney films, which are not available anywhere
    because the people who trained the models are afraid of being sued.

    What should Debian do? Remove the package from the archive so no
    one can use it? Patch it to download the models from a random
    URL which may or may not be accessible? Construct 75 gigabytes of
    DFSG-free annotated training data to stuff into the source package?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to Marco d'Itri on Wed May 7 16:20:01 2025
    On Tue, 06 May 2025 at 22:10:28 -0000, Marco d'Itri wrote:
    [email protected] wrote:
    Debian is unusual in the way we interpret our mission statement as >>extending to everything we distribute being Free, not just our
    executable code. Many other FOSS distributions apply the DFSG, the OSD,
    the FSF's guidelines or similar principles to executable code (only),
    and do not see a problem with having non-executable data that Debian
    would consider to be non-Free.

    I have been a Debian developer for almost 30 years, and I remember that
    when I joined the project we had no plans to apply the DFSG to e.g. >documentation.
    Then the "editorial changes" (not) GR happened, and some people were
    very surprised by the practical outcome.

    Yes, I didn't mean to imply that I think our interpretation is
    necessarily the one that brings most benefit to our users and Free
    Software, only that it's the one that the project enforces.

    I personally think there's a risk that we put too much emphasis on
    following the chain of "true" source code to justifiable but impractical conclusions, at the expense of our ability to spend developers' finite
    time and motivation on making our distribution better; but I can't claim
    to have assessed whether this is a position that has consensus.

    I also think we spend too much time thinking about "*the* preferred form
    for modification" when it's sometimes more appropriate to be looking for
    "*a* preferred form for modification" or even just "a form that would be reasonable to modify as a way to exercise your Free Software rights".
    Questions that have a clear answer for typical C/C++ source code do not
    always have an equally clear answer for other digital works.

    Hopefully there is room for some sort of nuance and cost/benefit
    analysis beyond "if you and your upstream do not both meet the demands
    of the most DFSG-maximalist developer, then your work will be thrown
    away", which is unlikely to be good for anyone's state of mind
    (certainly it isn't good for mine).

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Clint Adams on Wed May 7 18:20:01 2025
    Clint Adams <[email protected]> writes:

    On Wed, May 07, 2025 at 02:20:44PM +0200, Simon Josefsson wrote:
    Thanks for answers! Surprisingly I now find myself agreeing that your >>approach is reasonable and is consistent with existing Debian practices.
    I just wish that the existing practices were more libre and more
    consistent with documented policies, but I also think this is not the >>popular opinion.

    So, let's delve deeper on the practical impact of such consistency
    or not. Let's say we have a hypothetical package called
    gnipgnop-rattrap. It's an accessibility tool which tracks elements
    of your face using pretrained Haar cascade classifier models, and
    based on where you look, moves the "mouse" pointer. The models
    we ship it with have been trained solely on 75 gigabytes of images
    captured from Disney films, which are not available anywhere
    because the people who trained the models are afraid of being sued.

    What should Debian do? Remove the package from the archive so no
    one can use it? Patch it to download the models from a random
    URL which may or may not be accessible? Construct 75 gigabytes of
    DFSG-free annotated training data to stuff into the source package?

    Doesn't Aigars' reply answer that? Assuming it wins the vote.

    https://lists.debian.org/debian-vote/2025/05/msg00075.html

    My reading is that if it is possible for a skilled person to re-create
    an equivalent model following some description, under Aigars' proposal,
    it would be permissible to have gnipgnop-rattrap in Debian main,
    including the model trained on 75 gigabytes of Disney films.

    That is not my preference nor what I would want to see happen, but I
    think it is consistent with how Debian approach including non-free
    firmware in the official installer images, and how Debian approaches
    licensing on other non-source files inside packages.

    /Simon

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgbhf0UHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdFoq5QAQDnLk5vRMcV zDAydkAgCclajVkmU9B734UyhHsGsuwVgAEA85lf4CCz+QcphHI1E9q93GKUsLe3 wqOTxIMoZ04HHAg=
    =B8dc
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Clint Adams on Wed May 7 19:40:01 2025
    Clint Adams <[email protected]> writes:

    On Wed, May 07, 2025 at 06:10:37PM +0200, Simon Josefsson wrote:
    That is not my preference nor what I would want to see happen, but I
    think it is consistent with how Debian approach including non-free
    firmware in the official installer images, and how Debian approaches >>licensing on other non-source files inside packages.

    So what is your preference and what would you want to see happen?
    I ask because I see no good options here. I am thinking about
    this from the perspective of a user who wants to use the models
    unmodified and from the perspective of a user who wants to
    modify the models to work better with a face that the models
    "consider" an outlier.

    I think Thorsten Glaser's proposal on the surface looks more in line
    with what I would want to see, but I don't think we understand the full implications of any of the proposals right now.

    https://lists.debian.org/debian-vote/2025/04/msg00118.html

    Some approach to have LLM tools in 'main' when they can work with models
    that would be appropriate for inclusion in 'main' seems fine to me.
    Then we can ship models for that tool in 'non-free', for people who want
    to work with some larger model. I don't see a need to permit LLM tools
    in 'main' that are unable to work with any libre LLM model, those tools
    could go into 'contrib'.

    /Simon

    -----BEGIN PGP SIGNATURE-----

    iQNoBAEWCAMQFiEEo8ychwudMQq61M8vUXIrCP5HRaIFAmgbmg0UHHNpbW9uQGpv c2Vmc3Nvbi5vcmfCHCYAmDMEXJLOtBYJKwYBBAHaRw8BAQdACIcrZIvhrxDBkK9f V+QlTmXxo2naObDuGtw58YaxlOu0JVNpbW9uIEpvc2Vmc3NvbiA8c2ltb25Aam9z ZWZzc29uLm9yZz6IlgQTFggAPgIbAwULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgBYh BLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XQkBQkNZGbwAAoJENc89jjFPAa+BtIA /iR73CfBurG9y8pASh3cbGOMHpDZfMAtosu6jbpO69GHAP4p7l57d+iVty2VQMsx +3TCSAvZkpr4P/FuTzZ8JZe8BrgzBFySz4EWCSsGAQQB2kcPAQEHQOxTCIOaeXAx I2hIX4HK9bQTpNVei708oNr1Klm8qCGKiPUEGBYIACYCGwIWIQSx0r0Tdb7LeEz0 +MTXPPY4xTwGvgUCZ9F0SgUJDWRmSQCBdiAEGRYIAB0WIQSjzJyHC50xCrrUzy9R cisI/kdFogUCXJLPgQAKCRBRcisI/kdFoqdMAQCgH45aseZgIrwKOvUOA9QfsmeE 8GZHYNuFHmM9FEQS6AD6A4x5aYvoY6lo98pgtw2HPDhmcCXFItjXCrV4A0GmJA4J ENc89jjFPAa+wUUBAO64fbZek6FPlRK0DrlWsrjCXuLi6PUxyzCAY6lG2nhUAQC6 qobB9mkZlZ0qihy1x4JRtflqFcqqT9n7iUZkCDIiDbg4BFySz2oSCisGAQQBl1UB BQEBB0AxlRumDW6nZY7A+VCfek9VpEx6PJmdJyYPt3lNHMd6HAMBCAeIfgQYFggA JgIbDBYhBLHSvRN1vst4TPT4xNc89jjFPAa+BQJn0XTSBQkNZGboAAoJENc89jjF PAa+0M0BAPPRq73kLnHYNDMniVBOzUdi2XeF32idjEWWfjvyIJUOAP4wZ+ALxIeh is3Uw2BzGZE6ttXQ2Q+DeCJO3TPpIqaXDAAKCRBRcisI/kdForZXAP4mzm1kMPlV WI40PopmBg4JTaHsRxsEDb5NQDA1PEmZgQEA9/dPFKSSDYN6fZcUUa+ANI6x0dEI qexJq1nRLqd71Qk=
    =k9nD
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Clint Adams@21:1/5 to Simon Josefsson on Wed May 7 19:30:02 2025
    On Wed, May 07, 2025 at 06:10:37PM +0200, Simon Josefsson wrote:
    That is not my preference nor what I would want to see happen, but I
    think it is consistent with how Debian approach including non-free
    firmware in the official installer images, and how Debian approaches >licensing on other non-source files inside packages.

    So what is your preference and what would you want to see happen?
    I ask because I see no good options here. I am thinking about
    this from the perspective of a user who wants to use the models
    unmodified and from the perspective of a user who wants to
    modify the models to work better with a face that the models
    "consider" an outlier.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Clint Adams@21:1/5 to Simon Josefsson on Wed May 7 20:20:01 2025
    On Wed, May 07, 2025 at 07:36:13PM +0200, Simon Josefsson wrote:
    I think Thorsten Glaser's proposal on the surface looks more in line
    with what I would want to see, but I don't think we understand the full >implications of any of the proposals right now.

    https://lists.debian.org/debian-vote/2025/04/msg00118.html

    Some approach to have LLM tools in 'main' when they can work with models
    that would be appropriate for inclusion in 'main' seems fine to me.
    Then we can ship models for that tool in 'non-free', for people who want
    to work with some larger model. I don't see a need to permit LLM tools
    in 'main' that are unable to work with any libre LLM model, those tools
    could go into 'contrib'.

    I'm not talking about LLMs. I barely care about LLMs more than
    I care about megahal. My fictitious example is based on an
    actual problem we have with computer vision classifier models.
    So, if the models which already exist in main cannot exist in
    main (either under current policy, Thorsten's proposal, or Mo's
    proposal), are we going to pretend that they're not there? Are
    we going to throw them out? Both those things are bad for our
    users. Are we going to fix them instead? If so, how? Thorsten's
    �5 handwaves implementation details away, but I think that this is
    actually very important, and I don't even need face detection
    technology to use my computer.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Soren Stoutner@21:1/5 to All on Wed May 7 15:57:42 2025
    On Tuesday, May 6, 2025 2:07:29 PM Mountain Standard Time Russ Allbery wrote:
    My general point is that we will need to have some exceptions for various reasons, and I'd rather document them explicitly in the DFSG rather than having a set of well-understood exceptions within Debian that aren't
    recorded where people would expect that information to be.

    I second this. I think we ought to be explicit about what types of exceptions.


    --
    Soren Stoutner
    [email protected]
    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEEJKVN2yNUZnlcqOI+wufLJ66wtgMFAmgb5WYACgkQwufLJ66w tgOI4w//Tqh9GFkkZqE46uXhtVJ0FJLoJcmintN2BWIdGO2pna0EqCerCcgQDZ+y 77YvtfclDBLbIptTSDjX+XMMbI/MdxD3qsRgx/camIF14HlJSV1Y1vZrFCucIgTp zU1Nd4AW4SJOT3rN+1iIvj42VHkr/qCnfQS3ZsFoM+5b7C0yi9vMTI9cIxi3OK3z 8l8ogfkUGAWjKOUqml8LANlLv3j7uEpo3Vf7pFi89VI6Lbke9q4NHBBpZP786Sy1 OlUp+wTm6QjaP1mPKATkRnPO/ZH8mtGoykgtVWlXqEc34lYeLH+p3is+mLN0oAz9 Ah4VcpKSy8Rm5qNvYJ9uiI7Qw+gqEmz0D0SrjFxKSgIPAx0obPqACOWvHXS3ejUY 45W8VybOluZ/3eT5qtpXr6o29u0WWNxdYphWmVj3Ie+x28S2kaKVTn86PyvCf8BJ vf+uXJ0lco7D3xAxiSabIVNF9VXCL5ge0lYOZ/TCsKVhszh+JF1C/naTgYTI5cgX 9OOSe9jeBPkAodILlr6Vm5C3qIPzGtHOqa8ZbaO8NDi6NWZL1GTxfmFanACxIJZG 1CL4hI6c2sm0+37RtbjV++m3J4L/sZzk+rRWF4oj8+/YhpslEQ23T/hxEoYupqIP Ky9MEf7bpFppxaIs6rzJL7s6OCNukp/iOmQMgR24kXoLaq3ljAE=
    =O83y
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefano Zacchiroli@21:1/5 to Russ Allbery on Thu May 8 16:40:01 2025
    On Tue, May 06, 2025 at 08:36:50AM -0700, Russ Allbery wrote:
    But, more directly to your point, I agree with you, but I don't understand why this implies that it's necessary to put non-free data in Debian main.
    I can exploit all sorts of non-open data from my Debian computer by
    obtaining it from any number of other sources. I don't see the need for Debian to host it.

    On Wed, May 07, 2025 at 05:23:21PM +0000, Clint Adams wrote:
    So what is your preference and what would you want to see happen?
    I ask because I see no good options here. I am thinking about
    this from the perspective of a user who wants to use the models
    unmodified and from the perspective of a user who wants to
    modify the models to work better with a face that the models
    "consider" an outlier.

    What I strongly suspect would happen, if proposal A wins (which I also
    consider quite likely) is that Debian maintainers of free software
    products that use trained ML models that lack DFSG-free training data,
    will have to go down the rabbit hole of patching those software to systematically download the models on first use. Or just give up on
    maintaining those packages, of course.

    Answering Russ upthread, I understand very well how such a situation
    will make us Debian people fell well, because we are not hosting it.
    But I fail to say how this helps in delivering software freedom to our
    users. First, they will have the models in question anyway, probably automatically so we will really not be "protecting" them from this eveil OSAID-but-not-DFSG-free stuff. (Or are we going to rule that free
    software that does this cannot be in main too?)

    Second, it will be more work for our maintainers, and deliver an overall
    worse experience in terms of security, mirroring, etc.

    Finally, we will also be making things harder for people that are fine
    with the limited modifications that are possible without the training
    data (e.g., fine tuning) as they will not be able to find the full
    sources (that are enough for their needs) within the Debian archive.

    Cheers
    --
    Stefano Zacchiroli . [email protected] . https://upsilon.cc/zack _. ^ ._
    Full professor of Computer Science o o o \/|V|\/ Télécom Paris, Polytechnic Institute of Paris o o o </> <\> Co-founder & CSO Software Heritage o o o o /\|^|/\ Mastodon: https://mastodon.xyz/@zacchiro '" V "'

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCAAdFiEE8ZooXsFA+JEz681OfH5Cj5NBJ5kFAmgcwfsACgkQfH5Cj5NB J5kpJhAArTrkQ0puSwTC2WLDQRYblzP66VffOFViztNfOmHBeP5pCLEGqc5pQCX6 21rrgSkyCzPt3LpbMt/IEwrAj9Tf+6qKamM776/CQI8S0uH762KlDQDdn7UgEdRb z5LoseHR7iOU5nFZndk6Z3R+JTSlxmMsdD7/YpmRGRY/tiNvdfyrswFC1phWut+s fKVelsAQloGnKGeuzu9gdGRhEDyZMYZJjsMhKe3V8jB2GCZLeoKdDYsjSJ2Y/Pkj h1kJpCsqWtyckeYX6aOx6zCN1uFcJvZwqOi9ueUuex6akIKarwxpN9xtjxelSRM8 N2lQEjtqA5edfVUBM7OqBHxC3iU/xfWuMcMiV/bQCver3G5JL0ih/D/4lZd3lalh OjDxtGDZ72faAwDJtv4M+y3IZR0UImx43pfTwOPxD6FGmog0l1W5foGbW0Q6wApN ZkztJbJjYYZcT+JlSC/JoK+w0foDGOsO+FKtvi6VvOWZbrhtZg5W3Ok1gXh/77Np pF04WPa92IUs6ANrGBwTRe
  • From Sam Hartman@21:1/5 to All on Thu May 8 18:20:01 2025
    "Stefano" == Stefano Zacchiroli <[email protected]> writes:


    Stefano> What I strongly suspect would happen, if proposal A wins
    Stefano> (which I also consider quite likely) is that Debian
    Stefano> maintainers of free software products that use trained ML
    Stefano> models that lack DFSG-free training data, will have to go
    Stefano> down the rabbit hole of patching those software to
    Stefano> systematically download the models on first use. Or just
    Stefano> give up on maintaining those packages, of course.

    For me this would give up on one of the big befenits of Debian.
    Debian is mostly self-contained.
    If I can restrict myself to things in the archive, I can throw a full
    Debian mirror into environments where I cannot reach the internet and
    mostly get very good results.
    The more Debian moves to a model where it encourages downloading
    non-mirrored artifacts, the harder that use case becomes.


    I don't care whether the artifacts I need are in main. I would be fine
    with another archive section.
    But I suspect you are right and rather than going through that
    complexity, especially since stuff in main cannot even recommend outside
    of main, they will download because it provides a better experience than
    trying to support a model data package in non-free.

    It will also enhance challenges when versions of software in stable want
    to use models that are not in the places they used to be.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Stefano Zacchiroli on Thu May 8 18:50:01 2025
    Stefano Zacchiroli <[email protected]> writes:

    Answering Russ upthread, I understand very well how such a situation
    will make us Debian people fell well, because we are not hosting it. But
    I fail to say how this helps in delivering software freedom to our
    users. First, they will have the models in question anyway, probably automatically so we will really not be "protecting" them from this eveil OSAID-but-not-DFSG-free stuff. (Or are we going to rule that free
    software that does this cannot be in main too?)

    Second, it will be more work for our maintainers, and deliver an overall worse experience in terms of security, mirroring, etc.

    Finally, we will also be making things harder for people that are fine
    with the limited modifications that are possible without the training
    data (e.g., fine tuning) as they will not be able to find the full
    sources (that are enough for their needs) within the Debian archive.

    But these are all arguments for merging non-free, or at the very least non-free-firmware, into main.

    There have always been good arguments for that. The proprietary NVIDIA
    drivers are quite important for people to be able to use their computer properly (thankfully hopefully becoming less so over time, but
    historically computers with NVIDIA graphics cards were nearly unusable
    without them, and they're still quite important for a lot of computing applications), and many of our users did not appreciate us "protecting"
    them from the drivers. If our primary goal was to make the most convenient distribution possible for our users, I think we would selectively include
    the most important non-free packages in main. It would be a better and
    more integrated user experience.

    I don't understand why machine learning models are any different. Or,
    rather, I understand why they're different to people who truly believe
    they really are free software. That argument makes sense to me; I just
    don't agree with it. But I don't understand the argument if one agrees
    that models without training data are non-free.

    Maybe the answer is that they're just too useful to the distribution to
    not package regardless of our opinions about whether they're free
    software. User experience and free software principles *are* often in
    tension and it's fine for us to shift that balance, in my opinion. But I
    guess I would have expected us to do that via a mechanism similar to non-free-firmware if we wanted to make it easy for users to use software
    that is OSAID-approved but not DFSG-free, at least if we have a lot of it.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Clint Adams@21:1/5 to Stefano Zacchiroli on Thu May 8 20:00:01 2025
    On Thu, May 08, 2025 at 05:38:52PM +0300, Stefano Zacchiroli wrote:
    What I strongly suspect would happen, if proposal A wins (which I also >consider quite likely) is that Debian maintainers of free software
    products that use trained ML models that lack DFSG-free training data,
    will have to go down the rabbit hole of patching those software to >systematically download the models on first use. Or just give up on >maintaining those packages, of course.

    That seems like widespread failure to me, but I'm still hoping
    that someone who supports either Mo's or Thorsten's proposal will
    articulate a better vision.

    On Thu, May 08, 2025 at 09:42:25AM -0700, Russ Allbery wrote:
    I don't understand why machine learning models are any different. Or,
    rather, I understand why they're different to people who truly believe
    they really are free software. That argument makes sense to me; I just
    don't agree with it. But I don't understand the argument if one agrees
    that models without training data are non-free.

    I'm not sure that these are quite the right terms. This email itself
    is non-free software, but if Sam wants to train some kind of deep
    learning model on it and release the model, without training data,
    under the Expat license, I definitely would not refer to the model
    as non-free. Would I prefer that copyright law be abolished and
    there be no impediments to providing the training data as well?
    Of course I would. But, absent that, there would be no way for Sam
    to distribute the training data as free software.

    To free some non-free firmware, in theory, the copyright holders
    just need to be motivated enough to do it. To free Sam's
    hypothetical email corpus, you would have to convince every single
    email author, including the spammers, to relicense. One of them
    is more of a pipe dream than the other.

    Maybe the answer is that they're just too useful to the distribution to
    not package regardless of our opinions about whether they're free
    software. User experience and free software principles *are* often in
    tension and it's fine for us to shift that balance, in my opinion. But I >guess I would have expected us to do that via a mechanism similar to >non-free-firmware if we wanted to make it easy for users to use software
    that is OSAID-approved but not DFSG-free, at least if we have a lot of it.

    Maybe that is what we should be doing; I'm not sure.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Clint Adams on Thu May 8 20:50:01 2025
    Clint Adams <[email protected]> writes:

    I'm not sure that these are quite the right terms. This email itself
    is non-free software, but if Sam wants to train some kind of deep
    learning model on it and release the model, without training data,
    under the Expat license, I definitely would not refer to the model
    as non-free. Would I prefer that copyright law be abolished and
    there be no impediments to providing the training data as well?
    Of course I would. But, absent that, there would be no way for Sam
    to distribute the training data as free software.

    I'm not sure that I agree that it would be great if copyright law were abolished. I think it's deeply flawed and I can certainly imagine
    different legal structures for achieving some of the same goals that I
    think would be superior, but right now, for all of its many problems,
    copyright law is one of the few tools that we have for consent. One of my problems with the stance that Aigars has summarized (not his fault -- it's
    a common view) is that consent should not be required to train models.

    I think your point is that someone training a Bayesian filter on my email messages should not require my consent. My views on that are more
    complicated. I think there are circumstances when it shouldn't require
    that consent and circumstances when it should, and it's a tricky moral
    question that, for me, is heavily influenced by how the model is used.

    But let me slide down the slippery slope a bit farther and present a case
    that I think is a natural extension of that position. Suppose that instead
    of training a Bayesian spam filter on a bunch of mail messages without
    explicit consent, someone instead gathered every email message that I had
    ever sent to a public mailing list and used them to train an LLM to
    impersonate me.

    I don't think someone should be allowed to do that without my consent.
    Right now, the tool I have for expressing that consent is based on
    copyright law, for better or worse.

    Now, there is a pretty good argument here that copyright law is the wrong
    tool to prevent that and we should have other laws that tackle that
    directly, such as the laws now being passed to prohibit "nudification"
    image transformation models that do not rely on copyright law. And I would agree! But those laws largely don't exist right now and copyright law does
    and until someone fixes the problem in some other way, I don't want to
    give up the protection that I may still have, even if it's murky and contingent.

    This is about larger questions of morality and law, but what I would say
    about Debian's rules specifically is that we should have some obligation
    to behave ethically. That's going to mean different things to different
    people, and we quite rightly don't incorporate in our foundation documents ethical principles beyond the scope of free software. But I still have my personal ethics and those will guide my vote on questions of what ethics
    Debian should adopt around free software.

    I think using other people's work without their consent is sometimes
    unethical. It depends a *lot* on the circumstances to me, but I think
    machine learning models, and LLMs and image manipulation models in
    particular, have opened new frontiers for unethical things that can be
    done using other people's work.

    This is not equivalent to the existing human capability to do the same
    thing manually precisely because the whole point of writing computer
    programs to do something is that you can do that thing cheaply and at
    scale. Some other human being can, today, study my writing style and try
    to impersonate me, and I can't stop that with copyright law. I understand
    that. But also this is hard and manual and it's very difficult for someone
    to keep that up at length. An LLM trained on my writing can potentially impersonate me trivially and extensively, for essentially free.

    Debian's free software principles cannot solve all, or even most, problems
    in this world. But I think they are both directly relevant and rather good
    at addressing at least Debian's involvement in this sort of activity.
    Applying free software rules to training data is a bit of a heavy hammer
    and maybe it's too much, but it does hold an ethical line about consent
    that I think we should hold. Maybe there's a different way to hold that
    line, and I'm open to being convinced by a different approach, but I don't
    want to give up this ethical line completely.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Sam Hartman@21:1/5 to All on Thu May 8 21:40:01 2025
    "Clint" == Clint Adams <[email protected]> writes:


    >> Maybe the answer is that they're just too useful to the
    >> distribution to not package regardless of our opinions about
    >> whether they're free software. User experience and free software
    >> principles *are* often in tension and it's fine for us to shift
    >> that balance, in my opinion. But I guess I would have expected us
    >> to do that via a mechanism similar to non-free-firmware if we
    >> wanted to make it easy for users to use software that is
    >> OSAID-approved but not DFSG-free, at least if we have a lot of
    >> it.

    Clint> Maybe that is what we should be doing; I'm not sure.

    I'd support this, especially if

    1) the name of the section did not make a negaive judgment about using
    it. Between the text in the social contract and the name non-free, we
    come across as making a judgment
    against using non-free software. People who do that are tolerated; we recognize that their needs exist, but we hope for a world where they are
    just able to use free software.
    We have (or had) programs like vrms to encourage people to use only free software.
    But under Russ's reasoning at least, I will never get a free spam
    classifier, or a free writing assistant. I'm fine with that but not okay
    making the same judgment against wanting spam classifiers or writing
    assistants that we do make about non-free software.

    2) Providing some mechanism (allowing recommends is the obvious solution
    to me) so that programs in main can get their model data without having
    to download it from non-Debian sources. Being able to have a complete
    system with things like spam classifiers, OCR, text to speech, only
    given a Debian mirror is very important to me.

    important

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Russ Allbery@21:1/5 to Aigars Mahinovs on Fri May 9 19:20:01 2025
    Aigars Mahinovs <[email protected]> writes:

    Just because something can be done cheaper or at scale with help of automation does not make the method of automation for it to become
    morally wrong. See torrent, see mass manufacturing techniques that allow factories in China to make millions of knock-offs of known toys.

    I'm sorry, I flatly and completely disagree with this as a general
    statement. There are indeed some things that do become wrong because they
    are done at scale.

    This is part of what it means to live in a society: we have to balance
    good and harm and put some thoughtful rules in place around what part of someone's work becomes fair use and what part of someone's work remains
    under their control. Those are necessary compromises within our current economic and political system if we want people to be able to afford to
    make new work, if we want to avoid fraud and misrepresentation, and if we
    want to respect the human dignity of artists and their right to be
    associated with their work and to *not* be associated with things that are *not* their work.

    I am extremely sympathetic to the argument that copyright as currently
    designed does not succeed in balancing these factors correctly. It
    certainly has a wealth of problems. But you will never have my support for simply breaking it and to hell with the consequences and anyone who gets
    hurt in the process. If you want to replace the foundation of a building,
    you need to build the new foundation first, not just knock down all the
    support pillars and then blame the building for failing to remain
    standing.

    Here we have a *monumental* movement in the development of both software
    and the entire copyright landscape as a whole - a movement that could, finally, permanently wound the corporate silos keeping the lid on the
    boiling pot of human knowledge. We finally have a legal tool that could finally free all that knowledge that is currently locked behind
    copyright walls and make it available for everyone to use freely and automatically.

    This position is hopelessly, almost cartoonishly naive. If you think that copyright is primarily protecting corporations from you, rather than the
    other way around, then you have completely misunderstood the entire
    history of copyright law, not to mention basic facts about how power works
    in a society. If there is no organized force based on moral principles in
    place to force a balance, the most powerful entities in society will crush anyone who opposes them.

    Corporations absolutely do abuse copyright law to their own ends, just
    like corporations attempt to abuse every other law to their own ends
    because they are fundamentally amoral entities. But just because a law is abused doesn't mean that the underlying principle is entirely erroneous
    and should be discarded.

    If you want to make copyright law more friendly to individual people and
    more hostile to corporations, you have my support. If you have some other non-copyright system that will protect the moral rights of artists and
    their ability to negotiate for food and shelter in exchange for their work because we unfortunately still live in a capitalist society, I will listen
    to your argument. If you want to replace capitalism with some different
    system that no longer requires people to barter their art for the means of existence, I'm right there with you, but I hope you're aware that it's
    going to be a lot of work.

    If you just want to smash everything that is abused by someone you don't
    like and to hell with anyone who gets hurt in the process, I think your politics are an active danger to people I care about and I will do what I
    can to oppose you.

    --
    Russ Allbery ([email protected]) <https://www.eyrie.org/~eagle/>

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)