Forum: >>> Magnum BBS <<<

Non-LLM example where we do not in practice use original training data

From Sam Hartman@21:1/5 to All on Mon May 5 22:30:01 2025

I think many of us modify machine learning models on a regular basis.
And I think when we make those modifications, we do not go back to
original training data, but instead, we modify the model weights.

I suspect I am not the only one who uses rspamd and who uses both the
Bayesian classifier and the neural network classifier, both of which are machine learning models.

My point here is that there a common case where the preferred form of modification for a model definitely is not the original training data.
Some people on the list probably do retain all the messages they submit
for learning.
I know I do not.
(I retain a significant subset and probably could reproduce something if
I had to.)

If I wanted to package up my classifier state and distribute it under a
free software license, I think it should be DFSG free.
I think that to satisfy the DFSG I would need to include all the
training data I still had and any scripts I used.
But I think in that circumstance the model weights would be a reasonable preferred form of modification.
If the way I responded to bug reports was to manually run messages
through rspamc, I think that ought to be DFSG free based on decisions we
have made in similar circumstances in the past.

I appreciate that coming up with a classifier state that was generic
enough to be valuable to package in Debian would be difficult. However,
I think this serves as an example we can all get our heads around to see
that in practice, real users do often use model weights as the preferred
form of modification.

-----BEGIN PGP SIGNATURE-----

iHUEARYKAB0WIQSj2jRwbAdKzGY/4uAsbEw8qDeGdAUCaBkfLwAKCRAsbEw8qDeG dM1yAP9NHf1eGblwJrrL9uyaKBJkx6tPN2xln4zdXonKMabhrgEA543m+ufiKjSX ZJ1F5gk9rMQ8x54rkjCRo2jtOLu4JAU=
=Byrs
-----END PGP SIGNATURE-----

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Centurion
  Tue Jul 28 22:54:59 2026
  from Berea, Ohio via Telnet
- Bob Worm
  Tue Jul 28 16:01:18 2026
  from Wales, Uk via Telnet
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet
- Ataricrypt
  Mon Jul 27 19:19:17 2026
  from England via Telnet
- Bob Worm
  Mon Jul 27 15:19:55 2026
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	51:49:28
Calls:	12,445
Calls today:	5
Files:	15,192
Messages:	6,537,253

Non-LLM example where we do not in practice use original training data

Who's Online

Recent Visitors

System Info