• Prolog totally missed the AI Boom

    From Mild Shock@21:1/5 to All on Sat Feb 22 13:06:33 2025
    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Sat Feb 22 22:53:16 2025
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Somebody on Sun Feb 23 18:35:49 2025
    Hi,

    Somebody wrote:

    It’s a self-supervised form of ILP.
    No autoencoders anywhere at all.

    And, this only proofs my point that ILP doesn’t
    solve the problem to make autoencoders and transformers
    available directly in Prolog. Which was the issue I posted
    at the top of this thread.

    Subsequently I would not look into ILP for Prolog
    autoencoders and transformers is my point exactly. Because
    mostlikely ILP is unaware of the concept of latent space.
    Latent space has quite some advantages:

    - *Dimensionality Reduction:* It captures the essential
    structure of high-dimensional data in a more
    compact form.

    - *Synthetic Data:* Instead of modifying raw data, you can
    use the latent space, to generate variations for
    further learning.

    - *Domain Adaptation:* Well-structured latent space can
    help transfer knowledge from abundant domains to
    underrepresented ones.

    If you don’t mention autoencoders and transformers at
    all, you are possibly also not aware of the above advantages
    and other properties of autoencoders and transformers.

    In ILP mostlikely the concept of latent space is dormant
    or blurred, since the stance is well we invent predicates,
    ergo relations. There is no attempt to break

    down relations further:

    https://www.v7labs.com/blog/autoencoders-guide

    Basically autoencoders and transformers, by imposing some
    hidden layer, are further structuring relations into an
    encoder and a decoder. So a relation is seen as a join.

    The H is the bottleneck on purpose:

    relation(X, Y) :- encoder(X, H), decoder(H, Y).

    The values of H go through the latent space which is
    invented during the learning process. It is not simply
    the input or output space.

    This design has some very interesting repercussions.

    Bye


    Mild Shock schrieb:
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Mar 19 21:00:45 2025
    Hi,

    I first wanted to use a working title:

    "new frontiers in logic programming"

    But upon reflection and because of fElon,
    here another idea for a working title:

    "neuro infused logic programming" (NILP)

    What could it mean? Or does it have some
    alternative phrasing already?

    Try this paper:

    Compositional Neural Logic Programming
    Son N. Tran - 2021
    The combination of connectionist models for low-level
    information processing and logic programs for high-level
    decision making can offer improvements in inference
    efficiency and prediction performance https://www.ijcai.org/proceedings/2021/421

    Browsing through the bibliography I find:

    [Cohen et al., 2017]
    Tensorlog: Deep learning meets probabilistic

    [Donadello et al., 2017]
    Logic tensor networks

    [Larochelle and Murray, 2011]
    The neural autoregressive distribution estimator

    [Manhaeve et al., 2018]
    Neural probabilistic logic programming

    [Mirza and Osindero, 2014]
    Conditional generative adversarial nets

    [Odena et al., 2017]
    auxiliary classifier GANs

    [Pierrot et al., 2019]
    compositional neural programs

    [Reed and de Freitas, 2016]
    Neural programmer-interpreters

    [Riveret et al., 2020]
    Neuro-Symbolic Probabilistic Argumentation Machines

    [Serafini and d’Avila Garcez, 2016]
    logic tensor networks.

    [Socher et al., 2013]
    neural tensor networks

    [Towell and Shavlik, 1994]
    Knowledge-based artificial neural networks

    [Tran and d’Avila Garcez, 2018]
    Deep logic networks

    [Wang et al., 2019]
    compositional neural information fusion


    Mild Shock schrieb:
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Mar 19 21:01:41 2025
    I first wanted to use a working title:

    "new frontiers in logic programming"

    But upon reflection and because of fElon,
    here another idea for a working title:

    "neuro infused logic programming" (NILP)

    What could it mean? Or does it have some
    alternative phrasing already?

    Try this paper:

    Compositional Neural Logic Programming
    Son N. Tran - 2021
    The combination of connectionist models for low-level
    information processing and logic programs for high-level
    decision making can offer improvements in inference
    efficiency and prediction performance https://www.ijcai.org/proceedings/2021/421

    Browsing through the bibliography I find:

    [Cohen et al., 2017]
    Tensorlog: Deep learning meets probabilistic

    [Donadello et al., 2017]
    Logic tensor networks

    [Larochelle and Murray, 2011]
    The neural autoregressive distribution estimator

    [Manhaeve et al., 2018]
    Neural probabilistic logic programming

    [Mirza and Osindero, 2014]
    Conditional generative adversarial nets

    [Odena et al., 2017]
    auxiliary classifier GANs

    [Pierrot et al., 2019]
    compositional neural programs

    [Reed and de Freitas, 2016]
    Neural programmer-interpreters

    [Riveret et al., 2020]
    Neuro-Symbolic Probabilistic Argumentation Machines

    [Serafini and d’Avila Garcez, 2016]
    logic tensor networks.

    [Socher et al., 2013]
    neural tensor networks

    [Towell and Shavlik, 1994]
    Knowledge-based artificial neural networks

    [Tran and d’Avila Garcez, 2018]
    Deep logic networks

    [Wang et al., 2019]
    compositional neural information fusion

    Mild Shock schrieb:
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From sobriquet@21:1/5 to All on Thu Mar 20 05:02:49 2025
    Op 22/02/2025 om 13:06 schreef Mild Shock:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    Set theory relates to logic as category theory relates to... ?

    https://www.youtube.com/watch?v=1KUhLHlgG2Q

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to sobriquet on Thu Mar 20 20:12:09 2025
    Hi,

    I guess:

    category theory is to set theory

    what

    autograd is to calculus

    LoL

    Bye

    See also:

    Another beautiful day doing math that has no real world applications https://x.com/MathMatize/status/1902708970306891901

    sobriquet schrieb:

    Set theory relates to logic as category theory relates to... ?

    https://www.youtube.com/watch?v=1KUhLHlgG2Q

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 23 13:59:13 2025
    Looks like sorting of rational trees
    needs an existential type, if we go full “logical”.
    If I use my old code from 2023 which computes
    a finest (*), i.e. non-monster, bisimulation

    pre-quotient (**) in prefix order:

    factorize(T, _, T) --> {var(T)}, !.
    factorize(T, C, V) --> {compound(T), member(S-V, C), S == T}, !.
    factorize(T, C, V) --> {compound(T)}, !,
    [V = S],
    {T =.. [F|L]},
    factorize_list(L, [T-V|C], R),
    {S =.. [F|R]}.
    factorize(T, _, T) --> [].

    I see that it always generates new
    intermediate variables:

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_8066=f(_8066)]-_8066

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_10984=f(_10984)]-_10984

    What would be swell if it would generate an
    existential quantifier, something like T^([T = f(T)]-T)
    in the above case. Then using alpha conversion
    different factorization runs would be equal,

    when they only differ by the introduced
    intermediate variables. But Prolog has no alpha
    conversion, only λ-Prolog has such things.
    So what can we do, how can we produce a

    representation, that can be used for sorting?

    (*) Why finest and not corsets? Because it uses
    non-monster instructions and not monster
    instructions

    (**) Why only pre-quotient? Because a
    XXX_with_stack algorithm does not fully
    deduplicate the equations, would
    probably need a XXX_with_memo algorithm.

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 23 14:04:54 2025
    Hi,

    So do we see a new wave in interst in bismulation,
    especially in computing existential types for all
    kind of things? It seems so, quite facinating find:

    BQ-NCO: Bisimulation Quotienting for Efficient
    Neural Combinatorial Optimization
    https://arxiv.org/abs/2301.03313

    Has nobody less than Jean-Marc Andreoli on the
    author list. Possibly the same guy from earlier
    Focusing and Linear Logic, who was associated with

    ECRC Munich in 1990’s, but now working for naverlabs.com.

    Bye

    Mild Shock schrieb:

    Looks like sorting of rational trees
    needs an existential type, if we go full “logical”.
    If I use my old code from 2023 which computes
    a finest (*), i.e. non-monster, bisimulation

    pre-quotient (**) in prefix order:

    factorize(T, _, T) --> {var(T)}, !.
    factorize(T, C, V) --> {compound(T), member(S-V, C), S == T}, !.
    factorize(T, C, V) --> {compound(T)}, !,
       [V = S],
       {T =.. [F|L]},
       factorize_list(L, [T-V|C], R),
       {S =.. [F|R]}.
    factorize(T, _, T) --> [].

    I see that it always generates new
    intermediate variables:

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_8066=f(_8066)]-_8066

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_10984=f(_10984)]-_10984

    What would be swell if it would generate an
    existential quantifier, something like T^([T = f(T)]-T)
    in the above case. Then using alpha conversion
    different factorization runs would be equal,

    when they only differ by the introduced
    intermediate variables. But Prolog has no alpha
    conversion, only λ-Prolog has such things.
    So what can we do, how can we produce a

    representation, that can be used for sorting?

    (*) Why finest and not corsets? Because it uses
    non-monster instructions and not monster
    instructions

    (**) Why only pre-quotient? Because a
    XXX_with_stack algorithm does not fully
    deduplicate the equations, would
    probably need a XXX_with_memo algorithm.

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 23 15:20:54 2025
    Hi,

    To do bi-simulation, you don't need to wear
    this t-shirt, bi-simulation doesn't refer to
    any sexual orientation, although you could give

    it a game theoretic touch with Samson and Delilah:

    Why Are You Geh T-Shirt https://www.amazon.co.uk/Why-Are-You-Gay-T-Shirt/dp/B0DJMZFQN8

    “bi-simulation equivalent” is sometimes simply
    named “bi-similar”. There is a nice paper by Manual
    Carro which gives a larger bisimilarity example:

    An Application of Rational Trees in a Logic.
    Programming Interpreter for a Procedural Language
    Manuel Carro - 2004
    https://arxiv.org/abs/cs/0403028v1

    He makes the case of “goto” in a programming
    language, where Labels are not needed, simply
    rational tree sharing and looping can be used.

    The case, from Figure 5: Threading the code into
    a rational tree, uses in its result the simpler
    bisimilarity, doesn’t need that much of a more
    elaborat bisimulation later.

    You can use dicts (not SWI-Prolog dicts, but
    some table operations) lookup to create the
    rational tree. But I guess you can also use dicts
    (again table operations) for the reverse, find

    some factorization of a rational tree,
    recreate the labels and jumps.

    Bye

    Mild Shock schrieb:
    Hi,

    So do we see a new wave in interst in bismulation,
    especially in computing existential types for all
    kind of things? It seems so, quite facinating find:

    BQ-NCO: Bisimulation Quotienting for Efficient
    Neural Combinatorial Optimization
    https://arxiv.org/abs/2301.03313

    Has nobody less than Jean-Marc Andreoli on the
    author list. Possibly the same guy from earlier
    Focusing and Linear Logic, who was associated with

    ECRC Munich in 1990’s, but now working for naverlabs.com.

    Bye

    Mild Shock schrieb:

    Looks like sorting of rational trees
    needs an existential type, if we go full “logical”.
    If I use my old code from 2023 which computes
    a finest (*), i.e. non-monster, bisimulation

    pre-quotient (**) in prefix order:

    factorize(T, _, T) --> {var(T)}, !.
    factorize(T, C, V) --> {compound(T), member(S-V, C), S == T}, !.
    factorize(T, C, V) --> {compound(T)}, !,
        [V = S],
        {T =.. [F|L]},
        factorize_list(L, [T-V|C], R),
        {S =.. [F|R]}.
    factorize(T, _, T) --> [].

    I see that it always generates new
    intermediate variables:

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl.
    [_8066=f(_8066)]-_8066

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl.
    [_10984=f(_10984)]-_10984

    What would be swell if it would generate an
    existential quantifier, something like T^([T = f(T)]-T)
    in the above case. Then using alpha conversion
    different factorization runs would be equal,

    when they only differ by the introduced
    intermediate variables. But Prolog has no alpha
    conversion, only λ-Prolog has such things.
    So what can we do, how can we produce a

    representation, that can be used for sorting?

    (*) Why finest and not corsets? Because it uses
    non-monster instructions and not monster
    instructions

    (**) Why only pre-quotient? Because a
    XXX_with_stack algorithm does not fully
    deduplicate the equations, would
    probably need a XXX_with_memo algorithm.

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf >>>
    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to All on Wed Jul 23 19:15:12 2025
    Hi,

    But you might then experience the problem
    that the usual extensionality axiom of
    set theory is not enough, there could
    be two quine atoms y = {y} and x = {x}
    with x=/=y.

    On the other hand SWI-Prolog is convinced
    that X = [X] and Y = [Y] are the same,
    it can even apply member/2 to it since
    it has built-in rational trees:

    /* SWI-Prolog 9.3.25 */
    ?- X = [X], Y = [Y], X == Y.
    X = Y, Y = [Y].

    ?- X = [X], member(X, X).
    X = [X].

    But Peter Aczel’s Original AFA Statement was
    only Uniqueness of solutions to graph equations,
    whereas today we would talk about Equality =

    existence of a bisimulation relation.

    Bye

    Hi,

    you do need a theory of terms, and a specific one

    You could pull an Anti Ackerman. Negate the
    infinity axiom like Ackerman did here, where
    he also kept the regularity axiom:

    Die Widerspruchsfreiheit der allgemeinen Mengenlehre
    Ackermann, Wilhelm - 1937 https://www.digizeitschriften.de/id/235181684_0114%7Clog23

    But instead of Ackermann, you get an Anti (-Foundation)
    Ackermann if you drop the regularity axiom. Result, you
    get a lot of exotic sets, among which are also the

    famous Quine atoms:

    x = {x}

    Funny that in the setting I just described , where
    there is the negation of the infinity axiom, i.e.
    all sets are finite, contrary to the usually vulgar
    view, x = {x} is a finite object. Just like in Prolog

    X = f(X) is in principle a finite object, it has
    only one subtree, or what Alain Colmerauer
    already postulated:

    Definition: a "rational" tre is a tree which
    has a finite set of subtrees.

    Bye

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Fri Jul 25 21:29:40 2025
    Hi,

    That is extremly embarassing. I don’t know
    what you are bragging about, when you wrote
    the below. You are wrestling with a ghost!
    Maybe you didn’t follow my superbe link:

    seemingly interesting paper. In stead
    particular, his final coa[l]gebra theorem

    The link behind Hopcroft and Karp (1971) I
    gave, which is a Bisimulation and Equirecursive
    Equality hand-out, has a coalgebra example,
    I used to derive pairs.pl from:

    https://www.cs.cornell.edu/courses/cs6110/2014sp/Lectures/lec35a.pdf

    Bye

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)