• Prolog totally missed the AI Boom

    From Mild Shock@21:1/5 to All on Sat Feb 22 13:07:12 2025
    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Sat Feb 22 22:54:36 2025
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Somebody on Sun Feb 23 18:34:49 2025
    Hi,

    Somebody wrote:

    It’s a self-supervised form of ILP.
    No autoencoders anywhere at all.

    And, this only proofs my point that ILP doesn’t
    solve the problem to make autoencoders and transformers
    available directly in Prolog. Which was the issue I posted
    at the top of this thread.

    Subsequently I would not look into ILP for Prolog
    autoencoders and transformers is my point exactly. Because
    mostlikely ILP is unaware of the concept of latent space.
    Latent space has quite some advantages:

    - *Dimensionality Reduction:* It captures the essential
    structure of high-dimensional data in a more
    compact form.

    - *Synthetic Data:* Instead of modifying raw data, you can
    use the latent space, to generate variations for
    further learning.

    - *Domain Adaptation:* Well-structured latent space can
    help transfer knowledge from abundant domains to
    underrepresented ones.

    If you don’t mention autoencoders and transformers at
    all, you are possibly also not aware of the above advantages
    and other properties of autoencoders and transformers.

    In ILP mostlikely the concept of latent space is dormant
    or blurred, since the stance is well we invent predicates,
    ergo relations. There is no attempt to break

    down relations further:

    https://www.v7labs.com/blog/autoencoders-guide

    Basically autoencoders and transformers, by imposing some
    hidden layer, are further structuring relations into an
    encoder and a decoder. So a relation is seen as a join.

    The H is the bottleneck on purpose:

    relation(X, Y) :- encoder(X, H), decoder(H, Y).

    The values of H go through the latent space which is
    invented during the learning process. It is not simply
    the input or output space.

    This design has some very interesting repercussions.

    Bye

    Mild Shock schrieb:
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Mar 19 21:00:00 2025
    Hi,

    I first wanted to use a working title:

    "new frontiers in logic programming"

    But upon reflection and because of fElon,
    here another idea for a working title:

    "neuro infused logic programming" (NILP)

    What could it mean? Or does it have some
    alternative phrasing already?

    Try this paper:

    Compositional Neural Logic Programming
    Son N. Tran - 2021
    The combination of connectionist models for low-level
    information processing and logic programs for high-level
    decision making can offer improvements in inference
    efficiency and prediction performance https://www.ijcai.org/proceedings/2021/421

    Browsing through the bibliography I find:

    [Cohen et al., 2017]
    Tensorlog: Deep learning meets probabilistic

    [Donadello et al., 2017]
    Logic tensor networks

    [Larochelle and Murray, 2011]
    The neural autoregressive distribution estimator

    [Manhaeve et al., 2018]
    Neural probabilistic logic programming

    [Mirza and Osindero, 2014]
    Conditional generative adversarial nets

    [Odena et al., 2017]
    auxiliary classifier GANs

    [Pierrot et al., 2019]
    compositional neural programs

    [Reed and de Freitas, 2016]
    Neural programmer-interpreters

    [Riveret et al., 2020]
    Neuro-Symbolic Probabilistic Argumentation Machines

    [Serafini and d’Avila Garcez, 2016]
    logic tensor networks.

    [Socher et al., 2013]
    neural tensor networks

    [Towell and Shavlik, 1994]
    Knowledge-based artificial neural networks

    [Tran and d’Avila Garcez, 2018]
    Deep logic networks

    [Wang et al., 2019]
    compositional neural information fusion

    Mild Shock schrieb:
    Hi,

    One idea I had was that autoencoders would
    become kind of invisible, and work under the hood
    to compress Prolog facts. Take these facts:

    % standard _, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    data(seg7, [0,0,0,0,0,0,0], [0,0,0,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,0], [1,1,1,1,1,1,0]).
    data(seg7, [0,1,1,0,0,0,0], [0,1,1,0,0,0,0]).
    data(seg7, [1,1,0,1,1,0,1], [1,1,0,1,1,0,1]).
    data(seg7, [1,1,1,1,0,0,1], [1,1,1,1,0,0,1]).
    data(seg7, [0,1,1,0,0,1,1], [0,1,1,0,0,1,1]).
    data(seg7, [1,0,1,1,0,1,1], [1,0,1,1,0,1,1]).
    data(seg7, [1,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [1,1,1,0,0,0,0], [1,1,1,0,0,0,0]).
    data(seg7, [1,1,1,1,1,1,1], [1,1,1,1,1,1,1]).
    data(seg7, [1,1,1,1,0,1,1], [1,1,1,1,0,1,1]).
    % alternatives 9, 7, 6, 1
    data(seg7, [1,1,1,0,0,1,1], [1,1,1,1,0,1,1]).
    data(seg7, [1,1,1,0,0,1,0], [1,1,1,0,0,0,0]).
    data(seg7, [0,0,1,1,1,1,1], [1,0,1,1,1,1,1]).
    data(seg7, [0,0,0,0,1,1,0], [0,1,1,0,0,0,0]). https://en.wikipedia.org/wiki/Seven-segment_display

    Or more visually, 9 7 6 1 have variants trained:

    :- show.
    _0123456789(9)(7)(6)(1)

    The auto encoder would create a latent space, an
    encoder, and a decoder. And we could basically query
    ?- data(seg7, X, Y) with X input, and Y output,

    9 7 6 1 were corrected:

    :- random2.
    0, 0
    _01234567899761

    The autoencoder might also tolerate errors in the
    input that are not in the data, giving it some inferential
    capability. And then choose an output again not in

    the data, giving it some generative capabilities.

    Bye

    See also:

    What is Latent Space in Deep Learning? https://www.geeksforgeeks.org/what-is-latent-space-in-deep-learning/

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986
    https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History)
    https://www.youtube.com/watch?v=OFS90-FX6pg


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 23 13:58:10 2025
    Looks like sorting of rational trees
    needs an existential type, if we go full “logical”.
    If I use my old code from 2023 which computes
    a finest (*), i.e. non-monster, bisimulation

    pre-quotient (**) in prefix order:

    factorize(T, _, T) --> {var(T)}, !.
    factorize(T, C, V) --> {compound(T), member(S-V, C), S == T}, !.
    factorize(T, C, V) --> {compound(T)}, !,
    [V = S],
    {T =.. [F|L]},
    factorize_list(L, [T-V|C], R),
    {S =.. [F|R]}.
    factorize(T, _, T) --> [].

    I see that it always generates new
    intermediate variables:

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_8066=f(_8066)]-_8066

    ?- X = f(f(X)), factorize(X, [], T, L, []), write(L-T), nl. [_10984=f(_10984)]-_10984

    What would be swell if it would generate an
    existential quantifier, something like T^([T = f(T)]-T)
    in the above case. Then using alpha conversion
    different factorization runs would be equal,

    when they only differ by the introduced
    intermediate variables. But Prolog has no alpha
    conversion, only λ-Prolog has such things.
    So what can we do, how can we produce a

    representation, that can be used for sorting?

    (*) Why finest and not corsets? Because it uses
    non-monster instructions and not monster
    instructions

    (**) Why only pre-quotient? Because a
    XXX_with_stack algorithm does not fully
    deduplicate the equations, would
    probably need a XXX_with_memo algorithm.

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Fri Jul 25 21:28:12 2025
    Hi,

    That is extremly embarassing. I don’t know
    what you are bragging about, when you wrote
    the below. You are wrestling with a ghost!
    Maybe you didn’t follow my superbe link:

    seemingly interesting paper. In stead
    particular, his final coa[l]gebra theorem

    The link behind Hopcroft and Karp (1971) I
    gave, which is a Bisimulation and Equirecursive
    Equality hand-out, has a coalgebra example,
    I used to derive pairs.pl from:

    https://www.cs.cornell.edu/courses/cs6110/2014sp/Lectures/lec35a.pdf

    Bye

    Mild Shock schrieb:

    Inductive logic programming at 30
    https://arxiv.org/abs/2102.10556

    The paper contains not a single reference to autoencoders!
    Still they show this example:

    Fig. 1 ILP systems struggle with structured examples that
    exhibit observational noise. All three examples clearly
    spell the word "ILP", with some alterations: 3 noisy pixels,
    shifted and elongated letters. If we would be to learn a
    program that simply draws "ILP" in the middle of the picture,
    without noisy pixels and elongated letters, that would
    be a correct program.

    I guess ILP is 30 years behind the AI boom. An early autoencoder
    turned into transformer was already reported here (*):

    SERIAL ORDER, Michael I. Jordan - May 1986 https://cseweb.ucsd.edu/~gary/PAPER-SUGGESTIONS/Jordan-TR-8604-OCRed.pdf

    Well ILP might have its merits, maybe we should not ask
    for a marriage of LLM and Prolog, but Autoencoders and ILP.
    But its tricky, I am still trying to decode the da Vinci code of

    things like stacked tensors, are they related to k-literal clauses?
    The paper I referenced is found in this excellent video:

    The Making of ChatGPT (35 Year History) https://www.youtube.com/watch?v=OFS90-FX6pg

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)