• Novacore goes Bisimulation: Scryer Prolog is Slow!

    From Mild Shock@21:1/5 to All on Fri Jul 25 14:26:56 2025
    Hi,

    Having fun with Bisimulation, and a new test
    suite of nasty circular pairs. But how store
    circular pairs, if clauses do not support

    circular terms. Well chop it up into equations,
    I create 1000 such equation pairs:

    test([A = c(A, B), C = c(A, D), E = n, _ = c(F, B),
    F = c(C, C), G = c(G, D), D = c(E, C), B = n],
    [H = c(H, I), J = c(K, I), L = c(L, I), I = c(M, N),
    O = c(K, J), N = n, K = c(K, J), M = c(K, O)]).
    Etc..

    The pairs are nasty because the usual compare_with_stack/2
    chokes on them. Here some results:

    /* SWI-Prolog 9.3.26 */
    ?- time((between(1,30,_), part, fail; true)).
    % 540,118 inferences, 0.047 CPU in 0.041 seconds (115% CPU, 11522517 Lips) true.

    /* Trealla Prolog 2.78.40 */
    ?- time((between(1,30,_), part, fail; true)).
    % Time elapsed 0.113s, 1143903 Inferences, 10.157 MLips
    true.

    /* Scryer Prolog 0.9.4-417 */
    ?- time((between(1,30,_), part, fail; true)).
    % CPU time: 0.226s, 1_117_809 inferences
    true.

    /* Dogelog Player 1.3.5 */
    ?- time((between(1,30,_), part2, fail; true)).
    % Zeit 309 ms, GC 0 ms, Lips 8693718, Uhr 25.07.2025 13:47
    true.

    The amazing thing is, I compared a 100% Prolog
    implementation, so there is a lot of head room
    for improvement:

    part2 :-
    bitest(X,Y), X ~~ Y, fail; true.

    The operator (~~)/2 is part of library(math),
    and has been implemented with same_term/2 so far.

    Bye

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to All on Mon Jul 28 10:50:52 2025
    Hi,

    Assume that we live in a world where we
    have excess memory. So we can afford stacks!
    And then make the crucial observation,

    we can use the stack of the Prolog engine,
    no need to create an artificial stack in C,
    or use the native stack of C.

    I guess SWI-Prolog has already groked the
    first we can "afford stacks". But did anybody
    already grok the "100% Prolog" idea?

    Well we are not yet there 100% Prolog
    has still an overhead. Here is a little
    test acyclic_term/2:

    /* SWI-Prolog 9.3.26, C Stacks and/or Agendas */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % 330,150 inferences, 0.016 CPU in 0.023 seconds
    (69% CPU, 21129600 Lips)
    true.

    /* Trealla Prolog 2.79.6, ?? */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Time elapsed 0.063s, 643413 Inferences, 10.166 MLips
    true.

    /* Dogelog Player 1.3.5, 100% Prolog */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Zeit 115 ms, GC 0 ms, Lips 11803904, Uhr 28.07.2025 10:03
    true.

    /* Scryer Prolog 0.9.4-417, Deutsch-Schorr-Waite */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % CPU time: 0.130s, 626_829 inferences
    true.

    Bye

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Mon Jul 28 11:01:03 2025
    Hi,

    So the 115 ms by Dogelog Player are faster than the
    0.130s by Scryer Prolog. Quite amazing! From
    Dogelog Player library(math):

    /**
    * acyclic_term(T): [TC2 8.3.11]
    * The predicate succeeds when the Prolog term T is an acyclic term,
    * otherwise the predicate fails.
    */
    % acyclic_term(+Term)
    acyclic_term(X) :- sys_cyclic_term(X, []), !, fail.
    acyclic_term(_).

    % sys_cyclic_term(+Term, +List)
    sys_cyclic_term(X, S) :- compound(X),
    member(Y, S),
    same_term(X, Y), !.
    sys_cyclic_term(X, S) :- compound(X),
    X =.. [_|L],
    member(Y, L),
    sys_cyclic_term(Y, [X|S]).

    Algorithms that implicitly detect sharing cannot
    do much? Since we only need to find a first cycle?
    Also a challenge to bring the association list to
    native and lets say turn it into a hash table,

    don't know yet when and how even somebody would
    attempt that. Big advantage of anything not using
    Deutsch-Schorr-Waite , no write operation into the
    given term. Write operation are sometimes

    poison for the CPU and the RAM.

    Bye

    Mild Shock schrieb:
    Hi,

    Assume that we live in a world where we
    have excess memory. So we can afford stacks!
    And then make the crucial observation,

    we can use the stack of the Prolog engine,
    no need to create an artificial stack in C,
    or use the native stack of C.

    I guess SWI-Prolog has already groked the
    first we can "afford stacks". But did anybody
    already grok the "100% Prolog" idea?

    Well we are not yet there 100% Prolog
    has still an overhead. Here is a little
    test acyclic_term/2:

    /* SWI-Prolog 9.3.26, C Stacks and/or Agendas */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % 330,150 inferences, 0.016 CPU in 0.023 seconds
    (69% CPU, 21129600 Lips)
    true.

    /* Trealla Prolog 2.79.6, ?? */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Time elapsed 0.063s, 643413 Inferences, 10.166 MLips
       true.

    /* Dogelog Player 1.3.5, 100% Prolog */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Zeit 115 ms, GC 0 ms, Lips 11803904, Uhr 28.07.2025 10:03
    true.

    /* Scryer Prolog 0.9.4-417, Deutsch-Schorr-Waite  */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % CPU time: 0.130s, 626_829 inferences
    true.

    Bye

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Mon Jul 28 11:37:14 2025
    Hi,

    Interstingly there is an obvious way how
    to do it without a braching association list,
    that has different futures so to speack,

    the below code repeatedly generates
    dfferent association lists via [X|S].
    We could of course instead do a threading

    of the association list through some predicate,
    and use some state on each entry inside the
    association list, similar like in acyclic_decompose/3

    we used a threading and entry state.

    Bye

    Mild Shock schrieb:
    Hi,

    So the 115 ms by Dogelog Player are faster than the
    0.130s by Scryer Prolog. Quite amazing! From
    Dogelog Player library(math):

    /**
     * acyclic_term(T): [TC2 8.3.11]
     * The predicate succeeds when the Prolog term T is an acyclic term,
     * otherwise the predicate fails.
     */
    % acyclic_term(+Term)
    acyclic_term(X) :- sys_cyclic_term(X, []), !, fail.
    acyclic_term(_).

    % sys_cyclic_term(+Term, +List)
    sys_cyclic_term(X, S) :- compound(X),
       member(Y, S),
       same_term(X, Y), !.
    sys_cyclic_term(X, S) :- compound(X),
       X =.. [_|L],
       member(Y, L),
       sys_cyclic_term(Y, [X|S]).

    Algorithms that implicitly detect sharing cannot
    do much? Since we only need to find a first cycle?
    Also a challenge to bring the association list to
    native and lets say turn it into a hash table,

    don't know yet when and how even somebody would
    attempt that. Big advantage of anything not using
    Deutsch-Schorr-Waite , no write operation into the
    given term. Write operation are sometimes

    poison for the CPU and the RAM.

    Bye

    Mild Shock schrieb:
    Hi,

    Assume that we live in a world where we
    have excess memory. So we can afford stacks!
    And then make the crucial observation,

    we can use the stack of the Prolog engine,
    no need to create an artificial stack in C,
    or use the native stack of C.

    I guess SWI-Prolog has already groked the
    first we can "afford stacks". But did anybody
    already grok the "100% Prolog" idea?

    Well we are not yet there 100% Prolog
    has still an overhead. Here is a little
    test acyclic_term/2:

    /* SWI-Prolog 9.3.26, C Stacks and/or Agendas */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % 330,150 inferences, 0.016 CPU in 0.023 seconds
    (69% CPU, 21129600 Lips)
    true.

    /* Trealla Prolog 2.79.6, ?? */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Time elapsed 0.063s, 643413 Inferences, 10.166 MLips
        true.

    /* Dogelog Player 1.3.5, 100% Prolog */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Zeit 115 ms, GC 0 ms, Lips 11803904, Uhr 28.07.2025 10:03
    true.

    /* Scryer Prolog 0.9.4-417, Deutsch-Schorr-Waite  */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % CPU time: 0.130s, 626_829 inferences
    true.

    Bye


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Mon Jul 28 17:26:15 2025
    Write operation are sometimes
    poison for the CPU and the RAM.

    Try this where SIZE is a prime number:

    /* Cache-Trashing Example */
    void column_major_access() {
    for (int col = 0; col < SIZE; col++) {
    for (int row = 0; row < SIZE; row++) {
    matrix[row][col] += 1;
    }
    }
    }

    /* Cache-Friendly Example */
    void row_major_access() {
    for (int row = 0; row < SIZE; row++) {
    for (int col = 0; col < SIZE; col++) {
    matrix[row][col] += 1;
    }
    }
    }

    Mild Shock schrieb:
    Hi,

    Interstingly there is an obvious way how
    to do it without a braching association list,
    that has different futures so to speack,

    the below code repeatedly generates
    dfferent association lists via [X|S].
    We could of course instead do a threading

    of the association list through some predicate,
    and use some state on each entry inside the
    association list, similar like in acyclic_decompose/3

    we used a threading and entry state.

    Bye

    Mild Shock schrieb:
    Hi,

    So the 115 ms by Dogelog Player are faster than the
    0.130s by Scryer Prolog. Quite amazing! From
    Dogelog Player library(math):

    /**
      * acyclic_term(T): [TC2 8.3.11]
      * The predicate succeeds when the Prolog term T is an acyclic term,
      * otherwise the predicate fails.
      */
    % acyclic_term(+Term)
    acyclic_term(X) :- sys_cyclic_term(X, []), !, fail.
    acyclic_term(_).

    % sys_cyclic_term(+Term, +List)
    sys_cyclic_term(X, S) :- compound(X),
        member(Y, S),
        same_term(X, Y), !.
    sys_cyclic_term(X, S) :- compound(X),
        X =.. [_|L],
        member(Y, L),
        sys_cyclic_term(Y, [X|S]).

    Algorithms that implicitly detect sharing cannot
    do much? Since we only need to find a first cycle?
    Also a challenge to bring the association list to
    native and lets say turn it into a hash table,

    don't know yet when and how even somebody would
    attempt that. Big advantage of anything not using
    Deutsch-Schorr-Waite , no write operation into the
    given term. Write operation are sometimes

    poison for the CPU and the RAM.

    Bye

    Mild Shock schrieb:
    Hi,

    Assume that we live in a world where we
    have excess memory. So we can afford stacks!
    And then make the crucial observation,

    we can use the stack of the Prolog engine,
    no need to create an artificial stack in C,
    or use the native stack of C.

    I guess SWI-Prolog has already groked the
    first we can "afford stacks". But did anybody
    already grok the "100% Prolog" idea?

    Well we are not yet there 100% Prolog
    has still an overhead. Here is a little
    test acyclic_term/2:

    /* SWI-Prolog 9.3.26, C Stacks and/or Agendas */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % 330,150 inferences, 0.016 CPU in 0.023 seconds
    (69% CPU, 21129600 Lips)
    true.

    /* Trealla Prolog 2.79.6, ?? */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Time elapsed 0.063s, 643413 Inferences, 10.166 MLips
        true.

    /* Dogelog Player 1.3.5, 100% Prolog */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % Zeit 115 ms, GC 0 ms, Lips 11803904, Uhr 28.07.2025 10:03
    true.

    /* Scryer Prolog 0.9.4-417, Deutsch-Schorr-Waite  */
    ?- time((between(1,30,_), acyclic2, fail; true)).
    % CPU time: 0.130s, 626_829 inferences
    true.

    Bye



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 30 14:44:46 2025
    Hi,

    Looks like the pairs are really nasty.
    I have them on the Novacore GIT.

    This one is no problem for Scryer-Prolog:

    test :-
    share(X), share(Y), X == Y.

    /* SWI-Prolog 9.3.26 */
    ?- time(test).
    % 32,140 inferences, 0.000 CPU in 0.002 seconds (0% CPU, Infinite Lips)
    true.

    /* Scryer Prolog 0.9.4-417 */
    ?- time(test).
    % CPU time: 0.006s, 32_042 inferences
    true.

    /* Trealla Prolog 2.79.6 */
    ?- time(test).
    % Time elapsed 0.016s, 42012 Inferences, 2.582 MLips
    true.

    /* Dogelog Player 1.3.5 */
    ?- time(test2).
    % % Zeit 83 ms, GC 0 ms, Lips 12117469, Uhr 30.07.2025 05:47
    % true.

    The Dogelog Player solutions uses a variant
    of library(hash), called library(maps).
    Both hash tables are itself 100% Prolog.

    Bye

    Mild Shock schrieb:
    Hi,

    Having fun with Bisimulation, and a new test
    suite of nasty circular pairs. But how store
    circular pairs, if clauses do not support

    circular terms. Well chop it up into equations,
    I create 1000 such equation pairs:

    test([A = c(A, B), C = c(A, D), E = n, _ = c(F, B),
          F = c(C, C), G = c(G, D), D = c(E, C), B = n],
         [H = c(H, I), J = c(K, I), L = c(L, I), I = c(M, N),
          O = c(K, J), N = n, K = c(K, J), M = c(K, O)]).
    Etc..

    The pairs are nasty because the usual compare_with_stack/2
    chokes on them. Here some results:

    /* SWI-Prolog 9.3.26 */
    ?- time((between(1,30,_), part, fail; true)).
    % 540,118 inferences, 0.047 CPU in 0.041 seconds (115% CPU, 11522517 Lips) true.

    /* Trealla Prolog 2.78.40 */
    ?- time((between(1,30,_), part, fail; true)).
    % Time elapsed 0.113s, 1143903 Inferences, 10.157 MLips
       true.

    /* Scryer Prolog 0.9.4-417 */
    ?- time((between(1,30,_), part, fail; true)).
       % CPU time: 0.226s, 1_117_809 inferences
       true.

    /* Dogelog Player 1.3.5 */
    ?- time((between(1,30,_), part2, fail; true)).
    % Zeit 309 ms, GC 0 ms, Lips 8693718, Uhr 25.07.2025 13:47
    true.

    The amazing thing is, I compared a 100% Prolog
    implementation, so there is a lot of head room
    for improvement:

    part2 :-
       bitest(X,Y), X ~~ Y, fail; true.

    The operator (~~)/2 is part of library(math),
    and has been implemented with same_term/2 so far.

    Bye

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Mild Shock@21:1/5 to Mild Shock on Wed Jul 30 14:50:22 2025
    Hi,

    The sharing example was painful for my GC.
    I got a long GC pause:

    /* Dogelog Player 1.3.5 for JavaScript */
    ?- time((share(X), acyclic3(X), fail; true)).
    % Zeit 122 ms, GC 0 ms, Lips 4906262, Uhr 30.07.2025 04:54
    true.

    ?- time((share(X), acyclic3(X), fail; true)).
    % Zeit 445 ms, GC 313 ms, Lips 1345087, Uhr 30.07.2025 04:54
    true.

    But fixed now:

    /* Dogelog Player 1.3.5 for JavaScript */
    ?- time((share(X), acyclic3(X), fail; true)).
    % Zeit 125 ms, GC 7 ms, Lips 4788512, Uhr 30.07.2025 09:11
    true.

    Here is share/1, the idea is to have some
    deep term, and share it widely:

    share(L) :-
    fac(s(s(s(s(s(s(s(n))))))), X),
    length(L, 5000),
    maplist(=(X), L).

    With usual peano arithmetic. n/0 stands for
    null, and s/1 for successor:

    add(n, X, X).
    add(s(X), Y, Z) :- add(X, s(Y), Z).

    mul(n, _, n).
    mul(s(X), Y, Z) :- mul(X, Y, H), add(Y, H, Z).

    fac(n, s(n)).
    fac(s(X), Y) :- fac(X, H), mul(s(X), H, Y).

    Bye

    Mild Shock schrieb:
    Hi,

    Looks like the pairs are really nasty.
    I have them on the Novacore GIT.

    This one is no problem for Scryer-Prolog:

    test :-
       share(X), share(Y), X == Y.

    /* SWI-Prolog 9.3.26 */
    ?- time(test).
    % 32,140 inferences, 0.000 CPU in 0.002 seconds (0% CPU, Infinite Lips)
    true.

    /* Scryer Prolog 0.9.4-417 */
    ?- time(test).
       % CPU time: 0.006s, 32_042 inferences
       true.

    /* Trealla Prolog 2.79.6 */
    ?- time(test).
    % Time elapsed 0.016s, 42012 Inferences, 2.582 MLips
       true.

    /* Dogelog Player 1.3.5 */
    ?- time(test2).
    % % Zeit 83 ms, GC 0 ms, Lips 12117469, Uhr 30.07.2025 05:47
    % true.

    The Dogelog Player solutions uses a variant
    of library(hash), called library(maps).
    Both hash tables are itself 100% Prolog.

    Bye

    Mild Shock schrieb:
    Hi,

    Having fun with Bisimulation, and a new test
    suite of nasty circular pairs. But how store
    circular pairs, if clauses do not support

    circular terms. Well chop it up into equations,
    I create 1000 such equation pairs:

    test([A = c(A, B), C = c(A, D), E = n, _ = c(F, B),
           F = c(C, C), G = c(G, D), D = c(E, C), B = n],
          [H = c(H, I), J = c(K, I), L = c(L, I), I = c(M, N),
           O = c(K, J), N = n, K = c(K, J), M = c(K, O)]).
    Etc..

    The pairs are nasty because the usual compare_with_stack/2
    chokes on them. Here some results:

    /* SWI-Prolog 9.3.26 */
    ?- time((between(1,30,_), part, fail; true)).
    % 540,118 inferences, 0.047 CPU in 0.041 seconds (115% CPU, 11522517
    Lips)
    true.

    /* Trealla Prolog 2.78.40 */
    ?- time((between(1,30,_), part, fail; true)).
    % Time elapsed 0.113s, 1143903 Inferences, 10.157 MLips
        true.

    /* Scryer Prolog 0.9.4-417 */
    ?- time((between(1,30,_), part, fail; true)).
        % CPU time: 0.226s, 1_117_809 inferences
        true.

    /* Dogelog Player 1.3.5 */
    ?- time((between(1,30,_), part2, fail; true)).
    % Zeit 309 ms, GC 0 ms, Lips 8693718, Uhr 25.07.2025 13:47
    true.

    The amazing thing is, I compared a 100% Prolog
    implementation, so there is a lot of head room
    for improvement:

    part2 :-
        bitest(X,Y), X ~~ Y, fail; true.

    The operator (~~)/2 is part of library(math),
    and has been implemented with same_term/2 so far.

    Bye


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)