Forum: >>> Magnum BBS <<<

Re: Higher Order Logic Programming and Autograd

From Mild Shock@21:1/5 to Mild Shock on Sat Mar 15 16:36:58 2025

What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example

to layered neural networks by
representing them as:

/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/

?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].

The model evaluation at a data point
is straight forward:

eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).

The backward calculation of deltas
is straight forward:

back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).

You can use this to compute weight changes
and drive a gradient algorithm.

Mild Shock schrieb:

new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
   sys_callable_cacheable(C, D),
   sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
   call(C, X, Y),
   sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
    % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
   maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to All on Sat Mar 15 16:36:20 2025

new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sat Mar 15 16:37:58 2025

But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog

list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative

of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is

the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:

/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).

/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:

A Gentle Introduction to torch.autograd https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

Mild Shock schrieb:

What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example

to layered neural networks by
representing them as:

/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/

?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].

The model evaluation at a data point
is straight forward:

eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
   matmul(M, V, H),
   vecact(H, expit, J),
   eval([J|L], R).

The backward calculation of deltas
is straight forward:

back([V], U, [D]) :- !,
   vecact(U, V, sub, E),
   vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
   back([W|L], U, [D|R]),
   mattran(M, M2),
   matmul(M2, D, E),
   vecact(E, V, mulderiv, D2).

You can use this to compute weight changes
and drive a gradient algorithm.

Mild Shock schrieb:

new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
    sys_callable_cacheable(C, D),
    sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
    call(C, X, Y),
    sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
     % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
     maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Mild Shock@21:1/5 to Mild Shock on Sat Mar 15 16:44:04 2025

A storm of symbolic differentiation libraries
was posted. But what can these Prolog code
fossils do?

Does one of these libraries support Python symbolic
Pieceweise ? For example one can define rectified
linear unit (ReLU) with it:

/ x x >= 0
ReLU(x) := <
\ 0 otherwise

With the above one can already translate a
propositional logic program, that uses negation
as failure, into a neural network:

NOT \+ p 1 - x
AND p1, ..., pn ReLU(x1 + ... + xn - (n-1))
OR p1; ...; pn 1 - ReLU(-x1 - .. - xn + 1)

For clauses just use Clark Completion, it makes
the defined predicate a new neuron, dependent on
other predicate neurons,

through a network of intermediate neurons. Because
of the constant shift in AND and OR, the neurons
will have a bias b.

So rule based in zero order logic is a subset
of neural network.

Python symbolic Pieceweise https://how-to-data.org/how-to-write-a-piecewise-defined-function-in-python-using-sympy/

rectified linear unit (ReLU) https://en.wikipedia.org/wiki/Rectifier_(neural_networks)

Clark Completion
https://www.cs.utexas.edu/~vl/teaching/lbai/completion.pdf

Mild Shock schrieb:

But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog

list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative

of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is

the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:

/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).

/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:

A Gentle Introduction to torch.autograd https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

Mild Shock schrieb:

What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example

to layered neural networks by
representing them as:

/**
  * Network is represented as [N0,M1,N1,...,Mn,Nn]
  * - Where N0 are the input neurons vector
  * - Where N1 .. Nn-1 are the hidden neurons vectors
  * - Where Nn are the output neurons vector
  * . Where M1 .. Mn are the transition weights matrice
  */

?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].

The model evaluation at a data point
is straight forward:

eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
    matmul(M, V, H),
    vecact(H, expit, J),
    eval([J|L], R).

The backward calculation of deltas
is straight forward:

back([V], U, [D]) :- !,
    vecact(U, V, sub, E),
    vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
    back([W|L], U, [D|R]),
    mattran(M, M2),
    matmul(M2, D, E),
    vecact(E, V, mulderiv, D2).

You can use this to compute weight changes
and drive a gradient algorithm.

Mild Shock schrieb:

new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to

replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration

of a higher order loop such as maplist/n, foldl/n etc…

So this is the new implementation of maplist/3:

% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
    sys_callable_cacheable(C, D),
    sys_maplist(L, D, R).

% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
    call(C, X, Y),
    sys_maplist(L, C, R).

Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,

which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,

but the cache adds an itch more speed. Here some
measurements that I did:

/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds

/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
     % CPU time: 0.318s, 3_007_105 inferences

/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
    maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18

/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
     maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips

Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all

over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	136:33:31
Calls:	12,470
Files:	15,203
Messages:	6,538,703

Re: Higher Order Logic Programming and Autograd

Who's Online

System Info