Re: Higher Order Logic Programming and Autograd
From
Mild Shock@21:1/5 to
Mild Shock on Sat Mar 15 16:36:58 2025
What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example
to layered neural networks by
representing them as:
/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/
?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].
The model evaluation at a data point
is straight forward:
eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).
The backward calculation of deltas
is straight forward:
back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).
You can use this to compute weight changes
and drive a gradient algorithm.
Mild Shock schrieb:
new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to
replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration
of a higher order loop such as maplist/n, foldl/n etc…
So this is the new implementation of maplist/3:
% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).
% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).
Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,
which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,
but the cache adds an itch more speed. Here some
measurements that I did:
/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds
/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences
/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all
over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)
From
Mild Shock@21:1/5 to
All on Sat Mar 15 16:36:20 2025
new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to
replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration
of a higher order loop such as maplist/n, foldl/n etc…
So this is the new implementation of maplist/3:
% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).
% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).
Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,
which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,
but the cache adds an itch more speed. Here some
measurements that I did:
/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds
/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences
/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all
over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)
From
Mild Shock@21:1/5 to
Mild Shock on Sat Mar 15 16:37:58 2025
But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog
list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative
of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is
the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:
/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).
/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:
A Gentle Introduction to torch.autograd
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
Mild Shock schrieb:
What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example
to layered neural networks by
representing them as:
/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/
?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].
The model evaluation at a data point
is straight forward:
eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).
The backward calculation of deltas
is straight forward:
back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).
You can use this to compute weight changes
and drive a gradient algorithm.
Mild Shock schrieb:
new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to
replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration
of a higher order loop such as maplist/n, foldl/n etc…
So this is the new implementation of maplist/3:
% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).
% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).
Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,
which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,
but the cache adds an itch more speed. Here some
measurements that I did:
/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds
/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences
/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all
over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)
From
Mild Shock@21:1/5 to
Mild Shock on Sat Mar 15 16:44:04 2025
A storm of symbolic differentiation libraries
was posted. But what can these Prolog code
fossils do?
Does one of these libraries support Python symbolic
Pieceweise ? For example one can define rectified
linear unit (ReLU) with it:
/ x x >= 0
ReLU(x) := <
\ 0 otherwise
With the above one can already translate a
propositional logic program, that uses negation
as failure, into a neural network:
NOT \+ p 1 - x
AND p1, ..., pn ReLU(x1 + ... + xn - (n-1))
OR p1; ...; pn 1 - ReLU(-x1 - .. - xn + 1)
For clauses just use Clark Completion, it makes
the defined predicate a new neuron, dependent on
other predicate neurons,
through a network of intermediate neurons. Because
of the constant shift in AND and OR, the neurons
will have a bias b.
So rule based in zero order logic is a subset
of neural network.
Python symbolic Pieceweise
https://how-to-data.org/how-to-write-a-piecewise-defined-function-in-python-using-sympy/
rectified linear unit (ReLU)
https://en.wikipedia.org/wiki/Rectifier_(neural_networks)
Clark Completion
https://www.cs.utexas.edu/~vl/teaching/lbai/completion.pdf
Mild Shock schrieb:
But where is Autograd, automatic derivation from
some symbolic input? In general you can objectify
neural networks which I already did with the Prolog
list, and routines such as back/3 are pure Prolog.
Basically you could symbolically derive expit
(activation), mulderiv (the product with the derivative
of the activation) and matrran (the jacobian without
activation) from a DAG of vector functions. In a linear
neural network, the jacobian without activation is
the same as the weights, and expit has a simple derivative
that is based on the expit result itself which is
already stored as the activation:
/* g(x) = logistic function */
expit(X, Y) :- Y is 1/(1+exp(-X)).
/* g'(x) = g(x)*(1-g(x)) */
mulderiv(X, Y, Z) :- Z is X*Y*(1-Y).
See also:
A Gentle Introduction to torch.autograd https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
Mild Shock schrieb:
What can we do with these new toys, we
can implement vector operations and matrice
operations. An then apply it for example
to layered neural networks by
representing them as:
/**
* Network is represented as [N0,M1,N1,...,Mn,Nn]
* - Where N0 are the input neurons vector
* - Where N1 .. Nn-1 are the hidden neurons vectors
* - Where Nn are the output neurons vector
* . Where M1 .. Mn are the transition weights matrice
*/
?- mknet([3,2], X).
X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)].
The model evaluation at a data point
is straight forward:
eval([V], [V]) :- !.
eval([V,M,_|L], [V,M|R]) :- !,
matmul(M, V, H),
vecact(H, expit, J),
eval([J|L], R).
The backward calculation of deltas
is straight forward:
back([V], U, [D]) :- !,
vecact(U, V, sub, E),
vecact(E, V, mulderiv, D).
back([V,M,W|L], U, [D2,M,D|R]) :-
back([W|L], U, [D|R]),
mattran(M, M2),
matmul(M2, D, E),
vecact(E, V, mulderiv, D2).
You can use this to compute weight changes
and drive a gradient algorithm.
Mild Shock schrieb:
new Prolog system. I thought my new Prolog system
has only monomorphic caches , I will never be able to
replicate what I did for my old Prolog system with
arity polymorphic caches. This changed when I had
the idea to dynamically add a cache for the duration
of a higher order loop such as maplist/n, foldl/n etc…
So this is the new implementation of maplist/3:
% maplist(+Closure, +List, -List)
maplist(C, L, R) :-
sys_callable_cacheable(C, D),
sys_maplist(L, D, R).
% sys_maplist(+List, +Closure, -List)
sys_maplist([], _, []).
sys_maplist([X|L], C, [Y|R]) :-
call(C, X, Y),
sys_maplist(L, C, R).
Its similar as the SWI-Prolog implementation in that
it reorders the arguments for better first argument
indexing. But the new thing is sys_callable_cacheable/1,
which prepares the closure to be more efficiently
called. The invocation of the closure is already
quite fast since call/3 is implemented natively,
but the cache adds an itch more speed. Here some
measurements that I did:
/* SWI-Prolog 9.3.20 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% 2,003,000 inferences, 0.078 CPU in 0.094 seconds
/* Scryer Prolog 0.9.4-350 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% CPU time: 0.318s, 3_007_105 inferences
/* Dogelog Player 1.3.1 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
/* realla Prolog 2.64.6-2 */
?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
maplist(succ,L,_),fail; true)), fail.
% Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
Not surprisingly SWI-Prolog is fastest. What was
a little surprise is that Scryer Prolog can do it quite
fast, possibly since they heavily use maplist/n all
over the place, they came up with things like '$fast_call'
etc.. in their call/n implementation. Trealla Prolog is
a little bit disappointing at the moment.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)