GSOC Project - Algebraic Expressions
Hello clojurians. I am glad to say that my Proposal for the Algebraic Expression project idea was accepted! It is a very interesting project and can be a big benefit for the scientific computing interested clojure programmers. In this blog post I want to tell what I plan to achieve in expresso ( a good name for the project :)), and also to discuss some relevant design decisions I have to make. I really appreciate every comment or advice. The more I get, the better I can make the library for you! So please comment.
Table of Contents
1 General Idea
Expresso aims to be a general purpose library to syntactically manipulate algebraic expressions, and combines clojure and the declarative power of core.logic to do this.
It will contain a solver and an optimizer, which are inspired by sympy and theano.
To give a more concrete idea, here ist the synopsis part of my proposal:
The goal is to implement a library for manipulating and analyzing algebraic expressions. This would allow several advanced features like solving a system of equations in regard to a variable X, or optimizing an expression for executing via core.matrix. The library should also be extensible, to build a foundation for symbolic manipulation of expressions, on which other more sophisticated transformations can be based.
I plan to use the search and backtracking capabilities of core.logic - the clojure port of mini-kanren - to manipulate expressions. The scope of the library would be:
- setting up a logic programming context, in which manipulations can be applied
- providing a general rule-based transformation facility - with convenient syntax for defining rules. A rule can be any logical clause, which enables even whole manipulations to be stated as rules
- transforming an expression into a (or more) suitable standart form (i.e unnesting sums, eliminating zero multiplication,…) for further manipulations.
- an input namespace, which makes it easy to transform and existing expression using core.matrix to a suitable one for algebraic manipulation
Also, it should contain two more sophisticated transformations
- an optimizer, which transforms an expression to be used with core.matrix to an equivalent expression, which runs faster. It could do optimizations, like removing redundant calculations,moving computation to compile time, analyzing the expression for parrallel executiion, …
- a solver for solving (sets of) equations. It would analyze the expression and recursively try to e.g isolate the variable if it occurs only once, solve the equation as an polynominal, contract multiple occurences of the variable. each of this transformations above can be a rule by itself and can contain rules by itself. This would make the solver extensible.
2 Core design decisions
Here are the core design decisions I need to address in order to be able to implement the more sophisticated solver and optimizer on top of it
- how to encode an expression
I currently plan to encode expressions as just plain old s-expressions. An extra type will be problematic because s-expressions are easy to handle with core.logic. I did not manage to get a self defined type play well in all situations with core.logic, but maybe some core.logic specialist could give advice if it is possible and recommended to use a custom type with core.logic. This raises other issues such as
- core/* and core.matrix/* behave differently - symbols will have to be fully qualified but there are many types on which + behaves the same - So I also want to have one set of rules for these +'ses. Clojures ad-hoc hierarchies could help here.
- I will need to store additional information on the expressions, like mathematical properties wich can be used to optimize it. without a custom type I could use metadata for it.
- how to express a rewrite rule
expresso will heavily rely on rewrite rules and also let the set of rules be extensible by users
- I plan to use a core.logic clause as a rule. It will succeed when the rule is applicable and fail when not when succeeding it unificates the second argument with the simplified rule.
- this would allow arbitrary complex rules, such as a whole calculation routing as a rule
- It will be useful to have a good syntax to create rules - I currently use a very simple one based on matche
- how to apply rules on an expression
- there are many ways to traverse an expression tree, it is probably best to support multiple and see which performs best
- rule based translators tend to perform badly with many rules. There need to be a good strategie to reduce the time to search for the right rule.
- how existing rule-based translators are extensible by users
- where to add a new rewrite rule - order matters
- how to construct an expression
Use of the library should be so seamless as possible. It would be perfect, if it could mirror the core.matrix api in an input namespace to construct an expression on which the library can work.
This are the central issues I can think of for now. There are others for the solver and optimizer, and surely other for the core part which I forgot…
3 The code
Mike Anderson created the expresso library and I committed to it to test my ideas. For now it contains a proof of concept rule-based translator and solver. In the code snippet below there are the rules used by simplification and solving and test cases. Remember that it is only a proof of concept.
; a rule to calculate an expression - is just a simple core.logic clause (defn calculo [eq eq1] (no-variablo eq) (resulto eq eq1)) ; A rule is a clause which gets an expression and returns a modified expression ; If a rule is not applicable, it fails (def simp-rules [(rule ['+ 0 x] :=> x) (rule ['+ x 0] :=> x) (rule ['* 0 x] :=> 0) (rule ['* x 0] :=> 0) (rule ['* 1 x] :=> x) (rule ['* x 1] :=> x) (rule ['- 0 x] :=> x) (rule ['- x 0] :=> x) (rule ['- x x] :=> 0) (rule ['* y ['+ 'x z]] :=> ['+ ['* y 'x] ['* y z]]) (rule ['* ['+ 'x z] y] :=> ['+ ['* y 'x] ['* y z]]) (rule ['+ x x] :=> ['* 2 x]) (rule ['+ ['* y 'x] ['* z 'x]] :=> ['* ['+ z y] 'x]) (rule ['+ ['* y 'x] ['* 'x z]] :=> ['* ['+ z y] 'x]) (rule ['+ ['* 'x y] ['* z 'x]] :=> ['* ['+ z y] 'x]) (rule ['+ ['* 'x y] ['* 'x z]] :=> ['* ['+ z y] 'x]) calculo ]) ;simple example (run* [q] (simplifyo '(+ x 0) q)) ;=> (x) ;rules are applied recursively (run* [q] (simplifyo '(* x (+ 0 (* 3 (* x 0)))) q)) ;=> (0) ;calculate is also just a rule, which checks if there is no variable in expression and calculates it (run* [q] (simplifyo '(+ (* 3 4) (+ (* x 0) 7)) q)) ;=> (19) ; will be used from the solver 1 means the variable x is on the left side (def isolate-rules [ (rule [1 ['= ['+ x y] z]] :=> ['= x ['- z y]]) (rule [2 ['= ['+ y x] z]] :=> ['= x ['- z y]]) (rule [1 ['= ['* x y] z]] :=> ['= x ['/ z y]]) (rule [2 ['= ['* y x] z]] :=> ['= x ['/ z y]]) (rule [1 ['= ['- x y] z]] :=> ['= x ['+ z y]]) (rule [2 ['= ['- y x] z]] :=> ['= ['- 0 x] ['- z y]]) ]) (run* [q] (solveo '(= (+ (* 3 (+ (* x 4) (* x 3))) 5) 89) 'x q)) ;=> ('(= x 4))
You can see the code here. The wiki pages contain more information. If you have any feature request or comment/advice for me, please comment or post on the mailing list. If you already have a use case on which you would want to use the library please tell me so I can have an accurate picture on what forms of expressions it should be able to handle and how it can be best used.
I am really glad that I managed to get into gsoc with such a good project. I am sure the summer will be interesting!