Designs for a Computer Algebra System

Creating a modern computer algebra system from the ground up, as has been a plan of mine for some months now, is no trivial goal, as anyone who has even a vague conceptual understanding of computational algebra must surely know. My efforts in this area, grouped under the title of “the Syracuse Project“, have involved mostly research and large amounts of contemplation so far, but I feel that I have finally managed to formulate enough solid ideas that they are worth presenting in a short article. I was also fortunate enough to find someone on IRC (in #math-software on Freenode) with whom I could discuss and refine many of our mutual ideas. The ideas discussed in this post are the culmination of my own thoughts and much of the knowledge I gained from my conversations with Robert Smith (nickname ‘Quadrescence’) online, which has served as the basis for some of my own investigation and explanations here, in a modified form.

What I am going to focus on here falls mainly under the domain of my Euclid.NET project, which is essentially the core of Archimedes, which one might describe as the “CAS kernel”.  Think of Euclid.NET as the framework for symbolic mathematics upon which Archimedes is built. Its primary purpose is to handle such tasks as expression evaluation, simplification, differentiation, series expansion, and so on. (Bonus points if you can spot the connection between the names of Archimedes and Syracuse.)

TeX.NET, which is the primary parser (expression tree builder) for the project at this moment, has already seen a stable release, and will soon see the second with any luck. As you might guess from the name, it takes input in TeX syntax (well, actually LaTeX, with a few extensions).

To begin, I thought I would tackle one of the tougher aspects of computer algebra systems, namely, simplification. More trivial features such as expression evaluation and differentiation are essential to a CAS, yet are hardly worth an in-depth discussion, at least not for now.

So what is simplification?

Simplification, to state the banal, is concerned with making an expression more “simple”. Unlike many every terms that have been borrowed by mathematics, “simple” or “simplification” truly does not have a more rigorous definition. The problem here largely stems from the fact that the definition of what is simple is somewhat fluid – it depends to a great extent on the context. To illustrate this nature more concretely, here a few examples.

  1. (x+1)^2 or x^2 + 2x + 1?
  2. y^{-3.5} or 1/y^{3.5} or 1/(y^3 \sqrt{y})?
  3. \tan x \sin x or \dfrac{\sin^2 x}{\cos x}?
  4. e^{A+(2x-y)} or e^A\left[\dfrac{e^{2x}}{e^{y}}\right]?
  5. 4 \sqrt{x^3 + 6x^2 + 9x} or 4(x + 3)^2 \sqrt{x} ?

A simple expression treeA visualisation of a very simple expression tree. The equivalent infix expression is (2 + 2) + (2 + 2) + (3 + 3). Note that in general such trees are not restricted to be binary.

I think that anyone who has had sufficient experience using maths and manipulating many formulae and expressions will realise that in some scenarios, one of the given equivalent expressions in 1 to 5 is desired in a certain scenario, while another is desired in a second situation. Hence, any conceivable simplification algorithm cannot be treated as a rigid mechanical process, bur rather must adjust itself depending on the parameters it is given, which hint at the sort of result that is desired.

When humans perform simplification of mathematical expressions, they often use so-called intuition, developed from much prior experience, along with trial and error, to (in most cases) quickly and accurately simplify maths. It is inevitable that even the best mathematicians hit dead ends when trying to simplify complicated expressions. A computer, perhaps unsurprisingly, cannot do any better. Moreover, mathematical simplification is, in my view, one of the few aspects of mathematical methodology that overall better suits a holistic intelligence, rather than the traditional sequential one that is most often associated with maths (theorem proving being a notable exception). In fact, mathematics is not all so logical and all-encompassing as even mathematicians not long ago thought – thanks to the magnificent Incompleteness Theorem proposed by Kurt Godel in the 1930s – this is however a subject for another day.

How can we measure simplicity?

Fortunately, and perhaps surprisingly, the field of evolutionary computation presents a rather handy way of treating the “simplicity” of expressions, that is, a fitness function – in its most abstract sense, something that measures the absolute “fitness” of any given solution for a certain optimisation problem (most commonly genetic algorithms, which lent the term “fitness”). The “fitness” may be thought of qualitatively as the value, worth, or suitability of a particular solution. The solutions, in this case, are of course expression trees.

To begin, it is greatly helpful to reduce the problem by extracting a certain (small) number of parameters from the expression tree, rather than trying to analyse the entire thing holistically. This reduces the parameter space (the set of all possible parameters) dramatically, which is typically highly beneficial in optimisation problems. The fitness function itself is allowed to take any form in general, though we shall see shortly that one or two classes of function in particular are desirable. For now, let us just focus on the set of parameters. After some consideration, I came to the conclusion that any effective parameter space must consist of the following variables:

  • Size of the expression tree (i.e. the number of nodes)
  • Height of the expression tree  (i.e. the number of layers)
  • Width of the expression tree (i.e. the number of nodes in the bottom layer).

These are all of course integer values, and thus the value of the function is restricted to lie within the set of integers. After building up an image that measuring simplicity is a tricky thing, this may seem like a rather straightforward framework; indeed, it is in some ways, thought it is worth noting that the apparent problems arise when we decide what function to use and when it should be applied. The reasons for the choice of these parameters should become apparent soon.

Let us first consider a basic fitness function that simply weights each parameter individually in a linear combination. In other words, the fitness function, F(S, H, W) may be defined as the following, where S, H, and W correspond to the size, height, and width of the expression tree respectively.

F(S, H, W) = aS + bH + cW

The constants a, b, c may be any integer (postive or negative), and should be passed to the simplifier routine rather than predefined, according to the desired output. It should be quite evident that by choosing different magnitudes as well as signs for these constants, each of the three parameters may be independently rewarded or penalised to greater or lesser extents. The question you might then ask is: why not a more complicated function depending on S, H, and W? My answer is a straightforward one: there is no need. A linear combination of terms gives enough control over the desired function that extending to the function to add higher-order or even exponential terms would be quite pointless and arbitrary. I have, however, far from closed my mind on this matter – as I design and test the system progressively, certain discoveries may be made that suggest a slightly different approach.

Indeed, my only other real consideration for a fitness function thus far is one of even more basic form. Suppose, for example, that the fitness function only depended on two things: a) which parameter should be prioritised, b) whether this parameter should be promoted or demoted. The other two parameters would simply be minimised in the case that the first shows no preference between two trees. I have not finalised the implementation of this, but hopefully this brief description should give you an idea.

I now only leave, as an exercise to the reader, the five example expressions given in the previous section, and considered how any desired result (simplified form) can be achieved through the selection of the appropiate fitness function (using either of the two I have just proposed). Of course, feel free to post a comment regarding any queries or findings you have regarding this matter.

An effective algorithm for simplification

So far, I have discussed how simplicity can be measured in absolute terms, and how in this way the most “simple” of a set of solutions (expression trees) can be chosen as the result of the algorithm. What I have not really mentioned, however, is how an algorithm might actually search out the possible solutions. Although the nature of the algorithm is mainly independent of the fitness function  and the evaluation of expression trees, it is helpful to discuss this second so as to give a clearer image to the approach as a whole.

Simplification when done by a computer, as when done by humans, involves at its heart the application of a large number of mathematical rules that transform expressions. To give some examples of a few of the more basic rules:

  • x + 0 \rightarrow x
  • 1 * x \rightarrow x
  • x * x \rightarrow x^2
  • x * (y / x) \rightarrow y
  • x^2 - y^2 \rightarrow (x + y)(x - y)
  • \sin^2 x + \cos^2 x \rightarrow 1

Given a complete set of all simplification rules, we can find a path (or derivation) between any two equivalent expressions. (In theory, this is no issue, since the set is finite, though rather large. The practicality of  all the required rules is another issue that I will not go into here.) Note that the rules are bidirectional; they allow you take a simple expression and sequentially transform it into an arbitrarily complex, yet still identical one. (For example, x \rightarrow x + 0 - 0 + 0 - 0 + 0.)

Assuming (quite reasonably) that this assertion that a “finite number of application of simplification rules can derive any equivalent expression from the original” is valid, we must then consider how the search should proceed. Under the utterly naive brute force approach, the algorithm is clearly non-terminating, but we can do a lot better than this.

The search algorithm is all about compromisation in essence. If we search every possible derived expression (up to a certain size), then it could take an unreasonably length of time to simplify even relatively tiny expressions. On the other hand, we make too many assumptions and cut off many branches in the search tree prematurely, the algorithm may terminate quickly, though not necessarily with the simplest solution (or even anything close). Hence, the idea of using genetic algorithms, albeit initially appealing, is in my view to restrictive to a problem that requires a “perfect” answer in most cases, and should not have any stochastic nature.

My currently planned approach is one that does not differ greatly from a simple brute force evaluation of all the simplification paths. The main improvement is one that falls out rather naturally from using a fitness function. The idea is that each node of the search tree is evaluated by the fitness function upon its creation, and if the fitness is below a certain (specified) threshold, the search from that particular node terminates. For a start, this prevents originally small/simple expression trees being transformed into ones that are absurdly large, while still allowing some limited expansion of expression trees in the hope that they may later be simplified very effectively. Many nonsensical (what we might call counter-intuitive) paths of reduction of the expression would also be eliminated in a similar way. There is also one practical problem of important note here: many disparate simplification paths along the tree do converge to the same solution during a search (some quite quickly), so it would be quite foolish to branch twice from identical nodes of the search tree. Instead, we really want to cache any nodes (expression trees) already visited during the search process, and not compute their descendants (derived children) more than once. A simple hash table (set, in fact) would seem like the most effective way of accomplish this. Creating an efficient and relatively collision-free hash function for an arbitrary tree structure is however no trivial task. I was to get a number of quite sensible and useful responds when I asked the question on StackOverflow.

Apologies if this discussion of “search trees” and “expression trees” and their corresponding nodes has led to some confusion regarding what is what. It is most important to recognise that the node of a search tree is itself an expression tree (what I sometimes call a “solution” to the “simplification problem”). Due to the risk of losing reader interest at the cost of an even longer post, I shall stop my ramblings here, and leave further elaboration of the search algorithm for another post.

The future of the project

What has been discussed so far is largely theoretical, yet I have tried to present it in such a way that the method of implementation is for the most part self-evident. Work on this project will likely proceed slowly in the short-term future, though as it advances, the features will surely solidify. I am hoping that at least by summer there should be some tangible results to these efforts. Regardless, I shall try to give status updates along the way.

Leave a Reply