Sunday, March 28, 2021

Mistakes in Symbolic Computation I

 I've started a series of posts on the Julia Zulip, trying to put together my thoughts on errors done in previous CASes. I will also post them here, as a good way to keep track of them.  I'm going to take the first 3 posts and make them into this first post here.

Problem: Zulip is markdown+latex and Blogger does not like this. So this is going to look awful, until I switch engines. Which I think I'm forced to now.

Symbols are not Imperative Variables

An enormous mistake that Maple made was to confuse symbols and variables, made doubly worse because Maple is fundamentally imperative at its core (even though it's also got a smattering of functional features too).

Being able to do `p := 5*x^2+7*x-5` and then having the value of `p` change (upon evaluation) **after** `x := 12` is executed is sheer idiocy. Assignment is a rotten way to do substitution. Much worse, as Maple is untyped, one can do all sorts of stuff like `x := infinity`, `x := <some matrix>`, `x := a + 5` and `x := [3,4,5]`, all of which have very different effects on `p`. It's even possible to do weird things by assigning to `x` something which depends on `x` itself (or on `p`). Sometimes these are caught, sometimes these lead to infinite loops in evaluation. It's all very bad.

I don't believe this is currently a problem in Julia - just don't get suckered into going this route.

Symbols are not Variables either

It's extremely tempting to allow "substitution" for variables, in a very generous manner.

Mathematically this ends up being problematic very quickly, unfortunately. While it is tempting to see `x + y` as a kind of open term that is like a first-class "function body" with free variables `x` and `y`, this analogy only works for a while, and then falls apart.  While $$R[x,y]$$ (for some ring $$R$$) and $$R \rightarrow R \rightarrow R$$ do share some properties, the fact remains that the latter space is enormously larger.  For example, for $$R=\mathbb{Z}$$, $$\mathbb{Z}[x]$$ has decidable equality but $$\mathbb{Z} \rightarrow \mathbb{Z}$$ does not. So element of $$\mathbb{Z}[x]$$ are *names* for 'a few' of those of  $$\mathbb{Z} \rightarrow \mathbb{Z}$$.

Much, much worse is the situation in $$\mathbb{Q}(x)$$. Here we have that $$x/x=1$$, completely unambiguously so, while it's a bit odd to claim that $$\lambda\ x\ .\ x/x$$ should be *equal* to $$\lambda\ x\ .\ 1$$. Of course, this is the most degenerate of cases - it's easy to make it harder to see.  What about $$\lambda\ x\ .\ \frac{x^2-2x+1}{(x-1)(x+1)}$$ ?

Moving up to $$R=\mathbb{R}$$ gets even more fun.  Two of my all-time favourites are $$\frac{1}{\frac{1}{x-x}}$$ and $$\frac{(|x|-x)(|x|+x)}{(\sqrt{x^2}-x)(\sqrt{x^2+x)}$$.  Is the first one $$0$$?  Is the second one $$1$$? There are ways to justify both answers... There are very easy ways for simplifiers to arrive there too.

**In my opinion** the way out is via *denotational semantics*. You need to figure out what your terms mean. And that when you move terms from one denotational domain to another -- which is a completely legitimate thing to do, and want to do -- this needs to be understand as an action (or effect, in current PL-speak). You do want various commutative diagrams to hold between your symbolic manipulation and your denotational semantics. This is indeed the whole point of symbolic computation! It's usually implicit , but there it is: the real desire is to shuffle some symbols around, with the aim that those shuffles are actually meaningful.

Again, **in my opinion**, a really good way to write this down formally is via *biform theories* (see papers 1, 3 and 4 on the [MathScheme publications](http://www.cas.mcmaster.ca/research/mathscheme/publications.html) page). This lets you clearly say (and prove) what your symbol shuffling is supposed to correspond to. The fact that "rational function normalization" does not in fact do what you expect is explained in [Towards Specifying Symbolic Computation](https://arxiv.org/abs/1904.02729), and alternatives are given that respect the more usual notion of equality of functions.

x-x is not necessarily equal to 0

One of the most basic equalities that term rewrite systems for simplification implement is $$x-x=0$$. This seems completely uncontroversial, right?

In this post, I will use $$x$$ for a symbol, and $$T$$ for a term in some algebra.

So $$x-x = 0\cdot x$$ is always true, same as with $$T-T=0\cdot T$$. This is because any reasonable notion of term algebra in which $$-$$ makes sense at all, both sides will either be defined (and equal), or both undefined.  Even in weird cases like $$\infty$$, although it is possible to create strange algebras where it's not -- but in those cases, you shouldn't really be using $$-$$ at all.

But there are so many cases where things go wrong. First, what if $$x$$ stands for an $$n\times x$$ matrix?  $$x-x$$ makes perfect sense, but the result is **not** $$0$$ (the scalar) but $$\mathbf{0}_{n\times n}$$, the zero matrix. $$x=\infty$$ is another obvious issue. [This is also linked to Symbols are not Variables either].

Worse still is $$T-T=0$$.  First, consider $$T = 1/0$$. Obviously you don't want $$1/0 - 1/0 = 0$$ (unless you define $$1/0=0$$ like HOL-Light does). But , of course, it gets worse really quickly: there are many different situations in which deciding whether $$T$$ is equal to $$0$$ or not is anywhere from extremely hard to impossible -- so that you can't know whether $$1/T - 1/T$$ should simplify.

Types can help a lot. In some cases, you have some meta-theorems that say that, at some types, there are no problems, so it's perfectly safe. With a caveat, of course: the types of interest here are not the value or denotation types, but rather the *expression types*, combined with value types. Polynomials with coefficients explicit constants of $$\mathbb{Q}$$ are fine (yay, decidable equality!). As soon as you get to operations that are partial, all hell breaks loose.

I need to look more deeply at the types involve in Julia and in Symbolics.jl. **In my opinion** that is where one can make a big leap from what is done in Maple, Mathematica, SageMath, Macsyma, etc.

But if you always simplify $$T-T$$ to $$0$$, you will be guaranteed to have bugs, because you will eventually make important singularities disappear or, worse, take something that was entirely undefined and make a computation out of that which is defined everywhere (or at the wrong type). Both of which cause a lot of debugging pain.

No comments: