Friday, July 23, 2021

Notes on "A Canonical Scheme for Model Composition"

 Unlike the previous paper, we're doing better here - no deeply embedded OO conceit right from the start. And 'conforms to' is the first relation that shows up.

Section 2 is a survey of existing tools.

Section 3 gets into the meat of things, by offering fairly concrete definitions. The starting point is, as usual, a graph (directed multigraph). "model-based" anything seems to always want to end up being visual, so it's hard to escape graphs, even though it's not always clear that that's the right answer. Again, there is a notion of conformance (with too much emphasis on nodes... and no types as both edges and nodes go to nodes). Interestingly, there is a need for self-reference if one wants finitely many levels!

The advantage, in a sense, is that everything is a graph, so operations are simple to define. This disadvantage is that it's not clear that graph-level operations are actually meaningful when the graph is interpreted as a model. It looks to be too reductionist a formalism.This untypedness and fuzzyness should up already in definitions 6 and 7; the match operation "searches" (!) for equivalences!

It's disappointing that definition 9, "model transformation", mentions "rules". And "set of models"? If a set of models is an important entity, why not give it a name? And then, a model transformation should be some kind of homomorphism of that entity, not just a function.

For example, it feels like Definition 6 (Correspondence model) is trying to say "monomorphic graph homomorphism" but with a lot of verbiage. Compose (Def'n 10) is unclear, but it feels like pushout? Or is that Merge?

The section 4 first talks about requirements (good, but fuzzy), and then looks at what the existing tools to.

It's cited a lot. Too much to go through.

Notes on "The Meta-Object Facility Typed"

 Obligatory meta-note: why am I even reading this #$%#$% on anything OO, related to the OMG and MOF? Because I'm involved in a project doing model-based development (MDD) and model-based engineering (MDE). Both of which are cool. It's just my pain that some of the relevant work comes from places that I'd much prefer to ignore entirely. The paper in question has gotten quite a few cites.

So the paper starts off well: using type theory to formalize things. Of course, it calls it CTT (constructive type theory) which, while not wrong, does not bode well.

The problem with the MOF is that it embeds the ideas of objects and classes too deeply into its very 'foundations'. If I was doing modelling, I wouldn't put typeclasses (or traits or ...) in my foundations either: they are all just a little too narrowly associated with a particular point of view. The 4 levels is also a bit of an illusion: yes, absolutely, there needs to be a tower of models. But the point isn't about the levels themselves, but about the relations between the levels. In particular, the conforms to relation, which should be central. Sigh So many of the details of MOF (and explanations thereof) seem like over-complicating things. Oh, it's not wrong, but it feels like trying to understand things by doing massive inlining (i.e. abstraction breaking).

Section 3 is an introduction to type theory. Not much to see here.

Section 4 starts with the actual encoding. Already the very first definition is weird because it uses set-encoding (i.e which isn't type theory). And it uses a predicate for well-formedness when the natural way of doing this would be to use a context / sequent / telescope instead. This is dependently-typed after all, why not go all the way? It's way too verbatim translation. Of course, some of this is due to the fact that metamodels themselves allow ridiculous amounts of self-reference. 

As the paper is old, it suffers from non-reproducibility: there are no available artifacts.

Going through who cites this paper, one finds "An Algebraic Semantics for MOF" by A. Boronat and J. Meseguer, which is indeed promising.

Sunday, March 28, 2021

Mistakes in Symbolic Computation I

 I've started a series of posts on the Julia Zulip, trying to put together my thoughts on errors done in previous CASes. I will also post them here, as a good way to keep track of them.  I'm going to take the first 3 posts and make them into this first post here.

Problem: Zulip is markdown+latex and Blogger does not like this. So this is going to look awful, until I switch engines. Which I think I'm forced to now.

Symbols are not Imperative Variables

An enormous mistake that Maple made was to confuse symbols and variables, made doubly worse because Maple is fundamentally imperative at its core (even though it's also got a smattering of functional features too).

Being able to do `p := 5*x^2+7*x-5` and then having the value of `p` change (upon evaluation) **after** `x := 12` is executed is sheer idiocy. Assignment is a rotten way to do substitution. Much worse, as Maple is untyped, one can do all sorts of stuff like `x := infinity`, `x := <some matrix>`, `x := a + 5` and `x := [3,4,5]`, all of which have very different effects on `p`. It's even possible to do weird things by assigning to `x` something which depends on `x` itself (or on `p`). Sometimes these are caught, sometimes these lead to infinite loops in evaluation. It's all very bad.

I don't believe this is currently a problem in Julia - just don't get suckered into going this route.

Symbols are not Variables either

It's extremely tempting to allow "substitution" for variables, in a very generous manner.

Mathematically this ends up being problematic very quickly, unfortunately. While it is tempting to see `x + y` as a kind of open term that is like a first-class "function body" with free variables `x` and `y`, this analogy only works for a while, and then falls apart.  While $$R[x,y]$$ (for some ring $$R$$) and $$R \rightarrow R \rightarrow R$$ do share some properties, the fact remains that the latter space is enormously larger.  For example, for $$R=\mathbb{Z}$$, $$\mathbb{Z}[x]$$ has decidable equality but $$\mathbb{Z} \rightarrow \mathbb{Z}$$ does not. So element of $$\mathbb{Z}[x]$$ are *names* for 'a few' of those of  $$\mathbb{Z} \rightarrow \mathbb{Z}$$.

Much, much worse is the situation in $$\mathbb{Q}(x)$$. Here we have that $$x/x=1$$, completely unambiguously so, while it's a bit odd to claim that $$\lambda\ x\ .\ x/x$$ should be *equal* to $$\lambda\ x\ .\ 1$$. Of course, this is the most degenerate of cases - it's easy to make it harder to see.  What about $$\lambda\ x\ .\ \frac{x^2-2x+1}{(x-1)(x+1)}$$ ?

Moving up to $$R=\mathbb{R}$$ gets even more fun.  Two of my all-time favourites are $$\frac{1}{\frac{1}{x-x}}$$ and $$\frac{(|x|-x)(|x|+x)}{(\sqrt{x^2}-x)(\sqrt{x^2+x)}$$.  Is the first one $$0$$?  Is the second one $$1$$? There are ways to justify both answers... There are very easy ways for simplifiers to arrive there too.

**In my opinion** the way out is via *denotational semantics*. You need to figure out what your terms mean. And that when you move terms from one denotational domain to another -- which is a completely legitimate thing to do, and want to do -- this needs to be understand as an action (or effect, in current PL-speak). You do want various commutative diagrams to hold between your symbolic manipulation and your denotational semantics. This is indeed the whole point of symbolic computation! It's usually implicit , but there it is: the real desire is to shuffle some symbols around, with the aim that those shuffles are actually meaningful.

Again, **in my opinion**, a really good way to write this down formally is via *biform theories* (see papers 1, 3 and 4 on the [MathScheme publications](http://www.cas.mcmaster.ca/research/mathscheme/publications.html) page). This lets you clearly say (and prove) what your symbol shuffling is supposed to correspond to. The fact that "rational function normalization" does not in fact do what you expect is explained in [Towards Specifying Symbolic Computation](https://arxiv.org/abs/1904.02729), and alternatives are given that respect the more usual notion of equality of functions.

x-x is not necessarily equal to 0

One of the most basic equalities that term rewrite systems for simplification implement is $$x-x=0$$. This seems completely uncontroversial, right?

In this post, I will use $$x$$ for a symbol, and $$T$$ for a term in some algebra.

So $$x-x = 0\cdot x$$ is always true, same as with $$T-T=0\cdot T$$. This is because any reasonable notion of term algebra in which $$-$$ makes sense at all, both sides will either be defined (and equal), or both undefined.  Even in weird cases like $$\infty$$, although it is possible to create strange algebras where it's not -- but in those cases, you shouldn't really be using $$-$$ at all.

But there are so many cases where things go wrong. First, what if $$x$$ stands for an $$n\times x$$ matrix?  $$x-x$$ makes perfect sense, but the result is **not** $$0$$ (the scalar) but $$\mathbf{0}_{n\times n}$$, the zero matrix. $$x=\infty$$ is another obvious issue. [This is also linked to Symbols are not Variables either].

Worse still is $$T-T=0$$.  First, consider $$T = 1/0$$. Obviously you don't want $$1/0 - 1/0 = 0$$ (unless you define $$1/0=0$$ like HOL-Light does). But , of course, it gets worse really quickly: there are many different situations in which deciding whether $$T$$ is equal to $$0$$ or not is anywhere from extremely hard to impossible -- so that you can't know whether $$1/T - 1/T$$ should simplify.

Types can help a lot. In some cases, you have some meta-theorems that say that, at some types, there are no problems, so it's perfectly safe. With a caveat, of course: the types of interest here are not the value or denotation types, but rather the *expression types*, combined with value types. Polynomials with coefficients explicit constants of $$\mathbb{Q}$$ are fine (yay, decidable equality!). As soon as you get to operations that are partial, all hell breaks loose.

I need to look more deeply at the types involve in Julia and in Symbolics.jl. **In my opinion** that is where one can make a big leap from what is done in Maple, Mathematica, SageMath, Macsyma, etc.

But if you always simplify $$T-T$$ to $$0$$, you will be guaranteed to have bugs, because you will eventually make important singularities disappear or, worse, take something that was entirely undefined and make a computation out of that which is defined everywhere (or at the wrong type). Both of which cause a lot of debugging pain.

Saturday, March 13, 2021

Notes on "An Introduction to Multi-Paradigm Modelling and Simulation"

 [Experiment: I'm taking active notes when reading some papers of real interest to me. I may as well make these notes public too. The notes are taken in an as-I-read fashion, and should in theory be very long annotations to the paper itself, but I couldn't publish that.]

As a first set of notes, it makes sense to first give context: I'm trying to understand "modelling". Abstractly I already do. But I'm trying to get a slightly more formal idea of things like "what is a model". Of course, that's likely to remain an informal notion. Nevertheless, I'm trying.

Also, it's important to remember that I bring my own context, knowledge and biases to my understanding. So the kinds of math that I know, the things in CS that I prefer (PLs, type theory, symbolic computation, etc) will colour my view of everything.

Thursday, March 11, 2021

Computers helping Mathematicians

So I wrote this for the FOM (Foundations of Mathematics) mailing list. There has been some discussions on the topic there, and the discussion was quite silly. It veered off way too quickly into when will computers "replace" mathematicians by doing what they do now, i.e. all the creative, insightful parts. Enjoy.


I honestly think that a lot of you are looking in the wrong places to find “computers helping mathematicians”!

Example: it was not that long ago that ‘human computers’ were used to create large tables of logarithms and other functions. The first editions of Abramowitz & Stegun (A&S) were all done by humans, after all. Slowly, we learned to automate a lot of the more tedious parts of A&S.  Today, there is active work in automatic almost all of the content of A&S. This is because our understanding of ‘special functions’ has grown so much that the vast majority of that book, including the pages upon pages of symbolic identities, are now automatable. A lot of mathematics, including oddball things like Ore Rings, went into automatic what a priori looked like analysis.

Another example: LaTeX. It is a tool used by the vast majority of mathematicians, and has greatly helped make the typesetting of papers come out of the awfulness that was the 1960s and 70s. It is a mathematically boring tool, but its effect is very large indeed.

More recently, web sites like MathOverlow, and math-aware web technology, and even just standard support for Unicode in web browsers and chat apps like Zulip, have made the act of communicating among a much larger community massively easier.  Various such technologies, and others, also helped the Polymath projects into existence.  Git and github is also playing a highly non-trivial role in the speed at which systems like Lean and its growing mathlib can proceed.  Without the underlying tools, all of that would be too painful to proceed at the current speed.

The place to look for “computers helping mathematicians” is: the boring stuff. All the stuff that just needs done, requires time, but not active brain cells to perform. The more we can automate that, the more time we can rescue for human mathematics to THINK about all the fun stuff. What is considered “boring” can evolve over time. I certainly have my pet peeves of things still done by hand which shouldn’t [and am working on it.]

One day computers can actually play a more active supporting role in the ‘fun’, creative stuff of mathematics. And we should be working towards that too.

A few of us deign to work “down in the weeds”, to automate the boring stuff, as that’s the only way that the rest will eventually happen.  No, it’s not math. No, it’s not glamorous. It’s even hard work.  But, please, don’t under-estimate its value.

Friday, April 27, 2018

Symbolic Computation

So what is symbolic computation? To better understand what it is, we need to understand the two terms that make up the name: symbolic and computation.

The use of the word symbolic is ancient -- from the Latin symbolicus, Greek συμβολικός meaning "of, or belonging to a symbol"; and thence to symbol (Ancient Greek συμβάλλω), meaning "a sign by which one infers something". And, most interestingly, its meaning has not changed! To quote from Wiktionary:

Any object, typically material, which is meant to represent another (usually abstract) even if there is no meaningful relationship.
     The dollar symbol has no relationship to the concept of currency or any related idea.
In the same way, the symbol π (pi) denotes a particular real number, but has no concrete relation to that number, other than through convention. It is important to note that "2" and "53" are likewise symbols, as are "+", "*" and so on. They do not have intrinsic meaning, even though it is harder for people trained in mathematics (and computer science) to remember so.

Computation is actually significantly harder to define. For our purposes, it suffices to say that it is something implementable (as a procedure) on a computer. (Procedure is used here even for what functional programmers would call 'pure functions', to separate them from 'functions' which, for the purposes of this blog post, are reserved for 'mathematical functions')

So symbolic computation is computation in which symbols feature prominently.

On one level, as most everything done on a computer involves at least one level of encoding, one could claim that all computation is symbolic. While strictly true, this isn't necessarily a very useful way to look at things. What we want to focus on here are those computations that involve symbols in a crucial way.

So an example is in order. Assuming that x is a symbol denoting some real number, we know that sin(x)^2+cos(x)^2=1. This equation is true for all values of x. Note that there are way too many real numbers -- the vast majority of them cannot be written down. And yet this equation holds. It is a prime example of a symbolic equation. So what do we really mean by it? We mean that "given any actual real number r, if we substitute r for x in sin(x)^2+cos(x)^2=1, then evaluate the resulting expression, the result will be a tautology". Evaluation is often pictured as a form of computation. However, as used above, it is different, as that computation cannot actually be carried out on a computer, as arbitrary real numbers are non-representable. But it is perfectly fine as a evaluation that happens in the field of real numbers. So we have evaluation, which happens in the space of denotations of our expression (here, the field of reals), and computation, which happens over (symbolic!) expressions.

In a sense, symbolic computation is symbol shuffling. If that's really all it was, it wouldn't be very useful. Most symbolic computation, however, obeys a very strict set of rules. Those rules are usually rather far from arbitrary. But if that is so, where do those rules come from? Denotational semantics! In other words, our symbols have meaning, and the objects in the space of denotations obey some rules. And, somewhat miraculously, we can write down many of these rules. We give them names and render them in phrases such as "addition is commutative" and "multiplication is associative".

So symbolic computation is really symbol shuffling with a purpose, and that purpose is to respect the rules that hold in the space of meanings we intend. When you attempt to compute a closed-form for the derivative of an expression e, what you are really asking is for the meaning of the resulting expression to denote a function, which is related to the input expression e by differentiation. The derivative of e is usually computed by various elaborate symbol shuffling operations, while derivatives are defined via limits. What is desired is that the symbolic manipulation preserves the meaning of 'derivative'.

One of the biggest difficulty with properly understanding symbolic computation stems from several issues:

  1. All human mathematics is written down, and so we cannot easily see the difference between syntax (symbols) and semantics (their denotation).
  2. We never write down 'denotation brackets' in our equations.
  3. We treat functions and the name of the function as being largely equivalent.
  4. We silently accept unevaluated functions as symbols, but are happy to evaluate them when we can.
That last point needs elaborated on: no one will blink if sin(0) is reduced, without mention, to 0, but sin(1/3) can stay 'as is' without difficulty. But this is incoherent: either sin is a function and it should reduce in all cases, or it is a symbol, in which case doing 'reduction' is an active process that should be clearly indicated.

In any case, the point is that symbolic computation is computing with symbols, which come from a specific language, and those computations are supposed to model a particular meaning. In other words, a particular computation on symbols is supposed to reflect the evaluation of a function on the space of denotations of that language.

One last remark on why this is rather non-trivial:

  1. Some spaces of denotations (like all functions from the reals to the reals) are unbelievably larger that the corresponding space of terms.
  2. Some spaces of denotations (like all boolean-valued functions on 2 boolean variables, i.e. of size 4) are quite a bit smaller than the corresponding space of terms (countably infinite if given 'and', 'or', 'not' and 2 variable names)
  3. Some questions, like "factor this polynomial", have answers which involve concepts outside the natural domain of denotation: a polynomial has a bag of factors, which requires us to enlarge the space of denotations quite a bit to answer them. This is because, in a sense, factoring is a naturally syntactic operation, whose meaning is destroyed by semantic operations. (x-1)(x+1) = (x^2-1) is a tautology if our semantics for polynomials is pair (d, v) of a degree and a vector v (of length d) of coefficients with the caveat that the first be non-zero.
Thus the correspondence between syntax and semantics is necessarily quite complex. This seems to be largely under-appreciated.

Monday, November 29, 2010

On Polymorphism

Suppose I take a relational view of types and polymorphism (which is quite natural given Reynold's work on relational parametricity.) Then, rather than thinking of a polymorphic type, that is a type (usually) represented by a free variable ranging over types, as free, I believe that it is better to think of it as unconstrained. Of course, unconstrained is the double-negation of free!

But since this is CS, it is useful to think constructively, so that these do not necessarily correspond. This is doubly natural since we know that the usual formalisms for parametric polymorphism do not have models in Set.

Which allows us to switch to viewing types in a relational setting. If we think of the usual lattice of types as our 'points', then the usual way of looking at (principal) types is that we want the least-upper-bound (in the 'information order' induced by the Scott topology) of all the typings, with this LUB being a valid type as well (where we use continuity to achieve that). So the space of points is not uniform like Euclidean space, but rather layered.

But what if stop thinking of Euclidean space in terms of points and switch to looking at it through the eyes of the Zariski topology? Well now Euclidean space starts to look rather layered too! And how does one explain the layers? Well, via ideals. More precisely, via systems of polynomials. But, in this setting, what is a principal ideal? It is one which is generated by a single element -- just like a principal type!

If we are to take any lesson at all from algebraic geometry, it is that to really study varieties, one needs to switch away from looking at sets-of-points, and study ideals very carefully. By analogy, the lesson should be to stop looking at types themselves, and to focus instead on the constraints which engender them. Then types are just the solutions to these constraint equations. In fact, one can even learn more from algebraic geometry and see that a lot of systems of constraints can be interpreted in multiple contexts (yes, I refrained from saying that they are polymorphic); and in each context, these systems can possess different solution 'sets'.

Rather than using Ring-theoretic methods, it might be better to focus on Lattice-theoretic methods, since Relation Algebras (seen as operators acting on the lattice of types) have a more natural ideal structure arising from its (residuated) lattice structure rather than its interpretation as an idempotent ring. But the ideas are still fundamentally the same.

In other words, a polymorphic type should be seen as the 'solution' to the empty relation. Principal types occur when a relation is fully triangularizable. This correspondance between the solutions-of-relations and types really needs a full blog post (if not a paper) to explain and spell out properly. But I wanted to jot down the line of reasoning, because it escaped me again.