Mechanized Mathematics: 2010

Monday, November 29, 2010

On Polymorphism

Suppose I take a relational view of types and polymorphism (which is quite natural given Reynold's work on relational parametricity.) Then, rather than thinking of a polymorphic type, that is a type (usually) represented by a free variable ranging over types, as free, I believe that it is better to think of it as unconstrained. Of course, unconstrained is the double-negation of free!

But since this is CS, it is useful to think constructively, so that these do not necessarily correspond. This is doubly natural since we know that the usual formalisms for parametric polymorphism do not have models in Set.

Which allows us to switch to viewing types in a relational setting. If we think of the usual lattice of types as our 'points', then the usual way of looking at (principal) types is that we want the least-upper-bound (in the 'information order' induced by the Scott topology) of all the typings, with this LUB being a valid type as well (where we use continuity to achieve that). So the space of points is not uniform like Euclidean space, but rather layered.

But what if stop thinking of Euclidean space in terms of points and switch to looking at it through the eyes of the Zariski topology? Well now Euclidean space starts to look rather layered too! And how does one explain the layers? Well, via ideals. More precisely, via systems of polynomials. But, in this setting, what is a principal ideal? It is one which is generated by a single element -- just like a principal type!

If we are to take any lesson at all from algebraic geometry, it is that to really study varieties, one needs to switch away from looking at sets-of-points, and study ideals very carefully. By analogy, the lesson should be to stop looking at types themselves, and to focus instead on the constraints which engender them. Then types are just the solutions to these constraint equations. In fact, one can even learn more from algebraic geometry and see that a lot of systems of constraints can be interpreted in multiple contexts (yes, I refrained from saying that they are polymorphic); and in each context, these systems can possess different solution 'sets'.

Rather than using Ring-theoretic methods, it might be better to focus on Lattice-theoretic methods, since Relation Algebras (seen as operators acting on the lattice of types) have a more natural ideal structure arising from its (residuated) lattice structure rather than its interpretation as an idempotent ring. But the ideas are still fundamentally the same.

In other words, a polymorphic type should be seen as the 'solution' to the empty relation. Principal types occur when a relation is fully triangularizable. This correspondance between the solutions-of-relations and types really needs a full blog post (if not a paper) to explain and spell out properly. But I wanted to jot down the line of reasoning, because it escaped me again.

Monday, September 13, 2010

Source code architecture, part II

The first part told the story of the requirements surrounding a packaging mechanism for code chunks. Alex left some helpful comments (only Buzz, not here). Here, we expand upon what was said in the first part. This should establish a sufficient understand upon which to build a solution.

Assets

Why talk about assets and code chunks instead of functions (procedures, class members, what have you)? The influence here is completely from literate programming where one can name fairly arbitrary chunks of code and re-use them. But surely in a language that has been influenced by Haskell (amongst many others), functions would be sufficient? Maybe. If the polymorphism supported by the language is sufficiently rich. And that's the source of my hedging: at this point, I don't want to mix these two issues (organization and polymorphism). But it is an issue to keep in mind: in a strongly typed language, insufficient polymorphism and/or persistent abstractions leads to macros. [A 'persistent' abstraction is an abstraction mechanism over which the programmer has no control, i.e. cannot reliably tell the compiler to inline it. I'll probably make a post on that in the future.] Macros are not evil per se, but they are a way of "giving up" on a language's intrinsic features as "not powerful enough". I would like to avoid that.

All this to say that, in theory, all assets should be 'like functions' where external dependencies, both input and output, should be visible and controllable. In the meantime, let's just go for "code chunks".

Naming

There was an implicit assumption that all our chunks are (somehow) named; this can be taken as an additional axiom. Names are not the only 'solution', but it sure is very convenient. For theoretical purposes, as well as ease of implementation, there are other solutions. The explicit design choice here is to favour human understandability above other measures. So names it is.

Another assumption is that names are expected to be locally meaningful, but not necessarily globally stable. In other words, it is implicitly assumed that we have a renaming mechanism. That is an easy assumption because, at the lower levels, we already do have a renaming mechanism, it is used a lot, and it is very handy. So we just extend it to the higher levels too.

Assembly

Because we priviledge human understanding (of source code), this means that machine-ready code might require a fair bit of processing to 'build' it. Clearly such processing can be an important part of 'understanding' too. Let us see it another way: whether it is the denotational semantics (of the source code) or the operational semantics (of the assembly code) which is 'complicated', both will present a barrier to understandability. Ideally, the operational semantics of the assembly code would be a straightforward implementation of some kind of hierarchical denotational semantics for the source code.

But we already assume that we 'understand' the MathScheme language. And the language has built-in features dealing with syntax, as well as black boxes. So it is a rather simple leap to think that 'assembly' should be regarded as something which should be an internal rather than external entity. In other words, as a program, it should be written in the same language.

Context

Why do we name chunks at all? To be able to conveniently refer to them. Why would we want to do that? To be able to (re)use them. The simplest situation is when we name a function, we just want to call it. In the situation at hand, what we really want to do is to create a context, aka a set of definitions, within which the code we are currently reading/writing 'makes sense'.

Ideally, everything would be done by reference, so as to minimize actual duplication. In practice, we sometimes do want duplication. But that will be considered to be a separate problem - though needs to be 'part of' the design.

Thursday, September 02, 2010

Source code architecture, part I

As MathScheme grows, various scalability issues need to be solved. Various solutions leap to mind, coupled with words such as packages, libraries, and modules. But are these actually solutions to the problems we are encountering? Which leads to this post: what are the problems that need a solution? This is a requirements issue first and foremost; we shouldn't jump into designs and solutions until that's been properly figured out. But let's walk through the discovery process rather than jump ahead too far. [Be prepared for a long tortuous story, but hopefully there will be a happy ending.]

The issue first popped up when we were doing some data-structure development (well, theories of types of data-structure, to be precise), and we wanted to separate that out from the base theories that cover algebraic structures. But data-structures very quickly leverage algebraic structure, so that we needed to build our second library 'on top of' the first. The obvious thing to do is to implement some kind of import mechanism. But something naive really doesn't scale: C's #include mechanism, for example, is a complete nightmare for scalability and maintenance.

A saner method is something like Haskell's import (or Java's package for that matter). This provides namespace support, and so allows you to segregate your code into discrete chunks and still be able to use it seamlessly. But this isn't perfect either: with a large enough set of imports, one no longer knows where any particular function came from. This is made considerably worse if the import mechanism silently does shadowing. There are two refinements of this idea: explicit imports and qualified imports. Explicit imports require you to name all the functions you want, while qualified imports let you give a "short name" (or alias) to the whole package. But both of these have problems as well: explicit import lists can grow to be very large, and now matter how "short", these alises clutter the resulting code in sometimes rather unaesthetic ways. Furthermore, a lot of these solutions are file-based, in that the name of the 'chunk' of code must somehow correspond to the name of the file it is in, which also comes with all sorts of problems of its own (as file names are really not portable from OS to OS, with directory names being even worse).

At this point, two further solutions ideas pop into mind. [I don't really know where my education has failed me, but so much emphasis on problem-solving has lead to an instinct of jumping from potential solution to potential solution. A better process would have been to focus much earlier on refining the exact problem statement to be solved, but this step is something which I must consciously remind myself to do, rather than the unconscious coming-up-with-solutions.] In any case, these are literate programming and URIs. But since these are again in solution-space, let's not explore these now. It is at this point that it became rather clear that the thinking process was solution-driven, and what was needed was some problem-space thinking.

So what is the real problem? We have an asset (code), which we want to use in many different ways. The code base will grow to be large enough that it needs to be organized. But some methods of organization are too obtrusive, and are a larger inconvenience than the problem they solve (imagine if all your functions had ~100 character names and you had to type them out in full everywhere). The instinct is then to come up with a reasonable compromise between order and convenience.

But that's not necessarily a good solution either! The point is, we want to use our code in many different ways. Crucial point: different tasks require different levels of order and convenience. Which means that we really need to understand what the different tasks are. And what the background axioms are.

(My) Axioms:

There will be many people wanting to develop the code base simultaneously.
There is no one organization which will fit all uses.
When we name and use an entity, it should be (relatively) easy to pinpoint where its definition is. [Yes, ad hoc overloading and AOP, I'm looking at you.]
Complex resolution rules are not 'easy'.
Tying the organization mechanism to concepts from the underlying operating system is a bad idea.
local code should be clutter-free.
global and glue code should be precise.
source code is primarily intended for humans to read and understand.
source code may require significant processing before it is machine-ready.
Names of entities should not be artificially restricted (like global uniqueness)
There are different scales for a code base.

Some explanations are necessary: local code is code which, while defined within a context, understanding that code is largely a local affair; low-level libraries are obvious examples. Global (and glue) code is code which uses a large context for its work (pulls in concepts from a lot of different places). And what is a context? It is the set of all definitions necessary for the current code to 'make sense'; it is not necessary to recurse through these definitions and expand them as well. And scale is really a synonym for the size of a context, which is purely the number of definitions (rather than their complexity).

There are also some clear tensions: if we use the same name in two different parts of the source code and then try to use these two pieces simultaneously, we have a problem. Our axioms tell us that the system should not artificially force us to restrict names (so as to avoid this problem). Of course, if these two pieces are meant to be used together, then it is good design to avoid needless clashes - but that is a human design decision. If this naming overlap makes sense, then there needs to be a mechanism to allow for precision; any kind of implicit resolution mechanism which relies on subtle ordering properties of either the source code or (worse) the assembly process hamper understanding too much to be viable. Some restrictions on names of entities are natural: when coding 'in the small', clearly one wants names to be unique -- but there should be no restriction that forces global uniqueness. There are some names which are just very natural in certain contexts, and it ought to be possible to use that name, at least locally. For example shadowing seems like a bad solution - it goes against ease of resolution.

A number of tasks were embedded (or at least inferable) in those axioms:

Write clutter-free local code.
Establish a context.
Write global code.
Trace where definitions come from.
Assemble 'code' from 'source code' aimed at humans.
Organize 'source code' so that it can be understood by humans.
Organize 'source code' so that it can be assembled by programs.

What I see as important is not so much how things are organized, but rather how does one establish a context. In other words, how do you name chunks of code so that later one can refer to these chunks as the context in which to work? And it seems to all come down to a matter of scale: small contexts are easily understood, large contexts are confusing.

Of course, one can use organization to deal with this problem: assemble chunks of source code into modules and packages. If each of these expose only a small number of 'functions', then a context made up of a (small) number of modules/packages should be reasonably easy to understand. Wishful thinking you say? Agreed.

In the next part, analysis of the above will be continued. A solution may even emerge. More likely, the tasks will need to be refined, and new ones will arise, requiring further analysis.

Wednesday, June 16, 2010

Completely addicted to Math and Stack Overflow

I used to spend a lot of time on MaplePrimes but I got bored after a while. Sure, I was King of the Heap (highest points), but because I wasn't learning anything nor getting any other kind of positive reinforcement out of it, I got bored. I also weaved in-and-out of Lambda the Ultimate; the quality of LtU greatly depends on who 'hangs around', and the crowd had changed. I was not learning so much anymore (although it's picked up again very recently).

For a while, my web presence was fairly low. But now I am completely addicted to both Math Overflow and Stack Overflow. I am learning a lot from MO, it's great. Now and then I can contribute too. I was rather ambivalent about Stack Overflow for a while -- the great majority of questions (and answers!) are of absolutely no interest to me at all. But, slowly, I learned to navigate the giant morass of questions to find the ones which I might find interesting. An 'interesting question' is not one to which I have an answer, but really one that interests me. Sometimes these questions interested me in the past, so that now I can provide some answer. Sometimes the interest is current, and so I learn something new from the answer(s). In either case, fun happens. And there is no chance at all that I'll become top-ranked (by reputation) on either of those sites - but I really don't care! As I said, 'fun happens'.

Having said all of that, neither Math Overflow nor Stack Overflow are perfect. There are too few people who worry about 'Mechanized Mathematics' on either, so that I still have nowhere to discuss certain issues. Although I should dip into the nForum again (it's part of the nLab), there are some kindred souls there. What I really need is a Math System Implementors Discussion Forum. Hmmm, maybe I should start it!

Sunday, June 13, 2010

Restarting

I created this blog a long time ago, and then let it lie there as I got really busy. But now I really need some place to save some of my thoughts, so I'll pick it up again. Or at least I'll post here until I find another host where I can easily post mathematics and syntax-highlighted code easily.