A mathematical pun?
2023/10/28
Iʼve been working on some theoretical aspects of programming recently. Writing new compiler optimization passes. Thinking about parsers through the lens of selective applicative functors, and tweaking them to encode exclusive choice.
If you go far enough down the rabbit hole, it turns out that you want semirings for static analysis. This is not unheard of in compilers! Itʼs a really good technique for analyzing control flow, for example: information about exclusive branches are combined additively, and information from sequential operations are combined multiplicatively. It is especially appropriate because, semantically speaking, you want those sequential operations to distribute over the branching.
(You can already see this in more typical typeclasses like Alternative
, which is naturally analyzed by taking <*>
to *
and <|>
to +
.
Itʼs just that Iʼm interested in augmenting Selective
to encode exclusive choice too.)
This led me to come up with this construction: how to make a semiring out of a semilattice.
This construction answers the question, “if you need a semiring for static analysis, how do you also keep other data around that does not care about the branching structure?” (like, say, a monoid).
Specifically in selective applicative parsers, I need it to answer the question of why aggregating information about the grammar is a valid thing to do across parser combinator segments, no matter how they are combined.
And in the compiler pass I was doing, I was implementing demand analysis via semirings (especially the min tropical semiring). I actually donʼt have specific information I was considering aggregating as a semilattice, but it was a possibility that might come up, especially if I want to fuse some passes together. Right now my one pass is really three traversals of the tree, with various monad stacks of reader, writer, and state. (Yes I used all three.)
I don’t know if this construction is well-known! Let me know if you have a reference for it.
You can make a semiring out of a semilattice by adjoining a new zero element. Lifting the semilattice operation in the two obvious ways gives you +
and *
. Idempotence gives distributivity(!).
(Bounded) semilattices are commutative monoids whose operation is also idempotent: for all .
I will write the monoid operation as x <> y
and as ,
and the empty element as mempty
or .
Semilattices have deep connections to order theory: they induce a really nice preorder given by when (or vice-versa, depending on whether you are talking about meet or join semilattices – and no, I cannot keep them straight 🏳️🌈). But we donʼt need the order theory here.
Semirings are rings without subtraction: just addition and multiplication and their identities, zero and one, respectively. And distributivity and annihilation laws to intertwine these two monoids.
The funny part of this is that “semi-” means different things: semirings are just missing subtraction (kind of a weird use of semi, which is why some call them rigs), but semilattices are literally half of a lattice (one idempotent commutative monoid instead of two interlinked).
(Lattices are actually closely related to semirings: they have the same shape of operations, and you can turn every bounded distributive lattice into a semiring – in two ways, in fact, since you can make a lattice with the opposite order.)
So itʼs like a mathematical joke that they can be related to each other at all!
How do we get two monoids out of one??
The key idea is to adjoin a zero. Thatʼs it.
The rest of the moves can be determined from that premise, so letʼs see how it works:
data WithZero t = Zero | NonZero t
-- Imagine that `t` is really a `Semilattice`
-- (this does not exist as a class in PureScript)
instance Monoid t => Semiring (WithZero t) where
zero = Zero
one = NonZero mempty
add Zero Zero = Zero
add Zero r = r
add l Zero = l
add (NonZero l) (NonZero r) = NonZero (l <> r)
mul (NonZero l) (NonZero r) = NonZero (l <> r)
mul _ _ = Zero
The two operations are the semilattice operation lifted through Maybe
in the two possible ways:
add
follows the pattern of the default Semigroup a => Monoid (Maybe a)
instance, that uses Nothing
as its identity.
This makes sense since weʼre adding Zero
, the identity for add
.
Indeed, it is forced by the laws, and the fact that we only have one binary operation to use.mul
is like the other possible instance, Monoid a => Monoid (App Maybe a)
, formed by (<>) = lift2 (<>)
.
This is likewise forced by the annihilation law and the fact that we only have one binary operation to use.Iʼm going to be lazy and use math notation for the laws, with the understanding that when I say for example, it means in Haskell/PureScript that x = NonZero x' :: WithZero t
for some unique x' :: t
, and if then means NonZero (x' <> y')
.
The fun part is the left and right distributivity laws:
To prove left distributivity, , we look at some cases:
We prove right distributivity in the same way , based on the calculation
The takeaway is that idempotence of the semilattice gives us distributivity of the semiring. This is why having a semilattice and not merely a monoid is essential.
This does make some sense: if weʼre aggregating information that does not care about branching structure at all, well, semilattices are great models for accumulating knowledge. Idempotence says you only learn a fact once.
We donʼt require multiplication to be commutative, so if you drop the left-distributivity law, you could get away with a right-regular band with the law .
I think left-distributivity is a bit weirder than right-distributivity, in the context of control flow. Right distributivity just says you can copy any training code into each case branch.
However, in general Iʼm a fan of left-regular bands, since they intuitively preserve order.
Also, to be fair, you could absolutely disregard some semiring laws for the sake of static analysis of programs: you donʼt always want to treat programs as purely algebraic structures, and often want to dig into the details of how they were constructed.
Like, if youʼve factored out common control flow, thatʼs almost always for a reason! So your static analysis should reflect that.
We made this true by definition of mul
: .
We also made true by definition of add
.
So we just need to prove that for . But that follows from the semilatticeʼs associativity: .
Yes.
For , we need two cases: and . But if , it is trivial still. (This is the nice way the identities and annihilator elements interact. They donʼt add any proof burden to the other.)
So for (and since and is given by the semilatticeʼs identity ), we look at the underlying semilattice and find that as we want.
Same case analysis as usual: if then we get associativity from the semilattice, otherwise both sides equal by the power of the annihilator.
Yes.
Note that we cannot make a lattice out of the semilattice – thatʼs a step too far. Intuitively from the order theory point of view, thereʼs no reason why would would be able to, since the meet and join operations of a lattice have opposite views of the preorder of the lattice.
And algebraically, the two absorption laws would fail in general: and (even stating them like that looks weird). For , by idempotence of the semilattice we would see , which only equals if . Thereʼs just no way to get rid of the extra there if we are sticking to one operation.
You could technically iterate this construction, since add
and mul
are both idempotent, commutative, associative operations now.
However itʼs not terribly interesting.
You end up adjoining some number of identities and annihilators to the underlying semilattice. (New top/bottom elements, depending on which way you look at it.) The order that you do this in does not matter, only how many times you choose to do each way.
Want a semiring without zero? No need to adjoin a zero, then – just use the same carrier type. The remaining laws still just work.
For static analysis, the zero is only good for representing unreachable/error cases. But the identity of the semilattice is indispensible: itʼs the empty analysis for when you know nothing yet or have nothing to contribute.
Important to note that all of these algebraic constructs (monoids, semilattices, semiringsbut not fields) are closed under taking products. This is why I said “how do you also keep other data around” in the introduction.
The concept of a semiring is an abstract conception of what a number is. A particular semiring is a specific conception of what can be a number. We can manipulate these “numbers” in the familiar ways – mostly.