The Haskell subreddit recently had a post about the cleverest piece of Haskell code you know, and the comments ignited a discussion of Martin Escardo’s exhaustive search of the Cantor set, which can be thought of as the set of all infinite chains of 1’s and 0’s. The techniques are exciting, but initially confounding for many, including me. I set out to fully understand the processes involved and devoured as many of the sparse resources as I could find. I didn’t really start to get a mental grip on it until I started playing with code – specifically Luke Palmer’s infinite search code. This code doesn’t search the Cantor set but instead exhaustively searches the natural numbers, and it’s just a little easier to dissect mentally as a precursor to the Cantor set. I think I now know how it works, what its limitations are, and how to break it. My goal is that by reading this post, you’ll come away with the same feeling.
Disclaimer: I am very much a Haskell newbie and not a computer scientist by training. I just think this stuff is cool.
Representation of natural numbers
First, let’s cover the representation of the domain, the natural numbers. The representation of natural numbers is defined recursively from Zero:
data Nat = Zero | Succ Nat
That is, a natural number is either Zero, or it is the Succ (successor) of another natural number. So for instance 0 = Zero, 1 = Succ (Zero), 2 = Succ (Succ (Zero)) etc. While this may seem cumbersome at first, the key advantage to representing the domain this way is that it makes comparison operations much more powerful. Consider the comparison
Succ (Zero) /= Zero
which represents that 1 is not equal to 0, which is true. Now consider what happens if we reformulate the expression as
Succ (?) /= Zero
where ? can be Zero, Succ (Zero), etc. By examining only the heads of each side of the expression, we are no longer comparing 1 with 0; we are comparing every number greater than or equal to 1 with 0, and like before, the expression is still true. If we compare
Succ (?) /= Succ (?)
then by examining just the head, we can only claim that the two values being compared are possibly equal. We cannot even claim to know which values are being compared, except to say that neither is Zero. So in this case, we have to examine more than just the heads of each expression to determine equivalency. We in fact have to examine as much of the expression as necessary to find a Zero on either the left side, the right side, or both. If a Zero does not exist, such as in
infinity = Succ infinity
infinity == infinity
then equivalency can never be determined. If you try to execute this in GHCi, you’ll get an infinite loop.
These comparison advantages are the reason why Palmer says that in order to use these infinite search techniques, the domain must be representable as a recursive polynomial type, and why Escardo says that the predicate must be computable. These conditions allow us to construct seemingly infinite comparisons as long as one side of the comparison reduces to something finite eventually.
Testing equivalence
I implemented by own instance of Eq for Nat instead of using Haskell’s derived implementation. You can see it in the full code listing at the bottom if you’re curious. Let’s test out some comparisons:
*Main> Succ (Succ (Succ (Zero))) == Succ (Zero)
-- Comparing Succ == Succ --> Possibly equal
-- Comparing Succ == Zero --> Not equal
False
Here we are comparing 3 and 1. The salient point here is that we can determine that the two values are not equal by only comparing the first two elements of each representation. The same is true when comparing to infinity:
*Main> infinity == Succ (Zero)
-- Comparing Succ == Succ --> Possibly equal
-- Comparing Succ == Zero --> Not equal
False
Expanding the Num instance
Palmer’s Num instance includes the operations (+), (*), and fromInteger. The purpose of the first two is clear. fromInteger takes an integer literal and transforms it into our Nat representation so we can compare it with another Nat representation:
*Main> fromInteger 2::Nat
-- Transforming 2 into Nat form.
---- Returning Succ as element of the transformation
---- Returning Succ as element of the transformation
---- Returning Zero as element of the transformation
Succ (Succ Zero)
*Main> Succ (Zero) == fromInteger 2
-- Transforming 2 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal
False
The operation (-), however, is not defined. We can define it by adding the following to the Num instance:
Zero - y = Zero
Succ x - Zero = Succ x
Succ x - Succ y = (x - y)
This definition of (-) carries two caveats. First, Zero minus anything is Zero. Consistently, any n – (n + k), where k >= 0 is Zero. This can lead to unexpected behaviors like
Zero - 130 == Zero
-- Comparing Zero == Zero --> Equal
True
so we need to ensure we are using it appropriately given the properties we have defined for it.
The second caveat is that, unlike addition, subtraction between two Nats cannot be done lazily. With addition, we know that for any x and y
Succ x + y = Succ (x + y)
so we don’t need to know what x and y are in order to start returning the result because we know it begins with Succ. However, with subtraction we have
Succ x - Succ y = (x - y)
so we have no idea whether the result will start with Succ or Zero until we traverse x and y and find a Zero in one of them. Indeed, something like
infinity - infinity
will never return a result.
The search’ function
With the background out of the way, let’s examine the algorithm itself, starting with the search’ function. You’ll notice that my code throughout this post is slightly different than the Palmer code and littered with trace calls. These will be useful later when we start exploring examples.
search' fTop | trace "Starting search. Starting top level predicate call (TLPC)" fTop $ bestGuess = Just bestGuess
| otherwise = Nothing
where
bestGuess = trace "The TLPC has no defined x yet, so start building the best guess with lyingSearch" lyingSearch $ fTop
The important thing to note about search’ is that all it does is apply the predicate fTop to a single natural number, bestGuess. That’s all. In fact, the entire infinite search algorithm begins when this single application of fTop begins to be evaluated, and ends when fTop finishes its evaluation (the Maybe wrapping notwithstanding). I changed the name of the predicate in the search’ function to fTop to try to capture this special significance.
The lyingSearch function
The bestGuess that search’ applies fTop to is merely the result of the lyingSearch call made above. So what does lyingSearch do? It recursively builds the best-guess solution for satisfying fTop while fTop is being evaluated. Let’s examine the code.
lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch fRec = if trace "Can we finish the best guess with a Zero and succeed? Let's try it" (fRec Zero)
then trace "Yes, so end best guess search with Zero. Next, complete TLPC" Zero
else trace "No, so continue best guess with Succ and do lyingSearch again" (Succ (lyingSearch (fRec . Succ)))
The input argument to lyingSearch, fRec, does two things:
- Includes the predicate we are trying to satisfy
- Records the progress of the best-guess solution
The record keeping is achieved by composing fRec with an additional Succ each time lyingSearch decides that the next element of the best-guess solution is Succ. Thus, when search’ first calls lyingSearch with fTop, the best-guess is empty because no Succ’s have yet to be tacked on the end.
Let’s go through lyingSearch step by step. Assume that fRec is fTop composed with zero or more Succ’s that represent the current state of the best guess. By applying fRec to Zero, we are testing whether or not the next element of the best guess should be Zero. If true, then the next element should indeed be zero because the predicate fTop will be satisfied. If false, then we just assume that the rest of the best guess must be Succ plus whatever is produced by another call to lyingSearch. Assuming this about the best guess does not guarantee that the predicate will be satisfied; we just know that it won’t be satisfied with Zero. In this way, lyingSearch could eventually return a result that does not satisfy the predicate (a lie).
Example: finding a solution
Let’s go through a few search examples and examine the traces that are produced step by step. I’ve included some additional annotations to the trace output in parentheses to add a little specificity where needed.
predFxn1 x = let lhs = x*x
rhs = 4
in trace ("-- Comparing LHS with RHS") (==) lhs rhs
*Main> search' predFxn1
Starting search. Starting top level predicate call (TLPC)
-- Comparing LHS with RHS (This is part of the TLPC)
The TLPC has no defined x yet, so start building the best guess with lyingSearch
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fRec to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal (This ends the application of fTop to Zero)
No, so continue best guess with Succ and do lyingSearch again
-- Transforming 4 into Nat form. (This is part of the TLPC)
---- Returning Succ as element of the transformation (The first element of the best guess is Succ!)
-- Comparing Succ == Succ --> Possibly equal (4 begins to be compared to (best guess)^2. Best guess so far is just Succ)
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fTop . Succ to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal (This ends the application of fTop . Succ to Zero)
No, so continue best guess with Succ and do lyingSearch again
---- Returning Succ as element of the transformation (The second element of the best guess is Succ!)
-- Comparing Succ == Succ --> Possibly equal (Continues the comparison of (best guess)^2 to 4 in TLPC)
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fTop . Succ . Succ to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Zero == Zero --> Equal (This ends the application of fTop . Succ . Succ to Zero)
Yes, so end best guess search with Zero. Next, complete TLPC
---- Returning Succ as element of the transformation (Continue and complete comparison of (best guess)^2 to 4)
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Zero == Zero --> Equal
Just (Succ (Succ Zero))
This is all well and good, but how does infinite search determine that there is no solution?
Example: finding no solution
No solution exists when we try to satisfy an impossible predicate. We saw in the last example that whenever we try to end the best guess with Zero but fail, we assume the right path forward is to use a Succ instead. If the predicate always returns false, then our best guess becomes an infinite chain of Succ’s. So then how does the algorithm know to give up the search?
predFxn2 x = let lhs = x*x
rhs = 3
in trace ("-- Comparing LHS with RHS") (==) lhs rhs
*Main> search' predFxn2
Starting search. Starting top level predicate call (TLPC)
... Uninteresting sections of trace omitted here. Best guess is now Succ (Succ (Zero)) ...
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS
-- Transforming 3 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Succ == Zero --> Not equal (This is the first time (best guess)^2 overshoots 3)
No, so continue best guess with Succ and do lyingSearch again (Best guess is now Succ (Succ (Succ (Zero))))
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal (This is part of the TLPC)
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS
-- Transforming 3 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Succ == Zero --> Not equal
No, so continue best guess with Succ and do lyingSearch again (Best guess is now Succ (Succ (Succ (Succ (Zero)))))
---- Returning Zero as element of the transformation (We've reached the end of 3's Nat representation in the TLPC)
-- Comparing Succ == Zero --> Not equal (This is the conclusion of the TLPC)
Nothing
What’s interesting here begins on line 14. The right hand side of the comparison, 3, is not changing, and we know the operation x*x results in a longer and longer series of Succs. With this information, you and I could conclude that this predicate will never be satisfied. The algorithm, however, doesn’t conclude that no solution can be found until the same comparison fails in the TLPC (line 31).
To summarize, the infinite search algorithm uses the predicate in two ways:
- Deciding what the next element of the best guess should be
- Deciding whether or not the best guess succeeds or fails
Usage 2 imposes a constraint on the structure of the predicates we can look to satisfy. Specifically, the algorithm can only declare that no solution exists if the predicate evaluates to False when given an infinite chain of Succs (i.e. infinity).
Breaking infinite search
With the constraint on the predicate in mind, let’s try to break infinite search. Here are a couple of simple predicates that work just fine, one with a solution and one without.
predFxn3 x = x*x == 6 - x
predFxn4 x = x*x == 7 - x
*Main> predFxn3 infinity
False
*Main> predFxn4 infinity
False
Now we know that for either of these predicates, search’ will give up the search if no solution exists.
*Main> search' predFxn3
Just (Succ (Succ Zero))
*Main> search' predFxn4
Nothing
Great. What about these modified predicates, again one with and one without a solution?
predFxn5 x = x*x == 6 + x
predFxn6 x = x*x == 7 + x
When evaluated with infinity, both of these predicates generate an infinite loop because, at the end of the day, we’re evaluating infinity == infinity. However, predFxn5
*Main> search' predFxn5
Just (Succ (Succ (Succ Zero)))
returns its solution, while search’ predFxn6 looks for one forever and never gives up. We can try to be clever and reformulate the predicate like this
predFxn7 x = x*x - x == 7
to eliminate infinity from one side, but all we’ve done is replace the non-terminating operator (==) with the non-terminating operator (-). Alas, predFxn7 will not work, either.
Other explorations
Hopefully I’ve been able to make infinite search a little less confounding, or at least this particular application of it. Here are some other questions to ponder for the curious mind. I’ll be pondering these myself.
- Can Nat infinite search be modified to handle predFxn6 or predFxn7?
- Can these techniques be used to perform epsilon-delta proofs on compact subsets of the reals?
- Must all lists contain at least three items?
Code listing
{-- Playing with Haskell infinite search --}
import Debug.Trace
data Nat = Zero | Succ Nat
deriving (Ord, Show)
infinity = Succ infinity -- This line is interesting but not necessary
fromInteger' 0 = trace "---- Returning Zero as element of the transformation" Zero
fromInteger' n = trace "---- Returning Succ as element of the transformation" Succ (fromInteger' (n-1))
instance Num Nat where
Zero + y = y
Succ x + y = Succ (x + y)
Zero - y = Zero
Succ x - Zero = Succ x
Succ x - Succ y = (x - y)
Zero * y = Zero
Succ x * y = y + (x * y)
fromInteger n = trace ("-- Transforming " ++ show n ++ " into Nat form.") $ x
where x = fromInteger' n
instance Eq Nat where
Succ x == Zero = trace ("-- Comparing Succ == Zero --> Not equal") False
Zero == Succ x = trace ("-- Comparing Zero == Succ --> Not equal") False
Zero == Zero = trace ("-- Comparing Zero == Zero --> Equal") True
Succ x == Succ y = trace ("-- Comparing Succ == Succ --> Possibly equal") (x == y)
-- lyingSearch f returns a Nat n such that f n, but if there is none, then
-- it returns a Nat anyway.
lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch fRec = if trace "Can we finish the best guess with a Zero and succeed? Let's try it" (fRec Zero)
then trace "Yes, so end best guess search with Zero. Next, complete TLPC" Zero
else trace "No, so continue best guess with Succ and do lyingSearch again" (Succ (lyingSearch (fRec . Succ)))
search' fTop | trace "Starting search. Starting top level predicate call (TLPC)" fTop $ bestGuess = Just bestGuess
| otherwise = Nothing
where
bestGuess = trace "The TLPC has no defined x yet, so start building the best guess with lyingSearch" lyingSearch $ fTop
predFxn1 x = let lhs = x*x
rhs = 4
in trace ("-- Comparing LHS with RHS") (==) lhs rhs
predFxn2 x = let lhs = x*x
rhs = 3
in trace ("-- Comparing LHS with RHS") (==) lhs rhs
predFxn3 x = x*x == 6 - x
predFxn4 x = x*x == 7 - x
predFxn5 x = x*x == 6 + x
predFxn6 x = x*x == 7 + x
predFxn7 x = x*x - x == 7
(Continues the comparison of (best guess)^2 to 4 in TLPC)