Emulation of TL2 Shared Stash

Posted on March 27, 2013 by dan

A lot of cool news to report on FNIStash!

After successfully pulling png icons out of the TL2 game data, the next step was to get a GUI going. As I’ve written about previously, GUIs on Haskell (especially windows) are a PITA. However, after doing a clean reinstall of Haskell Platform, I was able to successfully install Ji and build the example executables.

What is Ji?

The GUI landscape on Haskell is pretty bad. There are lots of frameworks out there, but actually installing them (much less using them) is often a big pain. There really hasn’t been much effort lately to make these frameworks more user friendly.

Where does Haskell get a lot of attention? The Web. Ji is a package that takes advantage of this web attention by turning your web browser into desktop GUI. You control the browser GUI from your Haskell application with commands to add elements, handle events, etc. Ji achieves this by running a Javascript library on your page that transparently communicates with the Haskell server using JSON to execute commands and send event notifications. The low latency of a desktop connection to itself makes this feasible, but technically Ji could be used for web apps that have a sufficiently fast connection (probably not a good idea, though). It’s all automatic and requires very little set up.

For FNIStash, I needed to expand on the Ji package by enabling assignment of element IDs and lookups thereof. I found that most of the infrastructure for this was already there but unused, which made adding the functionality pretty easy once I figured out how the package worked. The updates went into Ji earlier this morning, but Ji’s creator, Chris Done, has other things on his plate and wanted to unload responsibility for the package. Heinrich Apfelmus, of reactive-banana FRP fame, volunteered to take the project over with my assistance. That transition is still in the works.

Ok, so FNIStash…

I recently discovered how TL2 stores the location information for items. There is a hierarchical system of containers, slots, and indices, each of which has an ID to keep track of it all. From a raw item binary, I now know where that item is among TL2’s various item containers. The result?

A working shared stash display!

This is a working shared stash display, albeit unpolished. Each item is inserted into the emulated stash in its same position just like the game. The selection tabs at the top work, so clicking on a different tab changes the class of items that are displayed. Ji handles this all over the wire, although I admit is was a little tricky getting the click event handler written.

The error at the top indicates that there are some items in this shared stash file that I am not parsing correctly. It’s true that there are cases my parsing logic doesn’t currently handle, mainly because I don’t understand the file format entirely. For the time being, handling most cases is sufficient for moving development forward.

It’s actually starting to look like a real, genuine desktop app. I’m excited!

A comparison of Haskell vs F#

Posted on March 9, 2013 by dan

In my last post, I wrote about my frustrations as a Haskell user on Windows. It led to a good discussion on Reddit about a number of points I raised, as well as some good comments on this site. I received more than one recommendation to try out F# as an alternative to Haskell. I’ve been casually exploring it for a few weeks now, and I have decided not to pursue it. My impressions are below. Keep in mind that I am coming to F# as a barely experienced Haskell user, but a Haskell user nonetheless. My impressions are also restricted to F# in Windows, as I did not investigate Mono or other Unix tools.

F# – The Good

Full citizen status – Most immediately noticeable when coming to F# from Haskell is that Windows users come first. You do not need to install any Unixy tool chains to build what you want. In fact, almost everything you could possibly want is downloadable straight from Microsoft’s enormous and comprehensive .Net framework. Everything just works.

Documentation – Being part of the .Net framework, F# has a lot of corporate resources behind it at MS. This includes in depth documentation about nearly every .Net component. What is more, the .Net community is enormous, so finding examples online for pretty much anything is much easier than it is for Haskell.

Advanced tooling – One of the things Haskell suffers from, in my opinion, is lack of advanced development tools, including a standard and full-featured IDE. In F#, you get Visual Studio, a fully featured IDE that has been tested by thousands, if not millions. Haskell has no equivalent. Leksah is decent, but it is nowhere near as capable as VS.

For example, in F# on VS you can scroll over (almost) any value or function and see the inferred type signature for it. In Haskell, if I have a tough type error, I usually have to start explicitly defining the types of values and functions to narrow down the scope of my mistake. In VS, I can just scroll over the item in question and immediately get hints as to where I went wrong. Unfortunately, as far as I know this only works for named functions and not functions defined as symbols (operators).

Sane strings – Haskell’s default implementation of strings is a linked list of characters. While providing some advantages, in most cases it’s plain inefficient and unusable. GHC has some extensions to help fix this, such as the OverloadedStrings extension, and there are libraries to help as well, such as the text library. But different libraries choose to handle strings in different ways, so you often find yourself calling a number of conversion operations just to glue all the right types together. F#, on the other hand, has one standard string implementation that everyone uses.

Standards – Somewhat similar to the previous point, the Haskell ecosystem sometimes feels like the wild west, with different projects experimenting with types and strategies. Frequently, this can lead to ambiguity in how a newbie should approach a particular problem. In F#, there is usually a standard, recommended way to do things – the .Net way.

Computation expressions – This is a feature in F# similar to monads that is in some ways less powerful (see below) and in some ways more powerful. Like do notation, CE’s are syntactic sugar for monad-like operations. The familiar Bind and Return are present, like Haskell, but F# allows additional desugaring of loop expressions, the ability to implement lazy evaluation, and more.

No lazy evaluation (unless you want it) – Arguably a con overall, but my point here is that reasoning about performance and space usage is easier in F# due to strict evaluation semantics. The profiling utilities built into VS make this even easier still.

F# – The Bad

Verbose and odd syntax – A number of aspects of F# rub me the wrong way.

First, every declaration requires a let keyword. Let let let let let. Haskell in a lot of ways is much more concise because several values can be declared using one let.

Secondly (and this drives me nuts), F# function and type declarations must be made before the function is used. In Haskell, I often use a few helper functions that are defined down near the bottom of the file, but in F# those helpers must be defined near the top of else they can’t be used. Having such an antiquated restriction as this in a language a modern as F# boggles my mind.

Thirdly, the declaration of types and their constructors is sometimes not consistent. For instance, a constructed tuple in F# is (a,b) (like Haskell), but the type of this tuple is c * d. It’s not a hard thing to get over, but it does accentuate how self-consistent Haskell is.

No type classes! – I put an exclamation point on this one because it might be the single most important reason F# left a bad taste in my mouth. I did not realize the power of type classes until I tried to do functional programming without them. Take for example the monad type class. If a type is an instance of the monad type class, it is a given that it will work with replicateM, sequence, and a number of other functions that work on monads. If you have a new type, just declare a monad instance for it and sequence will work with it as well.

Trying to do this in F# is an exercise in frustration. There are ways to simulate type classes, but you have to do all kinds of inlining and type gymnastics to make the .Net CLR happy. Since it’s not readily supported by the language, there is no standard library of monad functions, and in a lot of situations you’ll have to reimplement these extraordinarily useful utilities yourself. This can lead to a good deal of code duplication. Just check out the FSharpx monad library code for examples.

Dependence on .Net – .Net is a comprehensive, full-featured framework, but it is huge and designed for OO languages. This results in classes like Stream, StreamReader, BinaryReader, TextReader, etc., which are all slight variations on each other. F# uses these same classes. Digging through the docs to find what you need can be challenge, but less so if you’re already familiar with .Net. In Haskell though, almost all of these can be replaced by mapping to/from a plain old byte string. Why does OO have so much seemingly unnecessary complexity to simply deserialize a type?

“Functional first” – F# is often described as a “functional first” language, meaning it supports both functional and OO programming paradigms. For some this is a plus (right tool for the right job and all that), but to me it seems like F# suffers from split personality disorder. Some of the F# libraries feel like little more than functional wrappers around an OO core. For a newbie, it’s not entire clear if these are merely measures of convenience or of necessity.

Back to Haskell

I doubt I was able to completely commit my thoughts on F# to this page, but this is at least a good start. I’ve decided to continue to wrestle with Haskell on Windows because even though the ecosystem is a house of cards, actually coding in Haskell is much more enjoyable than F#. I’ve started over with Haskell Platform to hopefully resolve as many compatibility issues as i can.

Reflections After a Hard Day in Haskell GUI Land

Posted on February 17, 2013 by dan

My progress on my current project (FNIStash) is unsteady, but yesterday I decided the time was right to start putting together a rudimentary GUI as both a learning exercise and as a boost of motivation for moving the program along. After working on getting a basic toy GUI for an entire day, I have absolutely nothing to show for it. It was truly the most disheartening and disappointing interaction with the Haskell ecosystem that I have had so far. Here are some of my impressions.

As a Windows User

Trying to navigate the Haskell ecosystem as a Win 7 user is like making a deposit at a bank in loose change. Some banks will accommodate you and others will turn you away, but in either case you’ll get the feeling that the way you are doing things is fundamentally looked down upon. In cases where you are accommodated, you usually have to jump through many hoops, such as installing specialized build environments like MINGW32 or cygwin (which themselves might require several different installers) and manually building cross-platform libraries. After a lot of work, you’ll find you’re ready to install the package you wanted to install all along, only to have it break for who-knows-why. A lot of libraries are just not as easy as “cabal install haskell-gui-support”

Cabal

Cabal is a frustratingly constraining tool. Far too frequently I encountered packages that, when trying to install, would say installing this package will break a dozen others. If not that, then I often would be notified that the dependencies could not be resolved. If you decide that you don’t care that your other packages break and go ahead with the install, it’s a tossup as to whether or not the install will be successful, but oops, you just broke a dozen of your old packages already, so if it fails you have some cleaning up to do. I’ve become pretty skilled at clearing out my cabal and ghc databases and invoking cabal install world to just bring my packages back to a self-consistent state after trying and failing to install a new package to explore.

For packages that leverage a cross platform library, this is frustraing but understandable. Often this house of cards is built by taking C or C++ source meant for a Unix like build environment, installing it through MINGw32 (as the special Windows installation instructions tell you to do), then installing a package from cabal that brings in the Haskell wrappers. It’s not pretty, but it’s kind of expected when pulling in functionality across platforms and across languages.

What I don’t understand is why cabal should ever have a problem with some packages breaking others or dependencies not being resolved. Cabal knows what package I want to install, it knows what versions of its dependencies it needs, and those dependencies are freely available from hackage (as are their own dependencies). You would think that this is a simple scenario with a straight forward solution – recurse and build all versions of everything as needed. But for some reason, cabal does not allow side-by-side installation of different versions of packages. If you have a package that requires a specific version of a dependency, you are effectively prevented from using any package that requires a different version. The utility cabal-dev is available, which allows the set of packages to be sandboxed on a per project basis, but that is small consolation if you (like me) need to get out of hell on a single project.

Side-by-side installation of multiple versions is a known and demonstrated solution to dependency hell. I alternate between blaming the community for not solving this problem and blaming myself for not understanding why it is so difficult. Solutions like Haskell Platform seem to me to be treating the symptom and not the disease.

Haskell Community

The Haskell community is widely regarded as newb-friendly. If you need newb help, the haskell IRC channel is just a click away, and in cases where I didn’t get help from the IRC channel or haskell cafe, I’ve simply selected a knowledgable member of the Haskell subreddit and messaged them directly. No one has ever called me a newb or made me feel dumb for asking questions, even when the answer was staring me in the face in the docs and I just didn’t notice. It’s a great community.

What has happened (albeit occasionally) is that I might get no response at all, and as my questions become more advanced and specific, the likelihood that my queries get lost in the void increases. This is no different from any other language community. However, the Haskell community is tiny compared to, say, Python, so it’s pretty easy to exhaust all avenues of help. If you’re in the Haskell Windows community, it’s even smaller.

Desperation in Finding a GUI Package

In my search to find a viable GUI package, the only one that I had absolutely any success with was HGamer3d, and it’s probably not a coincidence that it is designed to work on Windows. It was the only package that I was able to install successfully (without any headaches either), and it also had several example sources that worked effortlessly. The problem is that is has next to nothing for documentation. I tried modifying an example slightly only to have the executable fail on startup for unknown reasons. Maybe I will revisit it in the near future, but I would prefer to not have to learn Ogre to decipher the HGamer3d source code.

What I wanted at the start of all of this was a GUI library or framework that supported DDS texture files. By the end, I was willing to settle for anything that would successfully install and would display an image in any format. So why not the web frameworks? Yesod can’t resolve dependencies, happstack fails to load the network package (for which I found a work around online that, alas, doesn’t work), and snap fails to build the lens package due to a segfault in the generated code. I’m sure any of these would work if I had the “right” set of packages installed, but this is not a viable solution (see above). I can’t rip my environment apart every time I need something new.

wxHaskell is the most popular GUI toolset based on this reddit discussion, but I forgot to sacrifice a goat as step 0 of the installation process, so I had very little success with it.

Now

I feel pretty similar to the author of this reddit discussion, except I had even less success. GLUT, GLFW, elm, reactive this, FRP that. I feel as if I have looked at everything and found success in nothing. Others work in Haskell GUIs all the time, so I know it is possible and that part of the blame lies on my side of the screen.

This experience has been a completely different kind of difficulty compared to learning the Haskell language. List comprehension, immutable variables, monads, type classes…these are all well documented concepts with essentially no variability from case to case. Read up long enough and eventually they click, and you are on your way. The problem I face now is that I don’t know where to go from here.

String Keys Suck

Posted on January 12, 2013 by dan

In my previous post, I mentioned that I needed 95 MB of space memory just to run my simple test of extracting data from the Torchlight 2 PAK file. I did some investigating to figure out what the heck was going on. The culprit: using a FilePath (which is just a String) as the key to my map.

Prepping for profiling

To profile in Haskell with GHC, you need to compile your program with the -prof option, and throw in -auto-all to automatically add cost centers to the profile output. You then execute the program with some additional flag to tell the runtime to collect profiling data. After that, you can look at the resulting .prof file for nice tabular data, but I prefer graphs. There are a few annoying steps to this whole process, so I created this batch script to handle most of it for me, which I named hprofile. It runs the exe, generates the products, and tags the prof and graph with a description.

@echo off
%2.exe %3 %4 %5 %6 %7 %8 %9
hp2ps -e8in -c "%2.hp"
DEL %2.hp
set ID=%~1
set newname=%2(%ID%)
IF EXIST "%newname%.prof" (DEL "%newname%.prof")
RENAME "%2.prof" "%newname%.prof"
IF EXIST "%newname%.ps" (DEL "%newname%.ps")
RENAME "%2.ps" "%newname%.ps"
DEL "%2.aux"
CALL ps2pdf "%newname%.ps"
DEL "%newname%.ps"

Baseline – with String keys

Here’s the graph resulting from

>hprofile "baseline" FNIStash-Debug +RTS -p -hc

Memory usage for String keys in Map

The forText2/pakFileList is the function that generates the keys in the Map. In this case, the keys are Strings (FilePaths).

Improvement – with Text keys

I changed the type of the key in the map from FilePath to Text. This actually made a lot of sense since I parsed them out as Text anyway, but chose FilePath before so I could use the path handling utilities in System.FilePath. The lookup function on the map still takes a FilePath as the key. Now, however, the FilePath is converted to Text within the lookup function. Here is the result.

Memory usage for Text keys in Map

No more runaway memory usage! The moral of the story: avoid String.

Reading the TL2 PAK file

Posted on January 10, 2013 by dan

I’m gradually continuing to make progress with FNIStash, gradually being the key word. It’s pretty tough to learn a new language, decode a file format, and plan a wedding at the same time.

Lately, I’ve been working on a Haskell module to decode the Torchlight 2 PAK file. The PAK file is a single archive file that contains all the configuration and asset files for the game (well, at least most of them). I’m hoping to read this PAK file and decode its contents so I can, for instance, grab the bitmaps for the various inventory items and use them in the GUI for FNIStash.

In writing the PAK module, I came tried a few different strategies for parsing the binary data, with varying levels of success. I won’t describe the PAK format here because it is already well described in the readme file included in CUE’s Torchlight II packer.

Objective

The PAK.MAN file associates a file path to a particular part of the PAK file. What I’d like to do is create a lookup table of all the files archived in the PAK file. The key would be the file path, and the value would be the right data out of the PAK file. I call this a PAKEntry. So the mapping would look something like

type PAKFiles = M.Map FilePath PAKEntry

with PAK entries being parsed with the following Get operation

getPAKEntry :: Get PAKEntry
getPAKEntry = do
    hdr <- getWord32le
    decSize <- getWord32le
    encSize <- getWord32le
    encData <- getLazyByteString $ fromIntegral encSize
    return $ PAKEntry hdr decSize encSize encData

Attempt 1 – Using lazy Get

My first attempt used the lazy binary Get monad. This was also lazy on my part. My thinking here was that, since I already head a Get monad operation coded up that could parse out a PAKEntry, I could just add a line at the beginning of that operation that skips to the right part of the file. Something like

(skip . fromIntegral) (offset-4)

with the offset being passed in as a function argument (so the type signature would be Word32 -> Get PAKEntry). This worked, but was horrendously slow. What this does is read all the data of the PAK file into memory up to the offset (approximately), and then reads in as much more as it needs to parse out the PAK entry. For parsing entries near the end of the 800 MB file, this took a long time. I used this function to construct the table:

readPAKFiles :: FilePath -> FilePath -> IO PAKFiles
readPAKFiles manFile pakFile = do
    man <- readPAKMAN manFile
    pak <- BS.readFile pakFile
    let fileList = pakFileList man
        offsetList = pakFileOffsets man
        fileOffsetList = L.zip fileList offsetList
        f offset = flip runGet pak ((getPAKEntry . fromIntegral) offset)
        mapList = L.zip fileList (L.map f offsetList) :: [(FilePath, PAKEntry)]
    return $ M.fromList mapList

Attempt 2 – Using strict Get

Attempt 2 used the strict Get monad and strict bytestrings. The code is nearly identical to attempt 1. Here, the problem was that the BS.readFile function read in the entire contents of the PAK file to memory, then skipped to the right location of the PAK entry (instead of streaming the file into memory as necessary as it did with lazy strings). This was *much* faster, but required a huge 800 MB of memory to be read in just to parse out a tiny 26k PNG file (the file I was using for testing.).

Attempt 3 – Handle and lazy Get

Check out the type signature of the PAKFiles type. It maps a FilePath to a PAKEntry. No where in that definition is IO mentioned. So PAKFiles cannot do any IO, by definition. The previous two attempts have PAKFiles defined as a pure result of parsing binary data read through IO. What I really needed is for PAKFiles to have this type signature:

type PAKFiles = M.Map FilePath (IO PAKEntry)

Now, the result of a lookup operation is an IO action that retrieves a PAKEntry.

Reading the PAK entry lazily is the behavior I wanted to keep because I didn’t know how much data I needed to read in until after I started reading the entry. But I didn’t want to lazily read in and throw away the data in order to skip to the right part in the file. My next attempt used handles:

readPAKFiles :: FilePath -> FilePath -> IO PAKFiles
readPAKFiles manFile pakFile = do
    man <- readPAKMAN manFile
    let fileList = pakFileList man
        offsetList = pakFileOffsets man
        fileOffsetList = L.zip fileList offsetList
        f offset = do
            h <- openBinaryFile pakFile ReadMode
            hSeek h AbsoluteSeek (fromIntegral offset - 4)
            pak <- BS.hGetContents h
            let parsedEntry = flip runGet pak getPAKEntry
            hClose h
            return parsedEntry
        mapList = L.zip fileList (L.map f offsetList) :: [(FilePath, IO PAKEntry)]
    return $ M.fromList mapList

There’s a problem with this, though. I kept getting errors that the handle was closed before I tried to read it!

Attempt 4 – (Mostly) success with $!

The problem with attempt 3 is that runGet is executed lazily, but hOpen and hClose are not. When runGet is finally performed, the handle is already closed. This is easily fixed with the $! operator, which strict evaluates its argument:

readPAKFiles :: FilePath -> FilePath -> IO PAKFiles
readPAKFiles manFile pakFile = do
    man <- readPAKMAN manFile
    let fileList = pakFileList man
        offsetList = pakFileOffsets man
        f offset = do
            withBinaryFile pakFile ReadMode (\h -> do
                hSeek h AbsoluteSeek (fromIntegral offset - 4)
                pak <- BS.hGetContents h
                return $! (flip runGet pak getPAKEntry))
        mapList = L.zip fileList (L.map f offsetList) :: [(FilePath, IO PAKEntry)]
    return $ M.fromList mapList

This version of the operation is fast and uses a low amount of memory. By “low”, I mean 95 MB to parse out a 26k file. Something is wrong there, which is my next thing to investigate. Maybe that will make for another post.

– Update 01/23/2013 –

This post last ended with my creating a lookup table that generated IO actions that returned PAKEntry records. This turns out to be a horrible idea. Every time a lookup is made, IO has to be run. In order to do this, you need to pollute huge parts of your program with the IO monad. Instead, I reformulated this to be pure and more forward looking – you give the function a clue of what content you want in the table (such as a file path prefix), and it will read all of the necessary files into memory right then and there. After that, lookups can be done without the IO monad.

Installing Leksah on Windows 7

Posted on November 19, 2012 by dan

This past weekend, I convinced myself to give Leksah, the Haskell IDE written in Haskell, a try. I’m glad I did. Writing Haskell in it is much more enjoyable than using Notepad++ and WinGHCi side by side. So much more enjoyable that I actually managed to sit down and write something useful!

Installing Leksah on Windows is not hard, but it is easy to mess up, as I found out. Here is what I hope is a step-by-step guide to doing it on Windows 7. Note that this walkthrough assumes you already have GHC installed, probably by installing the Haskell Platform.

Determine your ghc version by typing “ghc –version” at a command prompt. In my case, it was 7.0.4.
Download the Leksah binary installer that matches your ghc version. Matching of versions is not optional!
Run the installer and accept the default installation location. Mine was C:\Program Files (x86)\Leksah\
Start up Leksah (you can search for it in the Start menu). When you first start up Leksah, you will be asked to provide the path to your cabal packages. This is so Leksah can process your packages into an explorable database.
Add the packages path into the text field, then click the Add button. You should see the path listed in the path list as confirmation of your selection. My path was C:\Users\MyUserName\AppData\Roaming\cabal\packages. Then click OK.
Leksah will process your installed packages. This takes a few minutes. If it looks like Leksah performed this processing really quickly/nearly instantly, then double-check that your Leksah installer matches your ghc version. (This was my problem for quite a while.)
Leksah saves your package data in a .leksah folder in the top level of your user directory. If you delete this folder, Leksah will revert its behavior as if it were freshly installed. This is useful if the guide above doesn’t work for you and you need to debug some stuff but don’t want to endlessly uninstall then reinstall.

And that’s it! If you need help on using Leksah, try this series of youtube videos.

Testing equivalence of rationals using natural encodings

Posted on June 12, 2012 by dan

I’ve recently been spending a decent amount of my free time in self-study of Ulrich Kohlenbach’s Applied Proof Theory: Proof Interpretations and their Use in Mathematics. As someone who has no experience with Constructivism, it is definitely challenging, but there is no better way to learn than hands-on. Here I explore one of the topics the book mentions – encoding rational numbers into the naturals – using Haskell.

The idea is simple – take operations on $\mathbb{Q}$ and transform them into operations on $\mathbb{N}$. We’ll use a special encoding to assign a unique $n \in \mathbb{N}$ for each $q \in \mathbb{Q}$ (yes, there are just as many natural numbers as there are rationals!). We can then perform operations (in this exercise, equivalence tests) in the encoded set ($\mathbb{N}$) and have the results be valid in the set we care about ($\mathbb{Q}$). Some of the steps below will seem silly, but this is just an exercise in theory.

The gist is available here.

Before we jump in, we’ll establish some prerequisite machinery for our exploration. First, we’ll take the Nat instance from the exploration on infinite search and expand it a bit using the nomenclature in Kohlenbach. Next, we’ll borrow the least function from Seemingly Impossible Functional Programs, slightly modified:

least p = if p 0 then 0 else 1 + least(\n -> p (1 + n))

Lastly, we’ll build the one-point compactification of $\mathbb{N}$ for use in the infinite search monad. The premise of this monad is that by defining an infinite set, an algorithm for exhaustively searching that set is automatically generated.

natural :: Set Nat
natural = union (singleton Z) (S <$> natural)

Now we’re ready for math.

Rationals can be thought of as pairs of integers, one being the numerator $num$ and the other the denominator $den$. There are two challenges in trying to encode this information into $\mathbb{N}$.

The encoding has two inputs and one output.
Rationals have sign, but the naturals have no sign.

We’ll tackle 2 first. If $q$ is positive, then its numerator $num$ is mapped to the even natural $2|num|$ and the odd natural $2|num|-1$ otherwise. In this way, we can encode the sign information in $q$ into the parity of a natural $n_{num}$. The denominator $den$, on the other hand, maps to separate natural $n_{den}$ through $|den| – 1$.

Now that we have $num$ and $den$ mapping to two naturals, we’ll map the pair of naturals $(n_{num}, n_{den})$ to a single natural $q_c \in \mathbb{N}$, with $q_c$ meaning an encoded rational. Using the Cantor pairing function

$$j(x^0, y^0) := \left\{ \begin{array}{lr} \mbox{min }u \le (x+y)^2 + 3x + y [2u = (x+y)^2 +3x + y] \mbox{ if it exists} \\ 0\mbox{ otherwise} \end{array} \right. $$

The function numden2QCode below performs this encoding, returning a natural number of type QCode representing a rational from the rational’s numerator and denominator.

numden2QCode :: Integer -> Integer -> QCode
numden2QCode num den  | (num >= 0 && den > 0) || (num <= 0 && den < 0)  = j (fromInteger $ 2*abs(num)) (fromInteger $ abs(den)-1)
                      | otherwise                                       = j (fromInteger $ 2*abs(num)-1) (fromInteger $ abs(den)-1)

j :: Nat -> Nat -> QCode
j x y = if k <= t then QCode k else QCode 0
        where   k = least (\u -> 2*u == t)
                t = (x+y)*(x+y) + 3*x + y

Testing for equivalency of rationals using QCodes might seem like a straightforward process of comparing the QCodes themselves, but one characteristic of the numden2QCode encoding we’ve developed is that $(num, den)$ (1, 2) has a different QCode than (2,4), even though these rationals are equivalent. To test for equivalence, we’ll project the QCode back into the $n_{num}$ and $n_{den}$ pairs .

$$ j_1(z) := \mbox{min }x \le z[\exists y \le z (j(x, y) = z)] $$

$$ j_2(z) := \mbox{min }y \le z[\exists x \le z (j(x, y) = z)] $$

Here, $z$ is our rational encoding $q_c$, $j_1$ returns $n_{num}$ and $j_2$ returns $n_{den}$. For example,

>j1 $ j 2 3
S (S Z)

>j2 $ j 3 4
S (S (S (S Z)))

Lastly, to test equivalence, we’ll extend QCode to be an instance of the Eq typeclass. The trick here is that we need to handle odd and even $n_{num}$ cases separately, since that indicates the rational is positive or negative. Once we back out the $n_{num}$ and $n_{den}$, we convert back to rational num/den form and equate using cross multiplication.

instance Eq QCode where
(==) n1 n2 | evenNat j1n1 && evenNat j1n2 = (j1n1/2)*(j2n2 + 1) == (j1n2/2)*(j2n1 + 1)
           | oddNat  j1n1 && oddNat  j1n2 = ((j1n1+1)/2)*(j2n2 + 1) == ((j1n2+1)/2)*(j2n1 + 1)
           | otherwise                    = False
           where j1n1 = j1 n1
                 j1n2 = j1 n2
                 j2n1 = j2 n1
                 j2n2 = j2 n2

Finally, we can equate some rationals!

print $ numden2QCode (-1) 1 == numden2QCode (-2) 2 -- True
print $ numden2QCode (-1) 1 == numden2QCode (-3) 3 -- True
print $ numden2QCode (-1) 2 == numden2QCode 2 (-4) -- True
print $ numden2QCode 1 2    == numden2QCode 2 4    -- True
print $ numden2QCode 1 2    == numden2QCode 1 3    -- False

Note that you’ll probably want to compile these into an executable like I did because they are slow otherwise. In executable form, they are done in about 1.5 seconds. In a future post, I’ll explore why a few changes to the Num instance for Nat reduced execution time from 88 seconds down to 1.5.

Natural numbers and the infinite search monad

Posted on May 20, 2012 by dan

My previous post explored the usage of infinite search techniques to exhaustively search the natural numbers to find a witness satisfying a total predicate, or decide when no such witness exists. Turns out, these “searchable” spaces form a monad, available in the infinite search package on hackage.

To use the monad for search across the naturals, we have to first define the set of naturals. Defining the set is all that is required to start performing infinite searches. Here is one of my failed attempts at defining the naturals:

natural :: Set Nat
natural = foldr1 union $ map (singleton) $ iterate S Z

What’s wrong with this definition? It defines the set {Z, S Z, SS Z, …}, which represents the entire space of naturals. Let’s try to do a search:

> search natural (\x -> x+2 == 1)
Nothing

> search natural (\x -> x*2 == 1)
(infinite loop)

Why does the second predicate above, which works just fine using search’ and lyingSearch from the previous post, fail on this predicate? The issue is that the search’ function searches across the one-point compactification of the naturals (i.e. including infinity). Our definition of natural above doesn’t include infinity because every element of the set ends in Z due to the iterate function. Without infinity, the set is not compact, and hence not searchable. Thanks goes out to Luke Palmer for providing the correct (and prettier!) definition:

natural :: Set Nat
natural = union (singleton Z) (S <$> natural)

This recursive set definition defines infinity as part of the set, so now…

> search natural (\x -> x*2 == 1)
Nothing

Infinite search explained

Posted on April 9, 2012 by dan

The Haskell subreddit recently had a post about the cleverest piece of Haskell code you know, and the comments ignited a discussion of Martin Escardo’s exhaustive search of the Cantor set, which can be thought of as the set of all infinite chains of 1’s and 0’s. The techniques are exciting, but initially confounding for many, including me. I set out to fully understand the processes involved and devoured as many of the sparse resources as I could find. I didn’t really start to get a mental grip on it until I started playing with code – specifically Luke Palmer’s infinite search code. This code doesn’t search the Cantor set but instead exhaustively searches the natural numbers, and it’s just a little easier to dissect mentally as a precursor to the Cantor set. I think I now know how it works, what its limitations are, and how to break it. My goal is that by reading this post, you’ll come away with the same feeling.

Disclaimer: I am very much a Haskell newbie and not a computer scientist by training. I just think this stuff is cool.

Representation of natural numbers

First, let’s cover the representation of the domain, the natural numbers. The representation of natural numbers is defined recursively from Zero:

data Nat = Zero | Succ Nat

That is, a natural number is either Zero, or it is the Succ (successor) of another natural number. So for instance 0 = Zero, 1 = Succ (Zero), 2 = Succ (Succ (Zero)) etc. While this may seem cumbersome at first, the key advantage to representing the domain this way is that it makes comparison operations much more powerful. Consider the comparison

Succ (Zero) /= Zero

which represents that 1 is not equal to 0, which is true. Now consider what happens if we reformulate the expression as

Succ (?) /= Zero

where ? can be Zero, Succ (Zero), etc. By examining only the heads of each side of the expression, we are no longer comparing 1 with 0; we are comparing every number greater than or equal to 1 with 0, and like before, the expression is still true. If we compare

Succ (?) /= Succ (?)

then by examining just the head, we can only claim that the two values being compared are possibly equal. We cannot even claim to know which values are being compared, except to say that neither is Zero. So in this case, we have to examine more than just the heads of each expression to determine equivalency. We in fact have to examine as much of the expression as necessary to find a Zero on either the left side, the right side, or both. If a Zero does not exist, such as in

infinity = Succ infinity

infinity == infinity

then equivalency can never be determined. If you try to execute this in GHCi, you’ll get an infinite loop.

These comparison advantages are the reason why Palmer says that in order to use these infinite search techniques, the domain must be representable as a recursive polynomial type, and why Escardo says that the predicate must be computable. These conditions allow us to construct seemingly infinite comparisons as long as one side of the comparison reduces to something finite eventually.

Testing equivalence

I implemented by own instance of Eq for Nat instead of using Haskell’s derived implementation. You can see it in the full code listing at the bottom if you’re curious. Let’s test out some comparisons:

*Main> Succ (Succ (Succ (Zero))) == Succ (Zero)
-- Comparing Succ == Succ --> Possibly equal
-- Comparing Succ == Zero --> Not equal
False

Here we are comparing 3 and 1. The salient point here is that we can determine that the two values are not equal by only comparing the first two elements of each representation. The same is true when comparing to infinity:

*Main> infinity == Succ (Zero)
-- Comparing Succ == Succ --> Possibly equal
-- Comparing Succ == Zero --> Not equal
False

Expanding the Num instance

Palmer’s Num instance includes the operations (+), (*), and fromInteger. The purpose of the first two is clear. fromInteger takes an integer literal and transforms it into our Nat representation so we can compare it with another Nat representation:

*Main> fromInteger 2::Nat
-- Transforming 2 into Nat form.
---- Returning Succ as element of the transformation
---- Returning Succ as element of the transformation
---- Returning Zero as element of the transformation
Succ (Succ Zero)
*Main> Succ (Zero) == fromInteger 2
-- Transforming 2 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal
False

The operation (-), however, is not defined. We can define it by adding the following to the Num instance:

Zero   - y = Zero
Succ x - Zero = Succ x
Succ x - Succ y = (x - y)

This definition of (-) carries two caveats. First, Zero minus anything is Zero. Consistently, any n – (n + k), where k >= 0 is Zero. This can lead to unexpected behaviors like

Zero - 130 == Zero
-- Comparing Zero == Zero --> Equal
True

so we need to ensure we are using it appropriately given the properties we have defined for it.

The second caveat is that, unlike addition, subtraction between two Nats cannot be done lazily. With addition, we know that for any x and y

Succ x + y = Succ (x + y)

so we don’t need to know what x and y are in order to start returning the result because we know it begins with Succ. However, with subtraction we have

Succ x - Succ y = (x - y)

so we have no idea whether the result will start with Succ or Zero until we traverse x and y and find a Zero in one of them. Indeed, something like

infinity - infinity

will never return a result.

The search’ function

With the background out of the way, let’s examine the algorithm itself, starting with the search’ function. You’ll notice that my code throughout this post is slightly different than the Palmer code and littered with trace calls. These will be useful later when we start exploring examples.

search' fTop    | trace "Starting search.  Starting top level predicate call (TLPC)" fTop $ bestGuess = Just bestGuess
                | otherwise       = Nothing
        where
        bestGuess = trace "The TLPC has no defined x yet, so start building the best guess with lyingSearch" lyingSearch $ fTop

The important thing to note about search’ is that all it does is apply the predicate fTop to a single natural number, bestGuess. That’s all. In fact, the entire infinite search algorithm begins when this single application of fTop begins to be evaluated, and ends when fTop finishes its evaluation (the Maybe wrapping notwithstanding). I changed the name of the predicate in the search’ function to fTop to try to capture this special significance.

The lyingSearch function

The bestGuess that search’ applies fTop to is merely the result of the lyingSearch call made above. So what does lyingSearch do? It recursively builds the best-guess solution for satisfying fTop while fTop is being evaluated. Let’s examine the code.

lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch fRec = if trace "Can we finish the best guess with a Zero and succeed?  Let's try it" (fRec Zero)
                    then trace "Yes, so end best guess search with Zero.  Next, complete TLPC" Zero
                    else trace "No, so continue best guess with Succ and do lyingSearch again" (Succ (lyingSearch (fRec . Succ)))

The input argument to lyingSearch, fRec, does two things:

Includes the predicate we are trying to satisfy
Records the progress of the best-guess solution

The record keeping is achieved by composing fRec with an additional Succ each time lyingSearch decides that the next element of the best-guess solution is Succ. Thus, when search’ first calls lyingSearch with fTop, the best-guess is empty because no Succ’s have yet to be tacked on the end.

Let’s go through lyingSearch step by step. Assume that fRec is fTop composed with zero or more Succ’s that represent the current state of the best guess. By applying fRec to Zero, we are testing whether or not the next element of the best guess should be Zero. If true, then the next element should indeed be zero because the predicate fTop will be satisfied. If false, then we just assume that the rest of the best guess must be Succ plus whatever is produced by another call to lyingSearch. Assuming this about the best guess does not guarantee that the predicate will be satisfied; we just know that it won’t be satisfied with Zero. In this way, lyingSearch could eventually return a result that does not satisfy the predicate (a lie).

Example: finding a solution

Let’s go through a few search examples and examine the traces that are produced step by step. I’ve included some additional annotations to the trace output in parentheses to add a little specificity where needed.

predFxn1 x = let lhs = x*x
                 rhs  = 4
             in trace ("-- Comparing LHS with RHS") (==) lhs rhs

*Main> search' predFxn1
Starting search. Starting top level predicate call (TLPC)
-- Comparing LHS with RHS (This is part of the TLPC)
The TLPC has no defined x yet, so start building the best guess with lyingSearch
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fRec to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal (This ends the application of fTop to Zero)
No, so continue best guess with Succ and do lyingSearch again
-- Transforming 4 into Nat form. (This is part of the TLPC)
---- Returning Succ as element of the transformation (The first element of the best guess is Succ!)
-- Comparing Succ == Succ --> Possibly equal (4 begins to be compared to (best guess)^2. Best guess so far is just Succ)
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fTop . Succ to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Zero == Succ --> Not equal (This ends the application of fTop . Succ to Zero)
No, so continue best guess with Succ and do lyingSearch again
---- Returning Succ as element of the transformation (The second element of the best guess is Succ!)
-- Comparing Succ == Succ --> Possibly equal (Continues the comparison of (best guess)^2 to 4 in TLPC)
Can we finish the best guess with a Zero and succeed? Let's try it
-- Comparing LHS with RHS (This begins the application of fTop . Succ . Succ to Zero)
-- Transforming 4 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Zero == Zero --> Equal (This ends the application of fTop . Succ . Succ to Zero)
Yes, so end best guess search with Zero. Next, complete TLPC
---- Returning Succ as element of the transformation (Continue and complete comparison of (best guess)^2 to 4)
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Zero == Zero --> Equal
Just (Succ (Succ Zero))

This is all well and good, but how does infinite search determine that there is no solution?

Example: finding no solution

No solution exists when we try to satisfy an impossible predicate. We saw in the last example that whenever we try to end the best guess with Zero but fail, we assume the right path forward is to use a Succ instead. If the predicate always returns false, then our best guess becomes an infinite chain of Succ’s. So then how does the algorithm know to give up the search?

predFxn2 x = let lhs = x*x
                 rhs  = 3
             in trace ("-- Comparing LHS with RHS") (==) lhs rhs

*Main> search' predFxn2
Starting search.  Starting top level predicate call (TLPC)
... Uninteresting sections of trace omitted here.  Best guess is now Succ (Succ (Zero)) ...
Can we finish the best guess with a Zero and succeed?  Let's try it
-- Comparing LHS with RHS
-- Transforming 3 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Succ == Zero --> Not equal (This is the first time (best guess)^2 overshoots 3)
No, so continue best guess with Succ and do lyingSearch again (Best guess is now Succ (Succ (Succ (Zero))))
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal (This is part of the TLPC)
Can we finish the best guess with a Zero and succeed?  Let's try it
-- Comparing LHS with RHS
-- Transforming 3 into Nat form.
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Succ as element of the transformation
-- Comparing Succ == Succ --> Possibly equal
---- Returning Zero as element of the transformation
-- Comparing Succ == Zero --> Not equal
No, so continue best guess with Succ and do lyingSearch again (Best guess is now Succ (Succ (Succ (Succ (Zero)))))
---- Returning Zero as element of the transformation (We've reached the end of 3's Nat representation in the TLPC)
-- Comparing Succ == Zero --> Not equal (This is the conclusion of the TLPC)
Nothing

What’s interesting here begins on line 14. The right hand side of the comparison, 3, is not changing, and we know the operation x*x results in a longer and longer series of Succs. With this information, you and I could conclude that this predicate will never be satisfied. The algorithm, however, doesn’t conclude that no solution can be found until the same comparison fails in the TLPC (line 31).

To summarize, the infinite search algorithm uses the predicate in two ways:

Deciding what the next element of the best guess should be
Deciding whether or not the best guess succeeds or fails

Usage 2 imposes a constraint on the structure of the predicates we can look to satisfy. Specifically, the algorithm can only declare that no solution exists if the predicate evaluates to False when given an infinite chain of Succs (i.e. infinity).

Breaking infinite search

With the constraint on the predicate in mind, let’s try to break infinite search. Here are a couple of simple predicates that work just fine, one with a solution and one without.

predFxn3 x = x*x == 6 - x
predFxn4 x = x*x == 7 - x
*Main> predFxn3 infinity
False
*Main> predFxn4 infinity
False

Now we know that for either of these predicates, search’ will give up the search if no solution exists.

*Main> search' predFxn3
Just (Succ (Succ Zero))
*Main> search' predFxn4
Nothing

Great. What about these modified predicates, again one with and one without a solution?

predFxn5 x = x*x == 6 + x
predFxn6 x = x*x == 7 + x

When evaluated with infinity, both of these predicates generate an infinite loop because, at the end of the day, we’re evaluating infinity == infinity. However, predFxn5

*Main> search' predFxn5
Just (Succ (Succ (Succ Zero)))

returns its solution, while search’ predFxn6 looks for one forever and never gives up. We can try to be clever and reformulate the predicate like this

predFxn7 x = x*x - x == 7

to eliminate infinity from one side, but all we’ve done is replace the non-terminating operator (==) with the non-terminating operator (-). Alas, predFxn7 will not work, either.

Other explorations

Hopefully I’ve been able to make infinite search a little less confounding, or at least this particular application of it. Here are some other questions to ponder for the curious mind. I’ll be pondering these myself.

Can Nat infinite search be modified to handle predFxn6 or predFxn7?
Can these techniques be used to perform epsilon-delta proofs on compact subsets of the reals?
Must all lists contain at least three items?

Code listing

{-- Playing with Haskell infinite search --}

import Debug.Trace

data Nat = Zero | Succ Nat
    deriving (Ord, Show)
infinity = Succ infinity   -- This line is interesting but not necessary

fromInteger' 0 = trace "---- Returning Zero as element of the transformation" Zero
fromInteger' n = trace "---- Returning Succ as element of the transformation" Succ (fromInteger' (n-1))

instance Num Nat where
    Zero   + y = y
    Succ x + y = Succ (x + y)

    Zero   - y = Zero
    Succ x - Zero = Succ x
    Succ x - Succ y = (x - y)

    Zero   * y = Zero
    Succ x * y = y + (x * y)
    fromInteger n = trace ("-- Transforming " ++ show n ++ " into Nat form.") $ x
                    where x = fromInteger' n

instance Eq Nat where
    Succ x == Zero = trace ("-- Comparing Succ == Zero --> Not equal") False
    Zero == Succ x = trace ("-- Comparing Zero == Succ --> Not equal") False
    Zero == Zero = trace ("-- Comparing Zero == Zero --> Equal") True
    Succ x == Succ y = trace ("-- Comparing Succ == Succ --> Possibly equal") (x == y)

-- lyingSearch f returns a Nat n such that f n, but if there is none, then
-- it returns a Nat anyway.
lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch fRec = if trace "Can we finish the best guess with a Zero and succeed?  Let's try it" (fRec Zero)
                    then trace "Yes, so end best guess search with Zero.  Next, complete TLPC" Zero
                    else trace "No, so continue best guess with Succ and do lyingSearch again" (Succ (lyingSearch (fRec . Succ)))

search' fTop    | trace "Starting search.  Starting top level predicate call (TLPC)" fTop $ bestGuess = Just bestGuess
                | otherwise       = Nothing
        where
        bestGuess = trace "The TLPC has no defined x yet, so start building the best guess with lyingSearch" lyingSearch $ fTop

predFxn1 x = let lhs = x*x
                 rhs  = 4
             in trace ("-- Comparing LHS with RHS") (==) lhs rhs

predFxn2 x = let lhs = x*x
                 rhs  = 3
             in trace ("-- Comparing LHS with RHS") (==) lhs rhs

predFxn3 x = x*x == 6 - x
predFxn4 x = x*x == 7 - x
predFxn5 x = x*x == 6 + x
predFxn6 x = x*x == 7 + x
predFxn7 x = x*x - x == 7

(Continues the comparison of (best guess)^2 to 4 in TLPC)

Quick update – Games, Reality, and Haskell

Posted on July 28, 2011 by dan

It had been a little while since my last post, so I wanted to just jot down some stuff I’ve been doing lately.

First, in order to learn Haskell, I started working though Haskell’s 99 Questions, which is a series of short functions of increasing difficulty. I have my solutions hosted on my github repo.

The other thing I have going on is I’m reading Jane McGonigal’s Reality Is Broken. The book leverages a lot of research in positive psychology to argue that games (the book takes the slant of video games, but really any kind of unnecessary obstacle activity such as rock climbing) are “better” than reality. The basic argument of the book so far is this: if modern psychological research shows us that people are in their happiest and most activated state when playing games, then why must we endure our normal lives? Why can we not use the best aspects of games in igniting human happiness and incorporate them into our societies to improve the quality of life of everyone?

I’m less than 100 pages in so far, and I have mixed feelings. The book does a pretty good job of highlighting interesting findings in positive psychology, and the points made in that regard I feel are spot on. I would read a short paragraph and think to myself, “You know…you’re right. That *does* make my happy.” On the other hand, I think the book really stretches to apply the research findings to the field of video games. The author, being a game designer, clearly has a bias in this sense, but that also drives her passionate prose style. She spends a lot of time explaining the mechanics of certain video games like SMB 2 and Tetris, but then fails to really drive home her point that video games are the ideal medium in which to engineer human happiness. It’s almost like she thinks it is self-evident that games and positive psychology go hand in hand.

Again, I’m only a little ways in, so maybe I’m trying to jump ahead to points she makes (or doesn’t make) later. I just learned recently that my good friend Seth read this book and wrote a paper on it for a class, so I’m eager to chat with him about it.