Busybox silently failing at startup

I had a problem at work recently that drove me bonkers for a long time. The embedded linux product I am working on would start to boot up just fine but then freeze at “Freeing init memory.” For the longest time, I thought the device files for the serial connection were not configured properly, causing the remaining output to get lost. Turns out the problem was a lot more troublesome.

The real problem was that my uclibc dynamic linker .so library was not built/configured properly. I used the “generic” linux headers that were included in the (very old) legacy buildroot distribution for this product. (It’s so old in fact that I believe it predates the name “buildroot.”) The installation scripts said that the generic headers would be fine, but apparently they weren’t, even though the whole toolchain was built successfully. Because the headers were wrong, the ld library caused a silent segmentation fault as soon as it tried to run busybox init. I really learned my lesson that I need to build my toolchain with the exact same headers as the kernel!

Debugging this was a pain. First, I built a static executable to just print a message then sleep for a long time, and set this as my initial executable in the kernel boot arguments. I saw the message just fine, so the serial connection was working, and static inits appeared to work. The problem had to be elsewhere.

I then built busybox statically instead of dynamically and ran its init as the initial executable. That worked just fine, so something was wrong with the dynamic executable.

With the machine booting up with the static busybox exe, I could pull the dynamic exe onto the system via ftp and execute it manually to see what was going on. I also added a few printfs into the init_main busybox function to quickly see where the hang up was. I ran the binary and, ah ha, segmentation fault. But I didn’t get any print outputs to the screen, so the segfault was happening before entering int_main. Weird.

I found some debug compile options for uClibc that enabled dynamic linker debugging, rebuilt, and ran the dynamic busybox init manually using the new linker library. The failure occurred in what looked like a kernel system call. I believed my kernel was fine, so the problem had to be with the library. I chased down the includes and discovered the wrong headers were being used.

It sounds like a very methodical debugging session, but this problem kicked my butt for days. I hope this post prevents similar frustration for someone else.

Increase RabbitVCS Emblem Size

RabbitVCS is an integrated VCS GUI client similar to TortoiseSVN. Like TortoiseSVN, it shows emblems over folders indicating the folder’s status (modified, unmodified, unversioned, etc.). The emblems packed with RabbitVCS, however, are too tiny for me, especially since I prefer to use list view when using my file explorer. I searched and did trial-and-error for a long time to try to increase the size of those emblems. This is the best solution I could manage.

Instead of crawling through arcane nautilus/nemo visualization settings and theme configuration files trying to find the right setting to increase the emblem size, I recommend simply changing the emblem icon file themselves. In fact, rabbit’s emblem files have lots of empty space that we can get rid of. Each emblem generally takes up an unchangeable amount of space over a folder icon, but rabbitvcs designs their emblems specifically to only fill the bottom left corner of this space. The remaining space is empty. By cropping the emblems so there is no longer this empty space, the emblems will appear larger when rendered over icons.

For this, you’ll need to install Inkscape and have access to a command line. Execute these commands (modeled after the one here)

cd /usr/share/icons/hicolor/scalable/emblems
sudo inkscape --verb=FitCanvasToDrawing --verb=FileSave --verb=FileClose emblem-rabbitvcs*.svg

This will command Inkscape to open up each SVG emblem file beginning with ’emblem-rabbitvcs,’ crop the image to include only the visible parts, then save and close each file. When you’re done, you might need to log out/log in to see the newly modified icons take effect.

The /usr/share/icons/hicolor/scalable/emblems directory is where rabbitvcs installed the emblems on my system. I am using Linux Mint 17.3.

A look at ECS component approaches

For game implementation, an ECS architecture has a lot of things going for it. This approach puts all the components of a particular type (say, all the Transform components) together in memory by component type instead of by the parent entity. In addition to having the flexibility of a general component architecture, the ECS’s use of contiguous memory to store components and Systems to operate on subsets of those components enables high performance processing.

However, there’s a key challenge that many beginner ECS tutorials don’t address. In these explorations, each example of a System simply iterates along a single array or vector of homogeneous components. What possible kind of interesting behavior can be achieved by examining only one kind of Component at a time? Take even the basic task of movement. In a System processing movement, the calculations would probably go something like this.

  1. Examine the Movement component to get linear and rotational velocities.
  2. Calculate the position and rotation change based on time step size.
  3. Modify the Transform component by the calculated deltas.

This example doesn’t even consider relative position to other entities, which would complicate the matter even more. The issue is that single components alone aren’t enough – we need to be able to easily retrieve and operate on multiple component types that are associated with the same entity. There are several approaches to make this easier, and I consider three of them here.


For context, my interest in these approaches is with respect to usage in a platformer game similar in style to Mario or Megaman. I expect games of this type to have the following characteristics:

  • Players move through levels mostly linearly from beginning to end.
  • Most of the entities exist at the start gameplay.
  • Entities are deleted when destroyed by the player.
  • The player typically destroys enemies from beginning to end, so enemies are destroyed in a somewhat predictable sequence.
  • New enemies are spawned near the player’s location.

Based on the above, a reasonable ordering for the components in a contiguous block of memory is to have the entities encountered first be near the end of the memory block. To see why, let’s assume a vector. When an enemy is destroyed, its associated components get removed from each vector of components: its Movement component get removed from the Movement component vector, its Transform component gets removed from the Transform component vector, etc. Ideally, the removed components would be the very last elements of each vector so that, when removed, there is no shifting of elements to fill the space in memory left by the removed component. By ordering the entities (and by extension, their constituent components) such that the ones encountered by the player first are near the end of the vector, we can greatly reduced the amount of element shifting when entities are removed. Similarly, when entities are added to the game, they are added to the end of the vector to limit element shifting.

Approach 1: Index is entity ID

This is one of the approaches described at t-machine. In each component array, the index of the component corresponds to its associated entity. In other words, Movement[8] and Transform[8] are associated with the same entity. If an entity has no component of a particular type, then the bytes at the index for the component type are null.

I’m only going to touch on this briefly because I don’t like some of the difficulties this method raises.

  • While you save some memory by allowing the position of the component to serve as its identifier, you can lose a lot by having lots of null bytes in the component array to “fill in” the spaces for entities that don’t use components of that particular type.
  • You can’t re-use array positions without some tricks. For instance, let’s say that one of the component types stores a reference to a different entity (say, for a targeted missile). The Targeting component could then store the value 8 to mean that the entity represented by position 8 in each component array is the target. But, before the missile hits its target, entity 8 is destroyed or removed by other means. Can entity 8 be replaced by a new entity 8? Will the missile seek out the new entity 8 automatically, and is this the desired behavior? It requires a lot of bookkeeping to get right, which begins to look a lot like the Handle method (also explored below).
  • Some good things are that entity component lookup is O(1) and the contiguous components keeps the cache full. For a game with a big scope, this might matter, but the scope in which I’m interested is much smaller. Also a big win is that it’s easy to iterate along multiple component arrays simultaneously to do per-entity operations using multiple components.

Approach 2: Handles

Handles are a way to keep track of a resource even if it gets moved around, as explained here. Essentially, a handle works similarly to a component index, but instead of encoding the position of the component directly, it encodes the position of an entry in a lookup table that describes where the component is. This is really useful for a number of reasons.

  • There are no more voids in the component array wasting memory, but we are using additional memory for all the bookkeeping data.
  • It allows us to add, remove, shift, and swap components back and forth in the contiguous memory block without losing track. For example, to remove a component, we can swap it with the end component, pop the new end off, and update the lookup table accordingly.
  • Reusing space for a component is the same as reusing an entry in the lookup table. One advantage of handles is that they not only keep track of the position in the lookup table, but they also keep track of which version of the lookup position was used. This allows outdated handles to automatically become invalidated once the corresponding entry is reused for a different resource.

So at first glance, it looks like our problems with Approach 1 are solved! We actually lost something important, though. With Approach 1, a common index in each respective component array both identifies the location of the component’s data and associates the different components together to form an entity implicitly. Handles, however, have no relationship to each other, so while they can be used to find an individual component quickly, they don’t associate different components together.

We have to do the component association on our own. For simple cases, we might get away with one component directly holding the handle for another, but we’ll quickly get handle spaghetti with that approach. Instead, we could have a component reference a container object that holds handles to all components belonging to an entity. If one component needs the data from another, it first references the container, then it finds the handle of the needed component, and then it can get the needed data.

This approach isn’t terrible, but it doesn’t seem very clean to me. Take the Transform and Movement system mentioned earlier. For each Movement component, you must get the association object, then check that a handle to a Transform component exists, then get and modify that data accordingly. It’s very possible that all these lookups and bouncing around in memory might blow the cache performance. It also subtly introduces a pseudo-singleton pattern by requiring a single, almost-global set of component association objects.

Approach 3: Flat map

This approach returns to the simplicity of Approach 1. Instead of association of components implicitly through their position in the array, it is done explicitly using a matching array of IDs. Thus, each collection of components is a vector of IDs and a vector of components of matching length. I like this approach a lot.

  • Deletion of components for existing entities or adding new components to existing entities could happen anywhere in vector and cause elements shifts, but the linear progression of the levels means it will mostly be happening near the end and be rare compared to other calculations.
  • When looking for an individual component, we search in reverse from the end linearly for the matching ID. We can abort the search early if the iteration encounters IDs lower than the desired one. As a future enhancement, we could even save the position of the previous lookup and then start the next linear search from that position, with the expectation of increasing average lookup speed.
  • Iterating across two component lists is easy because it’s essentially performing an operation on the intersection of two sorted arrays. We can iterate across each linearly in an alternating pattern to find the common elements and then apply the desired operation.
  • There’s no external object associating components together. It’s just an ID.
  • One downside is that the onus of generating unique IDs for each entity is now on the programmer. Typically, we want each new entity to have an ID higher than any other entity so that the new components get appended near the end of the vector. This is easily handled with a counter as long as possible rollover is taken into account.

It’s lightweight and flexible. The real downsides are the linear time complexity for searching as well as the shifting of elements when deleting, but I think the expected access pattern will cause those negatives to be negligible.

Comparison of handles and flat map

I implemented Approach 2 and Approach 3 and timed how long it took each to do the same operation. These aren’t scientifically controlled tests by any stretch. I simply implemented the two approaches in a straightforward, obvious way, and did the same for the tests. I call each collection of components an object_manager. Each implementation was used independently of the other in different git branches, so I don’t think there was any coupling between the tests.

Test 1: Add/remove components

In this test, the object manager was populated with 4000 simple components (which were just ints). Then I timed the amount of time it took to:

  1. Add new component 1 at the end.
  2. Add new component 2 at the end.
  3. Remove the previous end component (now third to end).
  4. Remove new component 1.
  5. Remove new component 2.
  6. Repeat this test until component collection is empty.

This insertion and access pattern is supposed to simulate an action platformer. New components 1 and 2 might belong to two new projectiles launched by the player. The first hits an enemy, so the enemy and projectile are removed. The second projectile misses and is removed once out of bounds.

When using the handles methods, my machine did this test in about 0.0002 seconds for 4k elements. Using the flat map method, the time was a greater 0.0002 to 0.0003 seconds.

Test 2: Dual iteration

In this test, two object managers were used. One had 4000 components (doubles) for IDs 0 to 3999, and another had 2000 components (ints) for every odd ID. For the handle case, the components also had the requisite handle pointing to a game object, which was just a struct with handles to both component types. The test was to update the double of C1 with 1.5 times the value of C2’s int for all entities that had both components (i.e., every odd numbered entity) and skip those that did not meet this criterion.

Using the handles method, my machine completed this in about 4e-5 seconds. The flat map method, however, did it in only 1e-5 seconds! The discrepancy is likely caused by all the checking and auxiliary structures needed in the handles method. Each iteration requires following the handle to the game object container, checking whether the matching component exists, then following the handle to the matching component. The flat map just iterates along two sorted ID vectors looking for common elements.

So which one is best?

It’s hard to say which one of the two methods I tested is best. The flat map method is much faster when iterating along component arrays, which is what most of the work is in the game loop. The handle method is a bit faster when creating and deleting components, but that’s a relatively rare occurrence in the game loop. In games, though, the speed of processing doesn’t really matter as long as it’s being done fast enough to render the game at a smooth 60 fps in both normal and exceptional circumstances. In terms of easy of programming use and decoupling, the flat map is my clear favorite.

Automatic array of struct to struct of array transformation

It’s been quite a while since I’ve posted. A lot happened in the mean time. I changed web hosts. I spent a few months helping a nascent startup. I learned I’m going to have a son.

But this post isn’t about any of that. Instead, it’s about my recent interest in data-oriented design, such as when implementing component-based game engines. Here’s the gist:

  • Accessing data stored in RAM is very expensive compared to how fast the processor can crunch it.
  • Instead of accessing only the tiny bit of data needed for a particular operation, the CPU pulls in a big chunk (usually 64 bytes) and saves it in its own small but fast-access storage.
  • If the program needs data that is already cached, the CPU can access it easily (a cache hit) instead of needing to slowly retrieve it out of RAM (a cache miss).
  • Given the above, your program can experience possibly huge gains in speed by organizing its data in a way maximize cache hits.

One way that programmers can take advantage of this is to keep object collections in contiguous memory. The data of the collection can be arranged in the block in multiple different ways, however. One is the array of struct in which all the data elements of a particular object are grouped together in memory. An alternative is the struct of arrays in which all the values of a particular field are grouped together in memory. Depending on access patterns of the data in the collection, one method of organization will provide more cache hits than the other, and thus be faster even for the same process.

A downside of the struct of arrays pattern is that it’s a little inconvenient to have the members of a particular object spread out in different memory locations. Managing the different field arrays can get unwieldy. I wanted to know if there was a way to use C++ and templates to somehow generate and handle all the array juggling for me.

This was the closest I found to a solution. It uses a lot of template trickery and macros to handle everything. I wanted to avoid the need for macros, but I believe it is not possible without liberal use of tuples. The reason is that the field names of the objects cannot be specified by templates. If you are happy to settle with each “object” being a tuple and accessing fields using syntax similar to get<1>(obj), then it should be possible. But the only way to specify convenient field names is to actually write them out, and that calls for macros.

Giving up on NixOS (for now)

When I wrote a blog series about setting up NixOS for Haskell development, I was deep within cabal hell trying to get different Haskell projects with differing dependencies to build and install reliably. NixOS, with it’s rapid path reconfiguration and nix-shell environments saved the day. The cost, however, was the relatively minor headache of needing to understand cabal2nix to automatically build packages from Hackage.

Since then, though, my development preferences have shifted. My language of choice (for the time being) is Purescript, a new and shifting language with new features added each compiler release. As such, being able to easily grab the newest compiler version without jumping through a lot of nix expression hoops is essential. Also, each of my purescript projects uses bower, so my projects are already self-contained within their own directories without needing to use nix-shell. In that light, NixOS doesn’t have much to offer my development flow.

So I’m moving away from my NixOS VM for the time being and switching to Lubuntu. So far, I’m not missing NixOS. With only a handful of actual applications on my dev machine, dependency hell is less likely without needing to learn the arcane nix language. The help resources for Ubuntu distributions are also much more numerous than those for NixOS, so in many cases, problems can be solved fast. If I ever need better build isolation than current workflow requires, I might give it another shot.

Purescript Puzzler 2 – an experiment in FP architecture

Previously, I wrote about Purescript Puzzler, a toy game I wrote to see how the MVI architecture works in practice. A few things kind of bothered me about my formulation of the architecture, some of which is due to the MVI philosophy generally and probably some due to my inexperience with it. A few notable points:

  1. The model contains all state, even GUI state. This feels like a violation of separation of concerns. Ideally, I should be able keep my model and replace the GUI with a new one if I wanted, with minimal code changes.
  2. The intent is very thin. I still don’t understand its purpose really. It seems like the goal is to keep the Model distinct from the View by having the Intent work as a translator between the two, but if the Model needs to store GUI state anyway, then there’s not much point.
  3. Messages are sent between components using an ADT on a single channel between each component pair. Since channels can only accommodate a single type, my ADT needs to encompass every type of action communicated between components. Purescript Puzzler, a minimal application by any reasonable assessment, had 7 actions in its ADT. I can’t imagine how many would be in a decently sized one. The explosion of data constructors in the ADT is also not very compositional; what if I want to build my app out of smaller logical units? Now I need an ADT to wrap the ADTs for each component so I can put them all on the same channel.
  4. I felt like there was too much display logic in the GUI. I usually want the GUI to consist of reusable components that are configurable externally, and that can’t happen with so much logic in the View. This is my own fault, though. I could have put the display logic in the Model, but that would have coupled the Model even more to the View than it already was.

So I set out to update Purescript Puzzler with a new architecture with the following goals:

  1. The Model consists of the domain model and nothing more. Specifically, if the user were to save their game during play, the Model consist only of the data that should persist between play sessions.
  2. The View should consist of reusable components whose rendering and display logic is controlled externally.
  3. To reduce the possible “explosion” of ADTs as the application grows, communicate functions over the channel instead of ADT messages.
  4. As an experiment, have the GUI be ‘stateless”. Each GUI is configured with actions that it can perform, and state required to perform those actions is carried in closures.

Overall, I think points 1-3 were pretty successful, but point 4 became a big headache. In the rearchitected version, I renamed the Intent to Controller since it was no longer following the MVI philosophy. Here’s a link again to the playable, original Purescript Puzzler.

Limiting the Model

I don’t like the idea of the Model containing GUI state. To me, the model should represent the “save file” (meaning persistent state) for the application and the operations that can be used to query the model or update the model to a new, valid state. In that sense, model operations should be atomic. I used this example in my previous Purescript Puzzler post.

$ move west 50 yards

This is an atomic action that transitions the model from one valid state to a new valid state. However, in a GUI you might need to select several buttons in sequence to stage the command you want to execute before passing that data to a model function to perform the transition. For example, select Move, select West, enter 50 yards, then submit. The state of the application at the end of each of those actions includes the Model state (which has not changed since the action hasn’t been submitted yet) as well as the transient GUI state. It was this transient GUI state I wanted to keep out of the Model. As a result, previous model items like selectedPiece and dropTarget were removed because those were related solely to staging GUI actions (i.e. selecting the piece, then hovering over the board to see where to place it). The result is that my Model only contained clean, atomic actions.

(In retrospect, I think a better approach would be to utilize extensible records to have a Model that operates on only Model fields and a GUI that operates only on GUI fields. Logically, they are distinct, but can populate the same record to be a single source of truth.)

Using configurable View components

In the original Purescript Puzzler, I had code in the View such as this:

case gs.victory of
  Nothing -> vtext "Purescript Puzzler!"
  Just true -> vtext "You win!!!!!!"
  Just false -> vtext "You looooose.... :'("

This code is entirely dependent on the Model being displayed, so the View component can’t be re-used in any other context. Instead, we can pass the behavior logic for the component in as an input, then choose new behaviors as the situation requires. I called this set of behaviors the Spec, which seems to be fairly common terminology for this kind of pattern. For instance, here is the structure for the GridViewSpec.

newtype GridViewSpec = GridViewSpec
  { id :: String
  , className :: Maybe String
  , gridSize :: { r :: Number, c :: Number }
  , click :: Callback
  , squareClass :: Number -> Number -> Maybe String
  , squareFill :: Number -> Number -> Maybe String
  , enterSquare :: Number -> Number -> Callback
  , exitSquare :: Number -> Number -> Callback
  , clickSquare :: Number -> Number -> Callback
  , dblClickSquare :: Number -> Number -> Callback

The Spec includes all actions the component can take, including actions to query for required data (like id) as well as callback actions invoked as a result of user input. In this case, the GridViewSpec is used to draw a rectangular grid of squares. In Purescript Puzzler 2, GridViewSpec is used to draw both the puzzle board as well as each selectable puzzle piece, with customized display logic handling each case.

This pattern largely worked well except for a few challenges. One is that I did not define a default Spec for each component type, so I had to completely define each one instead of redefining parts of the default. This led to pretty verbose code that in most cases merely specified that the functionality wasn’t used. Along similar lines, a Spec needs to strike a balance between being versatile enough to be useful in many situations, but not so general that typical functionality is difficult to define. Finding that balance can sometimes be difficult.

Another challenge was implementing components that contain other components, such as the area that contains all the selectable puzzle pieces. These containers should generically contain a list of displayable components, so to that end I created a Display type class with a single function display that takes an input (the Spec) and produces a VTree output. The component container could then contain a list of Specs with Display instances, or a list of raw VTrees, using the instance below.

class Display a where
display :: a -> VTree

instance vtreeDisplay :: Display VTree where
display = id

One side effect of Specs needing instances for Display is that all my Spec types needed to be newtypes instead of type synonyms. This posed some additional annoyances I’ll get to later, but otherwise the ComponentsContainerViewSpec worked pretty well in terms of displaying a homogenous collection of items. Sometimes, however, it was difficult to update the behavior of a single component in the collection. Also, a useful addition would be some kind of existential wrapper of each component so the container could display a heterogeneous collection of items.

Sending functions over the channel

The most successful development in Purescript Puzzler 2 compared to the original is communicating functions over channels instead of ADTs that represent actions.

Using ADTs, the space of possible actions can grow and grow as the application grows. Additionally, if you want to composed different Models together in a new Model’, you need to declare a new ADT just to wrap the ADTs of each component so that you can put them on the same channel. When received by Model’, the actions are unwrapped and routed to each component for execution and state update. Let’s look at the updateGame function for the original Purescript Puzzler, which has a monolithic Model.

updateGame (TogglePiece p) s = ...
updateGame (TargetDrop r c) s = ...
updateGame Hint s = ...

Notice the pattern? Each has type GameAction -> GameState -> GameState. So why do we need an ADT anyway? What if, instead, we did:

togglePiece p s = ...
targetDrop r c s = ...
hint s = ...

When curried with the data previously in the ADT, each of these functions has the same type GameState -> GameState, which takes the state and updates it. As such, we can send any of them on the channel to update the Model. And now, instead of requiring a dummy Init data constructor to initialize the channel, we can use simple, polymorphic id. We can even build new functions on the fly to update the game state, so the space of possible actions is limitless without needing to expand an ADT in parallel.

Another benefit to communicating between modules with update functions is that composing different Models together becomes really easy; simply compose the multiple constituent models together into a Model’ and any function from Model' -> Model' can update any subset of models, check relationships between them, etc.

To unlock this compositional advantage in a convenient way, however, we need to use lenses. One of the great things about lenses is that they allow easy composition of multiple, individual updater functions. For instance, here is an abridged lens from Purescript Puzzler 2 that updates the behaviors of several different visual components at once.

-- first change how the pieces are drawn so border is shown
(_PuzzlerViewSpec..pieces.._ComponentsContainerViewSpec..components .~ ...
.. -- Now change hover behavior for board
(_PuzzlerViewSpec..board.._GridViewSpec..enterSquare .~ ...
.. -- Now change click behavior of board
(_PuzzlerViewSpec..board.._GridViewSpec..clickSquare .~ ...

One annoying thing with using lenses with newtypes is how verbose the lenses become because a lens is required to unwrap the newtype. This is kind of silly since a newtype only has one component that a lens can focus on anyway, so there’s not really any other options. I think a useful solution to this would be to define a type class that defines a newtype wrapper/unwrapper lens. Then use the function (...) (three dots instead of two) to handle the lookup automatically. The above would then become:

-- first change how the pieces are drawn so border is shown
(...pieces...components .~ ...
.. -- Now change hover behavior for board
(...board...enterSquare .~ ...
.. -- Now change click behavior of board
(...board...clickSquare .~ ...

Note that the trailing ... after each set operation .~ are indicating removed code in the abridged version, not invocations of the (...) function.

Using lenses in practice was a big step in my FP skill development, and was pretty easy after deciphering how they work. Their use above still has room for improvement, though.

Carrying GUI state in closures

If sending functions through the channel was the big success with this experiment, carrying GUI state through closures was the big failure.

My original thinking on this issue was to think of GUIs as having no persistent state. Instead, they have a set of actions they can do in response to user input. Each action results in a new set of actions the GUI can perform. At no point does the GUI have data about the “state”, only the set of things it can do.

So, given the above assumptions, how can the new set of GUI actions depend on the previous actions? It can if the actions themselves carry their needed information, either by closure or by currying.

One of the problems with actions returning new actions, however, is that the callbacks for an action need to set the callbacks of the action it returns, and those callbacks need to set the callbacks of the actions they return. It’s callbacks all the way down. Generally, though, the callbacks only need to be defined until you reach some kind of “baseline” state that results when the Model performs an update action. But these baseline states might be a slight modification of another baseline state, which is what you’re trying to define in the first place!

To see what I mean, take a look at this excerpt from my code.

pieceSpec mSel p = GridViewSpec
    { id: ""
    , className: if (mSel == Just p) then Just "piece selected" else Just "piece"
    , gridSize: { r: rows p, c:cols p }

    , click: callback $ const $ send chan $ defer \_ -> 
        let newSelection = case mSel of
              Just sel | sel == p -> Nothing -- unselecting currently selected piece
              _ -> Just p -- new selection

        in  if isNothing newSelection
            -- if no piece is selected, return to base spec
            then \_ -> controller chan gs
            -- else, modify the behavior of the pieces area and board to
            -- highlight the selected piece and enable drop preview
            -- first change how the pieces are drawn so border is shown
            (_PuzzlerViewSpec..pieces.._ComponentsContainerViewSpec..components .~
              A.map (pieceSpec newSelection) ps

Look at the first line and the last line of the code. The specification of actions for the selectable piece pieceSpec defines the resulting set of available actions from a click event as a function of a new pieceSpec using the newly selected piece! This recursive definition would be right at home in Haskell, but the strict evaluation of semantics of Purescript won’t fly here. I had to “lazify” this code using the purescript-lazy package to defer its evaluation until it is actually called by the onClick event. This was my first clue that this approach wasn’t going to work too well.

Another problem with carrying GUI state in the action closures is that there is no longer one source of truth. For instance, if you pass the GameState to two different closures, there are two different version of truth, so updating one closure with new GameState will not automatically update the other. This was an issue in Purescript Puzzler 2 when the user has already selected a piece and wants to remove another from the board before placing the selected piece. The selection action binds the current game state to the closure when a piece is selected so the user can see what placements are legal and which are not. However, if the game state changes before the piece is placed (like by removing some other piece from the board), the old game state is still bound to the placement validity check, so valid moves might be shown as invalid. Really, it became kind of a mess, and in some cases I just left features from the original Purescript Puzzler unimplemented to avoid the hassle.

Yet another problem with using closures to record GUI state is that it’s never clear which objects are included in the closure and which aren’t. Is it possible that some objects are never garbage collected because they are always included in a closure in one way or another? I think it’s definitely possible. Toward that end, it’s much harder to inspect the environment included in a closure than it is to just inspect the state of a record in memory, so debugging is more difficult.

Next steps

Purescript Puzzler 2 was a success in that it informed what techniques in FP architecture might work and which won’t. If there’s ever a Purescript Puzzler 3, I’d include the following attributes:

  1. The Model can indeed include all domain model state as well as all GUI state (unless GUI state is stored in GUI components like in React), but extensible records should be use to logically separate the two.
  2. Communicate updater functions through the channel instead of using ADT messages. This was a nice improvement.
  3. Use lenses to define update relationships between components, but find some way to ameliorate the pain of newtype unwrapping.
  4. Try out a GUI framework like React. Virtual-dom works well for a from-scratch approach to a GUI, but I know of zero libraries of reusable virtual-dom components, whereas there are several React component libraries available.

That’s all, folks. Happy Valentine’s Day!

How functional programming lenses work

Lenses can be a powerful component of a functional programmer’s toolbox. This post is not about their power or how you can use them. There are plenty of decent tutorials for that, so if you want to know what the lens fuss is about, I encourage you to check them out.

Instead, I will try to explain the magic behind how lenses work. After scrutinizing the types and code for a while, I finally had my burrito moment, so hopefully this explanation helps someone else out. I’ll be explaining the implementation behind the purescript-lens library, which is modeled after the Haskell lens library. Other implementations might be out there, too, but this is the one I’m focusing on. If you’ve ever looked at this implementation and wondered “How the hell do you get a getter and setter out of that?!”, then this post is meant for you.

The gist: a lens is a function that executes a series of steps to get an intermediate result, performs an operation on the intermediate result, then follows the series of steps backwards to build up a new result. The manner in which the new result is built up is controlled by the functor used by the operation.

Lens type explained

A lens is a function with the following type:

type Lens s t a b = forall f. (Functor f) => (a -> f b) -> s -> f t

Looking at this type, it’s a little difficult to see what is going on. Essentially, a lens uses an operation a -> f b to convert data of type s into data of type f t. I call a -> f b the focus conversion and s -> f t the lens conversion.

Let’s start with the types s and a. s is the parent object that is being inspected, and a is the part of the parent the lens puts focus on. For example:

type Foo = 
    { foo :: {bar :: Number} 
    , baz :: Boolean }

bazL :: Lens Foo Foo Boolean Boolean

The bazL type definition using Foo Foo instead of two different types simply means that the lens conversion does not transform inputs of type Foo into some unrelated type, excepting the functor f. Indeed, a Foo input will become an f Foo output of the lens conversion. Similarly, Boolean Boolean indicates that the focus conversion does not change the type of of the focus (again, excepting the functor f) – Boolean becomes f Boolean .

This is the common pattern used to define a lens that works as both a getter and a setter. It’s so common that most lens libraries define a type synonym for it type LensP s a = Lens s s a a.

More than just a getter and setter

Getters and setters are common use cases for lenses, but really lenses perform a complete transformation across data. This is why there are four type variables instead of just two. Let’s look at a more general type for the bazL lens that allows it to operate on any record with the baz field and transform any baz type a into f b.

bazL :: forall a b r. Lens {baz :: a | r} {baz :: b | r} a b
bazL = lens (\o -> o.baz) (\o x -> o{baz = x})

This type indicates that the lens conversion is from a record with a baz field of any type to a record of a baz field with a possibly different type (wrapped in a functor). The focus conversion goes from a type a to a type f b. The lens function above is a utility function that builds a lens that can be used as both a getter and a setter from a getter function and a setter function.

Is it possible for a lens to have type Lens a a b c, meaning the focus conversion performs a type change but the lens conversion does not? I’m actually not sure, but I think yes. Such a lens could be composed with Lens b c d d to get Lens a a d d, which is the familiar type we already know. In fact, lens composition is just normal function composition because lenses are just regular functions!

Behind the curtain

What’s going on behind the scenes when we compose lenses? Let’s look at our Foo type again, along with a couple new lenses and the lens help function.

type Foo = 
    { foo :: {bar :: Number} 
    , baz :: Boolean }

fooL :: forall a b r. Lens {foo :: a | r} {foo :: b | r} a b 
fooL = lens (\o -> o.foo) (\o x -> o{foo = x})

barL :: forall a b r. Lens {bar :: a | r} {bar :: b | r} a b 
barL = lens (\o -> o.bar) (\o x -> o{bar = x})

lens :: forall s t a b. (s -> a) -> (s -> b -> t) -> Lens s t a b 
lens s2a s2b2t a2fb s = s2b2t s <$> a2fb (s2a s)

When we use lens to build fooL from a getter and a setter, here’s what the result looks like, with types specialized for our Foo data type instead of the more general extensible record types.

fooL :: ({bar :: Number} -> f {bar :: Number}) -> Foo -> f Foo 
fooL focusCon parent = 
    (\o x -> o{foo = x}) parent <$> focusCon ((\o -> o.foo) parent) 
-- which we can simplify to 
fooL focusCon parent = \x -> parent{foo = x} <$> focusCon (parent.foo)

Our resulting lens applies the focusCon function to parent.foo, then incorporates the result back into the parent structure using <$> applied for the functor.

The resulting fooL lens is just function, so we can compose it with other functions.

(..) = (<<<) -- normal function composition
(..) :: (b -> c) -> (a -> b) -> a -> c
(..) f g a = f (g a)
-- or alternatively
(..) f g = \a -> f (g a)

When we compose lenses fooL..barL, we have:

fooL..barL =
\fc -> fooL (barL fc) =
\fc parent -> \x -> parent{foo = x} <$> (barL fc) (parent.foo) =
\fc parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> \y -> parent2{bar = y} <$> fc parent2.bar) 

-- and now we can reason the following

fc :: Number -> f Number -- fc is the focus conversion function
parent :: Foo
fooL..barL :: (Number -> f Number) -> Foo -> f Foo
fooL..barL :: Lens Foo Foo Number Number

That’s amazing! Lenses compose using normal function composition.

Take a second look at the function pipeline above and assume that the focus conversion fc has been defined already. First, parent.foo is piped in as parent2. fc is applied to parent2.bar to give a result of type f Number. The result is mapped over using <$> to apply \y -> parent2{bar = y} to get a result of f {bar :: Number}. This is then mapped over again using <$> to get a result of type f Foo, which is the output of the lens conversion.

This is the essence of a lens; the composed lens looks deep within the object, focuses on a value of a certain type, does something with that type, then recurses back up the object, “combining” the results together using <$>.

That functor magic

One thing we haven’t looked at yet is the type variable f, the functor used by the focus conversion function. Not only is the functor used in the focus conversion function, but it is also used every time the lens recurses up one layer of the object and combines the results together. The functor, therefore, controls how the new result is built out intermediate results. The choice of functor is what determines the behavior of the lens.

Let’s take our fooL..barL example again.

fooL..barL =
\fc parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> \y -> parent2{bar = y} <$> fc parent2.bar) 

Let’s pick the Const functor for f, which has the following implementation (<$>) fx (Const x) = Const x. In other words, no matter what function fx is applied with <$>, a Const x value will never change. Let’s see what happens when we simplify our lens using Const as the focus function.

fooL..barL Const =
\parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> \y -> parent2{bar = y} <$> Const parent2.bar)
    parent.foo =
\parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> \y -> Const parent2.bar) 
    parent.foo =
\parent -> \x -> parent{foo = x} <$> Const parent.foo.bar =
\parent -> Const parent.foo.bar

When using the Const functor, the lens is a getter! Can you guess what happens when the Identity functor is used? For reference, (<$>) fx (Identity x) = Identity (fx x), which is just normal function application after unwrapping the Identity constructor.

fooL..barL Identity =
\parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> \y -> parent2{bar = y} <$> Identity parent2.bar) 
    parent.foo =
\parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> Identity (\y -> parent2{bar = y} $ parent2.bar)) 
    parent.foo =
\parent -> \x -> parent{foo = x} <$> 
    (\parent2 -> Identity (parent2{bar = parent2.bar})) 
    parent.foo =
\parent -> \x -> parent{foo = x} <$> 
    Identity (parent.foo{bar = parent.foo.bar}) =
\parent -> Identity (\x -> parent{foo = x} $ 
    parent.foo{bar = parent.foo.bar})) =
\parent -> Identity $ parent{foo = parent.foo{bar = parent.foo.bar}} =
\parent -> Identity parent

Under the Identity functor, the lens take a Foo and returns an equal Foo by building one from the ground up.

So what if, instead of using just Identity as the focus conversion, we first run the focus through a function? I.E. the focus function is \val -> Identity (fun val) for some fun? I’ll spare the derivation and jump to the end: we get back the parent we put in except the focus will have the value returned from fun. Set fun = const newVal and you have a setter!

As a little aside, I can say personally that understanding how drastically the functor changes the behavior of the lens was a major hurdle for me. So many lens tutorials say that lenses “combine a getter and a setter” or something along those lines, and then I’d check the implementation and see no obvious getter or setter and wonder, “What kind of devilry is this?!”

If the functor controls the rebuilding done at the end of the lens conversion, then how is the functor selected? They are built into the implementations of view, set, and any of the numerous operators that can be used to run a lens conversion.

The gist, again

Lenses are tools for ripping something into constituent pieces, doing something with the final piece, then recombining all the pieces together into something new. At least that’s how the person behind this keyboard thinks about them now. There’s probably an even more general interpretation that describes all kinds of wonky things lenses can do beyond the usual getter, setter, identity, and similar behaviors, but this covers the most common uses pretty well. I’ll save that more general discover for another day after my cognitive burrito appetite returns.