Parsing strategy


Since ipuz is a very loosely defined spec, we have a similarly tolerant approach to parsing ipuz files. We try to map gobject properties to fields in the puzzle whenever possible. However, that’s a really bad fit for the board where we want one data structure for the board, and there can be multiple fields representing them.

Since the Json-glib::serializable interface doesn’t let us map multiple nodes to a single property (such as mapping "puzzle" and "solution" to a board property) we have to do custom parsing.

When we call ipuz_puzzle_new_*, we will build the json dom and then:

  1. Confirm the ipuz file is a version of ipuz that we can handle (.v2)

  2. Read the kind field and create a sub-class appropriately.

  3. Go through all members in the toplevel object in the file, and call :load_node on it:

    1. The default handler should handle setting properties with the same name with SCALAR json nodes (aka booleans, strings, ints, etc).

    2. For more complex types, the :load_node handler should handle them manually

    3. Be sure to chain up to the super type!

  4. We then call :post_load_node to catch any types that need loading after the rest of the cells have been called. As an example, both solutions and puzzle need a lot of previously parsed values (dimensions, block, etc)

    1. There is no need to chain up on post_load_node.

  5. Next, there’s a fixup stage. This is for . For example, calculating enumerations or cell areas.

  6. Lastly, there’s a validate stage that makes sure the puzzle makes sense. It will catch nonsensical puzzles like puzzles without a grid or clues, etc.


We have two types of internal methods with similar names:

  • _fixup() methods to enforce internal code requirements and assumptions. Examples of this are making sure every cell has a link back to its associated clues.

  • _fix() methods to enforce puzzle requirements and assumptions. Examples of this are every crossword having its numbering in sequential order.

These two can overlap at times, but perform very different actions.

NOTE: It’s probably wise to rename the fixup functions in the future to remove naming confusion.


Note that saving the file won’t get the exact same file that’s loaded. At a minimum the formatting and indentation won’t be saved. In addition, clues are always converted to objects with the "cells" key set to whatever libipuz was able to calculate from the board. If show_enumerations is TRUE, then enumerations will be calculated and set to the clue length.


  • We ignore fields that are misformed or we don’t understand in the interest of compatibility. There is no strict parsing mode right now.

  • The GError handling on load really only catches misformed files. We don’t provide a way to warn right now about unhandled elements.

  • Boxed types (ex: IpuzStyle and IpuzClue) have a _load_node convenience function

  • Cells have _parse_* functions, because they’re allocated statically and we need to fill in their data.

  • It’s possible to have a block in different cells in both the "puzzle" and the "solution" elements. This can lead to conflicting puzzles. To make things simpler, we ignore the block in the solution field and leave the puzzle inconsistent.