# Parsing strategy ## Loading Since ipuz is a very loosely defined spec, we have a similarly tolerant approach to parsing ipuz files. We try to map gobject properties to fields in the puzzle whenever possible. However, that's a really bad fit for the board where we want one data structure for the board, and there can be multiple fields representing them. Since the Json-glib::serializable interface doesn't let us map multiple nodes to a single property (such as mapping `"puzzle"` and `"solution"` to a `board` property) we have to do custom parsing. When we call `ipuz_puzzle_new_*`, we will build the json dom and then: 1. Confirm the ipuz file is a version of ipuz that we can handle (.v2) 1. Read the `kind` field and create a sub-class appropriately. 1. Go through all members in the toplevel object in the file, and call `:load_node` on it: 1. The default handler should handle setting properties with the same name with SCALAR json nodes (aka booleans, strings, ints, etc). 1. For more complex types, the `:load_node` handler should handle them manually 1. Be sure to chain up to the super type! 1. We then call `:post_load_node` to catch any types that need loading after the rest of the cells have been called. As an example, both solutions and puzzle need a lot of previously parsed values (dimensions, block, etc) 1. There is no need to chain up on `post_load_node`. 1. Next, there's a `fixup` stage. This is for . For example, calculating enumerations or cell areas. 1. Lastly, there's a `validate` stage that makes sure the puzzle makes sense. It will catch nonsensical puzzles like puzzles without a grid or clues, etc. ### Naming We have two types of internal methods with similar names: * `_fixup()` methods to enforce internal code requirements and assumptions. Examples of this are making sure every cell has a link back to its associated clues. * `_fix()` methods to enforce puzzle requirements and assumptions. Examples of this are every crossword having its numbering in sequential order. These two can overlap at times, but perform very different actions. **NOTE:** It's probably wise to rename the fixup functions in the future to remove naming confusion. ## Saving Note that saving the file won't get the exact same file that's loaded. At a minimum the formatting and indentation won't be saved. In addition, clues are always converted to objects with the `"cells"` key set to whatever libipuz was able to calculate from the board. If `show_enumerations` is `TRUE`, then enumerations will be calculated and set to the clue length. ### Notes: * We ignore fields that are misformed or we don't understand in the interest of compatibility. There is no strict parsing mode right now. * The `GError` handling on load really only catches misformed files. We don't provide a way to warn right now about unhandled elements. * Boxed types (ex: `IPuzStyle` and `IPuzClue`) have a `_load_node` convenience function * Cells have `_parse_*` functions, because they're allocated statically and we need to fill in their data. * It's possible to have a block in different cells in both the `"puzzle"` and the `"solution"` elements. This can lead to conflicting puzzles. To make things simpler, we ignore the block in the solution field and leave the puzzle inconsistent.