Struct

IpuzCharset

Description [src]

struct IpuzCharset {
  /* No available fields */
}

An opaque, immutable data structure that stores an ordered count of unicode characters.

Charsets are surprisingly versatile. Fundamentally, they have a unique mapping between a gunichar and a guint. They can be used to keep track of the number of unicode characters in a puzzle or to represent a set of valid characters.

They are constructed from an IpuzCharsetBuilder, or a simple one can be created via ipuz_charset_deserialize(). A common case has a list of characters with a count of one to be used to filter text — such as an alphabet.

Charsets have designed to be used in areas that are performance critical. As a result, tradeoffs have been made to keep them as fast as possible.

Examples:

An example of creating a charset from an existing string:

g_autoptr (IpuzCharset) charset = NULL;

charset = ipuz_charset_deserialize ("ABCDEEE");

// Show that charset contains three 'E's
g_assert_cmpint (ipuz_charset_get_char_count (charset, g_utf8_get_char ("E")),
                 ==,
                 3);

A second example of creating an alphabet filter:

IpuzCharsetBuilder *builder;
g_autoptr (IpuzCharset) charset = NULL;
g_autoptr (GString) filtered = NULL;

builder = ipuz_charset_builder_new_for_language ("en");

// builder is consumed with this call
charset = ipuz_charset_builder_build (builder);

g_assert_true (ipuz_charset_check_text (charset, "ENGLISH"));
g_assert_false (ipuz_charset_check_text (charset, "ESPAÑOL!"));

// Filter string to only include english alphabet characters
filtered = g_string_new (NULL);
for (gchar *p = "ESPAÑOL!"; p[0]; p = g_utf8_next_char (p))
  {
    gunichar c = g_utf8_get_char (p);
    if (ipuz_charset_get_char_count (charset, c) > 0)
      g_string_append_unichar (filtered, c);
  }

// Make sure characters are filtered out
g_assert_cmpstr (filtered->str, ==, "ESPAOL");

Iteration

To iterate through a charset, one can do:

for (guint i = 0; i < ipuz_charset_get_n_chars (charset); i++)
  {
    IpuzCharsetValue value;

    ipuz_charset_get_value (charset, i, &value);
    // do something with value
  }

Limitations

Like the rest of libipuz, the charset operates on unicode characters rather than clusters. This means that glyphs with multiple code-points can’t be stored in a charset.

Functions

ipuz_charset_deserialize

Creates a new character set by deserializing from a string.

Struct

IpuzCharset

Description [src]

Examples:

Iteration

Limitations

Functions

ipuz_charset_deserialize

Instance methods

ipuz_charset_check_text

ipuz_charset_equal

ipuz_charset_get_char_count

ipuz_charset_get_char_index

ipuz_charset_get_n_chars

ipuz_charset_get_total_count

ipuz_charset_get_value

ipuz_charset_ref

ipuz_charset_serialize

ipuz_charset_subset

ipuz_charset_unref