external | ||
include | ||
src | ||
.gitignore | ||
.gitmodules | ||
CMakeLists.txt | ||
README.md |
liblex
A library for converting input text into tokens defined by regular expressions.
Rationale
liblex is a part of an attempt to write / create a compiler entirely from scratch. This part of the compiler would be used to convert input text into tokens to be evaluated by the parser.
Usage
First of all, an evulation configuration has to be created. This configuration is used to store the various regular expressions to be used during lexing. The below code is an example of initializing and configuring an evalutation configuration.
/* Declares the configuration */
eval_config config;
/* Initializes the configuration for use */
eval_config_init(&config);
/* Registers regular expressions to be used. The IDs, given as the third
parameter are also used for priority - the higher the ID, the higher the
priority. */
eval_config_add(&config, "[ \n]+", 0);
eval_config_add(&config, "[a-zA-Z_][a-zA-Z_0-9]*", 1);
eval_config_add(&config, "if", 2);
eval_config_add(&config, "else", 3);
eval_config_add(&config, "[0-9]+", 4);
eval_config_add(&config, "{|}", 5);
It should be noted that this example is incomplete. eval_config_add
returns
a liblex_result
, which represents the result of the operation. LIBLEX_SUCCESS
means that no errors occured. LIBLEX_MALLOC
, on the other hand, means that
the function failed to allocate the necessary memory, and LIBLEX_INVALID
means that the regular expression provided was not correctly formatted.
After the eval configuration has been configured, tokenizing a string is done by creating a linked list and populating it with the resulting tokens (called matches).
/* Declares the linked list. */
ll match_ll;
/* Initializes the linked list. */
ll_init(&match_ll);
/* The first parameter is the input string, the second is the index at which
to begin parsing. */
eval_all(string, 0, &config, &match_ll);
Once done, some things need to be cleaned up. The eval_foreach_match_free
function can be passed to ll_foreach
containing the matches to release them:
ll_foreach(&match_ll, NULL, compare_always, eval_foreach_match_free);
ll_clear(&match_ll);
And the configuration can be freed using:
eval_config_free(&config);