liblex/README.md

64 lines
2.2 KiB
Markdown
Raw Normal View History

2017-02-14 19:28:52 -08:00
# liblex
A library for converting input text into tokens defined by regular expressions.
## Rationale
liblex is a part of an attempt to write / create a compiler entirely from scratch.
This part of the compiler would be used to convert input text into tokens
to be evaluated by the parser.
## Usage
First of all, an evulation configuration has to be created. This configuration
is used to store the various regular expressions to be used during
lexing. The below code is an example of initializing and configuring an
evalutation configuration.
```C
/* Declares the configuration */
eval_config config;
/* Initializes the configuration for use */
eval_config_init(&config);
/* Registers regular expressions to be used. The IDs, given as the third
parameter are also used for priority - the higher the ID, the higher the
priority. */
eval_config_add(&config, "[ \n]+", 0);
eval_config_add(&config, "[a-zA-Z_][a-zA-Z_0-9]*", 1);
eval_config_add(&config, "if", 2);
eval_config_add(&config, "else", 3);
eval_config_add(&config, "[0-9]+", 4);
eval_config_add(&config, "{|}", 5);
```
It should be noted that this example is incomplete. `eval_config_add` returns
a `liblex_result`, which represents the result of the operation. `LIBLEX_SUCCESS`
means that no errors occured. `LIBLEX_MALLOC`, on the other hand, means that
the function failed to allocate the necessary memory, and `LIBLEX_INVALID`
means that the regular expression provided was not correctly formatted.
After the eval configuration has been configured, tokenizing a string is
done by creating a linked list and populating it with the resulting tokens
(called matches).
```
/* Declares the linked list. */
ll match_ll;
/* Initializes the linked list. */
ll_init(&match_ll);
/* The first parameter is the input string, the second is the index at which
to begin parsing. */
eval_all(string, 0, &config, &match_ll);
```
Once done, some things need to be cleaned up. The `eval_foreach_match_free`
function can be passed to `ll_foreach` containing the matches to release them:
```
ll_foreach(&match_ll, NULL, compare_always, eval_foreach_match_free);
ll_clear(&match_ll);
```
And the configuration can be freed using:
```
eval_config_free(&config);
```