64 lines
2.2 KiB
Markdown
64 lines
2.2 KiB
Markdown
|
# liblex
|
||
|
|
||
|
A library for converting input text into tokens defined by regular expressions.
|
||
|
|
||
|
## Rationale
|
||
|
|
||
|
liblex is a part of an attempt to write / create a compiler entirely from scratch.
|
||
|
This part of the compiler would be used to convert input text into tokens
|
||
|
to be evaluated by the parser.
|
||
|
|
||
|
## Usage
|
||
|
|
||
|
First of all, an evulation configuration has to be created. This configuration
|
||
|
is used to store the various regular expressions to be used during
|
||
|
lexing. The below code is an example of initializing and configuring an
|
||
|
evalutation configuration.
|
||
|
|
||
|
```C
|
||
|
/* Declares the configuration */
|
||
|
eval_config config;
|
||
|
/* Initializes the configuration for use */
|
||
|
eval_config_init(&config);
|
||
|
/* Registers regular expressions to be used. The IDs, given as the third
|
||
|
parameter are also used for priority - the higher the ID, the higher the
|
||
|
priority. */
|
||
|
eval_config_add(&config, "[ \n]+", 0);
|
||
|
eval_config_add(&config, "[a-zA-Z_][a-zA-Z_0-9]*", 1);
|
||
|
eval_config_add(&config, "if", 2);
|
||
|
eval_config_add(&config, "else", 3);
|
||
|
eval_config_add(&config, "[0-9]+", 4);
|
||
|
eval_config_add(&config, "{|}", 5);
|
||
|
```
|
||
|
|
||
|
It should be noted that this example is incomplete. `eval_config_add` returns
|
||
|
a `liblex_result`, which represents the result of the operation. `LIBLEX_SUCCESS`
|
||
|
means that no errors occured. `LIBLEX_MALLOC`, on the other hand, means that
|
||
|
the function failed to allocate the necessary memory, and `LIBLEX_INVALID`
|
||
|
means that the regular expression provided was not correctly formatted.
|
||
|
|
||
|
After the eval configuration has been configured, tokenizing a string is
|
||
|
done by creating a linked list and populating it with the resulting tokens
|
||
|
(called matches).
|
||
|
|
||
|
```
|
||
|
/* Declares the linked list. */
|
||
|
ll match_ll;
|
||
|
/* Initializes the linked list. */
|
||
|
ll_init(&match_ll);
|
||
|
|
||
|
/* The first parameter is the input string, the second is the index at which
|
||
|
to begin parsing. */
|
||
|
eval_all(string, 0, &config, &match_ll);
|
||
|
```
|
||
|
|
||
|
Once done, some things need to be cleaned up. The `eval_foreach_match_free`
|
||
|
function can be passed to `ll_foreach` containing the matches to release them:
|
||
|
```
|
||
|
ll_foreach(&match_ll, NULL, compare_always, eval_foreach_match_free);
|
||
|
ll_clear(&match_ll);
|
||
|
```
|
||
|
And the configuration can be freed using:
|
||
|
```
|
||
|
eval_config_free(&config);
|
||
|
```
|