Merge branch 'master' of dev.danilafe.com:Experiments/liblex
This commit is contained in:
commit
c629be5d68
64
README.md
Normal file
64
README.md
Normal file
|
@ -0,0 +1,64 @@
|
||||||
|
# liblex
|
||||||
|
|
||||||
|
A library for converting input text into tokens defined by regular expressions.
|
||||||
|
|
||||||
|
## Rationale
|
||||||
|
|
||||||
|
liblex is a part of an attempt to write / create a compiler entirely from scratch.
|
||||||
|
This part of the compiler would be used to convert input text into tokens
|
||||||
|
to be evaluated by the parser.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
First of all, an evulation configuration has to be created. This configuration
|
||||||
|
is used to store the various regular expressions to be used during
|
||||||
|
lexing. The below code is an example of initializing and configuring an
|
||||||
|
evalutation configuration.
|
||||||
|
|
||||||
|
```C
|
||||||
|
/* Declares the configuration */
|
||||||
|
eval_config config;
|
||||||
|
/* Initializes the configuration for use */
|
||||||
|
eval_config_init(&config);
|
||||||
|
/* Registers regular expressions to be used. The IDs, given as the third
|
||||||
|
parameter are also used for priority - the higher the ID, the higher the
|
||||||
|
priority. */
|
||||||
|
eval_config_add(&config, "[ \n]+", 0);
|
||||||
|
eval_config_add(&config, "[a-zA-Z_][a-zA-Z_0-9]*", 1);
|
||||||
|
eval_config_add(&config, "if", 2);
|
||||||
|
eval_config_add(&config, "else", 3);
|
||||||
|
eval_config_add(&config, "[0-9]+", 4);
|
||||||
|
eval_config_add(&config, "{|}", 5);
|
||||||
|
```
|
||||||
|
|
||||||
|
It should be noted that this example is incomplete. `eval_config_add` returns
|
||||||
|
a `liblex_result`, which represents the result of the operation. `LIBLEX_SUCCESS`
|
||||||
|
means that no errors occured. `LIBLEX_MALLOC`, on the other hand, means that
|
||||||
|
the function failed to allocate the necessary memory, and `LIBLEX_INVALID`
|
||||||
|
means that the regular expression provided was not correctly formatted.
|
||||||
|
|
||||||
|
After the eval configuration has been configured, tokenizing a string is
|
||||||
|
done by creating a linked list and populating it with the resulting tokens
|
||||||
|
(called matches).
|
||||||
|
|
||||||
|
```
|
||||||
|
/* Declares the linked list. */
|
||||||
|
ll match_ll;
|
||||||
|
/* Initializes the linked list. */
|
||||||
|
ll_init(&match_ll);
|
||||||
|
|
||||||
|
/* The first parameter is the input string, the second is the index at which
|
||||||
|
to begin parsing. */
|
||||||
|
eval_all(string, 0, &config, &match_ll);
|
||||||
|
```
|
||||||
|
|
||||||
|
Once done, some things need to be cleaned up. The `eval_foreach_match_free`
|
||||||
|
function can be passed to `ll_foreach` containing the matches to release them:
|
||||||
|
```
|
||||||
|
ll_foreach(&match_ll, NULL, compare_always, eval_foreach_match_free);
|
||||||
|
ll_clear(&match_ll);
|
||||||
|
```
|
||||||
|
And the configuration can be freed using:
|
||||||
|
```
|
||||||
|
eval_config_free(&config);
|
||||||
|
```
|
Loading…
Reference in New Issue
Block a user