1
0
mirror of https://github.com/DanilaFe/abacus synced 2024-12-22 15:30:09 -08:00

Updated Internals (markdown)

Danila Fedorin 2017-07-26 21:48:38 -07:00
parent 3d3d0d5554
commit cffbf63564

@ -5,7 +5,7 @@ Abacus, being design with future extensions in mind, does not simply operate on
All the `NumberInterface` implementations are immutable - any operation, such as addition, creates a new copy of the output. This prevents accidental side effects in functions, and removes the need for duplication when returning instances of `NumberInterface` from functions, or storing them as variables. All the `NumberInterface` implementations are immutable - any operation, such as addition, creates a new copy of the output. This prevents accidental side effects in functions, and removes the need for duplication when returning instances of `NumberInterface` from functions, or storing them as variables.
## Promotion System ## Promotion System
Abacus allows different `NumberInterface` implementations to occur within the same expression. In order to ensure the functionality of the primitive methods of the `NumberInterface` for two different implementations, it's necessary for the these implementations to be able to interact correctly. One option for this would be to use `instanceof` checks, and apply the operations accordingly. However, this would mean a number of hardcoded `instanceof` calls, and potential duplication between implementation. This is undesirable. To work around this, a promotion system was implemented. It is based on the assumption that it's always possible to convert one of the implementations to another, but not necessarily the other way around. As such, when applying operations to numbers, abacus checks which implementation is the most "general", that is, which implementation can be converted into from all present numbers. It then proceeds to "promote" all the other implementations into it. At this point, all of the numbers have the same implementation, and all the primitive operations are defined. Abacus allows different `NumberInterface` implementations to occur within the same expression. In order to ensure the functionality of the primitive methods of the `NumberInterface` for two different implementations, it's necessary for the these implementations to be able to interact correctly. One option for this would be to use `instanceof` checks, and apply the operations accordingly. However, this would mean a number of hardcoded `instanceof` calls, and potential duplication between implementations. This is undesirable. To work around this, a promotion system was implemented. It is based on the assumption that it's always possible to convert one of the implementations to another, but not necessarily the other way around. As such, when applying operations to numbers, abacus checks which implementation is the most "general", that is, which implementation can be converted into from all present numbers. It then proceeds to "promote" all the other implementations into it. At this point, all of the numbers have the same implementation, and all the primitive operations are defined.
# Tokenization # Tokenization
Before the input can be converted into an expression tree and evaluated, abacus needs to convert it to a list of tokens that can then be rearranged into postfix (the postfix makes it easier to construct a tree). The tokenization is not done via a set of if-else checks, or even a handwritten lexer. Rather, abacus adapts a regular expression based approach, following a modified version of Ken Thompson's [regular expression matching algorithm](https://swtch.com/~rsc/regexp/regexp1.html), implemented using Nondeterministic Finite Automata. An instance of `Lexer` is provided with regular expressions, which are then recursively compiled into NFA, and given identifies, allowing for easy additions of possible token types and changes in grammar. The implementation of Thompson's algorithm is modified to check for all token types at once, examining each input character only once. Before the input can be converted into an expression tree and evaluated, abacus needs to convert it to a list of tokens that can then be rearranged into postfix (the postfix makes it easier to construct a tree). The tokenization is not done via a set of if-else checks, or even a handwritten lexer. Rather, abacus adapts a regular expression based approach, following a modified version of Ken Thompson's [regular expression matching algorithm](https://swtch.com/~rsc/regexp/regexp1.html), implemented using Nondeterministic Finite Automata. An instance of `Lexer` is provided with regular expressions, which are then recursively compiled into NFA, and given identifies, allowing for easy additions of possible token types and changes in grammar. The implementation of Thompson's algorithm is modified to check for all token types at once, examining each input character only once.