Abacus/src/main/java/org/nwapw/abacus/parsing/LexerTokenizer.java

package org.nwapw.abacus.parsing;

import org.nwapw.abacus.lexing.Lexer;
import org.nwapw.abacus.lexing.pattern.Match;
import org.nwapw.abacus.lexing.pattern.Pattern;
import org.nwapw.abacus.plugin.PluginListener;
import org.nwapw.abacus.plugin.PluginManager;
import org.nwapw.abacus.tree.TokenType;

import java.util.Comparator;
import java.util.List;

/**
 * A tokenzier that uses the lexer class and registered function and operator
 * names to turn input into tokens in O(n) time.
 */
public class LexerTokenizer implements Tokenizer<Match<TokenType>>, PluginListener {

    /**
     * Comparator used to sort the tokens produced by the lexer.
     */
    protected static final Comparator<TokenType> TOKEN_SORTER = Comparator.comparingInt(e -> e.priority);

    /**
     * The lexer instance used to turn strings into matches.
     */
    private Lexer<TokenType> lexer;

    /**
     * Creates a new lexer tokenizer.
     */
    public LexerTokenizer() {
        lexer = new Lexer<TokenType>() {{
            register(" ", TokenType.WHITESPACE);
            register(",", TokenType.COMMA);
            register("[0-9]*(\\.[0-9]+)?", TokenType.NUM);
            register("\\(", TokenType.OPEN_PARENTH);
            register("\\)", TokenType.CLOSE_PARENTH);
            register("[a-zA-Z]+",TokenType.VARIABLE);
        }};
    }

    @Override
    public List<Match<TokenType>> tokenizeString(String string) {
        return lexer.lexAll(string, 0, TOKEN_SORTER);
    }

    @Override
    public void onLoad(PluginManager manager) {
        for (String operator : manager.getAllOperators()) {
            lexer.register(Pattern.sanitize(operator), TokenType.OP);
        }
        for (String function : manager.getAllFunctions()) {
            lexer.register(Pattern.sanitize(function), TokenType.FUNCTION);
        }
    }

    @Override
    public void onUnload(PluginManager manager) {
        for (String operator : manager.getAllOperators()) {
            lexer.unregister(Pattern.sanitize(operator), TokenType.OP);
        }
        for (String function : manager.getAllFunctions()) {
            lexer.unregister(Pattern.sanitize(function), TokenType.FUNCTION);
        }
    }

}
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`package org.nwapw.abacus.parsing;`

			`import org.nwapw.abacus.lexing.Lexer;`
			`import org.nwapw.abacus.lexing.pattern.Match;`
			`import org.nwapw.abacus.lexing.pattern.Pattern;`
			`import org.nwapw.abacus.plugin.PluginListener;`
			`import org.nwapw.abacus.plugin.PluginManager;`
			`import org.nwapw.abacus.tree.TokenType;`

			`import java.util.Comparator;`
			`import java.util.List;`

			`/**`
			`* A tokenzier that uses the lexer class and registered function and operator`
			`* names to turn input into tokens in O(n) time.`
			`*/`
			`public class LexerTokenizer implements Tokenizer<Match<TokenType>>, PluginListener {`

			`/**`
			`* Comparator used to sort the tokens produced by the lexer.`
			`*/`
			`protected static final Comparator<TokenType> TOKEN_SORTER = Comparator.comparingInt(e -> e.priority);`

			`/**`
			`* The lexer instance used to turn strings into matches.`
			`*/`
			`private Lexer<TokenType> lexer;`

			`/**`
			`* Creates a new lexer tokenizer.`
			`*/`
Format code. 2017-07-30 21:11:32 -07:00			`public LexerTokenizer() {`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`lexer = new Lexer<TokenType>() {{`
			`register(" ", TokenType.WHITESPACE);`
			`register(",", TokenType.COMMA);`
			`register("[0-9]*(\\.[0-9]+)?", TokenType.NUM);`
			`register("\\(", TokenType.OPEN_PARENTH);`
			`register("\\)", TokenType.CLOSE_PARENTH);`
recognise variables 2017-08-07 15:03:14 -07:00			`register("[a-zA-Z]+",TokenType.VARIABLE);`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`}};`
			`}`

			`@Override`
			`public List<Match<TokenType>> tokenizeString(String string) {`
			`return lexer.lexAll(string, 0, TOKEN_SORTER);`
			`}`

			`@Override`
			`public void onLoad(PluginManager manager) {`
Format code. 2017-07-30 21:11:32 -07:00			`for (String operator : manager.getAllOperators()) {`
			`lexer.register(Pattern.sanitize(operator), TokenType.OP);`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`}`
Format code. 2017-07-30 21:11:32 -07:00			`for (String function : manager.getAllFunctions()) {`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`lexer.register(Pattern.sanitize(function), TokenType.FUNCTION);`
			`}`
			`}`

			`@Override`
			`public void onUnload(PluginManager manager) {`
Format code. 2017-07-30 21:11:32 -07:00			`for (String operator : manager.getAllOperators()) {`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`lexer.unregister(Pattern.sanitize(operator), TokenType.OP);`
			`}`
Format code. 2017-07-30 21:11:32 -07:00			`for (String function : manager.getAllFunctions()) {`
Implement a LexerTokenizer and a ShuntingYard parser. These are basically two pieces of the old TreeBuilder, but decoupled and reimplemented conventionally. 2017-07-29 21:37:32 -07:00			`lexer.unregister(Pattern.sanitize(function), TokenType.FUNCTION);`
			`}`
			`}`

			`}`