Generic very simple lexer, capable of matching tokens defined by regular expressions.
More...
Generic very simple lexer, capable of matching tokens defined by regular expressions.
The Lexer works on an input string, sequentially building a list of matched Tokens (or throwing an Exception if the input string cannot be tokenized).
The Lexer is context independent and without lookahead, and cannot for example distinguish between -
used as a binary subtraction operation or as a unary negation. This is handled by the parser.
Tokens are added to the Lexer as TokenDefinition instances, and are saved in an ordered list. Hence some care has to be taken when defining the Lexer (see the implementation of StdMathLexer). For example, if we want the lexer to recognize sin
as well as sinh
as separate tokens, the more specific sinh
pattern should be added to the Lexer before sin
.
MathParser\Lexing\Lexer::tokenize |
( |
|
$input | ) |
|
Convert an input string to a list of tokens.
Using the list of knowns tokens, sequentially match the input string to known tokens. Note that the first matching token from the list is chosen, so if there are tokens sharing parts of the pattern (e.g. sin
and sinh
), care should be taken to add sinh
before sin
, otherwise the lexer will never match a sinh
.
- Parameters
-
string | $input | String to tokenize. |
- Return values
-
Token[] | sequence of recognized tokens |
- Exceptions
-
UnknownTokenException | throwns when encountering characters in the input string that doesn't match any knwon token. |