tmilos / lexer
Lexical analyzer with individual token definition with regular expressions
Installs: 983 473
Dependents: 2
Suggesters: 0
Security: 0
Stars: 21
Watchers: 3
Forks: 3
Open Issues: 1
Requires
- php: >=5.5
Requires (Dev)
- phpunit/phpunit: ~5.6
- satooshi/php-coveralls: ^1.0
This package is not auto-updated.
Last update: 2024-10-26 20:32:23 UTC
README
PHP implementation of Lexical Analyzer.
Warning This is not a GENERATOR like classical lex is. It does not produce any php code. It's a simple plain scanner of the given input string and tokenizer into given set of tokens by matching regular expressions. Thus, at runtime you can change the token definition and use one same code for any token set.
Token Definition
Tokens are defined with TokenDefinition
class that holds token name and regular expression. Token name can be
empty, and in that case, lexer will ignore/skip such tokens.
Lexer Configuration
The lexer configuration holds a list of all token definitions. With LexerArrayConfig
it can be easily created from an array
where keys are regular expressions and values are names of tokens.
Full scan
Lexer's static method scan($config, $input)
can be used to scan given input string and return an array of tokens.
Lexer with state
Instance of the Lexer
class be used to walk trough scanned tokens with single look-ahead token.
It's similar in API to doctrine/lexer, just tokens are defined and scanned differently, w/out
the need for recognizing the token type/name from the tokenized value - rather the token type/name is given by the same TokenDefn
that gave the regex to recognize the token.
Examples
<?php $config = new LexerArrayConfig([ '\\s' => '', '\\d+' => 'number', '\\+' => 'plus', '-' => 'minus', '\\*' => 'mul', '/' => 'div', ]); // static scan method that returns an array of $tokens = Lexer::scan($config, '2 + 3'); array_map(function ($t) { return $t->getName(); }, $tokens); // ['number', 'plus', 'number'] // lexer instance $lexer = new Lexer($config); $lexer->setInput('2 + 3'); $lexer->moveNext(); while ($lexer->getLookahead()) { print $lexer->getLookahead()->getName(); $lexer->moveNext(); }