|Did you know ...||Search Documentation:|
|Pack tokenize -- design_notes.md|
The library started as a very simple and lightweight set of predicates for a common, but very limited, form of lexing. As we extend it, we aim to maintain a modest scope in order to achieve a sweet spot between ease of use and powerful flexibility.
tokenize does not aspire to become an industrial strength lexer generator. We
aim to serve most users needs between raw input and a structured form ready for
parsing by a DCG.
If a user is parsing a language with keywords such as
and wants to distinguish these from variable names,
tokenize isn't going to
give you this out of the box. But, it should provide an easy means of achieving
this result through a subsequent lexing pass.
escape. tokenization need to return tokens represented with the same arity)
numbers, punctuationshould yield
[pnct('-'), number(12), pnct('.'), number(3)]while
punctuation, numbersshould yield