| Copyright | (c) Alec Theriault 2017-2018 |
|---|---|
| License | BSD-style |
| Maintainer | alec.theriault@gmail.com |
| Stability | experimental |
| Portability | portable |
| Safe Haskell | None |
| Language | Haskell2010 |
Language.Rust.Parser.Lexer
Contents
Description
As much as possible, this follows Rust's choices for tokenization, including punting some things to
the parser. For instance, the last two > in Vec<Option<i32>> are lexed as a single
GreaterGreater token while the last two tokens of Vec<Option<Option<i32>>> are
GreaterGreater and Greater.
Yet weirder (but very useful in parsing for dealing with conflicts and precedences of logical and,
bitwise and, and unary reference), &&&x&&&y lexes into AmpersandAmpersand, Ampersand,
, IdentTok "x"AmpersandAmpersand, Ampersand, . Although the parser sometimes
needs to "break apart" tokens, it never has to think about putting them together. That means it can
easily figure out that IdentTok "y"&&&x&&&y parses as &(&(&x)) && (&y) and not &(&(&x)) & (&(&y)) even if
bitwise conjunctions bind more tightly that logical conjunctions.
This sort of amguity where one token need to be broken up by the parser occurs for
&&in patterns like&&mut x||in closures with no arguments like|| x<<in qualified type paths likeFromIterator<<A as IntoIterator>::Item>>>in qualified paths like<Self as Foo<T>>::Bar>=in equality predicates likeF<A>=i32>>=in equality predicates likeF<G<A>>=i32
- lexToken :: P (Spanned Token)
- lexNonSpace :: P (Spanned Token)
- lexTokens :: P (Spanned Token) -> P [Spanned Token]
- lexShebangLine :: P (Maybe String)
- data Token
- = Equal
- | Less
- | Greater
- | Ampersand
- | Pipe
- | Exclamation
- | Tilde
- | Plus
- | Minus
- | Star
- | Slash
- | Percent
- | Caret
- | GreaterEqual
- | GreaterGreaterEqual
- | AmpersandAmpersand
- | PipePipe
- | LessLess
- | GreaterGreater
- | EqualEqual
- | NotEqual
- | LessEqual
- | LessLessEqual
- | MinusEqual
- | AmpersandEqual
- | PipeEqual
- | PlusEqual
- | StarEqual
- | SlashEqual
- | CaretEqual
- | PercentEqual
- | At
- | Dot
- | DotDot
- | DotDotEqual
- | DotDotDot
- | Comma
- | Semicolon
- | Colon
- | ModSep
- | RArrow
- | LArrow
- | FatArrow
- | Pound
- | Dollar
- | Question
- | OpenDelim !Delim
- | CloseDelim !Delim
- | LiteralTok LitTok (Maybe Name)
- | IdentTok Ident
- | LifetimeTok Ident
- | Space Space Name
- | Doc String !AttrStyle !Bool
- | Shebang
- | Eof
- | Interpolated (Nonterminal Span)
- lexicalError :: P a
Lexing
lexToken :: P (Spanned Token) Source #
Lexer for one Token. The only token this cannot produce is Interpolated.
lexNonSpace :: P (Spanned Token) Source #
Lexer for one non-whitespace Token. The only tokens this cannot produce are Interpolated
and Space (which includes comments that aren't doc comments).
lexTokens :: P (Spanned Token) -> P [Spanned Token] Source #
Apply the given lexer repeatedly until (but not including) the Eof token. Meant for debugging
purposes - in general this defeats the point of a threaded lexer.
lexShebangLine :: P (Maybe String) Source #
Lex the first line, if it immediately starts with #! (but not #![ - that should be an
inner attribute). If this fails to find a shebang line, it consumes no input.
Tokens
A general token (based on syntax::parse::token::Token).
Unlike its libsyntax counterpart, Token has folded in syntax::parse::token::BinOpToken
and syntax::parse::token::BinOpEqToken as regular tokens.
Constructors
| Equal |
|
| Less |
|
| Greater |
|
| Ampersand |
|
| Pipe |
|
| Exclamation |
|
| Tilde |
|
| Plus |
|
| Minus |
|
| Star |
|
| Slash |
|
| Percent |
|
| Caret |
|
| GreaterEqual |
|
| GreaterGreaterEqual |
|
| AmpersandAmpersand |
|
| PipePipe |
|
| LessLess |
|
| GreaterGreater |
|
| EqualEqual |
|
| NotEqual |
|
| LessEqual |
|
| LessLessEqual |
|
| MinusEqual |
|
| AmpersandEqual |
|
| PipeEqual |
|
| PlusEqual |
|
| StarEqual |
|
| SlashEqual |
|
| CaretEqual |
|
| PercentEqual |
|
| At |
|
| Dot |
|
| DotDot |
|
| DotDotEqual |
|
| DotDotDot |
|
| Comma |
|
| Semicolon |
|
| Colon |
|
| ModSep |
|
| RArrow |
|
| LArrow |
|
| FatArrow |
|
| Pound |
|
| Dollar |
|
| Question |
|
| OpenDelim !Delim | One of |
| CloseDelim !Delim | One of |
| LiteralTok LitTok (Maybe Name) | a literal token with an optional suffix (something like |
| IdentTok Ident | an arbitrary identifier (something like |
| LifetimeTok Ident | a lifetime (something like |
| Space Space Name | whitespace |
| Doc String !AttrStyle !Bool | doc comment with its contents, whether it is outer/inner, and whether it is inline or not |
| Shebang |
|
| Eof | end of file token |
| Interpolated (Nonterminal Span) | can be expanded into several tokens in macro-expansion |
Error reporting
lexicalError :: P a Source #
Signal a lexical error.