oxc/crates/oxc_parser
overlookmotel 66a7a68f9f
perf(parser): lexer byte handlers consume ASCII chars faster (#2046)
In the lexer, most `BYTE_HANDLER`s immediately consume the current char
with `lexer.consume_char()`.

Byte handlers are only called if there's a certain value (or range of
values) for the next char. This is their entire purpose. So in all cases
we know for sure that we're not at EOF, and that the next char is a
single-byte ASCII character.

The compiler, however, doesn't seem to be able to "see through" the
`BYTE_HANDLERS[byte](self)` call and understand these invariants. So it
produces very verbose ASM for `lexer.consume_char()`.

This PR replaces `lexer.consume_char()` in the byte handlers with an
unsafe `lexer.consume_ascii_char()` which skips on to next char with a
single `inc` instruction.

The difference in codegen can be seen here:
https://godbolt.org/z/1ha3cr9W5 (compare the 2 x
`core::ops::function::FnOnce::call_once` handlers).

Downside is that this does introduce a lot of unsafe blocks, but in my
opinion they're all pretty trivial to validate.

---------

Co-authored-by: Boshen <boshenc@gmail.com>
2024-01-16 12:31:45 +08:00
..
examples chore(clippy): enable undocumented_unsafe_blocks 2023-10-16 15:18:14 +08:00
src perf(parser): lexer byte handlers consume ASCII chars faster (#2046) 2024-01-16 12:31:45 +08:00
Cargo.toml perf(parser): lexer byte handlers consume ASCII chars faster (#2046) 2024-01-16 12:31:45 +08:00