oxc/crates/oxc_parser
overlookmotel 20679d1e1e
perf(parser): pad Token to 16 bytes (#2211)
Counter-intuitively, it seems that *increasing* the size of `Token`
improves performance slightly.

This appears to be because when `Token` is 16 bytes, copying `Token` is
a single 16-byte load/store. At present, it's 12 bytes which requires an
8-byte load/store + a 4-byte load/store.

https://godbolt.org/z/KPYsn3ab7

This suggests that either:

1. #2010 could be reverted at no cost, and the overhead of the hash
table removed.
or:
2. We need to get `Token` down to 8 bytes!

I have an idea how to *maybe* do (2), so I'd suggest leaving it as is
for now until I've been able to research that.

NB I also tried putting `#[repr(align(16))]` on `Token` so that copying
uses aligned loads/stores. That [hurt the benchmarks very
slightly](https://codspeed.io/overlookmotel/oxc/branches/lexer-pad-token),
though it might produce a gain on architectures where unaligned loads
are more expensive (ARM64 I think?). But I can't test that theory, so
have left it out.
2024-01-30 11:47:26 +08:00
..
examples chore(clippy): enable undocumented_unsafe_blocks 2023-10-16 15:18:14 +08:00
src perf(parser): pad Token to 16 bytes (#2211) 2024-01-30 11:47:26 +08:00
Cargo.toml chore(deps): update cargo (#2138) 2024-01-23 14:48:04 +08:00