Since this is a temporary solution in the time that we are waiting for the `#[span]` hint, And there are already other workarounds used in our `ast_codegen` I propose removing it right away - sorry in my opinion adding it in the first place was a mistake - in favor of adding an edge case in the codegen. It is better to do the refactoring in the codegen instead of the production code which people may depend on.
`Atom` is just a wrapper around `&str`, so better not to pass `&Atom` to functions, as that's a double-reference. Prefer `Atom` or `&str` instead to avoid indirection.
Follow-on after #3296.
Make parsing binary/octal/hex numeric literals a little more efficient.
These changes all rely on that we know more than the compiler does -
that strings passed to these `parse_*` functions can only contain a
certain set of characters.
## What This PR Does
- perf(lexer): use bit shifting when parsing hex, octal, and binary
integers instead of `mul_add`-ing on `f64`s. Check out the difference in
assembly generated [here](https://godbolt.org/z/zMEKaeYzh)
- perf(lexer): skip redundant utf8 check when parsing BigInts
- refactor(lexer): remove `unsafe` usage (as per @overlookmotel's
request
[here](https://github.com/oxc-project/oxc/pull/3283#issuecomment-2111814598))
- test(lexer): add numeric parsing unit tests
I don't expect this PR to have a large performance improvement, since
the most common case (`Kind::Decimal`) is not affected. We could do
this, however, by splitting `Kind::Decimal` into `Kind::DecimalFloat`
and `Kind::DecimalInt` when the lexer encounters a `.`
## What This PR Does
Updates numeric literal token lexing to record when separator characters
(`_`) are found in a new `Token` flag. This then gets passed to
`parse_int` and `parse_float`, removing the need for a second `_` check
in those two functions.
When run locally, I see no change to lexer benchmarks and minor
improvements to codegen benchmarks. For some reason, semantic and source
map benches seem to be doing slightly worse.
Note that I attempted to implement this with `bitflags!` (making
`escaped` and `is_on_newline` flags as well) and this caused performance
degradation. My best guess is that it turned reads on these flags from a
`mov` to a `mov` + a binary and.
---------
Co-authored-by: Boshen <boshenc@gmail.com>
part of #3213
We should only have one diagnostic struct instead 353 copies of them, so we don't end up choking LLVM with 50k lines of the same code due to monomorphization.
If the proposed approach is good, then I'll start writing a codemod to turn all the existing structs to plain functions.
---
Background:
Using `--timings`, we see `oxc_linter` is slow on codegen (the purple part).

The crate currently contains 353 miette errors. [cargo-llvm-lines](https://github.com/dtolnay/cargo-llvm-lines) displays
```
cargo llvm-lines -p oxc_linter --lib --release
Lines Copies Function name
----- ------ -------------
830350 33438 (TOTAL)
29252 (3.5%, 3.5%) 808 (2.4%, 2.4%) <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop
23298 (2.8%, 6.3%) 353 (1.1%, 3.5%) miette::eyreish::error::object_downcast
19062 (2.3%, 8.6%) 706 (2.1%, 5.6%) core::error::Error::type_id
12610 (1.5%, 10.1%) 65 (0.2%, 5.8%) alloc::raw_vec::RawVec<T,A>::grow_amortized
12002 (1.4%, 11.6%) 706 (2.1%, 7.9%) miette::eyreish::ptr::Own<T>::boxed
9215 (1.1%, 12.7%) 115 (0.3%, 8.2%) core::iter::traits::iterator::Iterator::try_fold
9150 (1.1%, 13.8%) 1 (0.0%, 8.2%) oxc_linter::rules::RuleEnum::read_json
8825 (1.1%, 14.9%) 353 (1.1%, 9.3%) <miette::eyreish::error::ErrorImpl<E> as core::error::Error>::source
8822 (1.1%, 15.9%) 353 (1.1%, 10.3%) miette::eyreish::error::<impl miette::eyreish::Report>::construct
8119 (1.0%, 16.9%) 353 (1.1%, 11.4%) miette::eyreish::error::object_ref
8119 (1.0%, 17.9%) 353 (1.1%, 12.5%) miette::eyreish::error::object_ref_stderr
7413 (0.9%, 18.8%) 353 (1.1%, 13.5%) <miette::eyreish::error::ErrorImpl<E> as core::fmt::Display>::fmt
7413 (0.9%, 19.7%) 353 (1.1%, 14.6%) miette::eyreish::ptr::Own<T>::new
6669 (0.8%, 20.5%) 39 (0.1%, 14.7%) alloc::raw_vec::RawVec<T,A>::try_allocate_in
6173 (0.7%, 21.2%) 353 (1.1%, 15.7%) miette::eyreish::error::<impl miette::eyreish::Report>::from_std
6027 (0.7%, 21.9%) 70 (0.2%, 16.0%) <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
6001 (0.7%, 22.7%) 353 (1.1%, 17.0%) miette::eyreish::error::object_drop
6001 (0.7%, 23.4%) 353 (1.1%, 18.1%) miette::eyreish::error::object_drop_front
5648 (0.7%, 24.1%) 353 (1.1%, 19.1%) <miette::eyreish::error::ErrorImpl<E> as core::fmt::Debug>::fmt
```
It's totalling more than 50k llvm lines, and is putting pressure on rustc codegen (the purple part on `oxc_linter` in the image above.
---
It's pretty obvious by looking at https://github.com/zkat/miette/blob/main/src/eyreish/error.rs, the generics can expand out to lots of code.