Revert #5192 and add a comment that it's not a perf gain.
This was really surprising to me, but the benchmarks do demonstrate it.
Please see the benchmarks commit-by-commit on this PR. Adding `#[inline]` to the function does give +1% gain, but it's no better than it was before #5192. So I think preferable to just revert to the simpler original.
I think likely explanation is that the compiler is already performing this optimization itself. And if it does it itself, then it understands the code better, and can then make better decisions about inlining.
https://godbolt.org/z/xzhWWeMoe seems to demonstrate this - there are 2 calls to `Item::gen` in the generated assembly, so it has split the loop into 2.
try to fix: https://github.com/rolldown/rolldown/issues/2013
1. Before we only considering the ast is untouched, but considering the
scenario.
```js
const a = /*__PURE__*/ test(),
// ^^^ ^^^^^^ is removed during transform
b = a();
```
Then according to the previous algorithm, `PURE` will attach to `b =
a()`
2. Now, we try to attach comments as much as possible unless the
comments are separated by comments, for the case above, `PURE` will not
be attached to `a()` since the content between `b = a()` and `/*
__PURE__*/` is not all whitespace.
3. we added back `MoveMap`, for the special case
```js
/*__NODE_SIDE_EFFECTS__*/ export const c = 100;
// ^^^^^^^^^^^^^^^^^^^^^ should be attached to first declarator,
// ^^^^^^ are not whitespace
```
`LineOffsetTables` records mappings from byte offset to line and column numbers (with column number in UTF-16 characters).
Most lines do not contain any Unicode characters, and for these lines there is an exact correspondence between number of bytes from start of line and UTF-16 column number, so no column lookup table is required.
Reduce the data stored for each line from 32 bytes to 8 bytes by storing column offset lookup tables for the rare lines which do contain Unicode chars separately.
Additionally, store column lookup tables as a `Box<[u32]>` instead of `Vec<u32>` to reduce the size of `ColumnOffsets` by 8 bytes.
Oxc have a limit on size of source files of 4 GiB, so `u32` is sufficient to hold line and column offsets. Use `u32` for these values in `LineOffsetTable`, which reduces size of the type by 8 bytes.
After studying google closure compiler, I'm leaning towards a multi-ast-pass infrastructure for the minifier.
This is one of the few places where we are going to trade maintainability over performance, given the goal of the minifier is compression size not performance.
All of the terminologies and separation of concerns are aligned with google closure compiler.
Infrastructure of `terser` and `esbuild` are not suitable for us to study nor pursuit. Their code are so tightly coupled - I failed to comprehend any of them every time I try to walk through a piece of optmization. Google closure compiler despite being written in Java, it's actually the most readable minifier out there.
To improve performance between ast passes, I envision a change detection system over a portion of the code.
The benchmark will demonstrate the performance regression of running 5 ast passes instead of 2.
To complete this PR, I need to figure out "fix-point" and order of these ast passes.
Refactors a lot of case-insensitive comparisons from
```rust
a.to_lowercase() == b.to_lowercase()
```
with
```rust
a.eq_ignore_ascii_case(b)
```
These mostly happened when checking JSX props, so I'm expecting the most benefit from JSX-related rules.