The sourcemap [`debugId`
proposal](https://github.com/tc39/source-map/blob/main/proposals/debug-id.md)
adds globally unique build or debug IDs to source maps and generated
code, making build artifacts self-identifying.
Support for debug IDs was added to
[`rust-sourcemap`](https://github.com/getsentry/rust-sourcemap/pull/66)
in 2023 and Sentry have made use of this to aid in matching up source
and sourcemap files without having to worry about path mismatches or
release versions.
I want to add debug ID support to Rolldown but it uses `oxc::sourcemap`
so it looks like I need to start here first!
Introduce new method `ConcatSourceMapBuilder::from_sourcemaps`.
Where all the sourcemaps being concatenated exist at time that you
create `ConcatSourceMapBuilder`, it's faster to use `from_sourcemaps`,
because it pre-allocates enough space for the data it will hold and so
avoids memory copying.
Before:
```rs
let mut builder = ConcatSourceMapBuilder::default();
builder.add_sourcemap(&sourcemap1, 0);
builder.add_sourcemap(&sourcemap2, 100);
builder.add_sourcemap(&sourcemap3, 100);
let combined = builder.into_sourcemap();
```
After:
```rs
let builder = ConcatSourceMapBuilder::from_sourcemaps(&[
(&sourcemap1, 0),
(&sourcemap2, 100),
(&sourcemap3, 200),
]);
let combined = builder.into_sourcemap();
```
Speed up source map VLQ encoding by removing a couple of operations from `serialize_mappings`'s hot loop.
On a local benchmark of just VLQ encoding, this change produces 5% performance increase (benchmarked on MacBook Pro M1).
Clone `Arc<str>`s for source text instead of creating new `Arc<str>`s and copying the string data.
For the shorter strings (names and source filenames) it's cheaper to create a new `Arc<str>` than to clone, presumably because of the overhead of atomic operations involved in `Arc::clone`.
In source map VLQ encoding, keep local copy of previous `Token`, rather than looking up up from `tokens`.
On a local benchmark of just VLQ encoding, this change produces 6% performance increase (benchmarked on MacBook Pro M1).
Reduce number of operations in main loop in source map VLQ encoding.
#4583 made pushing a byte to output only 2 instructions, so that makes it workable to repeat `push_byte_unchecked` inside and outside the loop.
On a local benchmark of just VLQ encoding shows this increases performance by 16% (on top of the 11% from #4583).
Probably main gain is it makes a fast path for encoding `0`, which is common.
In `oxc_sourcemap`'s VLQ encoding, avoid bounds checks when pushing bytes to the encoded string in the hot loop.
Those bounds checks are quite expensive as they involve a function call to `alloc::raw_vec::RawVec::grow_one`, and that happens on every single pushed byte.
https://godbolt.org/z/44G8jjss3
Not much difference on benchmarks, as VLQ encoding is only a small part of source map generation, but a local benchmark of just VLQ encoding shows this increases performance by 11%.
Also optimize the memory allocation in string escape. The default size in `serde_json` is 1024 for String type, we pre allocate `string.len() * 2 + 2` for every string to reduce re-allocate in escaping.
I've tried to hand write SIMD implementation, but it's too complex, so I uses the `v_jsonescape` here. But it doesn't support `aarch64` and `wasm32` simd implementation, we need to contribute to it!
Reduce memory copies when encoding source map as JSON, extending approach taken in #4476 to also avoid memory copies for source texts.
I believe reason this shows no benefit on benchmarks is because our benchmarks only create a source map from a single source file, but it should result in a speed-up when there are multiple sources.
Because the `token_chunks` need to pre-visit tokens and collect, it
could be done at add tokens phase. So here export it let rolldown could
be improve `renderChunks` sourcemap encode.
The rolldown plugin hook could return an object map, cast it to string
at node, and decode it has unnecessary json overhead at rust. So here
export an new function to let rolldown could using `JSONSourceMap` to
generate `Sourcemap`.
The sourcemap implement port from
[rust-sourcemap](https://github.com/getsentry/rust-sourcemap), but has
some different with it.
- Encode sourcemap at parallel, including quote `sourceContent` and
encode token to `vlq` mappings.
- Avoid `Sourcemap` some methods overhead, like `SourceMap::tokens()`
caused extra overhead at common cases. Here using `SourceViewToken` to
instead of it.