Commit graph

376 commits

Author SHA1 Message Date
Boshen
d43d404d61
fix: update lock file 2024-04-08 11:17:26 +08:00
renovate[bot]
cd6f4f1938
chore(deps): lock file maintenance rust crates (#2913) 2024-04-08 02:49:53 +00:00
Boshen
366a7fb0d4
Release crates v0.11.2 2024-04-03 19:36:54 +08:00
Boshen
54f7cd3978
Release crates v0.11.1 2024-04-03 16:57:52 +08:00
Boshen
93897c530c
chore: bump syn to v2 (#2888) 2024-04-02 20:57:09 +08:00
Boshen
a0532adc65
chore: bump criterion2, which removes clap and 10s compile time 2024-04-02 16:48:40 +08:00
Boshen
3ca6721783
ci: always run cargo shear 2024-03-31 16:06:22 +08:00
Boshen
293b9f482a
feat(transformer): add transform-typescript boilerplate (#2866) 2024-03-30 20:48:35 +08:00
Boshen
c1a2958a5a
chore: remove oxc_transformer for a reimplementation (#2865)
closes #2860
2024-03-30 17:19:46 +08:00
Boshen
31ed532b79
Release crates v0.11.0 2024-03-30 13:54:53 +08:00
underfin
b199cb89a2
feat: add oxc sourcemap crate (#2825)
The sourcemap implement port from
[rust-sourcemap](https://github.com/getsentry/rust-sourcemap), but has
some different with it.

- Encode sourcemap at parallel, including quote `sourceContent` and
encode token to `vlq` mappings.
- Avoid `Sourcemap` some methods overhead, like `SourceMap::tokens()`
caused extra overhead at common cases. Here using `SourceViewToken` to
instead of it.
2024-03-28 19:36:38 +08:00
renovate[bot]
8c6936ab74
chore(deps): lock file maintenance rust crates (#2827) 2024-03-26 16:04:53 +00:00
Boshen
e32a3b3783
ci: use cargo-shear (#2810) 2024-03-26 00:43:10 +08:00
renovate[bot]
3d761f17c2
chore(deps): lock file maintenance rust crates (#2804) 2024-03-25 17:24:30 +08:00
renovate[bot]
525031b7a2
chore(deps): update rust crates (#2802) 2024-03-25 09:47:26 +08:00
underfin
dc3f6e7570
chore: update sourcemap 8.0.0 (#2788) 2024-03-22 21:52:21 +08:00
Andi Pabst
4c5abb590e
feat(cli): wildcard expansion in paths for windows (#2767)
Unlike on other OS, on Windows there is no wildcard expansion/globbing
by the shell. Instead the application has to handle this. Therefore I
used the `glob` package to handle wildcards on Windows.

I also had to make the parent directory check more strict due to the
glob package resolving `..` in the middle of the path as well.

This closes #2695.
2024-03-22 00:21:30 +08:00
underfin
a2cfc867cb
feat: SourcemapVisualizer (#2773)
Export `SourcemapVisualizer` from codegen, it will be used oxc and
rolldown sourcemap test, so it support multiply source print, it will
using sourcemap `sourcesContent` as original source.
2024-03-21 11:19:09 +08:00
overlookmotel
508091314f
feat(tasks): benchmark NodeJS parser (#2770)
Add NodeJS parser to benchmarks.

Previous attempt #2724 did not work due CodSpeed producing very
inaccurate results (https://github.com/CodSpeedHQ/action/issues/96).

This version runs the actual benchmarks without CodSpeed's
instrumentation. Then another faux-benchmark runs within Codspeed's
instrumented action and just performs meaningless calculations in a loop
for as long as is required to take same amount of time as the original
uninstrumented benchmarks took.

It's unfortunate that we therefore don't get flame graphs on CodSpeed,
but this seems to be the best we can do for now.
2024-03-20 05:06:09 +00:00
renovate[bot]
1e9c0bc484
chore(deps): update rust crates (#2747) 2024-03-17 16:40:42 +00:00
Boshen
178d205d53
chore: criterion2 (#2737) 2024-03-16 18:55:04 +08:00
Boshen
a5ddb5b452
Release crates v0.10.0 2024-03-14 18:23:34 +08:00
Boshen
cbc2f5ff97
refactor: remove unused dependencies (#2718) 2024-03-14 17:50:31 +08:00
Boshen
697b6b70c0
feat: merge features serde and wasm to serialize (#2716)
This PR merges the previous confusing features `serde` and `wasm` into a
single `serialize` feature.

We'll eventually do serialize + type information for both wasm and napi
targets.

`oxc_macros` is removed from `oxc_ast`'s dependency because it requires
`syn` and friends, which goes against our policy ["Third-party
dependencies should be
minimal."](https://oxc-project.github.io/docs/contribute/rules.html#development-policy)
2024-03-14 17:13:12 +08:00
underfin
2dfc0cc28e
chore: upgrade sourcemap 7.1.1 (#2701) 2024-03-13 13:52:36 +08:00
renovate[bot]
b822b6d9eb
chore(deps): update rust crates (#2671) 2024-03-11 13:39:02 +08:00
overlookmotel
3c1e0db53f
refactor: reduce cfg_attr boilerplate with SerAttrs derive (#2669)
Closes #2641.

Also added `tsify` attribute to the `SerAttrs` derive macro, so `#[cfg_attr(feature = "wasm", tsify(...))]` can also be reduced to `#[tsify(...)]`.
2024-03-11 13:38:24 +08:00
Boshen
32303b20fb
New tool: oxc_module_lexer (#2650)
# Oxc Module Lexer

This is not a lexer. The name "lexer" is used for easier recognition.

## [es-module-lexer](https://github.com/guybedford/es-module-lexer)

Outputs the list of exports and locations of import specifiers,
including dynamic import and import meta handling.

Does not have any
[limitations](https://github.com/guybedford/es-module-lexer?tab=readme-ov-file#limitations)
mentioned in `es-module-lexer`.

I'll also work on the following cases to make this feature complete.

- [ ] get imported variables
https://github.com/guybedford/es-module-lexer/issues/163
- [ ] track star exports as imports as well
https://github.com/guybedford/es-module-lexer/issues/76
- [ ] TypeScript specific syntax
- [ ] TypeScript `type` import / export keyword

## [cjs-module-lexer](https://github.com/nodejs/cjs-module-lexer)

- [ ] TODO

## Benchmark

This is 2 times slower than `es-module-lexer`, but will be significantly
faster when TypeScript is processed.

The difference is around 10ms vs 20ms on a large file (700k).
2024-03-09 23:23:55 +08:00
Boshen
265b2fb640
feat: miette v7 (#2465) 2024-03-08 15:50:00 +08:00
Boshen
1f14d946aa
chore: update Cargo.toml and deny.yaml 2024-03-05 16:31:05 +08:00
Boshen
cca6eb073c
Release crates v0.9.0 2024-03-05 15:57:31 +08:00
cinchen
35ce3ccdc0
feat(linter): eslint-plugin-jest: prefer-to-have-length (#2580)
Rule Detail:
[link](https://github.com/jest-community/eslint-plugin-jest/blob/main/src/rules/prefer-to-have-length.ts)
2024-03-05 09:15:19 +08:00
renovate[bot]
9bd1d5b25e
chore(deps): update rust crates (#2589) 2024-03-04 11:23:06 +08:00
Boshen
8bb1084863
feat(codegen): add sourcemap (#2565)
Co-authored-by: underfin <2218301630@qq.com>
2024-03-03 14:44:49 +08:00
Boshen
c56b6cb643
refactor: replace InlinableString with CompactString for Atom (#2517)
relates #2516
2024-02-26 23:11:46 +08:00
Boshen
be6b8b7ce6
[BREAKING CHANGE] Change Atom to Atom<'a> to make it safe (#2497)
Part of #2295

This PR splits the `Atom` type into `Atom<'a>` and `CompactString`.

All the AST node strings now use `Atom<'a>` instead of `Atom` to signify
it belongs to the arena.

It is now up to the user to select which form of the string to use.

This PR essentially removes the really unsafe code 


93742f89e9/crates/oxc_span/src/atom.rs (L98-L107)

which can lead to 

![image](https://github.com/oxc-project/oxc/assets/1430279/8c513c4f-19b0-4b63-b61c-e07c187c95b5)
2024-02-26 19:34:40 +08:00
Boshen
4fabe66621
Publish crates v0.8.0 2024-02-26 19:01:51 +08:00
renovate[bot]
29b213eac7
chore(deps): update rust crates (#2503) 2024-02-26 10:38:11 +08:00
Boshen
f64c7e04a3
feat(linter): handle cjs module.exports.foo = bar and exports.foo = bar (#2492) 2024-02-24 23:54:43 +08:00
Boshen
7f867221ca
feat(linter): handle built-in modules in import/no_unresolved (#2479) 2024-02-23 22:10:27 +08:00
Boshen
fca6e02f84
refactor(cli): remove the codeowners feature (#2448)
I should've done a better job at selecting features. Every feature
requires more than just code to get it right.

linting by codeowners' files sounds good on paper but actually not that
useful.
2024-02-20 12:54:16 +08:00
renovate[bot]
daaea40d0e
chore(deps): update rust crate env_logger to 0.11.2 (#2430) 2024-02-19 10:32:34 +08:00
overlookmotel
90f9266d00
chore(deps): update bumpalo crate (#2417)
Latest version of `bumpalo` includes a couple of performance fixes for
`String` (e.g. https://github.com/fitzgen/bumpalo/pull/229) which may
help the parser a little.
2024-02-18 11:49:31 +08:00
Boshen
1cbd7539fb
feat(coverage): add prettier idempotency test (#2402)
closes #1329
2024-02-12 15:30:16 +08:00
Boshen
70a0076eed
refactor: remove global allocator from non-user facing apps (#2401)
The runtime performance gains does not out weight the compilation speed from
building the custom allocators, which takes about a minute to build on
slower machines.
2024-02-12 14:09:05 +08:00
renovate[bot]
e5fcf82b93
chore(deps): update rust crates (#2396) 2024-02-12 11:42:57 +08:00
overlookmotel
383f5b3081
perf(parser): consume multi-line comments faster (#2377)
Consume multi-line comments faster.

* Initially search for `*/`, `\r`, `\n` or `0xE2` (first byte of
irregular line breaks).
* Once a line break is found, switch to faster search which only looks
for `*/`, as it's not relevant whether there are more line breaks or
not.

Using `memchr` for the 2nd simpler search, as it's efficient for a
search with only one "needle".

Initializing `memchr::memmem::Finder` is fairly expensive, and tried
numerous ways to handle it. This is most performant way I could find.
Any ideas how to avoid re-creating it for each Lexer pass? (it can't be
a `static` as `Finder::new` is not a const function, and `lazy_static!`
is too costly)
2024-02-11 12:43:14 +08:00
Boshen
d6d921ea1f
Publish crates v0.7.0 2024-02-09 23:01:12 +08:00
overlookmotel
d3a59f27f7
perf(parser): lex identifiers as bytes not chars (#2352)
This PR re-implements lexing identifiers with a fast path for the most common case - identifiers which are pure ASCII characters, using the new `Source` / `SourcePosition` APIs.

Lexing identifiers is a hot path, and accounts for the majority of the time the Lexer spends. The performance bump from this change is (if I do say so myself!) quite decent.

I've spent a lot of time tuning the implementation, which gained a further 10-15% on the Lexer benchmarks compared to my first, simpler attempt. Some of the design decisions, if they look odd, are likely motivated by gains in performance.

### Techniques

This implementation uses a few different strategies for performance:

* Search byte-by-byte, not char-by-char.
* Process batches of 32 bytes at a time to reduce bounds checks.
* Mark uncommon paths `#[cold]`.

### Structure

The implementation is built in 3 layers:

1. ASCII characters only.
2. ASCII and Unicode characters.
3. `\` escape sequences (and all the above).

`identifier_name_handler` starts at the top layer, and is optimized for consuming ASCII as fast as possible. Each "layer" is considered more uncommon than the previous, and dropping down a layer is a de-opt.

I'm assuming that 95%+ of JavaScript code does not include either Unicode characters or escapes in identifiers, so the speed of the fast path is prioritised.

That said, once a Unicode character is encountered, the next layer does expect to find further Unicode characters, rather than de-opting over and over again. If an identifier *starts* with a Unicode character, it enters the code straight on the 2nd layer, so is not penalised by going through a `#[cold]` boundary. Lexing Unicode is never going to be as fast as ASCII, but still I felt it was important not to penalise it unnecessarily, so as not to be Anglo-centric.

### ASCII search macro

The main ASCII search is implemented as a macro. I found that, for reasons I don't understand, it's significantly faster to have all the code in a single function, even compared to multiple functions marked `#[inline]` or `#[inline(always)]`. The fastest implementation also requires some code to be repeated twice, which is nicer to do with a macro.

This macro, and the `ByteMatchTable` types that go with it, are designed to be re-usable. Next step will be to apply them for whitespace and strings, which should be fairly simple.

Searching in batches of 32 bytes is also designed to be forward-compatible with SIMD.

### Bye bye `AutoCow`

`AutoCow` is removed. Instead, a string-builder is only created if it's needed, when a `\` escape is first encountered. The string builder is also more efficient than `AutoCow` was, as it copies bytes in chunks, rather than 1-by-1.

This won't make much difference for identifiers, as escapes are so rare anyway, but this same technique can be used for strings, where they're more common.
2024-02-09 12:01:30 +08:00
renovate[bot]
41d1876650
chore(deps): update rust crates (#2302) 2024-02-05 14:36:53 +08:00