Commit graph

208 commits

Author SHA1 Message Date
Boshen
21509ae15b
chore: disable profile.dev debug symbols because we don't need it that much 2024-04-13 15:38:19 +08:00
Boshen
56c71e2b1e
chore: enable some rust lints 2024-04-13 15:38:00 +08:00
Boshen
b15bf2826b
feat(napi/parser): remove experimental flexbuffer api (#2957) 2024-04-13 14:59:31 +08:00
Boshen
bd56d51443
chore(macros): only select required features from syn to reduce compile time (#2955) 2024-04-13 13:37:59 +08:00
Boshen
f366d9bd7c
chore(minsize): remove brotlic because it takes too long to compile (#2954) 2024-04-13 13:24:25 +08:00
branchseer
f159f60084
Make ast types covariant over the allocator lifetime. (#2943)
## Why

Due to the usage of `&'alloc mut T` in `oxc_allocator::Box`, and
`bumpalo::collections::Vec` in `oxc_allocator::Vec`, ast types are
currently invariant over their allocator lifetime `'a`. This prevents
`ouroboros` from generating `borrow_*` on ast type fields, leading to
the unfriendly `with_*` api:
c250b288ef/crates/oxc_parser/examples/multi-thread.rs (L82-L84)

## How

- For `oxc_allocator::Vec`, switch to `allocator_api2::vec::Vec`, which
has a covariant relationship with the allocator lifetime.
- For `oxc_allocator::Box`, use `std::ptr::NonNull` which is
specifically designed to be covariant. I don't use
`allocator_api2::boxed::Box` because it holds the allocator for
dropping, so the size is bigger.

## Downside

Now that `oxc_allocator::Box` uses the unsafe `NonNull`. It has to be a
private field to be safe. This make it impossible to do `Box(....)`
pattern matching.
2024-04-12 18:12:18 +08:00
Boshen
614f73b66c
Release crates v0.12.3 2024-04-11 16:18:17 +08:00
Boshen
09452659e2
Release crates v0.12.2 2024-04-08 11:13:13 +08:00
renovate[bot]
cd6f4f1938
chore(deps): lock file maintenance rust crates (#2913) 2024-04-08 02:49:53 +00:00
Boshen
366a7fb0d4
Release crates v0.11.2 2024-04-03 19:36:54 +08:00
Boshen
504698ab4a
chore: guard against unsafe code as much as possible. 2024-04-03 19:35:07 +08:00
Boshen
54f7cd3978
Release crates v0.11.1 2024-04-03 16:57:52 +08:00
Boshen
93897c530c
chore: bump syn to v2 (#2888) 2024-04-02 20:57:09 +08:00
Boshen
31ed532b79
Release crates v0.11.0 2024-03-30 13:54:53 +08:00
underfin
b199cb89a2
feat: add oxc sourcemap crate (#2825)
The sourcemap implement port from
[rust-sourcemap](https://github.com/getsentry/rust-sourcemap), but has
some different with it.

- Encode sourcemap at parallel, including quote `sourceContent` and
encode token to `vlq` mappings.
- Avoid `Sourcemap` some methods overhead, like `SourceMap::tokens()`
caused extra overhead at common cases. Here using `SourceViewToken` to
instead of it.
2024-03-28 19:36:38 +08:00
renovate[bot]
8c6936ab74
chore(deps): lock file maintenance rust crates (#2827) 2024-03-26 16:04:53 +00:00
Boshen
95fc28168c
chore: apply cargo autoinherit (#2826)
See https://github.com/mainmatter/cargo-autoinherit
2024-03-26 23:57:50 +08:00
Ali Rezvani
3d0ea545ca
chore: using numeric value for profile.dev.debug. (#2820)
Change the `profile.dev.debug` value from `limited` to `1` which is the
same thing according to
[this](https://doc.rust-lang.org/cargo/reference/profiles.html#debug).

For some reason, the numeric value was failing when running the codspeed
benchmark.

------

#### Edit:

it was resulting in the following error:

```
   failed to parse manifest at `/home/runner/work/oxc/oxc/Cargo.toml`
```

Related to #2812
2024-03-26 16:15:34 +08:00
Boshen
33eca79440
chore: try speeding up compilation by setting debug = "limited" for [profile.dev] (#2812) 2024-03-26 01:55:46 +08:00
renovate[bot]
525031b7a2
chore(deps): update rust crates (#2802) 2024-03-25 09:47:26 +08:00
Boshen
ef1108a749
chore: Rust v1.77.0 (#2781) 2024-03-21 17:21:57 +00:00
Andi Pabst
4c5abb590e
feat(cli): wildcard expansion in paths for windows (#2767)
Unlike on other OS, on Windows there is no wildcard expansion/globbing
by the shell. Instead the application has to handle this. Therefore I
used the `glob` package to handle wildcards on Windows.

I also had to make the parent directory check more strict due to the
glob package resolving `..` in the middle of the path as well.

This closes #2695.
2024-03-22 00:21:30 +08:00
renovate[bot]
1e9c0bc484
chore(deps): update rust crates (#2747) 2024-03-17 16:40:42 +00:00
Boshen
178d205d53
chore: criterion2 (#2737) 2024-03-16 18:55:04 +08:00
Boshen
125edb2650
refactor: remove unused dependencies (#2729) 2024-03-15 18:43:22 +08:00
Boshen
a5ddb5b452
Release crates v0.10.0 2024-03-14 18:23:34 +08:00
Boshen
1facc8d35d
chore: add clippy::cargo rules 2024-03-13 15:01:08 +08:00
renovate[bot]
b822b6d9eb
chore(deps): update rust crates (#2671) 2024-03-11 13:39:02 +08:00
Boshen
32303b20fb
New tool: oxc_module_lexer (#2650)
# Oxc Module Lexer

This is not a lexer. The name "lexer" is used for easier recognition.

## [es-module-lexer](https://github.com/guybedford/es-module-lexer)

Outputs the list of exports and locations of import specifiers,
including dynamic import and import meta handling.

Does not have any
[limitations](https://github.com/guybedford/es-module-lexer?tab=readme-ov-file#limitations)
mentioned in `es-module-lexer`.

I'll also work on the following cases to make this feature complete.

- [ ] get imported variables
https://github.com/guybedford/es-module-lexer/issues/163
- [ ] track star exports as imports as well
https://github.com/guybedford/es-module-lexer/issues/76
- [ ] TypeScript specific syntax
- [ ] TypeScript `type` import / export keyword

## [cjs-module-lexer](https://github.com/nodejs/cjs-module-lexer)

- [ ] TODO

## Benchmark

This is 2 times slower than `es-module-lexer`, but will be significantly
faster when TypeScript is processed.

The difference is around 10ms vs 20ms on a large file (700k).
2024-03-09 23:23:55 +08:00
Boshen
c72675e89e
chore: Rust v1.76.0 (#2643) 2024-03-08 20:54:36 +08:00
Boshen
265b2fb640
feat: miette v7 (#2465) 2024-03-08 15:50:00 +08:00
Boshen
1f14d946aa
chore: update Cargo.toml and deny.yaml 2024-03-05 16:31:05 +08:00
Boshen
cca6eb073c
Release crates v0.9.0 2024-03-05 15:57:31 +08:00
Boshen
fbb7a5a75c
fix: revert Cargo.toml change 2024-03-05 15:51:15 +08:00
Boshen
bf42158ad7
perf(parser): inline end_span and parse_identifier_kind which are on the hot path (#2612) 2024-03-05 15:39:53 +08:00
renovate[bot]
9bd1d5b25e
chore(deps): update rust crates (#2589) 2024-03-04 11:23:06 +08:00
Boshen
570ca68b1e
chore: bump Minimum Supported Rust Version to 1.74
closes #2514
2024-02-26 23:25:03 +08:00
Boshen
4fabe66621
Publish crates v0.8.0 2024-02-26 19:01:51 +08:00
renovate[bot]
29b213eac7
chore(deps): update rust crates (#2503) 2024-02-26 10:38:11 +08:00
Boshen
f64c7e04a3
feat(linter): handle cjs module.exports.foo = bar and exports.foo = bar (#2492) 2024-02-24 23:54:43 +08:00
overlookmotel
90f9266d00
chore(deps): update bumpalo crate (#2417)
Latest version of `bumpalo` includes a couple of performance fixes for
`String` (e.g. https://github.com/fitzgen/bumpalo/pull/229) which may
help the parser a little.
2024-02-18 11:49:31 +08:00
renovate[bot]
e5fcf82b93
chore(deps): update rust crates (#2396) 2024-02-12 11:42:57 +08:00
Boshen
d6d921ea1f
Publish crates v0.7.0 2024-02-09 23:01:12 +08:00
overlookmotel
d3a59f27f7
perf(parser): lex identifiers as bytes not chars (#2352)
This PR re-implements lexing identifiers with a fast path for the most common case - identifiers which are pure ASCII characters, using the new `Source` / `SourcePosition` APIs.

Lexing identifiers is a hot path, and accounts for the majority of the time the Lexer spends. The performance bump from this change is (if I do say so myself!) quite decent.

I've spent a lot of time tuning the implementation, which gained a further 10-15% on the Lexer benchmarks compared to my first, simpler attempt. Some of the design decisions, if they look odd, are likely motivated by gains in performance.

### Techniques

This implementation uses a few different strategies for performance:

* Search byte-by-byte, not char-by-char.
* Process batches of 32 bytes at a time to reduce bounds checks.
* Mark uncommon paths `#[cold]`.

### Structure

The implementation is built in 3 layers:

1. ASCII characters only.
2. ASCII and Unicode characters.
3. `\` escape sequences (and all the above).

`identifier_name_handler` starts at the top layer, and is optimized for consuming ASCII as fast as possible. Each "layer" is considered more uncommon than the previous, and dropping down a layer is a de-opt.

I'm assuming that 95%+ of JavaScript code does not include either Unicode characters or escapes in identifiers, so the speed of the fast path is prioritised.

That said, once a Unicode character is encountered, the next layer does expect to find further Unicode characters, rather than de-opting over and over again. If an identifier *starts* with a Unicode character, it enters the code straight on the 2nd layer, so is not penalised by going through a `#[cold]` boundary. Lexing Unicode is never going to be as fast as ASCII, but still I felt it was important not to penalise it unnecessarily, so as not to be Anglo-centric.

### ASCII search macro

The main ASCII search is implemented as a macro. I found that, for reasons I don't understand, it's significantly faster to have all the code in a single function, even compared to multiple functions marked `#[inline]` or `#[inline(always)]`. The fastest implementation also requires some code to be repeated twice, which is nicer to do with a macro.

This macro, and the `ByteMatchTable` types that go with it, are designed to be re-usable. Next step will be to apply them for whitespace and strings, which should be fairly simple.

Searching in batches of 32 bytes is also designed to be forward-compatible with SIMD.

### Bye bye `AutoCow`

`AutoCow` is removed. Instead, a string-builder is only created if it's needed, when a `\` escape is first encountered. The string builder is also more efficient than `AutoCow` was, as it copies bytes in chunks, rather than 1-by-1.

This won't make much difference for identifiers, as escapes are so rare anyway, but this same technique can be used for strings, where they're more common.
2024-02-09 12:01:30 +08:00
Dunqing
ed29207781
chore(clippy): disable nursery group rules (#2319)
#2318
2024-02-05 18:43:15 +08:00
renovate[bot]
41d1876650
chore(deps): update rust crates (#2302) 2024-02-05 14:36:53 +08:00
Boshen
6002560fa1
feat(span): fix memory leak by implementing inlineable string for oxc_allocator (#2294)
closes #1803

This string is currently unsafe, but I want to get miri working before
introducing more changes.

I want to make a progress from memory leak to unsafe then to safety.
It's harder to do the steps in one go.
2024-02-04 19:28:23 +08:00
Boshen
d2b304b1f8
Publish crates v0.6.0 2024-02-03 22:35:30 +08:00
Boshen
5ac61f09a0
feat: setup wasm parser for npm (#2221) 2024-01-30 21:40:10 +08:00
Nicholas Roberts
cd5026c015
feat(ast): TypeScript definition for wasm target (#2158)
Closes #2151
2024-01-30 15:43:03 +08:00