Commit graph

50 commits

Author SHA1 Message Date
Boshen
d942a8d41a
chore: Rust v1.83.0 changes (#7535)
This PR does not upgrade rustc. Only changes are applied.

We cannot upgrade to the lastet Rust version yet due to wasm-bindgen
breaking some generated types.

THere's also some elided lifetimes in `**/generated/**`, which requires
modification to ast tools.
2024-11-29 11:59:45 +08:00
Boshen
896ff860f9 fix(minifier): do not fold if statement block with lexical declaration (#7519) 2024-11-28 10:37:41 +00:00
overlookmotel
39afb48025 feat(allocator): introduce Vec::from_array_in (#7331)
Because we lack specialization in stable Rust, `Vec::from_iter_in` is unable to take advantage of the fact that `[T; N]` has a statically knowable size.

Introduce `Vec::from_array_in` for this case, which should be able to create the `Vec` with a single static-sized memcpy, or may allow the compiler to see that it can construct the array directly in the arena, rather than construct on stack and then copy to the arena.

Also add a corresponding `AstBuilder::vec_from_array` method, and use it in various places in codebase.
2024-11-18 02:35:46 +00:00
Song Gao
cf3415b0e4
chore(doc): replace main/master to tag/commit to make the url always accessible (#7298) 2024-11-16 21:00:30 +08:00
overlookmotel
adb50398b8 refactor(allocator): add impl GetAddress for Address (#6891)
This allows passing an `Address` to methods like `StatementInjectorStore::insert_before` if you want to.
2024-10-25 15:20:21 +00:00
overlookmotel
419343bdd5 feat(traverse): implement GetAddress for Ancestor (#6877)
Closes #6803. Allow getting `Address` of an `Ancestor`.
2024-10-25 10:52:01 +00:00
overlookmotel
ab8aa2f7b0 refactor(allocator): move GetAddress trait into oxc_allocator (#6738)
Pure refactor. It makes more sense to me for `Address` and `GetAddress` to be defined in the same place. Also then we can use the trait for `impl GetAddress for Box`.
2024-10-21 11:46:29 +00:00
overlookmotel
e1c2d30d44 fix(allocator)!: make Vec non-drop (#6623)
`oxc_allocator::Vec` is intended for storing AST types in the arena. `Vec` is intended to be non-drop, because all AST types are non-drop. If they were not, it would be a memory leak, because those types will not have `drop` called on them when the allocator is dropped.

However, `oxc_allocator::Vec` is currently a wrapper around `allocator_api2::vec::Vec`, which *is* drop. That unintentionally makes `oxc_allocator::Vec` drop too.

This PR fixes this by wrapping the inner `allocator_api2::vec::Vec` in `ManuallyDrop`. This makes `oxc_allocator::Vec` non-drop.

The wider consequence of this change is that the compiler is now able to see that loads of other types which contain `oxc_allocator::Vec` are also non-drop. Once it can prove that, it can remove a load of code which handles dropping these types in the event of a panic. This probably also then allows it to make many further optimizations on that simplified code.

Strictly speaking, this PR is a breaking change. If `oxc_allocator::Vec` is abused to store drop types, then in some circumstances this change could produce a memory leak where there was none before. However, we've always been clear that only non-drop types should be stored in the arena, so such usage was always a bug.

#6622 fixes the only place in Oxc where we mistakenly stored non-drop types in `Vec`.

The change to `oxc_prettier` is because compiler can now deduce that `Doc` is non-drop, which causes clippy to raise a warning about using `then` instead of `then_some`.

As follow-up, we should:

1. Wrap other `allocator_api2` types (e.g. `IntoIter`) in `ManuallyDrop` too, so compiler can prove they are non-drop too (or reimplement `Vec` ourselves - #6488).
2. Introduce static checks to prevent non-drop types being used in `Box` and `Vec`, to make memory leaks impossible, and detect them at compile time.
2024-10-19 15:43:54 +00:00
overlookmotel
9f555d7c7f docs(allocator): clarify docs for Box (#6625)
Clarify the consequences of storing `Drop` types in `oxc_allocator::Box`.
2024-10-16 15:35:53 +00:00
DonIsaac
06e75b032e docs(allocator): enable lint warnings on missing docs, and add missing doc comments (#6613)
Part of https://github.com/oxc-project/backlog/issues/130
2024-10-15 22:50:48 +00:00
DonIsaac
5ee1ef3d92 feat(allocator): add Vec::into_boxed_slice (#6195)
Note that this PR does not implement the inverse operation (`Box::to_vec` for `[T]`).
2024-10-12 04:29:43 +00:00
overlookmotel
f7d113625e refactor(allocator): remove unnecessary Vec impl (#6213)
`impl<'alloc, T> ops::Index<usize> for &'alloc Vec<'alloc, T>` is unnecessary, as we already have `impl<'alloc, T> ops::Index<usize> for Vec<'alloc, T>`, whose `index` method takes a `&self`.
2024-10-01 10:54:47 +00:00
DonIsaac
5db9b3002c perf(allocator): use lower bound of size hint when creating Vecs from an iterator (#6194) 2024-10-01 00:07:29 +00:00
DonIsaac
3099709dcd docs(allocator): document oxc_allocator crate (#6037)
Part of #5870
2024-09-25 02:16:32 +00:00
Boshen
1bed5ce2a5 chore: run cargo +nightly fmt to sort imports (#5503)
They are never going to be stable are they ... cedf7a4daa/.rustfmt.toml (L8-L16)
2024-09-06 04:04:26 +00:00
overlookmotel
e8bdd12438 feat(allocator): add AsMut impl for Box (#5515) 2024-09-05 23:52:56 +00:00
overlookmotel
a4247e9353 refactor(allocator): move Box and Vec into separate files (#5034)
Pure refactor. Split `Box` and `Vec` definitions into separate files. Definitions are completely unchanged.
2024-08-21 01:08:08 +00:00
overlookmotel
90d0b2ba65 refactor(allocator, ast, span, ast_tools): use allocator as var name for Allocator (#4900)
We mostly use `allocator` as var name for an `Allocator`, but in some places used the shorter name `alloc`. Use `allocator` everywhere for consistency.
2024-08-15 10:49:11 +00:00
overlookmotel
a6967b30f3 refactor(allocator): correct code comment (#4904)
Correct code comment in `Box::unbox`.
2024-08-14 17:44:54 +00:00
overlookmotel
8e10e25ded feat(allocator): introduce Address (#4810)
Closes #4807.

Introduce `Address`. `Address` can be obtained from a `Box<T>` and can act as a unique identifier for an AST node in arena.

NB: It will also be unique across 2 ASTs in different allocators as long as neither allocator is dropped.
2024-08-11 03:31:45 +00:00
rzvxa
23b0040c16 feat(allocator): introduce CloneIn trait. (#4726)
Introduce the trait discussed in #4284.
2024-08-07 17:28:54 +00:00
Luca Bruno
4c6d19d440
perf(allocator): use capacity hint (#4584) 2024-07-31 18:41:27 -04:00
overlookmotel
0677a91e14 refactor(allocator): make Box::new_in code more explicit (#4432)
Replace `.into` with `NonNull::from`. I feel this is a bit clearer.
2024-07-23 15:41:31 +00:00
overlookmotel
504daeda24 refactor(allocator): rename fn params for Box::new_in (#4431)
Rename function params for `Box::new_in` to be more descriptive and match our naming conventions.
2024-07-23 15:24:30 +00:00
Boshen
a71787572e
chore: remove unsafe_code = "warn" rust lint
Feels too verbose as we already have unsafe comment turned on
2024-07-15 10:39:08 +08:00
rzvxa
115ac3b81b feat(allocator): introduce FromIn and IntoIn traits. (#4088) 2024-07-08 07:07:17 +00:00
Boshen
051ceb6539
chore: improve some format by running cargo +nightly fmt 2024-06-19 00:48:30 +08:00
Don Isaac
8f5655dfe6
feat(linter): add eslint/no-useless-constructor (#3594)
Co-authored-by: Boshen <boshenc@gmail.com>
2024-06-13 13:12:18 +08:00
overlookmotel
514228ad42
deps(allocator): disable serde dep by default (#3120)
`oxc_allocator` currently depends on `serde`, although it's generally
not required.

This PR puts the dependency behind a feature `serialize`.

NB: `serde` is needed for the crate's tests, but this can be enabled by
adding it to `dev-dependencies` and putting the impls behind
`#[cfg(any(feature = "serialize", test))]`.
2024-04-28 22:17:32 +08:00
overlookmotel
7e1fe36c68
refactor(ast): squash nested enums (#3115)
OK, this is a big one...

I have done this as part of work on Traversable AST, but I believe it
has wider benefits, so thought better to spin it off into its own PR.

## What this PR does

This PR squashes all nested AST enum types (#2685).

e.g.: Previously:

```rs
pub enum Statement<'a> {
    BlockStatement(Box<'a, BlockStatement<'a>>),
    /* ...other Statement variants... */
    Declaration(Declaration<'a>),
}

pub enum Declaration<'a> {
    VariableDeclaration(Box<'a, VariableDeclaration<'a>>),
    /* ...other Declaration variants... */
}
```

After this PR:

```rs
#[repr(C, u8)]
pub enum Statement<'a> {
    BlockStatement(Box<'a, BlockStatement<'a>>) = 0,
    /* ...other Statement variants... */

    VariableDeclaration(Box<'a, VariableDeclaration<'a>>) = 32,
    /* ...other Declaration variants... */
}

#[repr(C, u8)]
pub enum Declaration<'a> {
    VariableDeclaration(Box<'a, VariableDeclaration<'a>>) = 32,
    /* ...other Declaration variants... */
}
```

All `Declaration`'s variants are combined into `Statement`, but
`Declaration` type still exists.

As both types are `#[repr(C, u8)]`, and the discriminants are aligned, a
`Declaration` can be transmuted to a `Statement` at zero cost.

This is the same thing as #2847, but here applied to *all* nested enums
in the AST, and with improved helper methods.

No enums increase in size, and a few get smaller. Indirection is reduced
for some types (this removes multiple levels of boxing).

## Why?

1. It is a prerequisite for Traversable AST (#2987).
2. It would help a lot with AST Transfer (#2409) - it solves the only
remaining blocker for this.
3. It is a step closer to making the whole AST `#[repr(C)]`.

## Why is it a good thing for the AST to be `#[repr(C)]`?

Oxc's direction appears to be increasingly to build up control over the
fundamental primitives we use, in order to unlock performance and
features. We have our own allocator, our own custom implementations for
`Box` and `Vec`, our own `IndexVec` (TBC). The AST is the central
building block of Oxc, and taking control of its memory layout feels
like a step in this same direction.

Oxc has a major advantage over other similar libraries in that it keeps
all the AST data in an arena. This opens the door to treating the AST
either as Rust types or as *pure data* (just bytes). That data can be
moved around and manipulated beyond what Rust natively allows.

However, to enable that, the types need to be well-specified, with
completely stable layouts. `#[repr(C)]` is the only tool Rust provides
to do this.

Once the types are `#[repr(C)]`, various features become possible:

1. Cheap transfer of the AST across boundaries without ser/deser - the
property used by AST Transfer.
2. Having multiple versions of the AST (standard, read-only,
traversable), and these AST representations can be converted to one
other at zero cost via transmute - the property used by Traversable AST
scheme.
3. Caching AST data on disk (#3079) or transferring across network.
4. Stuff we haven't thought of yet!

Allowing the AST to be treated as pure data will likely unlock other
"next level" features further down the track (caching for "edge
bundling" comes to mind).

## The problem with `#[repr(C)]`

It's not *required* to squash nested enums to make the AST `#[repr(C)]`.

But the problem with `#[repr(C)]` is that it disables some compiler
optimizations. Without `#[repr(C)]`, the compiler squashes enums itself
in some cases (which is how `Statement` is currently 16 bytes). But
making the types `#[repr(C)]` as they are currently disables this
optimization.

So this PR essentially makes explicit what the compiler is already doing
- and in fact goes a bit further with the optimization than the compiler
is able to, in squashing 3 or 4 layers of nested enums (the compiler
only does up to 2 layers).

## Implementation

One enum "inheriting" variants from another is implemented with
`inherit_variants!` macro.

```rs
inherit_variants! {
#[repr(C, u8)]
pub enum Statement<'a> {
    BlockStatement(Box<'a, BlockStatement<'a>>),
    /* ...other Statement variants... */
    
    // `Declaration` variants added here by `inherit_variants!` macro
    @inherit Declaration
    // `ModuleDeclaration` variants added here by `inherit_variants!` macro
    @inherit ModuleDeclaration
}
}
```

The macro is *fairly* lightweight, and I think the above is quite easy
to understand. No proc macros.

The macro also implements utility methods for converting between enums
e.g. `Statement::as_declaration`. These methods are all zero-cost
(essentially transmutes).

New patterns for dealing with nested enums are introduced:

Creation:

```rs
// Old
let stmt = Statement::Declaration(Declaration::VariableDeclaration(var_decl));

// New
let stmt = Statement::VariableDeclaration(var_decl);
```

Conversion:

```rs
// Old
let stmt = Statement::Declaration(decl);

// New
let stmt = Statement::from(decl);
```

Testing:

```rs
// Old
if matches!(stmt, Statement::Declaration(_)) { }
if matches!(stmt, Statement::ModuleDeclaration(m) if m.is_import()) { }

// New
if stmt.is_declaration() { }
if matches!(stmt, Statement::ImportDeclaration(_)) { }
```

Branching:

```rs
// Old
if let Statement::Declaration(decl) = &stmt { decl.do_stuff() };

// New
if let Some(decl) = stmt.as_declaration() { decl.do_stuff() };
```

Matching:

```rs
// Old
match stmt {
    Statement::Declaration(decl) => visitor.visit(decl),
}

// New (exhaustive match)
match stmt {
    match_declaration!(Statement) => visitor.visit(stmt.to_declaration()),
}

// New (alternative)
match stmt {
    _ if stmt.is_declaration() => visitor.visit(stmt.to_declaration()),
}
```

New syntax has pluses and minuses vs the old. `match` syntax is worse,
but when working with a deeply nested enum, the code is much nicer -
it's shorter and easier to read.

This PR removes 200 lines from the linter with changes like this:


https://github.com/oxc-project/oxc/pull/3115/files#diff-dc417ff57352da6727a760ec6dee22de6816f8231fb69dbef1bf05d478699103L92-R95

```diff
- let AssignmentTarget::SimpleAssignmentTarget(simple_assignment_target) =
-     &assignment_expr.left
- else {
-     return;
- };
- let SimpleAssignmentTarget::AssignmentTargetIdentifier(ident) =
-     simple_assignment_target
+ let AssignmentTarget::AssignmentTargetIdentifier(ident) = &assignment_expr.left
else {
    return;
};
```
2024-04-28 20:40:37 +08:00
overlookmotel
6bc18e15e0
refactor(bench): reuse allocator in parser + lexer benchmarks (#3053)
Re-use allocator in parser + lexer benchmarks.

I believe this is the recommended usage when parsing a bunch of files -
to re-use one allocator rather than create a fresh one for each run, so
it makes sense to me that this is what the benchmark should measure.

Doesn't show much difference on CodSpeed because it only runs the
benchmark once, and it treats allocations as free anyway. But I imagine
the difference may show up a bit more in a standard criterion benchmark.
2024-04-22 09:03:26 +08:00
Boshen
063b281c39
feat(allocator): make Box's PhantomData own the passed in T (#2952) 2024-04-13 12:31:40 +08:00
branchseer
f159f60084
Make ast types covariant over the allocator lifetime. (#2943)
## Why

Due to the usage of `&'alloc mut T` in `oxc_allocator::Box`, and
`bumpalo::collections::Vec` in `oxc_allocator::Vec`, ast types are
currently invariant over their allocator lifetime `'a`. This prevents
`ouroboros` from generating `borrow_*` on ast type fields, leading to
the unfriendly `with_*` api:
c250b288ef/crates/oxc_parser/examples/multi-thread.rs (L82-L84)

## How

- For `oxc_allocator::Vec`, switch to `allocator_api2::vec::Vec`, which
has a covariant relationship with the allocator lifetime.
- For `oxc_allocator::Box`, use `std::ptr::NonNull` which is
specifically designed to be covariant. I don't use
`allocator_api2::boxed::Box` because it holds the allocator for
dropping, so the size is bigger.

## Downside

Now that `oxc_allocator::Box` uses the unsafe `NonNull`. It has to be a
private field to be safe. This make it impossible to do `Box(....)`
pattern matching.
2024-04-12 18:12:18 +08:00
Boshen
504698ab4a
chore: guard against unsafe code as much as possible. 2024-04-03 19:35:07 +08:00
Boshen
a1271af5df
docs(allocator): document behaviour of Box 2024-01-29 21:34:45 +08:00
Boshen
a6d9356ffa
feat(allocator): add From API (#1908)
closes #1701
2024-01-06 12:45:27 +08:00
Boshen
4886d408eb
chore(clippy): enable undocumented_unsafe_blocks 2023-10-16 15:18:14 +08:00
Boshen
12798e075f
refactor: improve code coverage a little bit 2023-08-25 23:07:14 +08:00
Boshen
fdf288c685
refactor: improve code coverage in various places (#721) 2023-08-11 15:17:49 +08:00
Boshen
38e11956be
chore(rust): rust version 1.71.0 nightly 2023-07-13 23:10:10 +08:00
Don Isaac
0346adb1eb
feat(linter): add eslint/no-control-regex (#516) 2023-07-10 10:20:57 +08:00
Boshen
ad2835f11b
chore(rustfmt): run cargo fmt 2023-05-21 11:52:26 +08:00
Boshen
7f93e58f10
chore: remove all #[must_use] 2023-05-11 21:08:00 +08:00
Boshen
becc5d0a3b
feat(hir): add HirId and HirBuilder for facilitating lowering (#318) 2023-04-25 23:16:04 +08:00
Boshen
f194c84f0b
chore: remove the confusing unsafe impl from ast and allocator 2023-04-25 18:54:15 +08:00
Boshen
50749f7c7f
refactor(allocator): clean up and add unit tests 2023-04-16 12:07:03 +08:00
Boshen
94cb990a48
refactor(oxc_ast): clean up doc 2023-04-16 00:39:07 +08:00
Boshen
9dfd4cd936
chore(rust): remove unnecessary missing_const_for_fn 2023-03-22 12:35:52 +08:00
Boshen
4ae70b9592 feat(parser): add lexer 2023-02-11 02:29:54 -08:00
Boshen
664c37631e feat(allocator): add allocator 2023-02-11 01:05:07 -08:00