Commit graph

47 commits

Author SHA1 Message Date
ottomated
9725e3c2ec feat(ast_tools): Add #[estree(always_flatten)] to Span (#6935)
Part of #6347

Other changes:
- added #[estree(skip)] to thisSpan in TSThisParameter
- Flattened the span in BoundaryAssertion (regex)
2024-10-28 02:13:24 +00:00
ottomated
169fa22350 feat(ast_tools): Default enums to rename_all = "camelCase" (#6933)
Part of #6347
2024-10-28 01:39:25 +00:00
ottomated
ce5b609514 feat(ast): remove explicit untagged marker on enums (#6915)
This assumes that any enums with exactly one field in each variant are untagged, and they're tagged otherwise.
2024-10-26 08:21:40 +00:00
leaysgur
90c786c420 feat(regular_expression)!: Support ES2025 Duplicated named capture groups (#6847)
Closes #6358

@preyneyv I know you've been working on this problem.

This is an implementation that has been dormant on my local for a while.

- All tests are passing
- However, the approach is simple but not general, so there might be some edge cases that were missed
- There's also room for improvement in terms of performance

For these reasons, it was marked as WIP for me.

I believe the test cases and other parts are usable, so feel free to fork and replace them with your implementation if you'd like.
2024-10-25 02:13:57 +00:00
Boshen
423d54cb74 refactor(rust): remove the annoying clippy::wildcard_imports (#6860) 2024-10-24 13:57:19 +00:00
ottomated
1145341a92 feat(ast_tools): output typescript to a separate package (#6755)
Part of #6347.

Moves typescript logic from derive_estree into a new ast_tools generator.
2024-10-24 13:08:57 +00:00
leaysgur
8032813bf8 fix(regular_expression)!: Migrate to new regexp parser API (#6741)
Follow up #6635

- [x] Remove old APIs
- [x] Update linter usage
- [x] Update parser usage
- [x] Update transformer usage
2024-10-22 05:34:18 +00:00
leaysgur
f8e1907c4f feat(regular_expression): Intro ConstructorParser(and LiteralParser) to handle escape sequence in RegExp('pat') (#6635)
Preparation for #6141

`oxc_regular_expression` can already parse and validate both `/regexp-literal/` and `new RegExp("string-literal")`.

But one thing that is not well-supported was reporting `Span` for the `RegExp("string-literal-with-\\escape")` case.

For example, these two cases produce the same `RegExp` instances in JavaScript:

- `/\d+/`
- `new RegExp("\\d+")`

For now, mainly in `oxc_linter`, the latter case is parsed with `oxc_parser` -> `ast::literal::StringLiteral` AST node -> `value` property.

At this point, escape sequences are resolved(!), `oxc_regular_expression` can handle aligned `&str` as an argument without any problem in both cases.

However, in terms of `Span` representation, these cases should be handled differently because of the `\\` in string literals...

As a result, the parsed AST's `Span` for `new RegExp("string-literal")` is not accurate if it contains escape sequences.

e.g. a01a5dfdaf/crates/oxc_linter/src/snapshots/no_invalid_regexp.snap (L118-L122)

Each time the `\` appears, the subsequent position is shifted. `_` should be placed under `*` in this case.

So... to resolve this issue, we need to implement `string_literal_parser` first, and use them as reading units of `oxc_regular_expression`.
2024-10-21 07:07:27 +00:00
Boshen
3711c32f22 chore(coverage): bump test262, babel and TypeScript (#6702)
closes #6692
2024-10-20 15:02:26 +00:00
overlookmotel
85e69a11ef refactor(ast_tools): add line breaks to generated code for ESTree derive (#6680)
Follow-on after #6404. Style nit. Add line breaks to generated code, to make it easier to read.
2024-10-19 19:50:13 +00:00
overlookmotel
ad8e293197 refactor(ast_tools): shorten generated code for impl Serialize (#6684)
Follow-on after #6404. Shorten generated code for `impl Serialize`.
2024-10-19 19:50:12 +00:00
overlookmotel
9ba2b0e3a3 refactor(ast_tools): move #[allow] attrs to top of generated files (#6679)
Follow-on after #6404. Shorten generated code for `impl Serialize` by moving `#[allow]` attrs to top of file.
2024-10-19 19:50:12 +00:00
overlookmotel
11458a5bfd refactor(ast_tools): shorten generated code by avoiding ref in matches (#6675)
Follow-on after #6404.

Shorten generated code for deriving `ESTree` by avoiding `ref` in matches.
2024-10-19 19:50:10 +00:00
ottomated
e310e52ca2
feat(parser): Generate Serialize impls in ast_tools (#6404)
Beginning of #6347. Instead of using serde-derive, we generate
`Serialize` impls manually.

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: overlookmotel <theoverlookmotel@gmail.com>
2024-10-19 09:38:44 +01:00
Tapan Prakash
9f9057b99f
fix(regular_expression): Fixed control Y regular expression (#6524)
Fixes https://github.com/oxc-project/oxc/issues/6413

Fixed regular expression for control Y
2024-10-14 11:19:37 +08:00
DonIsaac
7c200560c7 perf(regex): reduce string allocations in Display impls (#6528)
There's still room for improvement here.
2024-10-13 19:34:18 +00:00
leaysgur
b5b0af98cb feat(regular_expression): Support RegExp Modifiers (#6410)
Fixes #6354
2024-10-10 14:46:17 +00:00
leaysgur
c822b48d4f fix(regular_expression): Fix CharacterClass negative codegen (#6415)
Part of #6413 , fixes these mismatch.

```
  × Regular Expression content mismatch for `/[^]a/m`: `[]a` == `[]a`
  × Regular Expression content mismatch for `/a[^]/`: `a[]` == `a[]`
  × Regular Expression content mismatch for `/[^]/`: `[]` == `[]`
  × Regular Expression content mismatch for `/[^]/`: `[]` == `[]`
```
2024-10-10 05:00:45 +00:00
ottomated
384d5be40b
fix(regular_expression): Flatten Spans on regex AST nodes (#6396)
cc @overlookmotel
2024-10-10 09:13:18 +08:00
leaysgur
5a73a663dc refactor(regular_expression)!: Simplify public APIs (#6262)
This PR makes 2 changes to improve the existing API that are not very useful.

- Remove `(Literal)Parser` and `FlagsParser` and their ASTs
- Add `with_flags(flags_text)` helper to `ParserOptions`

Here are the details.

> Remove `(Literal)Parser` and `FlagsParser` and their ASTs

Previously, the `oxc_regular_expression` crate exposed 3 parsers.

- `(Literal)Parser`: assumes `/pattern/flags` format
- `PatternParser`: assumes `pattern` part only
- `FlagsParser`: assumes `flags` part only

However, it turns out that in actual usecases, only the `PatternParser` is actually sufficient, as the pattern and flags are validated and sliced in advance on the `oxc_parser` side.

The current usecase for `(Literal)Parser` is mostly for internal testing.

There were also some misuses of `(Literal)Parser` that restore `format!("/{pattern}/{flags}")` back and use `(Literal)Parser`.

Therefore, only `PatternParser` is now published, and unnecessary ASTs have been removed.
(This also obsoletes #5592 .)

> Added `with_flags(flags_text)` helper to `ParserOptions`

Strictly speaking, there was a subtle difference between the "flag" strings that users were aware of and the "mode" recognised by the parser.

Therefore, it was a common mistake to forget to enable `unicode_mode` when using the `v` flag.

With this helper, crate users no longer need to distinguish between flags and modes.
2024-10-03 02:47:08 +00:00
leaysgur
acab777c0a refactor(regular_expression): Misc fixes (#6234)
Preparation for #6141

- Keep `enum` size + add size asserts tests
- Arrange AST related directories
- Renaming
2024-10-02 13:32:29 +00:00
camchenry
8d026e1dd9 feat(regular_expression): implement GetSpan for RegExp AST nodes (#6056)
To make it easier to get the `Span` for some node in the Regex AST, I've implemented the `GetSpan` trait for all necessary structs.
2024-09-26 05:51:35 +00:00
camchenry
77647931e4 feat(regular_expression): implement visitor pattern trait for regex AST (#6055)
- resolves https://github.com/oxc-project/oxc/issues/5977
- supersedes https://github.com/oxc-project/oxc/pull/5951

To facilitate easier traversal of the Regex AST, this PR defines a `Visit` trait with default implementations that will walk the entirety of the Regex AST. Methods in the `Visit` trait can be overridden with custom implementations to do things like analyzing only certain nodes in a regular expression, which will be useful for regex-related `oxc_linter` rules.

In the future, we should consider automatically generating this code as it is very repetitive, but for now a handwritten visitor is sufficient.
2024-09-26 05:04:46 +00:00
leaysgur
304ce25446 fix(regular_expression): Keep LegacyOctalEscape raw digits for to_string (#5692)
Fixes #5690

- Update `CharacterKind` enum from `Octal` to `Octal1`, `Octal2` and `Octal3`
- Stylistic refactoring for `impl Display`
2024-09-11 07:07:00 +00:00
leaysgur
2da42efb6f refactor(regular_expression): Improve AST docs with refactoring may_contain_strings (#5665)
Follow up #5661
2024-09-10 07:32:28 +00:00
leaysgur
0511d55aa8 fix(regular_expression): Report more MayContainStrings error in (nested)class (#5661)
Fixes #5632
2024-09-10 01:55:51 +00:00
leaysgur
41582ea00c fix(regular_expression): Improve RegExp to_string() results (#5635)
Hopefully fixes #5487

- `/^([a-zA-Z]+:)?[\p{L}0-9@/.\-_\\]+$/u`
- e3bc0ddcba/packages/cli-tools/src/launchEditor.ts (L53-L57)
2024-09-09 08:17:55 +00:00
leaysgur
28aad281b6 fix(regular_expression): Handle - in /[\-]/u as escaped character (#5631)
Fixes #5487
2024-09-09 04:14:17 +00:00
Don Isaac
0ac420d6f9
refactor(linter): use meaningful names for diagnostic parameters (#5564)
Also add `pending` fix labels to many rules.
2024-09-06 18:14:56 -04:00
leaysgur
dec139529d refactor(regular_expression): Align diagnostics (#5543)
Manage all diagnostics for LiteralParser, FlagsParser, PatternParser in one place, same message format.
2024-09-06 16:28:06 +00:00
Boshen
1bed5ce2a5 chore: run cargo +nightly fmt to sort imports (#5503)
They are never going to be stable are they ... cedf7a4daa/.rustfmt.toml (L8-L16)
2024-09-06 04:04:26 +00:00
leaysgur
88b7ddb7e0 fix(regular_expression): Handle unterminated character class (#5523)
`/[/` is reported by `debug_assert!`, but should not.
2024-09-06 03:28:33 +00:00
rzvxa
9b984b31bd fix(regex): panic on displaying surrogated UnicodeEscape characters. (#5469)
fixes https://github.com/oxc-project/oxc/pull/5387#issuecomment-2330534180
2024-09-05 06:18:11 +00:00
rzvxa
ccc8a27e4f refactor(ast, ast_tools): use full method path for generated derives trait calls. (#5462)
As of now if we remove the implementation of a trait for a type and implement the method on that type directly it wouldn't break while it isn't the original trait anymore so that method might do something entirely different.
This change is more explicit on trait calls so we hit compile errors on these kinds of changes.
2024-09-05 05:36:50 +00:00
rzvxa
90facd3657 feat(ast): add ContentHash trait; remove noop Hash implementation from Span (#5451)
closes #5283

Also removes the noop Hash implementation on `Span` in favor of a real implementation.
2024-09-05 07:20:04 +03:30
rzvxa
23285f431d feat(ast): add ContentEq trait. (#5427)
Part of #5283
2024-09-04 11:53:50 +00:00
overlookmotel
e7bd49dae4 refactor(regular_expression): correct typo (#5429)
Just correct a misspelling.
2024-09-04 00:54:22 +00:00
rzvxa
59abf27d95 feat(ast, parser): add oxc_regular_expression types to the parser and AST. (#5256)
closes #5060
2024-09-03 02:36:37 +00:00
rzvxa
c0b6269cef feat(regular_expression): implement Display for RegularExpression type. (#5304)
Part of #5060
2024-09-03 02:20:45 +00:00
leaysgur
15b87adb05 chore(regular_expression): Extract diagnostics (#5287)
- Extract `Diagnostic::error()`s to separate file
- Align error message prefix
2024-08-28 03:19:29 +00:00
leaysgur
cffce11620 fix(regular_expression): Prevent panic on too large number (#5282)
Partially close #5257

Use `checked_(mul|add)` to prevent panic.
2024-08-28 01:31:54 +00:00
leaysgur
46b641b75d feat(regular_expression): Validate max quantifier value (#5218)
I've never seen but `/a{9007199254740991}/` is valid and this is the maximum value for quantifier.

\+ left comment about #5210 experiment.
2024-08-26 07:11:04 +00:00
leaysgur
c7b81f5762 chore(regular_expression): Update example to support RegExp constructor (#5106)
- Fix example to handle `new RegExp()` too
- Update NOTE comments

- - -

Until I tried interacting with the actual AST parsed by `oxc_parser`, I thought that the current `oxc_regular_expression` lacked support for the `RegExp` constructor due to escape sequences.

This was because `"\""` remained `"\""` after reading the source text from `.js` files.

However, once it was parsed by `oxc_parser`, I found that everything was [resolved](8ef85a43c0/crates/oxc_parser/src/lexer/string.rs)! (Wonderful work as usual. 👏🏻 )

Now there is nothing to worry about. 😌
2024-08-23 04:57:32 +00:00
leaysgur
96f57984eb refactor(regular_expression): Misc refactoring for body_parser (#5062)
- Add examples to list all `RegExp`s in source code
- Refactor `MayContainStrings` related part
2024-08-22 11:21:41 +00:00
Boshen
afe728a73a feat(parser): parse regular expression with regex parser (#4998)
Many false positives and incorrect errors. @leaysgur Enjoy 😁

Run `just conformance` to update the snapshot.
2024-08-22 03:09:55 +00:00
Boshen
081e2a37d9
refactor(regular_expression): s/RegExpLiteral/RegularExpression 2024-08-20 14:26:32 +08:00
Boshen
8d3f61bb54
chore(oxc_regular_expression): rename crate 2024-08-20 10:59:00 +08:00