oxc/crates/oxc_regular_expression/examples/regex_visitor.rs
leaysgur f8e1907c4f feat(regular_expression): Intro ConstructorParser(and LiteralParser) to handle escape sequence in RegExp('pat') (#6635)
Preparation for #6141

`oxc_regular_expression` can already parse and validate both `/regexp-literal/` and `new RegExp("string-literal")`.

But one thing that is not well-supported was reporting `Span` for the `RegExp("string-literal-with-\\escape")` case.

For example, these two cases produce the same `RegExp` instances in JavaScript:

- `/\d+/`
- `new RegExp("\\d+")`

For now, mainly in `oxc_linter`, the latter case is parsed with `oxc_parser` -> `ast::literal::StringLiteral` AST node -> `value` property.

At this point, escape sequences are resolved(!), `oxc_regular_expression` can handle aligned `&str` as an argument without any problem in both cases.

However, in terms of `Span` representation, these cases should be handled differently because of the `\\` in string literals...

As a result, the parsed AST's `Span` for `new RegExp("string-literal")` is not accurate if it contains escape sequences.

e.g. a01a5dfdaf/crates/oxc_linter/src/snapshots/no_invalid_regexp.snap (L118-L122)

Each time the `\` appears, the subsequent position is shifted. `_` should be placed under `*` in this case.

So... to resolve this issue, we need to implement `string_literal_parser` first, and use them as reading units of `oxc_regular_expression`.
2024-10-21 07:07:27 +00:00

31 lines
901 B
Rust

#![allow(clippy::print_stdout)]
use oxc_allocator::Allocator;
use oxc_regular_expression::{
visit::{RegExpAstKind, Visit},
LiteralParser, Options,
};
use oxc_span::GetSpan;
struct TestVisitor;
impl Visit<'_> for TestVisitor {
fn enter_node(&mut self, kind: RegExpAstKind) {
println!("enter_node: {:?} {kind:?}", kind.span());
}
fn leave_node(&mut self, kind: RegExpAstKind) {
println!("leave_node: {:?} {kind:?}", kind.span());
}
}
fn main() {
let source_text = r"(https?:\/\/github\.com\/(([^\s]+)\/([^\s]+))\/([^\s]+\/)?(issues|pull)\/([0-9]+))|(([^\s]+)\/([^\s]+))?#([1-9][0-9]*)($|[\s\:\;\-\(\=])";
let allocator = Allocator::default();
let parser = LiteralParser::new(&allocator, source_text, None, Options::default());
let pattern = parser.parse().unwrap();
let mut visitor = TestVisitor;
visitor.visit_pattern(&pattern);
}