Add `SemanticBuilder::with_excess_capacity` method to request that `SemanticBuilder` over-allocate space in `Semantic`'s `Vec`s.
Use this method to reserve 200% extra capacity for transformer to create more scopes, symbols and references.
200% is an unscientific guess of how much extra capacity is required. Obviously it depends on what transforms are enabled and content of the source code.
Realized we can get the source type from the AST.
The next PR will introduce `unambiguous` to `SourceType` and directly set `Program::source_type` to either `script` or `module`.
Follow-on after #5232. `oxc_wasm` build scopes text with a single AST traversal. Previous implementation was O($n^2$).
If we can assume scopes are listed in traversal order, then we could do it a bit more efficiently just from `ScopeTree`, but this approach of using `Visit` will handle out-of-order scope IDs (which you'd get if printing a post-transform `ScopeTree`).
Also reduce creating and discarding `String`s for indentation - reuse a single string instead.
- Feat: add `source_type` to `ParserOptions`
- Refactor: use plain objects for options instead of `new
OxcRunOptions()` and `new OxcParserOptions()`, allowing easier
serialization.
We no longer need to build semantic data inside the transformer.
The caller should be responsible for handling semantic data and its
errors.
The best way to achieve this in via `CompilerInterface`.
closes#3565
Previously `ScopeTree::get_child_ids` and `ScopeTree::get_child_ids_mut` returned an `Option`. Instead return the unwrapped value, in line with other methods e.g. `get_flags`.
After studying google closure compiler, I'm leaning towards a multi-ast-pass infrastructure for the minifier.
This is one of the few places where we are going to trade maintainability over performance, given the goal of the minifier is compression size not performance.
All of the terminologies and separation of concerns are aligned with google closure compiler.
Infrastructure of `terser` and `esbuild` are not suitable for us to study nor pursuit. Their code are so tightly coupled - I failed to comprehend any of them every time I try to walk through a piece of optmization. Google closure compiler despite being written in Java, it's actually the most readable minifier out there.
To improve performance between ast passes, I envision a change detection system over a portion of the code.
The benchmark will demonstrate the performance regression of running 5 ast passes instead of 2.
To complete this PR, I need to figure out "fix-point" and order of these ast passes.
Make `ScopeId` a type with a niche, like `SymbolId` and `ReferenceId`. This makes `Option<ScopeId>` 4 bytes instead of 8, and shrinks various AST types e.g. `ArrowFunctionExpression` by 8 bytes, and halves the size of the `Vec` in `ScopeTree::parent_ids`.
The snapshot change on `prefer-hooks-in-order` lint rule appears incidental - it doesn't alter what errors are reported, only the order they're reported in. This appears to be because it changes the order of keys in a hashmap keyed by `ScopeId` that [the rule uses](a49f4915de/crates/oxc_linter/src/rules/jest/prefer_hooks_in_order.rs (L143)).
> This PR is (unfortunately) quite large, but all changes are needed in tandem for this to work properly.
## What This PR Does
Updates the linter to populate diagnostics reported by rules with error codes statically derived from `RuleMeta` + `RuleEnum`.
Doing so required changing how we handle vitest rules. I know @mysterven was hoping to refactor that part of the code, and I think this approach is an improvement (but could probably be cleaned up further).
## Changes
### 1. Auto-Populate Error Codes
`LintContext` now sets an error code scope + error code number for diagnostics reported by lint rules. `LintContext` will not clobber existing codes set by rules, allowing for rule-specific override behavior (e.g. to use `eslint-plugin-react-hooks` as an error scope).
In order to accomplish this, I had to update every diagnostic factory for every rule. While doing this I found some incorrect error messages, or messages that could be easily improved. This is where a large majority of the snapshot diffs come from. Additionally, I was able to reduce string allocations from `format!` usages in diagnostic factories, especially within jest rules.
### 2. Framework and Library Detection
This PR adds `FrameworkFlags`, which specify what (if any) set of libraries and frameworks are being used by a project and/or file. They are passed in two ways:
1. `LintOptions` can specify a set of `framework_hints` that apply to the entire target codebase. Right now these are always empty, but I'm thinking in the future we could sniff `package.json`. It may be helpful for enabling/disabling default rules.
2. When `Linter` gets run on a file, framework information is sniffed from the `LintContext`. Right now, we are only checking for `vitest` imports in `ModuleRecord` and test path prefixes from `source_path`. It may be useful to do something similar for React/NextJS rules in the future. I know that [next/no-html-link-for-pages](https://nextjs.org/docs/messages/no-html-link-for-pages) could benefit greatly from this.
This PR adds a new method `build_with_symbols_and_scopes` to make semantic building optional, there may be prior steps that has the semantic data already built.