The parser returns a simple `ModuleRecord` that is allocated in the arena for performance reasons.
The linter uses a more complicated, `Send` + `Sync` `ModuleRecord` that will hold more cross-module information.
The next step is to return more esm information from the parser to eliminated the need of the `oxc_module_lexer` crate.
This has the benefit of:
* expose dynamic import / import meta info from parser
* 1 less ast shallow in semantic builder
* no ast walk in oxc's module lexer
* some more benefits coming soon
Firstly, a massive thanks to @alisnic for starting this (incredibly complicated) lint rule in https://github.com/oxc-project/oxc/pull/2637 !
still a draft. current state:
3x false positives (all todo with refs)
3x false negatives (TS will catch these
13x false negatvies todo with refs
1x false negative TODO
closes #2637
relates to #2174
> Closes#5046
This PR migrates the linter crate and oxlint app to use the new `LinterBuilder` API. This PR has the following effects:
1. `plugins` in oxlint config files are now supported
2. irons out weirdness when using CLI args and config files together. CLI args are now always applied on top of config file settings, overriding them.
# Breaking Changes
Before, config files would override rules set in CLI arguments. For example, running this command:
```sh
oxlint -A correctness -c oxlintrc.json
```
With this config file
```jsonc
// oxlintrc.json
{
"rules": {
"no-const-assign": "error"
}
}
```
Would result in a single rule, `no-const-assign` being turned on at an error level with all other rules disabled (i.e. set to "allow").
Now, **CLI arguments will override config files**. That same command with the same config file will result with **all rules being disabled**.
## Details
For a more in-depth explanation, assume we are running the below command using the `oxlintrc.json` file above:
```sh
oxlint -A all -W correctness -A all -W suspicious -c oxlintrc.json
```
### Before
> Note: GitHub doesn't seem to like deeply nested lists. Apologies for the formatting.
Here was the config resolution process _before_ this PR:
<details><summary>Before Steps</summary>
1. Start with a default set of filters (["correctness", "warn"]) if no filters were passed to the CLI. Since some were, the filter list starts out empty.
2. Apply each filter taken from the CLI from left to right. When a filter allows a rule or category, it clears the configured set of rules. So applying those filters looks like this
a. start with an empty list `[]`
b. `("all", "allow")` -> `[]`
c. `("correctness", "warn")` -> `[ <correctness rules> ]`
d. `("all", "allow")` -> `[]`
e. `("suspicious", "warn")` -> `[ <suspicious rules> ]`. This is the final rule set for this step
3. Apply overrides from `oxlintrc.json`. This is where things get a little funky, as mentioned in point 2 of what this PR does. At this point, all rules in the rules list are only from the CLI.
a. If a rule is only set in the CLI and is not present in the config file, there's no effect
b. If a rule is in the config file but not the CLI, it gets inserted into the list.
c. If a rule is already in the list and in the config file
i. If the rule is only present once (e.g. `"no-loss-of-precision": "error"`), unconditionally override whatever was in the CLI with what was set in the config file
ii. If the rule is present twice (e.g. `"no-loss-of-precision": "off", "@typescript-eslint/no-loss-of-precision": "error"`),
a. if all rules in the config file are set to `allow`, then turn the rule off
b. If one of them is `warn` or `deny`, then update the currently-set rule's config object, but _leave its severity alone_.
So, for our example, the final rule set would be `[<all suspicious rules: "warn">, no-const-assign: "error"]`
</details>
### After
Rule filters were completely re-worked in a previous PR. Now, lint filters aren't kept on hand-only the rule list is.
<details><summary>After Steps</summary>
1. Start with the default rule set, which is all correctness rules for all enabled plugins (`[<all correctness rules: "warn">]`)
2. Apply configuration from `oxlintrc.json` _first_.
a. If the rule is warn/deny exists in the list already, update its severity and config object. If it's not in the list, add it.
b. If the rule is set to allow, remove it from the list.
The list is now `[<all correctness rules except no-const-assign: "warn">, no-const-assign: "error"]`.
3. Apply each filter taken from the CLI from left to right. This works the same way as before. So, after they're applied, the list is now `[<suspicous rules: "warn">]`
</details>
Minifier benchmark currently includes the time to run parser and semantic as well as the minifier itself. Similar to transformer benchmark, only measure the minifier itself in the benchmark.
This will give us a clearer picture of performance changes, and should also reduce variance in the benchmarks.
Add `SemanticBuilder::with_excess_capacity` method to request that `SemanticBuilder` over-allocate space in `Semantic`'s `Vec`s.
Use this method to reserve 200% extra capacity for transformer to create more scopes, symbols and references.
200% is an unscientific guess of how much extra capacity is required. Obviously it depends on what transforms are enabled and content of the source code.
Realized we can get the source type from the AST.
The next PR will introduce `unambiguous` to `SourceType` and directly set `Program::source_type` to either `script` or `module`.
Fixing code is an important part of linter logic. We want to make sure fixers for each rule, and the code responsible for applying those fixes, are included in benchmarks.
As it currently stands, fixer closures are applied regardless of whether the user wants fixers to be applied. However, this is an implementation detail and is subject to change. I also want to bench the performance of `Fixer`.
Transformer benchmark measure only the transformer itself. Parse, generate `Semantic`, and drop `Semantic` outside of the measured section.
This should:
1. Give us greater visibility of how changes to transformer affect its performance.
2. Reduce variance in transformer benchmarks, since they will no longer include the variance introduced by `SemanticBuilder`.
Not ready to merge yet. We should first add an "end to end" benchmark testing the entire compilation process (parse - semantic - transform - minify - codegen).
This PR depends on https://github.com/Boshen/criterion2.rs/pull/49. This PR currently makes Oxc's dependency on `criterion2` a git dependency on that PR's branch. That can be changed once the upstream PR is merged.
Include dropping `Semantic` in semantic benchmark measure.
If you create a `Semantic`, you have to drop it, so the time it takes to
drop is part of the cost of using this API, and we should be working to
reduce it. Therefore I think it should be included in the benchmark.
It'll be interesting to see what effect a PR like #5232 which removes a
bunch of `Vec`s from `Semantic` has on the drop time.
We no longer need to build semantic data inside the transformer.
The caller should be responsible for handling semantic data and its
errors.
The best way to achieve this in via `CompilerInterface`.
closes#3565
Introduce new method `ConcatSourceMapBuilder::from_sourcemaps`.
Where all the sourcemaps being concatenated exist at time that you
create `ConcatSourceMapBuilder`, it's faster to use `from_sourcemaps`,
because it pre-allocates enough space for the data it will hold and so
avoids memory copying.
Before:
```rs
let mut builder = ConcatSourceMapBuilder::default();
builder.add_sourcemap(&sourcemap1, 0);
builder.add_sourcemap(&sourcemap2, 100);
builder.add_sourcemap(&sourcemap3, 100);
let combined = builder.into_sourcemap();
```
After:
```rs
let builder = ConcatSourceMapBuilder::from_sourcemaps(&[
(&sourcemap1, 0),
(&sourcemap2, 100),
(&sourcemap3, 200),
]);
let combined = builder.into_sourcemap();
```
After studying google closure compiler, I'm leaning towards a multi-ast-pass infrastructure for the minifier.
This is one of the few places where we are going to trade maintainability over performance, given the goal of the minifier is compression size not performance.
All of the terminologies and separation of concerns are aligned with google closure compiler.
Infrastructure of `terser` and `esbuild` are not suitable for us to study nor pursuit. Their code are so tightly coupled - I failed to comprehend any of them every time I try to walk through a piece of optmization. Google closure compiler despite being written in Java, it's actually the most readable minifier out there.
To improve performance between ast passes, I envision a change detection system over a portion of the code.
The benchmark will demonstrate the performance regression of running 5 ast passes instead of 2.
To complete this PR, I need to figure out "fix-point" and order of these ast passes.
Make 2 changes to sourcemap benchmark:
1. Move counting line breaks in output text to outside of the measured loop. This operation is reasonably expensive, and isn't part of what we're trying to measure.
2. It looks like benchmark is trying to measure concatenating 2 source maps, but `for i in 0..1` was actually only concatenating 1.
> This PR is (unfortunately) quite large, but all changes are needed in tandem for this to work properly.
## What This PR Does
Updates the linter to populate diagnostics reported by rules with error codes statically derived from `RuleMeta` + `RuleEnum`.
Doing so required changing how we handle vitest rules. I know @mysterven was hoping to refactor that part of the code, and I think this approach is an improvement (but could probably be cleaned up further).
## Changes
### 1. Auto-Populate Error Codes
`LintContext` now sets an error code scope + error code number for diagnostics reported by lint rules. `LintContext` will not clobber existing codes set by rules, allowing for rule-specific override behavior (e.g. to use `eslint-plugin-react-hooks` as an error scope).
In order to accomplish this, I had to update every diagnostic factory for every rule. While doing this I found some incorrect error messages, or messages that could be easily improved. This is where a large majority of the snapshot diffs come from. Additionally, I was able to reduce string allocations from `format!` usages in diagnostic factories, especially within jest rules.
### 2. Framework and Library Detection
This PR adds `FrameworkFlags`, which specify what (if any) set of libraries and frameworks are being used by a project and/or file. They are passed in two ways:
1. `LintOptions` can specify a set of `framework_hints` that apply to the entire target codebase. Right now these are always empty, but I'm thinking in the future we could sniff `package.json`. It may be helpful for enabling/disabling default rules.
2. When `Linter` gets run on a file, framework information is sniffed from the `LintContext`. Right now, we are only checking for `vitest` imports in `ModuleRecord` and test path prefixes from `source_path`. It may be useful to do something similar for React/NextJS rules in the future. I know that [next/no-html-link-for-pages](https://nextjs.org/docs/messages/no-html-link-for-pages) could benefit greatly from this.
This PR adds a new method `build_with_symbols_and_scopes` to make semantic building optional, there may be prior steps that has the semantic data already built.