preserve token positions

This commit is contained in:
Daniel Bulant 2022-02-21 22:12:42 +01:00
parent 33571b0c03
commit b000e924e8
4 changed files with 164 additions and 161 deletions

View file

@ -4,27 +4,37 @@ Rust shell. Inspired by Ion.
In case you're reading this: rush is in the works and not a priority. Features may be missing even if defined below.
## Scopes
Variables are block scoped.
Block scope creation:
- `if`
- `while`
- `else`
- `for..`
- `$(expr)`
Functions have a copy of their scope.
Files create file scopes, to which functions are scoped.
## Syntax
`;` is 'alias' for new line.
Syntax and type errors crash the program.
Scopes are for each block. Functions won't have access to variables it wouldn't have access if it wasn't a function:
```sh
fn testing
echo $t
end
if true
testing # Error! t is not defined
end
```
Variables are scoped to their block, and immediately freed when their block is left.
### Variables
String variables using `$`, arrays using `@`.
String variable value can be obtained using `$`, arrays using `@`.
When an array is stringified (referred to with `$`), it's contents are joined with space.
No special treatment of `PATH`.
Currently, the shell doesn't error out when variable doesn't exist, instead, it's replaced by an empty string.
Assigned using `let`.
Left side is evaluated to a string as well.
@ -37,8 +47,9 @@ echo $d # c
```
Arrays are assigned using `[ var ]`. You can join arrays and strings by simply passing them there, like `[ $var @var ]`.
Arrays and maps cannot be nested during definition (`[ $var [@var] ]` should have the same effect).
All assignments are done via the `let` keyword.
All assignments are done via the `let` keyword. If the variable exists, it is overwritten (even in upper scopes).
Instead of `=`, other operations are supported:
* `*=` - multiply
@ -55,29 +66,9 @@ Instead of `=`, other operations are supported:
`env::` namespace contains the environment (and doesn't error out if the variable doesn't exist, instead, empty string is returned)
`color::` (alias `c::`) has a number of colors
#### Types
Based on the value set in the set `let` (the one with just `=`), a type is infered (unless specificaly set using `let x:type = ...`). This type is then used for the operations after.
Supported types:
* `i32` (alias int)
* `i64`
* `i128`
* `u32`
* `u64`
* `u128`
* `f32`
* `f64`
* `str`
* `hmap[T]` (where T is one of the other types, except array)
* `[T]` (where T is one of the other types, except hmap)
HashMap is basically array, but with string keys (instead of numbers) in random order.
### Return
Sets the exit code (and possibly exits function/script early). If no return is set, the return code is set to the return code of the last expression.
Sets the exit code (and possibly exits function/script early). If no return is set, the return code is set to the return code of the last expression (`$?`).
### Math
@ -99,29 +90,32 @@ Slices: `[x..y]` gets a substring (or subarray) of the variable. When `x` ommite
Bracketless. Scopes are ended by the keyword `end`.
`if` - Runs it's scope if the command returns `0`. Useful in pair with `test` builtin.
`for $val of @arr` - Runs for each value of the array (or hashmap)
`for $val of ...` - Runs for each number in the range.
`while` - Runs in loop as long as the command returns `0`
- `if` - Runs it's scope if the command returns `0`. Useful in pair with `test` builtin. `else` supported. `else if` doesn't require another `end`.
- `for $val of @arr` - Runs for each value of the array (or hashmap)
- `for $val of X..Y` - Runs for each number in the range `X` and `Y` (both inclusive).
- `while` - Runs in loop as long as the command returns `0`
### Functions
Defined by `fn name arg -- desc`. `arg` can be ommited, or repeated. `desc` will be printed when the `arg` is missing (or when `describe` command is used).
Defined by `fn name [...arg] [--flags]`. `arg` can be ommited, or repeated.
`--flags` can be used to add additional functionality.
#### Special
Functions are scoped per file, even if they use `on-event` or similar to be triggered.
Use `source` to load external files with functions to be triggered.
From config (defined by `~/.rushrc`), special functions can be defined.
* `PROMPT` will be run to render the prompt
* `HIGHLIGHT` will run (for each key - make it fast) to highlight the text.
- `--desc` sets the functions description.
- `--on-event` will run the function when an event is run
#### Builtins
* `let` for assigning variables
* `export` for exporting variables to env
`let` is a special case which cannot be dynamically addressed (i.e. using `$(echo let) var = value`).
* `let` for assigning variables (`let var = value`)
* `export` for exporting variables to env (`export var` to export var, or `export var = value`)
* `test` tests for evaluation (`=` for equality, `>`, `<`, `<=`, `>=` for number comparisons)
* `exists` for existance of a given string, or if given a flag (`-F`unctions, `-v`ariables, `-e`nv, `-f`ile, `-d`irectory, `-r`eadable file, `-w`ritable file, e`-x`ecutable file), existence of the selected object
* `true` returns `0`
* `false` returns `1`
* `source` to run another file in the same file scope
Some GNU standard utils may be overwritten by rush builtins, but must be made compatible.

View file

@ -71,19 +71,82 @@ impl Shell {
let v = stdin.lock().lines().next().unwrap().unwrap();
self.term.input = v;
}
}
fn start_shell() {
let mut shell = Shell::new();
loop {
print!("$: ");
io::stdout().flush().unwrap();
shell.collect();
shell.term.input += "\n";
let res = parser::exec(&mut shell.term.input.as_bytes(), &mut shell.ctx);
match res {
Err(err) => eprintln!("rush: {}", err),
Ok(_) => {}
fn edit(&mut self) {
let stdin = io::stdin();
let mut stdout = io::stdout().into_raw_mode().unwrap();
for c in stdin.keys() {
let c = c.unwrap();
match c {
Key::Char('\n') => {
if self.term.input.chars().nth(self.term.idx).unwrap_or(' ') == '\\' {
self.term.insert_str(self.term.idx, "\\\n");
} else {
break;
}
}
Key::Backspace => {
if self.term.input.len() > 0 && self.term.idx > 0 {
if self.term.idx == self.term.input.len() - 1 {
self.term.input.pop();
} else {
self.term.remove(self.term.idx - 1);
}
self.term.idx -= 1;
}
}
Key::Delete => {
if self.term.idx < self.term.input.len() {
self.term.remove(self.term.idx);
}
}
Key::End => {
self.term.idx = cmp::max(self.term.input.len(), 1) - 1;
}
Key::Home => {
self.term.idx = 0;
}
Key::Left => {
if self.term.idx > 0 {
self.term.idx -= 1;
}
}
Key::Right => {
if self.term.idx < self.term.input.len() - 1 {
self.term.idx += 1;
}
}
Key::Ctrl('c') => {
process::exit(1);
}
Key::Ctrl('d') => {
process::exit(0);
}
Key::Char(char) => {
self.term.insert(self.term.idx, char);
self.term.idx += 1;
}
_ => {}
}
self.term.print(&mut stdout);
stdout.flush().unwrap();
}
stdout.suspend_raw_mode().unwrap();
}
fn start() {
let mut shell = Shell::new();
loop {
print!("$: ");
io::stdout().flush().unwrap();
shell.collect();
shell.term.input += "\n";
let res = parser::exec(&mut shell.term.input.as_bytes(), &mut shell.ctx);
match res {
Err(err) => eprintln!("rush: {}", err),
Ok(_) => {}
}
}
}
}
@ -127,7 +190,7 @@ fn main() {
},
None => {}
};
start_shell();
Shell::start();
}
#[cfg(test)]
@ -169,64 +232,3 @@ mod test {
load_and_run("test/while.rush")
}
}
fn editor() -> Shell {
let stdin = io::stdin();
let mut stdout = io::stdout().into_raw_mode().unwrap();
let mut shell = Shell::new();
for c in stdin.keys() {
let c = c.unwrap();
match c {
Key::Char('\n') => {
if shell.term.input.chars().nth(shell.term.idx).unwrap_or(' ') == '\\' {
shell.term.insert_str(shell.term.idx, "\\\n");
} else {
break;
}
}
Key::Backspace => {
if shell.term.input.len() > 0 && shell.term.idx > 0 {
if shell.term.idx == shell.term.input.len() - 1 {
shell.term.input.pop();
} else {
shell.term.remove(shell.term.idx - 1);
}
shell.term.idx -= 1;
}
}
Key::Delete => {
if shell.term.idx < shell.term.input.len() {
shell.term.remove(shell.term.idx);
}
}
Key::End => {
shell.term.idx = cmp::max(shell.term.input.len(), 1) - 1;
}
Key::Home => {
shell.term.idx = 0;
}
Key::Left => {
if shell.term.idx > 0 {
shell.term.idx -= 1;
}
}
Key::Right => {
if shell.term.idx < shell.term.input.len() - 1 {
shell.term.idx += 1;
}
}
Key::Ctrl('c') => {
process::exit(1);
}
Key::Char(char) => {
shell.term.insert(shell.term.idx, char);
shell.term.idx += 1;
}
_ => {}
}
shell.term.print(&mut stdout);
stdout.flush().unwrap();
}
stdout.suspend_raw_mode().unwrap();
shell
}

View file

@ -1,4 +1,4 @@
use crate::parser::tokens::Tokens;
use crate::parser::tokens::{Token, Tokens};
use anyhow::{bail, Context, Result};
#[derive(Debug)]
@ -126,7 +126,7 @@ pub enum Expression {
#[derive(Debug)]
struct Tree {
tokens: Vec<Tokens>,
tokens: Vec<Token>,
i: usize
}
@ -177,7 +177,7 @@ impl Tree {
buf.push(val);
if self.i >= end - 1 { break }
self.i += 1;
token = self.tokens.get(self.i).unwrap();
token = &self.tokens.get(self.i).unwrap().token;
if matches!(token, Tokens::CommandEnd(_)) { break }
}
match &token {
@ -196,7 +196,7 @@ impl Tree {
self.inc();
let mut len = 0;
for token in &self.tokens[self.i..] {
match token {
match token.token {
Tokens::ExportSet => { break },
_ => len += 1
}
@ -218,7 +218,7 @@ impl Tree {
let mut found_first = false;
for token in &self.tokens[self.i..] {
val_end += 1;
match token {
match token.token {
Tokens::Space => if found_first { break },
Tokens::CommandEnd(_) => if !found_first { bail!("Unexpected command end") } else { break },
Tokens::FileRead => bail!("Unexpected file read (<)"),
@ -242,7 +242,7 @@ impl Tree {
let mut found_first = false;
for token in &self.tokens[self.i..] {
val_end += 1;
match token {
match token.token {
Tokens::Space => if found_first { break },
Tokens::CommandEnd(_) => if !found_first { bail!("Unexpected command end") } else { break },
Tokens::FileRead => bail!("Unexpected file read (<)"),
@ -381,7 +381,7 @@ impl Tree {
let mut lvl = 1;
self.inc();
for token in &self.tokens[self.i..] {
match token {
match token.token {
Tokens::SubStart => lvl += 1,
Tokens::StringFunction(_) => lvl += 1,
Tokens::ArrayFunction(_) => lvl += 1,
@ -468,7 +468,7 @@ impl Tree {
let mut lvl = 1;
self.inc();
for token in &self.tokens[self.i..] {
match token {
match token.token {
Tokens::ParenthesisStart => lvl += 1,
Tokens::ParenthesisEnd => lvl -= 1,
_ => {}
@ -547,11 +547,11 @@ impl Tree {
self.i += 1;
self
}
fn get_current_token(&self) -> &Tokens { self.tokens.get(self.i).unwrap() }
fn get_next_token(&self) -> &Tokens { self.tokens.get(self.i + 1).unwrap() }
fn get_current_token(&self) -> &Tokens { &self.tokens.get(self.i).unwrap().token }
fn get_next_token(&self) -> &Tokens { &self.tokens.get(self.i + 1).unwrap().token }
}
pub fn build_tree(tokens: Vec<Tokens>) -> Result<Vec<Expression>> {
pub fn build_tree(tokens: Vec<Token>) -> Result<Vec<Expression>> {
let mut expressions: Vec<Expression> = Vec::new();
let mut tree = Tree { tokens, i: 0 };
loop {

View file

@ -1,5 +1,12 @@
use anyhow::{Result, bail};
#[derive(Debug)]
pub struct Token {
pub token: Tokens,
pub start: usize,
pub end: usize
}
#[derive(Debug)]
pub enum Tokens {
Space,
@ -86,7 +93,7 @@ impl Tokens {
}
fn read_var_ahead(i: usize, text: &String) -> (usize, Tokens) {
fn read_var_ahead(i: usize, text: &String) -> (usize, Token) {
let mut x = i;
let mut buf = String::new();
let parens_mode = text.chars().nth(x + 1).unwrap() == '{';
@ -113,14 +120,14 @@ fn read_var_ahead(i: usize, text: &String) -> (usize, Tokens) {
}
}
let token = match text.chars().nth(i).unwrap() {
'$' => Tokens::StringVariable(buf, parens_mode),
'@' => Tokens::ArrayVariable(buf, parens_mode),
'$' => Token { token: Tokens::StringVariable(buf, parens_mode), start: i, end: i + x },
'@' => Token { token: Tokens::ArrayVariable(buf, parens_mode), start:i , end: i+x },
a => panic!("Invalid value {}", a)
};
(x - i - 1, token)
}
pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Token>> {
let mut quote_active = false;
let mut double_quote_active = false;
let mut escape_active = false;
@ -128,10 +135,10 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
reader.read_to_string(&mut text)?;
let mut text_length = text.len();
let mut tokens: Vec<Tokens> = Vec::new();
let mut tokens: Vec<Token> = Vec::new();
fn save_buf(buf: &mut String, tokens: &mut Vec<Tokens>) {
if buf.len() > 0 { tokens.push(Tokens::detect(std::mem::take(buf))) }
fn save_buf(buf: &mut String, tokens: &mut Vec<Token>, i: usize) {
if buf.len() > 0 { tokens.push(Token { token: Tokens::detect(std::mem::take(buf)), end: i, start: i - buf.len() }) }
}
let mut buf = String::new();
@ -147,22 +154,22 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
'"' => if !escape_active && !quote_active { double_quote_active = !double_quote_active; buf_add = false },
'\'' => if !escape_active && !double_quote_active { quote_active = !quote_active; buf_add = false },
'$' => if !escape_active && !quote_active {
save_buf(&mut buf, &mut tokens);
save_buf(&mut buf, &mut tokens, i);
if text_length > i && text.chars().nth(i + 1).unwrap() == '(' {
tokens.push(Tokens::SubStart);
tokens.push(Token { token: Tokens::SubStart, start: i, end: i+1 });
skipper = 1;
buf_add = false;
} else {
let (skippers, mut token) = read_var_ahead(i, &text);
match token {
match token.token {
Tokens::StringVariable(ref str, bool) => if !bool && !double_quote_active {
if text.len() > i + skippers && text.chars().nth(i + skippers).unwrap() == '(' {
token = Tokens::StringFunction(str.clone());
token = Token { token: Tokens::StringFunction(str.clone()), end: i + skippers, start: i };
}
},
Tokens::ArrayVariable(ref str, bool) => if !bool && !double_quote_active {
if text.len() > i + skippers && text.chars().nth(i + skippers).unwrap() == '(' {
token = Tokens::ArrayFunction(str.clone());
token = Token { token: Tokens::ArrayFunction(str.clone()), end: i+skippers, start: i };
}
}
_ => bail!("Cannot happen")
@ -173,8 +180,8 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
}
},
';' | '\r' | '\n' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
tokens.push(Tokens::CommandEnd(letter.clone()));
save_buf(&mut buf, &mut tokens, i);
tokens.push(Token { token: Tokens::CommandEnd(letter.clone()), start: i, end: i });
let mut x = 0;
while x < text.len() - 1 && matches!(text.chars().nth(x).unwrap(), '\n' | '\r' | ';' | ' ') {
x += 1;
@ -185,28 +192,28 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
buf_add = false;
},
'&' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
save_buf(&mut buf, &mut tokens, i);
if i + 1 < text.len() && text.chars().nth(i+1).unwrap() == '&' {
tokens.push(Tokens::And);
tokens.push(Token { token: Tokens::And, start: i, end: i+1 });
skipper = 1;
} else {
tokens.push(Tokens::JobCommandEnd);
tokens.push(Token { token: Tokens::JobCommandEnd, start: i , end: i });
}
buf_add = false;
},
'|' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
save_buf(&mut buf, &mut tokens, i);
if i + 1 < text.len() && text.chars().nth(i+1).unwrap() == '|' {
tokens.push(Tokens::Or);
tokens.push(Token { token: Tokens::Or, start: i, end: i+1 });
skipper = 1;
} else {
tokens.push(Tokens::RedirectInto);
tokens.push(Token { token: Tokens::RedirectInto, start: i, end: i });
}
buf_add = false;
},
' ' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
tokens.push(Tokens::Space);
save_buf(&mut buf, &mut tokens, i);
tokens.push(Token { token: Tokens::Space, start: i, end: i });
let mut x = i;
while text.chars().nth(x).unwrap() == ' ' {
x += 1;
@ -215,13 +222,13 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
buf_add = false;
},
'(' => if !quote_active && !double_quote_active && !escape_active {
save_buf(&mut buf, &mut tokens);
tokens.push(Tokens::ParenthesisStart);
save_buf(&mut buf, &mut tokens, i);
tokens.push(Token { token: Tokens::ParenthesisStart, start: i, end: i });
buf_add = false;
}
')' => if !quote_active && !double_quote_active && !escape_active {
save_buf(&mut buf, &mut tokens);
tokens.push(Tokens::ParenthesisEnd);
save_buf(&mut buf, &mut tokens, i);
tokens.push(Token { token: Tokens::ParenthesisEnd, start: i, end: i });
buf_add = false;
},
'\\' => if !escape_active {
@ -231,12 +238,12 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
escape_active = false;
},
'=' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
tokens.push(Tokens::ExportSet);
save_buf(&mut buf, &mut tokens, i);
tokens.push(Token { token: Tokens::ExportSet, start: i, end: i });
buf_add = false;
},
'#' => if !escape_active && !quote_active && !double_quote_active {
save_buf(&mut buf, &mut tokens);
save_buf(&mut buf, &mut tokens, i);
buf_add = false;
let mut x = 0;
while x + i + 1 < text.len() && text.chars().nth(x + i + 1).unwrap() != '\n' {
@ -251,7 +258,7 @@ pub fn tokenize(reader: &mut dyn std::io::BufRead) -> Result<Vec<Tokens>> {
buf.push(*letter);
}
}
save_buf(&mut buf, &mut tokens);
save_buf(&mut buf, &mut tokens, text.len());
Ok(tokens)
}