Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add better errors for missing commas in arrays and objects #5210

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
6 changes: 4 additions & 2 deletions src/lang/abstractSyntaxTree.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,9 @@ describe('parsing errors', () => {
const result = parse(code)
if (err(result)) throw result
const error = result.errors[0]
expect(error.message).toBe('Array is missing a closing bracket(`]`)')
expect(error.sourceRange).toEqual([28, 29, 0])
expect(error.message).toBe(
'Unexpected character encountered. You might be missing a comma in between elements.'
)
expect(error.sourceRange).toEqual([29, 31, 0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we didn't test error messages from type script, can we just remove this check (or the whole file)?

Copy link
Collaborator Author

@jtran jtran Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

})
})
178 changes: 152 additions & 26 deletions src/wasm-lib/kcl/src/parsing/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
error::{ErrMode, StrContext, StrContextValue},
prelude::*,
stream::Stream,
token::{any, one_of, take_till},
token::{any, one_of, take_till, take_until},
};

use super::{
Expand Down Expand Up @@ -739,20 +739,38 @@
.context(expected("array contents, a list of elements (like [1, 2, 3])"))
.parse_next(i)?;
ignore_whitespace(i);
let end = close_bracket(i)
.map_err(|e| {
if let Some(mut err) = e.clone().into_inner() {
err.cause = Some(CompilationError::fatal(
open.as_source_range(),
"Array is missing a closing bracket(`]`)",
));
ErrMode::Cut(err)
} else {
// ErrMode::Incomplete, not sure if it's actually possible to end up with this here
e
}
})?
.end;

let maybe_closing_bracket: PResult<TokenSlice<'_>> = peek(take_until(0.., "]")).parse_next(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little eager to find a closing bracket - it can scan to the end of the source file(unless I misunderstand what Winnow would do) so could get a lot of false positives (as well as being bad for performance). At the least, we shouldn't do this check until we know we are recovering from an error (and even then it feels bad since we might call this function to test if this is an array). But I think for correctness the forward search should be terminated by anything which can't occur before a ] in correct code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. I can definitely move the check into the if statement on 761. How would you suggest checking for incorrect literals before the ]? If I'm thinking about it correctly, we would need to track the literals as we progressed through the tokens.
Like, a parenthesis could be invalid, but only if it's not balanced. Same with a comma. ,, is definitely wrong, but multiple commas aren't inherently incorrect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the closing bracket checks into the error branch. Let me know about how best to go about bailing out earlier and I can add that in as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not easy! I don't know how best to integrate it with Winnow. Typically for this kind of recovery you come up with a set of tokens which cannot appear within the sequence, the easiest place to start is keywords. For matching delimiters you need to use your own stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I got that working as requested I think. I didn't do anything too fancy for things like braces, brackets, and commas. Let me know if I should adjust or add anything in. I had some other comments/questions I'll leave on this PR once it's up to date with my branch

let has_closing_bracket = maybe_closing_bracket.is_ok();
let maybe_end = close_bracket(i).map_err(|e| {
if let Some(mut err) = e.clone().into_inner() {
err.cause = Some(CompilationError::fatal(
open.as_source_range(),
"Array is missing a closing bracket(`]`)",
));
ErrMode::Cut(err)
} else {
// ErrMode::Incomplete, not sure if it's actually possible to end up with this here
e

Check warning on line 754 in src/wasm-lib/kcl/src/parsing/parser.rs

View check run for this annotation

Codecov / codecov/patch

src/wasm-lib/kcl/src/parsing/parser.rs#L754

Added line #L754 was not covered by tests
}
});

// if there is a closing bracket at some point, but it's not the next token after skipping the
// whitespace and ignoring a trailing comma, it's likely that they forgot a comma between some
// of the elements
if has_closing_bracket && maybe_end.is_err() {
// safe to unwrap here because we checked it was Ok above
let start_range = maybe_closing_bracket.unwrap().as_source_range();
let e = ContextError {
context: vec![],
cause: Some(CompilationError::fatal(
start_range,
"Unexpected character encountered. You might be missing a comma in between elements.",
)),
};
return Err(ErrMode::Cut(e));
}
let end = maybe_end?.end;

// Sort the array's elements (i.e. expression nodes) from the noncode nodes.
let (elements, non_code_nodes): (Vec<_>, HashMap<usize, _>) = elements.into_iter().enumerate().fold(
Expand Down Expand Up @@ -811,7 +829,7 @@
}

fn object_property_same_key_and_val(i: &mut TokenSlice) -> PResult<Node<ObjectProperty>> {
let key = nameable_identifier.context(expected("the property's key (the name or identifier of the property), e.g. in 'height: 4', 'height' is the property key")).parse_next(i)?;
let key = nameable_identifier.context(expected("the property's key (the name or identifier of the property), e.g. in 'height = 4', 'height' is the property key")).parse_next(i)?;
ignore_whitespace(i);
Ok(Node {
start: key.start,
Expand All @@ -837,7 +855,7 @@
ignore_whitespace(i);
let expr = expression_but_not_ascription
.context(expected(
"the value which you're setting the property to, e.g. in 'height: 4', the value is 4",
"the value which you're setting the property to, e.g. in 'height = 4', the value is 4",
))
.parse_next(i)?;

Expand Down Expand Up @@ -870,7 +888,7 @@
alt((
// Normally you need a comma.
comma_sep,
// But, if the array is ending, no need for a comma.
// But, if the object is ending, no need for a comma.
peek(preceded(opt(whitespace), close_brace)).void(),
))
.parse_next(i)
Expand All @@ -893,10 +911,45 @@
)),
)
.context(expected(
"a comma-separated list of key-value pairs, e.g. 'height: 4, width: 3'",
"a comma-separated list of key-value pairs, e.g. 'height = 4, width = 3'",
))
.parse_next(i)?;
ignore_trailing_comma(i);
ignore_whitespace(i);

let maybe_closing_brace: PResult<TokenSlice<'_>> = peek(take_until(0.., "}")).parse_next(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

let has_closing_brace = maybe_closing_brace.is_ok();
// This will be an error if the next token is not a `}`, this can be because they forgot to
// include a `}` entirely or they forgot a comma somewhere in the object
let maybe_end = close_brace(i).map_err(|e| {
if let Some(mut err) = e.clone().into_inner() {
err.cause = Some(CompilationError::fatal(
open.as_source_range(),
"Object is missing a closing brace(`}`)",
));
ErrMode::Cut(err)
} else {
// ErrMode::Incomplete, not sure if it's actually possible to end up with this here
e

Check warning on line 933 in src/wasm-lib/kcl/src/parsing/parser.rs

View check run for this annotation

Codecov / codecov/patch

src/wasm-lib/kcl/src/parsing/parser.rs#L933

Added line #L933 was not covered by tests
}
});
// if there is a closing brace at some point, but it's not the next token after skipping the
// whitespace and ignoring a trailing comma, it's likely that they forgot a comma between some
// of the properties
if has_closing_brace && maybe_end.is_err() {
// okay to unwrap here because we checked it was Ok above
let start_range = maybe_closing_brace.unwrap().as_source_range();
let e = ContextError {
context: vec![],
cause: Some(CompilationError::fatal(
start_range,
"Unexpected character encountered. You might be missing a comma in between properties.",
)),
};
return Err(ErrMode::Cut(e));
}

let end = maybe_end?.end;
// Sort the object's properties from the noncode nodes.
let (properties, non_code_nodes): (Vec<_>, HashMap<usize, _>) = properties.into_iter().enumerate().fold(
(Vec::new(), HashMap::new()),
Expand All @@ -912,9 +965,7 @@
(properties, non_code_nodes)
},
);
ignore_trailing_comma(i);
ignore_whitespace(i);
let end = close_brace(i)?.end;

let non_code_meta = NonCodeMeta {
non_code_nodes,
..Default::default()
Expand Down Expand Up @@ -3694,7 +3745,7 @@
assert_eq!(
src_expected,
src_actual,
"expected error would highlight {} but it actually highlighted {}",
"expected error would highlight `{}` but it actually highlighted `{}`",
&p[src_expected[0]..src_expected[1]],
&p[src_actual[0]..src_actual[1]],
);
Expand Down Expand Up @@ -3769,7 +3820,7 @@
#[test]
fn test_parse_member_expression_binary_expression_in_array_number_second_missing_space() {
let code = r#"obj = { a: 1, b: 2 }
height = [obj["a"] -1, 0]"#;
height = [obj["a"] -1, 0]"#;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies there's indentation in the KCL code, which is probably not intended

crate::parsing::top_level_parse(code).unwrap();
}

Expand Down Expand Up @@ -3885,7 +3936,11 @@

#[test]
fn test_parse_weird_lots_of_fancy_brackets() {
assert_err(r#"zz({{{{{{{{)iegAng{{{{{{{##"#, "Unexpected token: (", [2, 3]);
assert_err(
r#"zz({{{{{{{{)iegAng{{{{{{{##"#,
"Object is missing a closing brace(`}`)",
[3, 4],
);
}

#[test]
Expand Down Expand Up @@ -4447,11 +4502,82 @@
}

#[test]
fn test_parse_missing_closing_bracket() {
fn test_parse_array_missing_closing_bracket() {
let some_program_string = r#"
sketch001 = startSketchOn('XZ') |> startProfileAt([90.45, 119.09, %)"#;
assert_err(some_program_string, "Array is missing a closing bracket(`]`)", [51, 52]);
}
#[test]
fn test_parse_array_missing_comma() {
let some_program_string = r#"
sketch001 = startSketchOn('XZ') |> startProfileAt([90.45 119.09], %)"#;
assert_err(
some_program_string,
"Unexpected character encountered. You might be missing a comma in between elements.",
[52, 65],
);
}
#[test]
fn test_parse_array_random_brace() {
let some_program_string = r#"
sketch001 = startSketchOn('XZ') |> startProfileAt([}], %)"#;
assert_err(
some_program_string,
"Unexpected character encountered. You might be missing a comma in between elements.",
[52, 54],
);
}

#[test]
fn test_parse_object_missing_closing_brace() {
let some_program_string = r#"{
foo = bar,
"#;

assert_err(some_program_string, "Object is missing a closing brace(`}`)", [0, 1]);
}
#[test]
fn test_parse_object_missing_comma() {
let some_program_string = r#"{
foo = bar,
bar = foo
bat = man
}"#;

assert_err(
some_program_string,
"Unexpected character encountered. You might be missing a comma in between properties.",
[37, 78],
);
}

#[test]
fn test_parse_object_random_bracket() {
let some_program_string = r#"{]}"#;

assert_err(
some_program_string,
"Unexpected character encountered. You might be missing a comma in between properties.",
[1, 3],
);
}

#[test]
fn test_parse_object_shorthand_missing_comma() {
let some_program_string = r#"
bar = 1
{
foo = bar,
bar
bat = man
}"#;

assert_err(
some_program_string,
"Unexpected character encountered. You might be missing a comma in between properties.",
[54, 89],
);
}

#[test]
fn warn_object_expr() {
Expand Down
19 changes: 19 additions & 0 deletions src/wasm-lib/kcl/src/parsing/token/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,9 @@ impl IntoIterator for TokenStream {
#[derive(Debug, Clone)]
pub(crate) struct TokenSlice<'a> {
stream: &'a TokenStream,
/// Current position of the leading Token in the stream
start: usize,
/// The number of total Tokens in the stream
end: usize,
}

Expand Down Expand Up @@ -175,6 +177,15 @@ impl<'a> TokenSlice<'a> {
stream: self.stream,
}
}

pub fn as_source_range(&self) -> SourceRange {
let first_token = self.token(0);
SourceRange::new(
first_token.start,
self.stream.tokens[self.end].end,
first_token.module_id,
)
}
}

impl<'a> IntoIterator for TokenSlice<'a> {
Expand Down Expand Up @@ -279,6 +290,14 @@ impl<'a> winnow::stream::StreamIsPartial for TokenSlice<'a> {
}
}

impl<'a> winnow::stream::FindSlice<&str> for TokenSlice<'a> {
fn find_slice(&self, substr: &str) -> Option<std::ops::Range<usize>> {
self.iter()
.enumerate()
.find_map(|(i, b)| if b.value == substr { Some(i..self.end) } else { None })
}
}

#[derive(Clone, Debug)]
pub struct Checkpoint(usize, usize);

Expand Down
Loading