-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom delimiter on array fields #227
Conversation
8533d42
to
bf5a5fb
Compare
bf5a5fb
to
0942c3c
Compare
@@ -300,5 +300,34 @@ describe('Parse Values - parseFieldValue', () => { | |||
it('Boolean array field, rejects array where value is missing (two delimiters are adjacent)'); | |||
expect(parseFieldValue(',true,false,TRUE', fieldBooleanArrayRequired).success).false; | |||
}); | |||
describe('Custom delimiter', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice feature! "out loud" thoughts about this, not to question your coverage but to narrate my analysis: asking myself "how do you escape the delimiter if it's also meant to be used as a value?" like you would in a comprehensive array of symbols. the answer I'd guess would be "use a compound delimiter", and then I realised it's not clear what happens if I give the delimiter repeatedly. does the array end up with an empty value? or does the function filter empties?
e.g. assume default (i.e. ,
), what if I give you symbols = ',",\,%,,,#,*
do I end up with [' , " , \ , % , , , # , *].length === 8
? or [' , " , \ , % , # , *].length === 6
... and what would be the representation of the quotes in JS, since one of the two first elements here would have to be escaped to make a string.
There's special handling here that could use being documented, methinks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you put the delimiter back-to-back this results in an empty value between them. This is not allowed in arrays where each element is expected to have a value. Validation of the data will fail.
lectern/packages/validation/test/parseValues/parseField.spec.ts
Lines 269 to 271 in 1c32ea4
it('Rejects array where value is missing (two delimiters are adjacent)', () => { | |
const result = parseFieldValue('12, 45,,-111111', fieldIntegerArrayRequired); | |
expect(result.success).false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no handling for escaping characters. If you are providing data in a TSV, all characters are allowed. If you are providing data in JSON, then the JSON parsing will strip escape characters. In general, parsing raw data handles escape characters, once the string is a variable in memory we do no escape character handling other than splitting on the delimiter.
We do not have a mechanism for allowing the delimiter in the string value. We could investigate this but I prefer pushing for the dictionary to use a delimiter that is not part of the data set (prefer because its simpler for devs and data submitters, though it puts the burden on the dictionary author)
| -------------- | -------- | ---------------------- | --------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- | | ||
| `name` | Required | | NameString (no whitespace or `.`) | Name of the field. This will be used as the header in TSV files in this field's schema, and in any paths referencing this field. | `"example_field` | | ||
| `valueType` | Required | | [Field Data Type](#field-data-types) | Type of value stored in this field | `"string"` | | ||
| `delimiter` | Optional | `,` | `string` | Character or string that will be used to split multiple values into an array. The default delimiter is a comma `,`. Any characters can be used as a delimiter. The delimiter value can be one or more characters long, but cannot be an empty string. Note: This property has no effect unless the field has `isArray: true`. | `"\|"` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check and confirm, I assume there's validation and TS preventing a submission where valueType: 'integer', isArray: true, delimiter: '0'
?
isArray is true but valueType is not string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not. That is a valid delimiter, and it would prevent numeric values with 0
in them. I would advise against using this delimiter but the spec allows it atm.
@@ -138,7 +138,7 @@ const convertArrayValue = (value: string, fieldDefinition: SchemaField): Result< | |||
|
|||
/* === Start of convertArrayValue logic === */ | |||
const { valueType } = fieldDefinition; | |||
const delimiter = DEFAULT_DELIMITER; | |||
const delimiter = fieldDefinition.delimiter !== undefined ? fieldDefinition.delimiter : DEFAULT_DELIMITER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line as is would allow empty strings, null values or arrays be used as a delimiter; empty strings not being a delimiter at all. am I missing something here?
possible fix/workaround:
const delimiter = fieldDefinition.delimiter !== undefined ? fieldDefinition.delimiter : DEFAULT_DELIMITER; | |
const delimiter = typeof fieldDefinition.delimiter === 'string' && fieldDefinition.delimiter.length ? fieldDefinition.delimiter : DEFAULT_DELIMITER; |
if the argument "TypeScript would not let that happen at this point" may seem tempting, that's not a tool available at PR reviews shrug... also does TS check for string length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realised zod schema states min 1... code here could still benefit from the additional declarativeness, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 ...
This is good type checking and I am considering how this fits.
- Any dictionary that has been validated by the lectern server or client will have a minimum delimiter length of 1.
- Any use of this with a dictionary that hasn't been validated could have
''
as a delimiter. Splitting a string by this results in each character being its own string value to be parsed.
If I provide a delimiter of ''
to the parsing function, the above behaviour is consistent. It is distinct behaviour from an undefined delimiter value. We also protect against this use case for any dictionaries read from a lectern server or validated using the dictionary package.
Alternately - with the proposed solution - a use case where a non-validated delimiter of ''
is provided, no splitting would occur. This is inconsistent since there is a delimiter provided but not used. I can see the situation where a user expects this behaviour, since they consider this as a "no value" case (though, it is not). In this case, some documentation would be good. The documentation does say that it shouldn't be an empty string, but I can make it more explicit.
Both of these outcomes are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left comment+suggestion to improve declarativeness as well as triple ensure type safety for the custom delimiters 👍
Nice feature!
Ticket: #204
Objective
Adds the ability to define a custom delimiter on any field. If the field has an array value, the value will use this custom delimiter to split the string value into an array.
The property is named
delimiter
and accepts a string value. Delimiter values must be at least one character in length, but can be longer and can contain any combination of characters.The dictionary-reference documentation has been updated to reflect the array delimiter implementation.
Example
String array split on
|
character:Then when used to parse data by the function
parseFieldValue()
from the validation library: