Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make get, decode, and split handle edge cases correctly #1769

Merged
merged 2 commits into from
Aug 30, 2024

Conversation

annevk
Copy link
Member

@annevk annevk commented Aug 19, 2024

Correct how it handles when a header is present but has no value (should be a list solely containing an empty string value) and when there is a trailing comma (should be a list ending with an empty string value).

Fixes #1768.


@domenic @MattMenke2 this was an oversight in #831 it appears. If you'd like to review in addition to @CarloCannas I'd appreciate it. Thanks!


Preview | Diff

@CarloCannas
Copy link

Hmm, what about an header with empty value?

If they are allowed (which seems the case to me, reading both RFC 9110 and header value definition) I think this algorithm should return a list of a single empty string, not an empty list.

What do you think?

Maybe something like this? (sorry for the bad formatting)

    Let input be the result of isomorphic decoding value.

    Let position be a position variable for input, initially pointing at the start of input.

    Let values be a list of strings, initially empty.

    Let temporaryValue be the empty string.

    While true:

        Append the result of collecting a sequence of code points that are not U+0022 (") or U+002C (,) from input, given position, to temporaryValue.

        The result might be the empty string.

        If position is not past the end of input and the code point at position within input is U+0022 ("), then:

                Append the result of collecting an HTTP quoted string from input, given position, to temporaryValue.

                If position is not past the end of input, then continue.

        Remove all HTTP tab or space from the start and end of temporaryValue.

        Append temporaryValue to values.

        If position is past the end of input

        	return values

        Set temporaryValue to the empty string.

        Assert: the code point at position within input is U+002C (,).

        Advance position by 1.

@MattMenke2
Copy link
Contributor

Without reviewing the actual verbiage, I agree with CarloCannas that a terminal comma should result in adding an empty string.

Specs allow for:

Foo: Bar
Foo:

to be replaced with:

Foo: Bar,

So the latter should presumably return two values when parsed.

@annevk
Copy link
Member Author

annevk commented Aug 19, 2024

@CarloCannas yeah, that seems reasonable and it looks like that should be compatible with existing callers. I guess with that in theory we could make the "parent" algorithm return the empty list instead of null when the header is missing, but null is clearer. I'll see about making that change.

@annevk annevk changed the title Make get, decode, and split handle a trailing comma correctly Make get, decode, and split handle edge cases correctly Aug 19, 2024
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really run this algorithm in my head enough to verify it's correct, but editorially this seems reasonable. Let's count on @CarloCannas for the normative review :).

A reference implementation and tests file with 100% code coverage would be the best solution of course.

@annevk
Copy link
Member Author

annevk commented Aug 27, 2024

@CarloCannas any further thoughts on this?

I'll merge this Monday if there's no more feedback.

@CarloCannas
Copy link

Hi Anne,
I reviewed the changes again, it looks fine to me. Go ahead and merge!

@annevk annevk merged commit 3153e5e into main Aug 30, 2024
2 checks passed
@annevk annevk deleted the annevk/trailing-comma branch August 30, 2024 05:34
jamierocks added a commit to jamierocks/ladybird that referenced this pull request Sep 25, 2024
awesomekling pushed a commit to LadybirdBrowser/ladybird that referenced this pull request Sep 26, 2024
ebanner pushed a commit to ebanner/ladybird that referenced this pull request Oct 15, 2024
nico pushed a commit to nico/serenity that referenced this pull request Nov 9, 2024
See:
- whatwg/fetch#1769
- whatwg/fetch@3153e5e

(cherry picked from commit 84351dfa51dc8bd0046c3a9ec3e574c58fe9f790)
nico pushed a commit to nico/serenity that referenced this pull request Nov 10, 2024
See:
- whatwg/fetch#1769
- whatwg/fetch@3153e5e

(cherry picked from commit 84351dfa51dc8bd0046c3a9ec3e574c58fe9f790)
nico pushed a commit to SerenityOS/serenity that referenced this pull request Nov 10, 2024
See:
- whatwg/fetch#1769
- whatwg/fetch@3153e5e

(cherry picked from commit 84351dfa51dc8bd0046c3a9ec3e574c58fe9f790)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

"get, decode, and split" doesn't return example output
4 participants