Bracketed paste support #73257

adamsitnik · 2022-08-02T18:42:22Z

ghost · 2022-08-02T18:42:44Z

Tagging subscribers to this area: @dotnet/area-system-console
See info in area-owners.md if you want to be subscribed.

Issue Details

Author:	adamsitnik
Assignees:	-
Labels:	`area-System.Console`
Milestone:	7.0.0

am11 · 2022-08-02T19:32:13Z

src/libraries/System.Console/tests/KeyParserTests.cs

+    [InlineData("\u001B[201~\u001B[200~", "")] // same, but with no content
+    // Terminals that were tested don't allow for pasting empty clipboard to the Terminal.
+    // In case some allow for that, an empty content is returned and the caller must re-try read() call
+    [InlineData("\u001B[200~\u001B[201~", "")]


Perhaps it would make sense to add one more case for nested start sequence:

[InlineData("\u001B[200~nested \u001B[200~brackets \u001B[201~start \u001B[201~", "nested brackets start ")]

(trailing space is intentional)

@am11 I tested nesting by pasting "nested" to the terminal, reading the input via read(), copying to clipboard with https://github.com/SimonCropp/TextCopy) and pasting to the Terminal again:

it seems that in such cases xterm is removing the escape characters from the "nested" tags. I can add a test case for that, but imo it would be a little bit odd (verifying that incomplete tags remain in the input).

[InlineData("\u001B[200~[200~nested[201~\u001B[201~", "[200~nested[201~ ")

but I can definitely add white spaces to show that they are not being trimmed.

I added more test cases (whitespaces + nested)

I was testing it on Ubuntu terminal, and noticed this:

$ printf '\u001B[200~nested \u001B[200~brackets \u001B[201~start \u001B[201~' nested brackets start $

@am11 I see. But this is something that you have printed to the Terminal, not something that you have read from it. Is it possible for the Terminal to return such data from read?

I am asking because with the current implementation we avoid copying memory and the code is relatively simple. I would prefer to keep it that way if there is no evidence that there is scenario that we don't support.

Yes the terminal will remove the escape tag exactly the way this implementation does. If I am reading the change right, your implementation will remove the nested tag and extract the text in between just the way how terminal does it. That is why I was suggesting to test to preserve this behavior if it makes sense. Currently you added:

[InlineData("\u001B[200~[200~nested[201~\u001B[201~", "[200~nested[201~")]

which is missing \u001B from the nested start tag so it is not testing the nested tag scenario. We may as well remove it since [200~ without \u001B has no meaning.

not something that you have read from it.

That was the intention. printf will send the raw string. To see what value is read, I used xxd to get the hex representation:

$ printf '\u001B[200~nested \u001B[200~brackets \u001B[201~start \u001B[201~' | xxd 00000000: 1b5b 3230 307e 6e65 7374 6564 201b 5b32 .[200~nested .[2 00000010: 3030 7e62 7261 636b 6574 7320 1b5b 3230 00~brackets .[20 00000020: 317e 7374 6172 7420 1b5b 3230 317e 1~start .[201~ # same with 'echo -e' $ echo -e '\u001B[200~nested \u001B[200~brackets \u001B[201~start \u001B[201~' | xxd 00000000: 1b5b 3230 307e 6e65 7374 6564 201b 5b32 .[200~nested .[2 00000010: 3030 7e62 7261 636b 6574 7320 1b5b 3230 00~brackets .[20 00000020: 317e 7374 6172 7420 1b5b 3230 317e 0a 1~start .[201~.

am11 · 2022-08-03T09:42:29Z

src/libraries/System.Console/tests/KeyParserTests.cs

+    [InlineData(" test123\u001B[201~", " test123")] // no start tag
+    [InlineData("\u001B[200~", "")] // just start tag
+    // "nested" pasted to the Terminal, raw input copied and pasted again (yes, the escape characters got eaten by the xterm)
+    [InlineData("\u001B[200~[200~nested[201~\u001B[201~", "[200~nested[201~")]


Suggested change

[InlineData("\u001B[200~[200~nested[201~\u001B[201~", "[200~nested[201~")]

[InlineData("\u001B[200~\u001B[200~nested\u001B[201~ tag\u001B[201~ ", "nested tag ")]

DHowett · 2022-08-09T17:17:45Z

src/libraries/System.Console/src/System/IO/StdInReader.cs

+                            //    b) The second read populates the buffer with the pasted text and the end tag.
+                            // We need to assume that other scenarios can happen. That is why once we remove the tags, the loop checks
+                            // whether the unprocessed buffer is still empty. If it is, the read() is performed again.
+                            ReadOnlySpan<byte> parsableBytes = useNet6KeyParser ? receivedBytes : KeyParser.RemoveBracketedPasteTags(receivedBytes);


I'm a bit worried about how this fits into the consumer's "world view." I'm untrained here, and have not studied the new VT parsing APIs. However, it looks like here deep in StdInReader, the consumer loses the ability to determine whether an input string came in by paste or simply by virtue of it having all shown up in the same call to read().

Stripping out the bracketing sequences seems like a stopgap that will help get them out of the input stream, but not serve the overarching purpose of paste bracketing. 😄

Stripping out the bracketing sequences seems like a stopgap that will help get them out of the input stream, but not serve the overarching purpose of paste bracketing.

I am not sure if I understand. The code here is Console.ReadKey which is supposed to read a single key at a time. When user pastes "ABC", the first call is going to return A, second B, third C . No matter how big the pasted input will be, the order is going to be preserved.

DHowett · 2022-08-09T17:18:56Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+        return receivedBytes;
+
+        static bool IsTag(ReadOnlySpan<byte> b)
+            => b[0] == '\u001B' && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' || b[4] == '1') && b[5] == '~';


I may be reading this wrong, but it looks like this will also pass malformed sequences where the end tag comes before the start tag -- a frame shift could easily result in this happening "in the wild", right? If we miss part of the introducer, we might detect a span flowing from the next finalizer to the next introducer as having been a bracketed paste.

end tag comes before the start tag

As I wrote in the unit tests comments, this handles hypothetical scenario where read() returns end tag from previous pasted content followed by start tag from next pasted content.

Example: user pastes "AB", then "CD".

1st read returns: <startTag>AB
2nd read returns: </endTag><startTag>CD</endTag>

runtime/src/libraries/System.Console/tests/KeyParserTests.cs

Lines 413 to 416 in da4d351

// the test case above are based on manual testing and observations

// the test cases below were not reproduced during manual testing, but hypothetically are possible

[InlineData("\u001B[201~", "")] // just end tag

[InlineData("\u001B[201~\u001B[200~test123", "test123")] // end tag (from previous pasted content) and start tag from next one

I could stick only to the test cases I've confirmed with manual testing, but each Terminal may implement this feature differently and I can't guarantee that we are going to handle all of them properly.

Should we validate that we get at least one start and one end tag (and ignore any nested tags that may occur)?

jozkee

Is a parsing the right thing to do for bracketing paste? Why did you decided that? #60101 doesn't sound to me like is asking for this; it looks more like a question about the mode as-is being reliable or not.

Wouldn't be better to let .NET developers decide whether they want to honor bracketing paste themselves or not? And if they don't make the decision, anyone using a .NET program can just disable the mode.

jozkee · 2022-08-09T21:29:41Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+    // In theory, a single buffer can contain end tag from previous paste followed immediately by start tag from next.
+    internal static ReadOnlySpan<byte> RemoveBracketedPasteTags(ReadOnlySpan<byte> receivedBytes)
+    {
+        // The Terminal is configured to return from read() as soon as at least byte is available for reading.


Suggested change

// The Terminal is configured to return from read() as soon as at least byte is available for reading.

// The Terminal is configured to return from read() as soon as at least one byte is available for reading.

jozkee · 2022-08-09T21:50:28Z

src/libraries/System.Console/src/System/IO/StdInReader.cs

+                        int result = ReadStdin(bufPtr, BytesToBeRead);
+                        if (result > 0)
+                        {
+                            ReadOnlySpan<byte> receivedBytes = new (bufPtr, result);


Suggested change

ReadOnlySpan<byte> receivedBytes = new (bufPtr, result);

ReadOnlySpan<byte> receivedBytes = new(bufPtr, result);

jozkee · 2022-08-09T22:22:52Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+        return receivedBytes;
+
+        static bool IsTag(ReadOnlySpan<byte> b)
+            => b[0] == '\u001B' && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' || b[4] == '1') && b[5] == '~';


Should we validate that we get at least one start and one end tag (and ignore any nested tags that may occur)?

jozkee · 2022-08-09T22:29:39Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+        const int TagLength = 6; // "\u001B[200~".Length
+        while (receivedBytes.Length >= TagLength)
+        {
+            if (IsTag(receivedBytes.Slice(0, TagLength)))


Move the escape sequences to a static field and do this instead:

Suggested change

if (IsTag(receivedBytes.Slice(0, TagLength)))

if (receivedBytes.Slice(0, TagLength).SequenceEqual(s_startBracketedPaste))

jozkee · 2022-08-09T22:30:09Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+            {
+                receivedBytes = receivedBytes.Slice(TagLength);
+            }
+            else if (IsTag(receivedBytes.Slice(receivedBytes.Length - TagLength)))


Suggested change

else if (IsTag(receivedBytes.Slice(receivedBytes.Length - TagLength)))

else if (receivedBytes.Slice(receivedBytes.Length - TagLength).SequenceEqual(s_endBracketedPaste))

jozkee · 2022-08-09T22:30:44Z

src/libraries/System.Console/src/System/IO/KeyParser.cs

+        static bool IsTag(ReadOnlySpan<byte> b)
+            => b[0] == '\u001B' && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' || b[4] == '1') && b[5] == '~';


Suggested change

static bool IsTag(ReadOnlySpan<byte> b)

=> b[0] == '\u001B' && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' || b[4] == '1') && b[5] == '~';

static bool IsTag(ReadOnlySpan<byte> b)

=> b[0] == Escape && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' || b[4] == '1') && b[5] == '~';

adamsitnik · 2022-08-10T08:35:03Z

Is a parsing the right thing to do for bracketing paste? Why did you decided that? #60101 doesn't sound to me like is asking for this; it looks more like a question about the mode as-is being reliable or not.

I think it is the right thing as I would expect the users to prefer not to deal with removing the tags themselves.

cc @daxian-dbw who created the original issue. @daxian-dbw what is your opinion here?

daxian-dbw · 2022-08-10T17:39:32Z

@adamsitnik We want the tags to be preserved by ReadKey instead of being stripped off.

In PSReadLine, right click pasting causes lots of problems (see PowerShell/PSReadLine#579). So, we want a way to detect a right click pasting, and the Bracketed Paste mode sounds to be the right solution given that Windows Terminal supports it now.

I need to know a Bracketed Paste is started, then just collect the pasted content until seeing the end tag, and then add the content text as a whole to the PSReadLine's buffer. Without seeing the tags, PSReadLine will have to assume the incoming keys are from interactive typing and handle them one by one (existing behavior today), and that won't help in solving the right click pasting issue.

So, what we want would be for ReadKey to return back the tags as is (maybe including the [ char that is missing today?), and also want the confirmation from .NET team that .NET application can depend on this behavior to handle Bracketed Paste.

adamsitnik · 2022-08-10T18:22:26Z

We want the tags to be preserved by ReadKey instead of being stripped off.

Thanks for the confirmation! In such a case this issue has been fixed with #73257 and the missing [ chars should be back now.

daxian-dbw · 2022-08-10T18:27:30Z

@adamsitnik That's great! Thanks!

bracketed paste support

455378d

adamsitnik added the area-System.Console label Aug 2, 2022

adamsitnik added this to the 7.0.0 milestone Aug 2, 2022

adamsitnik requested a review from jozkee August 2, 2022 18:42

ghost assigned adamsitnik Aug 2, 2022

am11 reviewed Aug 2, 2022

View reviewed changes

add more test cases, make test failure more readable

da4d351

am11 reviewed Aug 3, 2022

View reviewed changes

adamsitnik mentioned this pull request Aug 9, 2022

[API Proposal]: Allow Console.ReadKey to parse VT sequences when ENABLE_VIRTUAL_TERMINAL_INPUT is set on Windows #60107

Open

DHowett reviewed Aug 9, 2022

View reviewed changes

jozkee reviewed Aug 9, 2022

View reviewed changes

adamsitnik closed this Aug 10, 2022

daxian-dbw mentioned this pull request Aug 10, 2022

Console.ReadKey returns incomplete VT sequences for bracketed paste #60101

Closed

ghost locked as resolved and limited conversation to collaborators Sep 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bracketed paste support #73257

Bracketed paste support #73257

adamsitnik commented Aug 2, 2022

ghost commented Aug 2, 2022

am11 Aug 2, 2022 •

edited

Loading

adamsitnik Aug 3, 2022

adamsitnik Aug 3, 2022

am11 Aug 3, 2022

adamsitnik Aug 3, 2022

am11 Aug 3, 2022

am11 Aug 3, 2022

DHowett Aug 9, 2022

adamsitnik Aug 9, 2022

DHowett Aug 9, 2022

adamsitnik Aug 9, 2022

jozkee Aug 9, 2022

jozkee left a comment

jozkee Aug 9, 2022

jozkee Aug 9, 2022

jozkee Aug 9, 2022

jozkee Aug 9, 2022

jozkee Aug 9, 2022

jozkee Aug 9, 2022

adamsitnik commented Aug 10, 2022

daxian-dbw commented Aug 10, 2022

adamsitnik commented Aug 10, 2022

daxian-dbw commented Aug 10, 2022

	[InlineData("\u001B[200~[200~nested[201~\u001B[201~", "[200~nested[201~")]
	[InlineData("\u001B[200~\u001B[200~nested\u001B[201~ tag\u001B[201~ ", "nested tag ")]

	// the test case above are based on manual testing and observations
	// the test cases below were not reproduced during manual testing, but hypothetically are possible
	[InlineData("\u001B[201~", "")] // just end tag
	[InlineData("\u001B[201~\u001B[200~test123", "test123")] // end tag (from previous pasted content) and start tag from next one

	// The Terminal is configured to return from read() as soon as at least byte is available for reading.
	// The Terminal is configured to return from read() as soon as at least one byte is available for reading.

	ReadOnlySpan<byte> receivedBytes = new (bufPtr, result);
	ReadOnlySpan<byte> receivedBytes = new(bufPtr, result);

	if (IsTag(receivedBytes.Slice(0, TagLength)))
	if (receivedBytes.Slice(0, TagLength).SequenceEqual(s_startBracketedPaste))

	else if (IsTag(receivedBytes.Slice(receivedBytes.Length - TagLength)))
	else if (receivedBytes.Slice(receivedBytes.Length - TagLength).SequenceEqual(s_endBracketedPaste))

		static bool IsTag(ReadOnlySpan<byte> b)
		=> b[0] == '\u001B' && b[1] == '[' && b[2] == '2' && b[3] == '0' && (b[4] == '0' \|\| b[4] == '1') && b[5] == '~';

Bracketed paste support #73257

Bracketed paste support #73257

Conversation

adamsitnik commented Aug 2, 2022

ghost commented Aug 2, 2022

am11 Aug 2, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jozkee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamsitnik commented Aug 10, 2022

daxian-dbw commented Aug 10, 2022

adamsitnik commented Aug 10, 2022

daxian-dbw commented Aug 10, 2022

am11 Aug 2, 2022 •

edited

Loading