`stdin.read` (and `stdin.readSync`) corrupt non-ASCII input on Windows #18240

lionel-rowe · 2023-03-17T02:31:19Z

stdin.read (and stdin.readSync) corrupt non-ASCII input on Windows.

To reproduce:

const c = new Uint8Array(6);
Deno.stdin.read(c).then(() => console.log(c));

Then, enter a non-ASCII character. The resulting bytes will be corrupted on Windows.

Examples, with trailing LF/CRLF/null bytes truncated:

Input	Expected	Actual	Decoded as UTF-8
ÿ	`[195, 191]`	`[152]`	"�" (Invalid char)
Ā	`[196, 128]`	`[65]`	"A"
ā	`[196, 129]`	`[97]`	"a"
啊	`[229, 149, 138]`	`[63]`	"?"
🦄	`[240, 159, 166, 132]`	`[63, 63]`	"??"

Expected results are the UTF-8 bytes. Results on Linux are as expected.

The text was updated successfully, but these errors were encountered:

njhanley · 2023-04-03T00:22:17Z

This stems from Deno reading directly from console input as a file, which uses the console's current code page rather than Unicode (see High-Level Console I/O). In the example, 'ÿ' maps to 152 in code page 437 (OEM-US).

Instead ReadConsoleW should be used, as in Rust's std::io::Stdin.

@dsherret Is there a reason Deno doesn't use Rust/Tokio's stdio implementation? If not, would a PR be welcome?

dsherret · 2023-04-03T15:58:04Z

Yes, a PR would be welcome. I believe it should use std::io::stdin here when StdFileResourceKind::Stdin:

deno/ext/io/lib.rs

Lines 398 to 400 in 3cd7abf

    
           StdFileResourceKind::File | StdFileResourceKind::Stdin => { 
        
             self.file.read(buf) 
        
           }

Similar to how it does this for write:

deno/ext/io/lib.rs

Lines 377 to 383 in 3cd7abf

    
           StdFileResourceKind::Stdout => { 
        
             // bypass the file and use std::io::stdout() 
        
             let mut stdout = std::io::stdout().lock(); 
        
             stdout.write_all(buf)?; 
        
             stdout.flush()?; 
        
             Ok(()) 
        
           }

Mqxx · 2024-07-24T13:01:11Z

Hey, I just stumbled across this issue having the same problem. Characters like ÄÖÜ (non ASCII) are corrupted. Any update on when this gets fixed?

Thanks

dsherret added bug Something isn't working correctly windows Related to Windows platform labels Apr 3, 2023

njhanley linked a pull request Apr 14, 2023 that will close this issue

fix(cli): read stdin as utf-8 on windows #18699

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`stdin.read` (and `stdin.readSync`) corrupt non-ASCII input on Windows #18240

`stdin.read` (and `stdin.readSync`) corrupt non-ASCII input on Windows #18240

lionel-rowe commented Mar 17, 2023 •

edited

Loading

njhanley commented Apr 3, 2023

dsherret commented Apr 3, 2023

Mqxx commented Jul 24, 2024

stdin.read (and stdin.readSync) corrupt non-ASCII input on Windows #18240

stdin.read (and stdin.readSync) corrupt non-ASCII input on Windows #18240

Comments

lionel-rowe commented Mar 17, 2023 • edited Loading

njhanley commented Apr 3, 2023

dsherret commented Apr 3, 2023

Mqxx commented Jul 24, 2024

`stdin.read` (and `stdin.readSync`) corrupt non-ASCII input on Windows #18240

`stdin.read` (and `stdin.readSync`) corrupt non-ASCII input on Windows #18240

lionel-rowe commented Mar 17, 2023 •

edited

Loading