Skip to content

Argument unpacking (spread operator) doesn't normalise keys from an iterator #18581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
IMSoP opened this issue May 17, 2025 · 1 comment
Open

Comments

@IMSoP
Copy link
Contributor

IMSoP commented May 17, 2025

Description

Since PHP 8.0, the unpack/spread operator ...$iterable can be used with either int or string keys, with integers mapped to positional parameters (regardless of value) and strings mapped to named parameters.

With an iterator, it's possible to return a numeric string key which could not exist in a normal array. In the simple case, this leads to an error such as "Unknown named parameter $0" https://3v4l.org/IQBdd

When combined with a variadic parameter, the behaviour becomes rather more peculiar: the key is collected into the variadic parameter, even if there are other positional parameters left to fill; and the key is left as a string, even though any other context would force it to an int.

I can think of two sensible behaviours here:

  1. Normalise numeric string keys coming back from the iterator during the spread operation, and assign them to positional, rather than named, parameters. This would mean that yield '0' => 'foo'; gives the same behaviour as yield 0 => 'foo';, and that spreading into arguments behaved the same ways as spreading into an array.
  2. Forbid any string key which would not be valid as a parameter name, including numeric strings. Right now, it's possible to include any character you like in the key, with potentially dangerous side-effects (see below).

Here's an example: https://3v4l.org/jOddJ

function a() {
    return [
        100 => 'first',
        101 => 'second',
        102 => 'third',
        'named' => 'fourth',
    ];
}
function b() {
    yield 100 => 'first';
    yield 101 => 'second';
    yield 102 => 'third';
    yield 'named' => 'fourth';
}
function c() {
    yield '100' => 'first';
    yield '101' => 'second';
    yield '102' => 'third';
    yield 'named' => 'fourth';
}
function test($x=null, $y=null, ...$z) {
    var_dump($x, $y, $z);
}

test(...a());
echo "\n";
test(...b());
echo "\n";
test(...c());

Casess a() and b() populate the positional parameters $x and $y with "first" and "second", and "third" is treated as the first variadic argument, with key 0. When given an iterator with string keys, all four items are collected, and the keys are not normalised, resulting in this broken array:

array(4) {
  ["100"]=>
  string(5) "first"
  ["101"]=>
  string(6) "second"
  ["102"]=>
  string(5) "third"
  ["named"]=>
  string(6) "fourth"
}

It is even possible to end up with both 0 and "0" as keys in the same array: https://3v4l.org/qNMep

function a() {
    yield 0 => 'first';
    yield 0 => 'second';
}
function b() {
    yield 0 => 'first';
    yield '0' => 'second';
}
function c() {
    yield '0' => 'first';
    yield '0' => 'second';
}
function test(...$args) {
    var_dump($args);
}

test(...a());
echo "\n";
test(...b());
echo "\n";
test(...c());

Results in:

array(2) {
  [0]=>
  string(5) "first"
  [1]=>
  string(6) "second"
}

array(2) {
  [0]=>
  string(5) "first"
  ["0"]=>
  string(6) "second"
}


Fatal error: Uncaught Error: Named parameter $0 overwrites previous argument in /in/qNMep:23
Stack trace:
#0 {main}
  thrown in /in/qNMep on line 23

While writing this up, I realised that as well as numeric keys from iterators, even array spreading can be abused in potentially dangerous ways.

Newlines: https://3v4l.org/QfRmi

function test() {}
test(...["name\nwith\nnewlines\nembedded" => 'horrible']);
Fatal error: Uncaught Error: Unknown named parameter $name
with
newlines
embedded in /in/QfRmi:4
Stack trace:
#0 {main}
  thrown in /in/QfRmi on line 4

Null bytes: https://3v4l.org/XLC8M

function test() {}
test(...["string truncated\0 at the first null byte" => 'evil']);
Fatal error: Uncaught Error: Unknown named parameter $string truncated in /in/XLC8M:4
Stack trace:
#0 {main}
  thrown in /in/XLC8M on line 4

PHP Version

Reproducible on 3v4l for all versions from 8.0.0 to 8.4.7 inclusive

Operating System

No response

@IMSoP
Copy link
Contributor Author

IMSoP commented May 18, 2025

Implementation-wise, it looks like the checking could be done at the same time as checking for keys of invalid type here: https://heap.space/xref/php-src/Zend/zend_vm_execute.h?r=3f03f7ed3d988567b5a59ae542579fd91cdfde42#2512

An additional check could be made using php_valid_var_name: https://heap.space/xref/php-src/ext/standard/array.c#php_valid_var_name (This is the function used by extract, which in its default mode silently discards anything with an invalid variable name; as far as I know, the rules for parameter names are the same as those for variable names.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants