Skip to content

Conversation

@hikari-no-yume
Copy link
Contributor

@laruence laruence added the RFC label Jan 21, 2016
{ $$ = zend_ast_create(ZEND_AST_ARRAY_ELEM, $3, $1); }
| expr T_DOUBLE_ARROW T_LIST '(' assignment_list ')'
{ $$ = zend_ast_create(ZEND_AST_ARRAY_ELEM, $5, $1); }
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combining the implementations and throwing a clear compile-time error would be preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be necessary if we ever replace list() with [], but the way I've done it thus far makes things neater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, doing this in the compiler means the possibility of a nicer error message.

@jwatzman
Copy link

jwatzman commented Feb 6, 2016

You might want to consider some test cases for some of the more perverse behavior you can get into with list() -- see, e.g.,

which I fought with for like a week getting HHVM to match PHP7's semantics (and it still doesn't quite get it right in one corner case).

Both PHP5 and PHP7 have differently perversely broken semantics, and it would be sad if this made things worse :) In particular, your RFC claims you can desugar list($a, $b) = $c into $a = $c[0]; $b = $c[1]; which is emphatically not true; I strongly suspect you know this, but some of the ways it's not true are very interesting edge case tests for your new feature.

@hikari-no-yume
Copy link
Contributor Author

Most of the edge-cases with keyed list() are the same (assignment is left-to-right, same opcode). However, you can probably have fun times if you abuse the ability to use arbitrary expressions as keys.

In particular, your RFC claims you can desugar list($a, $b) = $c into $a = $c[0]; $b = $c[1]; which is emphatically not true; I strongly suspect you know this, but some of the ways it's not true are very interesting edge case tests for your new feature.

Hmm, what's the big difference? They compile to different opcodes (ZEND_FETCH_LIST + ZEND_ASSIGN vs. ZEND_FETCH_DIM_R + ZEND_ASSIGN), but do they have significantly different behaviour?

@jesseschalken
Copy link

Here is one difference:

$a = 'hello';

$b = $a[0];
var_dump($b); // 'h'

list($b) = $a;
var_dump($b); // null

I'm not sure why that is. 😕

@hikari-no-yume
Copy link
Contributor Author

Oh, sure, it doesn't work for strings. That's actually specified behaviour now: https://wiki.php.net/rfc/fix_list_behavior_inconsistency

@jesseschalken
Copy link

Cool. From the tests @jwatzman linked to the only other different behaviour I could see is that list($a) = $b evaluates sub-expressions in $b before sub-expressions in $a, while $a = $b[0] does the opposite.

$h = [['a'], ['b'], ['c']];
$i = 0;
list($j[$i++]) = $h[$i++];
var_dump($j);
array(1) {
  [1] =>
  string(1) "a"
}
$h = [['a'], ['b'], ['c']];
$i = 0;
$j[$i++] = $h[$i++][0];
var_dump($j);
array(1) {
  [0] =>
  string(1) "b"
}

While that is strange, I assume that behaviour would just be carried forward for list() with keys.

@hikari-no-yume
Copy link
Contributor Author

The behaviour isn't that strange when you consider that list() makes multiple assignments. The thing it assigns from is evaluated before the things it assigns to.

And yes, I think this behaviour is preserved with keys.

@jesseschalken
Copy link

It's still surprising that ... = ... and list(...) = [...] aren't equivalent, but provided list() with keys is consistent with existing list() behaviour (as you said) it's not necessarily relevant to this RFC.

@nikic
Copy link
Member

nikic commented Feb 6, 2016

@jwatzman To make sure we're on the same page, the problematic case you have in mind (notwithstanding a string as the RHS, which is just depressing), is that list(var1, var2) = expr; doesn't desugar into the expected

$tmp = expr;
var1 = $tmp[0];
var2 = $tmp[1];

but rather into

$tmp = expr;
$_tmp = $tmp[0];
var1 = $_tmp;
$_tmp = $tmp[1];
var2 = $_tmp;

If that's the problem you have in mind, I think we might be open to fix this. Unlike the fact that the primary expression is executed first, the fact that the individual offset accesses are executed first as well is not strictly intentional.

@nikic
Copy link
Member

nikic commented Feb 6, 2016

Though honestly, I have no clue what the "correct" behavior for your complex test case is. I can see how evaluating the side effects of the LHS before the side effects of the RHS might make sense as well there.

@hikari-no-yume
Copy link
Contributor Author

<?php

$c = [1, 2];
$a = $c[0];
$b = $c[1];

$c = [1, 2];
list($a, $b) = $c;

[Welcome to phpdbg, the interactive PHP debugger, v0.5.0]
To get help using phpdbg type "help" and press enter
[Please report bugs to <http://bugs.php.net/report.php>]
[Successful compilation of /Users/ajf/Projects/2014/PHP/php-src/foobarqux.php]
prompt> print exec 
[Context /Users/ajf/Projects/2014/PHP/php-src/foobarqux.php (11 ops)]
L1-8 {main}() /Users/ajf/Projects/2014/PHP/php-src/foobarqux.php - 0xa74000 + 11 ops
 L3    #0     ASSIGN                  $c                   array(2)                                 
 L4    #1     FETCH_DIM_R             $c                   0                    @1                  
 L4    #2     ASSIGN                  $a                   @1                                       
 L5    #3     FETCH_DIM_R             $c                   1                    @3                  
 L5    #4     ASSIGN                  $b                   @3                                       
 L7    #5     ASSIGN                  $c                   array(2)                                 
 L8    #6     FETCH_LIST              $c                   0                    @6                  
 L8    #7     ASSIGN                  $a                   @6                                       
 L8    #8     FETCH_LIST              $c                   1                    @8                  
 L8    #9     ASSIGN                  $b                   @8                                       
 L8    #10    RETURN                  1                                                             

Am I missing something?

PHP 7.1.0-dev (cli) (built: Feb  5 2016 11:18:55) ( NTS DEBUG )
Copyright (c) 1997-2016 The PHP Group
Zend Engine v3.1.0-dev, Copyright (c) 1998-2016 Zend Technologies

@jesseschalken
Copy link

The difference is when the target of the assignment contains a subexpression:

<?php

$c = [1, 2];
$a[f] = $c[0];
$b[f] = $c[1];

$c = [1, 2];
list($a[f], $b[f]) = $c;
 L3    #0     EXT_STMT
 L3    #1     ASSIGN                  $c                   array(2)
 L4    #2     EXT_STMT
 L4    #3     FETCH_CONSTANT                               "f"                  ~1
 L4    #4     FETCH_DIM_R             $c                   0                    @3
 L4    #5     ASSIGN_DIM              $a                   ~1
 L4    #6     OP_DATA                 @3
 L5    #7     EXT_STMT
 L5    #8     FETCH_CONSTANT                               "f"                  ~4
 L5    #9     FETCH_DIM_R             $c                   1                    @6
 L5    #10    ASSIGN_DIM              $b                   ~4
 L5    #11    OP_DATA                 @6
 L7    #12    EXT_STMT
 L7    #13    ASSIGN                  $c                   array(2)
 L8    #14    EXT_STMT
 L8    #15    FETCH_LIST              $c                   0                    @8
 L8    #16    FETCH_CONSTANT                               "f"                  ~9
 L8    #17    ASSIGN_DIM              $a                   ~9
 L8    #18    OP_DATA                 @8
 L8    #19    FETCH_LIST              $c                   1                    @11
 L8    #20    FETCH_CONSTANT                               "f"                  ~12
 L8    #21    ASSIGN_DIM              $b                   ~12
 L8    #22    OP_DATA                 @11
 L8    #23    RETURN                  1

The FETCH_CONSTANTs happen before the FETCH_DIM_Rs but after the FETCH_LISTs. So not only does list($a) = $b evaluate all of $b before any of $a (unlike $a = $b[0], which will evaluate sub-expressions of $a first), but it also evaluates $b[0] before any of $a, as well, which you need a temporary variable to simulate.

I suspect this would all be cleared up if plain assignment evaluated all of the RHS before any of the LHS, but the relevant question is whether or not this PR makes the situation any worse.

@jwatzman
Copy link

jwatzman commented Feb 7, 2016

(notwithstanding a string as the RHS, which is just depressing)

That's the worst one that really makes it not work, but there are lots of other subtitles, yeah.

the fact that the individual offset accesses are executed first as well is not strictly intentional.

I don't particularly care what the "right" behavior is, just that there isn't an accidental mismatch between the existing list() and the new behavior in this PR :) PHP is full of unintentional stuff like that, which sometimes accidentally changes from release to release, and so test cases trying to codify it are a good idea IMO.

Though honestly, I have no clue what the "correct" behavior for your complex test case is.

Haha, don't ask me, I would have sworn I lifted the test case from your UVS PR :) I guess not.

but the relevant question is whether or not this PR makes the situation any worse

Exactly.


For particular test cases, I'd probably start by taking the two HHVM tests I had above, and replacing all the occurrences of list($a, $b) = ... with list(0 => $a, 1 => $b) = ... and making sure the output is the same, since AIUI those two are intended to be the same.

I'd also decide if list(0 => $a, 1 => $b) = $c and list(1 => $b, 0 => $a) = $c are supposed to be the same or different, and write a test codifying that. (The difference is observable in a few ways, I know my test cases look for it; I think $arr = []; list(0 => $arr[], 1 => $arr[]) = [1, 2]; is the simplest case, but you can see a bunch of other fun things like my tests do.) The same for string keys instead of integers -- do the assignments happen in the order listed in the list() LHS, or the iteration order of the array on the RHS? Is this consistent if the LHS is actually a nested list()?

You should also consider what the semantics of stuff like the below is, and write tests to make sure the behavior is consistent with the existing list() where there is a clear analogue (and I think that's most places), or at least doesn't accidentally change in a minor release:

  • list($a => $a) = $b
  • list($b => $a, $a => $c) = $d -- i.e., one of your string key variables is overwritten mid-evaluation; this is the most interesting one IMO.
  • list($a => $a, $a => $a) = $a
  • All of the above when any of the variables involved are actually an ArrayAccess object, with a __toString if needed. (This allows it to not make "assignment" do what you expect, and also makes the order things happen in easily observable.)
  • (etc, you can get pretty perverse, particularly when you start nesting list())

@nikic
Copy link
Member

nikic commented Feb 7, 2016

@jwatzman The cases list($b => $a, $a => $c) = $d and list($a => $a, $a => $a) = $a look very tricky. Not entirely sure how it should behave. I'd probably go with the "obvious" behavior, i.e.

$a = $d[$b];
$c = $d[$a];
// and respectively
$_a = $a;
$a = $_a[$a];
$a = $_a[$a];

The last example follows the rule that if the RHS variable is used as an LHS target we force it to be evaluated first.

Another case to consider is something like

$i = 0;
$a = ...;
list($a[$i++] => $a[$i++], $a[$i++] => $a[$i++]) = $a[$i++];

Following the rule from #1730 (comment) this would be equivalent to

$a = ...;
list($a[1] => $a[2], $a[3] => $a[4])  = $a[0];

which I guess is reasonable (apart from RHS evaluation first maybe, but that's existing behavior).

@hikari-no-yume
Copy link
Contributor Author

I haven't written this up for the language specification yet. So I can specify the behaviour there and write tests for it.

@hikari-no-yume
Copy link
Contributor Author

I've spent the past two weeks procrastinating about writing tests for this. I'm not sure I know where to start.

@jesseschalken
Copy link

All of the above when any of the variables involved are actually an ArrayAccess object, with a __toString if needed. (This allows it to not make "assignment" do what you expect, and also makes the order things happen in easily observable.)

This may be a useful test using ArrayAccess to capture the evaluation/assignment order directly:

<?php

class Foo extends ArrayObject {
    function offsetGet($k) {
        print "get: $k\n";
        return parent::offsetGet($k);
    }
    function offsetSet($k, $v) {
        print "set: $k => $v\n";
        parent::offsetSet($k, $v);
    }
    function offsetExists($k) {
        print "exists?: $k\n";
        return parent::offsetExists($k);
    }
    function offsetUnset($k) {
        print "unset: $k\n";
        parent::offsetUnset($k);
    }
}

function trace($v) {
    print "val: $v\n";
    return $v;
}

$b = new Foo();

list(
    trace('i') => $b[trace(7)],
    trace('f') => list(
        trace('c') => $b[],
        trace('a') => $b[trace(6)]
    ),
    trace('g') => $b[]
) = new Foo([
    'f' => new Foo([
        'a' => 'b',
        'c' => 'd',
    ]),
    'g' => 'h',
    'i' => 'j',
]);

Even if it doesn't capture everything, it's probably a worthwhile test to pin down the result of anyway.

@jesseschalken
Copy link

@TazeTSchnitzel Here are some more tests based on @jwatzman's comments.

<?php

// All the following should print 'a' then 'b'

list($a, $b) = ['a', 'b'];
var_dump($a);
var_dump($b);

list(0 => $a, 1 => $b) = ['a', 'b'];
var_dump($a);
var_dump($b);

list(1 => $b, 0 => $a) = ['a', 'b'];
var_dump($a);
var_dump($b);

$arr = [];
list($arr[], $arr[]) = ['a', 'b'];
var_dump($arr[0]);
var_dump($arr[1]);

$arr = [];
list(0 => $arr[], 1 => $arr[]) = ['a', 'b'];
var_dump($arr[0]);
var_dump($arr[1]);

$arr = [];
list(1 => $arr[], 0 => $arr[]) = ['b', 'a'];
var_dump($arr[0]);
var_dump($arr[1]);

$arr = [];
list(list(1 => $arr[], 0 => $arr[])) = [['b', 'a']];
var_dump($arr[0]);
var_dump($arr[1]);

$arr = [];
list('key1' => $arr[], 'key2' => $arr[]) = ['key2' => 'b', 'key1' => 'a'];
var_dump($arr[0]);
var_dump($arr[1]);

// This should print 'foo'
$a = 0;
list($a => $a) = ['foo', 'bar'];
var_dump($a);

// This should print 'bar' then 'foo'
$a = 0;
$b = 1;
list($b => $a, $a => $c) = ['foo', 'bar'];
var_dump($a);
var_dump($c);

@jesseschalken
Copy link

And @nikic's last comment, as a test:

<?php

$i = 0;
$a = [
    0 => [
        'b' => 'bar',
        'a' => 'foo',
    ],
    1 => 'a',
    3 => 'b',
];
list($a[$i++] => $a[$i++], $a[$i++] => $a[$i++]) = $a[$i++];
var_dump($i); // should be 5
var_dump($a[2]); // should be 'foo'
var_dump($a[4]); // should be 'bar'

@hikari-no-yume
Copy link
Contributor Author

Okay, hikari-no-yume@ab532db adds a test which observes the evaluation order using some objects which produce log messages. It's not everything, but it's a start.

@hikari-no-yume
Copy link
Contributor Author

And hikari-no-yume@61def0c adds a second test which looks at nesting.

I like these tests because they can directly check the evaluation order. But they probably aren't enough given PHP or HHVM could do completely different things for non-ArrayAccess.

@hikari-no-yume
Copy link
Contributor Author

I've made a language specification patch as well: php/php-langspec#152

@jesseschalken
Copy link

I like these tests because they can directly check the evaluation order. But they probably aren't enough given PHP or HHVM could do completely different things for non-ArrayAccess.

Yeah, testing a mix of ways to observe assignment/evaluation order would be good for this reason.

@hikari-no-yume
Copy link
Contributor Author

Merged: 37c8bb5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants