-
-
Notifications
You must be signed in to change notification settings - Fork 33.9k
bpo-45214: add the LOAD_NONE opcode #28376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e672cbd to
d31b1fb
Compare
gvanrossum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s an impeccable PR. Do you have hard perf numbers for PyPerformance? Pablo may be able to help you (he has a benchmark machine set up at work).
|
OOI, why did this end up as the chosen design, as opposed to |
I don't think it's been definitely decided yet (this is why I did the refactor PRs first so we can more easily experiment). The thinking is that None is so common that it may be faster for it to have a dedicated opcode (and another STORE_FAST_NONE). There will probably be another LOAD_COMMON_CONST opcode for True, False and a few others (zeros, 0.5, 2.0, ... ). Then there is the question of whether to fold MAKE_INT into that as I did in not PR or have a separate MAKE_INT opcode which takes items from the small-int array (I think you suggested to split out MAKE_INT before your vacation, or maybe it was Guido). |
👍 As an aside, do we know how common the use of |
|
To count that, one could do this experiment: take this PR, add. Dedicated RETURN_NONE op code, and count how many LOAD_NONE op codes are generated. Easy to add to tools/scripts/count_opcodes.py. |
|
I disabled a bunch of background stuff and meditated while the benchmarks ran, hopefully these results are real: |
|
We can probably get 1% from quite a few different ideas. Ten different
things each giving us 1% is 10%, so I am excited about this improvement!
|
|
I looked at dis output for worst benchmark, pidigits (that result seems repeatable). I don't see any difference except what we expect (I thought maybe some optimisation is missed). Maybe the PREDICTs I put in ceval are wrong for generators, not sure. Will continue looking. |
It's probably not repeatable, there's some noise and how I launch the benchmark seems to make a difference. (3 shell commands is noisy, one shell command that begins with "repeat 3" is more stable). I just repeated the whole things, the overall picture is |
|
Don't worry too much about pidigits. It mostly exercises very long integer arithmetic, so the difference is most likely due to some other random CPU cache effect. |
|
@pablogsal Is your benchmark machine less noisy than what I posted above? |
I will try to benchmark this tomorrow on my benchmark machine. If it results to be less noisy, I can try to give you access so you can benchmark there if you find it useful. |
|
I tried to benchmark but I keep getting errors regarding something with the freeze module: I tried to apply the diff clean on main and regenerate it but I keep getting them :( |
|
@ericsnowcurrently do you know why this may be happening? |
|
Oh, actually since the 15 of September all benchmarks on the benchmark server are failing with the same error :( https://speed.python.org/ |
|
I'm pretty sure Victor fixed this earlier. |
|
Here are the results of the benchark: |
|
@pablogsal Are your results consistent, as in if you compare two runs of the same python version you see no diff? I don't see a reason why LOAD_NONE would make unpickle 3% slower. Anyway, the picture emerging is that while timeit clearly shows "x=None" being about 8% faster with LOAD_NONE, overall this doesn't seem to make a measurable difference on benchmarks. I think something like a RETURN_NONE opcode which replaces a pair of opcodes. I will try that (but not sure how much time I will have this week - got some distractions). |
I ran it again and is quite consistent. Here is the comparison of |
|
I wouldn't give up on this completely yet, but it would be nice if we could measure e.g. the size decrease in pyc files. @pablogsal There is some weird shit in the table, e.g. Is that just something that was copy-pasted wrong? I could imagine that what you actually ran was and some tool expanded the bpo-45214 reference in the middle of the filename to Markdown, and that got picked up when you copied-pasted it. I also noticed that you specified |
Yeah, I have no idea where that is coming from. It may be GitHub recognizing the bpo prefix and doing some markdon shenaningans.
Yeah, I normally also use
Here is the full table: |
Confirmed, is GitHub. Try to make a comment with something like "bpo-45214": it will automatically add the link even if you write it as text |
|
All the analysis here is out of date now, closing. |
https://bugs.python.org/issue45214