bpo-42834: Make static cache variables in _json.c compatible with subinterpreters #24121

Fidget-Spinner · 2021-01-05T15:59:20Z

This patch helps isolate static caches in _json.c, and may help with decreasing refleaks at finalization too.

No measured performance slowdowns in json_loads and json_dumps in pyperformance with this patch. There is a measurable slowdown in error raising of around 8%, but this isn't usually very hot code.

https://bugs.python.org/issue42834

vstinner

Did you consider to use _Py_IDENTIFIER() API? It is compatible with subinterpreters and it provides interned strings. _PyUnicode_FromId() can fail with memory allocation error (at the first call, when the string is created). There are "Id" variant of many functions.

Modules/_json.c

Fidget-Spinner · 2021-01-06T14:08:06Z

Did you consider to use _Py_IDENTIFIER() API? It is compatible with subinterpreters and it provides interned strings. _PyUnicode_FromId() can fail with memory allocation error (at the first call, when the string is created). There are "Id" variant of many functions.

No that didn't cross my mind due to my inexperience with the internal C API. Thank you for taking the time to review this and suggesting it to me! Please give me a week to address your suggestions as I'm currently quite busy. Thanks!

Fidget-Spinner · 2021-01-10T13:43:03Z

Sorry, it seems that I was wrong, my personal computer isn't producing consistent pyperformance results, so I don't see any real slowdown/speedup.

Fidget-Spinner · 2021-01-10T15:07:55Z

@vstinner I think this should be ready now. Pyperformance on windows (this isn't really accurate, but it's something at least) says:

json_dumps: Mean +- std dev: [json-main2] 17.5 ms +- 0.2 ms -> [json-heap] 17.8 ms +- 0.1 ms: 1.01x slower (+1%)
Benchmark hidden because not significant (1): json_loads

If it gets merged, I'll monitor speed.python.org for awhile and see if anything crops up. Thanks for your patience and help in all of this!

Fidget-Spinner · 2021-01-29T17:44:10Z

@vstinner I was finally able to test it on a linux machine (btw, json_dumps is slightly unstable on my machine and pyperformance sometimes complains, only json_loads is stable):
on master branch:

### json_dumps ###
Mean +- std dev: 47.5 ms +- 5.6 ms

### json_loads ###
Mean +- std dev: 81.1 us +- 10.5 us

on this branch:

### json_dumps ###
Mean +- std dev: 48.2 ms +- 4.4 ms

### json_loads ###
Mean +- std dev: 80.0 us +- 8.7 us

> pyperf compare_to master2.json json-heap2.json 
Benchmark hidden because not significant (2): json_dumps, json_loads

Geometric mean: 1.00x slower

Seems like performance isn't affected :). Please take a look when you have the time for it . Thank you!

vstinner · 2021-01-29T18:53:20Z

Modules/_json.c

    /* Use JSONDecodeError exception to raise a nice looking ValueError subclass */
-    static PyObject *JSONDecodeError = NULL;
-    PyObject *exc;
+    _Py_static_string(PyId_decoder, "json.decoder");


Do we really need PyId_decoder and PyId_JSONDecodeError micro-optimization? Decoding an ASCII string is fast in Python. Why not keeping PyImport_ImportModule & PyObject_GetAttrString?

I don't care much about the performance of the error path.

I agree that performance on error path is usually not important. However, I think json may be a little special - many people use it to load from sources that may provide invalid json data, this is pretty common:

try: val = json.loads(...) except json.decoder.JSONDecodeError: pass else: ...

So raising error is quite a common operation. Using PyId here surprisingly seems to speed up error raising by around 20%.

Modules/_json.c

Misc/NEWS.d/next/Library/2021-01-05-23-55-24.bpo-42834.LxRnZC.rst

vstinner · 2021-01-29T20:54:15Z

So raising error is quite a common operation. Using PyId here surprisingly seems to speed up error raising by around 20%.

What is your benchmark showing a 20% difference?

Fidget-Spinner · 2021-01-30T05:42:08Z

So raising error is quite a common operation. Using PyId here surprisingly seems to speed up error raising by around 20%.

What is your benchmark showing a 20% difference?

It's a microbenchmark, so I don't see it as 100% representative of a real-world scenario:

pyperf timeit "import json" "try: json.loads('{dkfjdkjf')
except: pass"

# without pyid

Mean +- std dev: 9.89 us +- 0.45 us

# with pyid

Mean +- std dev: 8.36 us +- 0.54 us

# compare

Mean +- std dev: [json-heap-no-static] 9.89 us +- 0.45 us -> [json-heap-static-string] 8.36 us +- 0.54 us: 1.18x faster

Edit: Honestly I'm quite surprised. I did not expect the static string interning to save that much time. I expected more of the call overhead to be from PyImport_ImportModule and PyObject_GetAttrString internal non-string ops, rather than almost 20% of time spent on strings.

Edit2: Wow, on master where exception is cached, it's only slightly faster than pyid with no cached exception version:

# no cached exception, no PyId
Mean +- std dev: [master-clean] 7.81 us +- 0.50 us -> [json-heap-no-static] 9.89 us +- 0.45 us: 1.27x slower

# no cached exception, PyId
Mean +- std dev: [master-clean] 7.81 us +- 0.50 us -> [json-heap-static-string] 8.36 us +- 0.54 us: 1.07x slower

vstinner · 2021-02-01T13:37:16Z

Can you please compare using the current master branch as a reference?

bench 1: reference, master
bench 2: your PR without static strings (expected to be slower)
bench 3: your PR using static strings

pyperf timeit "import json" "try: json.loads('{dkfjdkjf')

It seems like you import json at each iteration. Use -s "import json".

Fidget-Spinner · 2021-02-01T15:26:01Z

It seems like you import json at each iteration. Use -s "import json".

Wow! I didn't know that. I assumed that the first argument is automatically considered setup code without having to specify -s. Thanks for correcting me.

Can you please compare using the current master branch as a reference?
* bench 1: reference, master

* bench 2: your PR without static strings (expected to be slower)

* bench 3: your PR using static strings

I did:

sudo pyperf system tune
Ensured that I only saw 0.1us difference between runs before starting the benchmark.

python3 -m pyperf timeit -s "import json" "try: json.loads('{')
except: pass"

bench 1: reference, master
Mean +- std dev: 8.00 us +- 0.19 us

bench 2: PR without static strings (expected to be slower)
Mean +- std dev: 10.3 us +- 0.2 us

bench 3: PR using static strings
Mean +- std dev: 8.52 us +- 0.26 us

# compare_to bench1, bench2, bench3
python3 -m pyperf compare_to master.json  json-heap-no-pyid.json json-heap-pyid.json 
Mean +- std dev: [master] 8.00 us +- 0.19 us -> [json-heap-no-pyid] 10.3 us +- 0.2 us: 1.28x slower
Mean +- std dev: [master] 8.00 us +- 0.19 us -> [json-heap-pyid] 8.52 us +- 0.26 us: 1.07x slower

Edit: 1us -> 0.1us

vstinner · 2021-02-01T16:27:29Z

Merged, thanks. Well, since the code already uses static strings for the "hot code" (true/false), it's ok to "optimize" the error path (raise an exception).

Fidget-Spinner · 2021-02-01T16:29:59Z

Thanks for your patience and guidance on this PR Victor! I have waaaay more respect for the all the pyperf/pyperformance people now - benchmarks are pretty hard 😆.

vstinner · 2021-02-01T16:43:12Z

Ensured that I only saw 0.1us difference between runs before starting the benchmark.

I don't understand what you mean. If you don't use CPU isolation, you should run benchmarks on an idle machine (don't run other programs in parallel).

You should not pick which numbers look better to you, but accumulate more data. Use --append option rather than -o/--output when you create JSON files. Using pyperf, you can run a benchmark many times and accumulate more and more runs, pyperf computes the mean and std dev. Then you can have fun with stats and hist commands ;-)

https://pyperf.readthedocs.io/en/latest/analyze.html

Fidget-Spinner · 2021-02-01T16:48:01Z

You should not pick which numbers look better to you, but accumulate more data. Use --append option rather than -o/--output when you create JSON files. Using pyperf, you can run a benchmark many times and accumulate more and more runs, pyperf computes the mean and std dev. Then you can have fun with stats and hist commands ;-)

https://pyperf.readthedocs.io/en/latest/analyze.html

Thanks, TIL something new again about append.

I don't understand what you mean. If you don't use CPU isolation, you should run benchmarks on an idle machine (don't run other programs in parallel).

Sorry I wasn't clear here. What I meant was: I did that to make sure that all the tuning was working, not cherry picking data on purpose. Because on Windows, when I ran pyperf, there was actually still quite a lot of noise between each run (>1us! because tuning didn't really work!). So this just meant: my machine was quite stable.

On the linux benchmarking machine I didn't have any other user programs running (well other than the desktop environment and terminal :p).

Fidget-Spinner · 2021-02-01T16:55:37Z

Point taken about accumulating data though - I never knew such a useful feature existed. If I had known I'd definitely use it way more. Thanks!

vstinner · 2021-02-01T17:03:17Z

Oh, I only run benchmarks on Linux. Someone should write a documentation explaining how to run reproducible benchmarks on Windows. pyperf changes the process priority if it has the required dependency and the permissions. But I don't recall the details.

Fidget-Spinner · 2021-02-01T17:09:41Z

Oh, I only run benchmarks on Linux. Someone should write a documentation explaining how to run reproducible benchmarks on Windows. pyperf changes the process priority if it has the required dependency and the permissions. But I don't recall the details.

Yeap. After a while I gave up and just installed Linux 😅. I don't know if there's an easy way for pyperf to disable turbo boost properly on Windows, and some of the other tuning options which affect accuracy too can't be easily changed. So I just installed Linux on a real computer and ran all the benchmarks you see there.

Edit: Just to clarify:
1st benchmark (1 month ago): Windows
Everything else in the past 1 week: Linux

) Make internal caches of the _json extension module compatible with subinterpreters.

Fidget-Spinner added 4 commits January 4, 2021 23:35

Convert json local static types to heap types

c7f0fa9

fix tests

ea20bfb

update comment

158c12c

Add news

20a9621

the-knights-who-say-ni added the CLA signed label Jan 5, 2021

bedevere-bot added the awaiting review label Jan 5, 2021

Fidget-Spinner added 2 commits January 6, 2021 00:03

reduce the diff

b82d9c4

remove unneded forward decl

4236d8b

vstinner reviewed Jan 6, 2021

View reviewed changes

Modules/_json.c Outdated Show resolved Hide resolved

Fidget-Spinner added 5 commits January 10, 2021 20:22

use py_identifier for string caches

a17f4c7

import jsondecodeerror all the time

23d39b7

reduce diff

a0ee0ff

Merge branch 'json-heap-id' into json-heap-id-exception

a4e5edd

reduce diff

4fc84df

reduce diff again

a6fe991

Fidget-Spinner changed the title ~~bpo-42834: Convert static cache variables in _json.c to heap variables~~ bpo-42834: Make static cache variables in _json.c compatible with subinterpreters Jan 10, 2021

Update 2021-01-05-23-55-24.bpo-42834.LxRnZC.rst

eb65a40

Fidget-Spinner requested a review from vstinner January 15, 2021 14:37

Merge remote-tracking branch 'upstream/master' into json-heap

9f67c17

reduce diff

8b2cd46

vstinner reviewed Jan 29, 2021

View reviewed changes

Fidget-Spinner added 2 commits January 30, 2021 03:23

apply most of suggestions

d193574

move comments to correct spot

41c0485

vstinner merged commit b5931f1 into python:master Feb 1, 2021

bedevere-bot removed the awaiting review label Feb 1, 2021

Fidget-Spinner deleted the json-heap branch February 1, 2021 16:33

adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021

bpo-42834: Fix _json internal caches for subinterpreters (pythonGH-24121

ffa1482

) Make internal caches of the _json extension module compatible with subinterpreters.

Uh oh!

bpo-42834: Make static cache variables in _json.c compatible with subinterpreters #24121

bpo-42834: Make static cache variables in _json.c compatible with subinterpreters #24121

Uh oh!

Conversation

Fidget-Spinner commented Jan 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fidget-Spinner commented Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Jan 10, 2021

Uh oh!

Fidget-Spinner commented Jan 10, 2021

Uh oh!

Fidget-Spinner commented Jan 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner Jan 29, 2021

Choose a reason for hiding this comment

Uh oh!

Fidget-Spinner Jan 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vstinner commented Jan 29, 2021

Uh oh!

Fidget-Spinner commented Jan 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Feb 1, 2021

Uh oh!

Fidget-Spinner commented Feb 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Feb 1, 2021

Uh oh!

Fidget-Spinner commented Feb 1, 2021

Uh oh!

vstinner commented Feb 1, 2021

Uh oh!

Fidget-Spinner commented Feb 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Fidget-Spinner commented Feb 1, 2021

Uh oh!

vstinner commented Feb 1, 2021

Uh oh!

Fidget-Spinner commented Feb 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fidget-Spinner commented Jan 5, 2021 •

edited

Loading

Fidget-Spinner commented Jan 6, 2021 •

edited

Loading

Fidget-Spinner commented Jan 29, 2021 •

edited

Loading

Fidget-Spinner Jan 29, 2021 •

edited

Loading

Fidget-Spinner commented Jan 30, 2021 •

edited

Loading

Fidget-Spinner commented Feb 1, 2021 •

edited

Loading

Fidget-Spinner commented Feb 1, 2021 •

edited

Loading

Fidget-Spinner commented Feb 1, 2021 •

edited

Loading