Skip to content

Same memory leaks. Now tested under python 2.#270

Closed
Adriandorr wants to merge 8 commits intoultrajson:mainfrom
genomics-dev:master
Closed

Same memory leaks. Now tested under python 2.#270
Adriandorr wants to merge 8 commits intoultrajson:mainfrom
genomics-dev:master

Conversation

@Adriandorr
Copy link
Copy Markdown

@Adriandorr Adriandorr commented Jun 19, 2017

This contains a fixes for a number of memory leaks, mainly related to dictionary keys and values.

borman and others added 6 commits March 23, 2017 10:19
Using ujson.dumps() with objects having __json__() method leaked memory
for object's JSON representation.
PyObject_CallObject() returns a PyObject*; discarding it leaked
memory for the result of output.write().
Merged pull request to fix memory leak.
@qsniyg
Copy link
Copy Markdown

qsniyg commented Nov 2, 2017

I can confirm this works, hope it'll get pulled upstream.

if (GET_TC(tc)->itemName) {
Py_DECREF(GET_TC(tc)->itemName);
GET_TC(tc)->itemName = NULL;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental extra indentation here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @Adriandorr , this looks like the last remaining problem..?

tests/tests.py Outdated

def test_ujson_has_no_memory_leak(self):
import gc
import objgraph
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new dependency should be added to .travis.yml.

@daicang
Copy link
Copy Markdown

daicang commented Dec 27, 2017

Nice work. This pr fixes memory leaks when dumps() dict with string/unicode/other container-type key & value. And seems it also fixes memory leak when dump to files.

Some examples I can confirm:
dumps() with:

{string: 1} string key
{unicode: 1} unicode key
{1: [1,]} list value
{1: {1:1}} dict value
{1: string} string value
{1: unicode} unicode value

danmaas added a commit to spinpunch/ultrajson that referenced this pull request Jan 6, 2018
@danmaas
Copy link
Copy Markdown

danmaas commented Jan 6, 2018

Thank you, this is a helpful fix. Hope it can be merged soon!

@mxr
Copy link
Copy Markdown

mxr commented May 10, 2018

@Jahaja @orivej have you had a chance to take a look at this PR? Thanks!

@nav-agarwal
Copy link
Copy Markdown

nav-agarwal commented Apr 24, 2019

@Adriandorr I am not able to reproduce these memory leaks with ujson(1.35) and python(2.7.14). I have used tests added by you in this fix. Reference count before and after was same.

Here is the script that I used:

import ujson
import sys

def test_does_not_leak_dictionary_values():
    import gc
    gc.collect()
    value = ["abc"]
    data = {"1": value}
    ref_count = sys.getrefcount(value)
    ujson.dumps(data)
    print(ref_count, sys.getrefcount(value))

def test_does_not_leak_dictionary_keys():
    import gc
    gc.collect()
    key1 = "1"
    key2 = "1"
    value1 = ["abc"]
    value2 = [1,2,3]
    data = {key1: value1, key2: value2}
    ref_count1 = sys.getrefcount(key1)
    ref_count2 = sys.getrefcount(key2)
    ujson.dumps(data)
    print(ref_count1, sys.getrefcount(key1))
    print(ref_count2, sys.getrefcount(key2))

def test_does_not_leak_dictionary_string_key():
    import gc
    gc.collect()
    key1 = "1"
    value1 = 1
    data = {key1: value1}
    ref_count1 = sys.getrefcount(key1)
    ujson.dumps(data)
    print(ref_count1, sys.getrefcount(key1))

def test_does_not_leak_dictionary_tuple_key():
    import gc
    gc.collect()
    key1 = ("a",)
    value1 = 1
    data = {key1: value1}
    ref_count1 = sys.getrefcount(key1)
    ujson.dumps(data)
    print(ref_count1, sys.getrefcount(key1))

def test_does_not_leak_dictionary_bytes_key():
    import gc
    gc.collect()
    key1 = b"1"
    value1 = 1
    data = {key1: value1}
    ref_count1 = sys.getrefcount(key1)
    ujson.dumps(data)
    print(ref_count1, sys.getrefcount(key1))

def test_does_not_leak_dictionary_None_key():
    import gc
    gc.collect()
    key1 = None
    value1 = 1
    data = {key1: value1}
    ref_count1 = sys.getrefcount(key1)
    ujson.dumps(data)
    print(ref_count1, sys.getrefcount(key1))

test_does_not_leak_dictionary_values()
test_does_not_leak_dictionary_keys()
test_does_not_leak_dictionary_string_key()
test_does_not_leak_dictionary_tuple_key()
test_does_not_leak_dictionary_bytes_key()
test_does_not_leak_dictionary_None_key()

@Adriandorr
Copy link
Copy Markdown
Author

Have you tried adding the tests from the pull requests without the fixes?

@nav-agarwal
Copy link
Copy Markdown

nav-agarwal commented Apr 24, 2019 via email

@Adriandorr
Copy link
Copy Markdown
Author

Strange. It’s been a while since I used this. I think it was with python 3, but might well be wrong.

if (GET_TC(tc)->itemName) {
Py_DECREF(GET_TC(tc)->itemName);
GET_TC(tc)->itemName = NULL;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @Adriandorr , this looks like the last remaining problem..?

sadovnychyi added a commit to lawinsider/srsly that referenced this pull request Jun 3, 2019

if (!(GET_TC(tc)->itemValue = PyObject_GetItem(GET_TC(tc)->dictObj, GET_TC(tc)->itemName)))
{
if (GET_TC(tc)->itemValue) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I hadn't seen this and fixed it differently in #353.

@hugovk
Copy link
Copy Markdown
Member

hugovk commented Mar 1, 2020

I can reproduce the test failures locally on this branch, it's not always the same failure, and sometimes it passes:

$ python3.6 tests/tests.py
................................................................E...............................................................
======================================================================
ERROR: test_does_not_leak_dictionary_None_key (__main__.UltraJSONTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/tests.py", line 917, in test_does_not_leak_dictionary_None_key
    ujson.dumps(data)
OverflowError: Unterminated UTF-8 sequence when encoding string

----------------------------------------------------------------------
Ran 128 tests in 0.887s

FAILED (errors=1)
$ python3.6 tests/tests.py
................................................................E...............................................................
======================================================================
ERROR: test_does_not_leak_dictionary_None_key (__main__.UltraJSONTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/tests.py", line 917, in test_does_not_leak_dictionary_None_key
    ujson.dumps(data)
OverflowError: Unterminated UTF-8 sequence when encoding string

----------------------------------------------------------------------
Ran 128 tests in 1.156s

FAILED (errors=1)
$ python3.6 tests/tests.py
................................................................E...............................................................
======================================================================
ERROR: test_does_not_leak_dictionary_None_key (__main__.UltraJSONTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/tests.py", line 917, in test_does_not_leak_dictionary_None_key
    ujson.dumps(data)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb in position 9: invalid start byte

----------------------------------------------------------------------
Ran 128 tests in 0.625s

FAILED (errors=1)
$ python3.6 tests/tests.py
................................................................................................................................
----------------------------------------------------------------------
Ran 128 tests in 0.892s

OK
$

They pass when using pytest, it must enforce better test isolation:

$ mv tests/tests.py tests/test_ujson.py
$ python3.6 -m pytest
==================================================== test session starts ====================================================
platform darwin -- Python 3.6.9, pytest-5.3.5, py-1.8.1, pluggy-0.13.1
rootdir: /Users/hugo/github/ultrajson
plugins: cov-2.8.1
collected 128 items

tests/test_ujson.py ................................................................................................. [ 75%]
...............................                                                                                       [100%]

==================================================== 128 passed in 1.46s ====================================================
$ python3.6 -m pytest
==================================================== test session starts ====================================================
platform darwin -- Python 3.6.9, pytest-5.3.5, py-1.8.1, pluggy-0.13.1
rootdir: /Users/hugo/github/ultrajson
plugins: cov-2.8.1
collected 128 items

tests/test_ujson.py ................................................................................................. [ 75%]
...............................                                                                                       [100%]

==================================================== 128 passed in 2.08s ====================================================
$ python3.6 -m pytest
==================================================== test session starts ====================================================
platform darwin -- Python 3.6.9, pytest-5.3.5, py-1.8.1, pluggy-0.13.1
rootdir: /Users/hugo/github/ultrajson
plugins: cov-2.8.1
collected 128 items

tests/test_ujson.py ................................................................................................. [ 75%]
...............................                                                                                       [100%]

==================================================== 128 passed in 1.41s ====================================================
$

I've created PR #363 to switch over to pytest.

@hugovk hugovk mentioned this pull request Mar 1, 2020
@hugovk
Copy link
Copy Markdown
Member

hugovk commented Mar 1, 2020

Copy link
Copy Markdown
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subsets of this PR have been merged in #257 and #353.

Let's include the rest from here when the tests pass and conflicts are resolved, and anyone please feel free to create a new PR if necessary, it would be good to get this released.

@AndreLouisCaron
Copy link
Copy Markdown

@nav-agarwal Thanks for providing those tests! Thanks to them, I was able to confirm our upgrade to json>=2 introduced memory leaks in our application.

For others stumbling here, see code in this gist: https://gist.github.com/AndreLouisCaron/a18ae434e2eaf2866ffa4a24f693b2e8 to run tests against multiple ujson versions.

Note that I have the same results whether I use basepython = python3.8 or basepython = python2.7.

@hugovk
Copy link
Copy Markdown
Member

hugovk commented May 30, 2022

Closing this as it's very much out of date but new PRs always welcome to fix things like this (and there have been quite a few recently). Thanks!

@hugovk hugovk closed this May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog: Fixed For any bug fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.