Skip to content

Throttle simultaneous DNS requests#1924

Closed
pfreixes wants to merge 12 commits intoaio-libs:masterfrom
pfreixes:throttle_dns_requests
Closed

Throttle simultaneous DNS requests#1924
pfreixes wants to merge 12 commits intoaio-libs:masterfrom
pfreixes:throttle_dns_requests

Conversation

@pfreixes
Copy link
Copy Markdown
Contributor

@pfreixes pfreixes commented May 26, 2017

What do these changes do?

Added a new option at TCPConnector kw called throttle_dns that
allows to the connector throttle simultaneous DNS requests for a specific
domain. This functionality it's only compatible when the DNS cache is
enabled and it is disabled by default.

This feature get rid of the dogpile side effect when there is a miss in the cache, having
just one request doing the DNS resolution - at hostname granularity, requests that continue arriving while the ongoing DNS resolution is not finished will wait till it finishes.

When the DNS cache is not enabled this feature is not allowed. All requests will end up making a DNS query.

Are there changes in behavior for the user?

No

Related issue number

No

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new entry to CHANGES.rst
    • Choose any open position to avoid merge conflicts with other PRs.
    • Add a link to the issue you are fixing (if any) using #issue_number format at the end of changelog message. Use Pull Request number if there are no issues for PR or PR covers the issue only partially.

Pau Freixes added 6 commits May 26, 2017 16:25
Added a new option at TCPConnector level called `throttle_dns` that
allows to the connector throttle simulatnewous DNS requests for a specific
domain. This functionality its only compatible when the DNS cache is
enabled and it is dissabled by default. Therefore,  many requests can
end up making the same DNS resolution.
@codecov-io
Copy link
Copy Markdown

codecov-io commented May 26, 2017

Codecov Report

Merging #1924 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1924      +/-   ##
==========================================
+ Coverage   97.05%   97.06%   +0.01%     
==========================================
  Files          37       38       +1     
  Lines        7610     7637      +27     
  Branches     1328     1336       +8     
==========================================
+ Hits         7386     7413      +27     
  Misses        101      101              
  Partials      123      123
Impacted Files Coverage Δ
aiohttp/connector.py 97.66% <100%> (+0.08%) ⬆️
aiohttp/locks.py 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c96322c...89accd7. Read the comment docs.

@pfreixes
Copy link
Copy Markdown
Contributor Author

getting older this MR :) any feedback or to much busy these days @asvetlov @fafhrd91

@asvetlov
Copy link
Copy Markdown
Member

Hmm. The PR adds extra level of complexity to aiohttp client API but be could live with it probably.
_throttle_dns_events doesn't close events on .
If event have an exception inside it will be reported as 'exception was never retrieved'.
Also I strongly object against asyncio.Event hacking. This class is no intended for inheritance. If you need custom implementation -- just copy-paste it and extend on your own. Or use standard locking primitives.

@pfreixes
Copy link
Copy Markdown
Contributor Author

@asvetlov fair enough about the locking primitive, if the Event class is changed in some way it might break the derivated class. I will refactor it avoiding inheritance.

Not sure what you mean with

_throttle_dns_events doesn't close events on

Events are supposed to be temporary to only lock those requests that had a miss in the cache and were keen on the same host resolution. Once the creator of the event either has filled the cache or got an exception, the event is removed from _throttle_dns_events. What Im missing?

Also not sure about:

If event have an exception inside it will be reported as 'exception was never retrieved'.

Regarding the complexity I would like to justify why this change can be interesting.The following code shows a situation where 10 request to the same domain are made when there is a cache miss

import asyncio
import aiohttp

from aiohttp.connector import TCPConnector
from aiohttp.resolver import DefaultResolver

class MyResolver(DefaultResolver):

    def __init__(self, *args, **kwargs):
        self.cnt = 0
        super(MyResolver, self).__init__(*args, **kwargs)

    async def resolve(self, *args, **kwargs):
        self.cnt += 1
        return (await super(MyResolver, self).resolve(*args, **kwargs))



async def fetch(session, url):
    await session.get(url)

async def main():
    url = 'http://www.vilaweb.cat/'
    resolver = MyResolver()
    connector = TCPConnector(resolver=resolver, use_dns_cache=True)
    session = aiohttp.ClientSession(connector=connector)
    tasks = [fetch(session, url) for i in range(10)]
    await asyncio.gather(*tasks)
    print('DNS Resolve calls %d' % resolver.cnt)
    await fetch(session, url)
    print('DNS Resolve calls %d' % resolver.cnt)
    await fetch(session, url)
    print('DNS Resolve calls %d' % resolver.cnt)
    await fetch(session, url)
    print('DNS Resolve calls %d' % resolver.cnt)
    await fetch(session, url)
    print('DNS Resolve calls %d' % resolver.cnt)

if __name__ == '__main__':
    asyncio.get_event_loop().run_until_complete(main())

This is by design. First consecutive calls will be paused at the same code path, and they wont be resumed until the next full scheduler iteration. This MR get rid of that issue having as a result only one DNS query.

@asvetlov
Copy link
Copy Markdown
Member

asvetlov commented Jun 28, 2017

Ok, your motivation is very clean.

That I'm talking about is missing getting errors from _throttle_dns_events in TCPConnector.close().

Suppose we have non-processed yet failed events in _throttle_dns_events.
We call

connector.close()
del connector

Pretty simple scenario, isn't it?
Python destroys all connector internal data, and future objects that have stored exceptions inside raises 'exception was never retrieved' warning with more comprehensive info in PYTHONASYNCIODEBUG mode.

To get rid of it you should handle these exceptions in connectior.close()

@pfreixes
Copy link
Copy Markdown
Contributor Author

Checking this scenario I could see that the simple execution of set_exception will create a new reference for the specific Future involved with the loop, because of the call soon [1]. Hence, the destructor of the Future won't be called. IMHO this is the behavior that I will expect, the same for the happy path set_result. Therefore, the explicit close of the TCPconnector should not raise this log entry in execution time.

True that once the program finishes this messages might be printed if the exception is not gathered by the caller, but IMHO this is or it should be the default behavior.

[1] https://github.com/python/cpython/blob/master/Lib/asyncio/futures.py#L129

@asvetlov
Copy link
Copy Markdown
Member

No. Default should be finishing without warning.
Just call fut.cancel() or fut.exception()

@pfreixes
Copy link
Copy Markdown
Contributor Author

Handle the futures that might be already finished with exception but are still waiting to be processed by the callers will require some deep changes in that MR, have in mind that the current pattern pop the keys as soon as the first caller ends [1]. Main reason, once the cache is filled does not make sense keep stacking future callers in the Event.

Aside of that, I'm still not convinced about that need. Stop the TCPconnector explicitly and make a del of the instance won't always delete the instance, only if there are no more references to them.

Correct me if Im wrong, but even without this MR the current code can suffer the same issue. Let me explain.

Imagine that you have a bunch of callers that are paused waiting for the DNS response, and these have finished with an error, indeed the error is an exception saved to the futures that link the paused callers. If the TCPConnector is stoped and deleted - in fact wont be deleted - the callers will still have the chance to gather that exception if the loop keeps on motion, or if the loop is stoped the exception will be printed to the log.

If this is true, fix the issue that you claim shouldnt make sense because its the same behaviour that right now is happening in the current code.

[1] https://github.com/aio-libs/aiohttp/pull/1924/files#diff-7f25afde79f309e2f8722c26cf1f10adR699

Pau Freixes added 2 commits July 17, 2017 23:33
Will help the user to throttle DNS requests to the same domain. By
design it will happen always when many URLs behind the same domain are
scheduled at the same time, perhaps via *gather*

The Event class is no longer a derivated class of the
asyncio.locks.Event, and makes a full copy of this one to avoid future
issues.
@pfreixes
Copy link
Copy Markdown
Contributor Author

@asvetlov I've modified the Event class getting rid of the inheritance, basically doing a full copy of the official class and trying to get the 100% of coverage.

Also, Ive left the throtle_dns enabled by default, it does not make sense have a flag to enable it or disable it. The user won't enable it because in most of the scenarios won't be aware of the problematic.

Regarding the controversial topic of the futures. I would like to have another reviewer and gather other opinions. what do you think?

@asvetlov
Copy link
Copy Markdown
Member

Sorry, I'm almost not available until roughly end of month.
Please wait for a while for review.

@pfreixes
Copy link
Copy Markdown
Contributor Author

any ideas @fafhrd91 why the make doc-spelling is failing in Travis?

it works on my computer

@asvetlov
Copy link
Copy Markdown
Member

asvetlov commented Jul 18, 2017 via email

@pfreixes pfreixes closed this Jul 18, 2017
asvetlov pushed a commit that referenced this pull request Jul 31, 2017
* Throttle simultaneous DNS requests

Added a new option at TCPConnector level called `throttle_dns` that
allows to the connector throttle simulatnewous DNS requests for a specific
domain. This functionality its only compatible when the DNS cache is
enabled and it is dissabled by default. Therefore,  many requests can
end up making the same DNS resolution.

* Explicit loop for sleep calls

* Fixed typos in documentation

* Added DNS request throttling in CHANGES.rst

* Dogpile invalid word by doc-spelling

* Get full coverage for locks

* Throttle DNS always enabled

Will help the user to throttle DNS requests to the same domain. By
design it will happen always when many URLs behind the same domain are
scheduled at the same time, perhaps via *gather*

The Event class is no longer a derivated class of the
asyncio.locks.Event, and makes a full copy of this one to avoid future
issues.

* Remove implicit loop tests for locks.Event

* Use create_future helper

* Updated spelling wordlist

* Use a simple Event wrapper instead of a full Event class implementation

* Pass the loop explicitly

* Cancel pending throttled DNS requests if connector is close

* Removed not used attribute

* Pass loop explicit

* Throtlle DNS feature adapted to CHANGES protocol

* Update 2111.feature

* Changed class name
@lock
Copy link
Copy Markdown

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants