Add HEAD requests support to DelayedRequester#865
Conversation
180d6d2 to
367118f
Compare
HEAD requests support to DelayedRequester
| self.session = requests.Session() | ||
|
|
||
| def get(self, url, params=None, **kwargs): | ||
| def _make_request(self, method, url, **kwargs): |
There was a problem hiding this comment.
| def _make_request(self, method, url, **kwargs): | |
| def _make_request(self, method: Literal['get', 'head'], url: str, **kwargs): |
There was a problem hiding this comment.
👋 Hi @obulat - thanks for the suggestion. Could you help me understand this literal type constraint better? I was under the impression that the calls to _make_request were passing methods from the Requests module and not string values, but I'm probably missing something.
To experiment I changed the Literal params to something nonsensical (instead of get and head) and just lint and just test passed when I was expecting them to fail.
Also, if you're suggesting that we make the Requests module calls only within _make_request based on the string passed then that makes sense, but I wanted to get confirmation. Thank you!
There was a problem hiding this comment.
Gah, that's right! This is actually of type callable, rather than Literal. I think the way it is presently is fine, but it might be worth adding the generic callable type annotation (trying to match the call signatures of both .head and .get will be a head-ache 😄)
There was a problem hiding this comment.
Thank you for stepping in, @AetherUnbound, and sorry I couldn't reply sooner, @twstokes.
| url: URL to make the request as a string. | ||
| params: Dictionary of query string params | ||
| **kwargs: Optional arguments that will be passed to `requests.get` | ||
| **kwargs: Optional arguments that will be passed to the `requests` |
There was a problem hiding this comment.
The docstrings params need to be updated, too: add method and remove params (or keep it if it's possible to document a **kwargs parameter).
|
Looks like we could fold one more fix into this! WordPress/openverse#1578 |
|
Thank you @obulat and @AetherUnbound for the suggestions! I've been AFK but will have a chance to address these soon. 🙇 |
Hey @AetherUnbound! 👋 I attempted to:
here, but ran into some exceptions that may be related to what you've dealt with in the past. At that point I didn't identify a quick fix and felt like I was going outside of the original mission of this PR. If you have any guidance I'll be glad to wrap it into these changes. Thanks! |
|
I wanted to highlight this comment by @krysal and that I'm not totally confident the responses are the same because I wasn't sure of how to test the |
|
Thank you @twstokes! And ah, Freesound is indeed a special case. I think we can move forward with this PR without that as a fix, since we'll work to address it in WordPress/openverse-catalog#754 anyway. If you're interested in including it still, I think we'd want to keep the decorator but also add the Probably the best way to ensure that we're getting the same content back from the requests is to test running each of the altered provider DAGs locally and making sure we get the appropriate filesizes where expected! I will try to run each of those today and share the results 🙂 |
AetherUnbound
left a comment
There was a problem hiding this comment.
I ran both the Stocksnap and WordPress providers locally (after the suggested changes below) and they looked good! We'll definitely need to address the return issue before this is merged.
If you'd like to try and run this yourself @twstokes, do the following:
- Add
AIRFLOW_VAR_INGESTION_LIMIT=200to your.envfile - Run
just up - Login to Airflow using
airflow/airflowas user/pass respectively. - Flip the switch on the
stocksnap_workflowDAG and make sure it runs successfully (you can click on the DAG name itself to get to more useful views).
Co-authored-by: Madison Swain-Bowden <[email protected]>
|
Thanks for all your help @AetherUnbound 😄. I ran both the |
AetherUnbound
left a comment
There was a problem hiding this comment.
This looks great! I have one more suggestion, but after that this is good to go. Thanks for all the iteration on this! 🚀
obulat
left a comment
There was a problem hiding this comment.
Great documentation and typing!
Fixes
HEADmethod toDelayedRequesteropenverse#1579 by @AetherUnboundWordPressDataIngester::_get_filesizeshould not useget_response_jsonopenverse#1373 by @AetherUnboundDescription
This PR adds a
headmethod to theDelayedRequesterclass and updates file size calls in the StockSnap and WordPress ingesters to use it.Testing Instructions
See steps @AetherUnbound provided here.
Checklist
Update index.md).main) or a parent feature branch.Developer Certificate of Origin
Developer Certificate of Origin