Cache avatar url #12355
Conversation
There was a problem hiding this comment.
Pull request overview
Restores and fixes caching for User.get_avatar_url so avatar URL lookups are cached per-user in memcache (avoiding repeated site.get() / account lookups), and adds tests to validate correctness and caching behavior.
Changes:
- Re-enables
@cache.memoizeonUser.get_avatar_urlwith a per-username cache key. - Adds unit tests verifying avatar URL generation and caching behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| openlibrary/core/models.py | Reintroduces memcache memoization for avatar URL lookup using a per-username cache key. |
| openlibrary/plugins/upstream/tests/test_models.py | Adds tests for correct avatar URL generation and caching semantics. |
Comments suppressed due to low confidence (1)
openlibrary/core/models.py:1060
- This memoized value is cached indefinitely (default
expires=0). If an account is created/linked later orinternetarchive_itemnameis temporarily missing, the function can return a bad URL (e.g. ending inNone) and that result will be cached until memcache eviction. Consider adding a reasonableexpires(e.g. hours/day) and/or using thecacheableparameter to avoid caching whenitemnameis falsy.
@cache.memoize(
engine="memcache", key=lambda cls, username: f"user-avatar-{username}"
)
def get_avatar_url(cls, username: str) -> str:
username = username.rsplit('/people/', maxsplit=1)[-1]
user = web.ctx.site.get(f'/people/{username}')
itemname = user.get_account().get('internetarchive_itemname')
return f'https://archive.org/services/img/{itemname}'
| @cache.memoize( | ||
| engine="memcache", key=lambda cls, username: f"user-avatar-{username}" | ||
| ) | ||
| def get_avatar_url(cls, username: str) -> str: | ||
| username = username.rsplit('/people/', maxsplit=1)[-1] |
There was a problem hiding this comment.
The memoize key is computed from the raw username argument, but get_avatar_url later normalizes username by stripping a /people/ prefix. If callers pass both forms (e.g. "foo" vs "/people/foo"), this will create separate cache entries and reduce cache hit rate. Consider normalizing inside the key function (or normalize username before memoization) so both inputs map to the same cache key.
| def test_caches_result_for_same_username(self, monkeypatch): | ||
| web.ctx.site.save({"key": "/people/cachetest_user", "type": {"key": "/type/user"}}) | ||
| web.ctx.site.store["account/cachetest_user"] = { | ||
| "internetarchive_itemname": "@cachetest_user-archive", | ||
| } | ||
| call_count = 0 | ||
| original_get = web.ctx.site.get | ||
|
|
||
| def counting_get(key, *args, **kwargs): | ||
| nonlocal call_count | ||
| call_count += 1 | ||
| return original_get(key, *args, **kwargs) | ||
|
|
||
| monkeypatch.setattr(web.ctx.site, "get", counting_get) | ||
| models.User.get_avatar_url("cachetest_user") | ||
| models.User.get_avatar_url("cachetest_user") | ||
| assert call_count == 1 |
There was a problem hiding this comment.
These tests exercise the memcache-backed decorator, but they don't clear the global memcache between tests/runs. If the user-avatar-* key already exists (e.g. when re-running a subset of tests in the same process), call_count could be 0 and the test would fail. Consider deleting the relevant memcache key(s) at the start of the test (or flushing the mock memcache in setup_method) to make the caching assertions deterministic.
…method and updated cacheable import in TestGetAvatarUrl
…y into cache-avatar-url-9282 # Conflicts: # openlibrary/core/models.py
…l-9282 Cache avatar url
…l-9282 Cache avatar url
…l-9282 Cache avatar url
Closes #9282
Technical
get_avatar_urlhad a@cache.memoizedecorator that was commented out because its key was the static string"user-avatar", which would have cached one user's avatar URL for all users. This PR restores the decorator with a correct per-username key (f"user-avatar-{username}"), so each user's avatar URL is cached independently in memcache, avoiding repeatedsite.get()andget_account()calls for the same user across requests.Note: the first page load still pays the full cost of 50 calls per page of 25 followers (25x
site.get('/people/X')and 25xget_account()). Subsequent loads are served from cache with 0 calls per user. A bulk fetch viasite.get_many()inmy_follows.GETwould reduce the first load from 25 individualsite.getcalls down to 1, but requires template changes and is flagged for a follow-up.Testing
/people/<username>/followersor/people/<username>/followingsite.get()calls per userScreenshot
NA no UI changes
Stakeholders
@mekarpeles