Lock to avoid concurrent refresh of pyx tokens#17479
Conversation
Prevent concurrent processes from all refreshing the token simultaneously by using a file lock and timestamp-based debounce. When a refresh is needed: 1. Acquire an exclusive lock on refresh.lock 2. Check if another process recently refreshed (within 5 seconds) 3. If so, re-read the tokens from disk instead of making another request 4. Otherwise, perform the refresh and update the timestamp This prevents thundering herd issues when multiple concurrent processes invoke `uv auth credential-helper` at the same time.
The lock file is now specific to the token type:
- OAuth tokens: tokens.lock
- API key tokens: {api_key_digest}.lock
This ensures that different API keys don't unnecessarily block each other
when refreshing concurrently.
When another process recently refreshed, we re-read the token from disk. Now we also verify the token's expiration is beyond the tolerance before returning it. If the re-read token is stale (e.g., the other process failed to write a fresh token), we fall through to perform the refresh ourselves.
Consolidate the token freshness check into a single `is_fresh` method and use it in both places: the initial check and the debounce check. This reduces duplication and simplifies the code.
The is_fresh helper now includes all the debug logging, so we can use it in both the initial freshness check and the debounce check without duplicating the checking logic.
…eason
Move the freshness checking logic to a method on PyxTokens that returns
Result<(), ExpiredTokenReason>. The ExpiredTokenReason enum implements
Display, allowing callers to log "Refreshing token due to {reason}" with
descriptive messages like "missing expiration", "zero tolerance",
"token expired", or "token expiring soon".
ExpiringSoon now carries the expiration timestamp and displays it as
"token will expire within tolerance (`{exp}`)" for more informative logs.
When we read tokens from disk during debounce but they're still not fresh, log the reason so it's clear why we're falling through to refresh.
… comments - Expired variant now includes the timestamp like ExpiringSoon - Removed trailing periods from inline comments
check_fresh now returns Ok(expiration) instead of Ok(()), allowing callers
to log the expiration timestamp when the token is fresh, matching the
original "Access token is up-to-date (`{exp}`)" log message.
|
cc @charliermarsh I think this is probably sound even if it's not causing the Bazel issue @zsol is going to look into? |
crates/uv-auth/src/pyx.rs
Outdated
| /// Read the last refresh timestamp from the lock file. | ||
| async fn read_last_refresh(&self, lock_path: &Path) -> Option<u64> { | ||
| let data = fs_err::tokio::read_to_string(lock_path).await.ok()?; | ||
| data.trim().parse().ok() | ||
| } | ||
|
|
||
| /// Write the current timestamp to the lock file. | ||
| async fn write_last_refresh(&self, lock_path: &Path) -> Result<(), io::Error> { | ||
| let now = SystemTime::now() | ||
| .duration_since(UNIX_EPOCH) | ||
| .expect("system time before Unix epoch") | ||
| .as_secs(); | ||
| fs_err::tokio::write(lock_path, now.to_string()).await | ||
| } |
There was a problem hiding this comment.
I would version the contents of this lockfile, just in case we need to change its contents in future releases. It could be as simple as the first 2 bytes containing a version marker, followed by content
|
This seems to resolve the issue I'm seeing, verified with the following simple script: #!/usr/bin/env python3
"""Request pyx auth tokens from uv in concurrent attempts."""
import argparse
import asyncio
async def get_auth_token(
index: int, uv_binary: str
) -> tuple[int, str | None, str | None]:
"""Request an auth token from uv."""
proc = await asyncio.create_subprocess_exec(
uv_binary,
"auth",
"helper",
"--preview-features",
"auth-helper",
"--protocol",
"bazel",
"get",
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
input_json = b'{"uri": "https://api.pyx.dev/"}'
stdout, stderr = await proc.communicate(input=input_json)
if proc.returncode == 0:
return (index, stdout.decode().strip(), None)
else:
return (index, None, stderr.decode().strip())
async def main(invocations: int, uv_binary: str) -> None:
"""Run n concurrent auth token requests."""
print(f"Starting {invocations} concurrent auth token requests...")
tasks = [get_auth_token(i, uv_binary) for i in range(invocations)]
results = await asyncio.gather(*tasks)
successes = 0
failures = 0
for index, token, error in results:
if token:
successes += 1
# Only print first few characters of token for privacy.
print(f"[{index:3d}] Success: {token[:5]}...")
else:
failures += 1
print(f"[{index:3d}] Failed: {error}")
print(f"\nResults: {successes} successes, {failures} failures")
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Request PyX auth tokens from uv in concurrent attempts"
)
parser.add_argument(
"-n",
"--invocations",
type=int,
default=100,
help="Number of concurrent requests (default: 100)",
)
parser.add_argument(
"--uv",
default="uv",
help="Path to uv binary (default: uv)",
)
args = parser.parse_args()
asyncio.run(main(args.invocations, args.uv))On my machine, calling this with However, with |
|
We opted not to include the complexity of a debounce for now, and instead just changed the tolerance for the credential-helper fetch. |
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.9.24` → `0.9.26` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>astral-sh/uv (astral-sh/uv)</summary> ### [`v0.9.26`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0926) [Compare Source](astral-sh/uv@0.9.25...0.9.26) Released on 2026-01-15. ##### Python - Add CPython 3.15.0a5 ##### Enhancements - Add a hint to update uv when a managed Python download is not found ([#​17461](astral-sh/uv#17461)) - Improve cache initialization failure error message ([#​17469](astral-sh/uv#17469)) - Improve error message for abi3 wheels on free-threaded Python ([#​17442](astral-sh/uv#17442)) - Add support for `--no-sources-package` ([#​14910](astral-sh/uv#14910)) ##### Preview features - Add `METADATA.json` and `WHEEL.json` in uv build backend ([#​15510](astral-sh/uv#15510)) - Add support for GCS request signing ([#​17474](astral-sh/uv#17474)) - Adjust the process ulimit to the maximum allowed on startup ([#​17464](astral-sh/uv#17464)) ##### Bug fixes - Lock to avoid concurrent refresh of pyx tokens ([#​17479](astral-sh/uv#17479)) ##### Documentation - Add linting and formatting instructions to the CONTRIBUTING guide ([#​17470](astral-sh/uv#17470)) - Avoid rendering `pyproject.toml` examples for more system-level settings ([#​17462](astral-sh/uv#17462)) ### [`v0.9.25`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0925) [Compare Source](astral-sh/uv@0.9.24...0.9.25) Released on 2026-01-13. ##### Python - Add CPython 3.15.0a4 - Upgrade Tcl/Tk used by CPython to 9.0 ##### Enhancements - Add `--compile-bytecode` to `uv python install` and `uv python upgrade` to compile the standard library ([#​17088](astral-sh/uv#17088)) - Allow disabling `exclude-newer` per package ([#​16854](astral-sh/uv#16854)) - Broadcast `WM_SETTINGCHANGE` on `uv tool update-shell` ([#​17404](astral-sh/uv#17404)) ##### Preview features - Detect workspace from `uv run` target ([#​17423](astral-sh/uv#17423)) ##### Bug fixes - Avoid unwrapping size for file responses ([#​17434](astral-sh/uv#17434)) - Use keyring authentication when retrieving `tool@latest` version ([#​17448](astral-sh/uv#17448)) - Use latest Pyodide version for each python version ([#​17372](astral-sh/uv#17372)) - Improve trampoline file handle closing ([#​17374](astral-sh/uv#17374)) - Fix error message when installing musl python on armv7 ([#​17213](astral-sh/uv#17213)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi44MC4xIiwidXBkYXRlZEluVmVyIjoiNDIuODEuOCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiUmVub3ZhdGUgQm90IiwiYXV0b21hdGlvbjpib3QtYXV0aG9yZWQiLCJkZXBlbmRlbmN5LXR5cGU6OnBhdGNoIl19-->
Prevent concurrent processes from all refreshing the token simultaneously by using a file lock. When a refresh is needed:
This prevents thundering herd issues when multiple concurrent processes are invoking
uv auth tokenoruv auth helperat the same time.Also, set the tolerance for
uv auth helperto the same asuv auth token(5 minutes).