Skip to content

Comments

Fix stream did not contain valid UTF-8#8120

Merged
BurntSushi merged 1 commit intoastral-sh:mainfrom
kahojyun:fix-utf8
Oct 11, 2024
Merged

Fix stream did not contain valid UTF-8#8120
BurntSushi merged 1 commit intoastral-sh:mainfrom
kahojyun:fix-utf8

Conversation

@kahojyun
Copy link
Contributor

@kahojyun kahojyun commented Oct 11, 2024

Summary

Related issues: #8009 #7549

Although PYTHONIOENCODING=utf-8 forces python to use UTF-8 for stdout/stderr, it can't prevent code like sys.stdout.buffer.write() or subprocess.call(["cl.exe", ...]) to bypass the encoder. This PR uses lossy UTF-8 conversion to avoid decoding error.

Alternative

Using bstr crate might be better since it can preserve original information. Or we should follow the Windows convention, unset PYTHONIOENCODING and decode with system default encoding.

Test Plan

Running locally with non-ASCII character in UV_CACHE_DIR works fine, but I have no unit test plan. Testing locale problem is hard :(

@charliermarsh
Copy link
Member

\cc @BurntSushi

let stdout_reader = tokio::io::BufReader::new(child.stdout.take().unwrap()).lines();
let stderr_reader = tokio::io::BufReader::new(child.stderr.take().unwrap()).lines();
let stdout_reader = tokio::io::BufReader::new(child.stdout.take().unwrap()).split(b'\n');
let stderr_reader = tokio::io::BufReader::new(child.stderr.take().unwrap()).split(b'\n');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thought upon seeing this was thinking, "but what about \r?" But it looks like you handled that above. Nice. :-)

match reader.next_segment().await? {
Some(line_buf) => {
let line_buf = line_buf.strip_suffix(b"\r").unwrap_or(&line_buf);
let line = String::from_utf8_lossy(line_buf).into();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think bstr would be better for this, and would suggest using it (since we already depend on it, albeit transitively, in uv), but my memory here is that we need to write valid UTF-8. So while bstr wouldn't require this "lossy" step, I believe writing to the underlying stderr would still require valid UTF-8 anyway. So bstr would just absolve you of needing to handle line iteration, but I think this is fine as-is.

@Super1Windcloud
Copy link

@charliermarsh

please provide next fixed version as soon as possible

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Oct 15, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.4.20` -> `0.4.21` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.4.21`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0421)

[Compare Source](astral-sh/uv@0.4.20...0.4.21)

##### Enhancements

-   Add support for managed installations of free-threaded Python ([#&#8203;8100](astral-sh/uv#8100))
-   Add note about `uvx` to `uv tool run` short help ([#&#8203;7695](astral-sh/uv#7695))
-   Enable HTTP/2 requests ([#&#8203;8049](astral-sh/uv#8049))
-   Support `uv tree --no-dev` ([#&#8203;8109](astral-sh/uv#8109))
-   Support PEP 723 metadata with `uv run -` ([#&#8203;8111](astral-sh/uv#8111))
-   Support `pip install --exact` ([#&#8203;8044](astral-sh/uv#8044))
-   Support `uv export --no-header` ([#&#8203;8096](astral-sh/uv#8096))
-   ADd Python 3.13 images to Docker publish ([#&#8203;8105](astral-sh/uv#8105))
-   Support remote (`https://`) scripts in `uv run` ([#&#8203;6375](astral-sh/uv#6375))
-   Allow comma value-delimited arguments in `uv run --with` ([#&#8203;7909](astral-sh/uv#7909))

##### Configuration

-   Support wildcards in `UV_INSECURE_HOST` ([#&#8203;8052](astral-sh/uv#8052))

##### Performance

-   Use shared index when fetching metadata in lock satisfaction routine ([#&#8203;8147](astral-sh/uv#8147))

##### Bug fixes

-   Add prerelease compatibility check to `uv python` CLI ([#&#8203;8020](astral-sh/uv#8020))
-   Avoid deleting a project environment directory if we cannot tell if a `pyvenv.cfg` file exists ([#&#8203;8012](astral-sh/uv#8012))
-   Avoid excluding valid wheels for exact `requires-python` bounds ([#&#8203;8140](astral-sh/uv#8140))
-   Bump `netrc` crate to latest commit ([#&#8203;8021](astral-sh/uv#8021))
-   Fix `uv python pin 3.13t` failure when parsing version for project requires check ([#&#8203;8056](astral-sh/uv#8056))
-   Fix handling of != intersections in `requires-python` ([#&#8203;7897](astral-sh/uv#7897))
-   Remove the newly created tool environment if sync failed ([#&#8203;8038](astral-sh/uv#8038))
-   Respect dynamic extras in `uv lock` and `uv sync` ([#&#8203;8091](astral-sh/uv#8091))
-   Treat resolver failures as fatal in lockfile validation ([#&#8203;8083](astral-sh/uv#8083))
-   Use `git config --get` for author information for improved backwards compatibility ([#&#8203;8101](astral-sh/uv#8101))
-   Use comma-separated values for `UV_FIND_LINKS` ([#&#8203;8061](astral-sh/uv#8061))
-   Use shared resolver state between add and lock to avoid double Git update ([#&#8203;8146](astral-sh/uv#8146))
-   Make `--relocatable` entrypoints robust to symlinking ([#&#8203;8079](astral-sh/uv#8079))
-   Improve compatibility with VSCode PS1 prompt ([#&#8203;8006](astral-sh/uv#8006))
-   Fix "Stream did not contain valid UTF-8" failures in Windows ([#&#8203;8120](astral-sh/uv#8120))
-   Use `--with-requirements` in `uvx` error hint ([#&#8203;8112](astral-sh/uv#8112))

##### Documentation

-   Include `uvx` installation in Docker examples ([#&#8203;8179](astral-sh/uv#8179))
-   Make the instructions for the Windows standalone installer consistent across README and documentation ([#&#8203;8125](astral-sh/uv#8125))
-   Update pip compatibility guide to note transitive URL dependency support ([#&#8203;8081](astral-sh/uv#8081))
-   Document `--reinstall` with `--exclude-newer` to ensure downgrades ([#&#8203;6721](astral-sh/uv#6721))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants