-
Notifications
You must be signed in to change notification settings - Fork 847
Description
I apologize in advance for not diligently following the CONTRIBUTING.md format, but I'm not actually sure what the expected result of these commands should be (read below for why). But regardless of what it should be, there's a problem.
To preface, 'ば', '日' and '本' are all letters (according to isLetter from Data.Char). Now let's see what happens when we try to create a new project with each of them as a name:
This one just completely gives you an incorrect project name:
$ stack new "ば"
Downloading template "new-template" to create project "p" in p/ ...
This one fails with a fairly bizarre-to-the-user error:
$ stack new "日"
Cannot decode byte '\xe5': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
And this one is a complicated ball of I'm not sure what:
$ stack new "本"
Expected valid package name, but got: 本
- '本' is a letter, according to
isLetterfromData.Charat least, so it should be allowed - But are cabal package names in ASCII, or can you actually use unicode?
- If it is supposed to fail, it only accidentally fails. Turns out it's truncating '本' down to ',' via
Data.ByteString.Char8.packand then fails because ',' isn't a valid package name.
What's happening is:
the optparse-applicative parser is calling parsePackageNameFromString which uses Data.ByteString.Char8.pack, truncating any Chars outside of the Word8 range. Then packageNameParser is used to parse the result, yielding some pretty odd outcomes.
The decodeUtf8 error message on "日" is via packageNameText in some logging in the New command code
So, my question is: what should actually be done to fix this? In my mind, there's two options:
- Detect package names outside of ASCII range and reject them outright with a sane error message.
- Correctly handle unicode characters, which I'm not sure how extensive the changes required would be. Probably a lot of code paths would have to at least be looked at, and
PackageNamewould have to change at least a moderate amount.
I suspect that the first choice is the correct one, but can anyone confirm?
The above behavior is identical between the following two versions:
$ stack --version
Version 0.1.7.0, Git revision 2f00f7bd350192cef1c61a8f07cbe7341c1e735f (2555 commits) x86_64
$ stack --version
Version 0.1.6.0, Git revision e22271f5ce9afa2cb5be3bad9cafa392c623f85c (2313 commits) x86_64