-
-
Notifications
You must be signed in to change notification settings - Fork 939
fix glob dir on windows #8654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix glob dir on windows #8654
Conversation
24c51ad to
6f3b3b1
Compare
6f3b3b1 to
0a0b745
Compare
enebo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we should be using the String construction listed above in my comment for your createResource line. I think links.size() > 0 section also uses the same call and it should also be changed.
I am going to ask @headius on channel to review this since glob is always too complicated.
| buf.append( isRoot(base) ? EMPTY : SLASH ); | ||
| buf.append( getBytesInUTF8(file) ); | ||
| if ( SLASH_INDEX == -1 ) { | ||
| resource = JRubyFile.createResource(runtime, cwd, RubyString.byteListToString(buf)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up above we see the same resource creation but it uses the encoding for charset:
resource = JRubyFile.createResource(runtime, cwd, new String(buf.unsafeBytes(), buf.begin(), buf.length(), enc.getCharset()));This enc comes from the original pattern and that pattern is fed through RubyFile#get_path/filePathConvert which has special logic for transcoding if NOT on windows to match filesystem encoding. I think the use of byteListToString (used both here and below) is wrong. It will make an ISO-8859-1 string where each byte is a char vs the line above which properly decodes the bytes given with the charset provided.
I think the reason we do not ever see problems is almost all files are single-byte ASCII.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think this risks losing encoding and possibly dealing with mangled characters. If we cannot fix it by working with the possibly-multibyte characters directly, then we should use the functions in RubyEncoding to decode them to String properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorted
| resource = JRubyFile.createResource(runtime, cwd, RubyString.byteListToString(link)); | ||
| if ( resource.isDirectory() ) { | ||
| final int len = link.getRealSize(); | ||
| buf.length(0); | ||
| buf.append(link); | ||
| buf.append(path, SLASH_INDEX, end - SLASH_INDEX); | ||
| status = glob_helper(runtime, cwd, scheme, buf, buf.getBegin() + len, flags, func, arg); | ||
| } | ||
| final int len = link.getRealSize(); | ||
| buf.length(0); | ||
| buf.append(link); | ||
| buf.append(path, SLASH_INDEX, end - SLASH_INDEX); | ||
| status = glob_helper(runtime, cwd, scheme, buf, buf.getBegin() + len, flags, func, arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i moved the isDirectory check to where we append to links rather than the loop. This might cause an issue where links would have contained only non-directories, it should still call break mainLoop, although I haven't seen any issues. I don't know how I would cause this to happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reverted this change
86cbebf to
03945c6
Compare
Resolve #8623