-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix copy_file_range usage for files > 2GB #114
Fix copy_file_range usage for files > 2GB #114
Conversation
ping @stevvooe @dmcgowan @AkihiroSuda @Random-Liu This is a nasty gotcha. Considering this a P0 for moby/moby. |
fs/copy_linux.go
Outdated
|
||
if int64(n) != st.Size() { | ||
return errors.Wrapf(err, "short copy: %d of %d", int64(n), st.Size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this errors.Wrap
is incorrect and always returns nil.
2735bca
to
3293598
Compare
fs/copy_linux.go
Outdated
bufferPool.Put(buf) | ||
return err | ||
} | ||
first := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you arent setting this to false
later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. It's probably useless, but figured why not have the extra protection.
3293598
to
1e30e55
Compare
fs/copy_linux.go
Outdated
first = false | ||
written += int64(n) | ||
} | ||
if written != size { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this surely cant happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely not, but I do break if n == 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i.e., I guess only if there was a bug in the kernel in the go syscall implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not really worth checking for random kernel bugs
1e30e55
to
6a20386
Compare
fs/copy_linux.go
Outdated
} | ||
first := true | ||
for written < size { | ||
n, err := unix.CopyFileRange(int(src.Fd()), nil, int(dst.Fd()), nil, int(size-written), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you shouldnt cast size-written
to int, it could be larger. you will need a check for is it >2G and then write a smaller amount given the length arg is an int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the syscall doesnt take an int here so in C you would be able to pass a 64 bit value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
2399682
to
1b54a75
Compare
1b54a75
to
c8b9d21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fs/copy_linux.go
Outdated
@@ -43,28 +43,41 @@ func copyFileInfo(fi os.FileInfo, name string) error { | |||
return nil | |||
} | |||
|
|||
const maxCopySize = 2147483647 // max 32bit integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just use math.MaxInt32
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly. Didn't want to add the import just to get one number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
c8b9d21
to
f77b648
Compare
fs/copy_linux.go
Outdated
|
||
for written < size { | ||
var desired int | ||
if size-written > math.MaxInt32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this size-written
looks ugly without spaces, but apparently it's gofmt way. So I'd suggest to rewrite it as
desired := size - written
if desired > math.MaxInt32 {
desired=int(math.MaxInt32)
}
fs/copy_linux.go
Outdated
|
||
for written < size { | ||
var desired int | ||
if size-written > math.MaxInt32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK in fact I think we should not do something like this instead:
sfd := int(src.Fd())
dfd := int(dst.Fd())
size := st.Size()
for size > 0 {
n, err := unix.CopyFileRange(sfd, nil, dfd, nil, int(size), 0)
if err != nil {
// ...
}
size -= n
}
This way we don't have any knowledge of how big the size can be, but just relying on the fact that the syscall returns the number of bytes actually copied.
Note that in here int(size)
can indeed be negative, but it is internally used as size_t
, ie an unsigned type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was wondering why it didn't fail before when I did this. This makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
LGTM. Will update the vendor once this is merged. |
edbcbea
to
b652955
Compare
fs/copy_test.go
Outdated
@@ -46,6 +46,45 @@ func TestCopyDirectoryWithLocalSymlink(t *testing.T) { | |||
} | |||
} | |||
|
|||
// TestCopyWithLargeFile tests copying a file whose size is > 32bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe size > 2^32 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
b652955
to
7a7385b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; a nit in a comment (s/perrmission/permission/
)
copy_file_range (or any Linux syscall) will only ever copy 2GB at a time. In this case there is no error returned from the system call. This fix uses copy_file_range in a loop until either 0 bytes are copied, or the desired amount has been copied (st.Size()). Signed-off-by: Brian Goff <[email protected]>
7a7385b
to
afba265
Compare
@@ -46,6 +46,45 @@ func TestCopyDirectoryWithLocalSymlink(t *testing.T) { | |||
} | |||
} | |||
|
|||
// TestCopyWithLargeFile tests copying a file whose size > 2^32 bytes. | |||
func TestCopyWithLargeFile(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll consider this for a follow up, but skipping on testing.Short()
may be a good idea
@@ -46,6 +46,45 @@ func TestCopyDirectoryWithLocalSymlink(t *testing.T) { | |||
} | |||
} | |||
|
|||
// TestCopyWithLargeFile tests copying a file whose size > 2^32 bytes. | |||
func TestCopyWithLargeFile(t *testing.T) { | |||
dataR, dataW := io.Pipe() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a pseudorandom limit reader would make it cleaner
Some comments on test, I will address those in a follow up. Thanks for getting this in. LGTM |
copy_file_range (or any Linux syscall) will only ever copy 2GB at a
time. In this case there is no error returned from the system call.
This fix uses copy_file_range in a loop until either 0 bytes are copied,
or the desired amount has been copied (st.Size()).