Skip to content

Commit 7be3f09

Browse files
acln0ianlancetaylor
authored andcommitted
os, internal/poll, internal/syscall/unix: use copy_file_range on Linux
Linux 4.5 introduced (and Linux 5.3 refined) the copy_file_range system call, which allows file systems the opportunity to implement copy acceleration techniques. This commit adds support for copy_file_range(2) to the os package. Introduce a new ReadFrom method on *os.File, which makes *os.File implement the io.ReaderFrom interface. If dst and src are both files, this enables io.Copy(dst, src) to call dst.ReadFrom(src), which, in turn, will call copy_file_range(2) if possible. If copy_file_range(2) is not supported by the host kernel, or if either of dst or src refers to a non-regular file, ReadFrom falls back to the regular io.Copy code path. Add internal/poll.CopyFileRange, which acquires locks on the appropriate poll.FDs and performs the actual work, as well as internal/syscall/unix.CopyFileRange, which wraps the copy_file_range system call itself at the lowest level. Rework file layout in internal/syscall/unix to accomodate the additional system call numbers needed for copy_file_range. Merge these definitions with the ones used by getrandom(2) into sysnum_linux_$GOARCH.go files. A note on additional optimizations: if dst and src both refer to pipes in the invocation dst.ReadFrom(src), we could, in theory, use the existing splice(2) code in package internal/poll to splice directly from src to dst. Attempting this runs into trouble with the poller, however. If we call splice(src, dst) and see EAGAIN, we cannot know if it came from src not being ready for reading or dst not being ready for writing. The write end of src and the read end of dst are not under our control, so we cannot reliably use the poller to wait for readiness. Therefore, it seems infeasible to use the new ReadFrom method to splice between pipes directly. In conclusion, for now, the only optimization enabled by the new ReadFrom method on *os.File is the copy_file_range optimization. Fixes #36817. Change-Id: I696372639fa0cdf704e3f65414f7321fc7d30adb Reviewed-on: https://go-review.googlesource.com/c/go/+/229101 Run-TryBot: Ian Lance Taylor <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
1 parent c2e0f01 commit 7be3f09

16 files changed

Lines changed: 568 additions & 28 deletions
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
// Copyright 2020 The Go Authors. All rights reserved.
2+
// Use of this source code is governed by a BSD-style
3+
// license that can be found in the LICENSE file.
4+
5+
package poll
6+
7+
import (
8+
"internal/syscall/unix"
9+
"sync/atomic"
10+
"syscall"
11+
)
12+
13+
var copyFileRangeSupported int32 = 1 // accessed atomically
14+
15+
const maxCopyFileRangeRound = 1 << 30
16+
17+
// CopyFileRange copies at most remain bytes of data from src to dst, using
18+
// the copy_file_range system call. dst and src must refer to regular files.
19+
func CopyFileRange(dst, src *FD, remain int64) (written int64, handled bool, err error) {
20+
if atomic.LoadInt32(&copyFileRangeSupported) == 0 {
21+
return 0, false, nil
22+
}
23+
for remain > 0 {
24+
max := remain
25+
if max > maxCopyFileRangeRound {
26+
max = maxCopyFileRangeRound
27+
}
28+
n, err := copyFileRange(dst, src, int(max))
29+
switch err {
30+
case syscall.ENOSYS:
31+
// copy_file_range(2) was introduced in Linux 4.5.
32+
// Go supports Linux >= 2.6.33, so the system call
33+
// may not be present.
34+
//
35+
// If we see ENOSYS, we have certainly not transfered
36+
// any data, so we can tell the caller that we
37+
// couldn't handle the transfer and let them fall
38+
// back to more generic code.
39+
//
40+
// Seeing ENOSYS also means that we will not try to
41+
// use copy_file_range(2) again.
42+
atomic.StoreInt32(&copyFileRangeSupported, 0)
43+
return 0, false, nil
44+
case syscall.EXDEV, syscall.EINVAL:
45+
// Prior to Linux 5.3, it was not possible to
46+
// copy_file_range across file systems. Similarly to
47+
// the ENOSYS case above, if we see EXDEV, we have
48+
// not transfered any data, and we can let the caller
49+
// fall back to generic code.
50+
//
51+
// As for EINVAL, that is what we see if, for example,
52+
// dst or src refer to a pipe rather than a regular
53+
// file. This is another case where no data has been
54+
// transfered, so we consider it unhandled.
55+
return 0, false, nil
56+
case nil:
57+
if n == 0 {
58+
// src is at EOF, which means we are done.
59+
return written, true, nil
60+
}
61+
remain -= n
62+
written += n
63+
default:
64+
return written, true, err
65+
}
66+
}
67+
return written, true, nil
68+
}
69+
70+
// copyFileRange performs one round of copy_file_range(2).
71+
func copyFileRange(dst, src *FD, max int) (written int64, err error) {
72+
// The signature of copy_file_range(2) is:
73+
//
74+
// ssize_t copy_file_range(int fd_in, loff_t *off_in,
75+
// int fd_out, loff_t *off_out,
76+
// size_t len, unsigned int flags);
77+
//
78+
// Note that in the call to unix.CopyFileRange below, we use nil
79+
// values for off_in and off_out. For the system call, this means
80+
// "use and update the file offsets". That is why we must acquire
81+
// locks for both file descriptors (and why this whole machinery is
82+
// in the internal/poll package to begin with).
83+
if err := dst.writeLock(); err != nil {
84+
return 0, err
85+
}
86+
defer dst.writeUnlock()
87+
if err := src.readLock(); err != nil {
88+
return 0, err
89+
}
90+
defer src.readUnlock()
91+
n, err := unix.CopyFileRange(src.Sysfd, nil, dst.Sysfd, nil, max, 0)
92+
return int64(n), err
93+
}
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
// Copyright 2020 The Go Authors. All rights reserved.
2+
// Use of this source code is governed by a BSD-style
3+
// license that can be found in the LICENSE file.
4+
5+
package unix
6+
7+
import (
8+
"syscall"
9+
"unsafe"
10+
)
11+
12+
func CopyFileRange(rfd int, roff *int64, wfd int, woff *int64, len int, flags int) (n int, err error) {
13+
r1, _, errno := syscall.Syscall6(copyFileRangeTrap,
14+
uintptr(rfd),
15+
uintptr(unsafe.Pointer(roff)),
16+
uintptr(wfd),
17+
uintptr(unsafe.Pointer(woff)),
18+
uintptr(len),
19+
uintptr(flags),
20+
)
21+
n = int(r1)
22+
if errno != 0 {
23+
err = errno
24+
}
25+
return
26+
}

src/internal/syscall/unix/getrandom_linux.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ func GetRandom(p []byte, flags GetRandomFlag) (n int, err error) {
3232
if atomic.LoadInt32(&randomUnsupported) != 0 {
3333
return 0, syscall.ENOSYS
3434
}
35-
r1, _, errno := syscall.Syscall(randomTrap,
35+
r1, _, errno := syscall.Syscall(getrandomTrap,
3636
uintptr(unsafe.Pointer(&p[0])),
3737
uintptr(len(p)),
3838
uintptr(flags))

src/internal/syscall/unix/getrandom_linux_386.go renamed to src/internal/syscall/unix/sysnum_linux_386.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
package unix
66

7-
// Linux getrandom system call number.
8-
// See GetRandom in getrandom_linux.go.
9-
const randomTrap uintptr = 355
7+
const (
8+
getrandomTrap uintptr = 355
9+
copyFileRangeTrap uintptr = 377
10+
)

src/internal/syscall/unix/getrandom_linux_amd64.go renamed to src/internal/syscall/unix/sysnum_linux_amd64.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
package unix
66

7-
// Linux getrandom system call number.
8-
// See GetRandom in getrandom_linux.go.
9-
const randomTrap uintptr = 318
7+
const (
8+
getrandomTrap uintptr = 318
9+
copyFileRangeTrap uintptr = 326
10+
)

src/internal/syscall/unix/getrandom_linux_arm.go renamed to src/internal/syscall/unix/sysnum_linux_arm.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
package unix
66

7-
// Linux getrandom system call number.
8-
// See GetRandom in getrandom_linux.go.
9-
const randomTrap uintptr = 384
7+
const (
8+
getrandomTrap uintptr = 384
9+
copyFileRangeTrap uintptr = 391
10+
)

src/internal/syscall/unix/getrandom_linux_generic.go renamed to src/internal/syscall/unix/sysnum_linux_generic.go

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,11 @@
77

88
package unix
99

10-
// Linux getrandom system call number.
11-
// See GetRandom in getrandom_linux.go.
12-
//
1310
// This file is named "generic" because at a certain point Linux started
14-
// standardizing on system call numbers across architectures. So far this means
15-
// only arm64 and riscv64 use the standard numbers.
16-
const randomTrap uintptr = 278
11+
// standardizing on system call numbers across architectures. So far this
12+
// means only arm64 and riscv64 use the standard numbers.
13+
14+
const (
15+
getrandomTrap uintptr = 278
16+
copyFileRangeTrap uintptr = 285
17+
)

src/internal/syscall/unix/getrandom_linux_mips64x.go renamed to src/internal/syscall/unix/sysnum_linux_mips64x.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
package unix
88

9-
// Linux getrandom system call number.
10-
// See GetRandom in getrandom_linux.go.
11-
const randomTrap uintptr = 5313
9+
const (
10+
getrandomTrap uintptr = 5313
11+
copyFileRangeTrap uintptr = 5320
12+
)

src/internal/syscall/unix/getrandom_linux_mipsx.go renamed to src/internal/syscall/unix/sysnum_linux_mipsx.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
package unix
88

9-
// Linux getrandom system call number.
10-
// See GetRandom in getrandom_linux.go.
11-
const randomTrap uintptr = 4353
9+
const (
10+
getrandomTrap uintptr = 4353
11+
copyFileRangeTrap uintptr = 4360
12+
)

src/internal/syscall/unix/getrandom_linux_ppc64x.go renamed to src/internal/syscall/unix/sysnum_linux_ppc64x.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66

77
package unix
88

9-
// Linux getrandom system call number.
10-
// See GetRandom in getrandom_linux.go.
11-
const randomTrap uintptr = 359
9+
const (
10+
getrandomTrap uintptr = 359
11+
copyFileRangeTrap uintptr = 379
12+
)

0 commit comments

Comments
 (0)