Skip to content

Conversation

@StefanoDuo
Copy link
Contributor

@StefanoDuo StefanoDuo commented Aug 3, 2020

This patch set increases the total FUSE code coverage by adding a pseudo-syscalls (and some other stuff) which helps syzkaller mimics a FUSE daemon behavior.

Note: make presubmit fails at the make lint with the following error message pkg/host/syscalls_linux.go:173:1: cyclomatic complexity 27 of func "isSupportedSyzkall" is high (> 24). This seems to be due to the additional branches in the "isSupportedSyzkall", does anyone have any solution for that?
I have run make test, make presubmit_arch and make presubmit_race manually and they pass.


CC: @balsini

@dvyukov
Copy link
Collaborator

dvyukov commented Aug 4, 2020

Note: make presubmit fails at the make lint with the following error message pkg/host/syscalls_linux.go:173:1: cyclomatic complexity 27 of func "isSupportedSyzkall" is high (> 24). This seems to be due to the additional branches in the "isSupportedSyzkall", does anyone have any solution for that?

Hi Stefano,

It needs to be refactored somehow to reduce cyclomatic complexity.
Perhaps a map from syscall name to checking function will help to reduce it once and for all.
An intermediate solution may be to combine some functions into a single case, but I am not sure how exactly the checker counts, maybe it won't help.

@codecov
Copy link

codecov bot commented Aug 6, 2020

Codecov Report

Merging #2001 into master will increase coverage by 0.1%.
The diff coverage is 50.0%.

Impacted Files Coverage Δ
pkg/host/syscalls_linux.go 73.6% <50.0%> (+0.6%) ⬆️
prog/rotation.go 93.9% <0.0%> (-1.5%) ⬇️
pkg/csource/common.go 89.6% <0.0%> (ø)
sys/openbsd/init.go 82.9% <0.0%> (+1.2%) ⬆️
prog/mutation.go 91.3% <0.0%> (+2.7%) ⬆️
prog/target.go 68.2% <0.0%> (+5.8%) ⬆️

@dvyukov
Copy link
Collaborator

dvyukov commented Aug 10, 2020

My local instance running without syz_fuse_mount for 20h more now got to 61% of readdir.c. So it seems that the fuzzer is capable of getting all that coverage, maybe just somewhat slower. Slightly bumping priority of using the directory with a new pseudo-syscall does not look like a good RoI. And this helps only 1 filesystem out of 92, while applicable to all of them. I am thinking about some alternative approaches.

  1. It seem that we could slightly modify existing syz_mount_image to get the same effect: pass empty image for fuse, and then if image is empty, skip the memfd/loop part and just do mkdir+mount and add return of the dir fd. This looks equivalent and will benefit all filesystems with few additional lines of C code.
  2. Preseeding corpus with more complex scenarios seems to be useful. Please add the getdents program to sys/linux/test as well and we will figure out how to inject them into fuzzer corpus. If we start injecting them, then more non-trivial programs there will be useful. I hate working around fuzzer deficiencies with brute-force preseeding, but, well, all is fair in love, war and fuzzing.
  3. Perhaps we can do some generic improvement for O_CREAT that will benefit all operations on all filesystems. But I am not sure what exactly it should be... any ideas?
  4. Perhaps we can do some generic improvement in how the fuzzer works with files/directories. For example, how it chooses existing/new filename:

    syzkaller/prog/rand.go

    Lines 278 to 302 in 7030187

    func (r *randGen) filenameImpl(s *state) string {
    if r.oneOf(100) {
    return specialFiles[r.Intn(len(specialFiles))]
    }
    if len(s.files) == 0 || r.oneOf(10) {
    // Generate a new name.
    dir := "."
    if r.oneOf(2) && len(s.files) != 0 {
    dir = r.randFromMap(s.files)
    if dir != "" && dir[len(dir)-1] == 0 {
    dir = dir[:len(dir)-1]
    }
    if r.oneOf(10) && filepath.Clean(dir)[0] != '.' {
    dir += "/.."
    }
    }
    for i := 0; ; i++ {
    f := fmt.Sprintf("%v/file%v", dir, i)
    if !s.files[f] {
    return f
    }
    }
    }
    return r.randFromMap(s.files)
    }

    or how fd/fd_dir are organized. fd_dir is a bit of hack now. It does not really anyhow biased towards opening directories, and e.g. we may already have an fd that refers to a dir, but then if we need an fd_dir that existing fd won't be used. As far as I remember I introduced fd_dir just to mark some arguments as "in this position we want a directory". Maybe a better split would be fd and fd_file (refers to some file/dir rather than pipe/dev/memfd/epoll/etc). Or maybe we need to change filenameImpl to use fileN and dirN elements so that it knows later what is what. But I did not think much about this, I am not sure if it will work out or not.

@StefanoDuo
Copy link
Contributor Author

My local instance running without syz_fuse_mount for 20h more now got to 61% of readdir.c. So it seems that the fuzzer is capable of getting all that coverage, maybe just somewhat slower. Slightly bumping priority of using the directory with a new pseudo-syscall does not look like a good RoI. And this helps only 1 filesystem out of 92, while applicable to all of them. I am thinking about some alternative approaches.

  1. It seem that we could slightly modify existing syz_mount_image to get the same effect: pass empty image for fuse, and then if image is empty, skip the memfd/loop part and just do mkdir+mount and add return of the dir fd. This looks equivalent and will benefit all filesystems with few additional lines of C code.

I was thinking that a lot more was going on inside syz_mount_image, but I guess that is not the case. I have refactored it and added what I was previously doing inside syz_fuse_mount.

  1. Preseeding corpus with more complex scenarios seems to be useful. Please add the getdents program to sys/linux/test as well and we will figure out how to inject them into fuzzer corpus. If we start injecting them, then more non-trivial programs there will be useful. I hate working around fuzzer deficiencies with brute-force preseeding, but, well, all is fair in love, war and fuzzing.

Agreed.

Copy link
Collaborator

@dvyukov dvyukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall we are close. There is a bunch of nits, but on high level this looks good. With a small change to syz_mount_image we improve fuzzing of all filesystems, which is good.

Currently fuse_in.unique and fuse_out.unique are not linked by any
dependency chain. This causes the majority of the replies to the kernel
to be dropped because not referring to a previously sent request.
By defining them as a resource, we push the fuzzer in the right
direction (i.e., try to respond using a previously issued unique value).
The read syscall has been updated accordingly, it now expects and parses
a fuse_in header and some additional data.
Instead of using a generic int32 for fuse_attr.mode, use the expected
file mode flags.
@dvyukov
Copy link
Collaborator

dvyukov commented Aug 14, 2020

The change looks good to me.
Please rebase to resolve conflicts. We stepped onto each other, sorry. go test ./executor will be a fast way to test for new warnings.

At the moment syzkaller is able to respond to FUSE with a syntactically
correct response using the specific write$FUSE_*() syscalls, but most of
the times these responses are not related to the type of request that
was received.
With this pseudo-syscall we are able to provide the correct response
type while still allowing the fuzzer to fuzz its content. This is done
by requiring each type of response as an input parameter and then
choosing the correct one based on the request opcode.
Notice that the fuzzer is still free to mix write$FUSE_*() and
syz_fuse_handle_req() syscalls, so it is not losing any degree of
freedom.

syz_fuse_handle_req() retrieves the FUSE request and resource
fuse_unique internally (by performing a read() on the /dev/fuse file
descriptor provided as input). For this reason, a new template argument has
been added to fuse_out (renamed to _fuse_out) so that the unique field
can be both an int64 (used by syz_fuse_handle_req()) and a fuse_unique
resource (used by the write$FUSE_*() syscalls) without any code
duplication.
Use the type bytelen instead of len to express the length of the data
structure being pointed by the pointer passed to read$FUSE().
Refactor syz_mount_image() to support filesystems not requiring a
backing device and filesystem image (e.g. FUSE). To do that, we check for
the presence of the pointer to the array of struct fs_image_segment: if
missingi, there is no need to setup the loop device and we can proceed
directly with the mount() syscall.
Add syz_mount_image$fuse() (specialization for FUSE) inside
sys/linux/fs_fuse.txt.
Add syzkaller program which correctly handles a getdents64() syscall
on a FUSE directory. Here the related comment/discussion
google#2001 (comment).
@StefanoDuo
Copy link
Contributor Author

Please rebase to resolve conflicts. We stepped onto each other, sorry. go test ./executor will be a fast way to test for new warnings.

All conflicts should be resolved now. Thanks for suggesting go test ./executor, it definitely helped to speed up the process.

@StefanoDuo StefanoDuo requested a review from dvyukov August 14, 2020 16:13
@dvyukov dvyukov merged commit 7c06bb2 into google:master Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants