FUSE: increase code coverage #2001

StefanoDuo · 2020-08-03T17:22:42Z

This patch set increases the total FUSE code coverage by adding a pseudo-syscalls (and some other stuff) which helps syzkaller mimics a FUSE daemon behavior.

Note: make presubmit fails at the make lint with the following error message pkg/host/syscalls_linux.go:173:1: cyclomatic complexity 27 of func "isSupportedSyzkall" is high (> 24). This seems to be due to the additional branches in the "isSupportedSyzkall", does anyone have any solution for that?
I have run make test, make presubmit_arch and make presubmit_race manually and they pass.

CC: @balsini

executor/common_linux.h

dvyukov · 2020-08-04T05:39:47Z

Note: make presubmit fails at the make lint with the following error message pkg/host/syscalls_linux.go:173:1: cyclomatic complexity 27 of func "isSupportedSyzkall" is high (> 24). This seems to be due to the additional branches in the "isSupportedSyzkall", does anyone have any solution for that?

Hi Stefano,

It needs to be refactored somehow to reduce cyclomatic complexity.
Perhaps a map from syscall name to checking function will help to reduce it once and for all.
An intermediate solution may be to combine some functions into a single case, but I am not sure how exactly the checker counts, maybe it won't help.

executor/common_linux.h

pkg/host/syscalls_linux.go

sys/linux/fs_fuse.txt

sys/linux/test/fuse_deadlock

sys/linux/fs_fuse.txt

codecov · 2020-08-06T12:13:17Z

Codecov Report

Merging #2001 into master will increase coverage by 0.1%.
The diff coverage is 50.0%.

Impacted Files	Coverage Δ
pkg/host/syscalls_linux.go	`73.6% <50.0%> (+0.6%)`	⬆️
prog/rotation.go	`93.9% <0.0%> (-1.5%)`	⬇️
pkg/csource/common.go	`89.6% <0.0%> (ø)`
sys/openbsd/init.go	`82.9% <0.0%> (+1.2%)`	⬆️
prog/mutation.go	`91.3% <0.0%> (+2.7%)`	⬆️
prog/target.go	`68.2% <0.0%> (+5.8%)`	⬆️

dvyukov · 2020-08-10T06:28:16Z

My local instance running without syz_fuse_mount for 20h more now got to 61% of readdir.c. So it seems that the fuzzer is capable of getting all that coverage, maybe just somewhat slower. Slightly bumping priority of using the directory with a new pseudo-syscall does not look like a good RoI. And this helps only 1 filesystem out of 92, while applicable to all of them. I am thinking about some alternative approaches.

It seem that we could slightly modify existing syz_mount_image to get the same effect: pass empty image for fuse, and then if image is empty, skip the memfd/loop part and just do mkdir+mount and add return of the dir fd. This looks equivalent and will benefit all filesystems with few additional lines of C code.
Preseeding corpus with more complex scenarios seems to be useful. Please add the getdents program to sys/linux/test as well and we will figure out how to inject them into fuzzer corpus. If we start injecting them, then more non-trivial programs there will be useful. I hate working around fuzzer deficiencies with brute-force preseeding, but, well, all is fair in love, war and fuzzing.
Perhaps we can do some generic improvement for O_CREAT that will benefit all operations on all filesystems. But I am not sure what exactly it should be... any ideas?

Perhaps we can do some generic improvement in how the fuzzer works with files/directories. For example, how it chooses existing/new filename:

syzkaller/prog/rand.go

Lines 278 to 302 in 7030187

    
           func (r *randGen) filenameImpl(s *state) string { 
        
           	if r.oneOf(100) { 
        
           		return specialFiles[r.Intn(len(specialFiles))] 
        
           	} 
        
           	if len(s.files) == 0 || r.oneOf(10) { 
        
           		// Generate a new name. 
        
           		dir := "." 
        
           		if r.oneOf(2) && len(s.files) != 0 { 
        
           			dir = r.randFromMap(s.files) 
        
           			if dir != "" && dir[len(dir)-1] == 0 { 
        
           				dir = dir[:len(dir)-1] 
        
           			} 
        
           			if r.oneOf(10) && filepath.Clean(dir)[0] != '.' { 
        
           				dir += "/.." 
        
           			} 
        
           		} 
        
           		for i := 0; ; i++ { 
        
           			f := fmt.Sprintf("%v/file%v", dir, i) 
        
           			if !s.files[f] { 
        
           				return f 
        
           			} 
        
           		} 
        
           	} 
        
           	return r.randFromMap(s.files) 
        
           }

or how fd/fd_dir are organized. fd_dir is a bit of hack now. It does not really anyhow biased towards opening directories, and e.g. we may already have an fd that refers to a dir, but then if we need an fd_dir that existing fd won't be used. As far as I remember I introduced fd_dir just to mark some arguments as "in this position we want a directory". Maybe a better split would be fd and fd_file (refers to some file/dir rather than pipe/dev/memfd/epoll/etc). Or maybe we need to change filenameImpl to use fileN and dirN elements so that it knows later what is what. But I did not think much about this, I am not sure if it will work out or not.

StefanoDuo · 2020-08-10T17:15:13Z

My local instance running without syz_fuse_mount for 20h more now got to 61% of readdir.c. So it seems that the fuzzer is capable of getting all that coverage, maybe just somewhat slower. Slightly bumping priority of using the directory with a new pseudo-syscall does not look like a good RoI. And this helps only 1 filesystem out of 92, while applicable to all of them. I am thinking about some alternative approaches.

It seem that we could slightly modify existing syz_mount_image to get the same effect: pass empty image for fuse, and then if image is empty, skip the memfd/loop part and just do mkdir+mount and add return of the dir fd. This looks equivalent and will benefit all filesystems with few additional lines of C code.

I was thinking that a lot more was going on inside syz_mount_image, but I guess that is not the case. I have refactored it and added what I was previously doing inside syz_fuse_mount.

Preseeding corpus with more complex scenarios seems to be useful. Please add the getdents program to sys/linux/test as well and we will figure out how to inject them into fuzzer corpus. If we start injecting them, then more non-trivial programs there will be useful. I hate working around fuzzer deficiencies with brute-force preseeding, but, well, all is fair in love, war and fuzzing.

Agreed.

dvyukov

I think overall we are close. There is a bunch of nits, but on high level this looks good. With a small change to syz_mount_image we improve fuzzing of all filesystems, which is good.

pkg/host/syscalls_linux.go

executor/common_linux.h

sys/linux/fs_fuse.txt

executor/common_linux.h

Currently fuse_in.unique and fuse_out.unique are not linked by any dependency chain. This causes the majority of the replies to the kernel to be dropped because not referring to a previously sent request. By defining them as a resource, we push the fuzzer in the right direction (i.e., try to respond using a previously issued unique value). The read syscall has been updated accordingly, it now expects and parses a fuse_in header and some additional data.

Instead of using a generic int32 for fuse_attr.mode, use the expected file mode flags.

dvyukov · 2020-08-14T12:40:04Z

The change looks good to me.
Please rebase to resolve conflicts. We stepped onto each other, sorry. go test ./executor will be a fast way to test for new warnings.

At the moment syzkaller is able to respond to FUSE with a syntactically correct response using the specific write$FUSE_*() syscalls, but most of the times these responses are not related to the type of request that was received. With this pseudo-syscall we are able to provide the correct response type while still allowing the fuzzer to fuzz its content. This is done by requiring each type of response as an input parameter and then choosing the correct one based on the request opcode. Notice that the fuzzer is still free to mix write$FUSE_*() and syz_fuse_handle_req() syscalls, so it is not losing any degree of freedom. syz_fuse_handle_req() retrieves the FUSE request and resource fuse_unique internally (by performing a read() on the /dev/fuse file descriptor provided as input). For this reason, a new template argument has been added to fuse_out (renamed to _fuse_out) so that the unique field can be both an int64 (used by syz_fuse_handle_req()) and a fuse_unique resource (used by the write$FUSE_*() syscalls) without any code duplication.

Use the type bytelen instead of len to express the length of the data structure being pointed by the pointer passed to read$FUSE().

Refactor syz_mount_image() to support filesystems not requiring a backing device and filesystem image (e.g. FUSE). To do that, we check for the presence of the pointer to the array of struct fs_image_segment: if missingi, there is no need to setup the loop device and we can proceed directly with the mount() syscall. Add syz_mount_image$fuse() (specialization for FUSE) inside sys/linux/fs_fuse.txt.

Add syzkaller program which correctly handles a getdents64() syscall on a FUSE directory. Here the related comment/discussion google#2001 (comment).

StefanoDuo · 2020-08-14T16:13:10Z

Please rebase to resolve conflicts. We stepped onto each other, sorry. go test ./executor will be a fast way to test for new warnings.

All conflicts should be resolved now. Thanks for suggesting go test ./executor, it definitely helped to speed up the process.

xairy reviewed Aug 3, 2020

View reviewed changes

executor/common_linux.h Outdated Show resolved Hide resolved

dvyukov reviewed Aug 4, 2020

View reviewed changes

executor/common_linux.h Outdated Show resolved Hide resolved

dvyukov reviewed Aug 4, 2020

View reviewed changes

executor/common_linux.h Outdated Show resolved Hide resolved

dvyukov reviewed Aug 4, 2020

View reviewed changes

StefanoDuo mentioned this pull request Aug 4, 2020

Refactor isSupportedSyzkall #2005

Merged

StefanoDuo force-pushed the fs_fuse branch from d6306df to 47a365c Compare August 6, 2020 11:56

StefanoDuo force-pushed the fs_fuse branch from 47a365c to b9b1cc6 Compare August 10, 2020 17:06

dvyukov reviewed Aug 13, 2020

View reviewed changes

StefanoDuo force-pushed the fs_fuse branch from b9b1cc6 to 09ae13b Compare August 13, 2020 16:23

StefanoDuo force-pushed the fs_fuse branch from 09ae13b to 5ec83e5 Compare August 13, 2020 16:32

StefanoDuo added 2 commits August 14, 2020 09:29

sys/linux/fs_fuse.txt: use expected flags for fuse_attr.mode

13f1f99

Instead of using a generic int32 for fuse_attr.mode, use the expected file mode flags.

StefanoDuo added 4 commits August 14, 2020 15:47

sys/linux/fs_fuse.txt: use bytelen instead of len

ec04256

Use the type bytelen instead of len to express the length of the data structure being pointed by the pointer passed to read$FUSE().

sys/linux/test: add test performing getdents64() on FUSE dir

4de9995

Add syzkaller program which correctly handles a getdents64() syscall on a FUSE directory. Here the related comment/discussion google#2001 (comment).

StefanoDuo force-pushed the fs_fuse branch from 5ec83e5 to 4de9995 Compare August 14, 2020 16:03

StefanoDuo requested a review from dvyukov August 14, 2020 16:13

dvyukov merged commit 7c06bb2 into google:master Aug 14, 2020

FUSE: increase code coverage #2001

FUSE: increase code coverage #2001

Uh oh!

Conversation

StefanoDuo commented Aug 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dvyukov commented Aug 4, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Aug 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dvyukov commented Aug 10, 2020

Uh oh!

StefanoDuo commented Aug 10, 2020

Uh oh!

dvyukov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dvyukov commented Aug 14, 2020

Uh oh!

StefanoDuo commented Aug 14, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

StefanoDuo commented Aug 3, 2020 •

edited

Loading

codecov bot commented Aug 6, 2020 •

edited

Loading