Skip to content

Conversation

@voidptr127
Copy link
Contributor

This pull requests adds a new environment variable to enable users to work with virtual file formats. I called the variable AFL_POST_PROCESS_KEEP_ORIGINAL as it describes exactly what it does: keeping the original file prior post processing in the input queue, but executing the post processed file.

Usually post processing will write the post processed file to a testcase and then replace the file in the queue. However, when working on intermediate or virtual file formats, the file must be kept in the queue as it is, while the post processed file must be executed. We require this for our grammar-based fuzzer ATNwalk which relies on using bit sequences that encode parse trees. The bit sequences are kept in the queue, but the actual high-level structured inputs are executed (e.g. SQL files etc.).

Here another example what would happen when setting the environment variable: Imagine some post processing function fixes checksums of a PNG file. The PNG file with the fixed checksums is executed by the target but the PNG file with invalid checksums is kept.

This pull request also includes the respective custom mutator that is required by ATNwalk (see the README in the custom_mutator/atnwalk for further details on how to set it up).

I applied code formatting with make code-format and I confirm to comply to the Deloper Certificate of Origin

Please note that this is my first pull request.

Best regards,
Maik Betka

@vanhauser-thc
Copy link
Member

looks great! will review it sunday or monday.

@vanhauser-thc
Copy link
Member

looks good, I tried various custom mutators with python and without if with this env or not it creates crashes with an asan build and it did not.

is there a paper about this grammar mutator? if so please send a PR that adds a link to the readme.

merging now :)

@vanhauser-thc vanhauser-thc merged commit c5e5a17 into AFLplusplus:dev Apr 22, 2023
@voidptr127
Copy link
Contributor Author

Thank you for merging the PR so quickly! :)

A paper is currently under review, but I will come back upon acceptance (hopefully). Maybe a pre-print might be useful. Let's see.

We also work on a follow-up project with ATNwalk. So expect more stuff to come in the future ;)

@vanhauser-thc
Copy link
Member

looking forward to that :)

@vanhauser-thc
Copy link
Member

do you have a fuzzbench fuzzer setup for this?

@voidptr127
Copy link
Contributor Author

Unfortunately, I don't have a fuzzbench setup for this. I am also not sure whether it is easily (read: quickly) doable. I considered it in the past, but then dropped it due to the time constraints I had in the project. There was neither a setup for Nautilus nor for Gramatron that I could've simply adapt. In general, I couldn't find something nice for grammar-based fuzzers. But maybe I am wrong and just couldn't find anything?

Hence, I've written my own testbed which did the job for me. I am not against having something "more official", but, at least right now, I am not willing to spend the additional effort when the testbed covers my use cases, in particular: benchmarking and developing :)

@vanhauser-thc
Copy link
Member

There is a nautilus setup based on libafl on fuzzbench. Gramatron custom mutator is super easy to set up. Would only leave atn, I would need to check how complex that is to set up.
but for bugs only xml and php benchmarks exist, so it is a limited test.
I could take care of that this week

@voidptr127
Copy link
Contributor Author

If you don't mind and want to set it up, go ahead, I'd look forward to that :)

However, I don't have a grammar for XML.

The procedure to setup ATNwalk is pretty straight forward:

In fact, if you want to shortcut it, it should be no more than adding the install scripts from the testbed into the Dockerfile and run them!

I.e., append the following line to the Dockerfile mentioned above and it will install AFL++, the mutator, ATNwalk, and will also include the above grammars:

RUN git clone https://github.com/atnwalk/testbed && bash /home/rocky/testbed/install/atnwalk.bash

Running ATNwalk it is described here, but basically is no more than:

# create the required a random seed first
mkdir -p ~/campaign/example/seeds
cd ~/campaign/example/seeds
head -c1 /dev/urandom | ~/atnwalk/build/javascript/bin/decode -wb > seed.decoded 2> seed.encoded

# create the required atnwalk directory and copy the seed
cd ../
mkdir -p atnwalk/in
cp ./seeds/seed.encoded atnwalk/in/seed
cd atnwalk

# assign to a single core when benchmarking it, change the CPU number as required
CPU_ID=0

# start the ATNwalk server
nohup taskset -c ${CPU_ID} ${HOME}/atnwalk/build/javascript/bin/server 100 > server.log 2>&1 &

# start AFL++ with ATNwalk
AFL_SKIP_CPUFREQ=1 \
  AFL_DISABLE_TRIM=1 \
  AFL_CUSTOM_MUTATOR_ONLY=1 \
  AFL_CUSTOM_MUTATOR_LIBRARY=${HOME}/AFLplusplus/custom_mutators/atnwalk/atnwalk.so \
  AFL_POST_PROCESS_KEEP_ORIGINAL=1 \
  ~/AFLplusplus/afl-fuzz -t 100 -i in/ -o out -b ${CPU_ID} -- ~/jerryscript/build/bin/jerry

@vanhauser-thc
Copy link
Member

@voidptr127 could you email me your submitted paper? Or the results of your testbed runs? vh(at)thc(dot)org

I started to make fuzzers for fuzzbench for nautilus, gramatron and autotoken, but then for atnwalk you use rocky instead of ubuntu which complicates porting that to fuzzbench etc. and there would only be one target for benchmarking - so I think it is more of a time waste than helpful what I did so far.
magma has sqlite, xml, lua and php, so it does not fit much better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants