AFL++ Campaign Structure

h2hack

Some notes we compiled about afl++. We figured it might be helpful to write some information about this down. Also, in the attachments you can find a demo campaign with a toy parser with a persistent harness and setup you can play around with.

    
┌────────────────────────────────┐             414141414141414141414141414141
│~!`*^%$#@AA}<>?;:   {}!%$@^*&()_│             414141414141414141414141414141
│*^%┌────────────────────────────┴───┐         414141414141414141414141414141
│<>?│~!`*^%$#@AA}<>?;:)_+{}!%$@^*&()_│         414141414141414141414141414141
│|[]│*^%┌────────────────────────────┴───┐     414141414141414141414141414141
│#@&│<>?│~!`*^%$#@AA}<>?;:)_ {}!%$@^*&()_│     414141414141414141414141414141
│~!`│|[]│*^%┌────────────────────────────┴───┐ 414141414141414141414141414141
│*^%│#@&│<>?│~!`*^%$#@AA}<>?;:'*&{}!%$@^*&()_│ 414141414141414141414141414141
│<>?│~!`│|[]│*^%$#@!~!AA{}|[]<};'.,:;'"^&*^%$│ 414141414141414141414141414141
│|[]│*^%│#@&│<>?;'":|[AA}^&$#@|[]()_+|~%$#@!%│ 414141414141414141414141414141
│#@&│<>?│~!`│|[]<>!@#$AA&*^%$#>?/`'*&^%$#@&*(│ 414141414141414141414141414141
│~!`│|[]│*^%│#@&$%^^&*AA_+|[]{!*&:",./<>?`~! │ 414141414141414141414141414141
│*^%│#@&│<>?│~!`*^%$#@AA}<>?;:@!~{}!%$@^*&()_│ 414141414141414141414141414141
│<>?│~!`│|[]│*^%$#@!~!AA{}|[]<};'.,:;'"^&*^%$│ 414141414141414141414141414141
│|[]│*^%│#@&│<>?;'":|[AA}^&$#@|[]()_+|~%$#@!%│ 414141414141414141414141414141
│{~!│<>?│~!`│|[]<>!@#$AA&*^%$#>?/`'*&^%$#@&*(│ 414141414141414141414141414141
│&*^│|[]│*^%│#@&$%^^&*AA%^|[]{!*&:",./<>?`~! │ 414141414141414141414141414141
│.<>│{~!│<>?│~!`*^%$#@AA()>?;:@!~{}!%$@^*&()_│ 414141414141414141414141414141
│{|[│&*^│|[]│*^%$#@!~!AA()|[]<>?/.,:;'"^&*^%$│ 414141414141414141414141414141
│~!`│.<>│{~!│<>?;'":|[AA&~&$#@!*&()_+|~%$#@!%│ 414141414141414141414141414141
│*^%│{|[│&*^│|[]<>!@#$AA$+^%$#@!~`'*&^%$#@&*(│ 414141414141414141414141414141
│<>?│~!`│.<>│{~!`*^%$#AA%{<>?;:|[]{}!%$@     │ 414141414141414141414141414141
│|[]│*^%│{|[│&*^%$#@!~AA;)}|[]<>?/.,:;'"     │ 414141414141414141414141414141
│*^%│<>?│~!`│.<>?;'":|AA]~^&$#@!*&()_+|~     │ 414141414141414141414141414141
│<>?│|[]│*^%│{|[]<>!@#AA#+*^%$#@!~`'*&^%     │ 414141414141414141414141414141
│|[]│*^%│<>?│~!`*^%$#@AA${>?;:|[]{}!%$@      │ 414141414141414141414141414141
│   │<>?│|[]│*^%$#@!~!AA─^|[]<>?/.,:;'"      │ 414141414141414141414141414141
└───┤|[]│*^%│<>?;'":|[AA ^&$#@!*&()_+|~      │ 414141414141414141414141414141
    │   │<>?│|[]<>!@#$AA─*^%$#@!~`'*&^%      │ 414141414141414141414141414141
    └───┤|[]│*^%$#@!~!AA                     │ 414141414141414141414141414141
        │   │*<>?;'":|[AA                    │ 414141414141414141414141414141
        └───┤||[]<>!@#$AA                    │ 414141414141414141414141414141
            │         AA                     │ 414141414141414141414141414141
            └────────────────────────────────┘ 414141414141414141414141414141

General Thoughts

Compiler choice plays a crucial role in performance. If a binary is labeled as "fast" it is likely true. For instance, using afl-gcc-fast instead of afl-gcc doubled our processing speed. Here is our unofficial tier list based on performance:

afl-clang-lto
afl-clang-fast
afl-gcc-fast
afl-clang
afl-gcc

Using sanitizers is essential. They help identify bugs that may not directly cause crashes. However, it is advisable to run only one sanitizer per instance due to significant memory overhead. Choose the sanitizer based on your specific goals. One particularly instresting option is ASAN, as it "colors" memory regions to detect memory corruption issues. To enable sanitizers, use the following environment variables during compilation:

    
AFL_USE_ASAN=1      memory corruptions
AFL_USE_MSAN=1      read access on uninitialized memory
AFL_USE_CFISAN=1    undefined behaviour by the standard
AFL_USE_TSAN=1      type confusion vulns (cpp)

Always implement a parallel setup. Given the performance limitations imposed by sanitizers, we recommend having a master instance run the harness without any sanitizers activated. This master instance can execute significantly faster than the slave instances (which run with sanitizers), allowing it to generate a indepth corpus that the other instances can profit from.

For a typical setup on a four-core machine, compile the harness twice: once with AFL_USE_ASAN and once without. Launch the AFL++ instances in separate screen sessions, and monitor their status using afl-whatsup. Limit the memory for each instance with the -m option to match the constraints of your target embedded device or process. Since ASAN typically requires more memory, you may need to omit this option for those specific instances.

    
                                     ┌───────────────┐  
                                     │SLAVE01        │  
                              ┌─────►│               │  
                              │      │./harness-asan │  
                              │      └───────────────┘  
                              │                         
┌────────────────┐            │      ┌───────────────┐  
│MASTER          │   corpus   │      │SLAVE02        │  
│                ├────────────┼─────►│               │  
│./harness -m1024│            │      │./harness-asan │  
└────────────────┘            │      └───────────────┘  
                              │                         
                              │      ┌───────────────┐  
                              │      │SLAVE03        │  
                              └─────►│               │  
                                     │./harness-asan │  
                                     └───────────────┘

Here is a sample screenrc configuration for this setup. You can start it with screen -c screenrc

screen -t master 0 afl-fuzz -i in/ -o out/ -m 4096 -M master  -- ./harness/harness
screen -t slave1 1 afl-fuzz -i in/ -o out/ -S slave1  -- ./harness/harness-asan
screen -t slave2 2 afl-fuzz -i in/ -o out/ -S slave2  -- ./harness/harness-asan
screen -t slave3 3 afl-fuzz -i in/ -o out/ -S slave3  -- ./harness/harness-asan
screen -t top 4 top

Corpus Recycling

A fuzzer's effectiveness is heavily reliant on the quality of the harness you develop. Creating an efficient harness is an iterative process, and you shouldn't expect to achieve perfect orchistration on the get go. Begin with a file-based fuzzer and gradually enhance the harness, reusing the corpus you have already generated along the way.

For reuse, it is essential to minimize your queue directory using afl-cmin. This tool eliminates test cases that do not contribute to new paths or coverage. Once you have done this, you can further reduce the size of these test cases using afl-tmin. Depending on the inputs this can take some time. If you are in a hurry, just skip tmin.

If you encounter a crash with a large or unwieldy test case, you can also use tmin to minimize it. Given that a corpus can include thousands of inputs, we recommend writing a small wrapper script around these tools. Below is an example of our implementation:

#!/bin/bash

dir="$1"

shift

# Check for the -- separator
if [ "$1" != "--" ]; then
    echo "Error: Missing '--' separator"
    exit 1
fi

shift

invocation="$@"

OUTPUT_CMIN="$(mktemp -d /tmp/cmin.XXXXX)"
OUTPUT_TMIN="$(basename $dir).tmin"

mkdir $OUTPUT_TMIN

afl-cmin -i "$dir" -o "$OUTPUT_CMIN" \-\- $invocation

for file in "$OUTPUT_CMIN"/*; do
    if [ -f "$file" ]; then
        filename=$(basename "$file")
        afl-tmin -i "$file" -o "$OUTPUT_TMIN/${filename}.tmin" \-\- $invocation
    fi
done

Coverage Analysis

Suppose your fuzzing campaign has been running for about a day, and you notice that no new paths have been discovered for over 12 hours. At this point, it is advisable to pause the campaign and recompile your binary with coverage support. This allows you to run the binary using the generated corpus as input and visualize any missing paths. Afterwards you can craft new inputs and add them to the queue directory of your master.

Before generating coverage data, ensure you compile your target with the flags -fprofile-arcs -ftest-coverage. These options enable tracing for each execution. For further setup details, refer to the attached archive. Once the coverage data is collected, you can export it to HTML for analysis.

#!/bin/sh

dir="$1"

# generate cov data for all files in dir
for file in "$1"; do
    if [ -f "$file" ]; then
    ./toyparser/toyparser "$file"
    fi
done

mkdir -p cov

#parse coverage
cd cov && gcov ../toyparser/parser.c

#create info
geninfo ../toyparser/ -b ../toyparser/ -o cov.info

#export to html
genhtml cov.info -o cov/html

All referenced files and information can be found here. Feel free to use it for any purpose.

    
(\(\                     char buf[32];
(-.-)                    memcpy(buf,input,strlen(input));
^_(")")                  return 0;

240824