Fuzzing an Open-Source Project with AFL
Steps to Reproduce
Setting Up the Fuzzing Environment
The first step that needs to be completed is the setup of the fuzzing environment. As the manual installation could be a draining process, it can be used an unofficial Docker image (namely mykter/afl-training
) with all the tools already installed. This was primarily used for a workshop in which concepts about the fuzzer American Fuzzy Lop (AFL) are introduced via practical exercises.
-
Install Docker by running the official installation script.
curl -fsSL https://get.docker.com -o /tmp/get-docker.sh sh /tmp/get-docker.sh
-
Pull the image and create a container (in the privileged mode, with the
2222
port exposed and a password set as environment variable).docker run \ --privileged \ --publish 2222:2222 \ --env PASSMETHOD=env --env PASS=thispasscantbefuzzed \ ghcr.io/mykter/fuzz-training
-
Log in into the created container, by using the
fuzzer
account with the set password and the2222
port.ssh fuzzer@localhost -p 2222
Finding and Compiling (with Instrumentation) an Open-Source Project
In this guide, due to the different build processes implemented by each project, a specific open-source one was chosen: HiColor. It is a quite popular (judging by the stars on GitHub) program for converting PNG images into 15- or 16-bit RBG color ones.
-
Clone the project into the home of the
fuzzer
user (the default directory after the above login).git clone https://github.com/dbohdan/hicolor cd ~/hicolor
-
AFL could benefit from the fact that the source code is available, for performing a fuzzing in the whitebox approach. For achieving this, the binary needs to incorporate instrumentation artifacts, so the compilation needs to be a little altered. Optionally, the Makefile could be edited by adding the
-g
option to theCFLAG
variables to include debug symbols into the executable.CC=afl-clang CXX=afl-clang++ make
Fuzzing
As the fuzzer would be helped by valid inputs from which it could later mutate for generating new inputs, sample PNG images could be offered to AFL.
-
Create a directory for storing sample images.
mkdir --parents ~/sample_images/png cd ~/samples_images/png
-
Manually download 5 PNG images in the newly created directory.
-
Transform each image into its
.hic
(HiColor format) correspondent.mkdir ~/sample_images/hic find . \ -name "*.png" \ -exec ~/hicolor encode {} ~/sample_images/hic/{}.hic \;
-
Solve the AFL errors (only if it is required).
export AFL_I_DONT_CARE_ABOUT_MISSING_CRASHES=1 echo performance \ | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
-
Start the fuzzing for the encoding functionality
cd ~/hicolor afl-fuzz \ -m none \ -i ~/sample_images/png \ -o encode_results \ ./hicolor encode \ @@ /dev/null
-
Start the fuzzing for the decoding functionality
afl-fuzz \ -m none \ -i ~/sample_images/hic \ -o decode_results \ ./hicolor decode \ @@ /dev/null
-
Start the fuzzing for the custom format parsing functionality
afl-fuzz \ -m none \ -i ~/sample_images/hic \ -o info_results \ ./hicolor info \ @@ /dev/null
Analyzing the Results
After running the last 3 commands, the following outputs are generated:
Encode Functionality Fuzzing
american fuzzy lop ++3.00c (default) [fast] {0}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│ run time : 0 days, 0 hrs, 6 min, 14 sec │ cycles done : 0 │
│ last new path : 0 days, 0 hrs, 1 min, 25 sec │ total paths : 1024 │
│ last uniq crash : 0 days, 0 hrs, 4 min, 41 sec │ uniq crashes : 8 │
│ last uniq hang : none seen yet │ uniq hangs : 0 │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│ now processing : 1000.8 (97.7%) │ map density : 0.05% / 0.11% │
│ paths timed out : 143 (13.96%) │ count coverage : 1.47 bits/tuple │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│ now trying : splice 1 │ favored paths : 17 (1.66%) │
│ stage execs : 55/73 (75.34%) │ new edges on : 18 (1.76%) │
│ total execs : 479k │ total crashes : 8 (8 unique) │
│ exec speed : 1327/sec │ total tmouts : 1688 (8 unique) │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│ bit flips : n/a, n/a, n/a │ levels : 7 │
│ byte flips : n/a, n/a, n/a │ pending : 4.29G │
│ arithmetics : n/a, n/a, n/a │ pend fav : 0 │
│ known ints : n/a, n/a, n/a │ own finds : 24 │
│ dictionary : n/a, n/a, n/a │ imported : 0 │
│havoc/splice : 32/224k, 0/236k │ stability : 100.00% │
│ py/custom : 0/0, 0/0 ├───────────────────────┘
│ trim : 99.54%/11.9k, n/a │ [cpu000: 62%]
└────────────────────────────────────────────────────┘
Decode Functionality Fuzzing
american fuzzy lop ++3.00c (default) [fast] {0}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│ run time : 0 days, 0 hrs, 2 min, 43 sec │ cycles done : 0 │
│ last new path : 0 days, 0 hrs, 0 min, 3 sec │ total paths : 1160 │
│ last uniq crash : none seen yet │ uniq crashes : 0 │
│ last uniq hang : none seen yet │ uniq hangs : 0 │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│ now processing : 1090.1 (94.0%) │ map density : 0.43% / 0.51% │
│ paths timed out : 1 (0.09%) │ count coverage : 5.56 bits/tuple │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│ now trying : trim 1024/1024 │ favored paths : 35 (3.02%) │
│ stage execs : 107/137 (78.10%) │ new edges on : 36 (3.10%) │
│ total execs : 26.6k │ total crashes : 0 (0 unique) │
│ exec speed : 679.2/sec │ total tmouts : 63 (1 unique) │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│ bit flips : n/a, n/a, n/a │ levels : 3 │
│ byte flips : n/a, n/a, n/a │ pending : 722 │
│ arithmetics : n/a, n/a, n/a │ pend fav : 29 │
│ known ints : n/a, n/a, n/a │ own finds : 159 │
│ dictionary : n/a, n/a, n/a │ imported : 0 │
│havoc/splice : 120/7524, 39/3989 │ stability : 100.00% │
│ py/custom : 0/0, 0/0 ├───────────────────────┘
│ trim : 76.39%/5762, n/a │ [cpu000:112%]
└────────────────────────────────────────────────────┘
Custom Format Parsing Functionality Fuzzing
american fuzzy lop ++3.00c (default) [fast] {0}
┌─ process timing ────────────────────────────────────┬─ overall results ────┐
│ run time : 0 days, 0 hrs, 2 min, 24 sec │ cycles done : 1 │
│ last new path : none yet (odd, check syntax!) │ total paths : 1001 │
│ last uniq crash : none seen yet │ uniq crashes : 0 │
│ last uniq hang : none seen yet │ uniq hangs : 0 │
├─ cycle progress ───────────────────┬─ map coverage ─┴──────────────────────┤
│ now processing : 0.1160 (0.0%) │ map density : 0.01% / 0.01% │
│ paths timed out : 0 (0.00%) │ count coverage : 1.00 bits/tuple │
├─ stage progress ───────────────────┼─ findings in depth ───────────────────┤
│ now trying : splice 1 │ favored paths : 1 (0.10%) │
│ stage execs : 16/32 (50.00%) │ new edges on : 1 (0.10%) │
│ total execs : 376k │ total crashes : 0 (0 unique) │
│ exec speed : 2193/sec │ total tmouts : 0 (0 unique) │
├─ fuzzing strategy yields ──────────┴───────────────┬─ path geometry ───────┤
│ bit flips : n/a, n/a, n/a │ levels : 1 │
│ byte flips : n/a, n/a, n/a │ pending : 0 │
│ arithmetics : n/a, n/a, n/a │ pend fav : 0 │
│ known ints : n/a, n/a, n/a │ own finds : 0 │
│ dictionary : n/a, n/a, n/a │ imported : 0 │
│havoc/splice : 0/296k, 0/71.8k │ stability : 100.00% │
│ py/custom : 0/0, 0/0 ├───────────────────────┘
│ trim : 100.00%/28, n/a │ [cpu000: 87%]
└────────────────────────────────────────────────────┘
From this information, a coarse conclusion can be formed: only the encoding functionality has a poor implementation due to the multiple unique crashes that AFL could find. For the rest of the functionalities, namely the decoding and the custom format parsing, the fuzzer does not find any vulnerability.
Analysis of One Crash
The first generated crash was chosen and further analyzed with dynamic techniques, namely debugging. gdb
was instructed to automatically start the program for encoding, with the file that generated the first crash. It needs to be mentioned that the names of the dumped files could vary depending on multiple factors (order of finding crashes, time, current operation).
gdb \
--ex run
--args ./hicolor encode encode_results/default/crashes/id:000000,sig:11,src:001013,time:93879,op:havoc,rep:2 encoded.hic
The line on which the program generates a SIGSERV
is the one highlighted below. It tries (and fails) to dereference the rgb_img
pointer, which is NULL
.
| 29 hicolor_rgb* cp_to_rgb(const cp_image_t img) │
│ 30 { │
│ 31 hicolor_rgb* rgb_img = malloc(sizeof(hicolor_rgb) * img.w * img.h); │
│ 32 │
│ 33 for (uint32_t i = 0; i < (uint32_t) img.w * (uint32_t) img.h; i++) { │
│ >34 rgb_img[i].r = img.pix[i].r; │
│ 35 rgb_img[i].g = img.pix[i].g; │
│ 36 rgb_img[i].b = img.pix[i].b; │
│ 37 } │
│ 38 │
│ 39 return rgb_img; │
│ 40 }
By inspecting the parameters used in the malloc
call (with the command p img
), huge values can be observed. This means that the product on the line 31 results in 34582635452664. This value is in fact not truncated (it can be stored into an unsigned int
, on a 64-bit architecture) and the malloc
is instructed to allocate 31 terabytes on the stack, it fails and returns a NULL
pointer.
{
w = 0xf3ff,
h = 0xb000258,
pix = 0x0
}
The PNG image that generated the crash can be analyzed with file
, which only parses the header of the file, or with identify
, a utility from the ImageMagick suite, which investigate the PNG file more deeply.
file encode_results/default/crashes/id:000000,sig:11,src:001013,time:93879,op:havoc,rep:2
encode_results/default/crashes/id:000000,sig:11,src:001013,time:93879,op:havoc,rep:2: PNG image data, 62463 x 184549976, 8-bit/color RGB, non-interlaced
identify ../crashes/id:000007,sig:11,src:001013,time:93879,op:havoc,rep:2
identify-im6.q16: insufficient image data in file `../crashes/id:000007,sig:11,src:001013,time:93879,op:havoc,rep:2' @ error/png.c/ReadPNGImage/4098.
By correlating this information with the source code of the analyzed program, the root cause can be reached. AFL, in its input generation phase, modified the sizes (height and width) stored in the PNG header with the large sizes seen in the file
command output. The project uses an open-source header-only C library (namely cute_png.h
) to loads the PNG files.
Further, both the library and hicolor
does not check if the image is valid (like identify
, which detects that the declared pixels from the header are not stored in the data section too) and the image is loaded as it is. When the image needs to be converted to the custom file format, the malloc
fails and the programs tries to dereference a NULL
pointer, which raises a SIGSERV
.
Last Step
The bug discovered in this guide was reported by creating a new issue on the repository.