Fuzzing & Vulnerability Research
AFL++, LibFuzzer, coverage-guided fuzzing, finding 0-days, code auditing for security, and structured vulnerability research.
fuzzingAFLLibFuzzervulnerability research0-daycode auditcoverage guided fuzzingsanitizers
Real-World Analogy
Fuzzing is like hiring a monkey to randomly bang on a keyboard — except this monkey is infinitely patient, tracks which keyboard combinations caused the application to crash, and systematically explores variations. The crashes are your vulnerabilities.
What is Fuzzing?
Fuzzing generates large amounts of semi-random input, feeds it to a program, and monitors for crashes (memory corruption) or hangs (infinite loops/DoS).
Types of fuzzers:
Blackbox → no source code, treat as black box
+ works on proprietary software
- low coverage, many dead branches
Greybox → instrument binary for coverage feedback
AFL, AFL++ — the dominant approach
+ much better coverage than blackbox
+ works on binaries
Whitebox → full source access, symbolic execution
KLEE, angr, mayhem
+ theoretically complete coverage
- extremely slow, doesn't scale
Coverage-guided → tracks which code paths each input covers
prioritizes inputs that reach new code
→ AFL++, LibFuzzer, Honggfuzz AFL++ — Coverage-Guided Fuzzing
# Install AFL++
sudo apt install afl++
# Or build from source for latest features
git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus
make distrib
sudo make install
# Step 1: Compile target with AFL++ instrumentation
# Replace your compiler with afl-clang-fast
CC=afl-clang-fast CXX=afl-clang-fast++ ./configure --disable-shared
make -j$(nproc)
# Or for a simple program:
afl-clang-fast -o target target.c
# Step 2: Prepare corpus (seed inputs)
mkdir inputs/
# Put valid input files that the program accepts:
echo "valid input" > inputs/sample1.txt
cp /usr/share/doc/target/*.txt inputs/ # grab real inputs if available
# Step 3: Run AFL++
mkdir outputs/
afl-fuzz -i inputs/ -o outputs/ -- ./target @@
# @@ = placeholder for input file
# Step 4: Monitor progress
# AFL++ shows:
# - Exec speed (want: >1000/sec for good fuzzing)
# - Coverage (map density, unique paths found)
# - Crashes / hangs found
# - Cycle progress
# Step 5: Analyze crashes
ls outputs/default/crashes/
# Each file is an input that caused a crash
./target outputs/default/crashes/id:000001,*
# Run manually to reproduce Sanitizers — Catch More Bugs
Build with sanitizers to catch bugs that don’t always crash:
# AddressSanitizer (ASan) — heap/stack buffer overflows, use-after-free
clang -fsanitize=address -g -o target target.c
ASAN_OPTIONS=detect_leaks=1 ./target
# MemorySanitizer (MSan) — use of uninitialized memory
clang -fsanitize=memory -g -o target target.c
# UndefinedBehaviorSanitizer (UBSan) — integer overflow, null deref, etc.
clang -fsanitize=undefined -g -o target target.c
# Combine with AFL++
AFL_USE_ASAN=1 afl-clang-fast -fsanitize=address -g -o target target.c
afl-fuzz -i inputs/ -o outputs/ -- ./target @@
# Now AFL++ fuzzes and ASan reports any memory safety issues LibFuzzer — In-Process Fuzzing
LibFuzzer runs the fuzzer inside the target process — faster than AFL++ but requires source.
// Write a fuzz target (harness)
#include <stdint.h>
#include <stddef.h>
// LLVMFuzzerTestOneInput is called by LibFuzzer with each generated input
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
// Feed the input to the function you want to test
if (Size < 1) return 0;
// Example: fuzzing a JSON parser
char *input = malloc(Size + 1);
memcpy(input, Data, Size);
input[Size] = '\0';
json_parse(input); // the function under test
free(input);
return 0;
} # Compile with LibFuzzer
clang -fsanitize=fuzzer,address -g -o fuzz_target fuzz_target.c libjson.a
# Run
./fuzz_target -max_total_time=3600 # fuzz for 1 hour
# LibFuzzer also tracks coverage and prioritizes interesting inputs
# Crashes appear as: corpus/crash-<hash> Fuzzing Protocols and File Formats
# Fuzzing a file format parser (e.g., image library)
# Mutate existing valid images and feed to parser
afl-fuzz -i valid-png-samples/ -o output/ -- ./png-parser @@
# Fuzzing a network protocol
# Use AFLnet for stateful protocol fuzzing
git clone https://github.com/aflnet/aflnet
cd aflnet && make
# Record real network session as seed
sudo tcpdump -w seed.pcap port 1234 &
nc 127.0.0.1 1234 < valid_input.txt
# Fuzz the server
aflnet-fuzz -i seeds/ -o output/ -N tcp://127.0.0.1/1234 \
-P FTP -D 10000 -q 3 -s 3 -- ./server
# Boofuzz — Python network protocol fuzzer
pip install boofuzz
# Define protocol structure, then boofuzz mutates and sends Code Auditing for Security
Manual source code review to find vulnerabilities:
# grep for dangerous function patterns
# C/C++:
grep -n "gets\|strcpy\|sprintf\|scanf(" *.c
grep -n "strcat\|memcpy\|strncat" *.c # check bounds
grep -n "system(\|popen(\|execve(" *.c # command injection
# Python:
grep -n "eval(\|exec(\|os.system(" *.py
grep -n "subprocess.shell=True" *.py
grep -n "pickle.loads\|yaml.load(" *.py # unsafe deserialization
# PHP:
grep -n "eval(\|system(\|exec(\|shell_exec(" *.php
grep -n "\$_GET\|\$_POST\|\$_REQUEST" *.php # user input sources
grep -n "mysql_query\|mysqli_query" *.php # SQL queries
# JavaScript/Node.js:
grep -n "eval(\|Function(" *.js
grep -n "child_process\|exec(" *.js
grep -n "innerHTML\|document.write(" *.js # XSS sinks
grep -n "res.send(req\.\|res.json(req\." *.js # direct input reflection Data Flow Analysis
Source → Sink analysis:
Source = where attacker-controlled data enters
HTTP parameters, headers, body
File contents
Environment variables
Database values (if DB is also controlled)
Sink = where dangerous operations happen
SQL queries
System commands
File writes
eval()
Deserialization
Taint = track data from source to sink
Does it get validated/sanitized along the way?
Is validation bypassable?
Tools:
Semgrep — pattern-matching SAST (find dangerous patterns)
CodeQL — semantic code analysis (find dataflow paths)
Joern — code property graph analysis # Semgrep — find SQL injection patterns
# pip install semgrep
# Run built-in security rules
semgrep --config=p/security-audit .
semgrep --config=p/owasp-top-ten .
semgrep --config=p/python .
# Custom rule example
# .semgrep/sqli.yaml:
rules:
- id: python-string-format-sql
patterns:
- pattern: |
$DB.execute(f"... {$INPUT} ...")
- pattern: |
$DB.execute("..." % $INPUT)
- pattern: |
$DB.execute("..." + $INPUT)
message: "Potential SQL injection via string formatting"
severity: ERROR
languages: [python] Finding 0-Days: The Process
1. Target selection
- High-value software with wide deployment
- Software that hasn't been fuzzed recently
- Code recently changed (new features = new bugs)
- Parsing code (parsers are historically buggy)
2. Attack surface identification
- What inputs does the software accept?
- Which input parsing code is most complex?
- What assumptions does the code make about input?
3. Fuzzing strategy
- Write/find a good corpus (real-world inputs)
- Write a harness that exercises the right code paths
- Run with sanitizers
- Let it run for days/weeks on a cluster
4. Manual audit
- Focus on complex parsing logic
- Integer arithmetic (overflow/underflow)
- Memory allocation (size calculations)
- Format strings, type confusion, use-after-free patterns
5. Crash triage
- Is this exploitable? (what's the control flow?)
- What's the impact? (what data is accessible?)
- Write a minimal reproducible proof of concept
6. Responsible disclosure
- Report to vendor immediately
- Give 90 days for patch
- Publish after patch is available Real Project: Fuzzing libpng
# Step 1: Get libpng source
apt source libpng1.6
# Step 2: Write fuzz harness
cat > fuzz_png.c << 'EOF'
#include <png.h>
#include <stdlib.h>
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
if (size < 8) return 0;
png_structp png = png_create_read_struct(PNG_LIBPNG_VER_STRING, NULL, NULL, NULL);
if (!png) return 0;
png_infop info = png_create_info_struct(png);
if (!info) { png_destroy_read_struct(&png, NULL, NULL); return 0; }
// Custom read function
struct {const uint8_t *data; size_t size; size_t pos;} state = {data, size, 0};
png_set_read_fn(png, &state, [](png_structp p, uint8_t *d, size_t n) {
auto *s = (decltype(state)*)png_get_io_ptr(p);
n = std::min(n, s->size - s->pos);
memcpy(d, s->data + s->pos, n);
s->pos += n;
});
setjmp(png_jmpbuf(png));
png_read_png(png, info, PNG_TRANSFORM_IDENTITY, NULL);
png_destroy_read_struct(&png, &info, NULL);
return 0;
}
EOF
# Step 3: Compile
clang++ -fsanitize=fuzzer,address -g fuzz_png.c -lpng -lz -o fuzz_png
# Step 4: Get PNG corpus
mkdir corpus/
find /usr/share -name "*.png" -exec cp {} corpus/ \;
# Step 5: Fuzz!
./fuzz_png corpus/ -max_total_time=86400 # run for 24 hours