← Ethical Hacking · advanced · 12 min · 29 / 31 বাংলা

Fuzzing & Vulnerability Research

AFL++, LibFuzzer, coverage-guided fuzzing, finding 0-days, code auditing for security, and structured vulnerability research.

fuzzingAFLLibFuzzervulnerability research0-daycode auditcoverage guided fuzzingsanitizers

Real-World Analogy

Fuzzing is like hiring a monkey to randomly bang on a keyboard — except this monkey is infinitely patient, tracks which keyboard combinations caused the application to crash, and systematically explores variations. The crashes are your vulnerabilities.

What is Fuzzing?

Fuzzing generates large amounts of semi-random input, feeds it to a program, and monitors for crashes (memory corruption) or hangs (infinite loops/DoS).

Types of fuzzers:
  Blackbox  → no source code, treat as black box
              + works on proprietary software
              - low coverage, many dead branches

  Greybox   → instrument binary for coverage feedback
              AFL, AFL++ — the dominant approach
              + much better coverage than blackbox
              + works on binaries

  Whitebox   → full source access, symbolic execution
              KLEE, angr, mayhem
              + theoretically complete coverage
              - extremely slow, doesn't scale

  Coverage-guided → tracks which code paths each input covers
                    prioritizes inputs that reach new code
                    → AFL++, LibFuzzer, Honggfuzz

AFL++ — Coverage-Guided Fuzzing

# Install AFL++
sudo apt install afl++

# Or build from source for latest features
git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus
make distrib
sudo make install

# Step 1: Compile target with AFL++ instrumentation
# Replace your compiler with afl-clang-fast
CC=afl-clang-fast CXX=afl-clang-fast++ ./configure --disable-shared
make -j$(nproc)

# Or for a simple program:
afl-clang-fast -o target target.c

# Step 2: Prepare corpus (seed inputs)
mkdir inputs/
# Put valid input files that the program accepts:
echo "valid input" > inputs/sample1.txt
cp /usr/share/doc/target/*.txt inputs/   # grab real inputs if available

# Step 3: Run AFL++
mkdir outputs/
afl-fuzz -i inputs/ -o outputs/ -- ./target @@
# @@ = placeholder for input file

# Step 4: Monitor progress
# AFL++ shows:
# - Exec speed (want: >1000/sec for good fuzzing)
# - Coverage (map density, unique paths found)
# - Crashes / hangs found
# - Cycle progress

# Step 5: Analyze crashes
ls outputs/default/crashes/
# Each file is an input that caused a crash
./target outputs/default/crashes/id:000001,*
# Run manually to reproduce

Sanitizers — Catch More Bugs

Build with sanitizers to catch bugs that don’t always crash:

# AddressSanitizer (ASan) — heap/stack buffer overflows, use-after-free
clang -fsanitize=address -g -o target target.c
ASAN_OPTIONS=detect_leaks=1 ./target

# MemorySanitizer (MSan) — use of uninitialized memory
clang -fsanitize=memory -g -o target target.c

# UndefinedBehaviorSanitizer (UBSan) — integer overflow, null deref, etc.
clang -fsanitize=undefined -g -o target target.c

# Combine with AFL++
AFL_USE_ASAN=1 afl-clang-fast -fsanitize=address -g -o target target.c
afl-fuzz -i inputs/ -o outputs/ -- ./target @@
# Now AFL++ fuzzes and ASan reports any memory safety issues

LibFuzzer — In-Process Fuzzing

LibFuzzer runs the fuzzer inside the target process — faster than AFL++ but requires source.

// Write a fuzz target (harness)
#include <stdint.h>
#include <stddef.h>

// LLVMFuzzerTestOneInput is called by LibFuzzer with each generated input
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    // Feed the input to the function you want to test
    if (Size < 1) return 0;

    // Example: fuzzing a JSON parser
    char *input = malloc(Size + 1);
    memcpy(input, Data, Size);
    input[Size] = '\0';

    json_parse(input);  // the function under test

    free(input);
    return 0;
}

# Compile with LibFuzzer
clang -fsanitize=fuzzer,address -g -o fuzz_target fuzz_target.c libjson.a

# Run
./fuzz_target -max_total_time=3600   # fuzz for 1 hour

# LibFuzzer also tracks coverage and prioritizes interesting inputs
# Crashes appear as: corpus/crash-<hash>

Fuzzing Protocols and File Formats

# Fuzzing a file format parser (e.g., image library)
# Mutate existing valid images and feed to parser
afl-fuzz -i valid-png-samples/ -o output/ -- ./png-parser @@

# Fuzzing a network protocol
# Use AFLnet for stateful protocol fuzzing
git clone https://github.com/aflnet/aflnet
cd aflnet && make

# Record real network session as seed
sudo tcpdump -w seed.pcap port 1234 &
nc 127.0.0.1 1234 < valid_input.txt

# Fuzz the server
aflnet-fuzz -i seeds/ -o output/ -N tcp://127.0.0.1/1234 \
  -P FTP -D 10000 -q 3 -s 3 -- ./server

# Boofuzz — Python network protocol fuzzer
pip install boofuzz
# Define protocol structure, then boofuzz mutates and sends

Code Auditing for Security

Manual source code review to find vulnerabilities:

# grep for dangerous function patterns
# C/C++:
grep -n "gets\|strcpy\|sprintf\|scanf(" *.c
grep -n "strcat\|memcpy\|strncat" *.c    # check bounds
grep -n "system(\|popen(\|execve(" *.c   # command injection

# Python:
grep -n "eval(\|exec(\|os.system(" *.py
grep -n "subprocess.shell=True" *.py
grep -n "pickle.loads\|yaml.load(" *.py  # unsafe deserialization

# PHP:
grep -n "eval(\|system(\|exec(\|shell_exec(" *.php
grep -n "\$_GET\|\$_POST\|\$_REQUEST" *.php  # user input sources
grep -n "mysql_query\|mysqli_query" *.php    # SQL queries

# JavaScript/Node.js:
grep -n "eval(\|Function(" *.js
grep -n "child_process\|exec(" *.js
grep -n "innerHTML\|document.write(" *.js  # XSS sinks
grep -n "res.send(req\.\|res.json(req\." *.js  # direct input reflection

Data Flow Analysis

Source → Sink analysis:
  Source = where attacker-controlled data enters
    HTTP parameters, headers, body
    File contents
    Environment variables
    Database values (if DB is also controlled)

  Sink = where dangerous operations happen
    SQL queries
    System commands
    File writes
    eval()
    Deserialization

  Taint = track data from source to sink
    Does it get validated/sanitized along the way?
    Is validation bypassable?

Tools:
  Semgrep — pattern-matching SAST (find dangerous patterns)
  CodeQL — semantic code analysis (find dataflow paths)
  Joern — code property graph analysis

# Semgrep — find SQL injection patterns
# pip install semgrep

# Run built-in security rules
semgrep --config=p/security-audit .
semgrep --config=p/owasp-top-ten .
semgrep --config=p/python .

# Custom rule example
# .semgrep/sqli.yaml:
rules:
  - id: python-string-format-sql
    patterns:
      - pattern: |
          $DB.execute(f"... {$INPUT} ...")
      - pattern: |
          $DB.execute("..." % $INPUT)
      - pattern: |
          $DB.execute("..." + $INPUT)
    message: "Potential SQL injection via string formatting"
    severity: ERROR
    languages: [python]

Finding 0-Days: The Process

1. Target selection
   - High-value software with wide deployment
   - Software that hasn't been fuzzed recently
   - Code recently changed (new features = new bugs)
   - Parsing code (parsers are historically buggy)

2. Attack surface identification
   - What inputs does the software accept?
   - Which input parsing code is most complex?
   - What assumptions does the code make about input?

3. Fuzzing strategy
   - Write/find a good corpus (real-world inputs)
   - Write a harness that exercises the right code paths
   - Run with sanitizers
   - Let it run for days/weeks on a cluster

4. Manual audit
   - Focus on complex parsing logic
   - Integer arithmetic (overflow/underflow)
   - Memory allocation (size calculations)
   - Format strings, type confusion, use-after-free patterns

5. Crash triage
   - Is this exploitable? (what's the control flow?)
   - What's the impact? (what data is accessible?)
   - Write a minimal reproducible proof of concept

6. Responsible disclosure
   - Report to vendor immediately
   - Give 90 days for patch
   - Publish after patch is available

Real Project: Fuzzing libpng

# Step 1: Get libpng source
apt source libpng1.6

# Step 2: Write fuzz harness
cat > fuzz_png.c << 'EOF'
#include <png.h>
#include <stdlib.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    if (size < 8) return 0;

    png_structp png = png_create_read_struct(PNG_LIBPNG_VER_STRING, NULL, NULL, NULL);
    if (!png) return 0;

    png_infop info = png_create_info_struct(png);
    if (!info) { png_destroy_read_struct(&png, NULL, NULL); return 0; }

    // Custom read function
    struct {const uint8_t *data; size_t size; size_t pos;} state = {data, size, 0};
    png_set_read_fn(png, &state, [](png_structp p, uint8_t *d, size_t n) {
        auto *s = (decltype(state)*)png_get_io_ptr(p);
        n = std::min(n, s->size - s->pos);
        memcpy(d, s->data + s->pos, n);
        s->pos += n;
    });

    setjmp(png_jmpbuf(png));
    png_read_png(png, info, PNG_TRANSFORM_IDENTITY, NULL);
    png_destroy_read_struct(&png, &info, NULL);
    return 0;
}
EOF

# Step 3: Compile
clang++ -fsanitize=fuzzer,address -g fuzz_png.c -lpng -lz -o fuzz_png

# Step 4: Get PNG corpus
mkdir corpus/
find /usr/share -name "*.png" -exec cp {} corpus/ \;

# Step 5: Fuzz!
./fuzz_png corpus/ -max_total_time=86400   # run for 24 hours